天天看點

python bs4 find_all_在BS4中使用find_all擷取文本作為清單

python bs4 find_all_在BS4中使用find_all擷取文本作為清單

I'll start by saying I'm very new with Python. I've been building a Discord bot with discord.py and Beautiful Soup 4. Here's where I'm at:

@commands.command(hidden=True)

async def roster(self):

"""Gets a list of CD's members"""

url = "http://www.clandestine.pw/roster.html"

async with aiohttp.get(url) as response:

soupObject = BeautifulSoup(await response.text(), "html.parser")

try:

text = soupObject.find_all("font", attrs={'size': '4'})

await self.bot.say(text)

except:

await self.bot.say("Not found!")

Here's the output:

python bs4 find_all_在BS4中使用find_all擷取文本作為清單

Now, I've tried using get_text() in multiple different ways to strip the brackets and HTML tags from this code, but it throws an error each time. How would I be able to either achieve that or output this data into an array or list and then just print the plain text?

解決方案

Replace

text = soupObject.find_all("font", attrs={'size': '4'})

with this:

all_font_tags = soupObject.find_all("font", attrs={'size': '4'})

list_of_inner_text = [x.text for x in all_font_tags]

# If you want to print the text as a comma separated string

text = ', '.join(list_of_inner_text)