How do I get the value of a soup.select?

Mangs

0人浏览 · 2022-08-25 07:16:08

Mangs · 2022-08-25 07:16:08 发布

Answer a question

<h2 class="hello-word"><a href="http://www.google.com">Google</a></h2>

How do I grab the value of the a tag (Google)?

print soup.select("h2 > a")

returns the entire a tag and I just want the value. Also, there could be multiple H2s on the page. How do I filter for the one with the class hello-word?

Answers

You can use .hello-word on h2 in the CSS Selector, to select only h2 tags with class hello-word and then select its child a . Also soup.select() returns a list of all possible matches, so you can easily iterate over it and call each elements .text to get the text. Example -

for i in soup.select("h2.hello-word > a"):
    print(i.text)

Example/Demo (I added a few of my own elements , one with a slightly different class to show the working of the selector) -

>>> from bs4 import BeautifulSoup
>>> s = """<h2 class="hello-word"><a href="http://www.google.com">Google</a></h2>
... <h2 class="hello-word"><a href="http://www.google.com">Google12</a></h2>
... <h2 class="hello-word2"><a href="http://www.google.com">Google13</a></h2>"""

>>> soup = BeautifulSoup(s,'html.parser')

>>> for i in soup.select("h2.hello-word > a"):
...     print(i.text)
...
Google
Google12

Python

Python社区为您提供最前沿的新闻资讯和知识内容

更多推荐

求助！为什么用InsCode部署会出现无限重定向？

Python

如何重塑熊猫。系列

问题:如何重塑熊猫。系列在我看来,它就像 pandas.Series 中的一个错误。 a = pd.Series([1,2,3,4]) b = a.reshape(2,2) b b 有类型 Series 但无法显示,最后一条语句给出异常,非常冗长,最后一行是“TypeError: %d format: a number is required, not numpy.ndarray”。 b.sha

Python

在哪里可以找到有关 Keras 中默认权重初始化器的文档? [复制]

问题:在哪里可以找到有关 Keras 中默认权重初始化器的文档? [复制] 我刚刚在这里](https://keras.io/initializers/)中阅读了有关[中的 Keras 权重初始化器的信息。在文档中,只介绍了不同的初始化程序。如: model.add(Dense(64, kernel_initializer='random_normal')) 当我没有指定kernel_initia