BeautifulSoup Scraping - 'find' method does not return any children in 'div' tag

Mangs

0人浏览 · 2022-08-24 17:22:00

Mangs · 2022-08-24 17:22:00 发布

Answer a question

I am using BeautifulSoup to extract information from [http://financials.morningstar.com/company-profile/c.action?t=AAPL][1]
Especially, the 'CIK' field from the 'Operation Details' section as shown in the [image][1]

This is the code I have used:

```page = requests.get('http://financials.morningstar.com/company-profile/c.action?t=AAPL')```
```soup = BeautifulSoup(page.content, 'html5lib')```
```div = soup.find(name='div',attrs={'id':'OperationDetails'}) ```

Upon ```print(div)``` I get an empty tag output.

However, upon inspecting the page the 'div' tag with **id='OperationDetails'** does have child tags under it. Am I missing something here?

I am a beginner in using BeautifulSoup and I was practicing on this website. What is wrong and how do I now get the 'table' element that has the information (CIK) I am looking for?

Sincere thanks.
edit: I am really sorry, I dont know why Stackoverflow is removing image and website links after the question is posted. Please let me know if you need any additional details, I will be prompt in responding as quickly as I can. Thanks again.

Answers

The "Operation Details" panel is loaded from external URL. You can use this example how to load it:

import requests
from bs4 import BeautifulSoup

url = (
    "http://financials.morningstar.com/cmpind/company-profile/component.action"
)

query = {
    "component": "OperationDetails",
    "t": "XNAS:AAPL",  # <--- change to your ID
    "region": "usa",
    "culture": "en-US",
    "cur": "",
}

soup = BeautifulSoup(requests.get(url, params=query).content, "html.parser")

cik = (
    soup.find(lambda tag: tag.name == "th" and "CIK" in tag.text)
    .find_next("td")
    .text
)
print(cik)

Prints:

Python

Python社区为您提供最前沿的新闻资讯和知识内容

更多推荐

求助！为什么用InsCode部署会出现无限重定向？

Python

如何重塑熊猫。系列

问题:如何重塑熊猫。系列在我看来,它就像 pandas.Series 中的一个错误。 a = pd.Series([1,2,3,4]) b = a.reshape(2,2) b b 有类型 Series 但无法显示,最后一条语句给出异常,非常冗长,最后一行是“TypeError: %d format: a number is required, not numpy.ndarray”。 b.sha

Python

在哪里可以找到有关 Keras 中默认权重初始化器的文档? [复制]

问题:在哪里可以找到有关 Keras 中默认权重初始化器的文档? [复制] 我刚刚在这里](https://keras.io/initializers/)中阅读了有关[中的 Keras 权重初始化器的信息。在文档中,只介绍了不同的初始化程序。如: model.add(Dense(64, kernel_initializer='random_normal')) 当我没有指定kernel_initia