Web Scraping Google Finance

Mangs

0人浏览 · 2022-08-25 02:45:48

Mangs · 2022-08-25 02:45:48 发布

Answer a question

I'm trying to teach myself how to web scrape stock data. I'm quite a newbie so please excuse any stupid questions I may ask.

Here's my code for scraping the price and I'm trying to scrape the PE ratio as well.

import urllib.request
from bs4 import BeautifulSoup

start = 'http://www.google.com/finance?cid=694653'

page = urllib.request.urlopen(start)
soup = BeautifulSoup(page)
          

P = soup.find('span',{'id':'ref_694653_l'})

print(P.get_text())

                     
pe = soup.find_all('td',{'class':'val'})

print(pe[5].get_text())

pe = soup.find('td',{'data-snapfield':'pe_ratio'})

print(pe.td.next_sibling.get_text())

I can get the price data, and i managed to get the PE ratio but not directly. I tried to use next_sibling and next_element but it gives me an error saying there is no attribute.

I'm having trouble figuring out how to scrape data from a table as it's usually set up in rows and the classes around the data are usually very common like <td> or <tr>.

So just wanted to ask for some help in scraping the PE ratio.

Thanks guys

Answers

This will help:

>>> pe = soup.find('td',{'data-snapfield':'pe_ratio'})
>>> pe
<td class="key" data-snapfield="pe_ratio">P/E
</td>
>>> print(pe.td.next_sibling.get_text())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'next_sibling'
>>> 
>>> 
>>> 
>>> pe
<td class="key" data-snapfield="pe_ratio">P/E
</td>
>>> pe.td
>>> pe.next_sibling
u'\n'
>>> pe.next_sibling.next_sibling
<td class="val">29.69
</td>
>>> pe.next_sibling.next_sibling.get_text()
u'29.69\n'

Python

Python社区为您提供最前沿的新闻资讯和知识内容

更多推荐

求助！为什么用InsCode部署会出现无限重定向？

Python

如何重塑熊猫。系列

问题:如何重塑熊猫。系列在我看来,它就像 pandas.Series 中的一个错误。 a = pd.Series([1,2,3,4]) b = a.reshape(2,2) b b 有类型 Series 但无法显示,最后一条语句给出异常,非常冗长,最后一行是“TypeError: %d format: a number is required, not numpy.ndarray”。 b.sha

Python

在哪里可以找到有关 Keras 中默认权重初始化器的文档? [复制]

问题:在哪里可以找到有关 Keras 中默认权重初始化器的文档? [复制] 我刚刚在这里](https://keras.io/initializers/)中阅读了有关[中的 Keras 权重初始化器的信息。在文档中,只介绍了不同的初始化程序。如: model.add(Dense(64, kernel_initializer='random_normal')) 当我没有指定kernel_initia