如何使用python从sofascore中抓取足球结果
·
问题:如何使用python从sofascore中抓取足球结果
我正在 Python 3.8 上开发这个项目。我必须将数据下载到 Pandas Dataframe 中,并最终写入 2018 年和 2019 年所有英超球队的数据库(SQL 或 Access)。我正在尝试使用 beautifulsoup。我有一个适用于soccerbase.com 的代码,但它不适用于sofascore.com @oppressionslayer 到目前为止对代码有所帮助。有人可以帮我吗?
import json
import pandas as pd
import requests
from bs4 import BeautifulSoup as bs
url = "https://www.sofascore.com/football///json"
r = requests.get(url)
soup = bs(r.content, 'lxml')
json_object = json.loads(r.content)
json_object['sportItem']['tournaments'][0]['events'][0]['homeTeam']['name']
# 'Sheffield United'
json_object['sportItem']['tournaments'][0]['events'][0]['awayTeam']['name'] # 'Manchester United'
json_object['sportItem']['tournaments'][0]['events'][0]['homeScore']['current']
# 3
json_object['sportItem']['tournaments'][0]['events'][0]['awayScore']['current']
print(json_object)
如何循环此代码以获取整个团队?我的目标是让每个团队数据的行为 [“Event date”、“Competition”、“Home Team”、“Home Score”、“Away Team”、“Away Score”、“Score”] 例如31/10/2019 英超 切尔西 1 曼联 2 1-2
我是一个 sarter,我怎样才能得到它?
解答
这段代码可以正常工作。虽然它没有捕获网站的所有数据库,但它是一个强大的爬虫
import simplejson as json
import pandas as pd
import requests
from bs4 import BeautifulSoup as bs
url = "https://www.sofascore.com/football///json"
r = requests.get(url)
soup = bs(r.content, 'lxml')
json_object = json.loads(r.content)
headers = ['Tournament', 'Home Team', 'Home Score', 'Away Team', 'Away Score', 'Status', 'Start Date']
consolidated = []
for tournament in json_object['sportItem']['tournaments']:
rows = []
for event in tournament["events"]:
row = []
row.append(tournament["tournament"]["name"])
row.append(event["homeTeam"]["name"])
if "current" in event["homeScore"].keys():
row.append(event["homeScore"]["current"])
else:
row.append(-1)
row.append(event["awayTeam"]["name"])
if "current" in event["awayScore"].keys():
row.append(event["awayScore"]["current"])
else:
row.append(-1)
row.append(event["status"]["type"])
row.append(event["formatedStartDate"])
rows.append(row)
df = pd.DataFrame(rows, columns=headers)
consolidated.append(df)
pd.concat(consolidated).to_csv(r'Path.csv', sep=',', encoding='utf-8-sig',
index=False)
礼貌 Praful Surve@praful-surve
更多推荐

所有评论(0)