Answer a question

Looking to sharpen my data science skills. I am practicing url data pulls from a sports site and the json file has multiple nested dictionaries. I would like to be able to pull this data to map my own custom form of the leaderboard in matplotlib, etc., but am having a hard time getting the json to a workable df.

The main website is: https://www.usopen.com/scoring.html

Looking at the background I believe the live info is being pulled from the link listed in the short code below. I'm working in Jupyter notebooks. I can get the data successfully pulled.

But as you can see, it is pulling multiple nested dictionaries which is making it very difficult in getting a simple dataframe pulled.

Was just looking to get player, score to par, total, and round pulled. Any help would be greatly appreciated, thank you!

import pandas as pd
import urllib as ul
import json
url = "https://gripapi-static-pd.usopen.com/gripapi/leaderboard.json"
response = ul.request.urlopen(url)
data = json.loads(response.read())
print(data)

Answers

Simple and Quick Solution. A better solution might exist with JSON normalize from pandas but this is fairly good for your use case.

def func(x):
    if not any(x.isnull()):
        return (x['round'], x['player']['firstName'], x['player']['identifier'], x['toParToday']['value'], x['totalScore']['value'])

df = pd.DataFrame(data['standings'])
df['round'] = data['currentRound']['name']
df = df[['player', 'toPar', 'toParToday', 'totalScore', 'round']]
info = df.apply(func, axis=1)
info_df = pd.DataFrame(list(info.values), columns=['Round', 'player_name', 'pid', 'to_par_today', 'totalScore'])
info_df.head()
Logo

Python社区为您提供最前沿的新闻资讯和知识内容

更多推荐