Home > Software design >  Pandas read nested JSON to data frame
Pandas read nested JSON to data frame

Time:01-14

Below is a snippet of my JSON file. There is 1 play, multiple playId's, and with numerous timestamps. I am trying to get a data frame of: 'timeStamp', 'playID', 'x', 'y', 'typ', 'id'. The majority of those fields would come from the mergedPositionalData key.

Currently I am stuck here:

mergedPositionalData_df = pd.json_normalize(data, record_path=['mergedPositionalData'])

positions_df = pd.json_normalize(selected_dfs.positions[0])

positions_df

With the output looking like:

    oo  x   qi  y   typ id
0   0   -11.418268  0   161.538639  2   16
1   0   85.623169   0   84.475133   2   15
2   0   -98.474940  0   97.518917   2   17
3   0   -2.319136   0   -6.008502   2   14
4   0   -14.389485  0   128.571621  1   6
5   0   12.693297   0   45.052760   1   1

{
   "Play":{
      "gamePK":634339,
      "playId":"a5c4cf6a-36c5-4ca9-99eb-0130a61ce917",
      "MeasurementTimeCodeOffset":"0",
      "gameDate":"2021-04-27",
      "MeasurementTime":"2021-04-27T22:53:03.0500000Z",
      "MeasurementTimeCode":"302840648"
   },
   "mergedPositionalData":[
      {
         "timeStamp":"2021-04-27T22:53:10.537",
         "positions":[
            {
               "oo":0,
               "x":-11.379039063602816,
               "qi":0,
               "y":161.56143950775393,
               "typ":2,
               "id":16
            },
            {
               "oo":0,
               "x":87.40255331551018,
               "qi":0,
               "y":92.80044305786649,
               "typ":2,
               "id":15
            },
            {
               "oo":0,
               "x":-100.73931207626782,
               "qi":0,
               "y":101.58825375625815,
               "typ":2,
               "id":17
            },
            {
               "oo":0,
               "x":-1.467968670442033,
               "qi":0,
               "y":-2.7976281571602843,
               "typ":2,
               "id":14
            },
            {
               "oo":0,
               "x":10.196311102524582,
               "qi":0,
               "y":104.21368921515476,
               "typ":1,
               "id":6
            },
            {
               "oo":0,
               "x":24.704141676368593,
               "qi":0,
               "y":35.0912115150846,
               "typ":1,
               "id":1
            },
            {
               "oo":0,
               "x":46.33992652361183,
               "qi":0,
               "y":42.76105101928655,
               "typ":3,
               "id":18
            },
            {
               "oo":0,
               "x":42.69370698593863,
               "qi":0,
               "y":91.42190261594243,
               "typ":1,
               "id":4
            },
            {
               "oo":0,
               "x":-90.22524146032416,
               "qi":0,
               "y":36.699031575991725,
               "typ":3,
               "id":19
            },
            {
               "oo":0,
               "x":69.93003976639605,
               "qi":0,
               "y":46.40282249559731,
               "typ":1,
               "id":3
            },
            {
               "oo":0,
               "x":22.41643570271914,
               "qi":0,
               "y":0.30107752906411767,
               "typ":1,
               "id":2
            },
            {
               "oo":0,
               "x":-5.153584805924734,
               "qi":0,
               "y":76.48601999888726,
               "typ":1,
               "id":5
            },
            {
               "oo":0,
               "x":-79.23511414132038,
               "qi":0,
               "y":201.69368087106704,
               "typ":1,
               "id":7
            },
            {
               "oo":0,
               "x":118.9001757311804,
               "qi":0,
               "y":162.72692084189558,
               "typ":1,
               "id":9
            },
            {
               "oo":0,
               "x":16.58415668961945,
               "qi":0,
               "y":254.91739260908565,
               "typ":1,
               "id":8
            },
            {
               "oo":0,
               "x":-25.939845342000044,
               "qi":0,
               "y":-2.127047070372252,
               "typ":1,
               "id":10
            }
         ]
      }]}

Any help is appreciated. Thanks

CodePudding user response:

You can select the value of mergedPositionalData first, then make use of meta argument of pandas.json_normalize().

import json
import pandas as pd

with open('data.json') as f:
    data = json.load(f)

df = pd.json_normalize(data['mergedPositionalData'], record_path='positions', meta='timeStamp')
print(df)

   oo          x  qi           y  typ  id                timeStamp
0   0 -11.379039   0  161.561440    2  16  2021-04-27T22:53:10.537
1   0  87.402553   0   92.800443    2  15  2021-04-27T22:53:10.537
  •  Tags:  
  • Related