Im fairly new to pandas (and Python) and Im having some trouble working with a multilevel dataframe, specifically trying to apply some filtering to the dataframe.
To put this on context, Im using the requests library to fetch data from a web, then creating the dataframe:
req = requests.get('url')
df = pd.DataFrame.from_dict(req.json()['results'])
This leads to:
id data score
0 1 {'rank': 1, 'active': 'true'} 178
1 2 {'rank': 3, 'active': 'false'} 125
Now, I need to filter this using the score (score > 150) and the active value from the data column (active == 'true'). I can easily do the first part with:
df.query('score > ' '150')
But Im not able to filter by 'active' too. I have tried to retrieve the active values from the data column aside, but nothing. I've read some of the official pandas documentation, a lot of similar questions, tried its solutions, but no luck so far :(
Thanks in advance.
CodePudding user response:
Use -
df = df.join(pd.DataFrame(df['data'].tolist()))
df[(df['score'] > 150) & (df['active'])]
