I have a DataFrame something like:
request data
0 a {'uid': 123}
1 a {'type': 'POST', 'code': 200}
2 a {}
3 b {'uid': 456}
4 b {'type': 'GET', 'code': 200}
5 b {'args': 'some data'}
code to replicate:
data = [
['a', 'a', 'a', 'b', 'b', 'b'],
[{'uid': 123}, {'type': 'POST' ,'code':200}, {},
{'uid': 456}, {'type': 'GET' ,'code':200}, {'args': 'some data'}]
]
cols = ['request', 'data']
df = pd.DataFrame(data=zip(data[0], data[1]), columns=cols)
I want to create columns from the dicts in the data column and then flatten the table to be the minimum number of rows long it can be so my resultant DF would be:
request uid type code args
0 a 123 POST 200 None
1 b 456 GET 200 some data
CodePudding user response:
You can convert the dictionaries to dataframe with apply(pd.Series), then aggregate using groupby first:
df['data'].apply(pd.Series).groupby(df['request']).first()
output:
uid type code args
request
a 123.0 POST 200.0 None
b 456.0 GET 200.0 some data
Or using collections.ChainMap:
from collections import ChainMap
(df.groupby('request')['data']
.apply(lambda s: pd.Series(ChainMap(*s.values)))
.unstack(level=1)
)
