I have a df with a wallet column (which I want to use as the index and a positions_rewards column, which is the main target of this question.
Each row in positions_rewards is a list of dictionaries (each row has a different amount of dictionaries within their list).
I want to create a multiindex df with wallet as the index, with columns being the keysshared amongst all dictionaries (position and position_rewards).
Basically, each wallet has many positions and corresponding position rewards for each position.
wallet amount positions_rewards
0 0x00000000000000000000 0 [{'position': 'LUSD', 'position_rewards': '0'}, {'position': 'ALCX', 'position_rewards': '1000'}]
1 0x000566b53e028d21e104 719082156191632 [{'position': 'LUSD', 'position_rewards': '0'}, {'position': 'ALCX', 'position_rewards': '420'}]
2 0xb72ca7ee0aa8ad85a4ff 4656542682597557130970 [{'position': 'SUSHI_LP', 'position_rewards': 1200}]
3 0x81f01fed84a5bb03813a 777565117641038730248 [{'position': 'SUSHI_LP', 'position_rewards': 300}, {'position': 'UNI_LP', 'position_rewards': 30000}]
4 0x3726a511f7ff6a417e05 1566258882202552609066 [{'position': 'ALCX', 'position_rewards': 2000}]
My code is as follows (just a simple DataFrame construction out of a list of nested dictionaries (hence the position_rewardscolumn being the only one still with "unexploded" values):
df = pd.DataFrame(rewards_list)
Notes:
- Ignore
amountcolumn. - Is a better idea to create a
DataFramewith repeatingwalletrows and then set that as index (or not) to be able to performgroupby()operations? My preference is multiindex for now, since I will append other data on that wallet index. So I don't want to have repeatingwalletrows since it will be harder to mergedfson that instead of on adfwith unique indexes.
CodePudding user response:
Try:
df = df.drop(columns="amount").explode("positions_rewards")
df = pd.concat([df, df.pop("positions_rewards").apply(pd.Series)], axis=1)
print(df)
Prints:
wallet position position_rewards
0 0x00000000000000000000 LUSD 0
0 0x00000000000000000000 ALCX 1000
1 0x000566b53e028d21e104 LUSD 0
1 0x000566b53e028d21e104 ALCX 420
2 0xb72ca7ee0aa8ad85a4ff SUSHI_LP 1200
3 0x81f01fed84a5bb03813a SUSHI_LP 300
3 0x81f01fed84a5bb03813a UNI_LP 30000
4 0x3726a511f7ff6a417e05 ALCX 2000
If I understand you correctly, you can use df.pivot afterwards to obtain your dataframe:
print(
df.pivot(
index="wallet",
columns="position",
values="position_rewards",
).fillna(0)
)
Prints:
position ALCX LUSD SUSHI_LP UNI_LP
wallet
0x00000000000000000000 1000 0 0 0
0x000566b53e028d21e104 420 0 0 0
0x3726a511f7ff6a417e05 2000 0 0 0
0x81f01fed84a5bb03813a 0 0 300 30000
0xb72ca7ee0aa8ad85a4ff 0 0 1200 0
CodePudding user response:
Try this:
df = df.explode(['positions_rewards'])
df['positions_rewards'].apply(pd.Series)
