I have following multi index dataframe:
offset A B C
0 0 0 100
1 200
2 300
3 400
1 0 10
1 20
2 30
3 40
...
To group A and sum values of C, I execute:
df.droplevel('B').sum(level = [0, 1], axis = 0)
whose exepcted output must be:
offset A C
0 0 1000
0 1 100
...
However, the output is (column C is discarded):
offset A
0 0
1
1 0
1
...
Is there something wrong to get expected output (why is column C discarded)?
CodePudding user response:
C is also in your Multiindex, there is no sum happening at all.
If you would change your code to df.droplevel('B').sum(level = [0, 1, 2], axis = 0) you would see column C, it is not discarded.
Running your code is returning a warning for me:
FutureWarning: Using the level keyword in DataFrame and Series aggregations is deprecated and will be removed in a future version. Use groupby instead. df.sum(level=1) should use df.groupby(level=1).sum()
You want to do it like this for example:
res = df.reset_index(level=-1).groupby(level=[0,1]).sum()
print(res)
C
offset A
0 0 1000
1 100
C is no index at the moment.
