Home > Software design >  How to Reindex MultiIndex Dataframe on Multiple Levels?
How to Reindex MultiIndex Dataframe on Multiple Levels?

Time:01-08

So I currently have the following dataframe which I aggregated and I have a multiindex that looks like the following:

Date     Country_Band      Value      Decimal
May 2021 Non-US            2-14       0.11
         US                2-14       0.22
                           1          0.33
                           15         0.44
         Non-US            1          0.55
                           15         0.66

I want to organize and group these in a way to obtain the below:

Date     Country_Band      Value      Decimal
May 2021 US                1          0.33
                           2-14       0.22
                           15         0.44

         Non-US            1          0.55
                           2-14       0.11
                           15         0.66

This is the index of a larger dataframe. I first tried to do the following code:

df_march_agg = df_march_agg.reindex(['US', 'Non-US'], level='Country_Band')

Which worked in getting the country band group, however, the value is still not in numerical order:

Date     Country_Band      Value      Decimal
May 2021 US                2-14       0.22
                           1          0.33 
                           15         0.44

         Non-US            2-14       0.11
                           1          0.55
                           15         0.66

I tried then doing the same:

df_march_agg = df_march_agg.reindex(['1', '2-14', '15 '], level='Value')

But this then undid the previous reindex. Any idea of what I am missing or need to add in order to get both in order?

Cheers!

CodePudding user response:

One idea with ordered categoricals in MultiIndex.set_levels, so possible use DataFrame.sort_index:

df.index = (df.index.set_levels(pd.CategoricalIndex(df.index.levels[1], 
                                                   ordered=True,
                                                   categories=['US', 'Non-US']), 
                                                   level=1)
                    .set_levels(pd.CategoricalIndex(df.index.levels[2], 
                                                   ordered=True, 
                                                   categories=['1', '2-14', '15 ']), 
                                                   level=2))

df = df.sort_index()
print (df)
                             Decimal
Date     Country_Band Value         
May 2021 US           1         0.33
                      2-14      0.22
                      15        0.44
         Non-US       1         0.55
                      2-14      0.11
                      15        0.66

Another idea with DataFrame.reindex with MultiIndex.from_product:

mux = pd.MultiIndex.from_product([['May 2021'],
                                  ['US', 'Non-US'],
                                  ['1', '2-14', '15 ']], 
                                  names=['Date','Country_Band','Value'])

df = df.reindex(mux)
print (df)
                             Decimal
Date     Country_Band Value         
May 2021 US           1         0.33
                      2-14      0.22
                      15        0.44
         Non-US       1         0.55
                      2-14      0.11
                      15        0.66
  •  Tags:  
  • Related