Home > Net >  Transpose multi-indexed grouped table into a formatted descriptive table (python/pandas)
Transpose multi-indexed grouped table into a formatted descriptive table (python/pandas)

Time:01-11

I have a df that looks a bit like this:

import pandas as pd

df = pd.DataFrame({'id': [1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3],
                   'eth': ['H','H','H','H','H','H','H','H', 'W', 'W', 'W', 'W', 'W', 'W', 'W', 'W', 'B', 'B', 'B', 'B'],
                   'sex': ['F','F','F','F','F','F','F','F','M','M','M','M','M','M','M','M','F','F','F','F'],
        'outcome': ["eating", "drinking", "eating", "drinking","eating", "drinking","eating", "drinking","eating", "drinking","eating", "drinking","eating", "drinking","eating", "drinking","eating", "drinking","eating", "drinking"],
        'mean': [3.4,4.5,5.6,7.8,9.8,3.4,6.5,7.8,9.5,1.2,3.4,5.6,7.6,6.5,4.3,12.3,3.2,4.3,5.6,7.8]})

df.head()

    id  eth sex outcome   mean
0   1   H   F   eating    3.4
1   1   H   F   drinking  4.5
2   1   H   F   eating    5.6
3   1   H   F   drinking  7.8
4   1   H   F   eating    9.8

I then get the mean and std of the mean column by outcome and ethnicity:

ethnicity = df.groupby(['outcome','eth']).agg({'mean':['mean','std']})

ethnicity

               mean          
               mean       std
outcome  eth                 
drinking B    6.050  2.474874
         H    5.875  2.267708
         W    6.400  4.564355
eating   B    4.400  1.697056
         H    6.325  2.657536
         W    6.200  2.846050

What I would like to turn this multi-index groupby table into would be a table like this:

desired = pd.DataFrame({'eth': ['B', 'H', 'W'],
                        'drinking': ["6.05 (2.47)", "5.88 (2.27)", "6.40 (4.56)"],
                        'eating': ["4.40 (1.70)", "6.33 (2.66)", "6.20 (2.85)"]})

desired

  eth     drinking       eating
0   B  6.05 (2.47)  4.40 (1.70)
1   H  5.88 (2.27)  6.33 (2.66)
2   W  6.40 (4.56)  6.20 (2.85)

My question is how could I go from the grouped table with the multi-index to the desired table?

Thanks in advance.

CodePudding user response:

You can do format change after get the groupby , then doing the unstack

out = df.groupby(['outcome','eth'])['mean'].agg(['mean','std'])
out = out.astype(str).agg(lambda x : x['mean']   '('   x['std'] ')',axis=1).unstack(level=0).reset_index()
Out[597]: 
outcome eth                  drinking                     eating
0         B  6.05(2.4748737341529163)    4.4(1.6970562748477134)
1         H  5.875(2.267708094089713)  6.325(2.6575364531836625)
2         W    6.4(4.564354645876385)    6.2(2.8460498941515415)

CodePudding user response:

One possible solution:

(ethnicity['mean']
 .round(2).astype(str)
 .apply(' ('.join, axis=1).add(')')
 .unstack('outcome')
 )

Output:

outcome     drinking       eating
eth                              
B        6.05 (2.47)    4.4 (1.7)
H        5.88 (2.27)  6.32 (2.66)
W         6.4 (4.56)   6.2 (2.85)
  •  Tags:  
  • Related