Home > OS >  (Python)Transforming dataframe
(Python)Transforming dataframe

Time:02-05

My goal is to transform dataframe. Source and Target form is like this. And the date column of taget is index. How to transform source table to target form? (I tried pd.DataFrame([sum(list(df.values()), []), but it doesnot work)

#Source form 
#date is 2021-11-24

import pandas as pd
df = pd.DataFrame({'A': [10, 20, 30],'B': [100, 200, 300]})

  A    B
 10  100
 20  200
 30  300


#Target form  (date is index)

# date        A0 A1 A2  B0  B1  B2
# 2021-11-24   10 20 30 100 200 300 

CodePudding user response:

I would first pre-process the df using pd.melt to bring the column names down to the table under a new column called variable.

df_melt = pd.melt(df, var_name='variable', value_name='value')

Then append numerical suffix to the column names

df_melt['variable']  = df_melt.groupby('variable').cumcount().astype(str)

Now, it's a good time to put the date in,

df_melt['date'] = pd.to_datetime('2021-11-24')

so that I can use it as the index for my final table. variable is also set as the index so that I can use unstack to bring it up to the column names.

df_melt.set_index(['date', 'variable']).unstack().droplevel(0, axis=1)

The last droplevel takes care of the unwanted level of column name.

CodePudding user response:

You can do it like this:

df_out = df.unstack().to_frame().T
df_out.columns = [f'{i}{j}' for i, j in df_out.columns]
df_out['date'] = pd.to_datetime('2021-11-14')
print(df_out)

Output:

   A0  A1  A2   B0   B1   B2       date
0  10  20  30  100  200  300 2021-11-14
  •  Tags:  
  • Related