My goal is to transform dataframe. Source and Target form is like this. And the date column of taget is index. How to transform source table to target form? (I tried pd.DataFrame([sum(list(df.values()), []), but it doesnot work)
#Source form
#date is 2021-11-24
import pandas as pd
df = pd.DataFrame({'A': [10, 20, 30],'B': [100, 200, 300]})
A B
10 100
20 200
30 300
#Target form (date is index)
# date A0 A1 A2 B0 B1 B2
# 2021-11-24 10 20 30 100 200 300
CodePudding user response:
I would first pre-process the df using pd.melt to bring the column names down to the table under a new column called variable.
df_melt = pd.melt(df, var_name='variable', value_name='value')
Then append numerical suffix to the column names
df_melt['variable'] = df_melt.groupby('variable').cumcount().astype(str)
Now, it's a good time to put the date in,
df_melt['date'] = pd.to_datetime('2021-11-24')
so that I can use it as the index for my final table. variable is also set as the index so that I can use unstack to bring it up to the column names.
df_melt.set_index(['date', 'variable']).unstack().droplevel(0, axis=1)
The last droplevel takes care of the unwanted level of column name.
CodePudding user response:
You can do it like this:
df_out = df.unstack().to_frame().T
df_out.columns = [f'{i}{j}' for i, j in df_out.columns]
df_out['date'] = pd.to_datetime('2021-11-14')
print(df_out)
Output:
A0 A1 A2 B0 B1 B2 date
0 10 20 30 100 200 300 2021-11-14
