Home > Software design >  Create a new variable in flat tables based on existing column
Create a new variable in flat tables based on existing column

Time:02-02

I have got the following dataframe:

lst=[['01012021','A',5],['01012021','B',-10],['01012021','C',20],['01022021','A',20],['01022021','B',-30],['01022021','C',40]]
df2=pd.DataFrame(lst,columns=['Date','Art','AuM'])

I would like to create in column Art a new variable (D) by adding A, B and C. It should be grouped by the columns date and Art. The result looks like this:

lst=[['01012021','A',5],['01012021','B',-10],['01012021','C',20],['01012021','D',15],['01022021','A',20],['01022021','B',-30],['01022021','C',40],['01022021','D',30]]
df2=pd.DataFrame(lst,columns=['Date','Art','AuM'])

Thank you for your help.

CodePudding user response:

You can groupby Date, assign 'D' as a new column Art to the result and append to the orginal dataframe:

df2.append(df2.groupby('Date', as_index=False).AuM.sum().assign(Art='D'))

If needed, you can sort by appending .sort_values(['Date', 'Art']):

       Date Art  AuM
0  01012021   A    5
1  01012021   B  -10
2  01012021   C   20
0  01012021   D   15
3  01022021   A   20
4  01022021   B  -30
5  01022021   C   40
1  01022021   D   30
  •  Tags:  
  • Related