Considering the two dataframes
>>> df1
Dr Cr Opening Balance
0 B2 B2 0.0
1 B1 B1 100.0
2 D1 D1 0.0
3 F1 F1 -100.0
>>> df2
Date Amount Dr Cr
0 2021-12-01 452.25 B1 D1
1 2022-01-01 100.00 B1 D1
2 2022-01-02 100.00 B1 D1
I am trying to add a column to df1 which gives the total of Opening Balance sum of Dr from df2 - sum of Cr from df2
In this case, df1 would become:
>>> df1
Dr Cr Opening Balance Total
0 B2 B2 0.0 0.0
1 B1 B1 100.0 752.25
2 D1 D1 0.0 -652.25
3 F1 F1 -100.0 -100.0
Thanks
CodePudding user response:
One of more straight-forward and relatively debug friendly approach is to group df2 based on Dr and Cr, join the results to df1 and add/subtract the values:
dr = df2.groupby('Dr')['Amount'].sum().rename('Dr Amount')
cr = df2.groupby('Cr')['Amount'].sum().rename('Cr Amount')
df3 = df1.join(dr, 'Dr').join(cr, 'Cr').fillna(0)
df3['Total'] = df3['Opening Balance'] df3['Dr Amount'] - df3['Cr Amount']
# Dr Cr Opening Balance Dr Amount Cr Amount Total
# 0 B2 B2 0.0 0.00 0.00 0.00
# 1 B1 B1 100.0 652.25 0.00 752.25
# 2 D1 D1 0.0 0.00 652.25 -652.25
# 3 F1 F1 -100.0 0.00 0.00 -100.00
# Optional clean-up:
df3.drop(columns=['Dr Amount', 'Cr Amount'], inplace=True)
Note that the fillna(0) on third row is needed only for rows where there is no corresponding Dr or Cr in df2
