Home > Software engineering >  Multiply two Dataframes if the column names and the condition on a column value matches
Multiply two Dataframes if the column names and the condition on a column value matches

Time:01-27

I have two dataframes(namely a and b):

a =

enter image description here

and b =

enter image description here

And I am looking for a way to multiply the values in each column of Dataframe a with the corresponding column value of Dataframe b if the dates are matching. For example:values in column aa_10 of Dataframe a for the date 2021-01-19 with the corresponding value in column aa_10 of Dataframe b for the date 2021_01_19 (i.e., the column- NaN, 2.0, 3.0,4.0 of Dataframe a to be multiplied with the value 4 in Dataframe b) and so on.

Desired result:

enter image description here

Sample Data:

import numpy as np
import pandas as pd
  
d = {'Date'  : pd.Series([20210119, 20210119, 20210119, 20210119, 20210122, 20210122, 20210122, 20210122]),
     'To'  : pd.Series(['aa', 'bb', 'cc', 'dd', 'aa', 'bb', 'cc', 'dd']),
     'aa_10' : pd.Series([np.nan, 2, 3, 4, np.nan, 4, 3, 2]),
     'bb_11' : pd.Series([6, np.nan, 8, 9, 9, np.nan, 7, 8]),
     'cc_12' : pd.Series([1, 2, np.nan, 4, 4, 3, np.nan, 5]),
     'dd_13' : pd.Series([6, 7, 8, np.nan, 8, 6, 9, np.nan])}
  
# creates Dataframe.
a = pd.DataFrame(d)
a['Date'] = pd.to_datetime(a['Date'], format='%Y%m%d')
# print the data.
display (a)

# Initialize data to Dicts of series.
d = {'Date'  : pd.Series([20190110, 20210119, 20210121, 20210122]),
     'aa_10' : pd.Series([2, 4, 1, 2]),
     'bb_11' : pd.Series([1, 3, 5, 4]),
     'cc_12' : pd.Series([10, 12, 4, 2]),
     'dd_13' : pd.Series([2, 1, 2, 5])}
  
# creates Dataframe.
b = pd.DataFrame(d)
b['Date'] = pd.to_datetime(b['Date'], format='%Y%m%d')  
# print the data.
display(b)

CodePudding user response:

Set_index as date and multiply

s=a.set_index(['Date', 'To']).mul(b.set_index(['Date'])).reset_index()




   Date  To  aa_10  bb_11  cc_12  dd_13
0 2021-01-19  aa    NaN   18.0   12.0    6.0
1 2021-01-19  bb    8.0    NaN   24.0    7.0
2 2021-01-19  cc   12.0   24.0    NaN    8.0
3 2021-01-19  dd   16.0   27.0   48.0    NaN
4 2021-01-22  aa    NaN   36.0    8.0   40.0
5 2021-01-22  bb    8.0    NaN    6.0   30.0
6 2021-01-22  cc    6.0   28.0    NaN   45.0
7 2021-01-22  dd    4.0   32.0   10.0    NaN

CodePudding user response:

You can try with reindex , then mul the value and update your original a

b_n = b.set_index('Date').reindex(a.Date)
b_n.index = a.index
a.update(a.mul(b_n))
a
        Date  To  aa_10  bb_11  cc_12  dd_13
0 2021-01-19  aa    NaN   18.0   12.0    6.0
1 2021-01-19  bb    8.0    NaN   24.0    7.0
2 2021-01-19  cc   12.0   24.0    NaN    8.0
3 2021-01-19  dd   16.0   27.0   48.0    NaN
4 2021-01-22  aa    NaN   36.0    8.0   40.0
5 2021-01-22  bb    8.0    NaN    6.0   30.0
6 2021-01-22  cc    6.0   28.0    NaN   45.0
7 2021-01-22  dd    4.0   32.0   10.0    NaN
  •  Tags:  
  • Related