I have two df1 and df2
raw_data1 = {
'dogs': [42, 39, 0, 15, 23, 0, 68, 81, 86],
'cats': [52, 41, 79, 0, 34, 47, 19, 22, 0],
'sheep': [62, 37, 0, 51, 67, 32, 0, 89, 73],
}
raw_data2 = {
'dogs': [42, 39, 0, 15, 23, 0, 68, 81, 86],
'cats': [52, 41, 79, 0, 34, 47, 19, 22, 59],
'sheep': [62, 37, 84, 51, 67, 0, 23, 89, 73],
'lizards': [72, 43, 0, 26, 53, 88, 88, 34, 69],
'birds': [82, 35, 77, 0, 18, 12, 45, 56, 0],
}
df1 = pd.DataFrame(raw_data1,
index=pd.Index(['row_1', 'row_2', 'row_3', 'row_4', 'row_5', 'row_6', 'row_7', 'row_8', 'row_9'], name='Rows'),
columns=pd.Index(['dogs', 'cats', 'sheep'], name='animals'))
df2 = pd.DataFrame(raw_data2,
index=pd.Index(['row_1', 'row_2', 'row_3', 'row_4', 'row_5', 'row_6', 'row_7', 'row_8', 'row_9'], name='Rows'),
columns=pd.Index(['dogs', 'cats', 'sheep', 'lizards', 'birds'], name='animals'))
When I do the following code
df3 = (df2 * df1.loc['row_8'] * 10).where(df1 == 0)
I'm getting all 5 columns in df3 but I want only the column names which are in df1. How do we do multiply with respect to df1?
CodePudding user response:
You could use reindex_like to ensure that the indexes of df2 will be identical to that of df1:
(df2.reindex_like(df1) * df1.loc['row_8'] * 10).where(df1 == 0)
Or, slice the output with the columns of df1:
(df2 * df1.loc['row_8'] * 10).where(df1 == 0)[df1.columns]
output:
animals dogs cats sheep
Rows
row_1 NaN NaN NaN
row_2 NaN NaN NaN
row_3 0.0 NaN 74760.0
row_4 NaN 0.0 NaN
row_5 NaN NaN NaN
row_6 0.0 NaN NaN
row_7 NaN NaN 20470.0
row_8 NaN NaN NaN
row_9 NaN 12980.0 NaN
You can also prefilter:
df2.loc[:, df1.columns].mul(df1.loc['row_8'] * 10).where(df1 == 0)
CodePudding user response:
You can also use Index.intersection method:
df3 = (df2[df2.columns.intersection(df1.columns)] * df1.loc['row_8'] * 10).where(df1 == 0)
Output:
animals dogs cats sheep
Rows
row_1 NaN NaN NaN
row_2 NaN NaN NaN
row_3 0.0 NaN 74760.0
row_4 NaN 0.0 NaN
row_5 NaN NaN NaN
row_6 0.0 NaN NaN
row_7 NaN NaN 20470.0
row_8 NaN NaN NaN
row_9 NaN 12980.0 NaN
