I am trying to convert excel if else condition in python dataframe columns, can anyone help me out in this:
Input: df
Name1 Name2 Name3 Name4 Value1 Value2 Value3 Value4 MaxValue
0 John1 John2 John3 John4 10 3 5 7 10
1 Sony1 Sony2 Sony3 Sony4 2 12 4 8 12
2 Mark1 Mark2 Mark3 Mark4 5 13 0 3 13
3 Biky1 Biky2 Biky3 Biky4 7 7 5 44 44
4 Rose1 Rose2 Rose3 Rose4 7 0 9 7 9
Name values may not be ended with 1/2/3 etc this may have different name also.
Output: How to calculate the Final_Name column
Name1 Name2 Name3 Name4 Value1 Value2 Value3 Value4 MaxValue Final_Name
0 John1 John2 John3 John4 10 3 5 7 10 John1
1 Sony1 Sony2 Sony3 Sony4 2 12 4 8 12 Sony2
2 Mark1 Mark2 Mark3 Mark4 5 13 0 3 13 Mark2
3 Biky1 Biky2 Biky3 Biky4 7 7 5 44 44 Biky4
4 Rose1 Rose2 Rose3 Rose4 7 0 9 7 9 Rose3
In excel we, can write something like this:
=IF(I2=H2,D2,IF(I2=G2,C2,IF(I2=F2,B2,IF(I2=E2,A2,""))))
CodePudding user response:
You can first filter the df into two parts , then we use the value position locate the Name
v = df.filter(regex = '^Value')
name = df.filter(regex = '^Name')
df['out'] = name.values[df.index, v.columns.get_indexer(v.idxmax(1))]
df
Out[188]:
Name1 Name2 Name3 Name4 Value1 Value2 Value3 Value4 MaxValue out
0 John1 John2 John3 John4 10 3 5 7 10 John1
1 Sony1 Sony2 Sony3 Sony4 2 12 4 8 12 Sony2
2 Mark1 Mark2 Mark3 Mark4 5 13 0 3 13 Mark2
3 Biky1 Biky2 Biky3 Biky4 7 7 5 44 44 Biky4
4 Rose1 Rose2 Rose3 Rose4 7 0 9 7 9 Rose3
CodePudding user response:
You can first create a column that will show you which 'Name' column should you return back using idxmax(). Then you can stack() your 'Name' columns and merge this result with your created column above based on index and 'Name':
# Create a helper column
v_c = [c for c in df if c.startswith('Value')]
df['id_col'] = df[v_c].idxmax(axis=1).str.replace('Value','Name')
# Merge the helper column with your stacked 'Name' columns
n_c = df.filter(like='Name').columns
res = pd.merge(df[n_c].stack().reset_index(),df[['id_col']].reset_index(),left_on=['level_0','level_1'], right_on=['index','id_col'])[0]
# Assign as a column
df['Final_Name'] = res
prints:
Name1 Name2 Name3 Name4 ... Value4 MaxValue id_col Final_Name
0 John1 John2 John3 John4 ... 7 10 Name1 John1
1 Sony1 Sony2 Sony3 Sony4 ... 8 12 Name2 Sony2
2 Mark1 Mark2 Mark3 Mark4 ... 3 13 Name2 Mark2
3 Biky1 Biky2 Biky3 Biky4 ... 44 44 Name4 Biky4
4 Rose1 Rose2 Rose3 Rose4 ... 7 9 Name3 Rose3
[5 rows x 11 columns]

