I have dataframe df1 as:
| col1 | col2 | col3 |
|---|---|---|
| Apple | Apple | Apple |
| orange | 0 | orange |
| Cake | 0 | 0 |
| 0 | Banana | 0 |
| 0 | grape | grape |
Logic: To add the common value to a new column. If value (other than zero) is present in one or two columns then compare them and add the common value.
I want to add a new column (New_col) to this dataframe with values like below:
| col1 | col2 | col3 | New_col |
|---|---|---|---|
| Apple | Apple | Apple | Apple |
| orange | 0 | orange | orange |
| Cake | 0 | 0 | Cake |
| 0 | Banana | 0 | Banana |
| 0 | grape | grape | grape |
Any suggestion how to do this? Thank you in advance!
CodePudding user response:
Assuming, you want to get the first non zero value per row, you can mask the zeros and bfill then get the first column:
df['NewCol'] = df.mask(df.eq('0')).bfill(axis=1).iloc[:,0]
NB. I am also assuming 0 is a string. You can now adapt this code to your needs
Output:
col1 col2 col3 NewCol
0 Apple Apple Apple Apple
1 orange 0 orange orange
2 Cake 0 0 Cake
3 0 Banana 0 Banana
CodePudding user response:
An alternative to @mozway 's solution: Get the index of the first occurrence different from '0' with .ne("0") and .idxmax for each row using .apply, and assign the result to the new column.
>>> df["new_col"] = df.apply(lambda x: x[x.ne("0").idxmax()], axis=1)
>>> df
col1 col2 col3 newCol
0 Apple Apple Apple Apple
1 Orange 0 Orange Orange
2 Cake 0 0 Cake
3 0 Banana 0 Banana
