I have this df
| A | B |
|---|---|
| 111 | 4 |
| 111 | 4 |
| 112 | 0 |
| 112 | 2 |
| 113 | 3 |
| 113 | 3 |
| 114 | nan |
| 114 | 1 |
I want to replace nan and 0 values with other values from col B for the corresponding item from col A as follows:
| A | B |
|---|---|
| 111 | 4 |
| 111 | 4 |
| 112 | 2 |
| 112 | 2 |
| 113 | 3 |
| 113 | 3 |
| 114 | 1 |
| 114 | 1 |
I tried this but this not returning the correct values
df['B'].fillna(0)
df=df.merge(df[B > 0].groupby('$LINK:NO').size().reset_index(name='B'), on='A')
CodePudding user response:
Replace values less or equal 0 to missing values in Series.where, so possible get first non missing values per groups by GroupBy.transform with GroupBy.first:
df['B'] = (df.assign(new = df['B'].where(df['B'].gt(0)))
.groupby('A')['new']
.transform('first'))
print (df)
A B
0 111 4.0
1 111 4.0
2 112 2.0
3 112 2.0
4 113 3.0
5 113 3.0
6 114 1.0
7 114 1.0
Another idea is sorting use max:
df['B'] = df.sort_values('B').groupby('A').transform('max')
print (df)
A B
0 111 4.0
1 111 4.0
2 112 2.0
3 112 2.0
4 113 3.0
5 113 3.0
6 114 1.0
7 114 1.0
