I need to fill the null values in City column based on Country column. If the country is same then it should fill the NaN value with the frequent city name corresponding to the country.
Input:
ID City Country
0 New York USA
1 NaN USA
2 London UK
3 Mumbai IND
4 Sydney AUS
5 NaN AUS
6 Sydney AUS
7 Brisbane AUS
Output:
ID City Country
0 New York USA
1 New York USA
2 London UK
3 Mumbai IND
4 Sydney AUS
5 Sydney AUS
6 Sydney AUS
7 Brisbane AUS
CodePudding user response:
Idea is replace possible empty strings to NaNs and then replace values of group by first non NaNs values:
df['City'] = (df.groupby('Country')['City']
.transform('first'))
Or forward and back filling missing values:
df['City'] = (df.groupby('Country')['City']
.transform(lambda x: x.ffill().bfill()))
