My data looks somewhat like this:
| key | city | currentCityKey |
|---|---|---|
| 1 | Boston | NaN |
| 2 | New York | 1 |
| 3 | Concord | 2 |
I'd like to create a new column with the current city name, which needs to be found by finding the city that corresponds to the currentCityKey value, returning this:
| key | city | currentCityKey | currentCity |
|---|---|---|---|
| 1 | Boston | NaN | NaN |
| 2 | New York | 1 | Boston |
| 3 | Concord | 2 | New York |
I've tried a lot of things, but I run into problems trying to return the value from the city column. This should just be a simple if currentCityKey is notnull then city where currentCityKey == key, but I just can't seem to get from this logic to actual code.
Raw input:
df = pd.DataFrame({'key': [1, 2, 3],
'city': ['Boston', 'New York', 'Concord'],
'currentCityKey': [nan, 1.0, 2.0]})
CodePudding user response:
Create a mapper with the 'key' and 'city' columns and use map on 'currentCityKey' column to obtain 'currentCity' column:
df['currentCity'] = df['currentCityKey'].map(df.set_index('key')['city'])
Output:
key city currentCityKey currentCity
0 1 Boston NaN NaN
1 2 New York 1.0 Boston
2 3 Concord 2.0 New York
