Calculated column based on secondary key in Pandas Dataframe-CodePudding

My data looks somewhat like this:

key	city	currentCityKey
1	Boston	NaN
2	New York	1
3	Concord	2

I'd like to create a new column with the current city name, which needs to be found by finding the city that corresponds to the currentCityKey value, returning this:

key	city	currentCityKey	currentCity
1	Boston	NaN	NaN
2	New York	1	Boston
3	Concord	2	New York

I've tried a lot of things, but I run into problems trying to return the value from the city column. This should just be a simple if currentCityKey is notnull then city where currentCityKey == key, but I just can't seem to get from this logic to actual code.

Raw input:

df = pd.DataFrame({'key': [1, 2, 3],
 'city': ['Boston', 'New York', 'Concord'],
 'currentCityKey': [nan, 1.0, 2.0]})

CodePudding user response：

Create a mapper with the 'key' and 'city' columns and use map on 'currentCityKey' column to obtain 'currentCity' column:

df['currentCity'] = df['currentCityKey'].map(df.set_index('key')['city'])

Output:

   key      city  currentCityKey currentCity
0    1    Boston             NaN         NaN
1    2  New York             1.0      Boston
2    3   Concord             2.0    New York