Let’s say I have the following Pandas dataframe, where the 'key' column only contains unique strings:
import pandas as pd
df = pd.DataFrame({'key':['b','d','c','a','e','f'], 'value': [0,0,0,0,0,0]})
df
key value
0 b 0
1 d 0
2 c 0
3 a 0
4 e 0
5 f 0
Now I have a list of unique keys and a list of corresponding values:
keys = ['a', 'b', 'c', 'd']
values = [1, 2, 3, 4]
I want to update the 'value' column in the same order of the lists, so that each row has matched 'key' and 'value' (a to 1, 'b' to 2, 'c' to 3, 'd' to 4). I am using the following code, but the dataframe seems to update values from top to bottom, which I don't quite understand
df.loc[df['key'].isin(keys),'value'] = values
df
key value
0 b 1
1 d 2
2 c 3
3 a 4
4 e 0
5 f 0
To be clear, I am expecting to get
key value
0 b 2
1 d 4
2 c 3
3 a 1
4 e 0
5 f 0
Any suggestions?
CodePudding user response:
Use map:
dd = dict(zip(keys, values))
df['value'] = df['key'].map(dd).fillna(df['value'])
CodePudding user response:
keys = ['a', 'b', 'c', 'd']
values = [1, 2, 3, 4]
# form a dictionary with keys and values list
d=dict(zip(keys, values))
# update the value where mapping exists using LOC and MAP
df.loc[df['key'].map(d).notna(), 'value'] =df['key'].map(d)
df
key value
0 b 2
1 d 4
2 c 3
3 a 1
4 e 0
5 f 0
CodePudding user response:
with a temporary dataframe:
temp_df = df.set_index('key')
temp_df.loc[keys] = np.array(values).reshape(-1, 1)
df = temp_df.reset_index()
