I have the following df:
| idx | value |
|---|---|
| 0 | 1 |
| 1 | 2 |
| 2 | 3 |
I want to create a new column based on a looking up the value for a given row in conjunction with a constant value in the following matrix:
| x | 1 | 2 | 3 |
|---|---|---|---|
| 1 | 1 | 0.3 | 0.4 |
| 2 | 0.3 | 1 | 0.2 |
| 3 | 0.4 | 0.2 | 1 |
So if constant == 1 then the resulting dataframe would look like:
| idx | value | lookup |
|---|---|---|
| 0 | 1 | 1 |
| 1 | 2 | 0.3 |
| 2 | 3 | 0.4 |
Existing answers I've seen reference:
pandas.lookupwhich is deprecatednumpy.selectornumpy.wherewhich aren't really scalable if the lookup matrix gets large
I think there might be a solution around pandas.merge but I'm struggling to get my head around it. Would anyone have any suggestions?
I should probably mention that speed is really important so a vectorised solution would be preferable.
To replicate:
df = pd.DataFrame({"value": [1, 2, 3]})
df_lookup = pd.DataFrame(
index=[1, 2, 3],
data={1: [1, 0.3, 0.4], 2: [0.3, 1, 0.2], 3: [0.4, 0.2, 0.1]}
)
CodePudding user response:
IIUC, you can directly map the row to value column
constant = 1
df['lookup'] = df['value'].map(df_lookup.loc[constant, :])
print(df)
value lookup
0 1 1.0
1 2 0.3
2 3 0.4
CodePudding user response:
If you want to make your life easy and the value column in df is non-repeating, then you could set it as the index and simply use loc.
constant = 1
df['mapped'] = df_lookup.loc[df.index, constant]
Which gives me :
value lookup
1 1 1.0
2 2 0.3
3 3 0.4
However, if it is repeating, this would not work. Then you could use .merge such as :
constant = 1
df = df.merge(df_lookup[constant], left_on='value', right_index=True)
