How can I merge two data frames when the column has a slight offset than the column I am merging to?
df1 =
| col1 | col2 |
|---|---|
| 1 | a |
| 2 | b |
| 3 | c |
df2 =
| col1 | col3 |
|---|---|
| 1.01 | d |
| 2 | e |
| 2.95 | f |
so, the merged column would end up like this even though the values in col1 are slightly different.
df_merge =
| col1 | col2 | col3 |
|---|---|---|
| 1 | a | d |
| 2 | b | e |
| 3 | c | f |
I have seen scenarios like this where "col1" is a string, but I'm wondering if it's possible to do this with something like pandas.merge() in the scenario where there is slight numerical offset (e.g /- 0.05).
CodePudding user response:
Lets do merge_asof with tolerance parameter
pd.merge_asof(
df1.astype({'col1': 'float'}).sort_values('col1'),
df2.sort_values('col1'),
on='col1',
direction='nearest',
tolerance=.05
)
col1 col2 col3
0 1.0 a d
1 2.0 b e
2 3.0 c f
PS: if the dataframes are already sorted on col1 then there is no need to sort again.
