so, I have two dataframes and I'm attempting the match the column 'numbers' from dataframe 1 to the content of the full dataframe 2, and extract the column header and use that as a label/new column in dataframe 1. Struggling to build a pipeline for this though. I am working in Python.
dataframe 1
| numbers |
|---|
| 100 |
| 101 |
| 102 |
| 103 |
| 200 |
| 201 |
| 202 |
| 203 |
| 300 |
| 301 |
| 302 |
| 303 |
dataframe 2:
| construction | fields | plates |
|---|---|---|
| 100 | 200 | 300 |
| 101 | 201 | 301 |
| 102 | 202 | 302 |
| 103 | 203 | 303 |
output [with new column label matched from dataframe 2]:
| numbers | label |
|---|---|
| 100 | construction |
| 101 | construction |
| 102 | construction |
| 103 | construction |
| 200 | fields |
| 201 | fields |
| 202 | fields |
| 203 | fields |
| 300 | plates |
| 301 | plates |
| 302 | plates |
| 303 | plates |
CodePudding user response:
Use melt to flatten your second dataframe then merge it with your first dataframe:
>>> df1.merge(df2.melt(var_name='label', value_name='numbers'), on='numbers')
numbers label
0 100 construction
1 101 construction
2 102 construction
3 103 construction
4 200 fields
5 201 fields
6 202 fields
7 203 fields
8 300 plates
9 301 plates
10 302 plates
11 303 plates
After melt, your second dataframe looks like:
>>> df2.melt(var_name='label', value_name='numbers')
label numbers
0 construction 100
1 construction 101
2 construction 102
3 construction 103
4 fields 200
5 fields 201
6 fields 202
7 fields 203
8 plates 300
9 plates 301
10 plates 302
11 plates 303
Another way with map:
df1['label'] = df1['numbers'].map(df2.melt().set_index('value').squeeze())
print(df)
# Output
numbers label
0 100 construction
1 101 construction
2 102 construction
3 103 construction
4 200 fields
5 201 fields
6 202 fields
7 203 fields
8 300 plates
9 301 plates
10 302 plates
11 303 plates
