I have a dataframe where I would like to append values to if the id value is the same. I do not wish to have duplicate rows in my df1 dataframe
Data
df1
id stat
aa11rr y
aa11rr y
aa11rr y
bb123 n
bb123 n
df2
loco country temp
aa11rr ny 55
aa11rr ny 60
aa11rr ny 65
bb123 ca 75
bb123 ca 70
aa11rr ny 70
bb123 ca 90
bb123 ca 95
Desired
id stat loco country temp
aa11rr y aa11rr ny 55
aa11rr y aa11rr ny 55
aa11rr y aa11rr ny 60
bb123 n bb123 ca 75
bb123 n bb123 ca 70
Doing
pd.concat([df1, df2.reindex(df1.index)], axis=1)
However, I would like my smaller dataset to not have more rows than it started with
CodePudding user response:
IIUC
extract = lambda x: pd.merge(x, df2, left_on='id', right_on='loco').head(len(x))
out = df1.groupby('id').apply(extract).reset_index(drop=True)
print(out)
# Output
id stat loco country temp
0 aa11rr y aa11rr ny 55
1 aa11rr y aa11rr ny 60
2 aa11rr y aa11rr ny 65
3 bb123 n bb123 ca 75
4 bb123 n bb123 ca 70
CodePudding user response:
I am not sure if your example reflects what you want to do. However, getting all rows from df2 to df1 where they agree on df1.id and df2.loco is called left outer join. You can do so by pandas.DataFrame.merge() method. Documentation is here. This should do the trick:
df1.merge(df1, left_on='id', right_on='loco', how='left')
