Home > OS >  Update/merge dataframes based values in each row
Update/merge dataframes based values in each row

Time:01-05

I have two dataframes:

import pandas as pd
df1 = pd.DataFrame({
    "id": [1,2,3,4,5,6], "c1": [1,2,3,4,5,6], "c2": [1,2,3,4,5,6], "c3": [1,2,3,4,5,6]
})

and

df2 = pd.DataFrame({
    "id": [1,2,3,4,5,6], "column": ["c1", "c2", "c3", "c1", "c2", "c3"], "new-value": [10,20,30,40,50,60]
})

I would like to update df1 based on information from df2 so that the result is:

df3 = pd.DataFrame({
    "id": [1,2,3,4,5,6], "c1": [10,2,3,40,5,6], "c2": [1,20,3,4,50,6], "c3": [1,2,30,4,5,60]
})
  1. Is it possible to do this using pandas?
  2. Are update/merge viable options for this?

CodePudding user response:

We can reshape df2 using pivot, then use it to substitute values in df1

df1.replace(df2.pivot(*df2.columns)).fillna(df1)

   id    c1    c2    c3
0   1  10.0   1.0   1.0
1   2   2.0  20.0   2.0
2   3   3.0   3.0  30.0
3   4  40.0   4.0   4.0
4   5   5.0  50.0   5.0
5   6   6.0   6.0  60.0

CodePudding user response:

It is doable, but I am not sure if it is the best way to do such operation.

  1. The hard way:
df1 = df1.set_index('id')
for idx, row in df2.iterrows():
    df1.loc[row['id'], row['column']] = row['new-value']
df1.reset_index()
  1. The more pythonic way:
df2 = df2.pivot(index='id', columns='column', values='new-value').reset_index()
df1.update(df2)

CodePudding user response:

It is possible with pandas with the following code:

ids = df2["id"].to_list()
columns = df2.columns.to_list()
new_values = df2["new-value"].to_list()

for id in ids:
    for col in columns:
        for value in new_values:
            df3.loc[id, col] = value

Probably not optimal if your datasets are very large.

  •  Tags:  
  • Related