Home > database >  How to add new column in dataframe based on the other dataframe?
How to add new column in dataframe based on the other dataframe?

Time:01-12

Hi I have 2 dataframes but both are not same. I have to update one based on the 2nd.

Example: df1:

Region  Sub_Region  Run_Date    Status   Reason
ASPAC   CRM         2022-01-11  Success
ASPAC   Genesys     2022-01-11  Failed
LATAM   CRM         2022-01-11  Success

df2:

Region   Sub_Region   Max_Load_Date
NA       CRM          2021-08-12
ASPAC    CRM          2021-03-15
LATAM    CRM          2021-10-11
NA       Genesys      2021-12-18
ASPAC    Genesys      2021-11-18
LATAM    Genesys      2021-04-16

In Final Output:

Region  Sub_Region  Run_Date    Status   Reason  Max_Load_Date
ASPAC   CRM         2022-01-11  Success          2021-03-15
ASPAC   Genesys     2022-01-11  Failed
LATAM   CRM         2022-01-11  Success          2021-10-11

The df1 will update when the status = 'Success'

CodePudding user response:

Filter out your dataframe after merge:

df1['Max_Load_Date'] = df1.merge(df2, on=['Region', 'Sub_Region'], how='left') \
                          .query("Status == 'Success'")['Max_Load_Date']
print(df)

# Output
  Region Sub_Region    Run_Date   Status  Reason Max_Load_Date
0  ASPAC        CRM  2022-01-11  Success     NaN    2021-03-15
1  ASPAC    Genesys  2022-01-11   Failed     NaN           NaN
2  LATAM        CRM  2022-01-11  Success     NaN    2021-10-11
  •  Tags:  
  • Related