How to update column value of one dataframe with other dataframe value in pandas?-CodePudding

I have following 2 dataframes -

df1:

                            job                                              e_msg
0  f_maf_crm_customer_request_global_jgvcc  permission denied for relation f_maf_custome

df2:

                 master_job                           error_msg      current_count
0    JGCC_Genesys_Conversations_Aggregates
1  f_maf_crm_customer_request_global_jgvcc                              100

if df1 is having the same job name so df2 should update error_msg with df1.e_msg and current_count should change as 0.

In final output:

             master_job                               error_msg                            current_count
0    JGCC_Genesys_Conversations_Aggregates
1  f_maf_crm_customer_request_global_jgvcc  permission denied for relation f_maf_custome     0

CodePudding user response：

A simple approach would be this one liner.

df2.loc[df2.master_job.isin(df1.job), ['error_msg', 'current_count']] = [df1['e_msg'].values, 0]

This gives the output you want. Please note that I have abbreviated the error_msg for brevity. You can test it out yourself and see that the output message is the same as the one you are using.

    job                                     error_msg             current_count
0   JGCC_Genesys_Conversations_Aggregates       
1   f_maf_crm_customer_request_global_jgvcc permission denied...    0.0

What that single line of code does is quite simple. It checks if the records in df1 exist in df2 and if it does, then the values in df2 are changed accordingly to match your requirement. You can read more about isin here.

CodePudding user response：

You can try an outer join on the two dataframes from the merge function in pandas. If you don't know what joins are then refer to this webpage.

Code -

merged_df = pd.merge(df1, df2, on = "master_job", how = "outer")
merged_df

Output -

	master_job	error_msg_x	current_count
0	JGCC_Genesys_Conversations_Aggregates	nan
1	f_maf_crm_customer_request_global_jgvcc	permission denied for relation f_maf_custome	100

Here we are getting two error_msg columns because error_msg_x is from df1 and error_msg_y is from df2. Because error_msg_y isn't of any use to use, we can drop it using the code below.

merged_df = merged_df.drop(columns = "error_msg_y")

We can merge the two dataframes and drop the error_msg_y column in just one line using this code,

merged_df = pd.merge(df1, df2, on = "master_job", how = "right").drop(columns = "error_msg_y")