I have these two dataframes:
Df1 = pd.DataFrame({'street1': ['1, 2', '1, 3', '1, 4', '123']})
Df2 = pd.DataFrame({'street2': ['1, 1', '1, 2', '1, 2']})
I added a column that tells me if the values of Df1 are in Df2 with the following code.
Df1["InDf2"] = Df1.street1.isin(Df2.street2).astype('bool')
I need to make another column Df1['Df2Index'], containing the index number (or numbers) where Df2 is equal to Df1. How can I do this?
CodePudding user response:
reset the index of the second DataFrame to get the index as a column. then merge on the desired columns. after that drop the extra column and NaNs created in the other columns.
Df1 = Df1\
.merge(Df2.reset_index(), how='outer', right_on='street2', left_on='street1')\
.drop('street2', axis=1)\
.dropna(subset=['street1'])
output:
street1 InDf2 index
0 1, 2 True 1.0
1 1, 2 True 2.0
2 1, 3 False NaN
3 1, 4 False NaN
4 123 False NaN
CodePudding user response:
lambda function helps to do that
l = df2['street2'].values
df1['InDf2'] = df1.apply(lambda row: 'True' '-' str(list(df2[df2['street2'] == row['street1']].index.values)) if row['street1'] in l else 'False-Not Exist', axis = 1)
df1[['InDf2', 'Df2_index']] = df1['InDf2'].str.split('-', expand=True)
df1
street1 InDf2 Df2_index
0 1, 2 True [1, 2]
1 1, 3 False Not_Exist
2 1, 4 False Not_Exist
3 123 False Not_Exist
