How to compare 2 Excel sheets and extract the common values?-CodePudding

Table I

c1	c2	c3
Ant	Bee	Apple
A	B	C
4	5	6

Table II

A1	A2	c3
G	H	Apple
H	s	C
8	asd	4

I would like to know if we can compare these 2 excel workbook (comparison based on the column c3) and extract only the similar values along with the entire row value of that matched value using Pandas

Expected Output:

c1	c2	c3	A1	A2
Ant	Bee	Apple	G	H
A	B	C	H	s

CodePudding user response：

Here's my solution. Basically, you should:

merge dfs with outer join on "c3"

drop all rows that have any NaN values

  import pandas as pd

  my_df = pd.DataFrame({'c1': ['Ant', 'A', 4],
                        'c2': ['Bee', 'B', 5],
                        'c3': ['Apple', 'C', 6]
                        })


  my_df2 = pd.DataFrame({'A1': ['G', 'H', 8],
                         'A2': ['H', 's', 'asd'],
                         'c3': ['Apple', 'C', 4]
                        })


  final = my_df.merge(my_df2, on='c3', how='outer').dropna()

Result:

        c1 |  c2 |    c3 | A1  | A2
    -------------------------------
       Ant | Bee | Apple | G   | H
         A |  B  |    C  | H   | s

CodePudding user response：

print (pd.merge(df1, df2, on='c3'))