Home > Net >  How do I read two CSV files and compare 1 column from both and then write to a new file where column
How do I read two CSV files and compare 1 column from both and then write to a new file where column

Time:01-25

I have 2 CSV's with some names in them. There are few duplicates in each CSV, I need to compare them and if the same names exist in both CSV, update in new CSV.

CSV 1

ID, Names
    1, Bill
    2, Smith

CSV 2

ID, Names
    1, James
    2, Gates
    3, Bill

As first row in CSV 1 matches with 3rd row in CSV 2

New CSV [Desired Output]

ID, Names
    1, Bill

I've tried this:

import pandas as pd

df1 = pd.read_csv('csv1.csv')
df2 = pd.read_csv('csv2.csv')

df1['flag'] = 'old'
df2['flag'] = 'new'

df = pd.concat([df1, df2])

dups = df.drop_duplicates(df.columns.difference(['flag']), keep = False)
dups.to_csv('new_CSV.csv', index = False)

CodePudding user response:

Try with merge

out = df1.merge(df2[['Names']],how='inner',on='Names')
Out[44]: 
   ID  Names
0   1   Bill
  •  Tags:  
  • Related