Newbie with Python. I have 2 csv files that both have student last name and student first name (Re-enrolledonline.csv and currentschoolroster.csv - the number of columns and column headings are different but both files contain this information). I then need to create a CSV with a list of students that are NOT found in Re-enrolledonline.csv but ARE in currentschoolroster.csv. I can match one column without an issue with the code below but I can't seem to find a good resource on how to only write to the new file if BOTH first name and last name columns are matched.
Thank you!
import pandas as pd
f1 = pd.read_csv('/users/Desktop/CompareFiles/currentschoolroster.csv')
f2 = pd.read_csv('/users/Desktop/CompareFiles/Re-enrolledonline.csv')
notmatched = f1[~f1.Last_Name.isin(f2.StudentLastName)]
notmatched.to_csv('/users/Desktop/CompareFiles/notmatched.csv')
CodePudding user response:
Pandas uses & for and, | for or in indexing: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#boolean-indexing
That should work fine:
notmatched = f1[~f1.Last_Name.isin(f2.StudentLastName) \
& (~f1.First_Name.isin(f2.StudentFirstName))]
PS. Beware, if the CSVs were big, isin() would be very slow.
