When I do match=a.isin(b) the number of matches doesnt equal the number of matches in match2=b.isin(a). Here a and b are dataframe columns (series) and a match is each "True" value in the column. I think of a.isin(b) as a function returning "True" for those elements in a found in b and b.isin(a) as a function returning "True" for those elements in b found in a. I would expect an equal amount of matches, why does it not? I have len(match)>>len(match2), can this be possible?
CodePudding user response:
I think you are confused on what isin does.
a = pd.Series([1,1,1,2,2,3])
b = pd.Series([1,2,2,4])
then a.isin(b) has the same length (and index) as a:
pd.Series([True, True, True, True, True, False])
while b.isin(a) has the same length (and index) as ba:
pd.Series([True, True, True, False])
What will be the same? The unique values:
set(a[a.isin(b)]) == set(b[b.isin(a)])
