In a dataframe. each row in column B contains a list, which is stored in object data type. I want to compare each row in B with another list. If they are equal in value, I want to add a new column indicting TRUE, otherwise, FALSE.
df
B
"1st", "2nd", "3rd"
"2nd", "1st", "4th"
"1st", "2nd", "3rd"
I use iterrows() for looping into rows:
for index, row in df[['B']].iterrows():
set(row['B']) == set(['1st', '2nd', '3rd'])
The error shows TypeError: 'float' object is not iterable, but I don't have any numeric in the comparison, why does it happen? Also, if I print(set(row['B']) == set(['1st', '2nd', '3rd'])), it shows TRUE/FALSE. Why is that?
CodePudding user response:
That error indicates that there are some NaN values in "B". See if
len(df['B']) == len(df['B'].dropna())
is True.
Also instead iterrows, you can use list comprehension where you check if you have a list or not:
df = pd.DataFrame({'B':[["1st", "2nd", "3rd"], ["2nd", "1st", "4th"], ["1st", "2nd", "3rd"], np.nan]})
df['new_col'] = [set(x)==set(['1st', '2nd', '3rd']) if isinstance(x, list) else False for x in df['B']]
Output:
new_col
0 True
1 False
2 True
3 False
