Issue detecting nan in for loop using numpy-CodePudding

Why can't I detect the np.nan value in data using np.isnan() in the list comprehension below? Does the list comprehension transform the type of values in some way?

data = pd.DataFrame({'col':['a', 'b', np.nan]})

[print('NaN') if np.isnan(i) else print('Not NaN') for i in data.col]

CodePudding user response：

Yes, you will get into trouble using np.isnan() because of the mixed types in the column. From pandas' docs

Because NaN is a float, a column of integers with even one missing values is cast to floating-point dtype (see Support for integer NA for more)

Therefore you should consider, as @saeedghadiri suggested using pd.isna():

[print('NaN') if pd.isna(i) else print('Not NaN') for i in data.col]

CodePudding user response：

If we look closer to your code, col as 3 values, 'a', 'b' and np.nan. The two first are strings, the third is a np.float.

However, np.isnan is not designed for string types, then it will crash. The following will work

data = pd.DataFrame({'col':[1, 2, np.nan]})

[print('NaN') if np.isnan(i) else print('Not NaN') for i in data.col]

If you want to distinguish np.nan from all object types, you should use pd.isna instead.

CodePudding user response：

I prefer to use pd.isna everytime for checking nans. So your code changes to:

[print('NaN') if pd.isna(i) else print('Not NaN') for i in data.col]

The output would be:

Not NaN
Not NaN
NaN

CodePudding user response：

As stated in the answer here, you have to use pd.isnull(i) instead of np.isnan(i), because the function np.isnan() doesn't work for str type.

[print('NaN') if pd.isnull(i) else print('Not NaN') for i in data.col]