How to ignore NaNs when reassigning values to variables-CodePudding

I imported and concatenated a couple of csv files. All of them contain the variable "prac_type" but the observations are listed in different ways. Some are strings (yes, no, unsure) while the others are numeric (1,2,3). Here is a look at the variable:

print(df.prac_type.unique())

[nan 1.0 2.0 3.0 '1' '2' 'Unsure']

But I just want 1.0 to merge into 1 (since they are representing the same thing), 2.0 to become 2, and 3.0 and unsure to become 3. I want my variable to be this:

print(df.prac_type.unique())
[ '1' '2' '3']

I tried doing this:

prac_dic = {'1.0': 1,'2.0': 2 , '3.0':3, 'Unsure':3}
  
df.prac_type = [prac_dic[item] for item in df.prac_type]
print(df.prac_type.unique())

But I get an error (KeyError: nan) because my variable prac_type has NaNs. I don't want to drop the NaNs though. So how can I get my code to ignore the missing values and reassign the values?

CodePudding user response：

Just add one special check on the nan value

df.prac_type = [prac_dic[item] if pandas.notnull(item) else np.nan for item in df.prac_type ]

https://pandas.pydata.org/docs/reference/api/pandas.isnull.html

CodePudding user response：

Try df.prac_type = [prac_dic.get(item) for item in df.prac_type]