I am working on the Medical Data Visualizer challenge and I am reassigning DataFrame values based on some conditions, like this :
df['cholesterol'].loc[df['cholesterol'] == 1] = 0 #normalizing cholestrol values
df['cholesterol'].loc[df['cholesterol'] > 1] = 1 #normalizing cholestrol values
It seems to work, however , I am also working with Jupyter Notebooks and when I load the snipet with this code I get the following warning :
/usr/local/lib/python3.7/dist-packages/pandas/core/indexing.py:670: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
Now, even though it works, I am wondering if this is the best way to do it (best practice) or there is other way I should reassign those values.
Looking for the best practice on this subject to not cause future errors.
Thank you
EDIT :
id age gender height weight ap_hi ap_lo cholesterol gluc smoke alco active cardio
0 18393 2 168 62.0 110 80 1 1 0 0 1 0
1 20228 1 156 85.0 140 90 3 1 0 0 1 1
2 18857 1 165 64.0 130 70 3 1 0 0 0 1
3 17623 2 169 82.0 150 100 1 1 0 0 1 1
4 17474 1 156 56.0 100 60 1 1 0 0 0 0
CodePudding user response:
Reference: Why does assignment fail when using chained indexing?
Use:
# subset index subset columns
df.loc[df['cholesterol'] == 1, 'cholesterol'] = 0
df.loc[df['cholesterol'] > 1, 'cholesterol'] = 1
print(df)
# Output
cholesterol
0 0
1 1
2 1
This will avoid SettingWithCopyWarning because there is no chained subsetting.
Setup:
df = pd.DataFrame({'cholesterol': [1, 2, 3]})
print(df)
# Output
cholesterol
0 1
1 2
2 3
Update
With your sample:
>>> df[['id', 'age', 'cholesterol']]
id age cholesterol
0 0 18393 0
1 1 20228 1
2 2 18857 1
3 3 17623 0
4 4 17474 0
>>> df
id age gender height weight ap_hi ap_lo cholesterol gluc smoke alco active cardio
0 0 18393 2 168 62.0 110 80 0 1 0 0 1 0
1 1 20228 1 156 85.0 140 90 1 1 0 0 1 1
2 2 18857 1 165 64.0 130 70 1 1 0 0 0 1
3 3 17623 2 169 82.0 150 100 0 1 0 0 1 1
4 4 17474 1 156 56.0 100 60 0 1 0 0 0 0
