I have the following df:
colA colB colC
12 33 66
13 35 67
14 44 77
15 55 79
18 56 81
I would like to replace the values of colB and colC with None starting from index 2 all the way to the end of df. The expected output is:
colA colB colC
12 33 66
13 35 67
14 None None
15 None None
18 None None
CodePudding user response:
Use DataFrame.loc with any index and columns names in list:
df.loc[df.index[2:], ['colB','colC']] = None
If there is default RangeIndex use 2::
df.loc[2:, ['colB','colC']] = None
print (df)
colA colB colC
0 12 33.0 66.0
1 13 35.0 67.0
2 14 NaN NaN
3 15 NaN NaN
4 18 NaN NaN
Because numeric values are Nones converted to NaNs.
If need integers with missing values use Int64:
df[['colB','colC']] = df[['colB','colC']].astype('Int64')
print (df)
colA colB colC
0 12 33 66
1 13 35 67
2 14 <NA> <NA>
3 15 <NA> <NA>
4 18 <NA> <NA>
CodePudding user response:
You can do something like this -
df.loc[2:, "colB":] = None
Basically using the loc method to select the rows starting from index 2 and the columns colB and colC, and then assign the value None to them. This will replace the values of colB and colC with None starting from index 2.
CodePudding user response:
Apart from pandas.DataFrame.loc (that jezrael's mentions), one can use pandas.DataFrame.iloc as follows
df.iloc[2:, 1:] = None
[Out]:
colA colB colC
0 12 33.0 66.0
1 13 35.0 67.0
2 14 NaN NaN
3 15 NaN NaN
4 18 NaN NaN
Note that colB and colC are floats, because NaN is a float. If one doesn't want those columns to be float64, one approach would be to use pandas.Int64Dtype as follows
df[['colB', 'colC']] = df[['colB', 'colC']].astype(pd.Int64Dtype())
[Out]:
colA colB colC
0 12 33 66
1 13 35 67
2 14 <NA> <NA>
3 15 <NA> <NA>
4 18 <NA> <NA>
