I have a dataframe with 1000 date columns. the value of every cell is all numeric. I'd like to replace any cell's value larger than 10 to be 10 for all the columns.
for individual columns, I know how to do it, but not sure how to do it for all the columns
SPM_data.loc[SPM_data > 10] = 10
I am also thinking to do this way, and got an error
def max_value(cell):
if cell>10:
return 10
SPM_data=SPM_data.apply(max_value)
SPM_data
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
I end up doing this way using for loop, not efficient but it can be done. any one has other better faster approach? thanks
def max_value(cell):
if cell>10:
return 10
else:
return cell
for col in SPM_data:
SPM_data.loc[:,col]=SPM_data.loc[:,col].apply(lambda cell : max_value(cell))
CodePudding user response:
Maybe you're looking for:
mask = SPM_data > 10
SPM_data[mask] = 10
Here, mask is a boolean mask over the entire dataframe and the second line broadcasts 10 to all Trues.
CodePudding user response:
Any cell value greater than 10 discards 10 for all its columns.
import pandas as pd
df = pd.DataFrame({'Name':['Sam', 'Andrea', 'Alex', 'Robin', 'Kia', 'Jhon'],
'Age':[14, 25, 7, 8, 21, 45]})
print(df)
print('')
df.loc[df['Age'] > 10, 'Age'] = 10
print(df)
