My pandas data frame is as below. IN the "price" column, I want to convert non digit values to nan. Thanks.
Input:
ID, name, qty, price
1, pen, 4, 55
2, book, 5, ai
3, PENCIL, 5, 100
4, book2, 10, si
output:
ID, name, qty, price
1, pen, 4, 55
2, book, 5, nan
3, PENCIL, 5, 100
4, book2, 10, nan
CodePudding user response:
Use pd.to_numeric:
df['price'] = pd.to_numeric(df['price'], errors='coerce')
If they're strings, use mask and str.isnumeric:
df['price'] = df['price'].mask(~df['price'].str.isnumeric())
mask automatically converts data that match the condition to NaN.
Or use str.isaplha:
df['price'] = df['price'].mask(df['price'].str.isalpha())
CodePudding user response:
You could use str.replace here:
df["price"] = df["price"].str.replace(r'^.*\D.*$', np.NaN)
The regex pattern ^.*\D.*$ will match any price entry having at least one non digit character, and will selectively replace with nan.
