I would like to filter dataframe by lambda if condition
I have a "product name" and "category1" coloumns and "if product name" not contains ("boxer","boxers","sock","socks") words I would like to change "category1"coloumn as "Other", but below code change all of them as "other" example even contains "sock"
df = pd.DataFrame({
'product_name': ["blue shirt", " medium boxers", "red jackets ", "blue sock"],})
df["category1"]=df.apply(lambda x: "Other" if ("boxer","boxers","sock","socks" not in x["product_name"] ) else x["category1"], axis=1)
I expected below results
df = pd.DataFrame({
'product_name': ["blue shirt", " medium boxers", "red jackets ", "blue sock"],
'category1'["other", Nan, "other ", "Nan"],})
thank you for your support
CodePudding user response:
You could use str.contains:
items = ("boxer","boxers","sock","socks")
import numpy as np
df["category1"] = np.where(df['product_name'].str.contains('|'.join(items)),
np.nan, # value is True
'Other') # value if False
output:
product_name category1
0 blue shirt Other
1 medium boxers nan
2 red jackets Other
3 blue sock nan
