I have a CSV file which contains the columns as follows:
"Date","Time","TimeZone","Name","Type","Status","Currency","Gross","Fee","Net","From Email Address","To Email Address","Transaction ID","Shipping Address","Address Status","Item Title","Item ID","Shipping and Handling Amount","Insurance Amount","Sales Tax","Option 1 Name","Option 1 Value","Option 2 Name","Option 2 Value","Reference Txn ID","Invoice Number","Custom Number","Quantity","Receipt ID","Balance","Address Line 1","Address Line 2/District/Neighborhood","Town/City","State/Province/Region/County/Territory/Prefecture/Republic","Zip/Postal Code","Country","Contact Phone Number","Subject","Note","Country Code","Balance Impact"
I am trying to just grab the rows of data that contain the string Chain × Jewelry × Necklace in the Item Title column.
The name under each item title is different. For example. One might be Chain × Jewelry × Necklace Popcorn Necklace others are BLANK VALUES but I just want all of them that contain Chain × Jewelry × Necklace
How can I use pandas to pull these specific rows containing this string? I am having trouble. Any help is greatly appreciated thank you.
CodePudding user response:
Try this:
df = pd.read_csv('path/to/your/file.csv')
df = df[df['Item Title'].fillna('').str.contains('Chain × Jewelry × Necklace') & df['Name'].fillna('').str.len().gt(0)]
CodePudding user response:
You can use regex:
df[df["Item Title"].str.contains(r"^(?=.*\bChain\b)(?=.*\bJewelry\b)(?=.*\bNecklace\b). ", regex=True)]
