Home > Enterprise >  Working with .startswith and is_unique in python to create new column
Working with .startswith and is_unique in python to create new column

Time:02-05

Example: I have the first three columns and need to create the new order column based on the previous ones. The new order should say no if the order number is repeated. However, it should say yes if the order number is unique or if the item starts with a "WC". The problem is that the two science courses have duplicated order number but since it starts with a WC, the value in new order column should be 'Yes'. I know i should use is_unique and .startswith to do this but i am not sure how to combine these two to get the result.

Item Name Order Number New Order
Calc Math 100 No
Algebra Math 100 No
Geometry Math 300 Yes
WCChem Science 200 Yes
WCBio Science 200 Yes

Dictionary to construct the DataFrame:

dic = {'Item': ['Calc', 'Algebra', 'Geometry', 'WCChem', 'WCBio'],
       'Name': ['Math', 'Math', 'Math', 'Science', 'Science'],
       'Order Number': [100, 100, 300, 200, 200],
       'New Order': ['No', 'No', 'Yes', 'Yes', 'Yes']}

CodePudding user response:

You can use ~duplicated (with keep=False to select all duplicates) to identify non-duplicate values and use str.startswith to identify the "Item" values that start with "WC". Now since you want a "Yes" if either of these are True, you use | operator and wrap it in a np.where to assign "Yes", "No" values:

df['New_Order'] = np.where(~df['Order_Number'].duplicated(keep=False) | df['Item'].str.startswith('WC'), 'Yes', 'No')

Output:

       Item     Name  Order Number New Order
0      Calc     Math           100        No
1   Algebra     Math           100        No
2  Geometry     Math           300       Yes
3    WCChem  Science           200       Yes
4     WCBio  Science           200       Yes
  •  Tags:  
  • Related