I have this dataframe which need to extract package info (ML, KG, PZA, LT, UN, etc) from description column, and i'm pretty new at pandas. This is the dataframe right now
| SKU | Description |
|---|---|
| 1 | TRIDENT 6S SANDIA 9GR |
| 2 | CANAST RABBIT F1 A 1UN |
| 3 | HAND SOAP VITAMIN E 442 ML. |
I need to extract 9GR, 1UN, 442 ML, etc. and take it into another column, there is any idea. I really appreciate this. Greetings
CodePudding user response:
You can use this regex:
pkg = ['ML', 'KG', 'PZA', 'LT', 'UN', 'GR']
df['package'] = df['Description'].str.extract(fr"\b(\d \s*(?:{'|'.join(pkg)}))\b")
print(df)
# Output
SKU Description package
0 1 TRIDENT 6S SANDIA 9GR 9GR
1 2 CANAST RABBIT F1 A 1UN 1UN
2 3 HAND SOAP VITAMIN E 442 ML. 442 ML
