I want my dataframe from this.....
| Name | Qualities |
|---|---|
| boba fet | 1. Fighting 2. Running 3.swimming |
| enigma | 1. Dodging bullets while running, cooking food 2. Sleep walking |
To the below format..
| Name | Qualities |
|---|---|
| boba fet | Fighting |
| boba fet | Running |
| boba fet | Swimming |
| enigma | Dodging bullets while running, cooking food |
| enigma | Sleep walking |
Even if there is comma in text it needs to be exploded into rows on the numberings.
I tried to do
df.assign(Qualities = df.Qualities.str.split('1.')).explode('Qualities') but didn't get the desired result.
CodePudding user response:
You could split on the number and period as the delimiter using regex. You'll end up with a few empty rows and whitespace using this pattern, so you can strip the values and drop empty rows.
import pandas as pd
df = pd.DataFrame({'Name': ['boba fet', 'enigma'],
'Qualities': ['1. Fighting 2. Running 3.swimming',
'1. Dodging bullets while running, cooking food 2. Sleep walking']})
df['Qualities'] = df.Qualities.str.split('\d .\s?')
df = df.explode('Qualities')
df['Qualities'] = df['Qualities'].str.strip()
print(df.loc[df['Qualities'].ne('')])
