I wish to separate strings in a column by the hyphen in multiple places.
Data
id type
hello-srp-lap-555-aaa aa
hello-sss-lap-555-aaa vv
happy-srp-bb-578-aaa c
Desired
id type original sep sep1
hello-lap aa hello-srp-lap-555-aaa hello lap
hello-lap vv hello-sss-lap-555-aaa hello lap
happy-bb c happy-srp-bb-578-aaa happy bb
Doing
df[['id', 'original', 'sep', 'sep1']] = df['id'].str.split('-', 1, expand=True)
Any suggestion is appreciated- The new columns are not generating
CodePudding user response:
Change to rsplit
df[['id', 'original', 'sep', 'sep1']] = df.id.str.rsplit(n=3,pat = '-',expand=True)
df
id type original sep sep1
0 hello-srp aa lap 555 aaa
1 hello-sss vv lap 555 aaa
2 happy-srp c bb 578 aaa
CodePudding user response:
Split on -, expand and filter the columns to take the relevant columns. Then apply join on the newly created columns to change id:
df['original'] = df['id']
df[['sep','sep1']] = df['id'].str.split('-', expand=True)[[0,2]]
df['id'] = df[['sep','sep1']].apply('-'.join, axis=1)
Output:
id type original sep sep1
0 hello-lap aa hello-srp-lap-555-aaa hello lap
1 hello-lap vv hello-sss-lap-555-aaa hello lap
2 happy-bb c happy-srp-bb-578-aaa happy bb
