Current data frame:
| Name | ID |
|---|---|
| Peter | School_09 |
| John | School_23 |
How I want it:
| Name | ID |
|---|---|
| Peter | 09 |
| John | 23 |
CodePudding user response:
We can also try using str.replace here:
df["ID"] = df["ID"].str.replace(r'.*_', '', regex=True)
CodePudding user response:
you can use str.extract with the \d $ regex (one or more trailing digits) to collect only the trailing digits:
df['ID'] = df['ID'].str.extract(r'(\d )$')
output:
ID
Name
0 Peter 09
1 John 23
and to have a numeric type, combine with to_numeric:
df['ID'] = pd.to_numeric(df['ID'].str.extract(r'(\d )$', expand=False), errors='coerce')
output:
Name ID
0 Peter 9
1 John 23
