I am trying to remove any occurrence of 'Doctor', 'Honorable', and 'Professor' from a variable in a dataframe. Here is an example of the dataframe:
| Name |
|---|
| professor Rick Smith |
| Mark M. Tarleton |
| Doctor Charles M. Alexander |
| Professor doctor Todd Mckenzie |
| Carl L. Darla |
| Honorable Billy Darlington |
Observations could have multiple, one, or none of: 'Doctor', 'Honorable', or 'Professor'. Also, the terms could be upper case or lower case.
Any help would be much appreciated!
CodePudding user response:
Use a regex with str.replace:
regex = '(?:Doctor|Honorable|Professor)\s*'
df['Name'] = df['Name'].str.replace(regex, '', regex=True, case=False)
Output:
Name
0 Rick Smith
1 Mark M. Tarleton
2 Charles M. Alexander
3 Todd Mckenzie
4 Carl L. Darla
5 Billy Darlington
