The dataframe for the problem statement looks like
| Name | UID | search_text |
|---|---|---|
| B | 14 | kj |
| S | 2 | hsa,isd |
| D | 10 | sa,ad,ad |
| E | 99 | pid, pd,dd,ef |
| G | 8 | dd |
I want the dataframe search_text to be stripped and replaced on the 1st word before comma.(I dont want to manually map it and replace). So it would look like.
| Name | UID | search_text |
|---|---|---|
| B | 14 | kj |
| S | 2 | hsa |
| D | 10 | sa |
| E | 99 | pid |
| G | 8 | dd |
Is there any convenient way to do that?
CodePudding user response:
Use Series.str.split
df['search_text'] = df['search_text'].str.split(',').str[0]
print(df)
Name UID search_text
0 B 14 kj
1 S 2 hsa
2 D 10 sa
3 E 99 pid
4 G 8 dd
CodePudding user response:
Extract the first alphanumerics in the string
df['search_text'] = df['search_text'].str.extract('(^\w )')
Name UID search_text
0 B 14 kj
1 S 2 hsa
2 D 10 sa
3 E 99 pid
4 G 8 dd
