i have a columns of texts. How I can cut the part of each text after the term show less?
For example:
df = pd.DataFrame({'col':['hi there see less new way is comming','today is summer, see less , is a lovely day']})
that looks like:
col
'hi there see less new way is comming'
'today is summer, see less , is a lovely day'
output:
col
'hi there'
'today is summer,'
CodePudding user response:
You can use str.split to split on see less and then use str slicing to take the first part.
df['col'] = df['col'].str.split('see less').str[0]
Output:
col
0 hi there
1 today is summer,
CodePudding user response:
You can use a simple regex and str.replace:
df['col'] = df['col'].str.replace('see less.*', '', regex=True)
output:
col
0 hi there
1 today is summer,
If you also want to remove the non-letter character just before:
df['col'].str.replace(r'\W see less.*', '', regex=True)
output:
col
0 hi there
1 today is summer
CodePudding user response:
Use str.extract:
df['col'] = df['col'].str.extract('(.*)\s*see less')
print(df)
# Output
col
0 hi there
1 today is summer,
