I'm starting to learn regex in order to match words in python columns and replace them for other values.
df['col1']=df['col1'].str.replace(r'(?i)unlimi \w*', 'Unlimited', regex=True)
This pattern serves to match different variations of the world Unlimited. But I have some values in the column that have not only one word, but two or more: ex:
[Unlimited, Unlimited (on-net), Unlimited (on-off-net)]`
I was wondering if there is a way to match all of the words in the previous example with a single regex line.
CodePudding user response:
You can use
df['col1']=df['col1'].str.replace(r'(?i)unlimi\w*(?:\s*\([^()]*\))?', 'Unlimited', regex=True)
See the regex demo.
The (?i)unlimi\w*(?:\s*\([^()]*\))? regex matches
(?i)- the regex to the right is case insensitiveunlimi- a fixed string\w*- zero or more word chars(?:\s*\([^()]*\))?- an optional sequence of\s*- zero or more whitespaces\(- a(char[^()]*- zero or more chars other than(and)\)- a)char.
