Home > Enterprise >  How to remove a special character based on negative pattern matching using regular expression
How to remove a special character based on negative pattern matching using regular expression

Time:01-06

I have a sample string of something like hello \ \\\world \ \\\\ this \234 \ is \Pattern\ and I want it to be something like hello \world this 234 is \Pattern

One way to do is to run a loop for every character in the string and if it's a \ and next character is NOT a word, then replace it with a space. Simple but inefficient code. There must be another way to do it using regular expression.

I can find all the \alphabet using r'\\\w ' and any single \ followed by space as \\\s but these won't take \\\ \( \ into consideration. How can this be done?

CodePudding user response:

Maybe use:

\\(?![A-Za-z])\s*

And replace with empty string as per this online demo

  • \\ - A backslash (escaped);
  • (?![A-Za-z]) - Negative lookahead to assert not being followed by alphachar;
  • \s* - 0 (Greedy) whitespace-chars.

CodePudding user response:

Try this regex:

\\(?=[\W\d]|$)

Substitute all the matches with an empty string

Click for Demo

Code

Explanation

  • \\ - matches \
  • (?=[\W\d]|$) - positive lookahead to make sure that the \ matched above must either be followed by a digit or a non-word or must be at the end of the string. All such matched \ are to be replaced by empty string

CodePudding user response:

You can use a lookahead:

s = r"hello \  \\\world \  \\\\ this  \234 \ is \pattern\'"

import re
s2 = re.sub(r'\\*(?![a-zA-Z])', '', s)
print(s2)

output: hello \world this 234 is \pattern'

How the regex works:

\\*          # match any number of \
(?![a-zA-Z]) # if not followed by a letter
  •  Tags:  
  • Related