Home > Software engineering >  Python: remove all punctuation before and after a alphanumeric string, but leave punctuation within
Python: remove all punctuation before and after a alphanumeric string, but leave punctuation within

Time:01-27

Is there a way with regex to strip all the punctuation of a string while leaving the punctuation and other special symbols within the string unchanged?

for example:

$$//[email protected]<~> becomes [email protected]

While:

       becomes 

And:

**A  B** becomes A  B

P.s there are no spaces between the string under analysis

CodePudding user response:

Use re.sub() to remove all non-word characters at the beginning and end.

string = re.sub(r'^\W |\W $', '', string)

^ matches the beginning of the string, and $ matches the end. \W matches a sequence of non-alphanumeric characters.

CodePudding user response:

You could use re.findall here and search for the pattern \w (?:\W \w )*:

inp = ["$$//[email protected]<~>", "**A  B**"]
output = [re.findall(r'\w (?:\W \w )*', x)[0] for x in inp]
print(output)  # ['[email protected]', 'A  B']

The regex pattern used above finds a word, followed by one or more non word characters and another word, zero or more times.

CodePudding user response:

This regex matches everything between the first and last instance of an alphabet including the alphabets. This should work:

[a-zA-Z]([^]*)[a-zA-Z]

Where:

  • [a-zA-Z] matches upper or lower case alphabet
  • ([^]*) matches 0 or more of any character
  •  Tags:  
  • Related