Is there a way with regex to strip all the punctuation of a string while leaving the punctuation and other special symbols within the string unchanged?
for example:
$$//[email protected]<~> becomes [email protected]
While:
becomes
And:
**A B** becomes A B
P.s there are no spaces between the string under analysis
CodePudding user response:
Use re.sub() to remove all non-word characters at the beginning and end.
string = re.sub(r'^\W |\W $', '', string)
^ matches the beginning of the string, and $ matches the end. \W matches a sequence of non-alphanumeric characters.
CodePudding user response:
You could use re.findall here and search for the pattern \w (?:\W \w )*:
inp = ["$$//[email protected]<~>", "**A B**"]
output = [re.findall(r'\w (?:\W \w )*', x)[0] for x in inp]
print(output) # ['[email protected]', 'A B']
The regex pattern used above finds a word, followed by one or more non word characters and another word, zero or more times.
CodePudding user response:
This regex matches everything between the first and last instance of an alphabet including the alphabets. This should work:
[a-zA-Z]([^]*)[a-zA-Z]
Where:
- [a-zA-Z] matches upper or lower case alphabet
- ([^]*) matches 0 or more of any character
