I want to write a regex expression for words with even-numbered length.
For example, the output I want from the list containing the words:
{"blue", "ah", "sky", "wow", "neat"} is {"blue", "ah", "neat}.
I know that the expression \w{2} or \w{4} would produce 2-worded or 4-worded words, but what I want is something that could work for all even numbers. I tried using \w{%2==0} but it doesn't work.
CodePudding user response:
You can repeat 2 word characters as a group between anchors ^ to assert the start and $ to assert the end of the string, or between word boundaries \b
^(?:\w{2}) $
See a regex demo.
import re
strings = [
"blue",
"ah",
"sky",
"wow",
"neat"
]
for s in strings:
m = re.match(r"(?:\w{2}) $", s)
if m:
print(m.group())
Output
blue
ah
neat
CodePudding user response:
If you need no extra validation for the strings in your set, you can simply use
words = {"blue", "ah", "sky", "wow", "neat"}
print( list(w for w in words if len(w) % 2 == 0) )
# => ['ah', 'blue', 'neat']
See this Python demo.
If you want to make sure the words you return are made of letters, you can use
import re
words = {"blue", "ah", "sky", "wow", "neat"}
rx = re.compile(r'(?:[^\W\d_]{2}) ') # For any Unicode letter words
# rx = re.compile(r'(?:[a-zA-Z]{2}) ') # For ASCII only letter words
print( [w for w in words if rx.fullmatch(w)] )
# => ['blue', 'ah', 'neat']
See this Python demo. A (?:[^\W\d_]{2}) pattern matches one or more occurrences of any two Unicode letters. Together with re.fullmatch, it requires a string to consist of an even amount of letters.
