I have a problem where I need to find if a word(for example "output") is present in a snake_case word.
Which means the regex must be capable of matching all of the following situations-
- output
- output_of_my_program
- my_output_from_program
- program_output
i.e. output, output_, output, and _output need to be matched
currently I have three individual regex patterns to cover all the cases, which are-
"^output[^a-z]_?""_output_""_output$"
I however have tried to combine the three into one, which is [a-z]*_?[^a-z]output_?[a-z]* but this fails for certain cases. Is it possible to combine the three patterns into one in this case?
Edit:The other keywords I am interested in are "in", "input", "out" and the challenge is to avoid matches with words such as "introspection" and other cases such as "my_programoutput"
CodePudding user response:
You don't include it in your test cases, but I assume part of the intent is to explicitly not match a string like my_floutputs_from_program. You could do this with a tricky regex, but I'd just use split and in:
for s in (
'output',
'output_of_my_program',
'my_output_from_program',
'program_output',
'input',
'my_floutputs_from_program'
):
print(f"{s}: {'output' in s.split('_')}")
output: True
output_of_my_program: True
my_output_from_program: True
program_output: True
input: False
my_floutputs_from_program: False
CodePudding user response:
Is there a reason why 'output' in s would not work? You do not have to use regex to solve this, unless it is necessitated as part of the project.
>>> strings = [
... 'output',
... 'output_of_my_program',
... 'my_output_from_program',
... 'program_output',
... 'input'
... ]
>>>
>>> for s in strings:
... print('output' in s)
...
True
True
True
True
False
CodePudding user response:
If you can use the re library, you can just use the search function with the string required to find.
import re
txt = "my_output_from_program"
result = re.search("output", txt)
CodePudding user response:
You can use a pattern with either a word boundary, or a postive assertion for _ on the left or the right if you don't want partial matches.
(?:\b|(?<=_))output(?:\b|(?=_))
The pattern matches:
(?:\b|(?<=_))Match either a word boundary or assert_to the leftoutputMatch literally(?:\b|(?=_))Match either a word boundary, or assert_to the right
See a regex demo and a Python demo.
import re
pattern = r"(?:\b|(?<=_))output(?:\b|(?=_))"
strings = [
"output",
"output_of_my_program",
"my_output_from_program",
"program_output",
"my_programoutput"
]
for s in strings:
m = re.search(pattern, s)
if m:
print(f"{s} --> {m.group()}")
Output
output --> output
output_of_my_program --> output
my_output_from_program --> output
program_output --> output
