I want to match any string that starts with . and word and then optionally any character after a space.
r"^\.(\w )(?:\s (. )\b)?"
eg:
should match
.just one two
.just
.blah one@nine
.blah
.jargon blah
should not match
.jargon
I want this second group mandatory if first group is jargon
CodePudding user response:
One approach would be to phrase your requirement using an alternation:
^\.(?:(?!jargon\b)\w (?: \S )*|jargon(?: \S ) )$
This pattern says to match:
^ from the start of the input
\. match dot
(?:
(?!jargon\b)\w match a first term which is NOT "jargon"
(?: \S )* then match optional following terms zero or more times
| OR
jargon match "jargon" as the first term
(?: \S ) then match mandatory one or more terms
)
$ end of the input
Here is a sample Python script:
inp = [".just one two", ".just", ".blah one@nine", ".blah", ".jargon blah", "jargon"]
matches = [x for x in inp if re.search(r'^\.(?:(?!jargon\b)\w (?: \S )*|jargon(?: \S ) )$', x)]
print(matches) # ['.just one two', '.just', '.blah one@nine', '.blah', '.jargon blah']
CodePudding user response:
You could attempt to match the following regular expression:
^\.(?!jargon$)\w (?= .|$).*
If successful, this will match the entire string. If one simply wants to know if the string conforms to the requirements .* can be dropped.
(?!jargon$) is a negative lookahead that asserts that the period is not immediately followed by 'jargon' at the end of the string.
(?= .|$) is a positive lookahead that asserts that the string of word characters is followed by a space followed by any character or they terminate the string.
