I am trying to create a regular expression which matches multiple groups, so the values between the groups can be extracted. Each group looks identical.
Lets consider the following example, note that the linebreaks are intended:
dog 1
wuff
wuff
cat
123
XYZ
dog 1
wuff
wuff
cat
456
ABC
dog 1
wuff
wuff
cat
789
Thus, with the right regular expression I want to get the output:
123
XYZ
456
ABC
789
On regex101.com I tried:
(?s)(?:dog.*cat)
which matches all values between the first occurence of dog an the last occurence of cat.
In addition I tried:
(?s)(?:dog.*(cat){1})
which, with my limited knowledge, should match the first occurence of cat and then end the group, but it does not.
I appreciate any help.
CodePudding user response:
You may use this regex in MULTILINE mode to capture value after dog.*cat matches:
^dog\b(?:.*\n) ?cat\n(.*(?:\n.*)*?)(?=\ndog|\Z)
Your values are present in capture group #1
RegEx Details:
^: Match start linedog\b: Match worddogwith a word boundary(?:.*\n) ?: Match anything followed by a line break. Repeat this 1 times (lazy)cat\n: Matchcatfollowed by a newline(.*(?:\n.*)*?): These are the multiline values you're interested in the first capture group.(?=\ndog|\Z): Lookahead to assert that we have adogafter line break or end of input ahead of the current position
