I am trying to capture the text between 2 delimiters only when there is also a line feed within the delimiters. So for example if we have the following text.
Organisation Name <<me.company.name>>
ABN/ACN <<me.company.abn>>
Contact Name <<me.name>>
<<me.PhoneNumber
Another line>>
Email <<me.emailAddress>>
I am wanting to only return the <<me.PhoneNumber \n\n 'Another Line>>
the \n could be anywhere - basically only matches that have at least one \n within the << >> and ignore all other << >>
The pattern I have so far is <<(.?\n)*?>> but this captures all << >> (I'm using C#)
here is an example of what I have tried https://regex101.com/r/sb0wCs/1
Thanks so much for your help
CodePudding user response:
You can use
<<((?:(?!<<|>>).)*?\n(?s:.)*?)>>
See the regex demo. Details:
<<- a<<string((?:(?!<<|>>).)*?\n(?s:.)*?)- Group 1:(?:(?!<<|>>).)*?- any zero or more chars (other than newline chars) that do not start>>or<<char sequence, as few as possible\n- a LF char(?s:.)*?- any zero or more chars (including newline chars), as few as possible
>>- a>>string
CodePudding user response:
You can try this: <<[^>]*?\n[^>]*>>
Test regex here: https://regex101.com/r/vD3EgE/2
<<[^>]*?\n[^>]*>>
<< match literal <<
[^>]*? match any char that is not > as few as possible
\n match a newline
[^>]* match any char that is not > as few as possible
>> match literal >>
- This will match a only if there is
\nbetween<<and>>.
CodePudding user response:
In your pattern <<(.*?\n*)*?>> you have a capture group and all parts are optional including the newline, so the non greedy quantifier *? can match until the first occurrence of >>
Also when repeating a capture group, the group value will hold the value of the last iteration, so instead you can put the capture group without a quantifier around the whole part that you want to capture.
If your strings start at the beginning of the line, you can use anchors and match at least a single line in between that does not start with either << or >>
^\s*<<(.*(?:\r?\n(?!<<|>>).*) \r?\n)\s*>>$
Explanation
^Start of string\s*<<Match optional leading whitspace chars and <<(Capture group 1.*Match the rest of the line(?:\r?\n(?!<<|>>).*)Match a newline, and repeat at least 1 line not starting with<<or>>
\r?\nMatch a newline)Close group 1\s*>>Match optional leading whitspace chars and >>$End of string
See a regex demo.
