I need a regex that will get all the text occurences between parentheses, having in mind that all the content is encapsulated by the word BEGIN and the chars ---- at the end.
Input example:
BEGIN ) Tj\nET37.66 533 Td\n( Td\n(I NEED THIS TEXT ) Tj\nET\nBT\n37.334 Td\n(AND ALSO NEED THIS TEXT ) Tj\nET\nBT\n37.55 Td\n(------------
Expected matches:
I NEED THIS TEXT
AND ALSO NEED THIS TEXT
I already did something like (?<=BEGIN).*(?=\(--) to the outside pattern, but i couldn't figure out how to get all text occurrences inside parentheses between this.
CodePudding user response:
With Python PyPi regex library, you can use
(?s)(?:\G(?!^)\)|BEGIN)(?:(?!\(--).)*?\((?!--)\K[^()]*
See the regex demo
Details:
(?s)- a DOTALL inline modifier making.match line break chars(?:\G(?!^)\)|BEGIN)- eitherBEGINor the end of the previous successful match and a)right after(?:(?!\(--).)*?- any char, zero or more but as few as possible occurrences, that does not start a(--char sequence\(- a(char(?!--)- right after(, there should be no--\K- match reset operator: what was matched before is discarded from the overall match memory buffer[^()]*- zero or more chars other than(and)
CodePudding user response:
Try:
\(((?:(?!BEGIN).)*?)\)(?=.*---)
\(((?:(?!BEGIN).)*?)\)- Match everything between( ), but notBEGIN(?=.*---)-.*---must follow after this match
