Home > Back-end >  Match string between delimiters, but ignore matches with specific substring
Match string between delimiters, but ignore matches with specific substring

Time:01-20

I have to parse all the text in a paranthesis but not the one that contains "GST"

e.g:

(AUSTRALIAN RED CROSS – ATHERTON)
(Total GST for this Invoice $1,104.96)
today for a quote (07) 55394226 − [email protected] − this applies to your Nerang services.

expected parsed value:

AUSTRALIAN RED CROSS – ATHERTON

I am trying:

^\(((?!GST).)*$

But its only matching the value and not grouping correctly.

https://regex101.com/r/HndrUv/1

What would be the correct regex for the same?

CodePudding user response:

This regex should work to get the expected string:

^\((?!.*GST)(.*)\)$

It first checks if it does not contain the regular expression *GST. If true, it then captures the entire text.

(?!*GST)(.*)

All that is then surrounded by \( and \) to leave it out of the capturing group.

\((?!.*GST)(.*)\)

Finally you add the BOL and EOL symbols and you get the result.

^\((?!.*GST)(.*)\)$

The expected value is saved in the first capture group (.*).

CodePudding user response:

You can use

^\((?![^()]*\bGST\b)([^()]*)\)$

See the regex demo. Details:

  • ^ - start of string
  • \( - a ( char
  • (?![^()]*\bGST\b) - a negative lookahead that fails the match if, immediately to the right of the current location, there are zero or more chars other than ) and ( and then GST as a whole word (remove \bs if you do not need whole word matching)
  • ([^()]*) - Group 1: any zero or more chars other than ) and (
  • \) - a ) char
  • $ - end of string

Bonus:

If substrings in longer texts need to be matched, too, you need to remove ^ and $ anchors in the above regex.

  •  Tags:  
  • Related