I have a feeling this will get closed as a duplicate as it seems like this would be a common ask...but in my defense, I have searched SO as well as Google and could not find anything.
I'm trying to search SQL files using ripgrep, but I want to exclude matches that are part of a single line SQL comment.
- I do not need to worry about multi-line
/* foobar */comments. Only single line-- foobar - I do not need to capture anything
- I don't need to worry about the possiblity of the string being part of text, like
SELECT '-- foobar'orSELECT '--', foobar. I'm okay with those false exclusions.
Match examples:
- Match:
SELECT foobar - Match:
, foobar - Exclude:
SELECT baz -- foobar - Exclude:
-- foobar - Exclude:
---foobar - Exclude:
-- blah foobar blah
AKA, search for foobar but ignore the match if -- occurs at any point before it on that line.
I have tried things like negative lookbehinds and other methods, but I can't seem to get them to work.
This answer seemed like it would get me there: https://stackoverflow.com/a/13873003/3474677
That answer says to use (?<!grad\()vec to find matches of vec that are not prefaced with grad(. So if I translate that to my use case, I would get (?<!--)foobar. This works...but only for excluding lines that contain --foobar it does not handle the other scenarios.
Worst case scenario, I can just pipe the results from ripgrep into another filter and exclude lines that match --.*foobar but I'm hoping to find an all-in-one solution.
CodePudding user response:
According to the comments, using ripgrep and enable --pcre2 you can use:
^(?:(?!--).)*\bfoobar\b
^Start of string(?:Non capture group(?!--).Negative lookahead, assert the from the current position there is not--directly to the right. If that assertion is true, then match any character except a newline
)*Close the non capture group and optionally repeat it\bfoobar\bMatch the wordfoobar
See a regex demo.
