In regex, we have greedy and lazy quantifiers. The greedy quantifier {n,m} matches the preceding atom/character/group a minimum of n and a maximum of m occurrences, inclusive.
If I have a collection of strings:
a
aa
aaa
aaaa
aaaaaaaaaa
With a{2,4}, it matches:
- nothing on first line
- aa on second
- aaa on third
- aaaa on fourth
- (aaaa), (aaaa), and (aa) on fifth line
That makes sense.
However, if I have a lazy quantifier a{2,4}? I get:
- nothing on first line
- aa on second line
- aa on third line
- (aa) and (aa) on fourth line
- (aa), (aa), (aa), (aa), and (aa) on fifth line
That actually makes sense. It finds the least amount of possible match.
The part that I want to clarify - is there any usefulness to pass any lazy quantifier in the form of {n,m}? a max value m (in this case, the 4 in {2,4}?)? Isn't the result is always the same as {2,}??
Is there a scenario where passing a max (like the 4 in {2,4}?) is useful in lazy quantifier?
Disclaimer: I am actually using the regular expression to search inside Vim (/a{-2,4}), not in any scripting language. I think the principle of the question is still the same.
CodePudding user response:
It matters when you need to consider what follows the lazily quantified expression. Laziness is used to prevent characters from being consumed by a later expression in a concatenation. Consider the string aaaaab:
- The string is not matched by
a{2,4}?b, as there are too manyas fora{2,4}to match. - The string is matched by
a{2,}?b, since it can match as manyas as necessary.
