I have been doing regular expression for 25 years but I don't understand why this regex is not a match (using Perl syntax):
"unify" =~ /[iny]{3}/
# as in
perl -e 'print "Match\n" if "unify" =~ /[iny]{3}/'
Can someone help solve that riddle?
CodePudding user response:
The letters n, i, and y aren't all adjacent. There's an f in between them.
/[iny]{3}/ matches any string that contains a substring of three letters taken from the set {i, n, y}. The letters can be in any order; they can even be repeated.
Choosing three characters three times, with replacement, means there are 33 = 27 matching substrings:
iii,iin,iiy,ini,inn,iny,iyi,iyn,iyynii,nin,niy,nni,nnn,nny,nyi,nyn,nyyyii,yin,yiy,yni,ynn,yny,yyi,yyn,yyy
To match non-adjacent letters you can do:
[iny].*[iny].*[iny]
Or:
[iny](.*[iny]){2}
Or with an extra trailing .*:
([iny].*){3}
(This last option will work fine on its own since your search is unanchored, but might not be suitable as part of a larger regex. The final .* could match more than you intend.)
CodePudding user response:
The {3} atom means "exactly three consecutive matches of the preceding element." While all of the letters in your character class are present in the string, they are not consecutive as they are separated by other characters in your string.
It isn't the order of items in the character class that's at issue. It's the fact that you can't match any combination of the three letters in your character class where exactly three of them are directly adjacent to one another in your example string.
CodePudding user response:
That pattern looks for three consecutive occurrences of the letters i, n, or y. You do not have three consecutive occurrences.
Perhaps you meant to use [inf] or [ify]?
CodePudding user response:
Looks like you are looking for 3 consecutive letters, so yours should not match
[iny]{3} //no match
[unf]{3} //no match
[nif]{3} //matches nif
[nify]{3} //matches nif
[ify]{3} //matches ify
[uni]{3} //matches uni
Hope that helps somewhat :)
CodePudding user response:
The quantifier {3} in the pattern [iny]{3} means to match a character with that pattern (either i or n or y), and then another character with the same pattern, and then another. Three -- one after another. So your string unify doesn't have that, but can muster two at most, ni.
That's been explained in other answer already. What I'd like to add is an answer to a clarification in comments: how to check for these characters appearing 3 times in the string, scattered around at will. Apart from matching that whole substring, we can use a lookahead:
(?=[iny].*[iny].*[iny])
This does not "consume" any characters but rather "looks" ahead for the pattern, not advancing the engine from its current position. As such it can be useful in combination with other matches.
A Perl example with that, to copy-paste on the command line:
perl -wE'say "Match" if "unify" =~ /(?=[iny].*[iny].*[iny])/'
