I would like to know if the following constraint can be checked with regex: "Must be at least 5 characters, of which 4 should be letters"
I know how to put the Must be at least 5 characters constraint, but not sure of of which 4 should be letters if it's even possible with regex.
CodePudding user response:
Yes, it is possible. Please use the following regex.
(?=.{5,})\w*[a-z]\w*[a-z]\w*[a-z]\w*[a-z]\w*
Explanation
(?=Lookahead assertion - assert that the following regex matches.Any character{5,}Not less than 5 repetitions
)Close lookahead\w*[a-z]\w*[a-z]\w*[a-z]\w*[a-z]\w*Static letters from a to z in any order
NOTE: The (?=.{5,}) asserts that the string match 5 or more characters
CodePudding user response:
You can also use this pattern:
(?i)(?=.*[a-z].*[a-z].*[a-z].*[a-z].*).{5,}
Here, the positive lookahead (?=.*[a-z].*[a-z].*[a-z].*[a-z].*) asserts that there must be four letters (case does not play a role given (?i)) either directly or indirectly following each other. Once that condition is met the regex matches any string that is at least 5 characters long
CodePudding user response:
What language or tool are you using?
This sounds like one of those things that doesn't need to be a single regex.
Here's "at least four letters"
[a-z].*[a-z].*[a-z].*[a-z]
and here's "at least five characters"
.{5,}
or even, if you're in a language like PHP, avoid regexes entirely and be more explicitly clear:
length($str) >= 5
CodePudding user response:
You can even do this without lookahead! Consider the following RegEx:
(. [a-z].*[a-z].*[a-z].*[a-z].*)|(.*[a-z]. [a-z].*[a-z].*[a-z].*)|(.*[a-z].*[a-z].*[a-z]. [a-z].*)|(.*[a-z].*[a-z].*[a-z]. [a-z].*)|(.*[a-z].*[a-z].*[a-z].*[a-z]. )
Depending on your engine you may have to anchor this using ^ and $.
Generation: Simply shifted the quantifier all the way through: The four letters are a must, but the fifth letter can be at any position.
If possible, you should avoid using RegEx for this though, or combine a RegEx that checks whether four letters are present (.*[a-z].*[a-z].*[a-z].*[a-z].*) with a simple length check.
If you need exactly 5 characters to be letters, replace . with [^a-z].
If you can use regular grammars, this can be written way shorter:
S →
%aA |.S'
S' →%aA' |.S'
A →%aB |.A'
A' →%aB' |.A'
B →%aC |.B'
B' →%aC |.B'
C →%aD |.C'
C' →%aD' |.C'
D →.D'
D' →ε
where S is the start symbol, . stands for any character and %a for any letter. Five states are needed to keep track of how many characters have been read; each state X also needs a state X' to keep track of whether a non-letter character has been read yet.
