I am working with java regexes, but I guess the principles apply for every regex.
I have these requirements for the segment a regex should match:
- have at least 3 times 'a'
- have at least 3 times 'b'
- occurrences of 'a' and 'b' can be in any order
Inspired by this post I came up with the following regex (regex101):
(?=([b]*[a]){3})(?=([a]*[b]){3})[ab]
I am struggling with adding a new requirement:
- Match if there is no or at least 3 'c'
- as above, 'c' can occur anywhere in the segment
Examples for valid sequences:
aaabbb
ababab
aaabbbccc
abcabcabc
ababcabcc
Examples for invalid sequences (as a whole):
aaabbbc
aabbb
abbccc
abcabca
My thoughts so far:
Having at least 3 'c'
(?=([bc]*[a]){3})(?=([ac]*[b]){3})(?=([ab]*[c]){3,})[abc]Combining this and above solution in a crude manner (regex101) which basically just a large "either none or at least 3"
((?=([bc]*[a]){3})(?=([ac]*[b]){3})(?=([ab]*[c]){3,})[abc] |(?=([b]*[a]){3})(?=([a]*[b]){3})[ab] )
Finally the Question: Is there a better way to achieve this using other methods, like or-ing the 'c'-requirement look-ahead, nested look-aheads or something entirely different?
CodePudding user response:
(?=^(?:.*a){3}.*$)(?=^(?:.*b){3}.*$)(?=^(?:.*c){3}.*$|^[^c]*$).*
Short Explanation
(?=^(?:.*a){3}.*$)Assert that string contains at least 3a(?=^(?:.*b){3}.*$)Assert that string contains at least 3b(?=^(?:.*c){3}.*$|^[^c]*$)Assert that string contains at least 3cor the string does not contain anyc.*Match the whole string that passes all assertions
Also, see the regex demo and Java example
CodePudding user response:
You could assert 3 times a and 3 times b, and then optionally match at least 3 times a c
Add anchors ^ and $ to assert the start and the end of the string.
Note that you don't have to put a single char like [a] in a character class:
^(?=([bc]*a){3})(?=([ca]*b){3})[ab]*(?:c[ab]*c[ab]*c[abc]*)?$
Explanation
^Start of string(?=([bc]*a){3})Assert 3 times anachar(?=([ca]*b){3})Assert 3 times abchar[ab]*Match optional charsab(?:Non capture groupc[ab]*c[ab]*cMatch 3 times acchar[abc]*Match optionala,bandcchars
)?Close the non capture group and make it optional$End of string
As you don't really need the capture groups, you can use non capture groups (?: instead for the repetition:
^(?=(?:[bc]*a){3})(?=(?:[ca]*b){3})[ab]*(?:c[ab]*c[ab]*c[abc]*)?$
CodePudding user response:
You can use
(?<![abc]) # No "a", "b", "c" allowed immediately on the left
(?=(?:[bc]*a){3}) # At least three "a"s
(?=(?:[ac]*b){3}) # At least three "b"s
(?: # Either
(?=[ab]*(?![abc])) # only "a" or "b"s allowed until a location not followed with "a", "b" or "c"
| # or
(?=(?:[ab]*c){3}) # At least three "c"s
)
[abc] # Match and consume one or more "a", "b" or "c" chars
See the regex demo.
As a single line:
(?<![abc])(?=(?:[bc]*a){3})(?=(?:[ac]*b){3})(?:(?=[ab]*(?![abc]))|(?=(?:[ab]*c){3}))[abc]
