Home > Back-end >  How to prevent the same letter in a row in regex
How to prevent the same letter in a row in regex

Time:01-20

I am supposed to write a regex that contains only U and W alternately or only W and U or an empty set. I am only allowed to use ?, *, , | and (). If I write (WU)*|(UW)*, the it doesn't match the single U and W and UW neither. But if I add a U or W to it, then it already matches too much. I'm sure the solution is simple, but I just can't figure it out.

So allowed is:

  • U

  • W

  • UWU

  • WUWU

not valid:

  • UU

  • WW

  • WWUUU

  • UUW

CodePudding user response:

W?(UW)*U?|U?(WU)*W?

This matches either W followed by a series of UW optionally followed by U, or U followed by a series of WU optionally followed by W. Making the first character optional allows for an empty string.

CodePudding user response:

You can use

^U?(WU)*W?$

See the enter image description here

Note that without anchors you can't make sure your regex matches full string if you need to put all requirements into the pattern. It is also possible to use specific regex matching functions in various languages, to name a few:

  • - std::regex_match
  • - re.fullmatch
  • - String.matches / Pattern.matches
  • - kotlin.text.matches or Regex.matchEntire
  • - with ==~ operator
  • - same as in or with match block if the regex is not declared as .unanchored

Pattern details:

  • ^ - start of string
  • U? - an optional U
  • (WU)* - zero or more sequences of WU
  • W? - an optional W char
  • $ - end of string.

Note: in Ruby (Ongimo regex library), you need \A and \z anchors, i.e.

\AU?(WU)*W?\z
  •  Tags:  
  • Related