I have a requirement to build a regex pattern to validate a String in Java. Hence I build a pattern
[A-Z][a-z]*\s?[A-Z]?[a-z]*$ for the conditions:
- Should start with caps
- Every other Word should start with caps
- No numbers included
- no consecutive two spaces allowed
Pattern.matches("[A-Z][a-z]*\s?[A-Z]?[a-z]*$","Joe V") returns false for me in java.
But the same pattern returns true for the data "Joe V" in regexr.com.
What might be the cause?
CodePudding user response:
Javascript has native support for regex while Java doesn't. Since Java uses \ for special signs in strings (like \n) you have to escape the \ to actually be a \ sign. That's done with another \. So any \ you use in Java should be written as \\.
Thus your regex / code should be:
Pattern.matches("[A-Z][a-z]*\\s?[A-Z]?[a-z]*$", "Joe V")
which returns true
P.s. \s is interpreted as a Space in any Java-String
CodePudding user response:
You can use
Pattern.matches("[A-Z][a-z]*(?:\\s[A-Z][a-z]*)*","Joe V")
Pattern.matches("\\p{Lu}\\p{Ll}*(?:\\s\\p{Lu}\\p{Ll}*)*","Joe V")
See the regex demo #1 and regex demo #2.
Note that .matches requires a full string match, hence the use of ^ and $ anchors on the testing site and their absence in the code.
Details:
^- start of string (implied in.matches)[A-Z]/\p{Lu}- an (Unicode) uppercase letter[a-z]*/\p{Ll}*- zero or more (Unicode) lowercase letters(?:\s[A-Z][a-z]*)*/(?:\s\p{Lu}\p{Ll}*)*- zero or more sequences of\s- one whitespace[A-Z][a-z]*/\p{Lu}\p{Ll}*- an uppercase (Unicode) letter and then zero or more (Unicode) lowercase letters.
$- end of string (implied in.matches)
