Home > Software engineering >  Regex to extract four digits using Java Pattern
Regex to extract four digits using Java Pattern

Time:01-21

I'm trying to extract four digits before the file extension using Java Pattern Matchers. It's throwing no group found exception. Can someone help me on this ?

String fileName = "20210101-000000_first_second_1234.csv";
Pattern pattern = Pattern.compile("\\\\d{4}");
System.out.println(pattern.matcher(fileName).group(4));

I would like to get 1234 from the fileName. I compiled the file pattern using regex \\\\d{4}. Which returns four groups. So, fourth group should suppose to return 1234 which is not returning, instead throwing group not found exception.

CodePudding user response:

The "\\\\d{4}" string literal defines a \\d{4} regex that matches a \dddd string (a backslash and then four d chars). You try to access Group 4, but there is no capturing group defined in your regex. Besides, you can't access match groups before actually running the matcher with Matcher#find or Matcher#matches.

You can use

String fileName = "20210101-000000_first_second_1234.csv";
Pattern pattern = Pattern.compile("\\d{4}(?=\\.[^.] $)");
Matcher m = pattern.matcher(fileName);
if (m.find()) {
    System.out.println(m.group());
}

See the Java demo and the regex demo. Details:

  • \d{4} - four digits
  • (?=\.[^.] $) - a positive lookahead that requires a . char and then one or more chars other than . till end of string.

Note also the Matcher m = pattern.matcher(fileName) added and if (m.find()) checks if there is a match. Only if there is a match, the value can be retrieved from the 0th group, m.group().

  •  Tags:  
  • Related