Home > Software design >  How to detect keyword in String without spaces?
How to detect keyword in String without spaces?

Time:01-22

Basically my desired outcome is to split a string based on known keywords regardless on if whitespace seperates the keyword. Below is an example of my current implementation, expect param String line = "sum:=5;":

private static String[] nextLineAsToken(String line) {
    return line.split("\\s (?=(:=|<|>|=))");
}

Expected:

String[] {"sum", ":=", "5;"};

Actual:

String[] {"sum:=5;"};

I have a feeling this isn't possible, but it would be great to hear from you guys. Thanks.

CodePudding user response:

Your main problem is you coded \s instead of \s*, which required there to be spaces to split, instead of spaces being optional. The other problem is your regex only splits before operators.

Use this regex:

\s*(?=(:=|<|>|(?<!:)=))|(?<=(=|<|>))\s*

See live demo.

Or as Java:

return line.split("\\s*(?=(:=|<|>|(?<!:)=))|(?<=(=|<|>))\\s*");

Which uses a look ahead to split before operators and a look behind to split after operators.

\s* has been added to consume any spaces between terms.

Note also the negative look behind (?<!:) within the look ahead to prevent splitting between : and =.

CodePudding user response:

Here is an example code that you can use to split your input into groups. White space characters like regular space are ignored. It is later printed to the output in for loop:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Example {
    public static void main(String[] args) {
        final String regex = "(\\w*)\\s*(:=)\\s*(\\d*;)";
        final String string = "sum:=5;";
        
        final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
        final Matcher matcher = pattern.matcher(string);
        
        while (matcher.find()) {
            System.out.println("Full match: "   matcher.group(0));
            
            for (int i = 1; i <= matcher.groupCount(); i  ) {
                System.out.println("Group "   i   ": "   matcher.group(i));
            }
        }
    }
}

And this is the output:

Full match: sum:=5;
Group 1: sum
Group 2: :=
Group 3: 5;
  •  Tags:  
  • Related