Home > Mobile >  Find match String using java regular expression
Find match String using java regular expression

Time:02-04

In Java Consider the list of strings, randomly coming one of its with a different value.

        "59USD-300kg-25mb_4G-48p/min(Incl. tax)-70y-2gb 1gb_Toffee-1km(mm/hr)" OR
        "59USD-300kg-25mb_4G-48p/min(Incl. tax)-70y-2gb 1gb_Toffee" OR
        "59USD-300kg-25mb_4G-48p/min(Incl. tax)-70y" OR
        "59USD-300kg-25mb_4G-48p/min(Incl. tax)" OR
        "59USD-300kg-25mb_4G" OR
        "59USD-300kg" OR
        "59USD"

Broadly the Hyphen (-) breaks down the part of this string.

I want to get the part of the string passing the keyword or parameter like:

        String str = "59USD-300kg-25mb_4G-48p/min(Incl. tax)-70y-2gb 1gb_Toffee-1km(mm/hr)"; 
        Keyword or parameter will be :USD and then the result will be
        String expectString = "59USD";


        String sourceStr = "59USD-300kg-25mb_4G-48p/min(Incl. tax)-70y-2gb 1gb_Toffee"; 
        Keyword or parameter will be :gb and then the result will be
        String expectString = "2gb 1gb_Toffee";

        String sourceStr = "59USD-300kg-25mb_4G-48p/min(Incl. tax)"; 
        Keyword or parameter will be :min and then the result will be
        String expectString = "48p/min(Incl. tax)";

CodePudding user response:

If you are comfortable with regexp, you could do this :

    void lookForKeyword(String message, String keyword) {
        System.out.println("Looking for keyword \""   keyword   "\" in string \""   message   "\"");
        String pattern = "^.*?-?([^-]*"   keyword  "[^-]*)-?.*$";
        Matcher matcher = Pattern.compile(pattern).matcher(message);
        if (matcher.matches()) {
            System.out.println("Found : \""   matcher.group(1)   "\"");
        }
    }

    void test() {
        lookForKeyword("59USD-300kg-25mb_4G-48p/min(Incl. tax)-70y-2gb 1gb_Toffee-1km(mm/hr)", "USD");
        lookForKeyword("59USD-300kg-25mb_4G-48p/min(Incl. tax)-70y-2gb 1gb_Toffee", "gb");
        lookForKeyword("59USD-300kg-25mb_4G-48p/min(Incl. tax)", "min");
    }

Output :

Looking for keyword "USD" in string "59USD-300kg-25mb_4G-48p/min(Incl. tax)-70y-2gb 1gb_Toffee-1km(mm/hr)"
Found : "59USD"
Looking for keyword "gb" in string "59USD-300kg-25mb_4G-48p/min(Incl. tax)-70y-2gb 1gb_Toffee"
Found : "2gb 1gb_Toffee"
Looking for keyword "min" in string "59USD-300kg-25mb_4G-48p/min(Incl. tax)"
Found : "48p/min(Incl. tax)"

CodePudding user response:

Well a simple solution can be to split the string around the dashes ("-"), then iterate over the split parts and match your keyword. But you have to decide what to do when there are multiple matches or no matches at all. The following code contains 2 basic implementations, one which collects the matches in a list and one which stops after the first match. The first will return an empty list when there are no matches, the second will return null.

import java.util.ArrayList;
import java.util.List;

public class KeywordMatcher {

    private static List<String> getKeywordMatches(String s, String keyword) {
        List<String> ret = new ArrayList<>();
        String[] parts = s.split("-");
        for (String part : parts) {
            if(part.contains(keyword))
                ret.add(part);
        }
        
        return ret;
    }

    private static String getFirstKeywordMatch(String s, String keyword) {
        String[] parts = s.split("-");
        for (String part : parts) {
            if(part.contains(keyword))
                return part;
        }
        
        return null;
    }

    public static void main(String[] args) {
        String s ="59USD-300kg-25mb_4G-48p/min(Incl. tax)-70y-2gb 1gb_Toffee-65USD-1km(mm/hr)";
        
        System.out.println(getKeywordMatches(s, "USD")); // prints [59USD, 65USD]
        System.out.println(getKeywordMatches(s, "min")); // prints [48p/min(Incl. tax)]

        System.out.println(getFirstKeywordMatch(s, "USD")); // prints 59USD
        System.out.println(getFirstKeywordMatch(s, "min")); // prints 48p/min(Incl. tax)

    }
}

A more sophisticated approach involves searching your string for the next divider ("-") and the next keyword. Iterating the string until its end and keeping track of the relative position of dividers and keywords gets you to the same result in a more memory-efficient way (since you don't create any new object in memory unlike the "split" approach). However the implementation can be quite cumbersome and difficult to read, so I suggest the one described above, unless you have specific performance requirements or the strings to be searched are MB-sized.

CodePudding user response:

If the structure of the input string is consistent, it may be described with the help of a regular expression with the named groups, and then the names of the groups may be applied to get appropriate "field" from the matched string.

The pattern for a group is as follows: (?<USD>[^-] ): name of the group in angle brackets, [^-] -- 1 or more non-dash characters

The first group is followed by N nested optional named groups.

String[] strs = {
    "59USD-300kg-25mb_4G-48p/min(Incl. tax)-70y-2gb 1gb_Toffee-1km(mm/hr)",
    "59USD-300kg-25mb_4G-48p/min(Incl. tax)-70y-2gb 1gb_Toffee",
    "59USD-300kg-25mb_4G-48p/min(Incl. tax)-70y",
    "59USD-300kg-25mb_4G-48p/min(Incl. tax)",
    "59USD-300kg-25mb_4G",
    "59USD-300kg",
    "59USD"
};
Pattern data = Pattern.compile("(?<USD>[^-] )(-(?<kg>[^-] )(-(?<mb>[^-] )(-(?<min>[^-] )(-(?<y>[^-] )(-(?<gb>[^-] )(-(?<km>[^-] ))?)?)?)?)?)?");
for (String str : strs) {
    Matcher m = data.matcher(str);
    if (m.matches()) {
        System.out.println(str);
        System.out.println("\tUSD:\t"   m.group("USD"));
        System.out.println("\tkg :\t"   m.group("kg"));
        System.out.println("\tmb :\t"   m.group("mb"));
        System.out.println("\tmin:\t"   m.group("min"));
        System.out.println("\ty  :\t"   m.group("y"));
        System.out.println("\tgb :\t"   m.group("gb"));
        System.out.println("\tkm :\t"   m.group("km"));
        System.out.println("----");
    }
}

Output:

59USD-300kg-25mb_4G-48p/min(Incl. tax)-70y-2gb 1gb_Toffee-1km(mm/hr)
    USD:    59USD
    kg :    300kg
    mb :    25mb_4G
    min:    48p/min(Incl. tax)
    y  :    70y
    gb :    2gb 1gb_Toffee
    km :    1km(mm/hr)
----
59USD-300kg-25mb_4G-48p/min(Incl. tax)-70y-2gb 1gb_Toffee
    USD:    59USD
    kg :    300kg
    mb :    25mb_4G
    min:    48p/min(Incl. tax)
    y  :    70y
    gb :    2gb 1gb_Toffee
    km :    null
----
59USD-300kg-25mb_4G-48p/min(Incl. tax)-70y
    USD:    59USD
    kg :    300kg
    mb :    25mb_4G
    min:    48p/min(Incl. tax)
    y  :    70y
    gb :    null
    km :    null
----
59USD-300kg-25mb_4G-48p/min(Incl. tax)
    USD:    59USD
    kg :    300kg
    mb :    25mb_4G
    min:    48p/min(Incl. tax)
    y  :    null
    gb :    null
    km :    null
----
59USD-300kg-25mb_4G
    USD:    59USD
    kg :    300kg
    mb :    25mb_4G
    min:    null
    y  :    null
    gb :    null
    km :    null
----
59USD-300kg
    USD:    59USD
    kg :    300kg
    mb :    null
    min:    null
    y  :    null
    gb :    null
    km :    null
----
59USD
    USD:    59USD
    kg :    null
    mb :    null
    min:    null
    y  :    null
    gb :    null
    km :    null
----
  •  Tags:  
  • Related