I have two strings, eg:
Long sentences may be used for several reasons: To develop tension. While a short sentence is the ultimate sign of the tension, long sentences could be used to develop this tension to a point of culmination. To give vivid descriptions.
Long sentences may be used for several reasons: To develop tension. While a short sentence is the sign ultimate of the tension, long sentences could be used to develop this tension to a point of culmination. To give vivid descriptions.
In the second string, the word ultimate has changed its position.
if(string1.equalsIgnoreCase(string2)) returns False but I want the result to be True since the contents of the strings are same (even though the order is not).
CodePudding user response:
You could count the occurences of every word in each String and compare the results :
String phrase = "Long sentences may be used for several reasons: To develop tension. While a short sentence is the ultimate sign of the tension, long sentences could be used to develop this tension to a point of culmination. To give vivid descriptions.";
String phrase2 = "Long sentences may be used for several reasons: To develop tension. While a short sentence is the sign ultimate of the tension, long sentences could be used to develop this tension to a point of culmination. To give vivid descriptions.";
Map<String,Long> wordCount = Arrays.stream(phrase.toLowerCase().split("\\W "))
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
Map<String,Long> wordCount2 = Arrays.stream(phrase2.toLowerCase().split("\\W "))
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
System.out.println(wordCount.equals(wordCount2));
- The first step is to apply
toLowerCase()to yourString. Remove this step if you want your comparison to be case sensitive."Hello world"=>"hello world"
- Then you
split()theStringaround the matches of the following regex :\Wto obtain an array. This regex matches one or more non-word character."hello world"=>["hello", "world"]
- You call
Arrays.stream()on this array to get aStream. - You collect the elements of the
StreamusingCollectors.groupingBy()to associate every word with its number of occurences.Function.identity()is a function that returns its input.{"hello": 1, "world": 1}
CodePudding user response:
This is an alternate version of Tom's answer. It should be faster for small-ish strings, since it does not spend time & memory to count words, but Tom's version will be better for very long strings (which will likely contain many duplicate words), as comparing word-counts should be much faster than comparing the full text, once the full text enters into the million-word or so range.
public static boolean equalWordsAndCountsIgnoringOrder(String s1, String s2) {
String[] w1 = s1.toLowerCase().split("\\W ");
String[] w2 = s2.toLowerCase().split("\\W ");
Arrays.sort(w1);
Arrays.sort(w2);
return Arrays.equals(w1, w2);
}
We first convert the strings into arrays with their words lower-cased. Then we sort those arrays. Finally, we test that the exact same words appear in the exact same places.
Usage:
System.out.println(equalWordsAndCountsIgnoringOrder(phrase, phrase2));
