I'm trying to remove all characters from a String, except
- numbers
- letters (ignoring case)
- comma (
,) and - dashes (
-)
Actually I built some regex that works:
.replaceAll("\\[", "").replaceAll("\\]", "").replaceAll("\"", "")
Example
Given that input with UUIDs formatted as string:
["edfcb406-b37e-4729-899e-93ea2af71e83","98e75d74-abe2-4c08-8340-06aa9b2faf0b"]
I want as expected result:
edfcb406-b37e-4729-899e-93ea2af71e83,98e75d74-abe2-4c08-8340-06aa9b2faf0b
Is there a better regex to get the expected result?
CodePudding user response:
Use Wiktor's commented solution.
Regular expression to clean UUIDs
[^-,a-zA-Z0-9]
It replaces all non-UUID characters (except separating commas).
The regex matches using a negated (^) range of allowed UUID characters:
- dashes
- - commas
,(to allow a comma-separated list of UUIDs) - 26 lowercase
a-zand uppercaseA-Zalphabetic letters - digits
0-9
If they are excluded from replacement with empty-string, then all other (dirty / noisy characters) are removed. In fact, this cleans to result in a comma-separated UUID list (CSV).
See the demo on IDEone:
String input = "[\"edfcb406-b37e-4729-899e-93ea2af71e83\",\"98e75d74-abe2-4c08-8340-06aa9b2faf0b\"]";
System.out.println("Input: " input);
String ouputAttempt = input.replaceAll("\\[", "").replaceAll("\\]", "").replaceAll("\"", "");
System.out.println("ouputAttempt: " ouputAttempt);
String cleanedUUIDs = input.replaceAll("[^-,a-zA-Z0-9]", "");
System.out.println("cleanedUUIDs: " cleanedUUIDs);
Prints expected:
Input: ["edfcb406-b37e-4729-899e-93ea2af71e83","98e75d74-abe2-4c08-8340-06aa9b2faf0b"]
ouputAttempt: edfcb406-b37e-4729-899e-93ea2af71e83,98e75d74-abe2-4c08-8340-06aa9b2faf0b
cleanedUUIDs: edfcb406-b37e-4729-899e-93ea2af71e83,98e75d74-abe2-4c08-8340-06aa9b2faf0b
Bonus: comma-separated values to a collection
Then you can split the result string (by comma) to get
- an
String[]array of UUIDs or - a
List(both may contain duplicates) or - a
Set(for only unique values)
like:
String[] arrayOfUUIDs = cleanedUUIDs.split(",");
List<String> sequenceOfUUIDs = Arrays.asList(arrayOfUUIDs); // ordered, but not unique (may contain duplicates)
Set<String> uniqueUUIDs = Set.of(arrayOfUUIDs); // unique, but not ordered
See also:
- Searching for UUIDs in text with regex
- iHateRegex: Regex for uuid
- Regex Pattern: UUID & GUID Regular Expression
- Java docs: class
java.util.UUID, and its factory-methodUUID.fromString(s)to parse a regex from String
CodePudding user response:
A great idea from xehpuk's comment: Looks like a JSON string of UUIDs.
Parsing UUIDs given as JSON array of strings
Your given input ["edfcb406-b37e-4729-899e-93ea2af71e83", "98e75d74-abe2-4c08-8340-06aa9b2faf0b"], coincidentally (or not), is valid JSON (try it), an array containing 2 strings (UUIDs).
JSON applies
A JSON array basically is an ordered list of zero or more elements (e.g. quoted strings) enclosed in square-brackets, like this:
Either [] empty or filled ["A", "", null, " 0815", "Long-text or number 7, even quotes escaped\" allowed."]. Both are valid JSON an can be parsed independent from platform & programming-language.
The former is an empty list, the latter is a representation of these 5 strings:
A- empty string
- null as null-reference in Java
0815numberLong-text or number 7, even quotes escaped" allowed.(with double-quote inside)
Using Jackson to parse/extract UUIDs from JSON-format
So we can parse that using a JSON parser in Java, like FasterXML's Jackson.
Adapted and extended from:
import java.io.IOException; // can be thrown when reading/writing JSON
// basic Jackson classes for parsing JSON
import com.fasterxml.jackson.core.JsonParseException;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.type.TypeFactory;
String json = "[\"edfcb406-b37e-4729-899e-93ea2af71e83\",\"98e75d74-abe2-4c08-8340-06aa9b2faf0b\"]";
ObjectMapper mapper = new ObjectMapper();
List<String> list = mapper.readValue(json, TypeFactory.defaultInstance().constructCollectionType(List.class, String.class));
String[] array = objectMapper.readValue(json, String[].class);
List<String> anotherList = objectMapper.readValue(json, new TypeReference<List<String>>(){});
See also:
- Convert JSON Array to a Java Array or List with Jackson
- Baeldung: Jackson - Marshall String to JsonNode, tutorial on the low-level
JsonNodetree that can be traversed to get all values (e.g. text-values inside an array) - Baeldung: articles tagged with Jackson
