I have to consume a WS that sends its XML data inside a CDATA tag, the output I get is the following:
<parent>
<child1>
<xmltag1>4 años < 8 </xmltag1>
<xmltag2>3 años < 12 </xmltag2>
<child1>
</parent>
I have to format this data to an usable XML so I can work with it.
It should look like:
<parent>
<child1>
<xmltag1>4 años < 8 </xmltag1>
<xmltag2>3 años < 12 </xmltag2>
<child>
</parent>
I have tried various java functions like: StringEscapeUtils.unescapeXml(string);
I guess there could be a way of getting that result by using regex
string.replaceAll("<{0}>", "</{0}>");
CodePudding user response:
You can use
String fixedXml = text.replaceAll("<(/?\\w (?:\\s[^>]*)?>)", "<$1");
See the regex demo. Details:
<- a<string(/?\\w (?:\\s[^>]*)?>)- Group 1 ($1):/?- an optional/char\w- one or more word chars(?:\s[^>]*)?- an optional sequence of a whitespace char and then any zero or more chars other than>>- a>char.
