var str = '&<![CDATA[&]]><![CDATA[&]]>&';
In the above string, I just want to convert only & inside the CDATA not the all &.
Expected Output: &<![CDATA[&]]><![CDATA[&]]>&
I tried below regular expression
str.trim().replace(/^(\/\/\s*)?<!\[CDATA\[|(\/&\/\s*)?\]\]>$/g, '&');
But above code is not working as expected. I am not good in regular expressions. I gone through different answers given in Stackoverflow. But, not able to find the better way to achieve the fix. Could you please guide me.
CodePudding user response:
For this particular string you can apply /(?<=CDATA\[)[&a-z;] (?=]])/g
You can use positive lookbehind and lookahead:
(?<=CDATA\[)is a positive lookbehind. Searches everything after CDATA[(?=]])is a positive lookahead. Searches everything before ]][&a-z;]matches some text containing lowercase letters, & and ;
If I've got your idea correctly, it would be better to use XML parsers to manipulate a document.
Here you can find a sample js code.
CodePudding user response:
If you want to replace any & in CDATA, regardless of what comes before and after (within CDATA):
str.trim().replace(/<!\[CDATA\[.*?\]\]>/g, m => m.replace('&', '&'));
results in
"&<![CDATA[&]]><![CDATA[&]]>&"
This first matches CDATA sections and replaces them with the result of a function, the function replaces all & with &;
Because that function is only applied on CDATA sections, &s outside of CDATA will not be changed.
Example with more characters in CDATA:
var str = '&<![CDATA[Oh look at this: & Haha!]]>&';
str.trim().replace(/<!\[CDATA\[.*?\]\]>/g, m => m.replace('&', '&'));
result:
"&<![CDATA[Oh look at this: & Haha!]]>&
CodePudding user response:
If you have control over the data received it is better to fix the data upstream. If not, you can use nested replaces:
- outer replace identifies the
<![CDATA[...]]> - inner replace
&inside CDATA
Both use the g flag to replace multiple time.
[
'&<![CDATA[&]]><![CDATA[&]]>&',
'&<![CDATA[this & that]]>&'
].forEach(str => {
let result = str.replace(/<!\[CDATA\[[^\]]*\]\]>/, m => m.replace(/&/g, '&'));
console.log(result);
});
Output:
&<![CDATA[&]]><![CDATA[&]]>&
&<![CDATA[this & that]]>&
