I get three parameters in a string. Each parameter is written in the form: Quotes, Name, Quotes, Equals sign, Quotes, Text, Quotes. The parameter separator is a space. Example 1:
"param1"="Peter" "param2"="Harald" "param3"="Marie"
With java.util.regex.Matcher I can find any name and text by the following regex:
"([^"]*)"\s*=\s*"([^"]*)"
Now, however, there may be a quotation mark in the text. This is masked by a backslash. Example 2:
"param1"="Peter" "param2"="Har\"ald" "param3"="Marie"
I have built the following regex:
"([^"]*)"\s*=\s*("([^"]*(\\")*[^"]*)*[^\\]")
This works well for example 2, but is not a universal solution.
If the backslash is at the end of a parameter-value, the solution does not work anymore. Example 3:
"param1"="Peter" "param2"="Harald\" "param3"="Marie"
If the backslash is at the end of the value, the matcher interprets "Harald\" " as the value of parameter 2 instead of "Harald\".
Do you have a universal solution for this problem? Thanks in advance for your input.
Kind regards Dominik
CodePudding user response:
You may use this regex in Java:
\"([^\"]*)\"\h*=\h*(\"[^\\\"]*(?:\\(?=\"(?:\h|$))|(?:\\.[^\\\"]*))*\")
RegEx Demo:
\"([^\"]*)\": Match quoted string a parameter name\h*=\h*: Match=surrounded with optional spaces(: Start capture group #1\": Match opening"[^\\\"]*: Match 0 or more of non-quote, non-backslash characters(?::\\: Match a\(?=\"(?:\h|$)): Must be followed by a"that has a whitespace or line afterwards|: OR(?:\\.[^\\\"]*))*: Match an escaped character followed by 0 or more of non-quote, non-backslash characters
\": Match closing"): End capture group #1
