I need to split a string on C#, based on space as delimiter and preserving the quotes.. this part is ok.
But additionally, I want to allow escape character for string \" to allow include other quotes inside the quotes.
Example of what I need:
One Two "Three Four" "Five \"Six\""
To:
- One
- Two
- Three Four
- Five "Six"
This is the regex I am currently using, it is working for all the cases except "Five \"Six\""
//Split on spaces unless in quotes
List<string> matches = Regex.Matches(input, @"[\""]. ?[\""]|[^ ] ")
.Cast<Match>()
.Select(x => x.Value.Trim('"'))
.ToList();
I'm looking for any Regex, that would do the trick.
CodePudding user response:
You can use
var input = "One Two \"Three Four\" \"Five \\\"Six\\\"\"";
// Console.WriteLine(input); // => One Two "Three Four" "Five \"Six\""
List<string> matches = Regex.Matches(input, @"(?s)""(?<r>[^""\\]*(?:\\.[^""\\]*)*)""|(?<r>\S )")
.Cast<Match>()
.Select(x => Regex.Replace(x.Groups["r"].Value, @"\\(.)", "$1"))
.ToList();
foreach (var s in matches)
Console.WriteLine(s);
See the C# demo.
The result is
One
Two
Three Four
Five "Six"
The (?s)"(?<r>[^"\\]*(?:\\.[^"\\]*)*)"|(?<r>\S ) regex matches
(?s)- aRegexOptions.Singlelineequivalent to make.match newlines, too"(?<r>[^"\\]*(?:\\.[^"\\]*)*)"-", then Group "r" capturing any zero or more chars other than"and\and then zero or more sequences of any escaped char and zero or more chars other than"and\, and then a"is matched|- or(?<r>\S )- Group "r": one or more whitespaces.
The .Select(x => Regex.Replace(x.Groups["r"].Value, @"\\(.)", "$1")) takes the Group "r" value and unescapes (deletes a \ before) all escaped chars.
