I am trying to remove the white space that is in this header that appears after the ":" character
batman: 100
robin: OFXSGML
superman: 102
wonderwoman: NONE
joker: USASCII
harley: 1252
aquaman: NONE
flash: NONE
iris: NONE
this is a regex pattern to match this exact header but I keep running into problems trying to delete the white space any help that can be offered is appreciated
^batman:\s100 robin:\sOFXSGML superman:\s102 wonderwoman:\s NONE joker:\sUSASCII harley: 1252 aquaman:\s NONE flash:\sNONE iris:\sNONE$
CodePudding user response:
In your pattern you are using spaces, but if you want to match all lines you can replace them with \s every time you cross a newline.
Then you can after process it replacing :\s with : but note that the pattern is very precise match.
If you want to be more flexible, You can use a capture group to capture all before the : and then match the spaces after it.
^([^\s:] :)[\p{Zs}\t] (?=\S)
The pattern matches:
^Start of string([^\s:] :)Capture group 1, match 1 non whitespace chars other than:and then match the:[\p{Zs}\t]Match 1 spaces(?=\S)Postive lookahead, assert a non whitespace char to the right (if there has to be one, else you can omit this part)
In the replacement use group 1 like $1
CodePudding user response:
var yourString = @"batman: 100 robin: OFXSGML superman: 102 wonderwoman: NONE joker: USASCII harley: 1252 aquaman: NONE flash: NONE iris: NONE";
yourString = Regex.Replace(yourString, "(?<=:) ", "");
CodePudding user response:
Shouldn't be any more complex than:
string source = @"
batman: 100
robin: OFXSGML
superman: 102
wonderwoman: NONE
joker: USASCII
harley: 1252
aquaman: NONE
flash: NONE
iris: NONE
".Trim();
Regex rx = new Regex(@"(?<=:)\s ");
string result = rx.Replace(source, "");
(?<=:)is a zero-width positive lookbehind: it anchors the match on a:, without it being a part of the match.\smatches 1 or more whitespace characters (SP, HT, CR, LF, VT).
That changes:
batman: 100
robin: OFXSGML
superman: 102
wonderwoman: NONE
joker: USASCII
harley: 1252
aquaman: NONE
flash: NONE
iris: NONE
into
batman:100
robin:OFXSGML
superman:102
wonderwoman:NONE
joker:USASCII
harley:1252
aquaman:NONE
flash:NONE
iris:NONE
Alternatively, you can include the : in the match. It just changes the replacement text:
Regex rx = new Regex(@":\s ");
string result = rx.Replace(source, ":");
If you care about the value of the key preceding the colon-plus-whitespace, use named capture groups and a match evaluator.
Here the regular expression (?<key>\w )\s*:\s* matches:
(?<key>\w )— a sequence of 1 or more whitespace characters (letters, digits or_), followed by\s*— zero or more whitespace characters, followed by:— a literal colon character, followed by\s*— zero or more whitespace characters
The match evaluator looks at the capturing group named key. If it is any of batman, robin, or superman, any whitespace preceding or following the colon is removed; otherwise, the match itself is returned unchanged.
Regex rx = new Regex(@"(?<key>\w )\s*:\s*");
string result = rx.Replace(source, (Match m) => {
string replacement;
string key = m.Groups["key"].Value;
switch (key) {
case "batman":
case "robin":
case "superman":
replacement = key ":";
break;
default:
replacement = m.Value;
break;
}
return replacement;
});
