i need to replace some charachters in a string with a \ plus the original character
so giving thats string and array
string origin = "words&sales -test\strange";
string[] specialChars = new string[]{"\", "&", "-", "?",......};
i want to get
"words\&sales \-test\\strange"
notice that the \ itself is a character to find and replace
thanks
CodePudding user response:
Generally speaking, the fastest way to build String values in C#/.NET is with a StringBuilder, even if you're transforming another String value.
The other problem is the "best" way to determine which char values should be escaped or not: if the set of escapable characters is fixed at compile-time, then use a switch() statement, as that will be compiled to a native jump-table, which is faster than using a runtime HashSet<Char> for determining set-membership:
e.g.:
static String Escape( String input )
{
StringBuilder sb = new StringBuilder( capacity: 5 * input.Length / 4 ); // Assuming 25% length increase due to escaping.
foreach( Char c in input )
{
switch( c )
{
case '\\':
case '&':
case '-':
case '?':
_ = sb.Append( '\\' ).Append( c );
break;
default:
_ = sb.Append( c );
break;
}
}
return sb.ToString();
}
If the set of escapable character is defined at runtime then using a HashSet<Char> will likely be the best overall option - though if you know you're only processing chars with Unicode code-points within a limited range (say ASCII-compatible chars in the range 0x00 to 0x7F) then you could use a Boolean[127] array to store the escape flag map.
Using a HashSet<Char>, it would be like this:
static String Escape( String input, IEnumerable<Char> escapableChars )
{
HashSet<Char> escapeThese = new HashSet<Char>( escapableChars );
StringBuilder sb = new StringBuilder( capacity: 5 * input.Length / 4 ); // Assuming 25% length increase due to escaping.
foreach( Char c in input )
{
if( escapeThese.Contains( c ) )
{
_ = sb.Append( '\\' ).Append( c );
}
else
{
_ = sb.Append( c );
}
}
return sb.ToString();
}
Of course, the above code can be optimized further: some suggestions:
- First check to see if the
String inputeven has any escapable characters in the first place: if none of its characters are escapable then just returninputdirectly without having created a newStringBuilder. - Create an (on-demand) pool of
StringBuilderinstances instead of creating new instances on every call. - Allow
ReadOnlySpan<Char>instead ofStringfor input and writing output toSpan<Char>- you'll need an initial step to calculate the required minimum size of theSpan<Char>first though, and pass that info back to the caller.- The same minimum-size calculation can be done to have an exactly correct
capacity:value for theStringBuilderinstead of my (lazy) 25% estimate.
- The same minimum-size calculation can be done to have an exactly correct
- Add memoization: use a Bloom filter and output cache keyed by the
inputvalue.
