Home > Blockchain >  Identify All non-whitelisted Characters in String (pl/sql)
Identify All non-whitelisted Characters in String (pl/sql)

Time:01-10

I've been given a list of acceptable characters, and if a string contains any characters that are not in this whitelist, I need the query to identify them. This can be done via a SELECT query from DUAL, or via an anonymous block in pl/sql.

Is there a way that I can return a string that contains non whitelisted characters in an input string without looping?

For my requirements, the whitelist characters are:

  • All numbers
  • All letters, upper and lower case
  • Parentheses
  • Curly braces
  • Square brackets
  • Dashes and underscores
  • Plus symbols and equals symbols
  • Commas
  • Forward slashes (/) and pipes

I can do something like:

declare
   myInputString varchar2(256);
   myDirtyCharacters varchar2(256);
begin
   myInputString := '(This is a 123 [email protected]!)';

   myDirtyCharacters := regexp_replace(myInputString,'someRegex');

   dbms_output.put_line('Identified dirty characters: ' || myDirtyCharacters);
end;

I would then expect that the myDirtyCharacters string would contain a all non-whitelisted characters. In my case, it would be "@!", because the at-symbol and the exclamation point are the only non-whitelisted characters.

Use of the regexp_replace function is not a requirement. Any non-looping method could be used.

CodePudding user response:

A (relatively) simple option (from my point of view) is a combination of nested

  • TRANSLATE function which will remove all whitelisted characters except letters and digits,
  • REGEXP_REPLACE which removes alphanumerics from what's left,
  • TRIM which removes unnecessary space characters

Something like this

SQL> with test (col) as
  2    (select '(This (is) a {123 minus - [email protected]! 1 1=2/2 2=4/4)' from dual)
  3  select trim(regexp_replace(translate(col, 'a(){}[]-_ =,/|', 'a'), '[[:alnum:]]', '')) result
  4  from test;

RES
---
@.!

SQL>
  •  Tags:  
  • Related