I have the following code snippet in SQL to select the next piece of text after ABC DEF that's of variable length:
SELECT trim('ABC DEF ' FROM regexp_substr(my_field, 'ABC DEF ([^ ] )')) FROM my_table
Sample Data:
'{random text here} ABC DEF {my_variable_length_keyword} {random text here}'
Expected Output:
{my_variable_length_keyword}
While this works, it only accounts for cases where there is one space after ABC DEF. How would I deal with cases where there are tabs, new lines, or multiple spaces before the next word?
I've tried:
SELECT trim('ABC DEF ' FROM regexp_substr(my_field, 'ABC DEF\s ([^ ] )')) FROM my_table
But this doesn't yield any result.
Can someone please help me out with this? Thank you!
CodePudding user response:
(?<=ABC DEF)\s \S or (?<=ABC DEF)\\s \\S for string literal used in other language.
Trim the leading whitespaces, then you'll get the final output.
Some notes:
(?<=pattern)is positive lookbehind, it works like normal regex but won't be put into match result, place it before strings that you want to match.- in most platform,
patternin lookbehind doesn't support*and. For example,(?<=ABC DEF\s )is unsupported in most cases, you need to move\soutside to match whitespaces and trim them later, check Important Notes About Lookbehind - Don't know what pattern
{my_variable_length_keyword}has though, but it seems it doesn't contain whitespaces,\Scan match one non-whitespace character.
CodePudding user response:
You do not need to use any lookarounds, nor do you need to trim the prefix matched with the regex, since with REGEXP_SUBSTR you can specify a capturing group and extract just its contents.
SELECT regexp_substr(my_field, 'ABC\\s DEF\\s (\\S )', 1, 1, 'e') FROM my_table
This is the regex demo.
Details:
ABC\s DEF\s (\S )-ABC, one or more whitespaces,DEF, one or more whitespaces (this text is just matched) and then one or more non-whitespace chars are captured into Group 1 (note you need to double escape backslashes as the single backslash is used to form string escape sequences, and to introduce a literal\it needs doubling)1- start searching from the first char in the string1- tells the regex engine to extract the first occurrence of the pattern'e'- enables extraction of the subexpression (aka capturing group value) from the resulting match.
