I have to find and remove a substring from the text using regexp in PostgreSQL. The substring corresponds to the condition: <any text between double-quotes containing for|while inside>
Example
Text:
PERFORM xxkkcsort.f_write_log("INFO", "xxkkcsort.f_load_tables__outof_order_system", " Script for data loading: ", false, v_sql, 0);
So, my purpose is to find and remove the substring "Script for data loading: ". When I tried to use the script below:
SELECT regexp_replace(
'PERFORM xxkkcsort.f_write_log("INFO", "xxkkcsort.f_load_tables__outof_order_system", "> Table for loading: "||cc.source_table_name , false, null::text, 0);'
, '(\")(.*(for|while)(\s).*)(\")'
, '');
I have all the texts inside double-quotes replaced. The result looks like:
PERFORM xxkkcsort.f_write_log(||cc.source_table_name , false, null::text, 0);
What's a proper regular expression to solve the issue?
CodePudding user response:
any text between double-quotes containing for|while inside
SELECT regexp_replace(string, '"[^"]*\m(?:for|while)\M[^"]*"', '');
" ... literal " (no special meaning here, so no need to escape it)
[^"]* ... character class including all characters except ", 0-n times
\m ... beginning of a word
(?:for|while) ... two branches in non-capturing parentheses
(regexp_replace() works with simple capturing parentheses, too, but it's cheaper this way since you don't use the captured substring. But try either with the replacement '\1', where it makes a difference ...)
\M ... end of a word
[^"]* ... like above
" ... like above
I dropped \s from your expression, as the task description does not strictly require a white-space character (end of string or punctuation delimiting the word ...).
Related:
CodePudding user response:
You canuse
SELECT regexp_replace(
'PERFORM xxkkcsort.f_write_log("INFO", "xxkkcsort.f_load_tables__outof_order_system", "> Table for loading: "||cc.source_table_name , false, null::text, 0);',
'"[^"]*(for|while)\s[^"]*"',
'') AS Result;
Output:
PERFORM xxkkcsort.f_write_log("INFO", "xxkkcsort.f_load_tables__outof_order_system", ||cc.source_table_name , false, null::text, 0);
See the regex demo and the DB fiddle. Details:
"- a"char[^"]*- zero or more chars other than"(for|while)-fororwhile\s- a whitespace[^"]*"- zero or more chars other than"and then a"char.
