I have a large string stored in table as a single line. I need a select query to split the large string to rows after every 100 characters and it should not split in middle of the word. Basically, the query should find a space after 100 characters and split into new line.
I have used this query, it is splitting after 100 lines, but it is breaking in the middle of words.
SELECT REGEXP_REPLACE ( col_large_string , '(.{100})' , '\1' || CHR (10) ) AS split_to_rows
FROM tab_large_string where string_id = 1;
CodePudding user response:
You do not need (slow) regular expressions and can do it with simple (quicker) string functions.
If you want to replace spaces with newlines then:
WITH bounds ( str, end_pos ) AS (
SELECT col_large_string,
INSTR(col_large_string, ' ', 101)
FROM tab_large_string
UNION ALL
SELECT SUBSTR(str, 1, end_pos - 1)
|| CHR(10)
|| SUBSTR(str, end_pos 1),
INSTR(str, ' ', end_pos 101)
FROM bounds
WHERE end_pos > 0
)
SELECT str AS split_to_lines
FROM bounds
WHERE end_pos = 0;
and if you want to have each line in a new row then:
WITH bounds ( str, start_pos, end_pos ) AS (
SELECT col_large_string,
1,
INSTR(col_large_string, ' ', 101)
FROM tab_large_string
UNION ALL
SELECT str,
end_pos 1,
INSTR(str, ' ', end_pos 101)
FROM bounds
WHERE end_pos > 0
)
SELECT CASE end_pos
WHEN 0
THEN SUBSTR(str, start_pos)
ELSE SUBSTR(str, start_pos, end_pos - start_pos)
END AS split_to_rows
FROM bounds;
If you do want to use regular expressions then:
SELECT REGEXP_REPLACE(
col_large_string,
'(.{100,}?) ',
'\1' || CHR (10)
) AS split_to_lines
FROM tab_large_string
WHERE string_id = 1;
db<>fiddle here
CodePudding user response:
You can use this regular expression:
SELECT REGEXP_REPLACE ( col_large_string , '((\w \s ){100})' , '\1' || CHR (10) ) AS split_to_rows
FROM tab_large_string where string_id = 1;
\w matches one or more occurrence of word character.
\s matches one or more occurrence of space character.
(\w \s ) matches a word followed by space
(\w \s ){100} then matches (a word followed by space) x100.
