How can I pull the records from the first table (Table_A) which has an Attr column having embedded CODE values those contain at least one uncommon CODE value whenever compared to the second table (Table_B: CODE & DESCR cols.) so that I get table_C as the result? This is what I have so far:
Select * From Table_A
Where Attr like '%CODE%' AND
not exist (select * from Table_B
where Table_A.Attr LIKE '%'||Table_B.CODE||'%')
table A
| ID | Attr |
|---|---|
| 1 | CODE = A111 |
| 2 | CODE = 'A111, B222, C333, D444' |
| 3 | CODE = 'D444', 'E555', 'F666' |
| 4 | CODE = 'G777', 'B222' |
| 5 | ITEM = 'AFRD' AND CODE = 'C333' |
| 6 | ITEM = BYNM |
table_B
| CODE | DESCR |
|---|---|
| A111 | djiefljfe |
| D444 | qrrascjg |
| E555 | wpofler |
| F666 | nfosmwfa |
| G777 | losk |
table_C
| ID | Attr |
|---|---|
| 2 | CODE = 'A111, B222, C333, D444' |
| 4 | CODE = 'G777', 'B222' |
| 5 | ITEM = 'AFRD' AND CODE = 'C333' |
CodePudding user response:
You must give Table_A an alias, so that you can reference it in the nested select:
SELECT *
FROM
Table_A A
WHERE
Attr LIKE '%CODE%' AND
NOT EXISTS
(
SELECT *
FROM Table_B
WHERE A.Attr LIKE '%'||Table_B.CODE||'%'
)
Also, the keyword is EXISTS not EXIST.
SQL is not able to parse complex arbitrary expressions like ITEM = 'AFRD' AND CODE = 'C333' and to differentiate between C333 being a code and AFRD being an item. However, if item ids are always different from codes this is not a problem.
See: http://sqlfiddle.com/#!4/995f23/1/0
CodePudding user response:
As long as codes always follow items, you could extract the list of codes with:
regexp_substr(attr, '.*(CODE = ?)(.*?)', 1, 1, null, 2
and remove the spaces and quotes with
translate(regexp_substr(attr, '.*(CODE = ?)(.*?)', 1, 1, null, 2), q'^x' ^', 'x')
to get a simple comma-separated list of values to look for, like G777,B222; and then split that into individual elements, for example using recursive subquery factoring; and find any that don't exist in the other table.
Which is messy, but:
with cte (id, attr, codes) as (
select id, attr,
translate(
regexp_substr(attr, '.*(CODE = ?)(.*?)', 1, 1, null, 2),
q'^x' ^', 'x'
)
from table_a
),
rcte (id, attr, codes, pos, code) as (
select id, attr, codes, 1, regexp_substr(codes, '(.*?)(,|$)', 1, 1, null, 1)
from cte
where codes is not null
union all
select id, attr, codes, pos 1, regexp_substr(codes, '(.*?)(,|$)', 1, pos 1, null, 1)
from rcte
where regexp_substr(codes, '(.*?)(,|$)', 1, pos 1, null, 1) is not null
)
select distinct r.id, r.attr
from rcte r
where not exists (
select null
from table_b b
where b.code = r.code
)
gives
| ID | ATTR |
|---|---|
| 2 | CODE = 'A111, B222, C333, D444' |
| 4 | CODE = 'G777', 'B222' |
| 5 | ITEM = 'AFRD' AND CODE = 'C333' |
The cte gets the simple CSV values; rcte splits those up to individual components (and assumes there will be no empty elements), and then those individual values are checked with not exists. Which gives duplicate id/attr pairs, so distinct removes the duplicates; but you can also see which code(s) didn't exist if you want, by not applying distinct, and potentially using listagg to still get a single result per ID.
It's painful, but that's what happens with data stored like this...
CodePudding user response:
You can use an outer join among Table_A and Table_B in order to obtain unmatched codes after converting comma separated codes to the newly generated rows per each individual code through use of regular expression functions along with a hierarchical query such as
WITH a0 AS
(
SELECT id, CASE WHEN INSTR(attr, 'CODE')>0 THEN
REGEXP_REPLACE(attr,'(.*CODE =) ')
END AS codes
FROM Table_A
), a AS
(
SELECT id, TRIM(BOTH CHR(39) FROM TRIM(REGEXP_SUBSTR(codes,'[^,] ',1,level)) ) AS codes
FROM a0
WHERE codes IS NOT NULL
CONNECT BY LEVEL <= REGEXP_COUNT(codes, ',') 1
AND PRIOR SYS_GUID() IS NOT NULL
AND PRIOR id = id
)
SELECT a.*
FROM Table_A a
JOIN (SELECT DISTINCT id
FROM a
LEFT JOIN Table_B
ON code = codes
WHERE code IS NULL) b
ON a.id = b.id
