I have data like below:
| group | seq | activity |
|---|---|---|
| A | 1 | scan |
| A | 2 | visit |
| A | 3 | pay |
| B | 1 | drink |
| B | 2 | rest |
I expect to have 1 new column "hist" like below:
| group | seq | activity | hist |
|---|---|---|---|
| A | 1 | scan | NULL |
| A | 2 | visit | scan |
| A | 3 | pay | scan, visit |
| B | 1 | drink | NULL |
| B | 2 | rest | drink |
I was trying to solve with LAG function, but LAG only returns one row from previous instead of multiple. Truly appreciate any help!
CodePudding user response:
Use a correlated sub-query:
SELECT t.*,
(SELECT LISTAGG(activity, ',') WITHIN GROUP (ORDER BY seq)
FROM table_name l
WHERE t."GROUP" = l."GROUP"
AND l.seq < t.seq
) AS hist
FROM table_name t
Or a hierarchical query:
SELECT t.*,
SUBSTR(SYS_CONNECT_BY_PATH(PRIOR activity, ','), 3) AS hist
FROM table_name t
START WITH seq = 1
CONNECT BY
PRIOR seq 1 = seq
AND PRIOR "GROUP" = "GROUP"
Or a recursive sub-query factoring clause:
WITH rsqfc ("GROUP", seq, activity, hist) AS (
SELECT "GROUP", seq, activity, NULL
FROM table_name
WHERE seq = 1
UNION ALL
SELECT t."GROUP", t.seq, t.activity, r.hist || ',' || r.activity
FROM rsqfc r
INNER JOIN table_name t
ON (r."GROUP" = t."GROUP" AND r.seq 1 = t.seq)
)
SEARCH DEPTH FIRST BY "GROUP" SET order_rn
SELECT "GROUP", seq, activity, SUBSTR(hist, 2) AS hist
FROM rsqfc
Which, for the sample data:
CREATE TABLE table_name ("GROUP", seq, activity) AS
SELECT 'A', 1, 'scan' FROM DUAL UNION ALL
SELECT 'A', 2, 'visit' FROM DUAL UNION ALL
SELECT 'A', 3, 'pay' FROM DUAL UNION ALL
SELECT 'B', 1, 'drink' FROM DUAL UNION ALL
SELECT 'B', 2, 'rest' FROM DUAL;
All output:
GROUP SEQ ACTIVITY HIST A 1 scan null A 2 visit scan A 3 pay scan,visit B 1 drink null B 2 rest drink
db<>fiddle here
CodePudding user response:
To aggregate strings in Oracle we use LISAGG function.
In general, you need a windowing_clause to specify a sliding window for analytic function to calculate running total.
But unfortunately LISTAGG doesn't support it.
To simulate this behaviour you may use model_clause of the select statement. Below is an example with explanation.
select group_ , activity , seq , hist from t model /*Where to restart calculation*/ partition by (group_) /*Add consecutive numbers to reference "previous" row per group. May use "seq" column if its values are consecutive*/ dimension by ( row_number() over( partition by group_ order by seq asc ) as rn ) measures ( /*Other columnns to return*/ activity , cast(null as varchar2(1000)) as hist , seq ) rules update ( /*Apply this rule sequentially*/ hist[any] order by rn asc = /*Previous concatenated result*/ hist[cv()-1] /*Plus comma for the third row and tne next rows*/ || presentv(activity[cv()-2], ',', '') /**/ /*lus previous row's value*/ || activity[cv()-1] )
GROUP_ | ACTIVITY | SEQ | HIST :----- | :------- | --: | :--------- A | scan | 1 | null A | visit | 2 | scan A | pay | 3 | scan,visit B | drink | 1 | null B | rest | 2 | drink
db<>fiddle here
