Home > Mobile >  Remove groups based on multiple conditions
Remove groups based on multiple conditions

Time:01-12

I have a table similar to the below. My goal is to remove groups for each date where the status moves to either 'Cancelled' or 'Failed', while retaining groups per day that contain other status changes.

Group Status Date
A Pending 2021-01-01 08:00:00
A Cancelled 2021-01-01 13:00:00
A Pending 2021-01-02 08:00:00
A Failed 2021-01-02 13:00:00
A Pending 2021-01-03 08:00:00
A Pending Settlement 2021-01-03 13:00:00
A Pending 2021-01-04 08:00:00
A Settled 2021-01-04 13:00:00
B Pending 2021-01-01 08:00:00
B Cancelled 2021-01-01 13:00:00
B Pending 2021-01-02 08:00:00
B Failed 2021-01-02 13:00:00
B Pending 2021-01-03 08:00:00
B Pending Settlement 2021-01-03 13:00:00
B Pending 2021-01-04 08:00:00
B Settled 2021-01-04 13:00:00

My first attempt was something like:

select GROUP, STATUS, DATE
from TABLE TBL
, (
select GROUP, STATUS, DATE
from TABLE
where STATUS in ('Cancelled','Failed')
) FLAG
where (TBL.GROUP <> FLAG.GROUP and TBL.DATE <> FLAG.DATE)
;

My expected output is shown below EDIT:, however it seems to be taking exceptionally long (>10 mins) even when applying date filters:

Group Status Date
A Pending 2021-01-03 08:00:00
A Pending Settlement 2021-01-03 13:00:00
A Pending 2021-01-04 08:00:00
A Settled 2021-01-04 13:00:00
B Pending 2021-01-03 08:00:00
B Pending Settlement 2021-01-03 13:00:00
B Pending 2021-01-04 08:00:00
B Settled 2021-01-04 13:00:00

CodePudding user response:

You may use the last_value() window function to get the last value within a group and then apply your filter against it.

SELECT "GROUP",
       "STATUS",
       "DATE"
       FROM (SELECT "GROUP",
                    "STATUS",
                    "DATE",
                    last_value("STATUS") OVER (PARTITION BY "GROUP",
                                                            trunc("DATE")
                                               ORDER BY "DATE" ASC) lv
                    FROM "TABLE") x
       WHERE lv NOT IN ('Cancelled',
                        'Failed');

Edit:

To filter out days where the status was 'Cancelled' or 'Failed' anytime during the day, you can use for example the windowed version of of count() with a CASE expression that gives a non-NULL value when the status is 'Cancelled' or 'Failed', or NULL (the default) otherwise.

SELECT "GROUP",
       "STATUS",
       "DATE"
       FROM (SELECT "GROUP",
                    "STATUS",
                    "DATE",
                    count(CASE
                            WHEN "STATUS" IN ('Cancelled',
                                              'Failed') THEN
                              0
                          END) OVER (PARTITION BY "GROUP",
                                                  trunc("DATE")) c
                    FROM "TABLE") x
       WHERE c = 0;
  •  Tags:  
  • Related