Problem: I have a table of transactions (see below) with either (open/start) or (close/end) transaction in the format of date. The task is to merge these transactions with their corresponding dates, however, there can be also cases when the transaction is opened/started, but not closed/ended, in which case only Start Date must be shown, and the Flag 'Y' assigned. The rest of cases where there is an Enddate, the flag will be 'N'. I have been able to set a flag depending on the values with first_value(Flag IGNORE NULLS) over(partition by Customer_ID order by EndDate desc), but Im having a bit harder time with matching the start with end dates.
Logic:
- There can be multiple start & end dates per one day with only time differences.
- There can be just one start & end date transaction per customer per day.
- There can be just start date (open) transaction per customer per day.
- Flag 'N' only for closed transactions, 'Y' for open i.e. no end date.
Current set example looks like this: Note: for this particular customer we have equal amount of transactions (inserts & deletes so I know the flag will be 'N', when we map start and enddates, but the solution also has to work when there are more start dates than enddates for other customers, where the final (newest) start date row will have a flag of 'Y'.
| Customer_Id | Start Date | End Date | Flag |
|---|---|---|---|
| 111 | 02/07/2020 20:58:32.000000 | N | |
| 111 | 02/07/2020 19:18:04.000000 | N | |
| 111 | 01/06/2020 09:38:49.000000 | N | |
| 111 | 01/06/2020 09:36:34.000000 | N | |
| 111 | 29/05/2020 16:58:07.000000 | N | |
| 111 | 02/07/2020 20:57:52.000000 | N | |
| 111 | 02/07/2020 19:17:22.000000 | N | |
| 111 | 29/05/2020 16:58:06.000000 | N | |
| 111 | 01/06/2020 09:38:40.000000 | N | |
| 111 | 01/06/2020 09:36:34.000000 | N |
Expected result:
| Customer_Id | Start Date | End Date | Flag |
|---|---|---|---|
| 111 | 02/07/2020 20:57:52.000000 | 02/07/2020 20:58:32.000000 | N |
| 111 | 02/07/2020 19:17:22.000000 | 02/07/2020 19:18:04.000000 | N |
| 111 | 01/06/2020 09:38:40.000000 | 01/06/2020 09:38:49.000000 | N |
| 111 | 01/06/2020 09:36:34.00000 | 01/06/2020 09:36:34.000000 | N |
| 111 | 29/05/2020 16:58:06.000000 | 29/05/2020 16:58:07.000000 | N |
CodePudding user response:
This will return the matching end date for each start date.
select customer_id
,start_date
,lead(end_date) -- find the next row's end date
over (partition by customer_id
order by coalesce(start_date, end_date)) as new_end
,case when new_end is null then 'Y' else 'N' end as flag
from tab
qualify start_date is not null -- only return starting rows
order by 1,2;
