So i have this task on our company postgresql database.
I need to find all discontinuity in dates between the same ID's. Here is example how it look's on db. There is ID, next we have column 'valid_from' and 'valid_to'.
As you can see only on rows 1 and 2 there is discontinuity between columns 'valid_to' to 'valid_from'.
Between rows 2>3>4 everything is ok.
id valid_from valid_to
1 Name_of_record1 10.10.2011 17:26 21.07.2021 02:47
2 Name_of_record1 28.08.2021 13:00 11.09.2021 21:12
3 Name_of_record1 11.09.2021 21:12 13.09.2021 05:42
4 Name_of_record1 13.09.2021 05:42 31.12.9999 00:00
CodePudding user response:
Using the window function LAG() you can retrieve a column value from a row before the current row, so if previous row valid_to value is different from actual row valid_from value, you have a discontinuity.
SELECT id, valid_from::date, valid_to::date,
CASE WHEN valid_from::date != LAG(valid_to::date) OVER (PARTITION BY id ORDER BY valid_from)
THEN false ELSE true END AS continuity
FROM t;
LAG() will be return null for the first row of each group (rows with the same id, because there is not previous row with the same id). Continuity is set to true for the first row.
Output:
When continuity is false there is a discontinuity.
| id | valid_from | valid_to | continuity |
|---|---|---|---|
| 1 | 2011-10-10 | 2021-07-21 | t |
| 1 | 2021-08-28 | 2021-09-11 | f |
| 1 | 2021-09-11 | 2021-09-13 | t |
| 1 | 2021-09-13 | 9999-12-31 | t |
CodePudding user response:
You may use LEAD to confirm that the values are continuous. eg
Query #1
SELECT
*
FROM (
SELECT
*,
LEAD("valid_from",1,"valid_to") OVER (
PARTITION BY "rec_name"
ORDER BY "valid_from"
)="valid_to" as continuity
FROM
my_table
) t
ORDER BY "id";
| id | rec_name | valid_from | valid_to | continuity |
|---|---|---|---|---|
| 1 | Name_of_record1 | 10.10.2011 17:26 | 21.07.2021 02:47 | false |
| 2 | Name_of_record1 | 28.08.2021 13:00 | 11.09.2021 21:12 | true |
| 3 | Name_of_record1 | 11.09.2021 21:12 | 13.09.2021 05:42 | true |
| 4 | Name_of_record1 | 13.09.2021 05:42 | 31.12.9999 00:00 | true |
Query #2
SELECT
*
FROM (
SELECT
*,
LEAD("valid_from",1,"valid_to") OVER (
PARTITION BY "rec_name"
ORDER BY "valid_from"
)="valid_to" as continuity
FROM
my_table
) t
WHERE continuity=true
ORDER BY "id";
| id | rec_name | valid_from | valid_to | continuity |
|---|---|---|---|---|
| 2 | Name_of_record1 | 28.08.2021 13:00 | 11.09.2021 21:12 | true |
| 3 | Name_of_record1 | 11.09.2021 21:12 | 13.09.2021 05:42 | true |
| 4 | Name_of_record1 | 13.09.2021 05:42 | 31.12.9999 00:00 | true |
Let me know if this works for you.
