I have a database which has following columns
id int
start datetime
end datetime
<other columns
The start and END is datetime of following format YYYY-MM-DD H24:MI:SS.MI. In a perfect world the END of yesterday will become start of today. However, there are times when the job doesn't run and it skips making an entry into the system. I want to write a query which can tell me which date is missing from the table (time doesn't matter).
For example, consider this
START END
2022-01-31 H24:MI:SS.MI 2022-02-01 H24:MI:SS.MI
2022-01-29 H24:MI:SS.MI 2022-01-30 H24:MI:SS.MI
I want to write a query which should be able to tell that start of 2022-01-30 is missing from the table.
EDIT
Bare minimum, the expected output should be 2022-01-30 (i.e., start is missing this date)
CodePudding user response:
I think you are looking for end_dates that don't have a matching start_date — except the last end_date:
with data as (
select $1 start_date, $2 end_date
from values('2022-01-31', '2022-02-01'), ('2022-01-29', '2022-01-30')
)
select end_date
from data
where end_date not in (select start_date from data)
and end_date not in (select max(end_date) from data)
CodePudding user response:
if you have some date with a gap like (I have used number instead of dates, just because they are small to type, and read, but the idea works across the types):
SELECT * FROM VALUES
(1,1,2),
(1,2,3),
(1,4,5),
(1,5,6)
v(job_id, start_id, end_id)
then you can get the (2,3) row AND/OR the (4,5) row by using a LAG/LEAD function where the prior end is not the current start, OR the next start is not the current end.
like:
SELECT v.*,
LAG(end_id, 1, start_id) OVER (PARTITION BY job_id order by start_id) as prior_end,
LEAD(start_id, 1, end_id) OVER (PARTITION BY job_id order by start_id) as next_start
FROM VALUES
(1,1,2),
(1,2,3),
(1,4,5),
(1,5,6)
v(job_id, start_id, end_id)
QUALIFY (start_id != prior_end) OR (end_id != next_start)
ORDER BY 1,2
gives:
| JOB_ID | START_ID | END_ID | PRIOR_END | NEXT_START |
|---|---|---|---|---|
| 1 | 2 | 3 | 2 | 4 |
| 1 | 4 | 5 | 3 | 5 |
and if you want it "cleaner"
SELECT v.*
FROM VALUES
(1,1,2),
(1,2,3),
(1,4,5),
(1,5,6)
v(job_id, start_id, end_id)
QUALIFY (
start_id != LAG(end_id, 1, start_id) OVER (PARTITION BY job_id order by start_id)
) OR (
end_id != LEAD(start_id, 1, end_id) OVER (PARTITION BY job_id order by start_id)
)
ORDER BY 1,2
| JOB_ID | START_ID | END_ID |
|---|---|---|
| 1 | 2 | 3 |
| 1 | 4 | 5 |
