There is a table:
| id | direction_id | created_at |
|---|---|---|
| 1 | 2 | 22 November 2021 г., 16:00:00 |
| 2 | 2 | 22 November 2021 г., 16:20:00 |
| 43 | 2 | 22 November 2021 г., 16:25:00 |
| 455 | 1 | 22 November 2021 г., 16:27:00 |
| 6567 | 2 | 22 November 2021 г., 17:36:00 |
| 674556 | 2 | 22 November 2021 г., 20:01:00 |
| 5243554 | 1 | 22 November 2021 г., 20:50:00 |
| 5243554 | 1 | 22 November 2021 г., 21:46:00 |
I need to get the following result:
| 1 | 2 | created_at_by_hour |
|---|---|---|
| 1 | 3 | 22.11.21 17 |
| 1 | 4 | 22.11.21 18 |
| 1 | 4 | 22.11.21 19 |
| 1 | 4 | 22.11.21 20 |
| 2 | 5 | 22.11.21 21 |
| 3 | 5 | 22.11.21 22 |
1 and 2 in the header are all possible values of direction_id that are in the table.
created_at is reduced to hours and you need to count how many records satisfy the condition <= created_at_by_hour. But the grouping should be such that if the time (hour) when no records were created, then just duplicate the previous hour.
The table consists of three fields - id (int), direction_id (int), created_at (timestamptz). I need to get an hourly (based on the created_at field) data upload with the number of records created before this "grouped" time. But I need not just the number, but separately for each direction_id (there are only two of them - 1 and 2). If no records were created for a certain direction_id at a certain hour, duplicate the previous one, but the result should end at the last created_at. created_at is the time when the record was created.
CodePudding user response:
In my opinion, better to generate a date between min and max date according to an hour then calculate the count of each direction.
with time_range as (
select
min(created_at) interval '1 hour' as min,
max(created_at) interval '1 hour' as max
from test
)
select
count(*) filter (where direction_id = 1) as "1",
count(*) filter (where direction_id = 2) as "2",
to_char(gs.hour, 'dd.mm.yy HH24') as created_at_by_hour
from
test t
cross join time_range tr
inner join generate_series(tr.min, tr.max, interval '1 hour') gs(hour)
on t.created_at <= gs.hour
group by gs.hour
order by gs.hour
CodePudding user response:
Truncate the date down to the hour, group by it and count. Then use SUM OVER to get a running total of the counts. In order to show missing hours in the table, you must generate a series of hours and outer join your data.
with hourly as
(
select date_trunc('hour', created_at) as hour, direction_id from mytable
)
, hours(hour) as
(
select *
from generate_series
(
(select min(hour) from hourly), (select max(hour) from hourly), interval '1 hour'
)
)
select
hours.hour,
sum(count(*) filter (where hourly.direction_id = 1)) over (order by hour) as "1",
sum(count(*) filter (where hourly.direction_id = 2)) over (order by hour) as "2"
from hours
left join hourly using (hour)
group by hour
order by hour;
Demo: https://dbfiddle.uk/?rdbms=postgres_14&fiddle=21d0c838452a09feac4ebc57906829f4
