I have a table of (tstz, value) of which data is inserted at random points in time, usually every second. I want to select and average the time series data, which I can do with:
SELECT
to_timestamp(round((extract(epoch from tstz)) / 10) * 10) AS ts,
AVG(value)
FROM table
GROUP BY ts
ORDER BY ts
the function to_timestamp(round((extract(epoch from tstz)) / 10) * 10) rounds the timestamptz to the nearest provided interval (i.e. 10 seconds in this example). So assuming 10 values per second, 10 values are aggregated into an average. Although, if there is a 10 second period (or any period equal to the period we are averaging at) then the timestamp is "missed", I want the value to instead be null.
I am trying to do with generate_series() so that I have an accurate list of timestamps which I can do successfully. However, I do not know how to make this compatible with the query above and add null values where there are no values to average.
CodePudding user response:
Build a contiguous list of ts periods using generate_series and left outer join it with your query. I am using CTEs for clarity but subqueries can do too.
with t as --your query
(
SELECT
to_timestamp(round((extract(epoch from tstz)) / 10) * 10) AS ts,
AVG(value)
FROM table
GROUP BY ts
),
contiguous_ts_list as
(
select ts from generate_series(
(select min(ts) from t),
(select max(ts) from t),
interval '10 seconds'
) ts
)
select *
from contiguous_ts_list
left outer join t using (ts)
order by ts;
