Given this
| ID | rownum | Value |
|---|---|---|
| 100 | 1 | false |
| 100 | 2 | false |
| 100 | 3 | true |
| 100 | 4 | false |
| 100 | 5 | false |
| 100 | 6 | false |
| 100 | 7 | true |
| 100 | 8 | false |
| 100 | 9 | false |
| 101 | 1 | false |
| 101 | 2 | false |
| 101 | 3 | false |
| 101 | 4 | true |
| 101 | 5 | false |
| 101 | 6 | false |
| 101 | 7 | false |
I need this
| ID | Counts |
|---|---|
| 100 | 1 |
| 101 | 2 |
Basically finding the count of continuous 'false' windows of length 3 (can be any length) by grouping by ID. I tried simple count() function grouping by ID, it did not work, it just returned the total count by groups.
I am using PostgreSQL latest version (14.1). Also, it is a fairly large database (10M rows)
CodePudding user response:
Slightly modified @LukStorms' solution to count groups by chunks of length 3.
SELECT ID, sum(cnt) AS Counts
FROM
(
SELECT ID, Rnk, (COUNT(*) - 3) / 3 1 AS Cnt
FROM
(
SELECT *
, SUM(flag) OVER (PARTITION BY ID ORDER BY rownum) AS Rnk
FROM
(
SELECT ID, rownum, Value
, CASE
WHEN Value = LAG(Value) OVER (PARTITION BY ID ORDER BY rownum)
THEN 0 ELSE 1 END AS flag
FROM your_table
) q1
WHERE Value = false
) q2
GROUP BY ID, Rnk
HAVING COUNT(*) > 2
) q3
GROUP BY ID
ORDER BY ID;
CodePudding user response:
Using LAG and SUM OVER it's possible to calculate a rank per ID for continues values.
Once there's a ranking, it's simple to aggregate on it.
SELECT ID, SUM(FLOOR(RankTotal/3)) AS Counts FROM ( SELECT ID, Rnk, COUNT(*) AS RankTotal FROM ( SELECT * , SUM(flag) OVER (PARTITION BY ID ORDER BY rownum) AS Rnk FROM ( SELECT ID, rownum, Value , CASE WHEN Value = LAG(Value) OVER (PARTITION BY ID ORDER BY rownum) THEN 0 ELSE 1 END AS flag FROM your_table ) q1 WHERE Value = false ) q2 GROUP BY ID, Rnk HAVING COUNT(*) > 2 ) q3 GROUP BY ID ORDER BY ID;
| id | counts |
|---|---|
| 100 | 1 |
| 101 | 2 |
| 102 | 4 |
Demo on db<>fiddle here
