I want to do a window function (like the SUM() OVER() function), but there are two catches:
- I want to consider the last 3 months on my moving sum, but the number of rows are not consistent. Some months have 3 entries, others may have 2, 4, 5, etc;
- There is also a "group" column, and the moving sum should sum only the amounts of the same group.
In summary, a have a table that has the following structure:
| id | date | group | amount |
|---|---|---|---|
| 1 | 2022-01 | group A | 1100 |
| 2 | 2022-01 | group D | 2500 |
| 3 | 2022-02 | group A | 3000 |
| 4 | 2022-02 | group B | 1000 |
| 5 | 2022-02 | group C | 2500 |
| 6 | 2022-03 | group A | 2000 |
| 7 | 2022-04 | group C | 1000 |
| 8 | 2022-05 | group A | 1500 |
| 9 | 2022-05 | group D | 2000 |
| 10 | 2022-06 | group B | 1000 |
So, I want to add a moving sum column, containing the sum the ammount for each group for the last 3 months. The sum should not reset every 3 months, but should consider only the previous values from the 3 months prior, and of the same group.
The end result should look like:
| id | date | group | amount | moving_sum_three_months |
|---|---|---|---|---|
| 1 | 2022-01 | group A | 1100 | 1100 |
| 2 | 2022-01 | group D | 2500 | 2500 |
| 3 | 2022-02 | group A | 3000 | 4100 |
| 4 | 2022-02 | group B | 1000 | 1000 |
| 5 | 2022-02 | group C | 2500 | 2500 |
| 6 | 2022-03 | group A | 2000 | 6100 |
| 7 | 2022-04 | group C | 1000 | 3500 |
| 8 | 2022-05 | group A | 1500 | 3500 |
| 9 | 2022-05 | group D | 2000 | 2000 |
| 10 | 2022-06 | group B | 1200 | 1200 |
The best example to see how the sum work in this example is line 8.
- It considers only lines 8 and 6 for the sum, because they are the only one that meet the criteria;
- Line 1 and 3 do not meet the criteria, because they are more than 3 months old from line 8 date;
- All the other lines are not from group A, so they are also excluded from the sum.
Any ideias? Thanks in advance for the help!
CodePudding user response:
Use SUM() as a window function partitioning the window by group in RANGE mode. Set the frame to go back 3 months prior the current record using INTERVAL '3 months', e.g.
SELECT *, SUM(amount) OVER w AS moving_sum_three_months
FROM t
WINDOW w AS (PARTITION BY "group" ORDER BY "date"
RANGE BETWEEN INTERVAL '3 months' PRECEDING AND CURRENT ROW)
ORDER BY id
Demo: db<>fiddle
