I'm doing SQL exercises at WindowFunctions.com. The task is to cumulatively sum cats' weights from heaviest to lightest. The table looks like:
| name | weight | ... |
|---|---|---|
| Ashes | 4.5 | ... |
| Tigger | 3.8 | ... |
| Smokey | 6.1 | ... |
| ... |
My first guess was
SELECT
name,
SUM(weight) OVER (ORDER BY weight DESC) AS running_total_weight
FROM cats
However, it produces wrong result which sums first two cats into first cumulative sum.
| name | running_total_weight |
|---|---|
| Smokey | 12.2 |
| Oscar | 12.2 |
| ... | |
| Tigger | 61.1 |
If I, however, run the same query ascending, the summation is done correctly, and Tiger has weight 3.8 and not 8.0.
The site suggests solution: rewrite OVER () clause as OVER (ORDER BY weight DESC ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW). Why does it works? Why is it needed when summation descending and not needed when descending?
CodePudding user response:
The point is not the unbounded preceding.
The point is the default framing behavior when over (order by...) is used, which is range (and not rows).
select *
--rows
,sum(c) over(order by c rows unbounded preceding) rows1
,sum(c) over(order by c rows between unbounded preceding and current row) rows2
--range
,sum(c) over(order by c) range1
,sum(c) over(order by c range unbounded preceding) range2
,sum(c) over(order by c range between unbounded preceding and current row) range3
from (values (10),(20),(20),(20),(30)) t(c)
| c | rows1 | rows2 | range1 | range2 | range3 |
|---|---|---|---|---|---|
| 10 | 10 | 10 | 10 | 10 | 10 |
| 20 | 30 | 30 | 70 | 70 | 70 |
| 20 | 50 | 50 | 70 | 70 | 70 |
| 20 | 70 | 70 | 70 | 70 | 70 |
| 30 | 100 | 100 | 100 | 100 | 100 |
