This is the table
| id | category | value |
|---|---|---|
| 1 | A | 40 |
| 1 | B | 20 |
| 1 | C | 10 |
| 2 | A | 4 |
| 2 | B | 7 |
| 2 | C | 7 |
| 3 | A | 32 |
| 3 | B | 21 |
| 3 | C | 2 |
I want the result like this
| id | category |
|---|---|
| 1 | A |
| 2 | B |
| 2 | C |
| 3 | A |
CodePudding user response:
For small tables or for only very few rows per user, a subquery with the window function rank() (as demonstrated by The Impaler) is just fine. The resulting sequential scan over the whole table, followed by a sort will be the most efficient query plan.
For more than a few rows per user, this gets increasingly inefficient though.
Typically, you also have a users table holding one distinct row per user. We can leverage that for an alternative query that scales much better - using WITH TIES in a LATERAL JOIN.
Requires Postgres 13 or later.
SELECT u.id, t.*
FROM users u
CROSS JOIN LATERAL (
SELECT t.category
FROM tbl t
WHERE t.id = u.id
ORDER BY t.value DESC
FETCH FIRST 1 ROWS WITH TIES -- !
) t;
See:
- Greater than or equal to ALL() and equal to MAX() speed
- Fetching a minimum of N rows, plus all peers of the last row
This can use a multicolumn index to great effect - which must exist, of course:
CREATE INDEX ON tbl (id, value);
Or:
CREATE INDEX ON tbl (id, value DESC);
Even faster index-only scans become possible with:
CREATE INDEX ON tbl (id, value DESC, category);
Or (the optimum for the query at hand):
CREATE INDEX ON tbl (id, value DESC) INCLUDE (category);
Assuming value is defined NOT NULL, or we have to use DESC NULLS LAST. See:
To keep users in the result that don't have any rows in table tbl, user LEFT JOIN LATERAL (...) ON true. See:
CodePudding user response:
You can use RANK() to identify the rows you want. Then, filtering is easy. For example:
select *
from (
select *,
rank() over(partition by id order by value desc) as rk
from t
) x
where rk = 1
Result:
id category value rk
--- --------- ------ --
1 A 40 1
2 B 7 1
2 C 7 1
3 A 32 1
See running example at DB Fiddle.
