How to return the category with max value for every user in postgresql?-CodePudding

This is the table

id	category	value
1	A	40
1	B	20
1	C	10
2	A	4
2	B	7
2	C	7
3	A	32
3	B	21
3	C	2

I want the result like this

id	category
1	A
2	B
2	C
3	A

CodePudding user response：

For small tables or for only very few rows per user, a subquery with the window function rank() (as demonstrated by The Impaler) is just fine. The resulting sequential scan over the whole table, followed by a sort will be the most efficient query plan.

For more than a few rows per user, this gets increasingly inefficient though.

Typically, you also have a users table holding one distinct row per user. We can leverage that for an alternative query that scales much better - using WITH TIES in a LATERAL JOIN.
Requires Postgres 13 or later.

SELECT u.id, t.*
FROM   users u
CROSS  JOIN LATERAL (
   SELECT t.category
   FROM   tbl t
   WHERE  t.id = u.id
   ORDER  BY t.value DESC
   FETCH  FIRST 1 ROWS WITH TIES  -- !
   ) t;

See:

This can use a multicolumn index to great effect - which must exist, of course:

CREATE INDEX ON tbl (id, value);

Or:

CREATE INDEX ON tbl (id, value DESC);

Even faster index-only scans become possible with:

CREATE INDEX ON tbl (id, value DESC, category);

Or (the optimum for the query at hand):

CREATE INDEX ON tbl (id, value DESC) INCLUDE (category);

Assuming value is defined NOT NULL, or we have to use DESC NULLS LAST. See:

Sort by column ASC, but NULL values first?

To keep users in the result that don't have any rows in table tbl, user LEFT JOIN LATERAL (...) ON true. See:

What is the difference between LATERAL JOIN and a subquery in PostgreSQL?

CodePudding user response：

You can use RANK() to identify the rows you want. Then, filtering is easy. For example:

select *
from (
  select *,
    rank() over(partition by id order by value desc) as rk
  from t
) x
where rk = 1

Result:

 id  category  value  rk 
 --- --------- ------ -- 
 1   A         40     1  
 2   B         7      1  
 2   C         7      1  
 3   A         32     1

See running example at DB Fiddle.