I have this table:
| user_id | name | modified_date | |
|---|---|---|---|
| 1 | John | [email protected] | 2022-01-01 |
| 1 | John | [email protected] | 2022-01-02 |
| 1 | Lucy | [email protected] | 2022-01-03 |
| 2 | Joey | [email protected] | 2021-12-24 |
| 3 | Mike | [email protected] | 2022-01-01 |
| 3 | Mary | [email protected] | 2022-01-02 |
I'm trying to get unique user_id's email with the most recent modified_date.
This is my expected output:
| user_id | name | |
|---|---|---|
| 1 | [email protected] | Lucy |
| 2 | [email protected] | Joey |
| 3 | [email protected] | Mary |
I used limit 1, but the output seems to be randomized.
Can somebody help me to sort this out ?
CodePudding user response:
At the moment, your query probably looks something like:
SELECT user_id, email
FROM email_addresses
GROUP BY user_id
You'll need to do something a little different
SELECT email_addresses.*
FROM email_addresses
LEFT OUTER JOIN email_addresses AS b
ON email_addresses.user_id = b.user_id
AND b.modified_date > email_addresses.modified_date
WHERE b.user_id IS NULL
This works by pairing with a newer record if one exists, and then the where clause requires the pair was unsuccessful.
CodePudding user response:
You can use distinct on:
select distinct on (user_id) *
from t
order by user_id, modified_date desc
CodePudding user response:
you can go for row_number() with partition by
Schema (PostgreSQL v10.0)
create table users(user_id integer, name varchar(10), email varchar(20), modified_date timestamp);
insert into users
values
(1 ,'John', '[email protected]', '2022-01-01'),
(1 ,'John', '[email protected]', '2022-02-01');
Query #1
select user_id, email, name
from
(
SELECT user_id, name, email,row_number() over(partition by user_id order by modified_Date desc) as rnk
FROM users
) as t
where rnk = 1;
| user_id | name | |
|---|---|---|
| 1 | [email protected] | John |
