I have a table like this:
| ID | OtherID | Date |
|---|---|---|
| 1 | z | 2022-09-19 |
| 1 | b | 2021-04-05 |
| 2 | e | 2022-04-05 |
| 3 | t | 2022-07-08 |
| 3 | z | 2021-03-02 |
I want a table like this:
| ID | OtherID | Date |
|---|---|---|
| 1 | z | 2022-09-19 |
| 2 | e | 2022-04-05 |
| 3 | t | 2022-07-08 |
That have distinct pairs consisted of ID-OtherID based on the Date values which are the most recent.
The problem I have now is the relationship between ID and OtherID is 1:M
I've looked at SELECT DISTINCT, GROUP BY, LAG but I couldn't figure it out. I'm sorry if this is a duplicate question. I couldn't find the right keywords to search for the answer.
Update: I use Postgres but would like to know other SQL as well.
CodePudding user response:
This works for many dbms (versions of postgres, mysql and others) but you may need to adapt if something else. You could use a CTE, or a join, or a subquery such as this:
select id, otherid, date
from (
select id, otherid, date,
rank() over (partition by id order by date desc) as id_rank
from my_table
)z
where id_rank = 1
| id | otherid | date |
|---|---|---|
| 1 | z | 2022-09-19T00:00:00.000Z |
| 2 | e | 2022-04-05T00:00:00.000Z |
| 3 | t | 2022-07-08T00:00:00.000Z |
CodePudding user response:
You can use a Common Table Expression (CTE) with ROW_NUMBER() to assign a row number based on the ID column (then return the first row for each ID in the WHERE clause rn = 1):
WITH cte AS
(SELECT ID,
OtherID,
Date,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY Date DESC) AS rn
FROM sample_table)
SELECT ID,
OtherID,
Date
FROM cte
WHERE rn = 1;
Result:
| ID | OtherID | Date |
|---|---|---|
| 1 | z | 2022-09-19 |
| 2 | e | 2022-04-05 |
| 3 | t | 2022-07-08 |
Fiddle here.
