I have the following table describing patients visits: Each patient has a visit_id for which he sees a given physician. I am trying to extract the value of the visit_id for which saw his 3rd physician. (3rd physician and not 3rd visit)
| patient | visit_id | physician |
|---|---|---|
| a | 1 | id_1 |
| a | 2 | id_2 |
| a | 3 | id_1 |
| a | 4 | id_3 |
| b | 5 | id_1 |
| b | 6 | id_2 |
| c | 7 | id_1 |
| c | 8 | id_2 |
| c | 9 | id_3 |
so the result would be:
| patient | visit_id |
|---|---|
| a | 4 |
| c | 9 |
Any suggestions?
CodePudding user response:
The following statement returns your result. The innermost subquery eliminates multiple visits to the same physician, then row_number() counts the visits and the outermost select gets the third physician.
select patient, visit
from (select patient, visit, row_number() over (partition by patient order by visit) rn
from ( select patient, min(visit) as visit
from tab
group by patient, physician
) t1
) t2
where t2.rn = 3
Result:
| patient | visit_id |
|---|---|
| a | 4 |
| c | 9 |
See db<>fiddle
CodePudding user response:
You can group by patient and physician to remove "duplicate" physicians and use min for visit_id:
-- test data
WITH dataset (patient, visit_id, physician) AS (
VALUES ('a', 1, 'id_1'),
('a', 2, 'id_2'),
('a', 3, 'id_1'),
('a', 4, 'id_3'),
('b', 5, 'id_1'),
('b', 6, 'id_2'),
('c', 7, 'id_1'),
('c', 8, 'id_2'),
('c', 9, 'id_3')
)
-- query
select patient, visit_id
from (
select *,
row_number() over (partition by patient order by visit_id) rnk
from (
select patient,
min(visit_id) visit_id,
physician
from dataset
group by patient, physician
)
)
where rnk = 3
Output:
| patient | visit_id |
|---|---|
| a | 4 |
| c | 9 |
Note that this query uses presto syntax (as your question has presto tag).
