I have a table that looks like below:
| person | fruit | date |
|---|---|---|
| A | apple | xxxx |
| B | banana | xxxx |
| C | apple | xxxx |
| A | banana | xxxx |
| C | apple | xxxx |
| B | banana | xxxx |
I am interested in persons who have more than one banana in the data set. In this case, it would be person B. I understand how to achieve this by aggregating the data. However, if I want my result to be NOT agrregated and look something like below, what would be the best way?
| person | fruit | date |
|---|---|---|
| B | banana | xxxx |
| B | banana | xxxx |
CodePudding user response:
You can use a window function such as SUM() OVER () along with a conditional aggregation :
SELECT person, fruit, date
FROM (SELECT SUM(CASE
WHEN fruit = 'banana' THEN
1
END) OVER( PARTITION BY person ) AS cnt
FROM t) tt
WHERE cnt > 1
CodePudding user response:
You can achieve this by using window function in subquery (in this case with count_if aggregate function) and filtering the result:
-- sample data
WITH dataset(person, fruit, date) AS (
VALUES ('A', 'apple' ,'xxxx'),
('B', 'banana' ,'xxxx'),
('C', 'apple' ,'xxxx'),
('A', 'banana' ,'xxxx'),
('C', 'apple' ,'xxxx'),
('B', 'banana' ,'xxxx')
)
-- query
select person, fruit, date
from (select *,
count_if(fruit = 'banana') over(partition by person) banana_count
from dataset)
where banana_count > 1
Output:
| person | fruit | date |
|---|---|---|
| B | banana | xxxx |
| B | banana | xxxx |
