Home > Back-end >  SQL: find duplicate entries with additional filter
SQL: find duplicate entries with additional filter

Time:02-01

I have a simple table like so:

Name   data
--   ----
A    1
A    2
A    null
B    1
B    null
C    1
D    1
D    2
E    null
E    null

I want to get the list of Names that:

  • have more than one entry associated with them
  • have data column field set to null in one of those rows. (or any filter on the value of the data field)

ie, i want to return A, B, E

I can do the first part ok but don't know how to do the second part. I tried this:

select Name from Table1
group by Name
having count(*) > 1
and Score is null;

which gives me E only, which is not good enough.

I can of course:

  • select the full set of Names for which I have data = null, as a subquery
  • add a where Name in (...) before the group by

something like:

select Name from Table1
where Name in (
  select distinct Name
  from Table1
  where data is null
)
group by Name
having count(*) > 1;

but that's unnecessarily complicated but is also not generic enough. What If I wanted the ones that have at least two null entries? etc.

Is there an easy/clean way of doing this?

CodePudding user response:

count(*) counts all rows.

count(Score) counts non-null Score values.

select Name from Table1
group by Name
having count(*) > 1               -- more than 1 row
   and count(*) > count(Score)    -- and at least 1 null

If you want to add at least two null entries:

select Name from Table1
group by Name
having count(*) >= count(Score)   2   -- at least 2 null values

CodePudding user response:

You can do intersection of both of your conditions

(select Name from Table1
group by Name
having count(*) > 1)
INTERSECT
(select Name from Table1
where Score is null)

CodePudding user response:

Taking into consideration your additional requirement of "what if ... at least two null entries" I would suggest conditional aggregation.

select Name from Table1
group by Name
having count(*) > 1 AND SUM(CASE WHEN data IS NULL THEN 1 ELSE 0 END) > 0

That makes it flexible enough, all you would need is to change > 0 to > 1 (or any other number).

DEMO (with MySQL)

  •  Tags:  
  • Related