SQL Remove Duplicate Rows with Condition-CodePudding

I have the following table:

TABLE cars
id |     name    |   prodDate   |  fuelType  |  registered  |
1      'Abarth'        2012           1            true
2      'Acura'         2022           2            true
3      'Acura'         2022           2            false
4      'Alfa Romeo'    2013           2            false
5      'Aston Martin'  2005           1            true
6      'Bentley'       2007           1            false
7      'BMW'           2019           2            false
8      'BMW'           2016           2            false
9      'Chevrolet'     2021           2            true
10     'Citroen'       2018           2            false
11     'Ferrari'       2015           2            true
12     'Ferrari'       2015           2            false
13     'Ferrari'       2014           2            false
14     'GMC'           2008           2            true

I need to write SQL script to delete all the duplicate rows (duplicate rows are the ones with the same name value) in which the registered is true. I must consider fuelType = 1 only.

For example row id = 11 will be deleted because there are duplicate Ferrari rows and this is the one with registered = true. Row id = 9 won't be deleted, as there are no duplicate rows with name = Chevrolet.

How can I do this?

CodePudding user response：

We use rank() to find all the duplicates where registered = true and get rid of them.

select id   
      ,name 
      ,proddate 
      ,fueltype 
      ,registered
from   (
       select *
              ,rank() over(partition by name order by registered) as rnk
       from   cars 
       ) cars
where  rnk = 1

id	name	proddate	fueltype	registered
1	Abarth	2012	1	true
3	Acura	2022	2	false
4	Alfa Romeo	2013	2	false
5	Aston Martin	2005	1	true
6	Bentley	2007	1	false
8	BMW	2016	2	false
7	BMW	2019	2	false
9	Chevrolet	2021	2	true
10	Citroen	2018	2	false
12	Ferrari	2015	2	false
13	Ferrari	2014	2	false
14	GMC	2008	2	true

Fiddle

CodePudding user response：

when I normally get into such a situation I normally use a GROUP BY with HAVING take this example, I hope it helps

 SELECT fstName, 
    lstName, 
    COUNT(*) AS Count
    FROM Employee
      GROUP BY fstName,lstName
         HAVING COUNT(*) > 1; //those that have count more than once