I have a table, and I would like the following update command to set isExcluded = 1 to all rows where PhoneID and PhoneName are not duplicates and all rows where the ID doesn't not have the smallest number from a selected PhoneID if those rows do not have a duplicate PhoneID and PhoneName (i.e.: since all rows with PhoneID = 2 are not duplicates, the row containing PhoneName = b has the smallest ID, since it's ID = 3. Therefore, all rows with PhoneID = 2 and ID > 3 will have their IsExcluded set to 1).
| ID | PhoneID | PhoneName | isExcluded |
|---|---|---|---|
| 1 | 1 | a | 0 |
| 2 | 1 | a | 0 |
| 3 | 2 | b | 0 |
| 4 | 2 | c | 0 |
| 5 | 2 | d | 0 |
| 6 | 2 | e | 0 |
| 7 | 3 | c | 0 |
| 8 | 3 | c | 0 |
| 9 | 3 | d | 0 |
| 10 | 3 | d | 0 |
Here's my SQL script that I wrote. It only seems to get the non-duplicates only.
WITH Duplicates AS
(
SELECT
ID, PhoneID, PhoneName, isExcluded,
(ROW_NUMBER() OVER (PARTITION BY PhoneName, PhoneID ORDER BY ID)) AS RowNum
FROM
Phones
)
UPDATE Phones
SET isExcluded = 1
FROM Duplicates d
WHERE (
d.PhoneID = Phones.PhoneID
AND d.PhoneName = Phones.PhoneName
AND d.RowNum =< 1);
SELECT * FROM Phones;
This table should be the result of my command.
| ID | PhoneID | PhoneName | isExcluded |
|---|---|---|---|
| 1 | 1 | a | 0 |
| 2 | 1 | a | 0 |
| 3 | 2 | b | 0 |
| 4 | 2 | c | 1 |
| 5 | 2 | d | 1 |
| 6 | 2 | e | 1 |
| 7 | 3 | c | 0 |
| 8 | 3 | c | 0 |
| 9 | 3 | d | 1 |
| 10 | 3 | d | 1 |
CodePudding user response:
This looks to be a variation of a gaps and islands problem, which you can solve by first grouping the partitions and then using an updatable CTE to assign the isExcluded value
with gp as (
select *,
Row_Number() over(order by id)
- Row_Number() over(partition by phoneid, phonename order by id) gp
from t
), p as (
select *,
case when Min(gp) over(partition by phoneid) <gp then 1 end IsEx
from gp
)
update p set isExcluded = isEx
where IsEx = 1
