I have a table like this:
| column_1 | column_2 | column_3 | column_4 |
|---|---|---|---|
| 41d97f0c-7c0b-441b-b947-d18ad2778899 | 1feb623f-781f-4c0e-9230-861970bd2e91 | random1 | 2022-01-01 |
| 1feb623f-781f-4c0e-9230-861970bd2e91 | 41d97f0c-7c0b-441b-b947-d18ad2778899 | random2 | 2022-01-01 |
| 7c1cbb7b-fefc-4ab8-827d-4e1a9a130da9 | 4722b35b-7ebb-4bb7-bd84-bd290fba5d3e | random3 | 2022-01-02 |
| 1313a680-5f87-4c03-b8a0-4c5743f0f2a4 | da67bcae-dad5-4122-95a2-7f6c32ca85e7 | random4 | 2022-01-03 |
| da67bcae-dad5-4122-95a2-7f6c32ca85e7 | 1313a680-5f87-4c03-b8a0-4c5743f0f2a4 | random5 | 2022-01-04 |
For all rows where column_1 == column_2 and column_2 == column_1, I want to remove one of them as its a duplicate. In above example first 2 rows are duplicates, so I want to ignore 2nd row. Last 2 rows are also duplicates so I want to remove last row. The end result should look like:
| column_1 | column_2 | column_3 | column_4 |
|---|---|---|---|
| 41d97f0c-7c0b-441b-b947-d18ad2778899 | 1feb623f-781f-4c0e-9230-861970bd2e91 | random1 | 2022-01-01 |
| 7c1cbb7b-fefc-4ab8-827d-4e1a9a130da9 | 4722b35b-7ebb-4bb7-bd84-bd290fba5d3e | random3 | 2022-01-02 |
| 1313a680-5f87-4c03-b8a0-4c5743f0f2a4 | da67bcae-dad5-4122-95a2-7f6c32ca85e7 | random4 | 2022-01-03 |
CodePudding user response:
If I understand correctly you just want to keep one row for all distinct values across column_1 and column_2. The below should work.
SELECT
*
FROM
table
WHERE
column_1 NOT IN (SELECT column_2 FROM table)
UNION
SELECT
*
FROM
table
WHERE
column_2 NOT IN (SELECT column_1 FROM table)
CodePudding user response:
You can do:
select *
from t a
left join t b on b.col1 = a col2 and b.col2 = a.col1
where b.col1 is null or a.col1 < b.col1
