Home > Blockchain >  SQL: how to remove duplicate rows when value in Column 1 = value in Column 2 and vice versa
SQL: how to remove duplicate rows when value in Column 1 = value in Column 2 and vice versa

Time:02-01

I have a table like this:

column_1 column_2 column_3 column_4
41d97f0c-7c0b-441b-b947-d18ad2778899 1feb623f-781f-4c0e-9230-861970bd2e91 random1 2022-01-01
1feb623f-781f-4c0e-9230-861970bd2e91 41d97f0c-7c0b-441b-b947-d18ad2778899 random2 2022-01-01
7c1cbb7b-fefc-4ab8-827d-4e1a9a130da9 4722b35b-7ebb-4bb7-bd84-bd290fba5d3e random3 2022-01-02
1313a680-5f87-4c03-b8a0-4c5743f0f2a4 da67bcae-dad5-4122-95a2-7f6c32ca85e7 random4 2022-01-03
da67bcae-dad5-4122-95a2-7f6c32ca85e7 1313a680-5f87-4c03-b8a0-4c5743f0f2a4 random5 2022-01-04

For all rows where column_1 == column_2 and column_2 == column_1, I want to remove one of them as its a duplicate. In above example first 2 rows are duplicates, so I want to ignore 2nd row. Last 2 rows are also duplicates so I want to remove last row. The end result should look like:

column_1 column_2 column_3 column_4
41d97f0c-7c0b-441b-b947-d18ad2778899 1feb623f-781f-4c0e-9230-861970bd2e91 random1 2022-01-01
7c1cbb7b-fefc-4ab8-827d-4e1a9a130da9 4722b35b-7ebb-4bb7-bd84-bd290fba5d3e random3 2022-01-02
1313a680-5f87-4c03-b8a0-4c5743f0f2a4 da67bcae-dad5-4122-95a2-7f6c32ca85e7 random4 2022-01-03

CodePudding user response:

If I understand correctly you just want to keep one row for all distinct values across column_1 and column_2. The below should work.

SELECT 
  * 
FROM 
  table 
WHERE 
  column_1 NOT IN (SELECT column_2 FROM table)
UNION
SELECT 
  * 
FROM 
  table 
WHERE 
  column_2 NOT IN (SELECT column_1 FROM table)

CodePudding user response:

You can do:

select *
from t a
left join t b on b.col1 = a col2 and b.col2 = a.col1
where b.col1 is null or a.col1 < b.col1
  •  Tags:  
  • Related