Home > Back-end >  How do I remove duplicate word in a cell in SQL
How do I remove duplicate word in a cell in SQL

Time:02-01

How do I remove duplicates in the following case in T-SQL?

I have a table with a column Code of type varchar(max).

It contains column value like truck/rail/truck/rail. I need the cell value to be truck/rail.

Other possibility is truck/rail/ship/truck need to be truck/rail/ship.

By using table valued function.

Thanks.

CodePudding user response:

You can use String_Split along with String_agg to remove the duplicates.

DECLARE @t table(id int, val varchar(max))

insert into @t values(1,'truck/rail/truck/rail'), (2,'truck/rail/ship/truck')

SELECT t.id,STRING_AGG(splitval,'/') as newval FROM @t as t
cross apply (
SELECT distinct value from string_split(t.val,'/')) as ca(splitval)
group by t.id
id newval
1 rail/truck
2 rail/ship/truck

Note1: String_Split, does not guarantee order. So, your concatenated results might be in different order from the original list, after duplicates removal. If you want to preserve the order, then we have to go for different solution using xml nodes or json array.

Note2: String_Split was introduced in SQL Server 2016. String_agg was introduced in SQL Server 2017. So, if you are using versions before that, you have to go for recursive CTE and CHARINDEX based solution.

CodePudding user response:

If you know that the error exists, then just do an UPDATE where you replace the truck/rail/truck/rail with truck/rail using the REPLACE(Code,'truck/rail/truck/rail',truck/rail).

The same goes for your truck/rail/ship/truck issue.

If you need automatic detection and correction to be done, that's a whole 'nuther story but could still be done using nested REPLACES. Detection of the issue is the hard part. Personally, I'd be having a talk with the people that are providing the data.

  •  Tags:  
  • Related