Home > database >  Column with most repeated word in other column
Column with most repeated word in other column

Time:01-21

I have a table like

MyText
Car, Car, Blue
Blue, Blue, Car

And I would like to get one like this, using SQL because it would run in SQL Server

MyText mostRepeatedWord
Car, Car, Blue Car
Blue, Blue, Car Blue

I was trying to use string_split modifying next code

    with words as (
    select value as word from 
    string_split('banana, apple, lemons, kiwi, orange, apple', ','))
    SELECT  top 1 word, count(word) as counts
    from words
    group by word
    order by counts desc

what returns me

word counts
apple 2

But I don't know how to replace that fruits string to the values of MyText Column

In the case of 'car, car, blue blue' I don't care which one is returned, the top 1 and order by are in charge of selecting the first one.

Thanks

CodePudding user response:

You can include MyText in the result if you apply to a sub-query with string_split.

select MyText, ca.*
from MyTable
outer apply (
 select top 1 trim(value) as word, count(*) as counts
 from string_split(MyText, ',') s
 group by trim(value) 
 order by count(*) desc
) ca;
MyText word counts
Car, Car, Blue Car 2
Blue, Blue, Car Blue 2

Test on db<>fiddle here

CodePudding user response:

You can use some window functions to partition each group's count and select the top 1 of each using with ties

with c as (
    select mytext, Trim(value) Word,
      Count(*) over(partition by mytext, Trim(value)) cnt
    from t
    cross apply string_split(mytext, ',')
)
select top (1) with ties  mytext, word, cnt MaxCount 
from c
group by mytext, word, cnt
order by row_number() over (partition by mytext order by cnt desc);

DB Fiddle example

  •  Tags:  
  • Related