I have a table like
| MyText |
|---|
| Car, Car, Blue |
| Blue, Blue, Car |
And I would like to get one like this, using SQL because it would run in SQL Server
| MyText | mostRepeatedWord |
|---|---|
| Car, Car, Blue | Car |
| Blue, Blue, Car | Blue |
I was trying to use string_split modifying next code
with words as (
select value as word from
string_split('banana, apple, lemons, kiwi, orange, apple', ','))
SELECT top 1 word, count(word) as counts
from words
group by word
order by counts desc
what returns me
| word | counts |
|---|---|
| apple | 2 |
But I don't know how to replace that fruits string to the values of MyText Column
In the case of 'car, car, blue blue' I don't care which one is returned, the top 1 and order by are in charge of selecting the first one.
Thanks
CodePudding user response:
You can include MyText in the result if you apply to a sub-query with string_split.
select MyText, ca.*
from MyTable
outer apply (
select top 1 trim(value) as word, count(*) as counts
from string_split(MyText, ',') s
group by trim(value)
order by count(*) desc
) ca;
| MyText | word | counts |
|---|---|---|
| Car, Car, Blue | Car | 2 |
| Blue, Blue, Car | Blue | 2 |
Test on db<>fiddle here
CodePudding user response:
You can use some window functions to partition each group's count and select the top 1 of each using with ties
with c as (
select mytext, Trim(value) Word,
Count(*) over(partition by mytext, Trim(value)) cnt
from t
cross apply string_split(mytext, ',')
)
select top (1) with ties mytext, word, cnt MaxCount
from c
group by mytext, word, cnt
order by row_number() over (partition by mytext order by cnt desc);
