Home > Enterprise >  T-SQL : count occurrences per unique substring in a column
T-SQL : count occurrences per unique substring in a column

Time:01-24

I want to count occurrences per unique substring in a column.

SELECT DISTINCT LEFT(code, 3) 
FROM table-with-codes

Output:

code
------------------
VJCrandomthings
PASrandomthings
CAArandomthings
PASrandomthings2
PASrandomthings3

Expected output:

caa 1
pas 3
vjc 1

I tried

SELECT COUNT(DISTINCT LEFT(code, 3)) 
FROM table-with-codes 

but that returns 3 as result.

CodePudding user response:

You need to count with group by

select Left(code,3), Count(*)
from [table-with-codes]
group by Left(code,3)

CodePudding user response:

You should keep the extracted 3 letters in the select statement and Group by.

Query

Select left(code, 3) as code3, count(left(code, 3)) as cnt
From tablename
Group by left(code, 3)
Order by 1;

CodePudding user response:

I don't like having LEFT(code, 3) appear more than once in the query. To improve maintainability and readability I'd refactor it using CROSS APPLY, like this :

SELECT subcode, COUNT(*)
FROM table-with-codes
CROSS APPLY (SELECT LEFT(code, 3)) AS T(subcode)
GROUP BY subcode

CodePudding user response:

You can use enter image description here

CodePudding user response:

You are correct with using left but instead of distinct you should group by

declare @table_with_codes table (code varchar(50))
insert into @table_with_codes values ('VJCrandomthings'), ('PASrandomthings'), ('CAArandomthings'), ('PASrandomthings2'), ('PASrandomthings3')

select left(code, 3) as code_3, 
       count(*) as cnt
from   @table_with_codes 
group by left(code, 3)

how this works
By using the group by you can count all rows that are grouped (and thus only returned once in the result), in this case all rows with the same value in lef(col, 3)

The result will be

code_3  cnt
------  ---
CAA      1
PAS      3
VJC      1
  •  Tags:  
  • Related