I have this homework assignment where I'm attempting to query a table to find the id numbers that are all using the same column value, let's say last name in this case. I'd like to find the ids that use the same last name more than once, and have a column that tells me the total number of unique IDs that used that same last name.
SELECT id, COUNT(*) as ID_count
FROM [table]
WHERE l_name IN
(
SELECT l_name
FROM [table]
GROUP BY l_name HAVING COUNT(*)>1
)
GROUP BY id;
This is what I have so far. It grants me the ID number, but the count(*) is not what I'm going for. What I'm instead trying to get is how many unique IDs have "Smith" as their last name, instead of all the occurrences of one specific ID that has used "Smith".
I've tried different things but I feel like I'm at a roadblock. Any hints or tips are nice; I don't need this problem solved 100%, but I feel as if I can't past the idea of using count(*).
Thanks all.
CodePudding user response:
It sounds like you were already there WITHIN the inner query. Just add the count to it for the output.
SELECT
t1.id,
t1.l_name,
max( PQ.UniqCount ) UniqCount,
COUNT(*) as countForThisSingleID
FROM
[table] t1
JOIN
( SELECT
t.l_name,
COUNT( DISTINCT t.ID ) as UniqCount
FROM
[table] t
GROUP BY
t.l_name
HAVING
COUNT( DISTINCT t.ID ) > 1 ) PQ
on t1.l_name = PQ.l_name
group by
t1.id,
t1.l_name
order by
t1.l_name,
t1.id
So by doing a COUNT( DISTINCT ) on the inner pre-query (alias PQ), for each L_Name, you are getting a count of distinct IDs. I dont know if your [table] has multiple entries for the same ID in it or not, so applying the DISTINCT. Same for the HAVING clause. But at least now the inner pre-query gets the overall distinct counts for a given L_Name value.
Now, doing a JOIN to the outer table on that L_Name will get the corresponding count in the result query, along with showing the l_name that it qualified against. So if you have a table with 18 DISTINCT ID instances of John, 37 of Karen, 11 of Mike, your inner query will get those. Now joined to the outer, you will get the output of EACH instance of John and their corresponding IDs, then all Karen instance and Mike instances.
The count for the outer query is getting the count of the one ID (and name) times that it appears in the table. So if the table had ID = 5, L_Name = John and ID 5 appeared 3 times in the table, the output of his record might look like
ID L_Name countForThisSingleID UniqCount
5 John 3 18
72 John 8 18
127 John 2 18
etc...
Similarly the output would include all Karen's and Mike's within the table (and any others that qualify).
Again, without knowing if your [table] is a unique instance per ID such as a master customer lookup table where it would only appear once vs an order table where the ID may appear more than once for a single person's ID, not positive what your final answer is looking for.
But I think I have given you a bunch to chew on and run with.
