I have browsed the other answers relating to grouping aliased columns but I believe what I am attempting to do is very common and I am super confused as to why this would create such a complex query that would require a ton of subqueries or CTEs. My goal is to look at different columns that are currently in use and determine the percentage of null or distinct values in each column. To do this, I am simply trying to apply the same logic to all of the for multiple different fields (400) in a new column for each column/variable. I would like my result table to look like this:
FIELD COUNT_DISTINCT Null_COUNT
Field_1 | COUNT(DISTINCT Field_1) | SUM(CASE WHEN Field_1 IS NULL THEN 1 ELSE 0 END)
Field_2 | COUNT(DISTINCT Field_2) | SUM(CASE WHEN Field_2 IS NULL THEN 1 ELSE 0 END)
Field_3 | COUNT(DISTINCT Field_3) | SUM(CASE WHEN Field_3 IS NULL THEN 1 ELSE 0 END)
.....
Field_400 | COUNT(DISTINCT Field_400) | SUM(CASE WHEN Field_400 IS NULL THEN 1 ELSE 0 END)
My query has to be edited in Excel as it is extremely long but even on smaller queries, I cannot get this result. So far, I have tried the following:
SELECT
"Field 1",
COUNT(DISTINCT Field_1) AS COUNT_DISTINCT,
"Field 2",
COUNT(DISTINCT Field_2) AS COUNT_DISTINCT,
"Field 3",
COUNT(DISTINCT Field_3) AS COUNT_DISTINCT
FROM XYZ
GROUP BY COUNT_DISTINCT
#this is wrong, cannot group by alias
SELECT
COUNT(DISTINCT Field_1) AS COUNT_DISTINCT,
COUNT(DISTINCT Field_2) AS COUNT_DISTINCT,
COUNT(DISTINCT Field_3) AS COUNT_DISTINCT
FROM XYZ
GROUP BY COUNT(DISTINCT '')
#this syntax was suggested in a similar question but I got the following error:

I really apologize if I just haven't researched this enough but aside from making this query really complex and difficult to scale with all of the variables that I need to pull, are there any recommendations? Thank you so much
CodePudding user response:
It is actually quite simple :-)
No need for GROUP BY.
SELECT
COUNT(DISTINCT Field_1) AS COUNT_DISTINCT_Field_1,
COUNT(DISTINCT Field_2) AS COUNT_DISTINCT_Field_2,
COUNT(DISTINCT Field_3) AS COUNT_DISTINCT_Field_3
FROM XYZ
Previous solution UNPIVOT
with t as
(
SELECT
COUNT(DISTINCT Field_1) AS f1,
COUNT(DISTINCT Field_2) AS f2,
COUNT(DISTINCT Field_3) AS f3
FROM XYZ
)
select *
from t UNPIVOT (COUNT_DISTINCT FOR field IN (f1,f2,f3))
CodePudding user response:
As per my understanding what you are trying to achieve can be done via.
SELECT
"Field 1" as Field,
COUNT(DISTINCT Field_1) AS COUNT_DISTINCT,
SUM(CASE WHEN Field_1 IS NULL THEN 1 ELSE 0 END) as Null_COUNT
FROM XYZ
GROUP BY Field
UNION
SELECT
"Field 2" as Field,
COUNT(DISTINCT Field_2) AS COUNT_DISTINCT,
SUM(CASE WHEN Field_2 IS NULL THEN 1 ELSE 0 END) as Null_COUNT
FROM XYZ
GROUP BY Field
UNION
SELECT
"Field 3" as Field,
COUNT(DISTINCT Field_3) AS COUNT_DISTINCT,
SUM(CASE WHEN Field_3 IS NULL THEN 1 ELSE 0 END) as Null_COUNT
FROM XYZ
GROUP BY Field
..
UNION
..
SELECT
"Field 400" as Field,
COUNT(DISTINCT Field_400) AS COUNT_DISTINCT,
SUM(CASE WHEN Field_400 IS NULL THEN 1 ELSE 0 END) as Null_COUNT
FROM XYZ
GROUP BY Field
