I have the following SQL query:
SELECT
modal_text,
COUNT(CASE WHEN ab_group = "control" THEN 1 ELSE 0 END)
FROM
onboarding_modals
GROUP BY
1
ORDER BY
1;
This doesn't work as expected (it will count more than expected), but when I remove the ELSE 0 in aggregate function, it works as expected:
SELECT
modal_text, COUNT(CASE WHEN ab_group = "control" THEN 1 END)
FROM
onboarding_modals
GROUP BY
1
ORDER BY
1;
Could someone explain me why having the ELSE 0 will make it count more data than it should be?
*It will also work if I use ELSE NULL
CodePudding user response:
Use SUM() instead of COUNT(), as in:
SELECT
modal_text,
SUM(CASE WHEN ab_group = "control" THEN 1 ELSE 0 END)
FROM
onboarding_modals
GROUP BY
1
ORDER BY
1;
CodePudding user response:
Could someone explain me why having the ELSE 0 will make it count more data than it should be?
Becasue COUNT(CASE WHEN ab_group = "control" THEN 1 ELSE 0 END) is different to COUNT(CASE WHEN ab_group = "control" THEN 1 END) let's see a sample below
we can see there will be count when we use count(1) or count(0) except count(null) count function will not be count when the value is null
Query 1:
SELECT COUNT(1)
| COUNT(1) |
|----------|
| 1 |
SELECT COUNT(0)
| COUNT(0) |
|----------|
| 1 |
SELECT COUNT(NULL)
| COUNT(NULL) |
|-------------|
| 0 |
Query 2:
SELECT SUM(1)
| SUM(1) |
|--------|
| 1 |
SELECT SUM(0)
| SUM(0) |
|--------|
| 0 |
SELECT SUM(NULL)
| SUM(NULL) |
|-----------|
| (null) |
CodePudding user response:
Because a COUNT(SomeColumn) doesn't count the NULL's in a column.
COUNT(1) or COUNT(*) count the rows.
And so does a COUNT(CASE WHEN x=1 THEN 1 ELSE 0 END)
This has no NULL's to ignore, because it's either 1 or 0.
But a CASE WHEN x=1 THEN 1 END
is just the implicit shorter syntax for
CASE WHEN x=1 THEN 1 ELSE NULL END
So it's normal to COUNT without the ELSE.
COUNT(DISTINCT CASE WHEN x=1 THEN t.ID END)
If you do want to use an ELSE, then do it with a SUM
SUM(CASE WHEN x=1 THEN 1 ELSE 0 END)
