I'm trying to create a query which allows to categorize the average percentage for specific data per month.
Here's how my dataset presents itself:
| Date | Name | Group | Percent |
|---|---|---|---|
| 2022-01-21 | name1 | gr1 | 5.2 |
| 2022-01-22 | name1 | gr1 | 6.1 |
| 2022-01-26 | name1 | gr1 | 4.9 |
| 2022-02-01 | name1 | gr1 | 3.2 |
| 2022-02-03 | name1 | gr1 | 8.1 |
| 2022-01-22 | name2 | gr1 | 36.1 |
| 2022-01-25 | name2 | gr1 | 32.1 |
| 2022-02-10 | name2 | gr1 | 35.8 |
| ... | ... | ... | ... |
And here's what I want to obtain with my query (based on what I showed of the table):
| Month | <=25% | 25<_<=50% | 50<_<=75% | 75<_<=100% |
|---|---|---|---|---|
| 01 | 1 | 1 | 0 | 0 |
| 02 | 1 | 1 | 0 | 0 |
| ... | ... | ... | ... | ... |
The result needs to:
- Be ordered by month
- Have the average use for each name counted and categorized
So far I know how to get the average of the Percent value per Name:
SELECT Name,
AVG(Percent)
from `table`
where Group = 'gr1'
group by Name
and how to count iterations of Percent in the categories created for the query:
SELECT EXTRACT(MONTH FROM Date) as Month,
COUNT(CASE WHEN Percent <= 25 AND Group = 'gr1' THEN Name END) `_25`,
COUNT(CASE WHEN Percent > 25 AND Percent <= 50 AND Group = 'gr1' THEN Name END) `_50`,
COUNT(CASE WHEN Percent > 50 AND Percent <= 75 AND Group = 'gr1' THEN Name END) `_75`,
COUNT(CASE WHEN Percent > 75 AND Percent <= 100 AND Group = 'gr1' THEN Name END) `_100`,
FROM `table`
GROUP BY Month
ORDER BY Month
but this counts all iterations of every name where I want the average of those values.
I've been struggling to figure out how to combine the two queries or to create a new one that answers my need.
I'm working with the BigQuery service from Google Cloud
CodePudding user response:
This query produces the needed result, based on your example. So basically this combines your 2 queries using subquery, where the subquery is responsible to calculate AVG grouped by Name, Month and Group, and the outer query is for COUNT and "categorization"
SELECT
Month,
COUNT(CASE
WHEN avg <= 25 THEN Name
END) AS _25,
COUNT(CASE
WHEN avg > 25
AND avg <= 50 THEN Name
END) AS _50,
COUNT(CASE
WHEN avg > 50
AND avg <= 75 THEN Name
END) AS _75,
COUNT(CASE
WHEN avg > 75
AND avg <= 100 THEN Name
END) AS _100
FROM
(
SELECT
EXTRACT(MONTH from Date) AS Month,
Name,
AVG(Percent) AS avg
FROM
table1
GROUP BY Month, Name, Group
HAVING Group = 'gr1'
) AS namegr
GROUP BY Month
This is the result:
| Month | _25 | _50 | _75 | _100 |
|---|---|---|---|---|
| 1 | 1 | 1 | 0 | 0 |
| 2 | 1 | 1 | 0 | 0 |
See also Fiddle (BUT on MySql) - http://sqlfiddle.com/#!9/16c5882/9
CodePudding user response:
You can use this query to Group By Month and each Name
SELECT CONCAT(EXTRACT(MONTH FROM Date), ', ', Name) AS DateAndName,
CASE
WHEN AVG(Percent) <= 25 THEN '1'
ELSE '0'
END AS '<=25%',
CASE
WHEN AVG(Percent) > 25 AND AVG(Percent) <= 50 THEN '1'
ELSE '0'
END AS '25<_<=50%',
CASE
WHEN AVG(Percent) > 50 AND AVG(Percent) <= 75 THEN '1'
ELSE '0'
END AS '50<_<=75%',
CASE
WHEN AVG(Percent) > 75 AND AVG(Percent) <= 100 THEN '1'
ELSE '0'
END AS '75<_<=100%'
from DataTable /*change to your table name*/
group by EXTRACT(MONTH FROM Date), Name
order by DateAndName
It gives the following result:
| DateAndName | <=25% | 25<_<=50% | 50<_<=75% | 75<_<=100% |
|---|---|---|---|---|
| 1, name1 | 1 | 0 | 0 | 0 |
| 1, name2 | 0 | 1 | 0 | 0 |
| 2, name1 | 1 | 0 | 0 | 0 |
| 2, name2 | 0 | 1 | 0 | 0 |
