I am trying to calculate rolling percentages for quarterly time periods. Sample data via sql is below:
create table #demo (Month char(7), Type varchar(10), Mileage int);
--low
insert into #demo values ('2021-05', 'low', 5000);
insert into #demo values ('2021-06', 'low', 15000);
insert into #demo values ('2021-07', 'low', 3000);
insert into #demo values ('2021-08', 'low', 3500);
-- med
insert into #demo values ('2021-05', 'med', 6789);
insert into #demo values ('2021-06', 'med', 12876);
insert into #demo values ('2021-07', 'med', 1578);
insert into #demo values ('2021-08', 'med', 3500);
--high
insert into #demo values ('2021-05', 'high', 5000);
insert into #demo values ('2021-06', 'high', 1500);
insert into #demo values ('2021-07', 'high', 2700);
insert into #demo values ('2021-08', 'high', 2968);
Month Type Mileage
2021-05 high 5000
2021-05 low 5000
2021-05 med 6789
2021-06 high 1500
2021-06 low 15000
2021-06 med 12876
2021-07 high 2700
2021-07 low 3000
2021-07 med 1578
2021-08 high 2968
2021-08 low 3500
2021-08 med 3500
My goal is to take this data and calculate the rolling 3 month percentage for each category individually.
Example: for the first 3 months of 2021 I would desire to get the sum(low mileage), sum(med mileage), sum (high mileage) for 2021-05, 2021-06, 2021-07 and then divide each sum by the total of those three months (for all categories). I'm imagining a PIVOT solution would be ideal here. I'm just not sure how to go about it.
So I would expect to see rolling totals like:
Month ThreeMonth_TotalMileage ThreeMonth_Low ThreeMonth_Med ThreeMonth_High
2021-05 NULL NULL NULL NULL
2021-06 NULL NULL NULL NULL
2021-07 53443 23000 21243 9200
2021-08 46622 18800 17954 7168
E.g. ThreeMonth_TotalMileage is the sum of all mileage for 2021-05, 2021-06, 2021-07. This will only show as a running total in 2021-07 since we have 3 months of history we can calculate a sum from, otherwise I'd like it to be null.
3Month_Low is the sum of all low mileage for 2021-05, 2021-06, 2021-07, basically its inclusive of the current month plus the prior 2 for each calc.
Any ideas on how I could try to accomplish this using SQL Server? Thank you!
CodePudding user response:
You can achieve that using Ordered Analytical Function. The idea is to find cumulative sum.
For Pivoting the data from row to column, you can use CASE and MAX.
select
Month
,Max(case when Type='low' then Mileage end) as Month_Low
,Max(case when Type='med' then Mileage end)as Month_Med
,Max(case when Type='high' then Mileage end) as Month_High
,case when ROW_NUMBER() over(order by Month asc)>=3 Then 1 end as is_3month_data_present
from #demo
group by month
| Month | Month_Low | Month_Med | Month_High | is_3month_data_present |
|---|---|---|---|---|
| 2021-05 | 5000 | 6789 | 5000 | null |
| 2021-06 | 15000 | 12876 | 1500 | null |
| 2021-07 | 3000 | 1578 | 2700 | 1 |
| 2021-08 | 3500 | 3500 | 2968 | 1 |
Below SQL should give the desired final result.
select
Month
,case when is_3month_data_present=1 then Sum(Month_Low Month_Med Month_High) over(order by Month ROWS BETWEEN 2 PRECEDING and CURRENT ROW ) end as ThreeMonth_TotalMileage
,case when is_3month_data_present=1 then Sum(Month_Low) over(order by Month ROWS BETWEEN 2 PRECEDING and CURRENT ROW ) end as ThreeMonth_Low
,case when is_3month_data_present=1 then Sum(Month_Med ) over(order by Month ROWS BETWEEN 2 PRECEDING and CURRENT ROW ) end as ThreeMonth_Med
,case when is_3month_data_present=1 then Sum(Month_High) over(order by Month ROWS BETWEEN 2 PRECEDING and CURRENT ROW ) end as ThreeMonth_High
from
(select
Month
,Max(case when Type='low' then Mileage end) as Month_Low
,Max(case when Type='med' then Mileage end)as Month_Med
,Max(case when Type='high' then Mileage end) as Month_High
,case when ROW_NUMBER() over(order by Month asc)>=3 Then 1 end as is_3month_data_present
from #demo
group by Month
)stg
Output:
| Month | ThreeMonth_TotalMileage | ThreeMonth_Low | ThreeMonth_Med | ThreeMonth_High |
|---|---|---|---|---|
| 2021-05 | null | null | null | |
| 2021-06 | null | null | null | |
| 2021-07 | 53443 | 23000 | 21243 | 9200 |
| 2021-08 | 46622 | 21500 | 17954 | 7168 |
Note: Expected output provided in the question is wrong as per the dataset given. ThreeMonth_low is wrong, rest all are correct. For 2021-07, actual value should be 23000 instead of 20300.
DB Fiddle: Try it here
