I have a time series (df) in R for which I would like to calculate percentage change over varying periods:
month x
Jan 1
Feb 4
Mar 5
Apr 3
May 1
Jun 2
I can calculate a month on month percentage change for the series using:
df <- df %>%
mutate(mom_pct = (count/lag(count)*100-100))
This results in
month x mom_pct
Jan 1 NA
Feb 4 300
Mar 5 25
Apr 3 -40
May 1 -66.67
Jun 2 100
I cannot work out how to produce a three-month on three-month percentage change however (i.e. the sum of the last three months divided by the previous three months). I have tried the following:
df <- df %>%
mutate("3mo3m_pct" = (rollapplyr(count, 3, sum, fill = NA)/rollapplyr(lag(count, -3), 3, mean, fill = NA))*100-100)
But this returns an error - x n must be a nonnegative integer scalar, not a double vector of length 1.
CodePudding user response:
This was achieved by moving the lag function in the denominator of the 3 month percentage change calculation:
mutate("3mo3m_pct" = (rollapplyr(x, 3, sum, fill = NA)/lag(rollapplyr(x, 3, sum, fill = NA),3)*100-100))
CodePudding user response:
Define a pct function and use it in rollapplyr:
library(dplyr)
library(zoo)
pct <- function(x) 100 * (sum(tail(x, 3)) / sum(head(x, 3)) - 1)
df %>% mutate(pct = rollapplyr(x, 6, pct, fill = NA))
Note
The input in reproducible form
Lines <- "month x
Jan 1
Feb 4
Mar 5
Apr 3
May 1
Jun 2"
df <- read.table(text = Lines, header = TRUE, strip.white = TRUE)
