My R dataframe looks like
df <- data.frame(
Date1 = rep(seq.Date(from = as.Date("2022-08-01"), to = as.Date("2022-09-12"), by = 7), 2),
Date2 = as.Date(c(rep("2022-08-15", times = 7), rep("2022-08-29", times = 7))),
Value = 1:14)
| Date1 | Date2 | Value |
|---|---|---|
| 2022-08-01 | 2022-08-15 | 1 |
| 2022-08-08 | 2022-08-15 | 2 |
| ... | ||
| 2022-08-15 | 2022-08-29 | 10 |
| 2022-08-22 | 2022-08-29 | 11 |
| ... |
Using a magrittr dplyr chain, I want to add a new column that groups the data by Date2 (and any other grouping columns) and shows the mean Value for that group when Date1 is less than Date2 and greater than "Date2 - 2 weeks."
| Date1 | Date2 | Value | CondGrpMean |
|---|---|---|---|
| 2022-08-01 | 2022-08-15 | 1 | 1.5 |
| 2022-08-08 | 2022-08-15 | 2 | 1.5 |
| ... | |||
| 2022-08-15 | 2022-08-29 | 10 | 10.5 |
| 2022-08-22 | 2022-08-29 | 11 | 10.5 |
| ... |
I'm looking at dplyr::across(), dplyr::group_modify(), but can't figure this out in anything like easy syntax that does not require a join.
I'm open to other straightforward solutions, but my current ideal answer would look something like
df %>%
dplyr::group_by(Date2) %>%
dplyr::mutate(CondGrpMean = mean(Value %>%
dplyr::filter(Date1 < Date2 &
Date1 > Date2 - 14)))
The output values for CondGrpMean should be 1.5 for rows 1 - 7 and 10.5 for rows 8 - 14.
CodePudding user response:
1) There is no filter method for numeric vectors. Subscript Value as shown instead:
library(dplyr)
df %>%
group_by(Date2) %>%
mutate(CondGrpMean = mean(Value[Date1 >= Date2-14 & Date1 < Date2])) %>%
ungroup
giving:
# A tibble: 14 × 4
Date1 Date2 Value CondGrpMean
<date> <date> <int> <dbl>
1 2022-08-01 2022-08-15 1 1.5
2 2022-08-08 2022-08-15 2 1.5
3 2022-08-15 2022-08-15 3 1.5
4 2022-08-22 2022-08-15 4 1.5
5 2022-08-29 2022-08-15 5 1.5
6 2022-09-05 2022-08-15 6 1.5
7 2022-09-12 2022-08-15 7 1.5
8 2022-08-01 2022-08-29 8 10.5
9 2022-08-08 2022-08-29 9 10.5
10 2022-08-15 2022-08-29 10 10.5
11 2022-08-22 2022-08-29 11 10.5
12 2022-08-29 2022-08-29 12 10.5
13 2022-09-05 2022-08-29 13 10.5
14 2022-09-12 2022-08-29 14 10.5
1a) A variation of this is:
df %>%
group_by(Date2) %>%
mutate(CondGrpMean = mean(Value[c(Date2 - Date1) %in% 1:14])) %>%
ungroup
2) With base R:
Mean <- function(ix) with(df[ix, ], mean(Value[Date1 >= Date2-14 & Date1 < Date2]))
transform(df, CondGrpMean = ave(1:nrow(df), Date2, FUN = Mean))
CodePudding user response:
Add your condition as a column and use it to subset the Value column that you take the mean of:
library(lubridate)
df %>%
group_by(Date2) %>%
mutate(
cond = Date1 < Date2 & Date1 >= (Date2 - weeks(2)),
CondGrpMean = mean(Value[cond])
)
# # A tibble: 14 × 5
# # Groups: Date2 [2]
# Date1 Date2 Value cond CondGrpMean
# <date> <date> <int> <lgl> <dbl>
# 1 2022-08-01 2022-08-15 1 TRUE 1.5
# 2 2022-08-08 2022-08-15 2 TRUE 1.5
# 3 2022-08-15 2022-08-15 3 FALSE 1.5
# 4 2022-08-22 2022-08-15 4 FALSE 1.5
# 5 2022-08-29 2022-08-15 5 FALSE 1.5
# 6 2022-09-05 2022-08-15 6 FALSE 1.5
# 7 2022-09-12 2022-08-15 7 FALSE 1.5
# 8 2022-08-01 2022-08-29 8 FALSE 10.5
# 9 2022-08-08 2022-08-29 9 FALSE 10.5
# 10 2022-08-15 2022-08-29 10 TRUE 10.5
# 11 2022-08-22 2022-08-29 11 TRUE 10.5
# 12 2022-08-29 2022-08-29 12 FALSE 10.5
# 13 2022-09-05 2022-08-29 13 FALSE 10.5
# 14 2022-09-12 2022-08-29 14 FALSE 10.5
CodePudding user response:
I think you are looking for something like this. You can deal with your conditions using ifelse and only include the Value you want by assigning others NA. Then use na.rm in your mean to get the correct values:
library(dplyr)
df <- data.frame(
Date1 = rep(seq.Date(from = as.Date("2022-08-01"), to = as.Date("2022-09-12"), by = 7), 2),
Date2 = as.Date(c(rep("2022-08-15", times = 7), rep("2022-08-29", times = 7))),
Value = 1:14)
df %>%
group_by(Date2) %>%
mutate(CondGrpMean = mean(ifelse(Date1 < Date2 & Date1 >= Date2 - 14,
Value,
NA), na.rm = T
)
)
#> # A tibble: 14 × 4
#> # Groups: Date2 [2]
#> Date1 Date2 Value CondGrpMean
#> <date> <date> <int> <dbl>
#> 1 2022-08-01 2022-08-15 1 1.5
#> 2 2022-08-08 2022-08-15 2 1.5
#> 3 2022-08-15 2022-08-15 3 1.5
#> 4 2022-08-22 2022-08-15 4 1.5
#> 5 2022-08-29 2022-08-15 5 1.5
#> 6 2022-09-05 2022-08-15 6 1.5
#> 7 2022-09-12 2022-08-15 7 1.5
#> 8 2022-08-01 2022-08-29 8 10.5
#> 9 2022-08-08 2022-08-29 9 10.5
#> 10 2022-08-15 2022-08-29 10 10.5
#> 11 2022-08-22 2022-08-29 11 10.5
#> 12 2022-08-29 2022-08-29 12 10.5
#> 13 2022-09-05 2022-08-29 13 10.5
#> 14 2022-09-12 2022-08-29 14 10.5
