I have the following df:
df = data.frame(id=c(1,1,1,1,1,1),
date=c(as.Date("2000-01-01"), as.Date("2000-07-11"),
as.Date("2000-08-01"), as.Date("2000-12-31"),
as.Date("2002-05-04"), as.Date("2002-06-01")))
I need the following result:
result = data.frame(id=c(1,1,1,1,1,1),
date=c(as.Date("2000-01-01"), as.Date("2000-07-11"),
as.Date("2000-08-01"), as.Date("2000-12-31"),
as.Date("2002-05-04"), as.Date("2002-06-01")),
days_91 = c(0,0,1,0,0,1),
days_182 = c(0,0,1,0,0,1),
days_273 = c(0,1,1,1,0,1),
days_365 = c(0,1,1,1,0,1))
Basically, for a certain date I want to know if a prior dates exists for the same ID in the last X days.
I supposed a lubridate fuction must exist but did not find it.
Result:
| id | date | days_91 | days_182 | days_273 | days_365 |
|---|---|---|---|---|---|
| 1 | 2000-01-01 | 0 | 0 | 0 | 0 |
| 1 | 2000-07-11 | 0 | 0 | 1 | 1 |
| 1 | 2000-08-01 | 1 | 1 | 1 | 1 |
| 1 | 2000-12-31 | 0 | 1 | 1 | 1 |
| 1 | 2002-05-04 | 0 | 0 | 0 | 0 |
| 1 | 2002-06-01 | 1 | 1 | 1 | 1 |
For instance, for row 3 there is a previous date in the last 91, 182, 273 and 365 days. However in row 2 there is no previous visit in the last 91 and 182 days
CodePudding user response:
Here is another option using map2 and map_dfc from purrr. After providing a given date and the previous date (in sorted order), you can compare the difference of these two values to all elements in a numeric vector (containing the number of days such as 91, 182, etc.).
library(tidyverse)
my_days <- c(91, 182, 273, 365)
df %>%
group_by(id) %>%
arrange(date, .by_group = T) %>%
mutate(days = map2(
date,
lag(date, default = as.Date(-Inf)),
\(x, y) {
bind_cols(map_dfc(set_names(my_days, paste0("days_", my_days)), ~ (x - y < .x)))
}
)) %>%
unnest(days)
Output
id date days_91 days_182 days_273 days_365
<dbl> <date> <int> <int> <int> <int>
1 1 2000-01-01 0 0 0 0
2 1 2000-07-11 0 0 1 1
3 1 2000-08-01 1 1 1 1
4 1 2000-12-31 0 1 1 1
5 1 2002-05-04 0 0 0 0
6 1 2002-06-01 1 1 1 1
CodePudding user response:
We can use dplyr to iterate over a list of dates you want to check for, and will return 1 if any date in the 'date' column is present within the previous x days:
library(dplyr)
dates_check <- c(91, 192, 213, 365) # Dates we want to check
prev_dates <- function(prev_date){
colname <- paste('days_', prev_date, sep='') # Dynamically create the column name
df <<- df %>%
group_by(id) %>% # Group our data by id
rowwise() %>% # Perform rowwise operation
mutate(!!colname := as.integer(any(df$date > date - prev_date & df$date < date)))
}
lapply(dates_check, prev_dates)
# A tibble: 6 x 6
# Rowwise: id
id date days_91 days_182 days_273 days_365
<dbl> <date> <int> <int> <int> <int>
1 1 2000-01-01 0 0 0 0
2 1 2000-07-11 0 0 1 1
3 1 2000-08-01 1 1 1 1
4 1 2000-12-31 0 1 1 1
5 1 2002-05-04 0 0 0 0
6 1 2002-06-01 1 1 1 1
