I have 3 date columns (class-date) and I want to create a new column that will have the earliest of the 3 dates. This is the code I used below:
df1 <- df %>% mutate(timeout= pmin(date1, date2, end_date))
In the case that date1 and date2 are NAs, then I would like the date in end_date to be returned in the timeout column and therefore timeout should not have any NAs. The code above is bringing back NAs. Any assistance will be greatly appreciated.
CodePudding user response:
You can add na.rm = TRUE, then it will ignore the NAs in each row when calculating pmin.
library(dplyr)
df %>%
mutate(timeout = pmin(date1, date2, end_date, na.rm = TRUE))
Output
id date1 date2 end_date timeout
1 1 <NA> <NA> 2008-01-23 2008-01-23
2 1 2007-10-16 2007-11-01 2008-01-23 2007-10-16
3 2 2007-11-30 2007-11-30 2007-11-30 2007-11-30
4 3 2007-08-17 2007-12-17 2008-12-12 2007-08-17
5 3 2008-11-12 2008-12-12 2008-12-12 2008-11-12
Data
df <- structure(list(id = c(1L, 1L, 2L, 3L, 3L), date1 = structure(c(NA,
13802, 13847, 13742, 14195), class = "Date"), date2 = structure(c(NA,
13818, 13847, 13864, 14225), class = "Date"), end_date = c("2008-01-23",
"2008-01-23", "2007-11-30", "2008-12-12", "2008-12-12")), class = "data.frame", row.names = c("1",
"2", "3", "4", "5"))
