I have a panel data set that has a column with date entries, though they are in class "character", with some as mm/dd/yyyy, and others with dd-mm-yyy. I want to format these into a Date vector, so that I can subset the data according to a cutoff date. However, as.date does not work, since the formatting of the entries varies.
df$OPdate <- as.Date(OPdate, format = "%Y-%M-%D")
dfnew = subset(df,OPdate < "2021/3/29")
df_age14 = subset(dfnew, age > 13)
list14 = unique(df_age14$postID)
finaldf = subset(df, postID %in% list14)
This is the code I am trying to run once the dates are formatted correctly. Any suggestions? Thanks in advance
CodePudding user response:
If you're sure you only have the two formats you can try as.Date with tryFormats in sapply because tryFormats is not vectorized. strftime returns the desired character string.
Using toy data.
dates
[1] "02/23/2021" "11/03/2021" "22-03-2021" "23-04-2020" "29-06-2021"
sapply(dates, function(x)
strftime(as.Date(x, tryFormats=c("%d-%m-%Y","%m/%d/%Y"))))
02/23/2021 11/03/2021 22-03-2021 23-04-2020 29-06-2021
"2021-02-23" "2021-11-03" "2021-03-22" "2020-04-23" "2021-06-29"
Working with the data
dates_new <- sapply(dates, function(x)
strftime(as.Date(x, tryFormats=c("%d-%m-%Y","%m/%d/%Y"))))
dates_new > "2021-04-14"
02/23/2021 11/03/2021 22-03-2021 23-04-2020 29-06-2021
FALSE TRUE FALSE FALSE TRUE
# or
as.Date(dates_new) - 23
[1] "2021-01-31" "2021-10-11" "2021-02-27" "2020-03-31" "2021-06-06"
CodePudding user response:
you can use the package "lubridate" from tidyverse to change the format of date in a specific one: i.e. mdy("4/1/17") will output "2017-04-01", dmy("14/10/2021") will output "2021-10-14".
