Home > OS >  How to handle mm/dd/yyyy and dd-mm-yyyy in one character vector in R Studio
How to handle mm/dd/yyyy and dd-mm-yyyy in one character vector in R Studio

Time:01-26

I have a panel data set that has a column with date entries, though they are in class "character", with some as mm/dd/yyyy, and others with dd-mm-yyy. I want to format these into a Date vector, so that I can subset the data according to a cutoff date. However, as.date does not work, since the formatting of the entries varies.

df$OPdate <- as.Date(OPdate, format = "%Y-%M-%D")
dfnew = subset(df,OPdate < "2021/3/29")
df_age14 = subset(dfnew, age > 13)
list14 = unique(df_age14$postID)
finaldf = subset(df, postID %in% list14)

This is the code I am trying to run once the dates are formatted correctly. Any suggestions? Thanks in advance

CodePudding user response:

If you're sure you only have the two formats you can try as.Date with tryFormats in sapply because tryFormats is not vectorized. strftime returns the desired character string.
Using toy data.

dates
[1] "02/23/2021" "11/03/2021" "22-03-2021" "23-04-2020" "29-06-2021"

sapply(dates, function(x) 
  strftime(as.Date(x, tryFormats=c("%d-%m-%Y","%m/%d/%Y"))))
  02/23/2021   11/03/2021   22-03-2021   23-04-2020   29-06-2021 
"2021-02-23" "2021-11-03" "2021-03-22" "2020-04-23" "2021-06-29" 

Working with the data

dates_new <- sapply(dates, function(x) 
  strftime(as.Date(x, tryFormats=c("%d-%m-%Y","%m/%d/%Y"))))

dates_new > "2021-04-14"
02/23/2021 11/03/2021 22-03-2021 23-04-2020 29-06-2021 
     FALSE       TRUE      FALSE      FALSE       TRUE

# or
as.Date(dates_new) - 23
[1] "2021-01-31" "2021-10-11" "2021-02-27" "2020-03-31" "2021-06-06"

CodePudding user response:

you can use the package "lubridate" from tidyverse to change the format of date in a specific one: i.e. mdy("4/1/17") will output "2017-04-01", dmy("14/10/2021") will output "2021-10-14".

  •  Tags:  
  • Related