I have a vector of dates stored as strings, some of these are in YYYY-mm-dd format and some are in YYYY-dd-mm format. I need to work out how to convert these as dates. Fortunately, all instances in the YYYY-dd-mm format are after the 12 day of any month so they are easily discernible.
Have tried
dates <- c("2017-12-31","2017-12-30","2017-29-12","2017-28-12")
dates <- as.Date(dates, format = c("%Y-%m-%d","%Y-%d-%m"))
is returning the following:
[1] "2017-12-31" "2017-12-30" NA NA
Any help greatly appreciated!
CodePudding user response:
If you want YMD to be the default, and YDM to be anything that didn't work for, use YMD first and then go back and try again with any results that produced missing values:
result = as.Date(dates, format = "%Y-%m-%d")
result[is.na(result)] = as.Date(dates[is.na(result)], format = "%Y-%d-%m")
result
# [1] "2017-12-31" "2017-12-30" "2017-12-29" "2017-12-28"
This does make me nervous, you're assuming that any date where both month and day numbers are less than 13 are YMD... I wouldn't make that assumption without very good reason.
CodePudding user response:
Here's a slightly longer approach.
# sample date
dates <- c("2015-02-23","2015-02-12","2015-18-02","2015-25-02")
# libraries
library(testit) #for has_warning
library(lubridate) #for date functions
This function will correct the dates.
correct_dates = function(dates)
{
dates_new = character()
for(i in 1:length(dates))
{
#print(i)
if(has_warning(day(ydm(dates[i]))>12))
{dates_new = append(dates_new, ymd(dates[i]))}
else
{dates_new = append(dates_new, ydm(dates[i]))}
}
return(dates_new)
}
Checking results
> dates
[1] "2015-02-23" "2015-02-12" "2015-18-02" "2015-25-02"
> correct_dates(dates)
[1] "2015-02-23" "2015-12-02" "2015-02-18" "2015-02-25"
CodePudding user response:
Use parse_date_time from lubridate and pass multiple formats your date can take.
dates <- c("2017-12-31","2017-12-30","2017-29-12","2017-28-12")
as.Date(lubridate::parse_date_time(dates, c('ymd', 'ydm')))
#[1] "2017-12-31" "2017-12-30" "2017-12-29" "2017-12-28"
This like other answers gives preference to ymd first and if it cannot identify the date then goes and checks for ydm format.
