I have a dataset in R that contains minutes and seconds in the following format:
3:00
As of now it is a character, how can I change it to a date time so that it can be analyzed in a graph? I want to analyze posts on instagram. I have their likes and video time and want to plot them. I cant do this unless I believe I turn it into a date time?
CodePudding user response:
Not clear what is meant by date time when we only have minutes and seconds but these two turn x into a times object, x1, and a lubridate Period object, x2, respectively. Internally a times object is a fraction of a day and as.numeric(x1) will give that fraction. as.numeric(x2, unit = "sec") will give the number of seconds corresponding to the Period object.
library(chron)
x <- "3:00"
x1 <- times(paste0("0:", x))
library(lubridate)
x2 <- ms(x)
CodePudding user response:
I imagine that your data frame looks something like this example I created.
df <- data.frame(time= c("3:00", "4:00", "2:00", "1:00", "6:00", "3:30", "1:10"), video= c(1:7),
likes= c(10, 17, 6, 25, 14, 5, 19))
library(lubridate)
library(tidyverse)
I suggest using the lubridate package to convert to seconds.
df2 <- df %>%
mutate(time_sec= as.period(ms(time), unit = "sec"))
Then you might consider presenting it as a scatter plot with likes versus time.
ggplot(df2, aes(x= likes, y= time_sec)) geom_point(size=2, shape=23)
CodePudding user response:
You could convert time to decimal.
t2dec <- \(x, type=c('mins', 'secs')) {
type <- match.arg(type)
r <- x |>
strsplit(':') |>
do.call(what=rbind) |>
type.convert(as.is=TRUE)
if (type == 'mins') {
r[, 1] r[, 2]/60
} else {
r[, 1]*60 r[, 2]
}
}
t2dec(dat$time) ## as minutes (default)
# [1] 1.583333 6.066667 7.366667 8.283333 8.850000 15.616667 23.383333 28.033333 33.350000 36.150000
# [11] 37.550000 40.250000 41.933333 42.700000 43.550000 47.833333 49.283333 55.066667 56.600000 59.883333
t2dec(dat$time, 's') ## as seconds (since "s" matches "seconds")
# [1] 95 364 442 497 531 937 1403 1682 2001 2169 2253 2415 2516 2562 2613 2870 2957 3304 3396 3593
Accordingly:
op <- par(mfrow=c(1, 2))
plot(t2dec(dat$time), dat$x, type='b', col=4, xlab='mins', ylab='likes', main='Videos')
abline(h=0)
plot(t2dec(dat$time, 's'), dat$x, type='b', col=4, xlab='secs', ylab='likes', main='Videos')
abline(h=0)
par(op)
Note: R >= 4.1 used.
Data:
dat <- structure(list(time = c("01:35", "06:04", "07:22", "08:17", "08:51",
"15:37", "23:23", "28:02", "33:21", "36:09", "37:33", "40:15",
"41:56", "42:42", "43:33", "47:50", "49:17", "55:04", "56:36",
"59:53"), x = c(-2L, 2L, -3L, 3L, 7L, 7L, -3L, 0L, 1L, 5L, 3L,
0L, 1L, 10L, 1L, 1L, 6L, 3L, -1L, -1L)), class = "data.frame", row.names = c(NA,
-20L))

