I have a data frame like
date X1 X2 X3
4/16/2019 0:00 1 2 3
4/16/2019 7:00 1 2 3
4/172019 0:00 1 2 3
4/17/2019 7:00 1 2 3
I would like to get
date time X1 X2 X3
4/16/2019 c(0,7) c(1,1) c(2,2) c(3,3)
4/17/2019 c(0,7) c(1,1) c(2,2) c(3,3)
where X1 is a list and X1[[1]] is a vector, that is c(1,1).
Is there an efficient way to achieve this? Thank you!
CodePudding user response:
Split the 'date' into 'date', 'time' columns at the space (\\s ), grouped by 'date', then summarise across all the columns by wrapping them in a list
library(dplyr)
library(tidyr)
library(stringr)
df1 %>%
separate(date, into = c('date', 'time'), sep = '\\s ') %>%
mutate(time = as.numeric(str_replace(time, ":", ".")) %>%
group_by(date) %>%
summarise(across(everything(), ~ list(.)))
-output
# A tibble: 2 × 5
date time X1 X2 X3
<chr> <list> <list> <list> <list>
1 4/16/2019 <dbl [2]> <int [2]> <int [2]> <int [2]>
2 4/17/2019 <dbl [2]> <int [2]> <int [2]> <int [2]>
data
df1 <- structure(list(date = c("4/16/2019 0:00", "4/16/2019 7:00",
"4/17/2019 0:00",
"4/17/2019 7:00"), X1 = c(1L, 1L, 1L, 1L), X2 = c(2L, 2L, 2L,
2L), X3 = c(3L, 3L, 3L, 3L)),
class = "data.frame", row.names = c(NA,
-4L))
CodePudding user response:
Here is an alternative way how you could do it: Logic:
- separate date and time column (other then with
separate, as already provided by akrun) - group
- summarise with
acrossusinglistandlambda paste(notice the.namesargument insummarise - use again
acrossandlambda paste0
library(dplyr)
library(readr)
df %>%
mutate(date = mdy_hm(date)) %>%
mutate(time = parse_number(sprintf("d", hour(date))), .before=2,
date = as.Date(ymd_hms(date))) %>%
group_by(date) %>%
summarise(across(everything(), list(~paste(.,collapse=",")), .names="{col}")) %>%
mutate(across(-date, ~paste0("c(",.,")")))
date time X1 X2 X3
<date> <chr> <chr> <chr> <chr>
1 2019-04-16 c(0,7) c(1,1) c(2,2) c(3,3)
2 2019-04-17 c(0,7) c(1,1) c(2,2) c(3,3)
