I have a dataset that looks something like this:
ID Minutes Read Comprehension
1 25 1
1 30 1
2 20 2
2 25 2
2 30 1
I want to create a column called "day" that counts the days each person reported reading, such as below:
ID Minutes Read Comprehension Day
1 25 1 1
1 30 1 2
2 20 2 1
2 25 2 2
2 30 1 3
How would I go about doing that? The end goal is to use the "day" column to reshape my data,
df2 <- reshape(df, idvar="ID", timevar = "day", direction="wide").
CodePudding user response:
Since your aim is to reshape the data, try:
reshape(transform(df, time = ave(ID, ID, FUN = seq)), dir = 'wide', idvar = 'ID')
ID Minutes.Read.1 Comprehension.1 Minutes.Read.2 Comprehension.2 Minutes.Read.3 Comprehension.3
1 1 25 1 30 1 NA NA
3 2 20 2 25 2 30 1
If you are only interested in the day column, then
df <- transform(df, day = ave(ID, ID, FUN = seq))
CodePudding user response:
Here is a tidyverse solution:
library(dplyr)
library(tidyr)
df %>%
group_by(ID) %>%
mutate(day = row_number()) %>%
pivot_wider(
names_from = day,
values_from = c(MinutesRead, Comprehension)
)
ID MinutesRead_1 MinutesRead_2 MinutesRead_3 Comprehension_1 Comprehension_2 Comprehension_3
<int> <int> <int> <int> <int> <int> <int>
1 1 25 30 NA 1 1 NA
2 2 20 25 30 2 2 1
df <- structure(list(ID = c(1L, 1L, 2L, 2L, 2L), MinutesRead = c(25L,
30L, 20L, 25L, 30L), Comprehension = c(1L, 1L, 2L, 2L, 1L)), class = "data.frame", row.names = c(NA,
-5L))
CodePudding user response:
You could do this as a one-liner:
df$Day <- unlist(sapply(rle(df$ID)$lengths, seq_len))
df
ID Minutes.Read Comprehension Day
1 1 25 1 1
2 1 30 1 2
3 2 20 2 1
4 2 25 2 2
5 2 30 1 3
CodePudding user response:
Using lapply
We first call to split() to split df by ID. The output of split is a list, so we use lapply to perform the task on each element (i.e. ID) of that list. Then, we compute the number of rows per ID and use seq_len to create a sequence of numbers from 1 to the number of rows per ID. Finally, we rbind, so that we are returned a data.frame.
df <- do.call(
rbind,
lapply(split(df, df$ID), function(x) cbind(x, Day = seq_len(nrow(x))))
)
rownames(df) <- NULL # optional
#> str(df$Day)
# int [1:5] 1 2 1 2 3
Data
df <- data.frame(ID = c(1,1,2,2,2),
Minutes = c(25,30,20,25,30),
Comprehension = c(1,1,2,2,1))
