I want to substract the result for level time0 from the results from all other levels, for each id.
id <- rep(1:4,each=4)
time <- rep(c(0,5,10,15),4)
a <- c(34,56,67,35)
b <-c(56,78,23,90)
c <- c(23,89,67,78)
df <- data.frame(id,time,a,b,c)
df
id time a b c
1 1 0 34 56 23
2 1 5 56 78 89
3 1 10 67 23 67
4 1 15 35 90 78
5 2 0 34 56 23
6 2 5 56 78 89
7 2 10 67 23 67
8 2 15 35 90 78
9 3 0 34 56 23
10 3 5 56 78 89
11 3 10 67 23 67
12 3 15 35 90 78
13 4 0 34 56 23
14 4 5 56 78 89
15 4 10 67 23 67
16 4 15 35 90 78
I started like this but it feels there must be a more efficient way. Any suggestions? Thanks!
for( i in 1:length(unique(df$id))){
df_id <- df[df$id==i,]
for(j in 2:length(time)){
test <- t(df_id[,-1])
test[,c(2:4)]-test[,1]
}
CodePudding user response:
Here's an option with dplyr -
library(dplyr)
df %>%
group_by(id) %>%
mutate(across(a:c, ~. - .[time == 0])) %>%
ungroup
# id time a b c
# <int> <dbl> <dbl> <dbl> <dbl>
# 1 1 0 0 0 0
# 2 1 5 22 22 66
# 3 1 10 33 -33 44
# 4 1 15 1 34 55
# 5 2 0 0 0 0
# 6 2 5 22 22 66
# 7 2 10 33 -33 44
# 8 2 15 1 34 55
# 9 3 0 0 0 0
#10 3 5 22 22 66
#11 3 10 33 -33 44
#12 3 15 1 34 55
#13 4 0 0 0 0
#14 4 5 22 22 66
#15 4 10 33 -33 44
#16 4 15 1 34 55
Using time == 0 would work if it is guaranteed that every id has exactly 1 value of time = 0. If for some id's there is no row for time = 0 or have more than one row with time = 0 then probably using match is better option.
df %>% group_by(id) %>% mutate(across(a:c, ~. - .[match(0, time)]))
CodePudding user response:
Use mapply in by.
vc <- c('a', 'b', 'c')
by(df, df$id, \(x) {x[-1, vc] <- mapply(`-`, x[-1, vc], x[1, vc]);x}) |>
do.call(what=rbind)
# id time a b c
# 1.1 1 0 34 56 23
# 1.2 1 5 22 22 66
# 1.3 1 10 33 -33 44
# 1.4 1 15 1 34 55
# 2.5 2 0 34 56 23
# 2.6 2 5 22 22 66
# 2.7 2 10 33 -33 44
# 2.8 2 15 1 34 55
# 3.9 3 0 34 56 23
# 3.10 3 5 22 22 66
# 3.11 3 10 33 -33 44
# 3.12 3 15 1 34 55
# 4.13 4 0 34 56 23
# 4.14 4 5 22 22 66
# 4.15 4 10 33 -33 44
# 4.16 4 15 1 34 55
If id==0 position is not consistent, you need to formulate more verbose:
{x[x$time != 0, vc] <- mapply(`-`, x[x$time != 0, vc], x[x$time == 0, vc]);x}
Data:
df <- structure(list(id = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L,
3L, 3L, 4L, 4L, 4L, 4L), time = c(0, 5, 10, 15, 0, 5, 10, 15,
0, 5, 10, 15, 0, 5, 10, 15), a = c(34, 56, 67, 35, 34, 56, 67,
35, 34, 56, 67, 35, 34, 56, 67, 35), b = c(56, 78, 23, 90, 56,
78, 23, 90, 56, 78, 23, 90, 56, 78, 23, 90), c = c(23, 89, 67,
78, 23, 89, 67, 78, 23, 89, 67, 78, 23, 89, 67, 78)), class = "data.frame", row.names = c(NA,
-16L))
