In this data.table:
dt <- data.table(id=c(1,1,1,2,2,2), time=rep(1:3,2), x=c(1,0,0,0,1,0))
dt
id time x
1: 1 1 1
2: 1 2 0
3: 1 3 0
4: 2 1 0
5: 2 2 1
6: 2 3 0
I need the following:
id time x
1: 1 1 1
2: 1 2 1
3: 1 3 1
4: 2 1 0
5: 2 2 1
6: 2 3 1
that is
- if
x==1attime==1thenx=1at times 2 and 3, byid - if
x==1attime==2thenx=1at time 3, byid
For the first point (I guess the second one will be similar), I have tried approaches mentioned in similar questions I posted before (here and here), but none work:
dt[x==1[time == 1], x := x[time == 1], id]gives an errorsetDT(dt)[, x2:= ifelse(x==1 & time==1, x[time==1], x), by=id]changesxonly attime 1(so, no real change observed)
It would be much easier to work with data.table in wide format, but I keep facing this kind of problem in long format and I don't want to reshape my data all the time
Thank you!
EDIT:
The answer provided by @GregorThomas, dt[, x := cummax(x), by = id], works for the problem that I presented.
Now I ask the same question for a character variable:
dt2 <- data.table(id=c(1,1,1,2,2,2), time=rep(1:3,2), x=c('a','b','b','b','a','b'))
dt2
id time x
1: 1 1 a
2: 1 2 b
3: 1 3 b
4: 2 1 b
5: 2 2 a
6: 2 3 b
In the table above, how could be done the following:
- if
x=='a'attime==1thenx='a'at times 2 and 3, by id - if
x=='a'attime==2thenx='a'at time 3, by id
CodePudding user response:
Using the cumulative maximum function cummax:
dt[, x := cummax(x), by = id]
dt
# id time x
# 1: 1 1 1
# 2: 1 2 1
# 3: 1 3 1
# 4: 2 1 0
# 5: 2 2 1
# 6: 2 3 1
