So my dataframe is structured like so:
| x | s |
|---|---|
| NA | 0 |
| 13 | 0 |
| -3 | 0 |
| 2 | 0 |
| -4 | 0 |
for each row in s, I would like to take the lag(s), add it to column x, then set it to the value of s.
my output data would therefore look like:
| x | s |
|---|---|
| NA | 0 |
| 13 | 13 |
| -3 | 10 |
| 2 | 12 |
| -4 | 8 |
I tried the following function, but after fiddling I was only able to get all NA's or all 0's:
mydata$s = lag(mydata$s) mydata$x
Note - if it helps, I can remove the first row.
CodePudding user response:
It works for me. Set up:
mydata <- data.frame(x = c(NA, 13, -3, 2, -4), s = c(0, 13, 10, 12, 8) )
mydata$s <- lag(mydata$s) mydata$x
Gives:
mydata
x s
1 NA NA
2 13 13
3 -3 10
4 2 12
5 -4 8
The difference is my first s is NA. That should be expected as the first x is NA.
CodePudding user response:
You can use cumsum() to perform the job, and also replace NA with 0 during the calculation (without changing your original dataset).
library(tidyverse)
df %>% mutate(s = cumsum(ifelse(is.na(x), 0, x)))
x s
1 NA 0
2 13 13
3 -3 10
4 2 12
5 -4 8
CodePudding user response:
Base R solution:
mydata$s <- c(mydata$x[1], cumsum(mydata$x[-1]))
Data:
mydata <- data.frame(x = c(NA, 13, -3, 2, -4))
