I would like to create a new column that displays the first year in V2 where the value of V3 appears. However, after null values I would like to put again the first year for the reappearance of V3.
That is, in possession of the following data:
I would like to get a new V4 column as follows:
I appreciate any help.
Below are the data:
structure(list(V1 = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3,
3, 3), V2 = c(2005, 2006, 2007, 2008, 2009, 2005, 2006, 2007,
2008, 2009, 2005, 2006, 2007, 2008, 2009), V3 = c(0, 0, 10, 25,
35, 12, 15, 0, 15, 17, 13, 0, 0, 15, 12)), row.names = c(NA,
15L), class = "data.frame")
CodePudding user response:
Using tidyverse and rleid from data.table you can try the following. You can group_by both V1 as well as a second group based on whether value in V3 is zero. This assumes the years are in chronological order (if not, may need to add arrange by V2 first).
library(tidyverse)
library(data.table)
df %>%
group_by(V1, grp = rleid(V3 != 0)) %>%
mutate(V4 = ifelse(V3 == 0, 0, first(V2))) %>%
ungroup %>%
select(-grp)
Output
V1 V2 V3 V4
<dbl> <dbl> <dbl> <dbl>
1 1 2005 0 0
2 1 2006 0 0
3 1 2007 10 2007
4 1 2008 25 2007
5 1 2009 35 2007
6 2 2005 12 2005
7 2 2006 15 2005
8 2 2007 0 0
9 2 2008 15 2008
10 2 2009 17 2008
11 3 2005 13 2005
12 3 2006 0 0
13 3 2007 0 0
14 3 2008 15 2008
15 3 2009 12 2008


