conditionally copying reference values in r-CodePudding

I am trying to conditionally copy values from the x column into a new column based on a reference value. for example in row 1, for time == 1, the ref value is 7 so the newx value should copy the x value from time == 1 and id == 7 the copied value always needs to be in the same time block.

In the event the ref value is 0, the newx value should also be 0

I have tried a few approaches and the below is probably the closest I have reached but it still isn't working

        library(dplyr)
        x <- sample(1:50, 24)
        y <- sample(1:50, 24)
        ref <- c(7,7,7,7,0,0,0,0,0,0,0,0,4,3,4,1,8,8,5,8,0,0,0,0)
        id <- rep(seq(1,8,1), 3)
        time <- rep(1:3, each = 8)

    x  y ref id time
1  41 29   7  1    1
2  18 37   7  2    1
3  50 25   7  3    1
4  47  7   7  4    1
5   2 40   0  5    1
6  22 19   0  6    1
7  48  9   0  7    1
8  26 36   0  8    1
9  49 47   0  1    2
10 46 18   0  2    2
11 25 23   0  3    2
12 38  3   0  4    2
13 28 31   4  5    2
14 34  4   3  6    2
15 21 32   4  7    2
16  9 48   1  8    2
17 43 43   8  1    3
18 39 38   8  2    3
19  6 16   5  3    3
20 12 41   8  4    3
21  1 13   0  5    3
22 19 17   0  6    3
23  7 34   0  7    3
24 33 10   0  8    3

        
        df <- as.data.frame(cbind(x,y,ref,id,time))
        
        df <- df %>% group_by(time) %>% mutate(Newx = case_when((ref > 0) ~ x[which(id==ref)],
                                                              T ~ 0,))

CodePudding user response：

You can join df with itself. The last mutate is just to remove the NAs for the ref == 0 rows. You can also use tidyr::replace_na but I wanted to stick to using only dplyr:

df %>%
  left_join(df %>% select(x, id, time) %>% rename(newx = x), by= c("time", "ref" = "id")) %>%
  mutate(newx = ifelse(is.na(newx), 0, newx))

Which results to:

    x  y ref id time newx
1  44 44   7  1    1   36
2  37 26   7  2    1   36
3  40 27   7  3    1   36
4  32 46   7  4    1   36
5  48 33   0  5    1    0
6  31  6   0  6    1    0
7  36  1   0  7    1    0
8  27 11   0  8    1    0
9  26 32   0  1    2    0
10 42 22   0  2    2    0
11 22 21   0  3    2    0
12 15 28   0  4    2    0
13 45 47   4  5    2   15
14 49  4   3  6    2   22
15 25 50   4  7    2   15
16 14  3   1  8    2   26
17 13 42   8  1    3   12
18 38  7   8  2    3   12
19 10 12   5  3    3   50
20  2 40   8  4    3   12
21 50 43   0  5    3    0
22  4  9   0  6    3    0
23 34 49   0  7    3    0
24 12 31   0  8    3    0

CodePudding user response：

Using purrr::map_dbl you could do:

library(purrr)
library(dplyr)

df %>%
  group_by(time) %>%
  mutate(newx = map_dbl(ref, function(ref) if (ref > 0) .data$x[.data$id == ref] else 0)) %>% 
  ungroup()
#> # A tibble: 24 × 6
#>        x     y   ref    id  time  newx
#>    <int> <int> <dbl> <dbl> <int> <dbl>
#>  1    31    17     7     1     1    37
#>  2    15    43     7     2     1    37
#>  3    14    39     7     3     1    37
#>  4     3    12     7     4     1    37
#>  5    42    15     0     5     1     0
#>  6    43    32     0     6     1     0
#>  7    37    42     0     7     1     0
#>  8    48     7     0     8     1     0
#>  9    25     9     0     1     2     0
#> 10    26    41     0     2     2     0
#> # … with 14 more rows

DATA

set.seed(123)

x <- sample(1:50, 24)
y <- sample(1:50, 24)
ref <- c(7, 7, 7, 7, 0, 0, 0, 0, 0, 0, 0, 0, 4, 3, 4, 1, 8, 8, 5, 8, 0, 0, 0, 0)
id <- rep(seq(1, 8, 1), 3)
time <- rep(1:3, each = 8)

df <- data.frame(x, y, ref, id, time)