Consider the following dataset:
df <- tibble(v1 = 1:5, v2= 101:105, v3 = c("v1", "v2", "v1", "v2", "v1"))
# A tibble: 5 × 3
v1 v2 v3
<int> <int> <chr>
1 1 101 v1
2 2 102 v2
3 3 103 v1
4 4 104 v2
5 5 105 v1
I would like to generate a new column that takes values from either v1 or v2, depending on which column is listed in v3.
# A tibble: 5 × 4
v1 v2 v3 v4
<int> <int> <chr> <dbl>
1 1 101 v1 1
2 2 102 v2 102
3 3 103 v1 3
4 4 104 v2 104
5 5 105 v1 5
Normally, I would use if_else, or if I had more cases, case_when. However, I have a lot of columns, so I'd rather not have a case_when statement that's many lines long. Is there a way to get R to interpret the values in v3 as column names? I've tried embracing the expression with {{ }} and using the .data[[ ]], but I can't seem to figure out the correct syntax.
CodePudding user response:
Here's a vectorized approach, no need to go row-wise or map it one-by-one.
df %>%
mutate(v4 = cbind(v1,v2)[ cbind(row_number(), match(v3, c("v1", "v2"))) ])
# # A tibble: 5 x 4
# v1 v2 v3 v4
# <int> <int> <chr> <int>
# 1 1 101 v1 1
# 2 2 102 v2 102
# 3 3 103 v1 3
# 4 4 104 v2 104
# 5 5 105 v1 5
CodePudding user response:
A tidyverse option would be rowwise with extraction using cur_data()
library(dplyr)
df %>%
rowwise %>%
mutate(v4 = cur_data()[[v3]]) %>%
ungroup
# A tibble: 5 × 4
v1 v2 v3 v4
<int> <int> <chr> <int>
1 1 101 v1 1
2 2 102 v2 102
3 3 103 v1 3
4 4 104 v2 104
5 5 105 v1 5
Or a compact approach would be get after rowwise
df %>%
rowwise %>%
mutate(v4 = get(v3)) %>%
ungroup
Or in base R, use row/column indexing for faster execution
df$v4 <- as.data.frame(df[1:2])[cbind(seq_len(nrow(df)),
match(df$v3, names(df)))]
df$v4
[1] 1 102 3 104 5
CodePudding user response:
You can try the following base R code with diag as.matrix (or t)
transform(
df,
v4 = diag(as.matrix(df[v3]))
)
or
transform(
df,
v4 = diag(t(df[v3]))
)
which gives
v1 v2 v3 v4
1 1 101 v1 1
2 2 102 v2 102
3 3 103 v1 3
4 4 104 v2 104
5 5 105 v1 5
CodePudding user response:
Here is a way how we could do it with pivot_longer:
- bring into long format with
pivot_longer filterbind_cols()v1andv2
library(tidyr)
library(dplyr)
df %>%
pivot_longer(
-v3,
names_to = "name",
values_to = "v4"
) %>%
filter(v3 == name) %>%
bind_cols(v1 = df$v1, v2=df$v2) %>%
select(v1, v2, v3, v4)
v1 v2 v3 v4
<int> <int> <chr> <int>
1 1 101 v1 1
2 2 102 v2 102
3 3 103 v1 3
4 4 104 v2 104
5 5 105 v1 5
CodePudding user response:
A base R solution:
df <- tibble(v1 = 1:5, v2= 101:105, v3 = c("v1", "v2", "v1", "v2", "v1"))
df$v4 <- apply(df, 1, function(x) x[x[3]])
df
#> # A tibble: 5 × 4
#> v1 v2 v3 v4
#> <int> <int> <chr> <chr>
#> 1 1 101 v1 1
#> 2 2 102 v2 102
#> 3 3 103 v1 3
#> 4 4 104 v2 104
#> 5 5 105 v1 5
Another possible solution, using purrr::pmap_dfr:
library(tidyverse)
df <- tibble(v1 = 1:5, v2= 101:105, v3 = c("v1", "v2", "v1", "v2", "v1"))
df %>%
mutate(pmap_dfr(., ~ list(v4 = if_else(..3 == "v1", ..1, ..2))))
#> # A tibble: 5 × 4
#> v1 v2 v3 v4
#> <int> <int> <chr> <int>
#> 1 1 101 v1 1
#> 2 2 102 v2 102
#> 3 3 103 v1 3
#> 4 4 104 v2 104
#> 5 5 105 v1 5
