I am trying to write a function that can be used within a dplyr pipeline. It should take an arbitrary number of columns as arguments, and replace certain substrings in only those columns. Below is a simple example of what I have so far.
library(tidyverse)
tib <- tibble(
x = c("cats and dogs", "foxes and hounds"),
y = c("whales and dolphins", "cats and foxes"),
z = c("dogs and geese", "cats and mice")
)
filter_words <- function(.data, ...) {
words_to_filter <- c("cat", "dog")
.data %>% mutate(
across(..., ~ gsub(
paste0(words_to_filter, collapse = "|"),
"#@!*", ., perl = TRUE
)
)
)
}
filtered_tib <- tib %>%
filter_words(x, y)
If this worked I would expect:
x y z
#@!*s and #@!*s whales and dolphins dogs and geese
foxes and hounds #@!*s and foxes cats and mice
But I get an error:
Error: Can't splice an object of type `closure` because it is not a vector
Run `rlang::last_error()` to see where the error occurred.
Called from: signal_abort(cnd)
I have tried numerous combinations of non-standard evaluation, as gleaned from the tidyverse docs and many questions on SO, and have seen almost as many different errors! Would anyone be able to help get this working? It does work if I replace the dots with everything(), but that does not fit my use case to only filter certain columns.
CodePudding user response:
If you are using the latest tidyverse, the recommended approach nowadays is to use the {{ }} operator to immediately defuse the argument to .cols in across. Something like this
filter_words <- function(.data, .mycols) {
words_to_filter <- c("cat", "dog")
.data %>% mutate(
across({{ .mycols }}, ~ gsub(
paste0(words_to_filter, collapse = "|"),
"#@!*", ., perl = TRUE
)
)
)
}
tib %>% filter_words(c(x, y))
You can then treat .mycols as the usual first argument of across and use whatever tidy-select you want. The output is
# A tibble: 2 x 3
x y z
<chr> <chr> <chr>
1 #@!*s and #@!*s whales and dolphins dogs and geese
2 foxes and hounds #@!*s and foxes cats and mice
CodePudding user response:
Inside your function, across(..., should instead be across(c(...),.
library(dplyr, warn.conflicts = FALSE)
sessionInfo()$otherPkgs$dplyr$Version
#> [1] "1.0.7"
tib <- tibble(
x = c("cats and dogs", "foxes and hounds"),
y = c("whales and dolphins", "cats and foxes"),
z = c("dogs and geese", "cats and mice")
)
filter_words <- function(.data, ...) {
words_to_filter <- c("cat", "dog")
.data %>% mutate(
across(c(...), ~ gsub(
paste0(words_to_filter, collapse = "|"),
"#@!*", ., perl = TRUE
)
)
)
}
tib %>%
filter_words(x, y)
#> # A tibble: 2 × 3
#> x y z
#> <chr> <chr> <chr>
#> 1 #@!*s and #@!*s whales and dolphins dogs and geese
#> 2 foxes and hounds #@!*s and foxes cats and mice
Created on 2022-01-17 by the reprex package (v2.0.1)
CodePudding user response:
You may use match.call to capture the dots (...).
library(dplyr)
filter_words <- function(.data, ...) {
words_to_filter <- c("cat", "dog")
args <- as.character(match.call(expand.dots = FALSE)$`...`)
.data %>% mutate(
across(all_of(args), ~ gsub(
paste0(words_to_filter, collapse = "|"),
"#@!*", ., perl = TRUE
)
)
)
}
tib %>% filter_words(x, y)
# x y z
# <chr> <chr> <chr>
#1 #@!*s and #@!*s whales and dolphins dogs and geese
#2 foxes and hounds #@!*s and foxes cats and mice
tib %>% filter_words(x)
# A tibble: 2 x 3
x y z
# <chr> <chr> <chr>
#1 #@!*s and #@!*s whales and dolphins dogs and geese
#2 foxes and hounds cats and foxes cats and mice
