Home > Back-end >  Using as.formula "backwards"
Using as.formula "backwards"

Time:01-25

I am attempting to run a wilcox test across a data frame within a script, without having to manually type all of the variables I am trying to compare before I run the script.

So far it seems that the as.formula function will only define the first of the many variables I'm attempting to examine, such that if I input:

n <- names(df)
f <- as.formula(paste(n[!n %in% "cluster"], paste("~ cluster", collapse = "   ")))
f

I get the first variable ~ cluster, and the error:

Using formula(x) is deprecated when x is a character vector of length > 1.
  Consider formula(paste(x, collapse = " ")) instead.

I was wondering if anyone knew how to run this in "reverse", such that I get all of my variables ~ cluster within a function. If I type them all manually (formula = c(x1, x2, x3 ...) ~ cluster) and run the wilcox test, I get the appropriate output. I just am trying to define them without doing that manually.

CodePudding user response:

If you didn't mean:

as.formula(paste(paste(setdiff(n, 'cluster'), collapse='   '), '~ cluster'))
# x1   x2   x3 ~ cluster

you could use lapply and setdiff.

foo <- lapply(setdiff(n, 'cluster'), \(x) as.formula(paste(x, '~ cluster')))
foo
# [[1]]
# x1 ~ cluster
# <environment: 0x55b1ca157078>
#   
#   [[2]]
# x2 ~ cluster
# <environment: 0x55b1ca159708>
#   
#   [[3]]
# x3 ~ cluster
# <environment: 0x55b1ca1eed50>

Later, subset the list,

wilcox.test(foo[[1]], data)

or even:

lapply(foo, \(f) wilcox.test(f, data))

Note: R >= 4.1 used.


Data:

n <- c(paste0('x', 1:3), 'cluster')
  •  Tags:  
  • Related