I'm trying to run a regression model for all two-way interactions and have about ~70 variables for each observation.
I have a certain variable, say z, that I want to model the main effects for but exclude from all two way interactions.
So right now I have something like this:
lm(y ~ .^2, data = d)
I'd like an easy way to do this:
lm(y ~ . (not z)^2, data = d)
I know it's a lot of variables, I'm an academic researcher and just need to see what's significant when I run the model with everything in it. For my purposes, z makes sense as a main effect but not as an interaction from an intuitive perspective.
Thank you!
CodePudding user response:
As Ritchie Sacramento and I discussed in the comments, this should work:
lm(mpg ~ (. - carb)^2 carb, data=mtcars)
The . - carb would create the formula with all but the carb variable, which is then added afterwards.
For your data, it would be something like this:
lm(y ~ (.-z)^2 z, data = d)
CodePudding user response:
I think you can put this together with some string manipulations and reformulate.
Sample data (we're not even going to try to fit a model, just figure out how to construct the formula, so this should be OK).
dd <- data.frame(a = 1:3, b=1:3, c=1:3, d=1:3, e=1:3, f=1:3)
Let's suppose a is the response and f is the focal variable that you want to include only as a main effect.
v1 <- paste(setdiff(names(dd), c("a","f")), collapse = " ")
v2 <- sprintf("(%s)^2", v1)
form <- reformulate(c(v2, "f"), response = "a")
## a ~ (b c d e)^2 f
colnames(model.matrix(form, data = dd))
results:
[1] "(Intercept)" "b" "c" "d" "e"
[6] "f" "b:c" "b:d" "b:e" "c:d"
[11] "c:e" "d:e"
Confirming that @FlapJack's answer also works:
colnames(model.matrix(a ~ (. - f)^2 f, data = dd))
[1] "(Intercept)" "b" "c" "d" "e"
[6] "f" "b:c" "b:d" "b:e" "c:d"
[11] "c:e" "d:e"
(On the other hand, you could use my framework to do more complicated things like include/exclude variables on the basis of regular expressions ...)
