How to use the names in the string vector in models? for example why cant I do this
loop_variables = c("Age", "BMI", "Height")
for (i in 1:length(loop_variables){
basic_logistic_model = glm(outcome~loop_variable[i], data=DB, family="binomial"
summary(basic_logistic_model)
}
I see alot of R users doing vectors with names of study variables then looping it what am I doing wrong?
CodePudding user response:
It is the formula that needs to be update. We may use paste or reformulate. In addition, it is better to have an object to store the output of summary especially a list would suit.
summary_lst <- vector('list', length(loop_variables))
names(summary_lst) <- loop_variables
for (i in 1:length(loop_variables){
# convert the column to factor column
DB[[loop_variables[i]]] <- factor(DB[[loop_variables[i]]])
# create the formula
fmla <- reformulate(loop_variable[i], response = 'outcome')
basic_logistic_model = glm(fmla, data=DB, family="binomial")
# assign the summary output to the list element
summary_lst[[i]] <- summary(basic_logistic_model)
}
CodePudding user response:
In my view, data transformation part (e.g. converting variables into factors) and modelling part (e.g. using glm()) ought to be separated and not to be mixed in the loop for the code readability and efficiency.
Here, I will show how to execute looping iterations using purrr::map(), while the data to be analysed is transformed using dplyr::mutate() beforehand.
Package loading
library(purrr) # for `map`, `set_names`
library(dplyr) # for `mutate`
Data transformation
Add new variables that was converted into factors using dummy coding
fct_ToothGrowth <- ToothGrowth |>
mutate(
fct_dose = dose |>
as.factor()
fct_len = len |>
## The numeric variable `len` is converted
## into a three-level factor
cut(3) |>
as.factor()
)
contrasts(fct_ToothGrowth$fct_dose)
contrasts(fct_ToothGrowth$fct_len)
Add new variables that was converted into factors using non-dummy coding
Sum contrast and forward difference coding are used here as examples.
fct_ToothGrowth <- ToothGrowth |>
mutate(
fct_dose = `contrasts<-`(
factor(
dose,
levels = c("0.5", "1", "2")
), ,
## sum contrast coding (as known as deviation coding)
contr.sum(3)
),
fct_len = `contrasts<-`(
factor(
cut(len, 3)
), ,
## Forward difference coding
MASS::contr.sdif(3)
)
)
contrasts(fct_ToothGrowth$fct_dose)
contrasts(fct_ToothGrowth$fct_len)
Looping glm()
explanatory_variables <- c("fct_len", "fct_dose", "len", "dose")
summaries <- map(
.x = explanatory_variables,
## "fct_len", "fct_dose", "len", and "dose" are replaced
## by the arguments specified in `.x`.
~ paste0("supp ~ ", .x) |>
## `supp ~ fct_len`, ..., `supp ~ dose` are inputted
## into the first argument of `glm()`, namely `formula` argument
glm(family = binomial, data = fct_ToothGrowth)
) |>
## set names to the returned sublists
set_names(nm = explanatory_variables)
summaries$fct_len
summaries$fct_dose
summaries$len
summaries$dose
