Home > Net >  Mutation of numeric Data into "high" "medium" and "low" produced betwe
Mutation of numeric Data into "high" "medium" and "low" produced betwe

Time:01-06

I tried to produce a mutation of numeric data into "high","medium" and "low" via

library(dplyr)

mdata %>%
  mutate(mvariable = case_when(vari < quantile(vari,0.5) ~ 'low', 
                      between(vari, quantile(vari, 0.5), quantile(vari, 0.75))~'med', 
                      TRUE ~ 'high'))

to use it in a multi level analysis.

It didn't generate my desired data but told me:

between() called on numeric vector with S3 class

what am I doing wrong?

Thanks in advance.

I am using R version 4.1.2 -- "Bird Hippie"

CodePudding user response:

One alternative solution (without knowing the data) could be to circumvent the between function entirely by switching the order of the case_when:

library(tidyverse)
mdata <- mdata %>%
  mutate(mvariable = case_when(vari < quantile(vari, 0.5) ~ 'low',
                               vari > quantile(vari, 0.75) ~ 'high', 
                               TRUE ~ 'med'))

CodePudding user response:

Have you tried base::cut()?

library(dplyr)

mdata %>% 
  mutate(mvariable = cut(vari, 
                         breaks=c(-Inf, 0.5, 0.75, Inf), 
                         labels=c("low", "med", "high")))

When labeling, labels requires n-numbers of breaks - 1. I.e, labels = breaks - 1. So if you have 5 breaks you need 4 labels, 4 breaks 3 labels and so on.

EDIT:

You could also use dynamic quantiles which changes depending on your data.

library(dplyr)

mdata %>% 
  mutate(mvariable = cut(vari, 
                         breaks=c(-Inf, quantile(vari)[3], quantile(vari)[4], Inf), 
                         labels=c("low", "med", "high")))
  •  Tags:  
  • Related