Home > database >  Does `cor()` only work for numeric variables?
Does `cor()` only work for numeric variables?

Time:02-06

I'm interested in the correlation of x with y. x is an ordinal (Likert-type) variable. y is a continuous variable.

But when I use cor(x, y, method = "spearman") I get an error saying

'x' must be numeric`

Spearman's rho doesn't necessarily require x to be numeric. So I wonder how I can run this function?

set.seed(0)

x <- sample(c("None", "Little", "Often", "Always"), 20, replace = TRUE)
y <- round(runif(length(x), 100, 300))
data <- data.frame(subject=seq_len(length(x)), x, y)

cor(x, y, method = "spearman") # Error: 'x' must be numeric

#data:
   subject      x   y
1        1 Little 255
2        2   None 287
3        3 Always 142
4        4  Often 230
5        5   None 125
6        6 Little 153
7        7   None 177
8        8  Often 103
9        9  Often 176
10      10 Little 274
11      11 Little 168
12      12  Often 196
13      13  Often 220
14      14   None 199
15      15   None 137
16      16   None 265
17      17 Little 234
18      18 Little 259
19      19 Little 122
20      20 Little 245

CodePudding user response:

Spearman's rho does require that the data be ordered, which characters are not and even regular factors are not (this is a little bit subtle — they do have an ordering which is used when listing factor levels, plotting, etc., but this ordering is not assumed to have any statistical meaning). It would make sense if cor() allowed ordered factors (factor(..., ordered = TRUE) or ordered(...), but it doesn't. As ?cor says:

The inputs must be numeric (as determined by ‘is.numeric’: logical values are also allowed for historical compatibility): the ‘"kendall"’ and ‘"spearman"’ methods make sense for ordered inputs but ‘xtfrm’ can be used to find a suitable prior transformation to numbers.

However, assuming that you have a factor variable and the order of levels is what you want, then using as.integer() in cor() should work fine. (In fact, the xtfrm.factor() method is just a wrapper for as.integer().)

xf <- ordered(x, levels = c("None", "Little", "Often", "Always"))
cor(as.integer(xf), y, method = "spearman")

CodePudding user response:

You can recode the values:

data <- data %>% mutate(x2 = recode(x, "None" = 0, "Little" =1 , "Often"=2, "Always"=3))
cor(data$x2, data$y, method = "spearman")
[1] -0.1930743
  •  Tags:  
  • Related