I'm interested in the correlation of x with y. x is an ordinal (Likert-type) variable. y is a continuous variable.
But when I use cor(x, y, method = "spearman") I get an error saying
'x' must be numeric`
Spearman's rho doesn't necessarily require x to be numeric. So I wonder how I can run this function?
set.seed(0)
x <- sample(c("None", "Little", "Often", "Always"), 20, replace = TRUE)
y <- round(runif(length(x), 100, 300))
data <- data.frame(subject=seq_len(length(x)), x, y)
cor(x, y, method = "spearman") # Error: 'x' must be numeric
#data:
subject x y
1 1 Little 255
2 2 None 287
3 3 Always 142
4 4 Often 230
5 5 None 125
6 6 Little 153
7 7 None 177
8 8 Often 103
9 9 Often 176
10 10 Little 274
11 11 Little 168
12 12 Often 196
13 13 Often 220
14 14 None 199
15 15 None 137
16 16 None 265
17 17 Little 234
18 18 Little 259
19 19 Little 122
20 20 Little 245
CodePudding user response:
Spearman's rho does require that the data be ordered, which characters are not and even regular factors are not (this is a little bit subtle — they do have an ordering which is used when listing factor levels, plotting, etc., but this ordering is not assumed to have any statistical meaning). It would make sense if cor() allowed ordered factors (factor(..., ordered = TRUE) or ordered(...), but it doesn't. As ?cor says:
The inputs must be numeric (as determined by ‘is.numeric’: logical values are also allowed for historical compatibility): the ‘"kendall"’ and ‘"spearman"’ methods make sense for ordered inputs but ‘xtfrm’ can be used to find a suitable prior transformation to numbers.
However, assuming that you have a factor variable and the order of levels is what you want, then using as.integer() in cor() should work fine. (In fact, the xtfrm.factor() method is just a wrapper for as.integer().)
xf <- ordered(x, levels = c("None", "Little", "Often", "Always"))
cor(as.integer(xf), y, method = "spearman")
CodePudding user response:
You can recode the values:
data <- data %>% mutate(x2 = recode(x, "None" = 0, "Little" =1 , "Often"=2, "Always"=3))
cor(data$x2, data$y, method = "spearman")
[1] -0.1930743
