below is a function to extract p-values from multiple Chi-Square tests and display them as a matrix. I'm trying to do the same, but to extract residuals instead. Any help is appreciated.
Sample data:
df <- data.frame(first_column = c(rep("E1_C1",5), rep("E1_C2",3), rep("E2_C2",7),rep("E3_C3",5)),
second_column = c(rep("E1_C1",3), rep("E1_C2",10), rep("E2_C2",4),rep("E3_C3",3)),
third_column = c(rep("E1_C1",7), rep("E1_C2",4), rep("E2_C2",3),rep("E3_C3",6)),
fourth_column = c(rep("E1_C1",4), rep("E1_C2",6), rep("E2_C2",6),rep("E3_C3",4))
)
Chi-square matrix function for P-Values:
chisqmatrix <- function(x) {
names = colnames(x); num = length(names)
m = matrix(nrow=num,ncol=num,dimnames=list(names,names))
for (i in 1:(num-1)) {
for (j in (i 1):num) {
#browser()
m[j,i] = chisq.test(x[, i, drop = TRUE],x[, j, drop = TRUE])$p.value
}
}
return (m)
}
CodePudding user response:
In your case, the returned residuals is a 4x4 matrix. Instead of using a matrix to take the results, the following solution uses a list instead. This way you can have matrices of different sizes.
With minimal changes from your original code:
chisqlist <- function(x) {
names = colnames(x); num = length(names)
m = list()
index = 1
for (i in 1:(num-1)) {
for (j in (i 1):num) {
#browser()
m[[index]] = chisq.test(x[, i, drop = TRUE],x[, j, drop = TRUE])$residuals
index=index 1
}
}
return (m)
}
Edit: I do prefer @ Onyambu's answer, which I didn't see. It would be faster than a nested for loop.
CodePudding user response:
Simply change your function from requesting $p.value to requesting $residuals. This will provide (observed - expected) / sqrt(expected). If you desire standardized residuals request $stdres.
chisqmatrix <- function(x) {
names = colnames(x); num = length(names)
m = matrix(nrow=num,ncol=num,dimnames=list(names,names))
for (i in 1:(num-1)) {
for (j in (i 1):num) {
#browser()
m[j,i] = chisq.test(x[, i, drop = TRUE],x[, j, drop = TRUE])$residuals
}
}
return (m)
}
