I have the following problem with the describe function of psych:
I want to describe selected variables of a dataframe and then remove a few of the results with subset and select. That seemingly only works with a dataframe but I get a describe class. For me it seems that sometimes it works and sometimes it doesn't which I assume is actually impossible. However, indeed it worked a few times and I already could save the output, nicely arranged, exactly as I wanted it to look like. But now again it returns the error that class describe cannot be converted into dataframe. I see the problem probably is that I get a list of lists (at least the environment says so). Since I'm a complete newbie in programming I just can't solve this problem, even after searching how to convert this class, I just don't get it.
Descriptives = describe(NumericData[5:44], na.rm = TRUE, interp = FALSE,
skew = TRUE, ranges = TRUE, trim = .1, type = 3,
check = TRUE, fast = NULL, quant = c(.25, .50, .75),
IQR = FALSE)
Descriptives = as.data.frame(Descriptives)
Descriptives = subset(Descriptives, select = -c(vars, median, trimmed, mad, range))
colnames(Descriptives) = c("N", "MEAN", "SD", "MIN", "MAX", "SKEW", "KURTOSIS", "SE", "Q1", "MEDIAN", "Q3")
Descriptives = round(Descriptives, digits = 4)
options(max.print = 1000)
print(as.data.frame(Descriptives))
write.table(Descriptives, file = "Descriptives.txt", sep = ",")
CodePudding user response:
Okay, so the error now occurs in every line (first block = line 1), the describe function actually works fine, but then I can't select the subset of resulting variables (line 3), here it says argument subset is missing and or name the columns left (line 4) which tell me tried to set columnnames for an object with less than two dimensions. Additionally round (line 5) returns an error that says can't use numerical argument for non mathematical function. The error "cannot coerce class ‘"describe"’ to a data.frame" which occurred previously in line 2 appears now when I want to print it. It worked one time and lately at least without choosing subset of resulting variables, but now nothing works and I don't understand why..The code remains the same.
2 rows of the Dataset I use:
structure(list(Age = c(24, 23, 44, 48, 35, 56, 64, 29, 20, 62,
35, 31, 32, 60, 57, 66, 46, 18, 52, 63, 64, 35, 54, 58, 61, 52,
52, 33, 49, 28, 22, 27, 40, 53, 18, 19, 43, 44, 26, 28, 38, 18,
50, 45, 23, 38, 50, 36, 72, 62, 33, 28, 29, 42, 48, 42, 29, 70,
27, 33, 22, 62, 67, 20, 32, 22, 32, 67, 29, 55, 49, 19, 52, 20,
30, 24, 18, 24, 23, 22, 19, 20, 29, 22, 20, 19, 21, 18, 22, 22,
18, 24, 22, 24, 19, 25, 24, 25, 20, 21, 23, 39, 60, 53, 47, 48,
40, 29, 24, 27, 21, 21, 27, 22, 20, 23, 36, 22, 25, 27, 66, 54,
54, 64, 49, 40), FTND = c(5, 7, 0, 6, 0, 6, 0, NA, 3, 4, 0, 7,
NA, 0, 4, 3, 4, 1, 0, 6, 0, 5, 0, NA, NA, 3, 0, 2, NA, 0, 0,
0, NA, NA, NA, NA, NA, 4, 0, 10, NA, NA, 8, NA, 3, 7, 0, 0, 5,
2, 0, 6, 7, 0, 4, 2, 0, NA, 0, 0, 0, 0, 0, 0, 4, 0, 0, NA, 3,
NA, NA, NA, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, NA, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)), row.names = c(NA,
-126L), class = "data.frame")
The dataset I get with describe is a list of lists, which does not contain the resulting variables, but all variables of my original dataset. Also by using (dput(Descriptives[5:6]) it should have printed variables age and FTND instead of EmQ (actually variable/row 9):
structure(list(EmQ = structure(list(descript = "EmQ", units = NULL,
format = NULL, counts = c(n = "97", missing = "29", distinct = "39",
Info = "0.998", Mean = "44.04", Gmd = "12.33", `.05` = "23.6",
`.10` = "27.6", `.25` = "38.0", `.50` = "45.0", `.75` = "50.0",
`.90` = "58.4", `.95` = "60.2"), values = list(value = c(19,
21, 22, 24, 25, 27, 28, 30, 32, 33, 34, 36, 37, 38, 39, 40,
41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55,
56, 57, 58, 59, 60, 61, 64, 66), frequency = structure(c(2,
2, 1, 2, 2, 1, 1, 2, 2, 2, 3, 1, 3, 1, 2, 2, 3, 4, 3, 4,
8, 7, 3, 6, 2, 4, 3, 1, 1, 1, 1, 4, 2, 1, 2, 3, 2, 2, 1), .Dim = 39L)),
extremes = c(L1 = 19, L2 = 21, L3 = 22, L4 = 24, L5 = 25,
H5 = 59, H4 = 60, H3 = 61, H2 = 64, H1 = 66)), class = "describe"),
EmQ10 = structure(list(descript = "EmQ10", units = NULL,
format = NULL, counts = c(n = "108", missing = "18",
distinct = "19", Info = "0.993", Mean = "10.16", Gmd = "4.346",
`.05` = "4.0", `.10` = "5.7", `.25` = "7.0", `.50` = "10.0",
`.75` = "13.0", `.90` = "15.0", `.95` = "16.0"), values = list(
value = c(2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,
14, 15, 16, 17, 18, 19, 20), frequency = structure(c(1,
3, 4, 3, 5, 13, 10, 11, 10, 11, 5, 11, 5, 6, 5, 2,
1, 1, 1), .Dim = 19L)), extremes = c(L1 = 2, L2 = 3,
L3 = 4, L4 = 5, L5 = 6, H5 = 16, H4 = 17, H3 = 18, H2 = 19,
H1 = 20)), class = "describe")), descript = "NumericData[5:44]", dimensions = c(126L,
2L), class = "describe")
The data I get with describeBy. Also a list, with just 2 groups in list 1(?) namely controls and patients, also contains needed result-variables like trimmed, median as attributes I guess):
structure(list(NULL, NULL), .Dim = 2L, .Dimnames = list(Group = c(NA_character_,
NA_character_)))
I'm sorry for the very long post, I don't know how to put it better..
CodePudding user response:
This is what I once got and want to get again:
N MEAN SD MIN MAX SKEW KURTOSIS SE Q1 MEDIAN Q3
Age 126 36.254 15.6578 18 72 0.6067 -0.9925 1.3949 22.25 30.5 49
FTND 107 1.2617 2.3121 0 10 1.7475 2.0378 0.2235 0 0 1.5
