I have a very large dataframe which is built as follows: Originaldf
I want to perform a pairwise t test within item A, comparing the measured value within the condition groups. So I would like to see if for all observations pertaining to item A, there is a difference between the measured values of the control group, test group, and placebo group (Condition).
The first thing I did was to split the dataframe into a list using dplyr's filter function.
Listdf <- split(originaldf, Item)
This worked and I got a list containing 82 elements with one dataframe corresponding to each item in the original dataframe.
I now am trying to perform the pairwise.t.test function on each element of the list. I am relatively new to R and think that writing a loop for this process, though inefficient, would help me understand what is going on the background. I know there is also the option to use the lapply function. I tried this on the Listdf with the following code, which I know is most likely much too simple but was worth a try.
lapply(Listdf, pairwise.t.test(Value, Condition))
However, I get the error Error in factor(g) : object 'Condition' not found. Not sure if there is a way to more specifically reference Condition so that it can be found. I've performed an individual pairwise.t.test on one of the items which worked with the following code.
pairwise.t.test(List$ItemA$Value, List$ItemA$Condition, p.adjust.method = "none")
However, I assume this would not work within the lapply function because I want it to perform the t.test for ItemA, ItemB, ItemC etc...
The loop I have tried so far is as follows:
for (i in Listdf) {
pairwise.t.test(List$i$logAddedConstant, List$i$Condition, p.adjust = "none")
}
For this I get the error "Error in split.default(X, group) : first argument must be a vector" I believe this error corresponds to the original splitting of the original dataframe. However I don't quite understand why this error would show up this late in the code because the splitting of the dataframe worked without a problem.
I know I am probably missing something fundamental, but I am quite stumped and have tried multiple options to no avail. If anyone has another idea or suggestion I would be very grateful for the help. Please let me know if I should add some more information.
CodePudding user response:
I made a very short example of a data.frame which is likewise structured as your originaldf
df <- data.frame(Item = c("A", "B", "C", "A", "B", "C"),
Value=runif(6),
Condition=c("Control","Control","Control", "Test", "Test", "Test"))
Listdf <- split(df, df$Item)
Using a simple for-loop
p <-list()
for (i in 1:length(Listdf)) {
p[[i]] <- pairwise.t.test(Listdf[[i]]$Value, Listdf[[i]]$Condition, p.adjust = "none")
}
Using lapply
p <- lapply(1:length(Listdf), function(x) {pairwise.t.test(Listdf[[x]]$Value, Listdf[[x]]$Condition, p.adjust = "none")})
