Home > Net >  Is there an easy way to pick specific observations from column out of a data frame for a function to
Is there an easy way to pick specific observations from column out of a data frame for a function to

Time:01-06

I have a function below from the package gapminder to run an analysis. I need to pick two continents out of the five available.

library(gapminder)

part3 <- gapminder
continent1 <- subset(part3, continent == "Asia")
continent2 <- subset(part3, continent =="Africa")
#As I'm going to t-test I need two factors - picking two continents
part3c <- rbind(continent1, continent1)

Question Is there a way for the user to pick continents for the analysis e.g., some code that says "pick two from the five available" that allows for the analysis to be run with different combinations?

Something like getting the output from filtering data in an excel pivot table or do I need to code in the continents each time - as above?

CodePudding user response:

Do you want something like this?
Function combn returns the combinations of a vector, in the case below 2 by 2 and applies a function to each of them. The function test_fun first makes sure the groups are of the same size, then runs the t-test.

In the example call, I test equality of lifeExp by continent but any other column can be tested.

test_fun <- function(X, col){
  cols <- c(col, "continent")
  n <- min(nrow(X[[1]]), nrow(X[[2]]))
  Y <- lapply(X, \(y) {
    if(nrow(y) > n)
      y[sample(nrow(y), n), cols]
    else y[cols]
  })
  Y <- do.call(rbind, Y)
  t.test(get(col) ~ continent, Y)
  
}

sp_part3 <- split(part3, part3$continent)

combn(sp_part3, 2, test_fun, simplify = FALSE, col = "lifeExp")
  •  Tags:  
  • Related