Create new list/data frame from a value in a list within a data frame in R-CodePudding

I have a large data frame df obtained by running a one-sided t-test on a different data frame:

df <- structure(list(uniqueID = c("101030", "101060"), res = list(structure(list(
    statistic = c(t = 19), parameter = c(df = 20), 
    p.value = 0.00015, conf.int = structure(c(0.389, 
    Inf), conf.level = 0.95), estimate = c(`mean of x` = 0.412), 
    null.value = c(mean = 0.22), stderr = 0.01, 
    alternative = "greater", method = "One Sample t-test", data.name = "mean"), class = "htest"), 
    structure(list(statistic = c(t = 29), parameter = c(df = 20), 
        p.value = 4.5e-05, conf.int = structure(c(0.569, 
        Inf), conf.level = 0.95), estimate = c(`mean of x` = 0.600), 
        null.value = c(mean = 0.22), stderr = 0.01, 
        alternative = "greater", method = "One Sample t-test", 
        data.name = "mean"), class = "htest"))), row.names = c(NA, 
-2L), class = c("tbl_df", "tbl", "data.frame"))

I want to create a new data frame df_new where I basically take the uniqueID value as well as the p.value:

df_new <- data.frame(uniqueID = c(101030, '101060'), pval = c(0.00015, 4.5e-05))

I know there must be a way to iterate over this data frame. For example, I can access the p.value by df[[2]][[i]]$p.value where i is the row number, but I'm at a lost for how to iterate over every row and save this output to either a list or new data frame. Any help would be greatly appreciated.

CodePudding user response：

If I understand what you are asking, you have a list, and the easiest way is to iterate with the apply functions:

df_new  <- data.frame(
    uniqueID = df$uniqueID,
    pval = sapply(df$res, function(x) x[["p.value"]])
)

Output:


r$> df_new
  uniqueID    pval
1   101030 1.5e-04
2   101060 4.5e-05

CodePudding user response：

We can also hoist the p.value column up one nested level:

library(tidyr)
library(dplyr)

hoist(df, .col = res, "p.value") %>%
  select(uniqueID, p.value)

#> # A tibble: 2 × 2
#>   uniqueID  p.value
#>   <chr>       <dbl>
#> 1 101030   0.00015 
#> 2 101060   0.000045

CodePudding user response：

Another possible solution:

library(tidyverse)

df %>% 
  rowwise %>% 
  mutate(pvalue = res %>% flatten %>% .["p.value"] %>% unlist, res = NULL)

#> # A tibble: 2 × 2
#> # Rowwise: 
#>   uniqueID   pvalue
#>   <chr>       <dbl>
#> 1 101030   0.00015 
#> 2 101060   0.000045

Or using purrr:

map_dbl(df$res, ~ .x$p.value) %>% bind_cols(uniqueID = df[,1], pvalue=.)