Home > Net >  How to do a 2-step wrangling and nesting using tidyr/dplyr, using %>% pipe only?
How to do a 2-step wrangling and nesting using tidyr/dplyr, using %>% pipe only?

Time:01-12

I need to accomplish a wrangling task with tidyr/dplyr as part of a %>% pipe. That is, without assigning data to helper objects. I have the following trb tibble as a given:

library(tibble)

trb <-
  tribble(~name,     ~type,    ~dat,
        "john",    "cat",    mtcars,
        "john",    "spider", Puromycin,
        "amanda",  "dog",    ToothGrowth,
        "chris",   "wolf",   PlantGrowth,
        "annie",   "lion",   women,
        "richard", "frog",   trees,
        "liz",     "horse",  USArrests,
        "raul",    "snake",  iris,
        "kate" ,   "bear",   quakes) 

and I want to do a 2-step wrangling (not necessarily in the following order):

  1. lump together john's dat data frames into a named list (in which names will come from type); and
  2. shift john's information to leftmost while nesting the data of the others.

The desired output should therefore be:

desired_output <-
  tribble(~dat_john,                                  ~other_people,
          list("cat" = mtcars, "spider" = Puromycin), trb %>% dplyr::filter(name != "john")
        )

As noted above, it's important to me to get from trb to desired_output using %>% only. Any ideas?

CodePudding user response:

Maybe something like this? It first categorizes the data as john or not, then nests all the data for each category into one list, then pivots those two categories wide.

library(tidyr); library(dplyr)
trb %>%
  mutate(column = if_else(name == "john", "dat_john", "other people")) %>%
  nest(-column) %>%
  pivot_wider(names_from = column, values_from = data)

CodePudding user response:

It is possible to achieve what you want via a sequence of pipelines. But I am not sure why you want to do this. Note that you need to manually assign "john" as the first level and rearrange the dataframe. Otherwise, if "john" is not the first entry, you won't get him to the leftmost after pivot_wider.

library(dplyr)
library(tidyr)

trb %>% 
  group_by(id = factor(name != "john", labels = c("dat_john", "other_people"))) %>% 
  arrange(id) %>% # use factor and arrange to ensure that john is always the first level
  nest(data = -id) %>% 
  pivot_wider(names_from = id, values_from = data) %>% 
  mutate(dat_john = with(dat_john[[1L]], list(setNames(dat, type))))

Output

# A tibble: 1 x 2
  dat_john         other_people    
  <list>           <list>          
1 <named list [2]> <tibble [7 x 3]>
  •  Tags:  
  • Related