Home > Enterprise >  Pipeline for Vector Manipulations in R as Replacement of dplyr data.frame Manipulations
Pipeline for Vector Manipulations in R as Replacement of dplyr data.frame Manipulations

Time:01-19

There is a common well known dplyr pipeline manipulating with a character column in a data frame:

library("dplyr")
data.frame(someVector = c("fff", "aaa", "bbb", "ccc", "ddd", "ccc")) %>% 
  distinct(someVector) %>% 
  arrange(someVector) %>% 
  filter(someVector != "bbb") %>% 
  pull(someVector)

This pipeline returns the desired vector result:

[1] "aaa" "ccc" "ddd" "fff"

However the cast of character vector to dataframe and back seems to be not optimal. Using the same sequence of functions with the vector as the parameter

c("fff", "aaa", "bbb", "ccc", "ddd", "ccc") %>% 
   distinct() %>%
   arrange() %>% 
   filter(.data != "bbb")

causes errors:

c("fff", "aaa", "bbb", "ccc", "ddd", "ccc") %>% distinct() 
Error in UseMethod("distinct") : 
  no applicable method for 'distinct' applied to an object of class "character"
   
c("fff", "aaa", "bbb", "ccc", "ddd", "ccc") %>% arrange() 
Error in UseMethod("arrange") : 
  no applicable method for 'arrange' applied to an object of class "character"
 
c("fff", "aaa", "bbb", "ccc", "ddd", "ccc") %>% filter(.data != "bbb")
Error in UseMethod("filter") : 
  no applicable method for 'filter' applied to an object of class "character"

That is why the used functions need to be "translated" into (replaced with) their vector analogues:

c("fff", "aaa", "bbb", "ccc", "ddd", "ccc") %>%
  unique() %>% 
  sort() 

I do not know the equivalent of filter() for vector manipulation in a pipeline way. The question is how to make a proper (most optimal "R - style") pipeline vector manipulation for the desired output?

CodePudding user response:

You have many options. For example,

this

c("fff", "aaa", "bbb", "ccc", "ddd", "ccc") %>% unique() %>% `[`(. != "bbb") %>% sort() 

this

c("fff", "aaa", "bbb", "ccc", "ddd", "ccc") %>% unique() %>% .[. != "bbb"] %>% sort() 

and this

c("fff", "aaa", "bbb", "ccc", "ddd", "ccc") %>% unique() %>% magrittr::extract(. != "bbb") %>% sort() 

all give

[1] "aaa" "ccc" "ddd" "fff"
  •  Tags:  
  • Related