Home > Mobile >  Filtering of dataframe columns displaying a counter intuitive behavior (R)
Filtering of dataframe columns displaying a counter intuitive behavior (R)

Time:01-16

Take as an example the dataframe below. I need to change the dataframe by keeping only the columns that are in the filter objects.

test <- data.frame(A = c(1,6,1,2,3) , B = c(1,2,1,1,2), C = c(1,7,6,4,1), D = c(1,1,1,1,1))
filter <- c("A", "B", "C", "D")
filter2 <- c("A","B","D")

To do that I'm using this piece of code:

`%ni%` <- Negate(`%in%`)
test <- test[,-which(names(test) %ni% filter2)]

If I use the filter2 object I get what is expected:

  A B D
1 1 1 1
2 6 2 1
3 1 1 1
4 2 1 1
5 3 2 1

However, if I use the filter object, I get a dataframe with zero columns:

data frame with 0 columns and 5 rows

I expected to get an untouched dataframe, since filter had all test columns in it. Why does this happen, and how can I write a more reliable code not to get empty dataframes in these situations?

CodePudding user response:

Use ! instead of -

test[,!(names(test) %ni% filter2)]
test[,!(names(test) %ni% filter)]

by wrapping with which and using -, it works only when the length of output of which is greater than 0

> which(names(test) %ni% filter2)
[1] 3
> which(names(test) %ni% filter)
integer(0)

By doing the -, there is no change in the integer(0) case

> -which(names(test) %ni% filter)
integer(0)
> -which(names(test) %ni% filter2)
[1] -3

thus,

> test[integer(0)]
data frame with 0 columns and 5 rows
  •  Tags:  
  • Related