Suppose I have the following dataset test
> test = data.frame(location = c("here", "there", "here", "there", "where"), x = 1:5, y = c(6,7,8,6,10))
> test
location x y
1 here 1 6
2 there 2 7
3 here 3 8
4 there 4 6
5 where 5 10
Then, I want to make a condition where if y satisfy a condition, every location matched once are maintained in the dataset, something like
test %>% filter_something(y == 6)
location x y
1 here 1 6
2 there 2 7
3 here 3 8
4 there 4 6
Note that, even in line 4 there is no y = 6, they keep on the dataset, since there is at least one case where location match the 'right' y.
I can solve this problem creating another dataset using y == 6, and then doing an inner join with test, but any hint if there is another option more elegant?, because I'm not filtering just this variable, but I'm using another columns too.
CodePudding user response:
We can group_by location, then use any(condition)
library(dplyr)
test %>% group_by(location) %>%
filter(any(y==6))
CodePudding user response:
If we want to use data.table, we could first get the locations associated with y ==6, and filter on those, all in one line:
library(data.table)
test <- setDT(test)
# keep only the locations associated with y == 6
test <- test[location %in% test[y==6]$location]
