Based on code from this link, we could find file names containing multiple strings:
allpatterns <- function(fnames, patterns) {
i <- sapply(fnames, function(fn) all(sapply(patterns, grepl, fn)) )
fnames[i]
}
filenames <- c("foo.txt", "bar.R", "foo_quux.py", "quux.c", "quux.foo",
"foo_bar", "bar.foo.cpp", "foo_bar_quux", "quux_foo.bar", "nothing")
allpatterns(filenames, c("foo", "bar"))
# [1] "foo_bar" "bar.foo.cpp" "foo_bar_quux" "quux_foo.bar"
Now I'd like to go further by adding a condition not contain certain strings, for example I hope to filter file names which containing foo, bar and not containing cpp, quux, which will gives following result:
# [1] "foo_bar"
How could I achieve that by modifying code above?
EDIT: answer below dedicated to a R master, it's inspiring even I did not get an exact expected result with it:
filenames <- c("foo.txt", "bar.R", "foo_quux.py", "quux.c", "quux.foo",
"foo_bar", "bar.foo.cpp", "foo_bar_quux", "quux_foo.bar",
"nothing")
keep <- c("foo", "bar")
drop <- c("cpp", "quux")
paste0('', paste0(keep, collapse = ''))
keep_regex <- paste0("\\b(?:", paste(keep, collapse="|"), ")\\b")
drop_regex <- paste0("\\b(?:", paste(drop, collapse="|"), ")\\b")
result <- filenames[grepl(keep_regex, filenames) &
!grepl(drop_regex, filenames)]
result
CodePudding user response:
"foo" OR "bar" without "cpp" and "quux":
filenames[grepl("foo|bar",filenames)&!grepl("cpp|quux",filenames)]
[1] "foo.txt" "bar.R" "foo_bar"
"foo" AND "bar" without "cpp" and "quux":
filenames[grepl("(?=.*foo)(?=.*bar)",filenames,perl = T)&!grepl("cpp|quux",filenames)]
[1] "foo_bar"
CodePudding user response:
Maybe this function would be of help:
allpatterns <- function(fnames, keep, remove) {
# Include if it contains all the `keep` variables
i <- Reduce(`&`, lapply(keep, function(x) grepl(x, fnames)))
# Drop if any of `remove` variable is present.
j <- !Reduce(`|`, lapply(remove, function(x) grepl(x, fnames)))
fnames[i & j]
}
allpatterns(filenames, c("foo", "bar"), c("cpp", "quux"))
#[1] "foo_bar"
