Suppose I have the following objects in the memory:
ab
ab_b
ab_pm
ab_pn
c1_ab_b
and I only want to keep ab_pm and ab_pn.
I tried to use negative lookahead in ls() to list ab, ab_b and c1_ab_b for removal:
rm(list = ls(pattern = "ab_?(?!p)")
However, I got the error:
Error in grep(pattern, all.names, value = TRUE) :
invalid regular expression 'ab_?(?!p)', reason 'Invalid regexp'
I tried my regex at regex101.com, and found it matched all five object names, which suggested my regex was not "invalid", although it did not do what I wanted. My questions are:
- Does
ls()in R support negative lookahead? I knowgrep()needsperl = TRUEto support it, but do not see a similar argument in thels()help documentation. - How to correctly select the three objects I wanted to remove?
CodePudding user response:
Your ab_?(?!p) PCRE regex does not match as expected because of backtracking. It matches ab, then it matches an optional _ and then tries the negative lookaround. When the lookaround finds p backtracking occurrs, and the lookahead is triggered again right before _. Since _ is not p, a match is returned.
The correct PCRE regex would be ab(?!_?p), see the regex demo. After matching b, the regex engine tries the lookahead pattern only once, and if it fails to match an optional _ followed with a p, the whole match will fail.
ls does not support perl=TRUE, so it only supports the default TRE regex library that does not support lookarounds.
You may use
ab([^_]p|_[^p]|.?$)
See the regex demo. Details:
ab-abtext([^_]p|_[^p]|.?$)- either of the three alternatives:[^_]p- any char but_and thenp|- or_[^p]- a_and then any char butp|- or.?$- any one optional char and then end of string.
CodePudding user response:
ls uses grep(pattern, all.names, value = TRUE), so it does not support perl extensions including lookahead. You can handle that externally, though, by wrapping ls in grep:
vec <- ls(pattern = "^ab_")
# vec <- c("ab","ab_b","ab_pm","ab_pn","c1_ab_b")
grep("ab_(?=p)", vec, perl = TRUE, value = TRUE)
# [1] "ab_pm" "ab_pn"
So perhaps a one-liner:
grep("ab_(?=p)", ls(pattern = "^ab_"), value = TRUE, perl = TRUE)
This does a double-grep (once inside ls, once outside); one can always just make it a little more direct with
grep("ab_(?=p)", ls(), value = TRUE, perl = TRUE)
