Say I have the following string -
vector <- "this is a string of text containing stuff. something.com [email protected] and other stuff with something.anything"
I would like to remove a string if it contains @ or . , so I would like to remove something.com, [email protected] and something.anything. I do not want to remove stuff because it's the end of a sentence and does not contain .. Ideally I would like to be able to use the %>% pipe to do this.
CodePudding user response:
gsub(" ?\\w [.@]\\S ", "", vector)
[1] "this is a string of text containing stuff. and other stuff with"
CodePudding user response:
An alternative to the (much more terse/simple) gsub method:
gre <- gregexpr("[^ ] [.@][^ ] ", vector)
regmatches(vector, gre)
# [[1]]
# [1] "something.com" "[email protected]" "something.anything"
regmatches(vector, gre) <- ""
vector
# [1] "this is a string of text containing stuff. and other stuff with "
This has the advantage of being able to replace them arbitrarily. Granted, we're just replacing them here with "", so this is a little overkill, but if you need to change the values somehow (change each substring), then this is a more powerful mechanism.
