I am strugling with regex.
I have this character vector bellow:
texts <- c('I-have-text-2-and-text-8','I-have-text-1-and-text-2','I-have-text-7-and-text-8','I-have-text-2-and-text-1','I-have-text-4-and-text-5','I-have-text-11-and-text-12','I-have-text-13-and-text-32','I-have-text-8-and-text-6')
I have two words important to me: text-1and text-2. And I need them both, in any order.
I want to extract the text with them.
The output should be: [1]'I-have-text-1-and-text-2' [2]I-have-text-2-and-text-1
Ive been using str_subset from stringrbut I dont know the regex expression for this.
str_subset(texts, 'regex')
Any help
CodePudding user response:
Using str_subset - regex would be to specify text-1 followed by characters (.*) and then text-2 or (|) in the reverse way
library(stringr)
str_subset(texts, 'text-1.*text-2|text-2.*text-1')
[1] "I-have-text-1-and-text-2" "I-have-text-2-and-text-1"
CodePudding user response:
"Both patterns in any order" sounds complicated for a single regex pattern, but trivial to do in two separate patterns:
texts[str_detect(texts, "text-1") & str_detect(texts, "text-2")]
# [1] "I-have-text-1-and-text-2" "I-have-text-2-and-text-1"
CodePudding user response:
You can use an alternation pattern with | to alternate between text-1 followed by text-2and vice versa:
grep("text-1.*text-2|text-2.*text-1", texts, value = TRUE)
[1] "I-have-text-1-and-text-2" "I-have-text-2-and-text-1"
The stringrequivalent would be:
str_subset(texts, "text-1.*text-2|text-2.*text-1")
