I have an issue where I want to extract a pattern from a vector of strings, ie extract
c("TAG a", "TAG b", "TAG c") from c("TAG a", "TAG b-3", "TAG c 3")
So far I've tried:
str_vec <- c("TAG a", "TAG b-3", "TAG c 2", "2 TAG d")
stringr::str_extract(str_vec, "TAG .*(?=[\\ \\-])")
Which returns TAG b and c correctly, but doesn't extract TAG a or d.
If I try
stringr::str_extract(str_vec, "TAG .*(?=[\\ \\-]|$)")
TAG a and d are returned correclty, but $ seems to override /- so TAG b and c are returned with their suffixes still attached.
CodePudding user response:
You need
str_vec <- c("TAG a", "TAG b-3", "TAG c 2", "2 TAG d")
stringr::str_extract(str_vec, "TAG [^ -]*")
# => [1] "TAG a" "TAG b" "TAG c" "TAG d"
Details:
TAG- a fixed string[^ -]*- zero or more chars other than-and.
See the regex demo and the R demo.
CodePudding user response:
How about:
library(stringr)
str_extract(str_vec, "TAG [a-z]")
Output:
[1] "TAG a" "TAG b" "TAG c" "TAG d"
