I am trying to split strings of a format
x <- "A(B)C"
where A, B and C could be empty strings or any sets of characters except for parentheses. The parentheses are always there - I want to keep them around the characters they enclose, so that the result would be:
"A" "(B)" "C"
So far my best try was:
strsplit(x, "(?<=\\))|(?=\\()", perl = TRUE)
[[1]]
[1] "A" "(" "B)" "C"
but that keeps the opening parenthesis separate. Any ideas?
CodePudding user response:
You can use
x <- "A(B)C"
library(stringr)
str_extract_all(x, "\\([^()]*\\)|[^()] ")
See the R demo and the regex demo. Details:
\([^()]*\)- a(, zero or more chars other than(and)and then)|- or[^()]- one or more chars other than(and).
CodePudding user response:
library(stringr)
x <- c("A(B)C", "ABC", "0$b")
stringr::str_extract_all(x, "[\\(]?.{1}[\\)]?")
# [[1]]
# [1] "A" "(B)" "C"
#
# [[2]]
# [1] "A" "B" "C"
#
# [[3]]
# [1] "0" "$" "b"
