I am trying to extract [[String]] with regular expression. Notice how a bracket opens [ and it needs to close ]. So you would receive the following matches:
[[String]][String]String
If I use \[[^\]] \] it will just find the first closing bracket it comes across without taking into consideration that a new one has opened in between and it needs the second close. Is this at all possible with regular expression?
Note: This type can either be String, [String] or [[String]] so you don't know upfront how many brackets there will be.
CodePudding user response:
You can use the following PCRE compliant regex:
(?=((\[(?:\w |(?2))*])|\b\w ))
See the regex demo. Details:
(?=- start of a positive lookahead (necessary to match overlapping strings):(- start of Capturing group 1 (it will hold the "matches"):(\[(?:\w |(?2))*])- Group 2 (technical, used for recursing):[, then zero or more occurrences of one or more word chars or the whole Group 2 pattern recursed, and then a]char|- or\b\w- a word boundary (necessary since all overlapping matches are being searched for) and one or more word chars
)- end of Group 1
)- end of the lookahead.
See the PHP demo:
$s = "[[String]]";
if (preg_match_all('~(?=((\[(?:\w |(?2))*])|\b\w ))~', $s, $m)){
print_r($m[1]);
}
Output:
Array
(
[0] => [[String]]
[1] => [String]
[2] => String
)
