I have a regex and test case on
https://regex101.com/r/5Z5Lop/1
^(?<KEY>CONF|ESD|TRACKING)[:;'\s]\s*(?<DATA>.*?)\s*(?:L[:;'\s]\s*\K(?<LINE_DATA>.*?))?(?<INITIALS>\*[a-zA-Z] )?\s*$
See the LINE_DATA named group.
Is it possible to split that group up into two separate groups?
I want one group LINE_NUMBERS to hold all integers not contained in parentheses.
Then, 1 group called QTYS to hold all integers that are contained in parentheses.
So currently LINE_NUMBERS yields "1,2,3(4),5(12) "
Is it possible to have a LINE_NUMBERS be [1,2,3,4] (either array or some kinda string)
and then QTYS to be [(4),(12)] Note: I do still want to capture the parentheses.
I would like to do this in the current regex if it's possible and doesn't overly complicate what I currently have.
Right now, I'm obtaining this data through post-processing with separate regexes. I'm using php
preg_match_all('/\d (?!\s*\))/i', $ret_data['LINE_DATA'], $ret_data['LINE_NUMBERS']);
Thanks!
preg_match_all('/\(\s*\d\s*\)/i', $ret_data['LINE_DATA'], $ret_data['QUANTITIES']);
CodePudding user response:
You can use a single pattern in the post-processing for the QUANTITIES and the LINE_NUMBERS using an alternation | and removing the empty entries from the result.
$re = '/^(?<KEY>CONF|ESD|TRACKING)[:;\'\s]\s*(?<DATA>.*?)\s*(?:L[:;\'\s]\s*\K(?<LINE_DATA>.*?))?(?<INITIALS>\*[a-zA-Z] )?\s*$/i';
$str = 'esd: here is my data L: 1,2,3(4),5(12) *sm ';
preg_match($re, $str, $matches);
preg_match_all('/(?<QUANTITIES>\(\d \))|(?<LINE_NUMBERS>\d )/', $matches["LINE_DATA"], $numbers);
print_r(array_filter($numbers["QUANTITIES"]));
print_r(array_filter($numbers["LINE_NUMBERS"]));
Output
Array
(
[3] => (4)
[5] => (12)
)
Array
(
[0] => 1
[1] => 2
[2] => 3
[4] => 5
)
There could be an option to use the \G anchor to get 2 separate groups for the given example data, but it will make the INITIALS part after it optional:
^(?<KEY>CONF|ESD|TRACKING)[:;'\s]\s*(?<DATA>.*?)\s*L[:;'\s]\s*|\G(?!^)(?:(?<QUANTITIES>\(\d \))|(?<LINE_NUMBERS>\d )),?(?:\s*(?<INITIALS>\*[a-zA-Z] )\s*$)?
^Start of string(?<KEY>CONF|ESD|TRACKING)[:;'\s]\s*The KEY group with alternatives, and match a single char listed in the character class and optional whitspace chars(?<DATA>.*?)\s*Match the DATA group, any char non greedy followed by optional whitespace charsL[:;'\s]\s*MatchLthe any of the list chars and optional whitespace chars|Or\G(?!^)Assert the position at the end of the previous match, not at the start(?:Non capture group(?<QUANTITIES>\(\d \))Group QUANTITIES, match 1 digits between parenthesis|Or(?<LINE_NUMBERS>\d )Group LINE_NUMBERS, match 1 digits
)Close non capture group,?Match an optional comma(?:\s*(?<INITIALS>\*[a-zA-Z] )\s*$)?Optional non capture group with group INITIALS
