How can i make group 1 differ based on content in the whole string?-CodePudding

In our Python system, I'm trying to isolate the second part of a size to make sure i can save the values separately.

As i got data in tons of different ways i have to take a lot of scenarios into consideration! At the same time our system requires everything to be in group 1 to be identified correctly, which increases the complexity!

This is what i got so far:

(?<=[\/\-])\s*([A-Za-z] |\w ) ?(?!\d*\s*\)|\d*\)|\w*\))(?!\s*[\/\-] )

Examples

working

These are my examples working:

110/116
S/M
S / M
S/M(32-34)
110/116(10-12y)
110/116(S/M)

not working

However my regex only functions correctly on the above examples.

Following 7 are causing issues:

S/M / L /XL
S / M / L / XL
S/M / L/XL
S/M/L/XL
S/M/L/XL(30-32)
S/M / L/XL(30-32)
S/M / L / XL(30-32)

How can I capture those cases as in below table:

Case	Input	Expected capture in group 1
1	`S/M / L /XL`	`"L /XL"`
2	`S / M / L / XL`	`"L / XL"`
3	`S/M / L/XL`	`"L/XL"`
4	`S/M/L/XL`	`"L/XL"`
5	`S/M/L/XL(30-32)`	`"L/XL"`
6	`S/M / L/XL(30-32)`	`"L/XL"`
7	`S/M / L / XL(30-32)`	`"L / XL"`

Issue

How can I capture a "/" in the middle including the whole part after (like /XL) but without any following parentheses (like not the (30/32)).

Example for S/M / L / XL(30-32) I want to capture L / XL only.

CodePudding user response：

You can use

(?<=[/-])\s*([A-Z] (?:\s*/\s*[A-Z] )?|\d )\b(?!\s*[/)-])

See the regex demo. Details:

(?<=[/-]) - a position immediately preceded with / or -
\s* - zero or more whitespaces
([A-Z] (?:\s*/\s*[A-Z] )?|\d ) - Group 1: one or more uppercase letters, and then an optional sequence of a / char enclosed with zero or more whitespaces and then one or more uppercase letters, or one or more digits
\b - a word boundary
(?!\s*[/)-]) - immediately to the right of the current location, there can't be zero or more whitespaces and then either /, ) or -.