I have made a regex for nginx that I want to capture the URL & parameters without the ?. It must only match URL that contain a ?. It must split results into 2 groups.
My regex is: ^(.*)\?(.*)$
It almost works but it catches the trailing slash which breaks some things.
As you can see the trailing / is inside capture group. So I want to either match ? or /? in a non-capturing group depending what is there but it doesn't work as expected:
Updated regex: ^(.*)(?:\/\?|\?)(.*)$
This will always still only match the ? I guess because it looks for smaller match first.
I can't quite conceptualize the right way to drop the training slash from capture group in a single regex.
CodePudding user response:
You can use
^(.*[^\/])?\/?\?(.*)$
^(.*?)\/?\?(.*)$
See the regex demo #1 / regex demo #2.
Details:
The ^(.*[^\/])?\/?\?(.*)$ pattern means:
^- start of string(.*[^\/])?- an optional Group 1: any zero or more chars other than line break chars as many as possible, and then a char other than a/\/?- an optional/char\?- a?char(.*)- Group 2: any zero or more chars other than line break chars as many as possible$- end of string.
The ^(.*?)\/?\?(.*)$ means:
^- start of string(.*?)- Group 1: any zero or more chars other than line break chars as few as possible\/?\?(.*)$- an optional/, then a?char, then Group 2 capturing the rest of the string.

