Given a command line
mycommand --optional-arguments their-values <patternOfInterestWithDirectoryPath> arg1 arg2
patternOfInterestWithDirectoryPath can be any of following
path/to/dir
/path/to/dir
path/to/dir/
"path/to/dir"
"/path/to/dir"
"path/to/dir/"
In any of above the ask is to extract /path/to/dir in all cases, where some of them may (or may not )be enclosed with double quotes, and/or may (or may not) have a leading /
Following regex does match but it also extracts the lastmost forward slash.
\S*mycommand\s (?:-\S \s )*\"?([^\"] )\/?\"?.*
Attaching a negative lookahead like this does not work
\S*mycommand\s (?:-\S \s )*"?([^\s"] (?!\/"))\/?"?.*
Any clue how to ignore the characters for extraction which are part of regex group but at specific position (eg the rightmost)?
CodePudding user response:
You can use
\S*mycommand\s (?:-\S \s )*(?|"([^"]*?)\/?"|(\S )(?<!\/)).*
See the regex demo. Details:
\S*- zero or more non-whitespace charsmycommand- a literal string\s- one or more whitespaces(?:-\S \s )*- zero or more occurrences of-, one or more non-whitespaces, one or more whitespaces(?|"([^"]*?)\/?"|(\S )(?<!\/))- a branch reset group that matches either:"([^"]*?)\/?"-", Group 1 capturing any zero or more chars other than a", as few as possible, and then an optional/and a"char|- or(\S )(?<!\/)- Group 1 (group ID is still1as it is inside a branch reset group): one or more whitespaces with no/at the end
.*- the rest of the line.
