I am trying to extract parameter definitions from a Jenkins script and can't work out an appropriate regex (I' working in Dyalog APL which supports PCRE8).
Here's how the subject looks like:
pipeline {
agent none
parameters {
string(name: 'foo', defaultValue: 'bar')
string(name: 'goo', defaultValue: 'hoo')
}
stages {
stage('action') {
steps {
echo "foo = ${params.foo}"
}
}
}
}
I would like to get the individual param definitions captured in group 1 (in other words: I'm looking for a results that reports two matches: string(name: 'foo', defaultValue: 'bar') and string(name: 'goo', defaultValue: 'hoo') ), but the matches are either too long or too short (depending on greediness).
My regex:
parameters\s*{(\s*\D*\(.*\)\s*)*} (dot matches nl)
Parameter types may vary, so my best idea was to use \D* for those (any # of non-digits). I am suspicious that this captures more than I expected - but replacing that with \w did not help.
An alternative idea was
parameters\s*{(\s*(\w*)\(([^\)]*)\))*\s*}
which seemed more precise wrt matching parameter types and also the content of the parens - but surprisingly that returned goo only and skipped foo.
What am I missing?
CodePudding user response:
Using PCRE you can use this regex in MULTILINE mode:
(?m)(?:^\h*parameters\h*{|(?!^)\G).*\R\h*\w \(\w :\h*'\K[^']
RegEx Details:
(?m): Enable MULTILINE mode(?:: Start non-capture group^\h*parameters\h*{: Match a line that starts withparameters {|: OR(?!^)\G:
): End non-capture group.*: Match anything\R: Match a line break\h*: Match 0 or more whitespaces\w: Match 1 word chars\(: Match(\w: Match 1 word chars:: Match a:\h*: Match 0 or more whitespaces': Match a'\K: Reset all the matched info[^']: Match 1 of any char that is not'(this is our parameter name)
