Using GNU Awk 5.0.0, API: 2.0 (GNU MPFR 4.0.2, GNU MP 6.1.2), I want to check for a pattern using match.
My sample text is the following (with a space at the beginning of the line):
7 Plasmas Mobiles (30%)
Using the following regex, I am able to match the string:
[0-9]{1,} .{1,} \([0-9]{1,}%\)
As proved with this live example: regexr.com/6n3fh
However, awk's match returns 0:
awk '{print match($0, " [0-9]{1,} .{1,} \([0-9]{1,}%\)")}' reports/test
awk: cmd. line:1: warning: escape sequence
\(' treated as plain('awk: cmd. line:1: warning: escape sequence
\)' treated as plain)'0
Why is that and how can I get the expected behavior, which is getting "1" as a return of match ?
CodePudding user response:
In awk a regex is formed as /the-regex/, see Regular Expressions. awk does offer Dynamic Regexps where the regex is quoted as you have it.
awk treats the two styles of regex differently. Specifically the double-quoted string is scanned twice by awk. This necessitates escaping with a double backslash, e.g. \\.
In your case you can either use:
match($0, / [0-9]{1,} .{1,} \([0-9]{1,}%\)/)
or
match($0, " [0-9]{1,} .{1,} \\([0-9]{1,}%\\)")
Example Use/Output
$ echo " 7 Plasmas Mobiles (30%)" | awk '{print match($0, / [0-9]{1,} .{1,} \([0-9]{1,}%\)/)}'
1
and
$ echo " 7 Plasmas Mobiles (30%)" | awk '{print match($0, " [0-9]{1,} .{1,} \\([0-9]{1,}%\\)")}'
1
