I'm working with regex on PRCE2 environment.
In my switch logs I have to capture a text string that I'm capturing as "message" and that is located in a specific position. The focus point is that it is always preceded by a set of characters ending with : but, after them, I can have or not some addictional characters ending with ; and I must be able to skip them.
Let me explain with my current regex and some log samples.
We can say that I have 3 chances:
1. (s)[18014]:Recorded command information.
2. (l):User logged out.
3. (s)[18014]:CID=0x11aa2222;The user succeeded in logging out of XXX.
My current regex is:
\(\w \)\[*\d*\]*\:(?<message>[^\[] ?\.)
that works for case 1 and 2 because:
- capture the fact that we always have a (, a literal character and a ) with
\(\w \) - capture, as in case 2, if after that we have a [, a number and a ] with
\[*\d*\]* - in every case the following characters are
:and I capture it with\: - The message is captured, and named, with
(?<message>[^\[] ?\.)that must avoid the capturing action if, after:, I have a[. The capture stops when when I get a.
My problem is: after the : I can have the case 3; it always begin with CID=<exadecimal expression>; but it is not only limited to this. After it, I can have other expression always ended by ; So we can say that I can have, for case 3, CID=<hex expression><other numeric and literal characters>;.
With current regex, of course, the CIDR part is included in the message. I must avoid it; if the CIDR part is present, the message capture must start after the ; that end it.
So, we can summarize that:
IF after the : we have no CIDR word, starts capturing; ELSE, avoid capturing until ; and start the job after it.
CodePudding user response:
The following pattern will match the right part of your test strings.
We look for either a : not followed by CID ?!CID or a ;. We then capture what follows.
((:(?!CID))|;)(.*)
see https://regex101.com/r/JRB4Rq/1
CodePudding user response:
You could write the pattern as:
\(\w \)(?:\[\d \])?:(?:CID=[^;]*;)?(?<message>[^.] \.)
Explanation
\(\w \)Match 1 word chars between parenthesis(?:\[\d \])?Optionally match 1 digits between square brackets:Match the colon (you don't have to escape it)(?:CID=[^;]*;)?Optionally match the CID= part till the first semicolon(?<message>[^.] \.)Group message, match 1 chars other than.and then match the.
See a regex demo.
