I'm trying to parse some attributes from a modem's AT output. My regex is as follow:
([^:]*):\s*([^\s]*)
Sample output as follow:
LTE SSC1 bw : 20 MHz LTE SSC1 chan: 2850
LTE SSC2 state:INACTIVE LTE SSC2 band: B20
LTE SSC2 bw : 10 MHz LTE SSC2 chan: 6300
EMM state: Registered Normal Service
RRC state: RRC Connected
IMS reg state: NOT REGISTERED IMS mode: Normal
This mostly works ok but not so well where an attribute's value has more characters after the first whitespace. For example, the match "LTE SSC2 bw" has a group 2 value of "10" when it should be "10 MHz".
Ideally I need the regex to match exactly the attributes, and group the value for it.
Hope this makes sense and thanks for your help.
CodePudding user response:
If there is always at least two spaces between the key-value pairs you can use
([^:\s][^:]*):[^\S\r\n]*(\S (?:[^\S\r\n]\S )*)
See the regex demo.
Details:
([^:\s][^:]*)- Group 1: a char other than whitespace and:and then zero or more non-:chars:- a colon[^\S\r\n]*- zero or more whitespaces other than CR and LF chars(\S (?:[^\S\r\n]\S )*)- Group 2: one or more non-whitespaces, then zero or more repetitions of a whitespace other than CR and LF chars and then one or more non-whitespace chars.
CodePudding user response:
You can try with this regex:
(?<attribute>[A-Z]{3} [^:] ): *(?<value1>.*?)(?> {2,}|$)(?<value2>[^:] $)?
The groups you have are the following:
- Group 1 attribute: will contain the attribute name
- Group 2 value1: will contain the attribute value
- Group 3 value2: will contain the optional attribute second value (for the fourth line)
Explanation:
(?<attribute>[A-Z]{3} [^:] ): Group 1[A-Z]{3}: three uppercase letters: a space[^:]: any combination of characters other than colon
: *: colon any number of spaces(?<value1>.*?): Group 2.*?: any character (in lazy modality, so that it tries to match the least amount that can match)
(?> {2,}|$): Positive lookahead that matches{2,}: two or more spaces (end of first inline attribute:value)|: or$: end of string (end of second inline attribute:value)
(?<value2>[^:] $)?: Group 3[^:]: any combination of characters other than colon$: end of string
You can call each group by their respective names.
Try it here.
