Home > Software design >  Regex after end of line with line length being undetermined
Regex after end of line with line length being undetermined

Time:02-04

I have some text which I need to obtain between tags 59 and 71A:

:59:Line with random data and length
RANDOM VALUE
TO BE
EXTRACTED
:71A:SHA

The line starting :59: can be of any length and random data. I need to extract the data between lines 59 and 71 without the data in the lines themselves.

Edit: One way I could do this is to do a regex search using:

:59:(.*)\n

This will allow me to determine what is present in line 59. I could then use that as the base of my next search to determine what's between line 59 and 71:

(?<=length)[\s\S] (?=>:71) (doesn't actually work)

CodePudding user response:

Since you haven't specified the language, I'll assume java (but the technique will work in practically any language):

String target = file.replaceAll("(?ms).*^:59:.*?\n(.*)^:71A:.*", "$1");

See live demo.

CodePudding user response:

Your regex :

(?<=length\n)[\s\S] (?=>71)

has a typo before 71, it should be (?=:71). Also this will greedily match anything after length till the tag :71 is found, so it will match everything from 'length' in the first line to the last :71 tag. Which I think you dont want.

:ABC:Line with random data and length
RANDOM VALUE
TO BE
EXTRACTED
:71A:SHA

:59:Line with random data and length
RANDOM VALUE
TO BE
EXTRACTED
:71A:SHA

So you need to change your regex to:

JS: /^:59:(?<length_line>.*\n)(?<value>[\s\S] ?)\n:71A:SHA$/gm
python: r"^:59:(?P<length_line>.*\n)(?P<value>[\s\S] ?)\n:71A:SHA$"

This matches from the :59: tag to :71:SHA tag. Capture groups are used so that you can get the first line with length in the group length_line and the random value to be extracted in the group value.

You will get a self explanatory explanation if you test this regex here: https://regex101.com/r/lRRzdh/1

  •  Tags:  
  • Related