I have some text which I need to obtain between tags 59 and 71A:
:59:Line with random data and length
RANDOM VALUE
TO BE
EXTRACTED
:71A:SHA
The line starting :59: can be of any length and random data. I need to extract the data between lines 59 and 71 without the data in the lines themselves.
Edit: One way I could do this is to do a regex search using:
:59:(.*)\n
This will allow me to determine what is present in line 59. I could then use that as the base of my next search to determine what's between line 59 and 71:
(?<=length)[\s\S] (?=>:71) (doesn't actually work)
CodePudding user response:
Since you haven't specified the language, I'll assume java (but the technique will work in practically any language):
String target = file.replaceAll("(?ms).*^:59:.*?\n(.*)^:71A:.*", "$1");
See live demo.
CodePudding user response:
Your regex :
(?<=length\n)[\s\S] (?=>71)
has a typo before 71, it should be (?=:71).
Also this will greedily match anything after length till the tag :71 is found, so it will match everything from 'length' in the first line to the last :71 tag. Which I think you dont want.
:ABC:Line with random data and length
RANDOM VALUE
TO BE
EXTRACTED
:71A:SHA
:59:Line with random data and length
RANDOM VALUE
TO BE
EXTRACTED
:71A:SHA
So you need to change your regex to:
JS: /^:59:(?<length_line>.*\n)(?<value>[\s\S] ?)\n:71A:SHA$/gm
python: r"^:59:(?P<length_line>.*\n)(?P<value>[\s\S] ?)\n:71A:SHA$"
This matches from the :59: tag to :71:SHA tag. Capture groups are used so that you can get the first line with length in the group length_line and the random value to be extracted in the group value.
You will get a self explanatory explanation if you test this regex here: https://regex101.com/r/lRRzdh/1
