I want to use regex findall to parse a html page. Using : (\d{4,9})$(?<!#\*), I am able to exclude items that ends in # or *, but I also want to parse from items that end in other characters. Below is a example of what I am trying to achieve.
input string
test: 11111###
test: 222222
test: 3333333<br>
expected output
["222222", "3333333"]
CodePudding user response:
You can use
:\s*(\d )(?![#*\d])
See the regex demo. Details:
:- colon\s*- zero or more whitespaces(\d )- Group 1: one or more digits(?![#*\d])- a negative lookahead that fails the match if there is#,*or a digit immediately to the right of the current location.
