I want to use regex to capture color, animal, and country from the following html. However, with country, there is a possibility that a <br> tag exists before the country name, such as with SPAIN in my example. I want to omit that <br> tag, so that only "SPAIN" is captured.
<p><span >RED</span><br><span >DOG</span>USA</p>
<p><span >GREEN</span><br><span >CAT</span><br>SPAIN</p>
<p><span >BLUE</span><br><span >MOUSE</span>FRANCE</p>
I have the following regex, but it doesn't omit the country <br> tag:
/<p><span >(.*)<\/span><br><span >(.*)<\/span>(.*)<\/p>/
Please help.
CodePudding user response:
Try this:
<p><span >(.*)<\/span><br><span >(.*)<\/span>(?:<br>)?(.*)<\/p>
(?:...) : non-capturing group.
? : 0 or 1 times
check pattern: Regex101
CodePudding user response:
You can try this to match only the content between > and <
(?<=>)([[:upper:]] )(?=<)
