I have the following Regex in my PHP code:
// markers for italic set *Text*
if (substr_count($line, '*')>=2)
{
$line = preg_replace('#\*{1}(.*?)\*{1}#', '<i>$1</i>', $line);
}
which works great.
However, when a $line holds a <br>, e.g.
*This is my text<br>* Some other text
Then the regex still considers the text and transforms it to:
<i>This is my text<br></i> Some other text
The goal is to not translate the text if a <br> is encountered. How to do that with a Regex - using a so called "negative lookahead" or how can the existing Regex be changed?
Note: Strings like *This is my text*<br>Some other text<br>And again *italic*<br>END should still be considered and transformed.
Idea: Or should I explode the $line and then iterate over the results with the regex?!
CodePudding user response:
Using match-what-you-don't-want and discard technique, you may use this regex in PHP (PCRE):
\*[^*]*<br>\*(*SKIP)(*F)|\*([^*]*)\*
and replace with <i>$1</i>
PHP code:
$r = preg_replace('/\*[^*]*<br>\*(*SKIP)(*F)|\*([^*]*)\*/'),
"<i>$1</i>", $input);
Explanation:
\*: Match a*[^*]*: Match 0 or more non-*characters<br>: Match<br>\*: Match closing*(*SKIP)(*F): PCRE verbs to discard and skip this match|: OR\*([^*]*)\*: Match string enclosed by*s
CodePudding user response:
You can replace matches of the regular expression
\*(?:(?!<br>)[^*]) \*
with
'<i>$0</i>'
where $0 holds the matched string.
The regular expression can be broken down as follows.
\* # match '*'
(?: # begin a non-capture group
(?!<br>) # negative lookahead asserts that next four chars are not '<br>'
[^*] # match any char other than '*'
) # end non-capture group and execute one or more times
\* # match '*'
