perl regex substitution if NOT this string NOR that character-CodePudding

I'm using Perl to highlight errors through my browser as I scan through pages of text. At this point, I want to ensure the text Seq is preceded by a maltese cross and space ✠ , otherwise highlight it. I also want to ignore n>Seq.

PS. If it's easier, I want to ignore > but it will always be n>. In fact, it would always be </span> - whichever is easiest to check for.

Example phrase: ✠ Seq. S. Evangélii sec. Joánnem. — In illo témpore

I'm trying to replace xySeq if xy is NOT a Maltese cross and a space ✠ , AND if xy is NOT the letter n and a greater than symbol n>.

In other words, I don’t want to substitute

✠ Seq  
n>Seq  
>Seq  
</span>Seq

but I do want to replace things like

✠Seq  
* Seq  
a✠Seq  
>aSeq

The following would work if I was just checking for single characters like ✠ or >

my $span_beg = q(<span class='bcy'>); # HTML markup for highlighting
my $span_end = q(</span>);
$phr =~ s/([^✠>]Seq)/$span_beg$1$span_end/g;

but [^✠ >]Seq will naturally only treat the ✠ and the space as one or the other.

I even tried [^(✠\s)>]Seq and a varible [^$var>] but these didn’t work.

I played with (?<!✠\s)Seq but didn't know how to incorporate > or if it was even the right way to go.

I hope this is possible, thanks for all.
Guy

CodePudding user response：

If you always want to tag Seq and exactly two characters before it, a couple of look-behinds might be enough:

s{..(?<!✠\s)(?<!n>)Seq}{$span_beg$&$span_end}g;

Or, with look-ahead:

s{(?!✠\s)(?!n>)..Seq}{$span_beg$&$span_end}g;

CodePudding user response：

This should be more efficient than performing lookaround at every position:

# Doesn't include preceding characters in the span.
s{(✠ |>)?Seq}{ $1 ? $& : "$span_beg$&$span_end" }eg

# Includes two preceding characters in the span.
s{(?:(✠ |>)|..)Seq}{ $1 ? $& : "$span_beg$&$span_end" }seg