Home > Net >  Regular expression to match speaker1: a \n speaker1: b → speaker1: a b
Regular expression to match speaker1: a \n speaker1: b → speaker1: a b

Time:01-16

This was my initial attempt

^(speaker1|speaker2): (.*?)[\n\s\r] \k<1>: 

But it doesn't work in this kind of cases:

speaker1: kskdjsk speaker2: 223 speaker1: fkjfdsj

because the regex (.*?) continue to look until it finds the 3rd line.

So I tried adding a negative lookbehind (?<!^(Io|Lei):)

^(speaker1|speaker2): (.*?(?<!^(speaker1|speaker2):))[\n\s\r] \k<1>: 

But it doesn't work.

CodePudding user response:

You can check at each character position whether there is not a match with a speaker pattern:

\bspeaker[12]:((?:(?!\bspeaker[12]).)*)

Having said that, if your programming environment allows for it, it will be more efficient to match only speaker[12]: and split the input text by the matches.

  •  Tags:  
  • Related