I need a regular expression that matches tab symbol by the following rules:
"—>text" does not match
"1.—>text" does not match
"1—>text" does not match
"A.—>text" does not match
"text—>text" match
That is, it shouldn't match tabs that are at the beginning of the text, after a listed item mark [A-Z] or [0-9]. Here is my expression:
(?<!^((?:\d |[A-Z])(?:\.)?))\t(?!\1)
How to fix it?
CodePudding user response:
You can use
(?<!^(?:(?:\d |[A-Z])\.?)?)\t
See the regex demo. Details:
(?<!^(?:(?:\d |[A-Z])\.?)?)- a negative lookbehind that fails the match if, immediately to the left of the current location, there are^- start of string(?:(?:\d |[A-Z])\.?)?- an optional sequence of(?:\d |[A-Z])- one or more digits or an uppercase ASCII letter\.?- an optional.
\t- a tab char.
Note that (?:\.)? is the same as \.?.
Also, capturing groups inside a negative lookbehind makes little sense as regex processing will be stopped before your backreference pattern is reached.

