I am trying to match some part from a text which is something like:
This is dummy text, and file added is [^file.pdf] and this is my format \\[^myfile.png]
First [^...] is what I want to match (it's a file link actually), and if user types this format manually in the input, it will be escaped as you can see second \\[^...]. So I want to get the text between all the [^...]'s and don't match if it has \ with the bracket.
I have tried [^\\]\[.*\]$, but it is not working. Also tried (?!.*?[\\])\[.*\], this one matches the brackets but doesn't restrict the bracket with slash.
I am using PYTHON (3.9.*) and please note I am getting this text format from the API, so changing the text format is not the solution.
CodePudding user response:
You used a negated character class, [^\\], that requires a char other than \ in front of your expected matches, this excluded matches at the start of string. Another issue is using a greedy dot, .*. It matches any zero or more chars other than line break chars as many as possible, so you matched from the first [ till the last ]. You did not specify that there must be ^ after the [, that also caused matching string with no ^ after [.
You can use
(?<!\\)(\[\^[^][]*])
See the regex demo. Details:
(?<!\\)- negative lookbehind that fails the match if there is a\immediately to the left of the current location\[\^-[^substring[^][]*- a negated character class that matches any zero or more chars other than[and]]- a]char.
