Sorry to bother because I know this topic already exists, but after a lots of tries I still couldn't arrive the the result I want.
My code:
string1 = 'James CameronSteven Spielberg'
string2 = 'Martin Scorsese'
string3 = 'John McQueen'
result1= re.split("(?=[a-zéè])(?=[A-ZÉÈÊ])", string1) # ['James Cameron','Steven Spielberg']
result2= re.split("(?=[a-zéè])(?=[A-ZÉÈÊ])", string2) # ['Martin Scorsese']
result3= re.split("(?=[a-zéè])(?=[A-ZÉÈÊ])", string3) # ['John Mc', 'Queen']
I'm trying to add an exception to my regex (it's a loop so I want to only use one regex), so I can except all names started with "Mc"
CodePudding user response:
You can use
(?<=[a-zéè])(?<!Mc)(?=[A-ZÉÈÊ])
See the regex demo. Details:
(?<=[a-zéè])- a positive lookbehind that matches a location that is immediately preceded witha-zandéandèletters(?<!Mc)- a negative lookbehind that fails the match if there isMcimmediately to the left of the current position(?=[A-ZÉÈÊ])- a positive lookahead that matches a location that is immediately followed with uppercase ASCII letters orÉ,È, orÊletter.
