In a string, I need all timecodes to be formatted as following [HH:MM:SS] or [HH:MM:SS.ms]. Some of them are already in brackets. They can be everywhere, beginning, middle, or end of a phrase.
I'd like to put those not in brackets in brackets.
To select all of them I use:
[\[]?\d\d:\d\d:\d\d(.\d )?[\]]?
I tried
(?!\[. \])(.|^)(\d\d:\d\d:\d\d(.\d )?)(.|$)(?!\[. \])
Which is almost fine except that my selection $2 includes space characters in the case of string not beggining by ^ or finishing by $.
How can I get rid of this selection?
CodePudding user response:
You can use
re.sub(r'\[?\b(\d{2}:\d{2}:\d{2}(?:\.\d )?)\b]?', r'[\1]', text)
See the regex demo. Details:
\[?- an optional[char\b- a word boundary(\d{2}:\d{2}:\d{2}(?:\.\d )?)- Group 1:\d{2}:\d{2}:\d{2}- two digits, and then two occurrences of:and two digits(?:\.\d )?- an optional sequence of.and one or more digits
\b- a word boundary]?- an optional]char
To make sure you match 24-hour time format you can use a more precise pattern:
\[?\b((?:[01][0-9]|2[0-3]):[0-5][0-9]:[0-5][0-9](?:\.[0-9] )?)\b]?
See this demo.
