Home > Software design >  Regex in BigQuery regexp_instr function
Regex in BigQuery regexp_instr function

Time:01-06

I would like to identify below string via REGEXP_INSTR function on BigQuery:

2020-12-31T23:00:00.000Z;15,2021-12-31T23:00:00.000Z;271,2022-12-31T23:00:00.000Z;12,...

The input string is a list of couples (datetime/integer value) separated by "," where each couple element is separated by ";".

My first (unsuccessful) iteration was:

'^(\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}.\\d{3}Z;\\d*) ,?'

I don't succeed to properly manage the "list" with possibly uneeded "," after the last element of the list.

Could you help me constructing the regex pattern please ?

Thanks a lot for help :)

CodePudding user response:

You can match the first part, and optionally repeat the whole part preceded by a , so it does not match the last comma.

If there can not be a . at the end, you can append $ to the pattern to assert the end of the string.

Note to escape the dot \.

^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d{3}Z;\d*(?:,\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d{3}Z;\d*)*

Regex demo

With double escaped backslashes

^\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}\\.\\d{3}Z;\\d*(?:,\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}\\.\\d{3}Z;\\d*)*
  •  Tags:  
  • Related