Home > Mobile >  Regex replacement for URI in vrl (vector.dev)
Regex replacement for URI in vrl (vector.dev)

Time:02-08

I need a regex that replaces the pieces of a URI that would create a high cardinality situation.

Basically if the segment of a URI contains any non a-zA-Z characters (other than /), replace it with an *

Example:

$ replace("/first/12ab34/B1a234/12B3a/1234/second/A789B-89d", r'(?i)[a-z]*\d (?i)[a-z]*',"*")

results in: "/first/**/**/**/*/second/*-*"

That's close, but I need "/first/*/*/*/*/second/*"

Multiple replaces are acceptable. Any regex masters out there willing to help? This is for vrl (vector.dev) written in Rust. VRL does not support look-around of any kind.

CodePudding user response:

For the example data, you might use

(?i)[a-z]*\d[\da-z]*(?:-[\da-z] )*
  • (?i) Inline modifier for case insensitive
  • [a-z]* Match optional chars a-z
  • \d Match a single digit
  • [\da-z]* Match optional digits or chars a-z
  • (?:-[\da-z] )* optionally repeat a - and 1 times either a digit or a-z

Regex demo

CodePudding user response:

Use

[^/\d]*\d[^/]*

See regex proof.

EXPLANATION

--------------------------------------------------------------------------------
  [^/\d]*                  any character except: '/', digits (0-9) (0
                           or more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  \d                       digits (0-9)
--------------------------------------------------------------------------------
  [^/]*                    any character except: '/' (0 or more times
                           (matching the most amount possible))
  •  Tags:  
  • Related