Home > Enterprise >  How to extract using REGEX in R for a ]
How to extract using REGEX in R for a ]

Time:01-19

I have a column that has many rows. the column has a value like

[Testing Data 123-INDEPENDENCE, MO] 99 *2-5PLT

I want to write str_extract to extract everything after ] so the output should be 99 *2-5PLT.

Thanks for your help.

CodePudding user response:

This will work:

a <- "[Testing Data 123-INDEPENDENCE, MO] 99 *2-5PLT"
str_extract(a, "(?<=\\] )(.*)")

[1] "99 *2-5PLT"

Here we use a lookbehind to find the closing bracket (also the trailing space), then match everything after:

https://regex101.com/r/Aq9D1p/1

Edit, you could also do something like:

a %>% str_split_fixed(., "] ", n = 2)

     [,1]                                 [,2]        
[1,] "[Testing Data 123-INDEPENDENCE, MO" "99 *2-5PLT"

CodePudding user response:

Also a base R solution:

regmatches(a, regexpr("\\[[^[]*\\]\\s \\K.*", a, perl = TRUE))

"99 *2-5PLT"

CodePudding user response:

You can drop everything till ].

Using sub in base R -

x <- "[Testing Data 123-INDEPENDENCE, MO] 99 *2-5PLT"
sub('.*\\]\\s ', '', x)
#[1] "99 *2-5PLT"

Similarly, with stringr::str_remove -

stringr::str_remove(x, '.*\\]\\s ')
  •  Tags:  
  • Related