I am trying to extract stock symbols from a body of text. These matches usually come in the following forms:
(<symbol>) => (VOO)
(<market>:<symbol>) => (NASDAQ:C)
In the sample cases shown above, I'd like to match VOO and C, skipping everything else. This regex gets me halfway there:
(?<=\()(.*?)(?=\))
With this, I match what's included within the parentheses, but the logic that ignores "noise" like NASDAQ: eludes me. I'd love to learn how to conditionally specify this pattern/logic.
Any ideas? Thanks!
CodePudding user response:
You can use
[A-Z] (?=\))
See the regex demo.
Details:
[A-Z]- one or more uppercase ASCII letters(?=\))- a positive lookahead that matches a location that is immediately followed with a)char.
Alternatively, you can use the following to capture the values into Group 1:
\((?:[^():]*:)?([A-Z] )\)
See this regex demo. Details:
\(- a(char(?:[^():]*:)?- an optional sequence of any zero or more chars other than(,)and:and then a:char([A-Z] )- Group 1: one or more uppercase ASCII letters\)- a)char.
