I'm trying to category_id for purely numeric values, this works. I need to also capture category_name. For category_name, I need to capture until space or include space if it started with a double quote.
Sample user input string:
python c:192 c:1Stackoverflow c:"Stack Overflow2"
The desired captures should be these two values for category_name and the 192 for category_id.
Expected output:
1Stackoverflow
Stack Overflow2
The category_name must contain at least one non-digit, but can be all alpha with no digits.
This query partially works:
/c:(?<category_name>(?:")(?!\d )[^"] (?:")|(?!\d )[^ ] )/g
It doesn't capture the input 1Stackoverflow, but does the quoted one. I need to remove the quotes:
(x.groups?.[key] ?? '').replace(/^\"/, '').replace(/\"$/, '')
The ?!\d is an attempt to evade clashing with category_id, but does not appear to be working.
How can I capture category_name in both forms (one word and quote deliminated) without the quotes in the capture and working with a leading digit?
CodePudding user response:
To capture all 3 named groups in one regex use:
c:(?:(?<category_id>\d \b)|(?<category_name>\w |"[^"]*"))
RegEx Breakdown:
c:: Matchc:(?:: Start non-capture group(?<category_id>\d \b): Named capture groupcategory_idto match 1 digits followed by a word boundary|: OR(?<category_name>\w |"[^"]*"): Named capture groupcategory_nameto match 1 word characters or a quoted text
): End non-capture group
CodePudding user response:
If you want to remove the quotes immediately, I would suggest to use two different named groups for category_name with and without quotes:
c:(?:(?<category_name_q>"[^"] ")|(?<category_name>(?:\d*[a-zA-Z] )))
(category_name_q contains the previously quoted matches, but without quotes)
