I need to capture 4 groups from:
John.7200_24.6.txt.gz
Output:
Group1: John
Group2: 7200
Group3: 24
Group4: 6
Here is my regex: ([^.|_|data|gz] )
It captures a single group with multiple matches. How can I fix it?
CodePudding user response:
This pattern ([^.|_|data|gz] ) can be written as ([^._datagz|] ) which uses a negated character class to match 1 chars other than the single chars listed.
You use a single capture group to split on, if you want 4 separate groups, you should create 4 groups and match instead of split.
^(\w )\.(\d )_(\d )\.(\d )
^Start of string(\w )\.Capture 1 word chars in group 1 and match.(\d )_Capture 1 digits in group 2 and match_(\d )\.Capture 1 digitsin group 3 and match.(\d )Capture 1 digits in group 4
Or matching the full example string:
^(\w )\.(\d )_(\d )\.(\d )\.\w \.gz$
