Regex need to consider two patterns in same group
sample data ::
mixexecutor:check_atom_exists:740 - requested to check this machine : **ET_colBackDDW_Temp**output_of_reports/PII/36478_**ABP_BAL_liquidpressure**-**20210831-123456**-**20210831-172355**.bat.yz
Both the data belongs to same column need to identity highlighted values
Expected output:
**ET_colBackDDW_Temp**--> group 1**ABP_BAL_liquidpressure**--> group 1,20210831-123456--> group 2,20210831-172355--> group 3
I have tried like below while developing the regex no need to consider the words
^. :(. )$|(?:[0-9]{4,5}_|-)((?:[a-zA-Z0-9_]*)(?:-[0-9]{1,7})?)-([0-9]{8}-[0-9]{6})-([0-9]{8}-[0-9]{6})
for this above regex it is identified as different group I am using pyspark.
CodePudding user response:
To get the values in 1 or 3 groups using a single pattern, you might use:
^.*?([A-Z]\w*_\w )(?:-([0-9]{8}-[0-9]{6})-([0-9]{8}-[0-9]{6}))?
The pattern matches:
^Start of string.*?Match as least a possible chars(Capture group 1[A-Z]\w*_Match A-Z and optional word chars and_\wMatch 1 word chars
)Close group 1(?:Non capture group-Match literally([0-9]{8}-[0-9]{6})Capture group 2-match-([0-9]{8}-[0-9]{6})Capture group 3
)?Close non capture group and make it optional
