I am trying to write python script to match a regex that can include everything which has two - and one . but I also want to exclude two strings from it. They are NIST-Privacy-v1.1 and NIST-CSF-v1.1
Here is my sample data:
NIST-Privacy-v1.1
NIST-CSF-v1.1
AWS-CIS-v1.4-1.10
SOC-2-CC6.8
NIST-800-53rev5
NIST-800-53rev5-CM-3(1)
NIST-800-53rev5-AU-12(4)
SOC-2-CC6.1
NISTPrivacyFramework-v1.0
NISTPrivacyFramework-v1.0-PR.AC-P1
NIST-800-53rev5-AC-1
NIST-800-53rev5-IA-1
I started with a very simple regex which does the job of matching what I need but doesn't exclude the two strings. Can you help me identify the exclusion part.
regex:
.*-.*-.*[.|\-].*
Desired output:
AWS-CIS-v1.4-1.10
SOC-2-CC6.8
NIST-800-53rev5-CM-3(1)
NIST-800-53rev5-AU-12(4)
SOC-2-CC6.1
NISTPrivacyFramework-v1.0-PR.AC-P1
NIST-800-53rev5-AC-1
NIST-800-53rev5-IA-1
CodePudding user response:
^(?!NIST-Privacy-v1\.1)(?!NIST-CSF-v1\.1).*-.*-.*[.-].*$
Output:
AWS-CIS-v1.4-1.10
SOC-2-CC6.8
NIST-800-53rev5-CM-3(1)
NIST-800-53rev5-AU-12(4)
SOC-2-CC6.1
NISTPrivacyFramework-v1.0-PR.AC-P1
NIST-800-53rev5-AC-1
NIST-800-53rev5-IA-1
Demo: https://regex101.com/r/gM9e44/1
^=> Given pattern must start from the beginning of the line[.-]=> "-" or "."^(?!NIST-Privacy-v1\.1)=> It must not start with "NIST-Privacy-v1.1"^(?!NIST-Privacy-v1\.1)(?!NIST-CSF-v1\.1)=> It must not start with "NIST-Privacy-v1.1" or "NIST-CSF-v1.1"$=> Given pattern must finish at the end of the line
CodePudding user response:
You may use this regex for your job:
^(?!NIST-(?:CSF|Privacy)-v1\.1$)(?:[^-]*-){2}.*[.-].*
RegEx Breakup:
^: Start(?!NIST-(?:CSF|Privacy)-v1\.1,$): Negative lookahead to fail to match when input isNIST-Privacy-v1.1orNIST-CSF-v1.1(?:[^-]*-){2}: Match 0 or more of non-hyphen characters followed by a hyphen. Repeat this group 2 times.*[.-]: Match any text followed by dot or hyphen.*: Match 0 or more of any text
