I feel like this is trivial but can't find any solution that works for me.
I have a string of this sort :
cn=doc_medical,ou=tged,ou=groupes,o=choregie,c=fr|cn=test,ou=test,ou=test,o=choregie,c=fr|cn=doc_confidentiel,ou=tged,ou=groupes,o=choregie,c=fr|cn=test,ou=test,ou=test,o=choregie,c=fr
Where I need to to find the value between cn= and ,ou=tged,ou=groupes,o=choregie,c=fr, in this case I should only match doc_medical first and doc_confidentiel then.
I have this regex : (?=cn=)(.*?)(?<=,ou=tged,ou=groupes,o=choregie,c=fr) but the problem is that it obviously matches everything after the second cn= of the global string until the next ,ou=tged,ou=groupes,o=choregie,c=fr. So my second group is wrong because it contains cn=test,ou=test,ou=test,o=choregie,c=fr|cn=doc_confidentiel,ou=tged,ou=groupes,o=choregie,c=fr instead of only doc_confidentiel.
I don't know the number of character there can be between the two strings, and I can't seem to figure out how to force the regex to match the first cn= previous to the ,ou=tged,ou=groupes,o=choregie,c=fr string instead of the first one it encounters after it.
CodePudding user response:
You can use
(?<=cn=)[^,|] (?=,ou=tged,ou=groupes,o=choregie,c=fr)
See the regex demo.
Details:
(?<=cn=)- a location immediately preceded withcn=[^,|]- one or more chars other than|and,(?=,ou=tged,ou=groupes,o=choregie,c=fr)- a positive lookahead that requires a,ou=tged,ou=groupes,o=choregie,c=frstring to appear immediately to the right of the current location.
See the Java demo:
import java.util.*;
import java.util.regex.*;
class Test
{
public static void main (String[] args) throws java.lang.Exception
{
String regex = "(?<=cn=)[^,|] (?=,ou=tged,ou=groupes,o=choregie,c=fr)";
String string = "cn=doc_medical,ou=tged,ou=groupes,o=choregie,c=fr|cn=test,ou=test,ou=test,o=choregie,c=fr|cn=doc_confidentiel,ou=tged,ou=groupes,o=choregie,c=fr|cn=test,ou=test,ou=test,o=choregie,c=fr";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println(matcher.group(0));
}
}
}
Output:
doc_medical
doc_confidentiel
NOTE: If there is a value other than cn that can contain more chars on the left, use a word boundary: (?<=\bcn=)[^,|] (?=,ou=tged,ou=groupes,o=choregie,c=fr). In Java, String regex = "(?<=\\bcn=)[^,|] (?=,ou=tged,ou=groupes,o=choregie,c=fr)";.
CodePudding user response:
We can use a regex replacement approach here:
String input = "cn=doc_medical,ou=tged,ou=groupes,o=choregie,c=fr|cn=test,ou=test,ou=test,o=choregie,c=fr|cn=doc_confidentiel,ou=tged,ou=groupes,o=choregie,c=fr|cn=test,ou=test,ou=test,o=choregie,c=fr";
String cn = input.replaceAll(".*\\bcn=([^,] ),ou=tged,ou=groupes,o=choregie,c=fr.*", "$1");
System.out.println(cn); // doc_confidentiel
Note that in your current regex pattern, which uses lookarounds, you seemed to be confusing lookbehinds with lookaheads. But, the approach I gave above doesn't even need lookarounds.
CodePudding user response:
You could use a capture group, and for example not cross matching a pipe | char
\bcn=([^|]*),ou=tged,ou=groupes,o=choregie,c=fr\b
If it is the first value after the cn= then not matching a comma could also work:
\bcn=([^,]*),ou=tged,ou=groupes,o=choregie,c=fr\b
Explanation
\bcn=Match the wordcnand then =([^,]*)Capture group 1, optionally match any char that you do not allow,ou=tged,ou=groupes,o=choregie,c=fr\bMatch the string
For example
String regex = "\\bcn=([^,]*),ou=tged,ou=groupes,o=choregie,c=fr\\b";
String string = "cn=doc_medical,ou=tged,ou=groupes,o=choregie,c=fr|cn=test,ou=test,ou=test,o=choregie,c=fr|cn=doc_confidentiel,ou=tged,ou=groupes,o=choregie,c=fr|cn=test,ou=test,ou=test,o=choregie,c=fr";
Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
Output
doc_medical
doc_confidentiel
