I have the following EDI file and need to filter the element LOC 11 but not the LOC 7 and I need all segments between them that the LOC Segment gets repeated but the segments between them not.
At the moment my regex looks like LOC[^L]*(?:L(?!OC)[^L]*)* but with that I get 4 results because it filters the loc 7 elemements too.
I only need the 2 results. Could you help me?
> NAD ST 14::92 Test' LOC 11 KOD23277::92' LOC 7 D77::92:Test' LIN 1
> test AP:IN'IMD F 12::272:K
> RIPPsadasdRIEM'RFF ON:EN10514492'RFF AAN:501'
> DTM 171:20220309:102'RFF AIF:500'DTM 171:20220305:102'CTA SC 12414:test,
> test'COM [email protected]:EM'
> COM ? 49-561-490-4173:TE'COM ? 49-561-490-84173:FX' QTY 83:1000:PCE'
> QTY 70:66850:PCE'DTM 51:20080101:102'
> QTY 72:0:PCE'DTM 52:20080101:102'
> QTY 194:1000:PCE'DTM 50:20220224:102'
> RFF AAU:2143276'DTM 171:20220218:102'
> QTY 194:1000:PCE'DTM 50:20220202:102'
> RFF AAU:2138944'DTM 171:20220131:102'
> QTY 194:1000:PCE'DTM 50:20220105:102'
> RFF AAU:2138943'DTM 171:20220103:102' SCC 24'
> QTY 113:1000:PCE'DTM 2:20220412:102'
> QTY 113:1000:PCE'DTM 2:20220503:102'
> QTY 113:1000:PCE'DTM 64:20220530:102'DTM 63:20220605:102'
> QTY 113:1000:PCE'DTM 64:20220620:102'DTM 63:20220626:102'
> QTY 113:1000:PCE'DTM 64:20220711:102'DTM 63:20220717:102'
> QTY 113:1000:PCE'DTM 64:20220801:102'DTM 63:20220807:102' GEI 3 37'
>
> NAD ST 14::92 test' LOC 11 KOD823226::92' LOC 7 D86::92:Test' LIN 2
> test H:IN'IMD F 12::272:K
> RIPPRIEM'RFF ON:EN10662318'RFF AAN:266'DTM 171:20220309:102'
> RFF AIF:265'DTM 171:20220305:102'CTA SC 12414:test,
> test'COM [email protected]:EM'
> COM ? 49-561-490-4173:TE'COM ? 49-561-490-84173:FX' QTY 83:200:PCE'
> QTY 70:14319:PCE'DTM 51:20100101:102'
> QTY 72:0:PCE'DTM 52:20100101:102' QTY 194:200:PCE'DTM 50:20220126:102'
> RFF AAU:2146871'DTM 171:20220121:102'
> QTY 194:200:PCE'DTM 50:20211210:102'RFF AAU:2146914'DTM 171:20211209:102' QTY 194:200:PCE'DTM 50:20211129:102'RFF AAU:2139927'DTM 171:20211124:102'SCC 24'
> QTY 113:200:PCE'DTM 2:20220503:102'
> QTY 113:200:PCE'DTM 64:20220606:102'DTM 63:20220612:102'
> QTY 113:200:PCE'DTM 64:20220718:102'DTM 63:20220724:102'
> QTY 113:200:PCE'DTM 64:20220829:102'DTM 63:20220904:102'
> QTY 113:200:PCE'DTM 64:20221010:102'DTM 63:20221016:102'
>
> UNT 142 1'UNZ 1 2756'
CodePudding user response:
You can use
LOC\ 11[^L]*(?:L(?!OC\ 11)[^L]*)*
LOC\ 11[\w\W]*?(?=LOC\ 11|$)
See the regex demo.
Details:
LOC\ 11-LOC 11string[^L]*(?:L(?!OC\ 11)[^L]*)*- any text up to the first occurrence ofLOC 11substring (uses the unroll-the-loop principle).
Although the results you get with the two patterns above are identical, the first one is much faster provided there are not too many Ls that are not followed with 11.
