Home > Software design >  Notepad delete everything outside the " quotes with regular expression
Notepad delete everything outside the " quotes with regular expression

Time:01-06

I searched and try so much until now, but I think it is better to ask.

Situation:

I have a file with multiple lines (around 5000) like this:

Testtext und hier steht noch mehr!
Franz jagt in einem total verwahrlosten Taxi durch die Eifel
Hier ist eine Zeile mit einer Information, die Information ist "ABC12345"
Zu der Information gibt es eine 2te Zeile mit einer weiteren Information "Info1|Info2|Info3"
Dann kommt noch eine ueberfluessige Zeile

... and I just need the informations ABC12345 and Info1|Info2|Info3.

I want to delete everything else what is not between the quotes. Every information block is in the same way.

And I also want to delete the lines without important information's

That I get:

ABC12345
Info1|Info2|Info3
ZYX9876
Info9|Info7|Info5

or the same with quotes (that's not important)

I tried to search the Regex \"(.*?)\" that's fine, I can find everything inside the "quotes".

After that my next step was to say delete everything was is NOT \"(.*?)\" ... over search and replace.

But I do not understand how I can negate this.

(?!(\"(.*?)\")) doesn't work.

I think for a specialist it is so easy to solve it, please help me.

CodePudding user response:

  • Ctrl H
  • Find what: ". ?"(*SKIP)(*F)|.
  • Replace with: LEAVE EMPTY
  • CHECK Regular expression
  • UNCHECK . matches newline
  • Replace All

Explanation:

". ?"           # matches something beween quotes
(*SKIP)(*F)     # fail the match
|               # OR
.               # any  character

Screenshot (before):

enter image description here

Screenshot (after):

enter image description here

CodePudding user response:

My solution in this case with the "more" original files:

Some Text with the word Job
Some other with job
Job has been sent to "ABC1234"
Job owner's username "Info1|Info2"

Find: .*(\"(.*?)\").*|.*Job.*[\r]?[\n]||.*job.*[\r]?[\n]
Restore: $1 

Job and job is also Text in the lines I didn't need. So I have as a result only X times the 2 lines with the Information and no empty rows.

Thanks to all this info from you that helped to find the final solution!

  •  Tags:  
  • Related