I have list of phone number (25million) I want to use that list as input file. Lets say that I have email phone database and I want to only extract phone number that available in input file(25 million) How can I do that in em editor? Or in any large file?
CodePudding user response:
To extract all matched string
Suppose you have a 25 million phone number list (file A) and a phone-email database file (file B).
- Open
file B, use a regular expression to extract phone numbers only. To do this, press Ctrl F to bring up the Find dialog box, set the Regular Expressions option, and depending on the phone number format, enter[0-9]{3,3}-[0-9]{3,3}-[0-9]{4,4}or\([0-9]{3,3}\)[0-9]{3,3}-[0-9]{4,4}to theFindbox. ClickExtractbutton, and save the phone numbers only database as a new file (file C). - Open
file Aand selectTab(or any CSV format) in theCSVtoolbar (or select theEditmenu -CSV-Tab separated). - Open
file Cand select the same CSV format as 2. - Click
Join CSVbutton on theSorttoolbar (or selectEditmenu -CSV-Join) to bring up the Join CSV dialog box. - Select
A.txtand set theUnique Keyoption as CSV Document 1, and selectC.txtand set theUnique Keyoption as CSV Document 2. - Select
Whole strings matchas theConditions, and set theMatch Caseoption. - Deselect
A.txtfrom the list box, and ensureC.txtis selected. - Click
Join Now. A new document will be created with all matched strings. Save this file asfile D.
To extract all matched lines
If file D is small enough, you can use Advanced Filter to filter file B with the contents of file D.
- Copy the
file Dcontents to the Clipboard. (To do this, Openfile Dwith EmEditor, press CTRL A, and CTRL C) - Open
file Bwith EmEditor, click Advanced Filter on the Filter toolbar. - Right-click on the list box, and paste the Clipboard contents.
- While all items in the list box are selected, make sure the Logical Disjunction (OR) to the Previous Condition option is set.
- Click the Filter button, and click the Close button if necessary.
- Click the Extract All button to extract all matched lines to a new document.

