So I'm using the below code to scrape a CSV of Business Names & Website domains (about 10,000) for "mailto:" links and trying to save those to a CSV when mailto links are found. But occasionally I run into "temporary dns lookup failur" and "connection time out" errors.
I need help figuring out how to go about having it "Skip" when the request function throws these errors (any error) and just continue down the list.
import csv
import requests
from bs4 import BeautifulSoup
import numpy as np
results = []
agency_names = ['Agency Name']
agency_websites = ['Website']
agency_emails = ['Email Address']
with open('agencies_clean.csv') as csvfile:
reader = csv.reader(csvfile) # change contents to floats
count = 0;
for row in reader: # each row is a list
if count != 0:
if row[1] != "":
print("working on " row[1] "...")
page = requests.get('http://' row[1])
soup = BeautifulSoup(page.content, "html.parser")
mailtos = soup.select('a[href^=mailto]')
if mailtos:
agency_names.append(row[0])
agency_websites.append(row[1])
agency_emails.append(mailtos[0].text)
print('Completed[x] Company:' row[0] 'Email: ' mailtos[0].text)
count=count 1
np.savetxt('scrapes/agencies_w_emails.csv', [p for p in zip(agency_names, agency_websites)], delimiter=',', fmt='%s')
CodePudding user response:
This might work. The print in except are optional.
try:
mailtos:
agency_names.append(row[0])
agency_websites.append(row[1])
agency_emails.append(mailtos[0].text)
print('Completed[x] Company:' row[0] 'Email: ' mailtos[0].text)
except:
print('email error')
continue
CodePudding user response:
You're probably looking for something similar to:
for row in reader:
try:
// your verification code here
except DNSException:
continue
I'm not sure about the exception name you should use, I think you can read it from the python interpreter's output and replace DNSException with it. A part from that, the main idea is to use continue to pass to the next element of the iteration.
A generic exceptions handler like this:
try:
// something
except:
pass
is usually not a good coding practice.
