Home > Software engineering >  How to make Python print the URL and its corresponding status code?
How to make Python print the URL and its corresponding status code?

Time:02-01

I'd like to have Python go through a text file and check the status code of each URL in that file. Then, Python can make a new text file and input in each line the URL and its corresponding status code. How can I do that? This is my incomplete script below:

import requests
with open("jack.txt") as fid:
    url_lines = set(fid)
for url in url_lines:
    response = requests.get(url)
    status_code = response.status_code

CodePudding user response:

import requests
import csv

# open the file in the write mode
# or rename the url-status-codes.csv to a filename you want
f = open('url-status-codes.csv', 'w', encoding='UTF8', newline='')
# create the csv writer
writer = csv.writer(f)

with open("jack.txt") as fid:
    url_lines = set(fid)
for url in url_lines:
    response = requests.get(url)
    status_code = response.status_code
    # if you want it in a format of url, status_code
    writer.writerow([url, status_code])
    # but you want just a space after, uncomment this
    # writer.writerow([url   " "   status_code])

# close the file
f.close()

CodePudding user response:

You're really close; one approach you could take is to save all the results in a list, open a new file and write out the contents of said list:

import requests

with open("jack.txt") as fid:
    url_lines = set(line.rstrip() for line in fid)

results = []
for url in url_lines:
    response = requests.get(url)
    status_code = response.status_code
    results.append((url, status_code))

with open("results.txt", "w") as f:
    for url, status_code in results:
        f.write(f"{url} {status_code}\n")

If jack.txt is:

https://stackoverflow.com
https://google.com
https://docs.python.org/3/library/magic

results.txt might look like:

https://google.com 200
https://stackoverflow.com 200
https://docs.python.org/3/library/magic 404

Note that the orders don't necessarily match -- this is because you're using a set for url_lines.

  •  Tags:  
  • Related