Home > Back-end >  Using request in python to download a xls file
Using request in python to download a xls file

Time:01-23

In this page you will find a link to download an xls file (below attachment or adjuntos): https://www.banrep.gov.co/es/emisiones-vigentes-el-dcv

The link to download the xls file is: https://www.banrep.gov.co/sites/default/files/paginas/emisiones/EMISIONES.xls

I was using this code to automatically download that file:

import requests
import os

path = os.path.abspath(os.getcwd()) #donde se descargará el archivo

path = path.replace("\\", '/') '/'

url = 'https://www.banrep.gov.co/sites/default/files/paginas/emisiones/EMISIONES.xls'

myfile = requests.get(url, verify=False)

open(path 'EMISIONES.xls', 'wb').write(myfile.content)

This code was working well, but suddently the downloaded file started being corrupted.

if I run the code,raises this warning:

InsecureRequestWarning: Unverified HTTPS request is being made to host 'www.banrep.gov.co'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings warnings.warn(

Thanks for your help

CodePudding user response:

The error is related to how your request is being built. The status_code returned by the request is 403 [Forbiden]. You can see it typing

myfile.status_code

I guess the security issue is related to cookies and headers in your get request, because of that I suggest you take a view on how the webpage is building its headers in your request before the URL you're using is sent.

TIP: start you web browser in development mode and using Network tab, try to identify the headers.

To solve the issue of cookies take a view on how to retrieve naturally cookies pointing out to a previous webpage in www.banrep.gov.co, using requests.sessions

session_ = requests.Session()

Before coding you could try test your requests using Postman, or other REST API test software.

I hope that could help you.

CodePudding user response:

pip install pyopenssl seemed to solve it for me.

  •  Tags:  
  • Related