I am trying to scrape my data from a website that requires a login but I keep getting the following error:
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>MethodNotAllowed</Code><Message>The specified method is not allowed against this resource.</Message><Method>POST</Method><ResourceType>OBJECT</ResourceType><RequestId>DCVJZ8D4R3PK45M1</RequestId><HostId>PIra5vNbfC5d1TfFZ3hABXk9eIsKwtJm5bYH4Bozu4nS4InkGEILNflPPzdvT9hUpQOPaW0AZBA=</HostId></Error>
Python Script
import requests
loginurl = ("https://cbscarrickonsuir.app.vsware.ie/")
secure_url = ("https://cbscarrickonsuir.app.vsware.ie/11571471/behaviour")
payload = {"username":"REMOVED","password":"REMOVED","source":"web"}
r = requests.post(loginurl, data=payload)
print(r.text)
Had to remove username and password as this is a working website.
I don't know how to do this. I followed a 
It would be a good idea to click through links in XHR part of Network tab and see the headers, request and response to understand what API endpoint exactly you should be using along with the method, the request body format which is expected and the kind of response you will receive.
Edit: Also you'll be probably needing persistent sessions for scraping any data which will require you to login first. Go through these:
- Python Requests and persistent sessions
- https://requests.kennethreitz.org/en/master/user/advanced/#session-objects
CodePudding user response:
There are two mistakes in your code.
you send data to main page but browser send to
https://cbscarrickonsuir.vsware.ie/tokenapiV2/loginyou send data as
FORM databut browser sends asJSON dataso you needjson=payloadinstead ofdata=payload
Other problem can make that you don't use Session() to send automatically cookies - and all servers use cookies to keep information that you already logged in. If you don't send cookies then server doesn't know that you are logged in.
import requests
url = "https://cbscarrickonsuir.app.vsware.ie/"
login_url = 'https://cbscarrickonsuir.vsware.ie/tokenapiV2/login'
payload = {
"username": "none",
"password": "[email protected]",
"source":"web"
}
s = requests.Session()
r = s.post(login_url, json=payload)
print('status:', r.status_code)
print('--- text ---')
print(r.text)
print('----------------')
I don't have account to login but now it get status 401 with message invalid_username_password
status: 401
--- text ---
{"fieldErrors":[],"genericErrors":[{"messageKey":"invalid_username_password","metadata":null}]}

