I'm sending a POST request to an API using scrapy.FormRequest and receiving a TextResponse object back. The body of this object looks like so:
{
"Refiners": ...
"Results": ...
}
I am only interested in the Results portion of the response as it contains HTML that I would like to parse.
As such, I am trying to creating a new TextResponse object containing only the Results portion in the body, so that I am able to use the response.css method on it.
I tried the following and it yielded an empty response body. Any thoughts on why and how to fix this?
new_response = scrapy.http.TextResponse(response.json()["Results"])
CodePudding user response:
You can use the HTMLResponse class and you need to provide the body and encoding arguments in the constructor.
from scrapy.http import HtmlResponse
new_response = HtmlResponse(url="some_url", body=response.json()["Results"], encoding="utf-8")
You can then use new_response.css(...) to select elements.
