Home > Blockchain >  Issue to scrape href using scrapy
Issue to scrape href using scrapy

Time:01-27

I am to scrape href but they give me empty list

import scrapy
from scrapy.http import Request


class PushpaSpider(scrapy.Spider):
    name = 'pushpa'
    start_urls = ['http://smartcatalog.emo-milano.com/it/catalogo/elenco-alfabetico/400/A']

    def parse(self, response):
        for href in response.xpath("//div[@class='exbox-name']/a/@href").extract():
            yield href
      

CodePudding user response:

your xpath expression is false, this expression gives no results because your expression is a relative path (./), you need to change it for an absolute xpath : "//div[@class='exbox-name']/a/@href"

CodePudding user response:

ok, did you enable logs to identify problems ? for example your parse function should return an item not a string, you can print the string but yield something like an item :

yield {'url': href} 
  •  Tags:  
  • Related