I am new to web scraping and I'm trying to use Scrapy to scrape the Release Date for the following website: https://m.imdb.com/title/tt0468569/?ref_=adv_li_tt
This is the selector I am using:
//a[contains(@class,'ipc-metadata-list-item__list-content-item ipc-metadata-list-item__list-content-item--link')]/text()
It returns too many elements and I just want the release data string.
CodePudding user response:
To select more specific and get only the text of release date adjust your path like this:
//li[contains(@data-testid,'title-details-releasedate')]//li/a/text()
It will select the <li> that contains the attribute data-testid with value title-details-releasedate. Cause these contains two <a> it focus on the <a> that is contained in another <li>
