Selenium (python): retrieving both href and text of an anchor-CodePudding

The following working code shows that I am able to retrieve the text from a Webelement but not the href (returns None). What am I doing wrong? The code line that doesn't work as expected is the next to last:

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.keys import Keys

driver = webdriver.Chrome(
    "/Users/bob/Documents/work/AIFA/scraper/scrape_gu/chromedriver"
)
wait = WebDriverWait(driver, 30)

driver.get("https://farmaci.agenziafarmaco.gov.it/bancadatifarmaci/cerca-farmaco")
readunderstood = driver.find_element_by_id("conf")
readunderstood.click()
accept = WebDriverWait(driver, 10).until(
    EC.element_to_be_clickable((By.XPATH, "/html/body/div[5]/div[3]/div/button"))
)
accept.click()
# end of the initial agreement screens and general preparation
##############################################################
SEARCH_STRING = "AB"  # we can safely assume this does not exist

find_textbox = driver.find_element_by_id("search")
find_textbox.clear()  # after the first search the old value will still be there
find_textbox.send_keys(SEARCH_STRING)
find_textbox.send_keys(Keys.ENTER)
# end of the search for a drug action
##############################################################
drugs_list = wait.until(
    EC.presence_of_all_elements_located(
        (By.XPATH, "//*[@id='ul_farm_results']/li[@style='display: list-item;']",)
    )
)
###### this is the part I don't understand
for drug in drugs_list:
    print(drug.get_attribute("href"))  # this should return a link, but returns None
    print(drug.text)  # this correctly prints 3 lines per drug

CodePudding user response：

The href links are contained not inside the elements you are getting by your locator but in their child element a.
So to make your code working as you expect you just have to adjust the locator.
Please try this:

drugs_list = wait.until(EC.presence_of_all_elements_located((By.XPATH, "//*[@id='ul_farm_results']/li[@style='display: list-item;']/a")))

for drug in drugs_list:
    print(drug.get_attribute("href")) 
    print(drug.text)

I would also advice you to use visibility_of_element_located expected conditions instead of presence_of_all_elements_located since visibility_of_element_located will wait for more mature element state, not only presence of elements (while they still may be not completely rendered) but also the element is fully rendered and visible.
In this case your code could looks like the following:

the_xpath = "//*[@id='ul_farm_results']/li[@style='display: list-item;']/a"
wait.until(EC.visibility_of_element_located((By.XPATH, the_xpath)))
drugs_list = driver.find_elements(By.XPATH, the_xpath)
for drug in drugs_list:
    print(drug.get_attribute("href")) 
    print(drug.text)