Home > database >  Finding a table row that matches the text in a cell with Python and Selenium
Finding a table row that matches the text in a cell with Python and Selenium

Time:01-22

If would like to access the row element from a HTML table where the text in a certain cell matches my string 'Mathematik & Informatik'.

The HTML looks like this:

<table >
   <thead>
      <tr>
         <th  scope="col">
            Teilbibliothek          
         </th>
         <th  scope="col">
            Datum          
         </th>
         <th  scope="col">
            Zeitraum          
         </th>
         <th  scope="col">
         </th>
      </tr>
   </thead>
   <tbody>
      <tr >
         <td >
            Stammgelände          
         </td>
         <td >
            <span >Samstag, 22. Januar 2022</span>          
         </td>
         <td >
            09:00 – 14:30          
         </td>
         <td >
            ausgebucht          
         </td>
      </tr>
      <tr >
         <td >
            Stammgelände          
         </td>
         <td >
            <span >Samstag, 22. Januar 2022</span>          
         </td>
         <td >
            15:00 – 21:30          
         </td>
         <td >
            ausgebucht          
         </td>
      </tr>
      <tr >
         <td >
            Mathematik &amp; Informatik          
         </td>
         <td >
            <span >Samstag, 22. Januar 2022</span>          
         </td>
         <td >
            10:00 – 14:30          
         </td>
         <td >
            ausgebucht          
         </td>
      </tr>
      <tr >
         <td >
            Mathematik &amp; Informatik          
         </td>
         <td >
            <span >Samstag, 22. Januar 2022</span>          
         </td>
         <td >
            15:00 – 19:30          
         </td>
         <td >
            ausgebucht          
         </td>
      </tr>
      <tr >
         <td >
            Weihenstephan          
         </td>
         <td >
            <span >Samstag, 22. Januar 2022</span>          
         </td>
         <td >
            10:00 – 14:30          
         </td>
         <td >
            <a href="/reserve/1438527699">Zur Reservierung</a>          
         </td>
      </tr>
      <tr >
         <td >
            Weihenstephan          
         </td>
         <td >
            <span >Samstag, 22. Januar 2022</span>          
         </td>
         <td >
            15:00 – 19:30          
         </td>
         <td >
            <a href="/reserve/530262745">Zur Reservierung</a>          
         </td>
      </tr>
   </tbody>
</table>

I am using Python and Selenium and came up with the following bit of code to get the table row I want.

driver.find_elements(By.XPATH, "//table//tr/td[contains(text(),'Mathematik & Informatik')]/..")

This line returns a list with three elements. These are the two rows that match my string 'Mathematik & Informatik', but also another element that somehow has the text ' Mathematik & Informatik, Weihenstephan  8:00 – 14:3015:00 – 21:30 10:00 – 14:3015:00 – 19:30 '.

I do not understand what's wrong with my XPATH (why It does not return only the two rows with the given text). Could you help me fix it please?

Thanks for your help!

CodePudding user response:

I do not agree with the @Prophet solution, since there are trailing spaces and if you do not use .contains it won't match up any node.

I see only two matching nodes in the HTML that you've shared. However, you can make it tightly coupled with class like this:

//table//tr/td[contains(text(),'Mathematik & Informatik') and @class='views-field views-field-field-teilbibliothek']

also, Selenium does not have support for XPath v2.0, if it had, we'd have ended up using ends-with.

to remove trailing spaces, please use:

//table//tr/td[normalize-space()='Mathematik & Informatik']/..

You will have to check-in HTMLDOM first that how many nodes it is matching:

Steps to check:

Press F12 in Chrome -> go to element section -> do a CTRL F -> then paste the xpath and see, if your desired elements is getting highlighted with 1/2 matching node.

CodePudding user response:

In case you want to match lines with text equals to some value, not just contains some text (and may contain additional text there) you should use appropriate XPath expression.
So, instead of

driver.find_elements(By.XPATH, "//table//tr/td[contains(text(),'Mathematik & Informatik')]/..")

You can use

driver.find_elements(By.XPATH, "//table//tr/td[text()='Mathematik & Informatik']/..")

BTW, you can also locate the desired tr element directly, without a need to step up from the td as following:

driver.find_elements(By.XPATH, "//table//tr[./td[text()='Mathematik & Informatik']]")
  •  Tags:  
  • Related