Home > Software engineering >  Trying to check if a tag exists in XML before parsing
Trying to check if a tag exists in XML before parsing

Time:02-02

I need to check the existence of certain tags in an XML file before parsing it; I'm using Element Tree in Python. Reading here, I tried writing this:


tgz_xml = f"https://www.ncbi.nlm.nih.gov/pmc/utils/oa/oa.fcgi?id=PMC8300416" 
response = urllib.request.urlopen(tgz_xml).read()
tree = ET.fromstring(response)


for OA in tree.findall('OA'):
  records = OA.find('records')
  if records is None:
    print('records missing')
  else:
    print('records found')

I need to check if the "records" tag exists. I don't get an error, but this doesn't print out anything. What did I do wrong? Thank you!

CodePudding user response:

When parsing this XML document variable tree already points to element OA, so when searching for this element expression tree.findall('OA') returns an empty list and loop isn't executed. Remove that line and code will be executed:

import xml.etree.ElementTree as ET 
from urllib.request import urlopen

tgz_xml = f"https://www.ncbi.nlm.nih.gov/pmc/utils/oa/oa.fcgi?id=PMC8300416" 
with urlopen(tgz_xml) as conn:
  response = conn.read()
  tree = ET.fromstring(response)

  records = tree.find('records')
  if records is None:
    print('records missing')
  else:
    print('records found')
  •  Tags:  
  • Related