Home > database >  XML tree - extract the subtags values only if the main tag has a certain property value
XML tree - extract the subtags values only if the main tag has a certain property value

Time:01-30

I am new to XML and working on the parser, let me first illustrate my problem:

<animals>
      <pet type="dog">
        <name>Jack</name>
        <name>Benny</name>
        <name>Will</name>
      </pet>
      <pet type="cat">
        <name>Luna</name>
        <name>Lilith</name>
        <name>Willow</name>
      </pet>
      <pet type="rabbit">
        <name>Lilly</name>
        <name>Robin</name>
      </pet>
</animals>

From this tree I would like to extract only the names of dogs (Jack, Benny, Will). I've tried:


name = file.getElementsByTagName('name')
pets = file.getElementsByTagName('pet')
for i in pets:
    if (i.attributes['type'].value == "dog"):
            for j in name: 
               print(j.firstChild.data)

I am getting all names as I am using the whole subset of pets. My question is how to indicate that I want to explicitly select the tag and loop through it to get only the 3 names. I would like to stick to xml.dom.minidom and not use the Element Tree. Thanks in advance!

CodePudding user response:

Hope this helps:

from xml.dom.minidom import parseString

xml_string = '''
<animals>
      <pet type="dog">
        <name>Jack</name>
        <name>Benny</name>
        <name>Will</name>
      </pet>
      <pet type="cat">
        <name>Luna</name>
        <name>Lilith</name>
        <name>Willow</name>
      </pet>
      <pet type="rabbit">
        <name>Lilly</name>
        <name>Robin</name>
      </pet>
</animals>
'''

root = parseString(xml_string)

for pet in root.getElementsByTagName('pet'):    
    if pet.attributes['type'].value == 'dog':
        names = pet.getElementsByTagName('name')
        print([name.firstChild.data for name in names])

Output:

['Jack', 'Benny', 'Will']
  •  Tags:  
  • Related