I am new to XML and working on the parser, let me first illustrate my problem:
<animals>
<pet type="dog">
<name>Jack</name>
<name>Benny</name>
<name>Will</name>
</pet>
<pet type="cat">
<name>Luna</name>
<name>Lilith</name>
<name>Willow</name>
</pet>
<pet type="rabbit">
<name>Lilly</name>
<name>Robin</name>
</pet>
</animals>
From this tree I would like to extract only the names of dogs (Jack, Benny, Will). I've tried:
name = file.getElementsByTagName('name')
pets = file.getElementsByTagName('pet')
for i in pets:
if (i.attributes['type'].value == "dog"):
for j in name:
print(j.firstChild.data)
I am getting all names as I am using the whole subset of pets. My question is how to indicate that I want to explicitly select the tag and loop through it to get only the 3 names. I would like to stick to xml.dom.minidom and not use the Element Tree. Thanks in advance!
CodePudding user response:
Hope this helps:
from xml.dom.minidom import parseString
xml_string = '''
<animals>
<pet type="dog">
<name>Jack</name>
<name>Benny</name>
<name>Will</name>
</pet>
<pet type="cat">
<name>Luna</name>
<name>Lilith</name>
<name>Willow</name>
</pet>
<pet type="rabbit">
<name>Lilly</name>
<name>Robin</name>
</pet>
</animals>
'''
root = parseString(xml_string)
for pet in root.getElementsByTagName('pet'):
if pet.attributes['type'].value == 'dog':
names = pet.getElementsByTagName('name')
print([name.firstChild.data for name in names])
Output:
['Jack', 'Benny', 'Will']
