I need to remove all id attributes from XML using Python. It will be part of a bigger app and will be the input for some transformations after.
Example code:
<body>
<r1 format="bold" id="NODE1">
<r2 title="Test" id="NODE2">
<r3 group="123" type="Operation" id="NODE3">
<rtit id="NODE4">Evaluate the temperature</rtit>
<procedure id="NODE5">
<procstep id="NODE6">
<graphelem id="NODE7">
<graphic graphicname="T123456" res_width="3.58in" scale="70" id="NODE8"/>
</graphelem>
<proct>Remove the screws. Remove the plates.</proct>
</procstep>
<procstep id="NODE9">
<graphelem id="NODE10">
<graphic graphicname="T654321" res_width="3.58in" scale="70" id="NODE11"/>
</graphelem>
<proct>Fix the thermocouple in the cover.</proct>
</procstep>
</procedure>
</r3>
</r2>
</r1>
</body>
The source files have more than 1000 lines, and more than 30 different XML tags that contain the id attribute.
The expected result is:
<body>
<r1 format="bold">
<r2 title="Test">
<r3 group="123" type="Operation">
<rtit>Evaluate the temperature</rtit>
<procedure>
<procstep>
<graphelem>
<graphic graphicname="T2093978" res_width="3.58in" scale="70"/>
</graphelem>
<proct>Remove the screws. Remove the plates.</proct>
</procstep>
<procstep>
<graphelem>
<graphic graphicname="T654321" res_width="3.58in" scale="70"/>
</graphelem>
<proct>Fix the thermocouple in the cover.</proct>
</procstep>
</procedure>
</r3>
</r2>
</r1>
</body>
I've tried to use xslt to make the transformation except for the id attribute, but without any success.
Does anyone help me with this issue, please?
CodePudding user response:
I need to remove all id attributes from XML using Python.
Something like the below - loop over all elements and drop the 'id' attrib
import xml.etree.ElementTree as ET
xml = '''<body><r1 format="bold" id="NODE1">
<r2 title="Test" id="NODE2">
<r3 group="123" type="Operation" id="NODE3">
<rtit id="NODE4">Evaluate the temperature</rtit>
<procedure id="NODE5">
<procstep id="NODE6">
<graphelem id="NODE7">
<graphic graphicname="T123456" res_width="3.58in" scale="70" id="NODE8"/>
</graphelem>
<proct>Remove the screws. Remove the plates.</proct>
</procstep>
<procstep id="NODE9">
<graphelem id="NODE10">
<graphic graphicname="T654321" res_width="3.58in" scale="70" id="NODE11"/>
</graphelem>
<proct>Fix the thermocouple in the cover.</proct>
</procstep>
</procedure>
</r3>
</r2>
</r1>
</body>'''
root = ET.fromstring(xml)
for elem in root.iter():
if 'id' in elem.attrib:
del elem.attrib['id']
ET.dump(root)
