I have the following XML file.
<dos>
<tot>
<diagram type="tot" ns="1">
<point e="-3.000000000" d="2.000000000"/>
<point e="-2.993993994" d="4.000000000"/>
<point e="-2.987987988" d="5.000000000"/>
<point e="-2.981981982" d="0.600000000"/>
<point e="-2.963963964" d="0.600000000"/>
</diagram>
</tot>
<part type="par" species="1">
<diagram ns="1" l="0" m="0">
<point e="-3.000000000" d="0.002000000"/>
<point e="-2.993993994" d="0.300000000"/>
<point e="-2.987987988" d="4.000000000"/>
<point e="-2.981981982" d="0.90000000"/>
</diagram>
<diagram ns="1" l="1" m="-1">
<point e="-3.000000000" d="0.005000000"/>
<point e="-2.993993994" d="0.040000000"/>
<point e="-2.987987988" d="0.0700000000"/>
<point e="-2.981981982" d="0.800000000"/>
</diagram>
</part>
<part type="par" species="2">
<diagram ns="1" l="0" m="0">
<point e="-3.000000000" d="2.002000000"/>
<point e="-2.993993994" d="3.300000000"/>
<point e="-2.987987988" d="1.000000000"/>
<point e="-2.981981982" d="2.90000000"/>
</diagram>
<diagram ns="1" l="1" m="-1">
<point e="-3.000000000" d="3.005000000"/>
<point e="-2.993993994" d="4.040000000"/>
<point e="-2.987987988" d="5.0700000000"/>
<point e="-2.981981982" d="2.800000000"/>
</diagram>
</part>
</dos>
I would like to get all points in each "diagram" block and preferably save them in different variables. Using the following simple code, I could extract all of these values.
from lxml import etree
from xml.dom import minidom
filedoss='./PDOS_RhSi/tmp.xml'
file = minidom.parse(filedoss)
tot = file.getElementsByTagName('tot')
pointsid = file.getElementsByTagName('point')
d_id = np.zeros((len(pointsid),2), dtype=float)
for i in range(len(pointsid)):
d_id[i,0]=pointsid[i].attributes['e'].value
d_id[i,1]=pointsid[i].attributes['d'].value
print(d_id)
which has the output of
[[-3.00000000e 00 2.00000000e 00]
[-2.99399399e 00 4.00000000e 00]
[-2.98798799e 00 5.00000000e 00]
[-2.98198198e 00 6.00000000e-01]
[-2.96396396e 00 6.00000000e-01]
[-3.00000000e 00 2.00000000e-03]
[-2.99399399e 00 3.00000000e-01]
[-2.98798799e 00 4.00000000e 00]
[-2.98198198e 00 9.00000000e-01]
[-3.00000000e 00 5.00000000e-03]
[-2.99399399e 00 4.00000000e-02]
[-2.98798799e 00 7.00000000e-02]
[-2.98198198e 00 8.00000000e-01]
[-3.00000000e 00 2.00200000e 00]
[-2.99399399e 00 3.30000000e 00]
[-2.98798799e 00 1.00000000e 00]
[-2.98198198e 00 2.90000000e 00]
[-3.00000000e 00 3.00500000e 00]
[-2.99399399e 00 4.04000000e 00]
[-2.98798799e 00 5.07000000e 00]
[-2.98198198e 00 2.80000000e 00]]
However, this way of reading my XML file combines all five blocks. How can I get read my XML file in such a way that I can bring the above array into 5 different arrays, for example, "tot", "par_species1_l0_m0", "par_species1_l0_m-1", "par_species2_l0_m0" and "par_species2_l0_m-1"?
CodePudding user response:
If I understand you correctly, this should get what you are after (or close enough):
diags = file.xpath('//diagram')
for diag in diags:
atrs = diag.getparent().attrib
if len(atrs)>0:
type = atrs.values()[0]
spec = atrs.keys()[1]
spec_val = atrs.values()[1]
items = diag.attrib.items()[1:]
l = "".join(items[0])
m = "".join(items[1])
print(f"{type}_{spec}{spec_val}_{l}_{m}")
else:
print(diag.getparent().tag)
Output:
tot
par_species1_l0_m0
par_species1_l1_m-1
par_species2_l0_m0
par_species2_l1_m-1
