How to avoid overwriting of k, v in dictionaries-CodePudding

My program is parsing values from an XML file and then puts them into a dictionary.

Here I've used a for loop to iterate all tags from the file and attributes and also the text

But when there is a subtag like [250][155] which is <name>, it will overwrite the [4] <name>

And all of this is running under the for loop

Now, I want to hinder the loop from overwriting the values once it has been entered into the loop

import pprint as p  # Importing pprint/pretty print for formatting dict
import xml.etree.ElementTree as ETT  # Importing xml.etree for xml parsing
import csv  # Importing csv to write in CSV file


def fetch_data():
    # Asking input from user for the file path
    xml_file_path = input(
        "Enter the path to the file. \n Note* use 'Double back slash' instead of 'single back slash': \n \t \t \t \t \t")

    # Initiating variable parser which will parse from the file
    parser = ETT.parse(xml_file_path)

    # Initiating variable root to get root element which will be useful in further queries
    root = parser.getroot()

    # Initiating variable main_d which is the dictionary in which the parsed data will be stored
    main_d = {}

    for w in root.iter():  # Making a for loop which will iter all tags out of the file
        value = w.attrib  # Initiating variable value for storing attributes where attributes are in the form of dictionaries
        value['value'] = w.text  # Hence, appending the text/value of the tag in the value dict
        if w not in main_d:
            main_d[w.tag] = value  # Writing all the keys and values in main_d
        else:
            main_d.pop(w)
    p.pprint(main_d, sort_dicts=False, width=200, depth=100)


fetch_data()

This is what the XML would look like

<?xml version="1.0" encoding="UTF-8"?>
<Data data_version="1">
    <modified_on_date>some_time</modified_on_date>
    <file_version>some version</file_version>
    <name>h</name>
    <class>Hot</class>
    <fct>
        <fc_tem di="value1" un="value2" unn="value3">some integer</fc_tem>
        <fc_str di="value1" un="value2" unn="value3">some integer</fc_str>
        <DataTable name="namee" type="0" columns="2" rows="2" version="some version">
            <name>this will be overwritten on the first one up there</name>
            <type>0</type>
        </DataTable>    
    </fct>
</Data>

This is my progress so far

Taking into account the confidentiality of the program, that's all I can share

CodePudding user response：

First of all, thanks to @PatrickArtner, his way worked

so you just have to do w.tag instead of w

the full snippet is:

# This program is to fetch/parse data(tags, attribs, text) from the XML/XMT
# file provided


# Importing required libraries

import pprint as p                                      # Importing pprint/pretty print for formatting dict
import xml.etree.ElementTree as ETT                     # Importing xml.etree for xml parsing
import csv                                              # Importing csv to write in CSV file


# Creating a method/function fetch_data() to fetch/parse data from the given XML/XMT file

def fetch_data():


    # Asking input from user for the file path
    xml_file_path = input(
        "Enter the path to the file \n \t \t :")

    # Asking input from user for the name of the csv file which will be created
    file_name = input(str("Enter the file name with extension you want as output \n \t \t : "))


    # Initiating variable parser which will parse from the file
    parser = ETT.parse(xml_file_path)


    # Initiating variable root to get root element which will be useful in further queries
    root = parser.getroot()


    # Initiating variable main_d which is the dictionary in which the parsed data will be stored
    main_d = {}


    for w in root.iter():                               # Making a for loop which will iter all tags out of the file
        value = w.attrib                                # Initiating variable value for storing attributes where attributes are in the form of dictionaries
        value['value'] = w.text                         # Hence, appending the text/value of the tag in the value dict
        if w.tag not in main_d:                         # Checking if the tag exists or not, this will help to avoid overwriting of tag values
            main_d[w.tag] = value                       # Writing all the keys and values in main_d
        else:
            pass

    p.pprint(main_d, sort_dicts=False, width=200)               # This is just to check the output

    with open(file_name, 'w ', buffering=True) as file:         # Opening a file with the filename provided by the user
        csvwriter = csv.writer(file, quoting=csv.QUOTE_ALL)     # Initiating a variable csvwriter for the file and passing QUOTE_ALL agr.
        for x in main_d.keys():                                 # Creating a loop to write the tags
            csvwriter.writerow({x})                             # Writing the tags



fetch_data()