Home > Enterprise >  output XML nodes out into individual files
output XML nodes out into individual files

Time:01-19

I am trying to create individual files from the nodes of a XML file. My issue is no matter what way I try it I seem to be getting stuck in a nested loop and I either keep rewriting each file until they are just the same node data over and over, or I run all of the nodes per loop instance. I'm sure this should be pretty easy but I'm getting hung up somewhere.

doc = Nokogiri::XML(open("original_copy_mod.xml"))

 doc.xpath("//nodes/node").each do |item| 

      item.xpath("//div[@class='meeting-date']/span/@content").each do |date| 

      date = date.to_s
      split_date = date.split('T00')
      split_date = split_date[0].gsub("-","_")
      split_date = split_date   ".pcf"


       File.open(split_date,'w'){ |f| f.write(item)}
      end 
        
 end

This is another attempt that I don't understand why is failing to create all the pages. This only creates one page, but if I use a "puts" the count does iterate through all 101 nodes.

 doc = Nokogiri::XML(open("original_copy_mod.xml"))

 doc.xpath("//nodes/node").each do |item| 


      date = item.xpath("//no-name/div[@class='meeting-date']/span/@content").to_s
      split_date = date.split('T00')
      split_date = split_date[0].gsub("-","_")
      split_date = split_date   ".pcf"
      File.open(split_date,'w'){ |f| f.write(item)}
     
        
     end

For further clarification, this is an example of the nodes that I'm trying to create into pages.

<?xml version="1.0" encoding="UTF-8" ?>
<nodes>

<node>
    <no-name><div >Meeting-a</div>
        <div ><span  property="dc:date" datatype="xsd:dateTime" content="a-2021-11-29T00:00:00-06:00">Monday, November 29, 2021</span></div>
    </no-name>
    <no-name><div >
        <div>
            <span><a href="/final.pdf" target="_blank"><img src="agenda-icon.svg"/></a></span>
            <span><a href="final.pdf" target="_blank">Agenda</a></span>
        </div>


        <div>
            <span><a href="https://9949973" target="_blank"><img src="webcast-icon.svg"/></a></span>
            <span><a href="https://9949973" target="_blank">11/29</a></span>
        </div>

        </div>
        <div ></div></no-name>
</node>

<node>
    <no-name><div >Meeting-b</div>
        <div ><span  property="dc:date" datatype="xsd:dateTime" content="e-2021-09-10T00:00:00-05:00">Friday, September 10, 2021</span></div>
    </no-name>
    <no-name><div >
        <div>
            <span><a href="/final.pdf" target="_blank"><img src="agenda-icon.svg"/></a></span>
            <span><a href="final.pdf" target="_blank">Agenda</a></span>
        </div>


        <div>
            <span><a href="https://9949973" target="_blank"><img src="webcast-icon.svg"/></a></span>
            <span><a href="https://9949973" target="_blank">11/29</a></span>
        </div>

        </div>
        <div ></div></no-name>
</node>

<node>
    <no-name><div >Meeting-c</div>
        <div ><span  property="dc:date" datatype="xsd:dateTime" content="f-2021-08-13T00:00:00-05:00">Friday, August 13, 2021</span></div>
    </no-name>
    <no-name><div >
        <div>
            <span><a href="/final.pdf" target="_blank"><img src="agenda-icon.svg"/></a></span>
            <span><a href="final.pdf" target="_blank">Agenda</a></span>
        </div>


        <div>
            <span><a href="https://9949973" target="_blank"><img src="webcast-icon.svg"/></a></span>
            <span><a href="https://9949973" target="_blank">11/29</a></span>
        </div>

        </div>
        <div ></div></no-name>
</node>

</nodes>

 

CodePudding user response:

date = item.xpath("//no-name/div[@class='meeting-date']/span/@content").to_s

By using // you are breaking out of the scope of the node you are iterating in. Removing the slashes you preserve the scope of the node.

date = item.xpath("no-name/div[@class='meeting-date']/span/@content").to_s

CodePudding user response:

When you use w option it always rewrite onto the file. What you need is to create or append to the file, it's done with the a option. So you can try this:

File.open(split_date,'a'){ |f| f << item }

PS. Be sure that split_date as the name of the file is uniq for each node since you want a separate file per node

  •  Tags:  
  • Related