XML help

Sun Jun 5 09:41:35 EDT 2005

I'm 4 months new to python and 4 hours new to XML. I've been trying to
understand and use the DOM tree walk sample shown at this site:
http://www.rexx.com/~dkuhlman/pyxmlfaq.html to walk through an xml file from
which I need to extract data for subsequent plotting.

I've repeated the functions from the site that I'm using here:

import sys, string
from xml.dom import minidom, Node

def walkTree(node):
    if node.nodeType == Node.ELEMENT_NODE:
        yield node
        for child in node.childNodes:
            for n1 in walkTree(child):
                yield n1

def test(inFileName):
    outFile = sys.stdout
    doc = minidom.parse(inFileName)
    rootNode = doc.documentElement
    level = 0
    for node in walkTree(rootNode):
        my_processing(node, outFile)

A piece of the XML file I want to process is here:

<XMLDocument>
  <!--
******************************************************************  -->
  <!--
******************************************************************  -->
  <!--File Name: C:\Temp\slit_coarse_isowall_velocity.xml-->
  <!--
******************************************************************  -->
  <!--
******************************************************************  -->
  <HEADER>
    <NAME>  Simulation Results XML Writer</NAME>
    <Version>  1.00</Version>
  </HEADER>
  <Dataset Name="Velocity" ID="1555">
    <DataType>  ELDT(Element data)</DataType>
    <DeptVar Name="Velocity" Unit="m/s"/>
    <NumberOfComponents>  1</NumberOfComponents>
    <NumberOfIndpVariables>  2</NumberOfIndpVariables>
    <IndpVar Name="Time" Unit="s"/>
    <IndpVar Name="Normalized thickness" Unit=""/>
    <Blocks>
      <NumberOfBlocks>  231</NumberOfBlocks>
      <Block Index="1">
        <IndpVar Name="Time" Value="0.065320" Unit="s"/>
        <IndpVar Name="Normalized thickness" Value="0.000000" Unit=""/>
        <NumberOfDependentVariables>  32</NumberOfDependentVariables>
        <Data>
          <ElementData ID="1">
            <DeptValues>     1.5098e+000</DeptValues>
          </ElementData>
          <ElementData ID="2">
            <DeptValues>     1.4991e+000</DeptValues>
          </ElementData>
          <ElementData ID="7">
            <DeptValues>     1.4744e+000</DeptValues>
          ......
        </Data>
      </Block>
      <Block Index="2">
        ....

As can be seen data is represented by blocks within which a datapoint exists
for finite element IDs. Number of entries in each block vary and Element IDs
are not necessarily contiguous.

I've managed to test for specific elements and extract values. I want to
place the reults in arrays with array index equal to element ID. So as I
walk the tree I temporarily store IDs and DeptValues in lists. I'm ok so
far. I then intend to create an array of size determined by the maximum
value of ID. So in the sample above the array size will be 8 even though
only three entries exist.

At this point I'm stuck because I want to do this latter array creation and
processing when I "see" the /Block end of block tag. However I can't figure
out how to do that. Obviously I'm not understanding something about XML DOM
trees and Elements because when I try to print all elements I never see an
end tag for any. I'm obviously approaching this from a readline and process
point of view which is probably half the problem.

So how can I initiate array processing at the end of each block prior to
reaching the next block. Of course I'm open to simpler ways too ;)

tia for any advice.