Parsing XML - Newbie help
rh0dium
sklass at pointcircle.com
Sun May 22 19:34:08 EDT 2005
Fredrik Lundh wrote:
> didn't you ask the same question a few days ago? did you read the
> replies to that post?
Yes I did but the XML was malformed.. Actually it still is but you
helped me figure out a way to correct it - Thanks
Here is what I have so far. Now I want to find a child of a child ( I
think that's how you state it ?? ) Below is a piece of the XML which I
am trying to parse.. In short I want to figure out all of the memory
in a system. I can look at the "size" of all "bank:?"'s and add them
up. I am having trouble getting to the children of the "System Memory"
inp = open("xml.test1")
data = inp.read()
inp.close()
# strip off bogus XML declaration
import re
m = re.match("<\?xml[^>]+>", data)
if m:
data = data[m.end():]
# Apparently ampersands are common in lshw.. Get rid of them..
data = data.replace('&ersand;', '')
# wrap notes in container element
data = "<doc>" + data + "</doc>"
tree = ET.XML(data)
for elem in tree.findall(".//node"):
if elem.get("class") == "memory":
if elem.findtext("description") == "System Memory":
print "Found system memory bank"
Ok so up to here I am ok. I find ( If you want the full xml let me
know) two blocks of system memory. It MUST be "System Memory" only.
Now how do I get a list of all of the children "nodes" of this. They
are named bank:N ( i.e bank:0, bank:1 etc [see below] ). For each one
of those there may ( or may not ) have some memory stuck in it. I can
tell if there is memory because a size is given. I want to a list of
all of the sizes. From there I can say you have sum(memory) in
len(memory) banks of total banks.
Here is what I tried - but I was clearly messing up..
for mem in elem.findall("./node/node")
if elem.get("class") == "memory":
print "Entering Memory Class"
if elem.findtext("size"):
print "Found size",
elem.findtext("size"):
And the XML which goes with that..
<node id="memory:0" claimed="true" class="memory"
handle="DMI:0027">
<description>System Memory</description>
<physid>27</physid>
<slot>System board or motherboard</slot>
<node id="bank:0" claimed="true" class="memory"
handle="DMI:002C">
<description>DIMM DDR Synchronous [empty]</description>
<vendor>JEDEC ID:</vendor>
<physid>0</physid>
<slot>DIMM3B</slot>
</node>
<node id="bank:1" claimed="true" class="memory"
handle="DMI:002D">
<description>DIMM DDR Synchronous [empty]</description>
<vendor>JEDEC ID:</vendor>
<physid>1</physid>
<slot>DIMM3A</slot>
</node>
<node id="bank:2" claimed="true" class="memory"
handle="DMI:002E">
<description>DIMM DDR Synchronous 400 MHz (2.5
ns)</description>
<product>M3 12L2920BG0-CCC</product>
<vendor>JEDEC ID:CE 00 00 00 00 00 00 00</vendor>
<physid>2</physid>
<serial>96000241</serial>
<slot>DIMM1B</slot>
<size units="bytes">1073741824</size>
<width units="bits">64</width>
<clock units="Hz">400000000</clock>
</node>
<node id="bank:3" claimed="true" class="memory"
handle="DMI:002F">
<description>DIMM DDR Synchronous 400 MHz (2.5
ns)</description>
<product>M3 12L2920BG0-CCC</product>
<vendor>JEDEC ID:CE 00 00 00 00 00 00 00</vendor>
<physid>3</physid>
<serial>4A000741</serial>
<slot>DIMM1A</slot>
<size units="bytes">1073741824</size>
<width units="bits">64</width>
<clock units="Hz">400000000</clock>
</node>
</node>
<node id="memory:1" claimed="true" class="memory"
handle="DMI:0028">
<description>System Memory</description>
<physid>28</physid>
<slot>System board or motherboard</slot>
<node id="bank:0" claimed="true" class="memory"
handle="DMI:0030">
<description>DIMM DDR Synchronous [empty]</description>
<vendor>JEDEC ID:</vendor>
<physid>0</physid>
<slot>DIMM4B</slot>
</node>
<node id="bank:1" claimed="true" class="memory"
handle="DMI:0031">
<description>DIMM DDR Synchronous [empty]</description>
<vendor>JEDEC ID:</vendor>
<physid>1</physid>
<slot>DIMM4A</slot>
</node>
<node id="bank:2" claimed="true" class="memory"
handle="DMI:0032">
<description>DIMM DDR Synchronous 400 MHz (2.5
ns)</description>
<product>M3 12L2920BG0-CCC</product>
<vendor>JEDEC ID:CE 00 00 00 00 00 00 00</vendor>
<physid>2</physid>
<serial>95000041</serial>
<slot>DIMM2B</slot>
<size units="bytes">1073741824</size>
<width units="bits">64</width>
<clock units="Hz">400000000</clock>
</node>
<node id="bank:3" claimed="true" class="memory"
handle="DMI:0033">
<description>DIMM DDR Synchronous 400 MHz (2.5
ns)</description>
<product>M3 12L2920BG0-CCC</product>
<vendor>JEDEC ID:CE 00 00 00 00 00 00 00</vendor>
<physid>3</physid>
<serial>58000E41</serial>
<slot>DIMM2A</slot>
<size units="bytes">1073741824</size>
<width units="bits">64</width>
<clock units="Hz">400000000</clock>
</node>
</node>
<node id="memory:2" class="memory" handle="DMI:0029">
<description>Flash Memory</description>
<physid>29</physid>
<slot>System board or motherboard</slot>
<capacity units="bytes">1048576</capacity>
<node id="bank" class="memory" handle="DMI:0035">
<description>Chip FLASH Non-volatile</description>
<physid>0</physid>
<slot>SYSTEM ROM</slot>
<size units="bytes">1048576</size>
<width units="bits">4</width>
</node>
</node>
<node id="memory:3" class="memory" handle="">
<physid>b</physid>
</node>
<node id="memory:4" class="memory" handle="">
<physid>c</physid>
</node>
<node id="memory:5" class="memory" handle="PCI:00:00.0">
<description>Memory controller</description>
<product>CK804 Memory Controller</product>
<vendor>nVidia Corporation</vendor>
<physid>0</physid>
<businfo>pci at 00:00.0</businfo>
<version>a3</version>
<width units="bits">32</width>
<clock units="Hz">66000000</clock>
<capabilities>
<capability id="bus_master" >bus mastering</capability>
<capability id="cap_list" >PCI capabilities
listing</capability>
</capabilities>
</node>
Thanks so much. PS - XML can be a real PITA when the data you throw at
it is not "correct". I actually had started working with sgmllib after
I saw a similar thread. However I ran into the same problem ( child of
child..)
Thanks again.
More information about the Python-list
mailing list