xml.parsers.expat loading xml into a dict and whitespace

Wed May 23 07:44:27 EDT 2007

kaens wrote:
> Now the code looks like this:
> 
> import xml.etree.ElementTree as etree
> 
> optionsXML = etree.parse("options.xml")
> options = {}
> 
> for child in optionsXML.getiterator():
>    if child.tag != optionsXML.getroot().tag:
>        options[child.tag] = child.text
> 
> for key, value in options.items():
>    print key, ":", value

Three things to add:

Importing cElementTree instead of ElementTree should speed this up pretty
heavily, but:

Consider using iterparse():

http://effbot.org/zone/element-iterparse.htm

*untested*:

  from xml.etree import cElementTree as etree

  iterevents = etree.iterparse("options.xml")
  options = {}

  for event, child in iterevents:
      if child.tag != "parent":
          options[child.tag] = child.text

  for key, value in options.items():
     print key, ":", value

Note that this also works with lxml.etree. But using lxml.objectify is maybe
actually what you want:

http://codespeak.net/lxml/dev/objectify.html

*untested*:

  from lxml import etree, objectify

  # setup
  parser = etree.XMLParser(remove_blank_text=True)
  lookup = objectify.ObjectifyElementClassLookup()
  parser.setElementClassLookup(lookup)

  # parse
  parent = etree.parse("options.xml", parser)

  # get to work
  option1 = parent.option1
  ...

  # or, if you prefer dictionaries:
  options = vars(parent)
  for key, value in options.items():
     print key, ":", value

Have fun,

Stefan