a few more questions on XML and python
Lars von Wedel
vonWedel at lfpt.rwth-aachen.de
Fri Jan 4 03:39:06 EST 2002
Hi,
> [...] but I daresay that working code can eat
> a great deal of memory and still perform better than a dysfunctional
> collection of highly optimized fragments :)
Sure, in general, but there are a number of things about reading XML
that can be performed using a lean SAX-style implementation, e.g.
reading rather simple configuration files etc.
> If you're processing huge data sets, DOM isn't going to cut it. DOM
> builds an in-memory representation of the entire document, whereas SAX
> handles a single element at a time. But after you have a bit of
> Python and XML parsing under your belt, you can always move on to SAX
> if necessary, eh?
What I do in order to simplify parsing XML using SAX is the following
in a class used as an element handler. These two methods dispatch the
calls of startElement/endElement to a bunch of methods called E1_start,
E1_end, E2_start, E2_end, ... for each element type (e.g. E1, E2)
occurring in the XML file. That saves me a large if-construct. Very
simple, but I like to use it a lot.
def startElement(self, name, attrs):
mth_name = string.lower(name) + '_start'
self.attr_st.append(attrs)
if hasattr(self, mth_name):
method = getattr(self, mth_name)
method(attrs)
else:
if self.verbose:
print 'Warning: Start of element %s skipped' % name
def endElement(self, name):
attrs = self.attr_st.pop()
mth_name = string.lower(name) + '_end'
if hasattr(self, mth_name):
method = getattr(self, mth_name)
method(attrs)
else:
if self.verbose:
print 'Warning: End of element %s skipped' % name
Lars
More information about the Python-list
mailing list