XML Parsing

James Kew james.kew at btinternet.com
Mon Feb 16 18:45:16 EST 2004


"Chris Herborth" <chrish at cryptocard.com> wrote in message
news:5z3Yb.3920$Cd6.177500 at news20.bellglobal.com...
>
> PyXML on Sourceforge (http://pyxml.sourceforge.net/) has faster
> DOM-producing routines.

Which are? I like PyXML, but well-documented it ain't. I tend to use PyXML's
minidom, fed by either the validating (== xmlproc) or non-validating (==
expat) parsers -- are there faster PyXML alternatives?

> pyRXP (http://www.reportlab.org/pyrxp.html) is probably the fastest XML
> parser for Python, but it doesn't produce a DOM or have a SAX API...

And recent threads here suggest it's not fully XML-compliant either, unless
you can work in an ASCII-only XML subset.

For raw speed, libxml2 (and its Python wrapper) seems to get a lot of
glowing reviews. It's not a standard DOM API, though, and again
documentation is a problem (lots of C-API-level documentation, but not much
in terms of how to put it together into a working Python app).

I gave it a whirl and it certainly seemed to fly, but getting to grips with
the API and converting my existing DOM-manipulating code to it felt like too
much of a hurdle given that my app runs fast enough as it is.

James





More information about the Python-list mailing list