[XML-SIG] Re: cElementTree 0.8 (january 11, 2005)

Paul Boddie paul at boddie.org.uk
Thu Jan 13 17:37:59 CET 2005


On Wednesday 12 January 2005 20:12, Fredrik Lundh wrote:
> several people have asked for libxml2 figures, since libxml2 is known as
> the fastest parser under the sun (with the possible exception of RXP, which
> is known as quite possibly the fastest parser anywhere).
>
> here's an updated table:

[...]

>     libxml2                     16000k  0.098s
>     cElementTree 0.8             5700k  0.058s

cElementTree looks really impressive, but having run various tests comparing 
libxml2 and cElementTree with some of the larger test documents in the 
libxml2 distribution, libxml2 still seems faster. I've used GNU time to 
report things like the elapsed, system and user times as well as measuring 
the elapsed time in Python, but I couldn't get the memory usage. How should 
one go about getting these figures under Linux? Should I turn process 
accounting on or something like that?

One thing that may explain the discrepancy between the above results and the 
ones I've been getting is the unfortunate need to explicitly free each 
libxml2 document after finishing with it - I found that otherwise libxml2 
does indeed get slower after loading a few documents, and I'd imagine that 
the memory requirements start to affect my resource-challenged laptop as a 
result. Of course, this depends on how one does the tests, but in order to 
diminish start-up times and to time a single process loading many documents, 
I looped over a number of files, parsing each one, looping over this entire 
process many times.

Still, cElementTree looks like a very promising addition to the range of 
Python XML tools, especially given the uncomplicated installation process 
(compared to some of the other top performers, notably libxml2 and 
cDomlette).

Paul


More information about the XML-SIG mailing list