[XML-SIG] Performance question
Fred L. Drake, Jr.
fdrake@acm.org
Tue, 5 Nov 2002 09:23:52 -0500
Henry S. Thompson writes:
> If you want _another_ factor of 10, go to PyLTXML. The report below
> is from Python 2.2.1 on RedHat Linux 7.2 using PyXML 0.8.1 and
> PyLTXML-1.3-2.
Wow! That's fast!
> I used Fred's driver, added two new functions to text bit-level and
> tree-level access via PyLTXML.
>
> parser performance test
> 100 parses took 3.88 seconds, or 0.04 seconds/parse
> 100 parses took 0.25 seconds, or 0.00 seconds/parse
> 100 parses took 0.02 seconds, or 0.00 seconds/parse
> 100 parses took 0.03 seconds, or 0.00 seconds/parse
>
> The first measurement is the original 4DOM DOM builder, the second is
> the expatbuilder, the third is PyLTXML returning the whole tree, the
> fourth is PyLTXML returning every bit (start tag, end tag, text). I
> guess the tree is faster because it's slightly lazy wrt Python
> structures, i.e. only the root is in Python form as returned, the rest
> gets converted from the native C structs as you walk the Python tree.
So is the resulting object compliant (or at least close) to the Python
DOM, as defined in the Python Library Reference?
http://www.python.org/doc/current/lib/module-xml.dom.html
(Lazy building of structures is fine, of course, since that's
implementation.) If it doesn't support the DOM API, does it support
something with an equivalent model and functionality?
> Here are the additions I made to Fred's version of the script:
...
> def allBits(s):
> f=PyLTXML.OpenString(s1,PyLTXML.NSL_read|PyLTXML.NSL_read_namespaces)
> b=PyLTXML.GetNextBit(f)
> while b:
> b=PyLTXML.GetNextBit(f)
> PyLTXML.Close(f)
>
> def itemParse(s):
> f=PyLTXML.OpenString(s1,PyLTXML.NSL_read|PyLTXML.NSL_read_namespaces)
> b=PyLTXML.GetNextBit(f)
> while b.type!='start':
> b=PyLTXML.GetNextBit(f)
> d=PyLTXML.ItemParse(f,b.item)
> PyLTXML.Close(f)
> return d
Ouch! Very inscrutible code... at least to me. I must confess that
I've not had time to dig into the LTXML API (C or Python), though I've
stashed a copy of the documentation on my desk somewhere, meaning to
get to it.
-Fred
--
Fred L. Drake, Jr. <fdrake at acm.org>
PythonLabs at Zope Corporation