[XML-SIG] Pyana 0.2.0 released

Brian Quinlan brian@sweetapp.com
Mon, 17 Dec 2001 18:41:10 -0800


This is a multi-part message in MIME format.

------=_NextPart_000_0036_01C1872A.6648D570
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

Uche Ogbuji wrote:

> > PIRXX is focused on providing Xerces XML services to Python. The
current
> > release of PIRXX provides SAX2 interfaces but I believe that =
J=FCrgen
is
> > working on DOM support.
> >
> > So, right now, Pyana is probably your best bet for high-performance
XSLT
> > processing in Python while PIRXX offers Xerces SAX2 interfaces.
>=20
> Are you basing this on actual benchmarks?  In particular, I'd be
surprised
> if Pyana was faster overall than current CVS of 4XSLT, Since Xalan
isn't,
> as I measure it.

I am basing this on the timings of largish transformations that I was
doing around 2 months ago. Since then I haven't really compared them and
I have never run any formal benchmarks.=20

Note that one of the big problems with timing Xalan from the command
line is that it is very slow to load, especially on windows. I just
timed "import Pyana" on my PIV 1.7GHz and it took 0.74s. But the beauty
of using Pyana instead of something like "popen('xalan ..." is that the
load time becomes a one-time cost for the application.

For fun, I just downloaded:
http://www.datapower.com/XSLTMark/download/XSLTMark_2_1_0.zip

And wrote the attached script. I did this without expending any effort
trying to understand the benchmark suite; I just test each .xml/.xsl
pair. Notice that all of the source/stylesheet documents are small so
the advantage should go to 4suite.

I don't want to get 4suite from CVS so why don't you get Pyana:
http://prdownloads.sourceforge.net/pyana/Pyana-0.2.0.win32-py2.1.exe
(very easy Windows installer for Python 2.1 [you can probably figure out
how to transform that URL for Python 2.0 ;-)])

Then you can run this script against Pyana and (with a few tweeks)
against 4suite. Here are Pyana's results on my machine:

C:\Dev\Me\Pyana\pyana\Test>python benchmark.py
time to import Pyana: 0.0785s # Cached by Windows?
time to execute axis: 0.0080s (675 bytes of output)
time to execute bottles: 0.0145s (12075 bytes of output)
time to execute brutal: 0.0156s (4191 bytes of output)
time to execute chart: 0.0110s (3837 bytes of output)
time to execute current: 0.0060s (320 bytes of output)
time to execute game: 0.0080s (457 bytes of output)
time to execute html: 0.0066s (504 bytes of output)
time to execute identity: 0.0043s (218 bytes of output)
time to execute inventory: 0.0088s (2070 bytes of output)
time to execute metric: 0.0132s (640 bytes of output)
time to execute number: 0.0074s (788 bytes of output)
time to execute oddtemplate: 0.0071s (173 bytes of output)
time to execute priority: 0.0083s (587 bytes of output)
time to execute products: 0.0085s (439 bytes of output)
time to execute queens: 0.0900s (1772 bytes of output)
time to execute tower: 0.1555s (70729 bytes of output)
time to execute trend: 0.0513s (8382 bytes of output)
time to execute union: 0.0058s (128 bytes of output)
time to execute xpath: 0.0062s (225 bytes of output)
time to execute xslbench1: 0.0204s (7011 bytes of output)

Cheers,
Brian

------=_NextPart_000_0036_01C1872A.6648D570
Content-Type: text/plain;
	name="benchmark.py"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
	filename="benchmark.py"

import time
import os

# Find all .xsl/.xml pairs
testcase_directory =3D r'C:\Dev\Me\Pyana\xsltmark\testcases'
testcase_files =3D os.listdir(testcase_directory)
testcase_files.sort()

last_name =3D None
tests =3D []
for file in testcase_files:
    name, ext =3D os.path.splitext(file)
    if ext =3D=3D '.xsl':
        if last_name =3D=3D name:
            tests.append(name)
        last_name =3D None
    elif ext =3D=3D '.xml':
        last_name =3D name
    else:
        last_name =3D None


print 'time to import Pyana:',
startTime =3D time.clock()
import Pyana
print "%0.4fs" % (time.clock() - startTime)

for test in tests:
    print 'time to execute %s:' % test,
    startTime =3D time.clock()
    length =3D len(Pyana.transformToString(source =3D =
Pyana.URI(os.path.join(testcase_directory, test + '.xml')),
                            style =3D =
Pyana.URI(os.path.join(testcase_directory, test + '.xsl'))))
    print "%0.4fs (%d bytes of output)" % (time.clock() - startTime, =
length) 
------=_NextPart_000_0036_01C1872A.6648D570--