[Python-Dev] Python Benchmarks

Wed May 31 20:28:37 CEST 2006

Fredrik Lundh wrote:
> M.-A. Lemburg wrote:
> 
>> AFAIK, there were no real issues with pybench, only with the
>> fact that time.clock() (the timer used by pybench) is wall-time
>> on Windows and thus an MP3-player running in the background
>> will cause some serious noise in the measurements 
> 
> oh, please; as I mentioned back then, PyBench reported massive slowdowns 
> and huge speedups in code that wasn't touched, gave unrepeatable results 
> on most platforms, and caused us to waste quite some time investigating 
> potential regressions from 2.4 that simply didn't exist.

It would be nice if you could post examples of these results
and also the details of the platforms on which you tested.

> of about a dozen claimed slowdowns when comparing 2.4 to 2.5a2 on 
> several platforms, only *one* slowdown could be independently confirmed 
> with other tools.
> 
> and when we fixed that, and ended up with an implementation that was 
> *faster* than in 2.4, PyBench didn't even notice the speedup.

Which one was that ?

With which parameters did you run pybench ?

> the fact is that the results for individual tests in PyBench are 100% 
> unreliable.  I have no idea why.

If they are 100% unreliable, then perhaps we should simply
reverse the claims pybench makes ;-) (this would be like a
100% wrong wheather report).

Seriously, I've been using and running pybench for years
and even though tweaks to the interpreter do sometimes
result in speedups or slow-downs where you wouldn't expect
them (due to the interpreter using the Python objects),
they are reproducable and often enough have uncovered
that optimizations in one area may well result in slow-downs
in other areas.

Often enough the results are related to low-level features
of the architecture you're using to run the code such as
cache size, cache lines, number of registers in the CPU or
on the FPU stack, etc. etc.

E.g. some years ago, I played around with the ceval loop,
ordered the switch statements in different ways, broke the
switch in two parts, etc. The results were impressive - but
mostly due to the switch statement being optimized for the
platform I was running (AMD at the time). Moving a switch
option sometimes had a huge effect on the timings. Most of
which was due to the code being more suitably aligned for
the CPU cache.

> the accumulated result may be somewhat useful (at least after the "use 
> minimum time instead of average" changes), but I wouldn't use it for any 
> serious purpose.  at least PyStone is unusable in a well-defined way ;-)

I think that most of your experience is related to the way
timing is done on Windows.

Like I said: I'm working on better timers for use in pybench.
Hopefully, this will make your experience a better one next
time you try.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 31 2006)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::