[Speed] Are benchmarks and libraries mutable?

Sun Sep 2 00:10:34 CEST 2012

On Sat, Sep 1, 2012 at 2:57 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:

> On Sat, 1 Sep 2012 13:21:36 -0400
> Brett Cannon <brett at python.org> wrote:
> >
> > One is moving benchmarks from PyPy over to the unladen repo on
> > hg.python.org/benchmarks. But I wanted to first make sure people don't
> view
> > the benchmarks as immutable (e.g. as Octane does:
> > https://developers.google.com/octane/faq). Since the benchmarks are
> always
> > relative between two interpreters their immutability isn't critical
> > compared to if we were to report some overall score. But it also means
> that
> > any changes made would throw off historical comparisons. For instance,
> if I
> > take PyPy's Mako benchmark (which does a lot more work), should it be
> named
> > mako_v2, or should we just replace mako wholesale?
>
> mako_v2 sounds fine to me. Mutating benchmarks makes things confusing:
> one person may report that interpreter A is faster than interpreter B
> on a given benchmark, and another person retort that no, interpreter B
> is faster than interpreter A.
>
> Besides, if you want to have useful timelines on speed.p.o, you
> definitely need stable benchmarks.
>
> > And the second is the same question for libraries. For instance, the
> > unladen benchmarks have Django 1.1a0 as the version which is rather
> > ancient. And with 1.5 coming out with provisional Python 3 support I
> > obviously would like to update it. But the same questions as with
> > benchmarks crops up in reference to immutability.
>
> django_v2 sounds fine too :)
>

True, but having to carry around multiple copies of libraries just becomes
a pain.

>
> > (e.g. I will have to probably update the 2.7 code to use
> > io.BytesIO instead of StringIO.StringIO to be on more equal footing).
>
> I disagree. If io.BytesIO is faster than StringIO.StringIO then it's
> normal for the benchmark results to reflect that (ditto if it's slower).
>
> > If we can't find a reasonable way to handle all of this then what I will
> do
> > is branch the unladen benchmarks for 2.x/3.x benchmarking, and then
> create
> > another branch of the benchmark suite to just be for Python 3.x so that
> we
> > can start fresh with a new set of benchmarks that will never change
> > themselves for benchmarking Python 3 itself.
>
> Why not simply add Python 3-specific benchmarks to the mix?
> You can then create a "py3" benchmark suite in perf.py (and perhaps
> also a "py2" one).
>

To avoid historical baggage and to start from a clean slate. I don't
necessarily want to carry around Python 2 benchmarks forever. It's not a
massive concern, just a nicety.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/speed/attachments/20120901/dd71e340/attachment.html>