From gmludo at gmail.com Wed May 10 08:29:26 2017 From: gmludo at gmail.com (Ludovic Gasc) Date: Wed, 10 May 2017 14:29:26 +0200 Subject: [Speed] pymicrobench: collection of CPython microbenchmarks In-Reply-To: References: Message-ID: Interesting, you might ping python-dev with this repository. -- Ludovic Gasc (GMLudo) Lead Developer Architect at ALLOcloud https://be.linkedin.com/in/ludovicgasc 2017-03-17 3:29 GMT+01:00 Victor Stinner : > Hi, > > I started to create a collection of microbenchmarks for CPython from > scripts found on the bug tracker: > https://github.com/haypo/pymicrobench > > I'm not sure that this collection is used yet, but some of you may > want to take a look :-) > > I know that some people have random microbenchmarks in a local > directory. Maybe you want to share them? > > I don't really care to sort them or group them. My plan is first to > populate the repository, and later see what to do with it :-) > > Victor > _______________________________________________ > Speed mailing list > Speed at python.org > https://mail.python.org/mailman/listinfo/speed > -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Wed May 10 12:16:29 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 10 May 2017 18:16:29 +0200 Subject: [Speed] pymicrobench: collection of CPython microbenchmarks In-Reply-To: References: Message-ID: FYI since my previous email in March, I removed the following microbenchmarks from pyperformance to move them to pymicrobench: * call_simple, call_method, call_method_slots, call_method_unknown * logging_silent pyperformance development version: http://pyperformance.readthedocs.io/changelog.html#version-0-5-5 https://github.com/haypo/pymicrobench Victor From gmludo at gmail.com Wed May 10 12:28:05 2017 From: gmludo at gmail.com (Ludovic Gasc) Date: Wed, 10 May 2017 18:28:05 +0200 Subject: [Speed] pymicrobench: collection of CPython microbenchmarks In-Reply-To: References: Message-ID: Thanks for the update, I'm processing unread e-mails ;-) -- Ludovic Gasc (GMLudo) Lead Developer Architect at ALLOcloud https://be.linkedin.com/in/ludovicgasc 2017-05-10 18:16 GMT+02:00 Victor Stinner : > FYI since my previous email in March, I removed the following > microbenchmarks from pyperformance to move them to pymicrobench: > > * call_simple, call_method, call_method_slots, call_method_unknown > * logging_silent > > pyperformance development version: > http://pyperformance.readthedocs.io/changelog.html#version-0-5-5 > > https://github.com/haypo/pymicrobench > > Victor > -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Mon May 29 12:49:37 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 29 May 2017 18:49:37 +0200 Subject: [Speed] performance 0.5.5 and perf 1.3 released Message-ID: Hi, I just released performance 0.5.5 and perf 1.3. The main change is the removal of microbenchmarks from performance, you can now find them in a different project: https://github.com/haypo/pymicrobench The perf warmup calibration is still highly unstable, and so PyPy still don't use performance yet. I began to analyze warmups manually but I was too lazy to finish the work: http://haypo-notes.readthedocs.io/pypy_warmups.html === performance 0.5.5 === * On the 2.x branch on CPython, ``compile`` now pass ``--enable-unicode=ucs4`` to the ``configure`` script on all platforms, except on Windows which uses UTF-16 because of its 16-bit wchar_t. * The ``float`` benchmark now uses ``__slots__`` on the ``Point`` class. * Remove the following microbenchmarks. They have been moved to the `pymicrobench `_ project because they are too short, not representative of real applications and are too unstable. - ``pybench`` microbenchmark suite - ``call_simple`` - ``call_method`` - ``call_method_unknown`` - ``call_method_slots`` - ``logging_silent``: values are faster than 1 ns on PyPy with 2^27 loops! (and around 0.7 us on CPython) * Update requirements - Django: 1.11 => 1.11.1 - SQLAlchemy: 1.1.9 => 1.1.10 - certifi: 2017.1.23 => 2017.4.17 - perf: 1.2 => 1.3 - mercurial: 4.1.2 => 4.2 - tornado: 4.4.3 => 4.5.1 === perf 1.3 === * Add ``get_loops()`` and ``get_inner_loops()`` methods to Run and Benchmark classes * Documentation: add export_csv.py and plot.py examples * Rewrite warmup calibration for PyPy: - Use Q1, Q3 and stdev, rather than mean and checking if the first value is an outlier - Always use a sample of 10 values, rather than using a sample of a variable size starting with 3 values * Use lazy import for most imports of the largest modules to reduce the number of imported module on 'import perf'. * Fix handling of broken pipe error to prevent logging the error: "Exception ignored in: ... BrokenPipeError: ..." * ``collect_metadata`` gets more metadata on FreeBSD: - use ``os.getloadavg()`` if ``/proc/loadavg`` is not available (ex: FreeBSD) - use ``psutil.boot_time()`` if ``/proc/stat`` is not available (ex: FreeBSD) to get ``boot_time`` and ``uptime`` metadata * The Runner constructor now raises an exception if more than one instance is created. Victor From solipsis at pitrou.net Mon May 29 13:00:22 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 29 May 2017 19:00:22 +0200 Subject: [Speed] performance 0.5.5 and perf 1.3 released References: Message-ID: <20170529190022.5488bef7@fsol> On Mon, 29 May 2017 18:49:37 +0200 Victor Stinner wrote: > * The ``float`` benchmark now uses ``__slots__`` on the ``Point`` class. So the benchmark numbers are not comparable with previously generated ones? > * Remove the following microbenchmarks. They have been moved to the > `pymicrobench `_ project because > they are too short, not representative of real applications and are too > unstable. > [...] > - ``logging_silent``: values are faster than 1 ns on PyPy with 2^27 loops! > (and around 0.7 us on CPython) The performance of silent logging calls is actually important for all applications which have debug() calls in their critical paths. This is quite common in network and/or distributed programming where you want to allow logging many events for diagnosis of unexpected runtime issues (because many unexpected conditions can appear), but with those logs disabled by default for performance and readability reasons. This is no more a micro-benchmark than is, say, pickling or JSON encoding; and much less so than solving the N-body problem in pure Python without Numpy... > * Update requirements > > - Django: 1.11 => 1.11.1 > - SQLAlchemy: 1.1.9 => 1.1.10 > - certifi: 2017.1.23 => 2017.4.17 > - perf: 1.2 => 1.3 > - mercurial: 4.1.2 => 4.2 > - tornado: 4.4.3 => 4.5.1 Are those requirements for the benchmark runner or for the benchmarks themselves? If the latter, won't updating the requirements make benchmark numbers non-comparable with those generated by previous versions? This is something that the previous benchmarks suite tried to above by using pinned versions of 3rd party libraries. Regards Antoine. From solipsis at pitrou.net Mon May 29 13:10:00 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 29 May 2017 19:10:00 +0200 Subject: [Speed] performance 0.5.5 and perf 1.3 released References: <20170529190022.5488bef7@fsol> Message-ID: <20170529191000.09a8b7f4@fsol> Also, to expand a bit on what I'm trying to say: like you, I have my own idea of which benchmarks are pointless and unrepresentative, but when maintaining the former benchmarks suite I usually refrained from removing those benchmarks, out of prudence and respect for the people who had written them (and probably had their reasons for finding those benchmarks useful). Regards Antoine. On Mon, 29 May 2017 19:00:22 +0200 Antoine Pitrou wrote: > On Mon, 29 May 2017 18:49:37 +0200 > Victor Stinner > wrote: > > * The ``float`` benchmark now uses ``__slots__`` on the ``Point`` class. > > So the benchmark numbers are not comparable with previously generated > ones? > > > * Remove the following microbenchmarks. They have been moved to the > > `pymicrobench `_ project because > > they are too short, not representative of real applications and are too > > unstable. > > > [...] > > - ``logging_silent``: values are faster than 1 ns on PyPy with 2^27 loops! > > (and around 0.7 us on CPython) > > The performance of silent logging calls is actually important for > all applications which have debug() calls in their critical paths. > This is quite common in network and/or distributed programming where you > want to allow logging many events for diagnosis of unexpected runtime > issues (because many unexpected conditions can appear), but with > those logs disabled by default for performance and readability reasons. > > This is no more a micro-benchmark than is, say, pickling or JSON > encoding; and much less so than solving the N-body problem in pure > Python without Numpy... > > > * Update requirements > > > > - Django: 1.11 => 1.11.1 > > - SQLAlchemy: 1.1.9 => 1.1.10 > > - certifi: 2017.1.23 => 2017.4.17 > > - perf: 1.2 => 1.3 > > - mercurial: 4.1.2 => 4.2 > > - tornado: 4.4.3 => 4.5.1 > > Are those requirements for the benchmark runner or for the benchmarks > themselves? If the latter, won't updating the requirements make > benchmark numbers non-comparable with those generated by previous > versions? This is something that the previous benchmarks suite tried > to above by using pinned versions of 3rd party libraries. > > Regards > > Antoine. From victor.stinner at gmail.com Mon May 29 16:29:44 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 29 May 2017 22:29:44 +0200 Subject: [Speed] performance 0.5.5 and perf 1.3 released In-Reply-To: <20170529190022.5488bef7@fsol> References: <20170529190022.5488bef7@fsol> Message-ID: 2017-05-29 19:00 GMT+02:00 Antoine Pitrou : > The performance of silent logging calls is actually important for > all applications which have debug() calls in their critical paths. I wasn't sure about that one. The thing is that the performance of many Python functions of the stdlib are important, but my main concern is to get reproductible benchmark results and to use benchmarks which are representative to large applications. > This is quite common in network and/or distributed programming where you > want to allow logging many events for diagnosis of unexpected runtime > issues (because many unexpected conditions can appear), but with > those logs disabled by default for performance and readability reasons. Note: I would suggest to use a preprocessor or something like to *remove* the calls if performance is critical. It is the solution was chose in a previous company working on embedded devices :-) > This is no more a micro-benchmark than is, say, pickling or JSON > encoding; and much less so than solving the N-body problem in pure > Python without Numpy... I'm working step by step. In a perfect world, I would also remove all the benchmarks you listed :-) I'm in touch with Intel who wants to add new benchmarks more representative of applications, like Django. >> * Update requirements >> >> - Django: 1.11 => 1.11.1 >> - SQLAlchemy: 1.1.9 => 1.1.10 >> - certifi: 2017.1.23 => 2017.4.17 >> - perf: 1.2 => 1.3 >> - mercurial: 4.1.2 => 4.2 >> - tornado: 4.4.3 => 4.5.1 > > Are those requirements for the benchmark runner or for the benchmarks > themselves? performance creates a virtual env, installs dependencies and run benchmarks in this virtual environment. > If the latter, won't updating the requirements make > benchmark numbers non-comparable with those generated by previous > versions? Yes, they are incompatible and "performance compare" raises an error in such case (performance version is stored in JSON files). > This is something that the previous benchmarks suite tried > to above by using pinned versions of 3rd party libraries. Versions (of direct but also indirect dependencies) are pinned in performance/requirements.txt to get reproductible results. Each performance version is incompatible with the previous one. There is current no backward compatibility warranty. Maybe we should provide a kind of backward compatibility after performance 1.0 release, for example use semantic version. But I'm not sure that it's really doable. I don't think that it's matter so much to provide backward compatibility. It's very easy to get an old version of performance if you want to compare a new version with old results. It would be nice to convince PyPy developers to run performance instead of their old benchmark suite. But it seems like there is a technical issue with the number of warmups. Victor From solipsis at pitrou.net Mon May 29 16:45:21 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 29 May 2017 22:45:21 +0200 Subject: [Speed] performance 0.5.5 and perf 1.3 released References: <20170529190022.5488bef7@fsol> Message-ID: <20170529224521.17d61e3d@fsol> On Mon, 29 May 2017 22:29:44 +0200 Victor Stinner wrote: > > > This is quite common in network and/or distributed programming where you > > want to allow logging many events for diagnosis of unexpected runtime > > issues (because many unexpected conditions can appear), but with > > those logs disabled by default for performance and readability reasons. > > Note: I would suggest to use a preprocessor or something like to > *remove* the calls if performance is critical. It is the solution was > chose in a previous company working on embedded devices :-) Then you lose the ability to enable debug logs on user / customer systems. Unless the preprocessor is run each time the user changes the logging configuration, which adds another layer of fragility. > But I'm not sure that it's really doable. I don't think that it's > matter so much to provide backward compatibility. It's very easy to > get an old version of performance if you want to compare a new version > with old results. I don't know. It means that benchmark results published on the Web are generally not comparable with each other unless they happen to be generated with the exact same version. It reduces the usefulness of the benchmarks suite quite a bit IMHO. Let's ask the question a different way: was there any necessity to update those dependencies? If yes, then fair enough. Otherwise, the compatibility breakage is gratuitous. Regards Antoine. From victor.stinner at gmail.com Mon May 29 16:46:05 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 29 May 2017 22:46:05 +0200 Subject: [Speed] performance 0.5.5 and perf 1.3 released In-Reply-To: <20170529191000.09a8b7f4@fsol> References: <20170529190022.5488bef7@fsol> <20170529191000.09a8b7f4@fsol> Message-ID: 2017-05-29 19:10 GMT+02:00 Antoine Pitrou : > Also, to expand a bit on what I'm trying to say: like you, I have my own > idea of which benchmarks are pointless and unrepresentative, but when > maintaining the former benchmarks suite I usually refrained from > removing those benchmarks, out of prudence and respect for the people > who had written them (and probably had their reasons for finding those > benchmarks useful). I created a pull request to reintroduce the benchmark: https://github.com/python/performance/pull/25 Victor From victor.stinner at gmail.com Mon May 29 16:57:18 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 29 May 2017 22:57:18 +0200 Subject: [Speed] performance 0.5.5 and perf 1.3 released In-Reply-To: <20170529224521.17d61e3d@fsol> References: <20170529190022.5488bef7@fsol> <20170529224521.17d61e3d@fsol> Message-ID: 2017-05-29 22:45 GMT+02:00 Antoine Pitrou : > I don't know. It means that benchmark results published on the Web > are generally not comparable with each other unless they happen to be > generated with the exact same version. It reduces the usefulness of > the benchmarks suite quite a bit IMHO. I only know a 3 websites to compare Python performances: * speed.python.org * speed.pypy.org * speed.pyston.org My goal is to convaince PyPy developers to use performance. I'm not sure that pyston.org is revelant: it seems like their forked benchmark suite is modified, so I don't expect that results on pypy.org and pyston.org are comparable. I would also prefer that Pyston uses the same benchmark suite. About speed.python.org, what was decided is to *drop* all previous results if we modify benchmarks. That's what I already did 3 times: * 2017-03-31: old results removed, new CPython results to use Git commits instead of Mercurial. * 2017-01: old results computed without PGO removed (unstable because of code placement), new CPython results using PGO * 2016-11-04: old results computed with benchmarks removed, new CPython results (using LTO but not PGO) computed with the new performance benchmark suite. To be honest, in the meanwhile, I chose to run the master branch of perf and performance to develop perf and performance. In practice, I never noticed any significant performance change on any performance the last 12 months when I updated dependencies. Sadly, it seems like no significant optimization was merged in our dependencies. > Let's ask the question a different way: was there any necessity to > update those dependencies? If yes, then fair enough. Otherwise, the > compatibility breakage is gratuitous. When I started to work on benchmarks last year, I noticed that we used a Mercurial version which was 5 years old, and a Django version which was something like 3 years old. I would like to benchmark the Mercurial and Django versions deployed on production. Why do you want to update performance if you want a pinned version of Django? Just always use the same performance version, no? For speed.python.org, maybe we can decide that we always use a fixed version of performance, and that we must remove all data each time we change the performance version. For my needs, maybe we could spawn a "beta" subdomain running master branches? Again, I expect no significant difference between the main website and the beta website. But we can do it if you want. Victor From victor.stinner at gmail.com Mon May 29 17:01:19 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 29 May 2017 23:01:19 +0200 Subject: [Speed] performance 0.5.5 and perf 1.3 released In-Reply-To: References: <20170529190022.5488bef7@fsol> <20170529224521.17d61e3d@fsol> Message-ID: 2017-05-29 22:57 GMT+02:00 Victor Stinner : > When I started to work on benchmarks last year, I noticed that we used > a Mercurial version which was 5 years old, and a Django version which > was something like 3 years old. I would like to benchmark the > Mercurial and Django versions deployed on production. Hum, I sent my email before checking. At revision 9923b81a1d34, benchmarks used: lib/2to3 lib3/2to3 lib3/Chameleon-2.9.2 lib3/mako-0.3.6 lib3/Mako-0.7.3 lib/bazaar lib/Chameleon-2.22 lib/Chameleon-2.9.2 lib/django lib/Django-1.5 lib/Django-1.9 lib/google_appengine lib/html5lib lib/lockfile lib/mako-0.3.6 lib/Mako-0.7.3 lib/mercurial # version 1.2.1 lib/pathlib lib/psyco lib/rietveld lib/spambayes lib/spitfire lib/tornado-3.1.1 Django 1.9 was released at 2015-12-01, Mercurial 1.2.1 was released at 2009-03-20. So sorry, Django was kind of up to date, but Mercurial wasn't. Many libraries were also deprecated. Victor From brett at python.org Mon May 29 20:55:39 2017 From: brett at python.org (Brett Cannon) Date: Tue, 30 May 2017 00:55:39 +0000 Subject: [Speed] performance 0.5.5 and perf 1.3 released In-Reply-To: References: <20170529190022.5488bef7@fsol> <20170529224521.17d61e3d@fsol> Message-ID: I thought Piston had been shuttered by Dropbox, so I wouldn't worry about convincing them to change speed.pyston.org. On Mon, May 29, 2017, 13:57 Victor Stinner, wrote: > 2017-05-29 22:45 GMT+02:00 Antoine Pitrou : > > I don't know. It means that benchmark results published on the Web > > are generally not comparable with each other unless they happen to be > > generated with the exact same version. It reduces the usefulness of > > the benchmarks suite quite a bit IMHO. > > I only know a 3 websites to compare Python performances: > > * speed.python.org > * speed.pypy.org > * speed.pyston.org > > My goal is to convaince PyPy developers to use performance. I'm not > sure that pyston.org is revelant: it seems like their forked benchmark > suite is modified, so I don't expect that results on pypy.org and > pyston.org are comparable. I would also prefer that Pyston uses the > same benchmark suite. > > About speed.python.org, what was decided is to *drop* all previous > results if we modify benchmarks. That's what I already did 3 times: > > * 2017-03-31: old results removed, new CPython results to use Git > commits instead of Mercurial. > * 2017-01: old results computed without PGO removed (unstable because > of code placement), new CPython results using PGO > * 2016-11-04: old results computed with benchmarks removed, new > CPython results (using LTO but not PGO) computed with the new > performance benchmark suite. > > To be honest, in the meanwhile, I chose to run the master branch of > perf and performance to develop perf and performance. In practice, I > never noticed any significant performance change on any performance > the last 12 months when I updated dependencies. Sadly, it seems like > no significant optimization was merged in our dependencies. > > > Let's ask the question a different way: was there any necessity to > > update those dependencies? If yes, then fair enough. Otherwise, the > > compatibility breakage is gratuitous. > > When I started to work on benchmarks last year, I noticed that we used > a Mercurial version which was 5 years old, and a Django version which > was something like 3 years old. I would like to benchmark the > Mercurial and Django versions deployed on production. > > Why do you want to update performance if you want a pinned version of > Django? Just always use the same performance version, no? > > For speed.python.org, maybe we can decide that we always use a fixed > version of performance, and that we must remove all data each time we > change the performance version. For my needs, maybe we could spawn a > "beta" subdomain running master branches? Again, I expect no > significant difference between the main website and the beta website. > But we can do it if you want. > > Victor > _______________________________________________ > Speed mailing list > Speed at python.org > https://mail.python.org/mailman/listinfo/speed > -------------- next part -------------- An HTML attachment was scrubbed... URL: