From victor.stinner at gmail.com Mon Apr 3 08:11:23 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 3 Apr 2017 14:11:23 +0200 Subject: [Speed] Issues to run benchmarks on Python before 2015-04-01 In-Reply-To: References: Message-ID: 2017-04-01 0:47 GMT+02:00 Victor Stinner : > (2) 2014-04-01, 2014-07-01, 2014-10-01, 2015-01-01: "venv/bin/python > -m pip install" fails in extract_stack() of pyparsing I reported the issue on pip bug tracker: https://github.com/pypa/pip/issues/4408 Victor From victor.stinner at gmail.com Mon Apr 3 18:21:33 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 4 Apr 2017 00:21:33 +0200 Subject: [Speed] performance: remove "pure python" pickle benchmarks? Message-ID: Hi, In benchmarks slower in Python 3.7 compared to 2.7 at https://speed.python.org/comparison/ I found pickle_pure_python and unpickle_pure_python. These benchmarks test the pure Python implementation of the pickle protocol. I don't see the point of testing the pure Python implementation, since the C accelerator (_pickle) is always used by default in Python 3. I propose to remove this benchmark. What do you think? Victor From rdmurray at bitdance.com Mon Apr 3 18:59:30 2017 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 03 Apr 2017 18:59:30 -0400 Subject: [Speed] performance: remove "pure python" pickle benchmarks? In-Reply-To: References: Message-ID: <20170403225931.713591310017@webabinitio.net> On Tue, 04 Apr 2017 00:21:33 +0200, Victor Stinner wrote: > I don't see the point of testing the pure Python implementation, since > the C accelerator (_pickle) is always used by default in Python 3. I > propose to remove this benchmark. What do you think? Are these benchmarks always going to be used only for CPython? I thought the long term goal was to support multiple implementations? --David From victor.stinner at gmail.com Mon Apr 3 19:57:53 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 4 Apr 2017 01:57:53 +0200 Subject: [Speed] performance: remove "pure python" pickle benchmarks? In-Reply-To: <20170403225931.713591310017@webabinitio.net> References: <20170403225931.713591310017@webabinitio.net> Message-ID: 2017-04-04 0:59 GMT+02:00 R. David Murray : > On Tue, 04 Apr 2017 00:21:33 +0200, Victor Stinner wrote: >> I don't see the point of testing the pure Python implementation, since >> the C accelerator (_pickle) is always used by default in Python 3. I >> propose to remove this benchmark. What do you think? > > Are these benchmarks always going to be used only for CPython? I > thought the long term goal was to support multiple implementations? pyperformance is supposed to work on any Python implementation. The benchmarks pickle and unpickle test the "pickle" module which is generic and should be available on all Python implementation. pickle_pure_python and unpickle_pure_python benchmarks explicitly force the usage of the pure Python implementation and so are likely to be specific to CPython. https://github.com/python/performance/blob/master/performance/benchmarks/bm_pickle.py#L275 Ah, the trick is that Python 2 requires to implement "cPickle" instead of "pickle". Well, since we are talking about performances, I expect that users use "cPickle" and not "pickle" on Python 2, no? Victor From storchaka at gmail.com Tue Apr 4 06:06:58 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 4 Apr 2017 13:06:58 +0300 Subject: [Speed] performance: remove "pure python" pickle benchmarks? In-Reply-To: References: Message-ID: On 04.04.17 01:21, Victor Stinner wrote: > In benchmarks slower in Python 3.7 compared to 2.7 at > https://speed.python.org/comparison/ I found pickle_pure_python and > unpickle_pure_python. These benchmarks test the pure Python > implementation of the pickle protocol. > > I don't see the point of testing the pure Python implementation, since > the C accelerator (_pickle) is always used by default in Python 3. I > propose to remove this benchmark. What do you think? I consider it as a benchmark of Python interpreter itself. And indeed, they show a progress from 3.5 to 3.7. A slow regression in unpickling in 3.5 may be related to adding more strong validation (but may be not related). Unfortunately the Python code is different in different versions. Maybe write a single cross-version pickle implementation and use it with these benchmarks? From victor.stinner at gmail.com Tue Apr 4 07:43:06 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 4 Apr 2017 13:43:06 +0200 Subject: [Speed] performance: remove "pure python" pickle benchmarks? In-Reply-To: References: Message-ID: 2017-04-04 12:06 GMT+02:00 Serhiy Storchaka : > I consider it as a benchmark of Python interpreter itself. Don't we have enough benchmarks to test the Python interpreter? I would prefer to have more realistic use cases than "reimplement pickle in pure Python". "unpickle_pure_python" name can be misleading as well to users exploring speed.python.org data, no? By the way, I'm open to add new benchmarks ;-) > Unfortunately the Python code is different in different versions. Maybe > write a single cross-version pickle implementation and use it with these > benchmarks? That's another good reason to remove the benchmark :-) Other bencharks use pinned versions of dependencies (see performance/requirements.txt), so the code should be more and less the name on all Python versions. (Expect code differences between Python 2 and Python 3.) > And indeed, they > show a progress from 3.5 to 3.7. A slow regression in unpickling in 3.5 may > be related to adding more strong validation (but may be not related). 3.7 is faster than 3.5. Are you running benchmarks with LTO + PGO + perf system tune? https://speed.python.org/comparison/?exe=12%2BL%2Bmaster%2C12%2BL%2B3.5&ben=647%2C675&env=1&hor=true&bas=12%2BL%2B3.5&chart=normal+bars Victor From ncoghlan at gmail.com Tue Apr 4 11:31:20 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 5 Apr 2017 01:31:20 +1000 Subject: [Speed] performance: remove "pure python" pickle benchmarks? In-Reply-To: References: Message-ID: On 4 April 2017 at 21:43, Victor Stinner wrote: > 2017-04-04 12:06 GMT+02:00 Serhiy Storchaka : >> I consider it as a benchmark of Python interpreter itself. > > Don't we have enough benchmarks to test the Python interpreter? > > I would prefer to have more realistic use cases than "reimplement > pickle in pure Python". > > "unpickle_pure_python" name can be misleading as well to users > exploring speed.python.org data, no? The split benchmark likely made more sense in Python 2, when "import pickle" gave you the pure Python version by default, and you had to do "import cPickle as pickle" to get the accelerated version - you'd get very different performance characteristics based on which import the application used. It makes significantly less sense now that Python 3 always using the accelerated version by default and only falls back to pure Python if the accelerator module is missing for some reason. If anything, the appropriate cross-version comparison would be between the pure Python version in 2.7, and the accelerated version in 3.x, since that reflects the performance change you get when you do "import pickle". However, that argument only applies to whether or not to include it in the default benchmark set used to compare the overall performance across versions and implementations - it's still valid as a microbenchmark looking for major regressions in the speed of the pure Python fallback. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From solipsis at pitrou.net Tue Apr 4 11:47:09 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 4 Apr 2017 17:47:09 +0200 Subject: [Speed] performance: remove "pure python" pickle benchmarks? References: Message-ID: <20170404174709.01dbd1fa@fsol> Unfortunately, the C version of pickle lacks the extensibility of the pure Python, so the pure Python has to be used in some cases. One such example is the `cloudpickle` project, which extends pickle to support many more types, such as local functions. `cloudpickle` is often used by distributed executors to allow shipping Python code for remote execution on a cluster. See https://github.com/cloudpipe/cloudpickle/blob/master/cloudpickle/cloudpickle.py#L59-L70 Regards Antoine. On Wed, 5 Apr 2017 01:31:20 +1000 Nick Coghlan wrote: > On 4 April 2017 at 21:43, Victor Stinner wrote: > > 2017-04-04 12:06 GMT+02:00 Serhiy Storchaka : > >> I consider it as a benchmark of Python interpreter itself. > > > > Don't we have enough benchmarks to test the Python interpreter? > > > > I would prefer to have more realistic use cases than "reimplement > > pickle in pure Python". > > > > "unpickle_pure_python" name can be misleading as well to users > > exploring speed.python.org data, no? > > The split benchmark likely made more sense in Python 2, when "import > pickle" gave you the pure Python version by default, and you had to do > "import cPickle as pickle" to get the accelerated version - you'd get > very different performance characteristics based on which import the > application used. > > It makes significantly less sense now that Python 3 always using the > accelerated version by default and only falls back to pure Python if > the accelerator module is missing for some reason. If anything, the > appropriate cross-version comparison would be between the pure Python > version in 2.7, and the accelerated version in 3.x, since that > reflects the performance change you get when you do "import pickle". > > However, that argument only applies to whether or not to include it in > the default benchmark set used to compare the overall performance > across versions and implementations - it's still valid as a > microbenchmark looking for major regressions in the speed of the pure > Python fallback. > > Cheers, > Nick. > From victor.stinner at gmail.com Tue Apr 4 12:44:34 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 4 Apr 2017 18:44:34 +0200 Subject: [Speed] Timeline of CPython performance April, 2014 - April, 2017 Message-ID: Hi, I hacked my "performance compile" command to force pip 7.1.2 on alpha versions of Python 3.5, which worked around the pyparsing regression: https://sourceforge.net/p/pyparsing/bugs/100/ I succeeded to run benchmarks on CPython on the period April, 2014 - April, 2017, with one dot per quarter. I started to analyze performance in depth and added notes at: From victor.stinner at gmail.com Tue Apr 4 12:49:16 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 4 Apr 2017 18:49:16 +0200 Subject: [Speed] Timeline of CPython performance April, 2014 - April, 2017 In-Reply-To: References: Message-ID: (Crap, how did I sent an incomplete email? Sorry about that.) Hi, I hacked my "performance compile" command to force pip 7.1.2 on alpha versions of Python 3.5, which worked around the pyparsing regression (used since pip 8): https://sourceforge.net/p/pyparsing/bugs/100/ I succeeded to run benchmarks on CPython on the period April, 2014 - April, 2017, with one dot per quarter (so 4 dots per year). I started to analyze performance in depth and added notes in the performance documentation: http://pyperformance.readthedocs.io/cpython_results_2017.html (performance has now an online doc!) I wrote a tool to bisect a performance change, it tries to find the commit which made the significant performance change (slowdown or speedup): https://github.com/haypo/misc/blob/master/misc/find_git_revisions_by_date.py I now started to compute one dot per month to have a better resolution, since my tool failed to bisect the April,2016-July,2016 period. With a resolution of 1 month, I identified the "PyMem_Malloc() now uses the fast pymalloc allocator" change which has a signficant impact on unpickle_list. Victor From ncoghlan at gmail.com Wed Apr 5 05:43:08 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 5 Apr 2017 19:43:08 +1000 Subject: [Speed] performance: remove "pure python" pickle benchmarks? In-Reply-To: <20170404174709.01dbd1fa@fsol> References: <20170404174709.01dbd1fa@fsol> Message-ID: On 5 April 2017 at 01:47, Antoine Pitrou wrote: > > Unfortunately, the C version of pickle lacks the extensibility of the > pure Python, so the pure Python has to be used in some cases. One such > example is the `cloudpickle` project, which extends pickle to > support many more types, such as local functions. `cloudpickle` is > often used by distributed executors to allow shipping Python code for > remote execution on a cluster. Perhaps a more suitable benchmark could be formulated based on that? That way the benchmark could be pinned to a particular version of cloudpickle, and it would be testing a known real-world use case that explicitly requires the pure Python version of the underlying pickle module. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From victor.stinner at gmail.com Thu Apr 6 05:00:39 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 6 Apr 2017 11:00:39 +0200 Subject: [Speed] CPython benchmark status, April 2017 Message-ID: Hi, I'm still working on analyzing past optimizations to guide future optimizations. I succeeded to identify multiple significant optimizations over the last 3 years. At least for me, some were unexpected like "Use the test suite for profile data" which made pidigts 1.16x faster. Here is a report of my work of last weeks. I succeeded to compute benchmarks on CPython master on the period April, 2014-April,2017: we now have have a timeline over 3 years of CPython performance! https://speed.python.org/timeline/ I started to take notes on significant performance changes (speedup and slowdown) of this timeline: http://pyperformance.readthedocs.io/cpython_results_2017.html To identify the change which introduced a significant performance change, I wrote a Python script running a Git bisection: compile CPython, run benchmark, repeat. https://github.com/haypo/misc/blob/master/misc/bisect_cpython_perf.py It uses a configuration file which looks like: --- [config] work_dir = ~/prog/bench_python/bisect-pickle src_dir = ~/prog/bench_python/master old_commit = 133138a284be1985ebd9ec9014f1306b9a42 new_commit = 10427f44852b6e872034061421a8890902b8f benchmark = ~/prog/bench_python/performance/performance/benchmarks/bm_pickle.py pickle benchmark_opts = --inherit-environ=PYTHONPATH -p5 -v configure_args = --- I succeeded to identify many significant optimizations (TODO: validate them on the speed-python server), examples: * PyMem_Malloc() now uses the fast pymalloc allocator * Add a C implementation of collections.OrderedDict * Use the test suite for profile data * Speedup method calls 1.2x * Added C implementation of functools.lru_cache() * Optimized ElementTree.iterparse(); it is now 2x faster perf, performance, server configuration, etc. evolve quicker than expected, so I created a Git project to keep a copy of JSON files: https://github.com/haypo/performance_results I already lost data of my first miletone (november-december 2016), but you have data from the second (december 2016-february 2017) and third (march 2016-today) milestones. I'm now discussing with PyPy to see how performance could be used to measure PyPy performance. Victor From storchaka at gmail.com Fri Apr 7 01:22:57 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 7 Apr 2017 08:22:57 +0300 Subject: [Speed] CPython benchmark status, April 2017 In-Reply-To: References: Message-ID: On 06.04.17 12:00, Victor Stinner wrote: > I succeeded to compute benchmarks on CPython master on the period > April, 2014-April,2017: we now have have a timeline over 3 years of > CPython performance! > > https://speed.python.org/timeline/ Excellent! I always wanted to see such graphics. But can you please output years on the scale? And would be nice to add thin horizontal lines for current performances of maintained Python releases. From victor.stinner at gmail.com Fri Apr 7 09:45:39 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 7 Apr 2017 15:45:39 +0200 Subject: [Speed] CPython benchmark status, April 2017 In-Reply-To: References: Message-ID: 2017-04-07 7:22 GMT+02:00 Serhiy Storchaka : >> https://speed.python.org/timeline/ > > Excellent! I always wanted to see such graphics. Cool :-) > But can you please output years on the scale? Ah ah, yeah, the lack of year becomes painful :-) You should look at https://github.com/tobami/codespeed/ which is the Django app runnins speed.python.org. > And would be nice to add thin horizontal lines for current performances of maintained Python releases. I tried to do that, but it seems like I'm not uploading results correctly into Codespeed. It seems like Codespeed requires different "executable" objects to be able to compare two Python branches. It doesn't seem possible to compare Python 2.7 and Python 3.7 if there are different branches of the same executable? On http://speed.pypy.org/timeline/ it is possible to compare PyPy to CPython for example, but I'm not sure that you can compare two branches of the same executable neither. Maybe each CPython x.y branch should use a different executable, so use "cpython2.7-lto-pgo" instead of "lto-pgo" for Python 2.7 for example? Victor From ncoghlan at gmail.com Wed Apr 12 00:58:06 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 12 Apr 2017 14:58:06 +1000 Subject: [Speed] Testing a wide Unicode build of Python 2 on speed.python.org? Message-ID: speed.python.org has been updated to split out per-branch results for easier cross version comparisons, but looking at the performance repo suggests that the only 2.7 results currently reported are for the default UCS2 builds. That isn't the way Linux distros typically ship Python: we/they specify the "--enable-unicode=ucs4" option when calling configure in order to get correct Unicode handling. Not that long ago, `pyenv` also switched to using wide builds for `manylinux1` wheel compatibility, and conda has similarly used wide builds from the start for ABI compatibility with system Python runtimes. That means the current Python 2 benchmark results may be unrepresentative for anyone using a typical Linux build of CPython: the pay-off in reduced memory use and reduced data copying from Python 3's dynamic string representation is higher relative to Python 2 wide builds than it is relative to narrow builds, and we'd expect that to affect at least the benchmarks that manipulate text data. Perhaps it would make sense to benchmark two different variants of the Python 2.7 branch, one with a wide build, and one with a narrow one? Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From victor.stinner at gmail.com Wed Apr 12 04:52:14 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 12 Apr 2017 10:52:14 +0200 Subject: [Speed] Testing a wide Unicode build of Python 2 on speed.python.org? In-Reply-To: References: Message-ID: 2017-04-12 6:58 GMT+02:00 Nick Coghlan : > speed.python.org has been updated to split out per-branch results for > easier cross version comparisons, but looking at the performance repo > suggests that the only 2.7 results currently reported are for the > default UCS2 builds. You're right, it's a bug, it's wasn't deliberate. I fixed "performance compile": it now pass --enable-unicode=ucs4 to configure if the branch starts with "2.". https://github.com/python/performance/commit/9af0c6e029db9d2a8475f11d5cc601d2992dc62e I'm running benchmarks with this option. Once results will be ready, I will remove the old 2.7 result to replace it with the new one. Note: Python 2.7 configure should use UCS4 by default on Linux, but that's a different topic ;-) Victor From victor.stinner at gmail.com Wed Apr 12 12:15:12 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 12 Apr 2017 18:15:12 +0200 Subject: [Speed] Testing a wide Unicode build of Python 2 on speed.python.org? In-Reply-To: References: Message-ID: 2017-04-12 10:52 GMT+02:00 Victor Stinner : > I'm running benchmarks with this option. Once results will be ready, I > will remove the old 2.7 result to replace it with the new one. Done. speed.python.org now uses UCS-4 on Python 2.7. Is it better now? Previous JSON file: https://github.com/haypo/performance_results/raw/master/2017-03-31-cpython/2017-04-03_16-11-2.7-23d6eb656ec2.json.gz New JSON file: https://github.com/haypo/performance_results/raw/master/2017-04-12-cpython/2017-04-10_17-27-2.7-e0cba5b45a5c.json.gz I see small performance differences, but they don't seem to be related to UTF-16 => UCS-4, but more random noise. Comparison between the two files: $ python3 -m perf compare_to 2017-03-31-cpython/2017-04-03_16-11-2.7-23d6eb656ec2.json.gz 2017-04-12-cpython/2017-04-10_17-27-2.7-e0cba5b45a5c.json.gz --table -G --min-speed=5 +-----------------+-----------------------------------+-----------------------------------+ | Benchmark | 2017-04-03_16-11-2.7-23d6eb656ec2 | 2017-04-10_17-27-2.7-e0cba5b45a5c | +=================+===================================+===================================+ | chameleon | 23.3 ms | 25.0 ms: 1.07x slower (+7%) | +-----------------+-----------------------------------+-----------------------------------+ | scimark_fft | 623 ms | 673 ms: 1.08x slower (+8%) | +-----------------+-----------------------------------+-----------------------------------+ | fannkuch | 806 ms | 889 ms: 1.10x slower (+10%) | +-----------------+-----------------------------------+-----------------------------------+ | scimark_sor | 401 ms | 443 ms: 1.11x slower (+11%) | +-----------------+-----------------------------------+-----------------------------------+ | unpack_sequence | 138 ns | 154 ns: 1.12x slower (+12%) | +-----------------+-----------------------------------+-----------------------------------+ | regex_v8 | 53.4 ms | 60.0 ms: 1.12x slower (+12%) | +-----------------+-----------------------------------+-----------------------------------+ Not significant (61): (...) Hopefully, perf stores information on the Unicode implementation in metadata. You can check metadata using: $ python3 -m perf 2017-03-31_06-53-2.7-5aa913d72317.json.gz -b 2to3|grep python_unicode - python_unicode: UTF-16 $ python3 -m perf metadata 2017-04-10_17-27-2.7-e0cba5b45a5c.json.gz -b 2to3|grep python_unicode - python_unicode: UCS-4 Victor From victor.stinner at gmail.com Thu Apr 13 13:11:42 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 13 Apr 2017 19:11:42 +0200 Subject: [Speed] pybench, call_simple and call_method microbenchmarks removed Message-ID: Hi, I removed pybench, call_simple and call_method microbenchmarks from the performance project and moved them my other project: https://github.com/haypo/pymicrobench The pymicrobench project is a collection of CPython microbenchmarks used to optimize CPython and test specific micro-optimizations. IMHO the removed microbenchmarks are not representative of CPython or PyPy performances at all, and are only useful for CPython developers. I also wrote a much longer description for each benchmark in performance documentation: http://pyperformance.readthedocs.io/benchmarks.html I also started to document weird CPython results: http://pyperformance.readthedocs.io/benchmarks.html#html5lib http://pyperformance.readthedocs.io/benchmarks.html#sympy Victor From ncoghlan at gmail.com Mon Apr 17 01:19:58 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 17 Apr 2017 15:19:58 +1000 Subject: [Speed] Testing a wide Unicode build of Python 2 on speed.python.org? In-Reply-To: References: Message-ID: On 13 April 2017 at 02:15, Victor Stinner wrote: > 2017-04-12 10:52 GMT+02:00 Victor Stinner : >> I'm running benchmarks with this option. Once results will be ready, I >> will remove the old 2.7 result to replace it with the new one. > > Done. speed.python.org now uses UCS-4 on Python 2.7. Is it better now? Thanks! > Previous JSON file: > https://github.com/haypo/performance_results/raw/master/2017-03-31-cpython/2017-04-03_16-11-2.7-23d6eb656ec2.json.gz > > New JSON file: > https://github.com/haypo/performance_results/raw/master/2017-04-12-cpython/2017-04-10_17-27-2.7-e0cba5b45a5c.json.gz > > I see small performance differences, but they don't seem to be related > to UTF-16 => UCS-4, but more random noise. Given that lack of divergence and the known Unicode correctness problems in narrow builds, I guess it doesn't make much sense to invest time in benchmarking both of them. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia