From storchaka at gmail.com Sat Jul 1 04:16:32 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 1 Jul 2017 11:16:32 +0300 Subject: [Python-Dev] Summary of Python tracker Issues In-Reply-To: <20170630160920.087DD11A882@psf.upfronthosting.co.za> References: <20170630160920.087DD11A882@psf.upfronthosting.co.za> Message-ID: 30.06.17 19:09, Python tracker ????: > ACTIVITY SUMMARY (2017-06-23 - 2017-06-30) > Python tracker at http://bugs.python.org/ > > To view or respond to any of the issues listed below, click on the issue. > Do NOT respond to this message. > > Issues counts and deltas: > open 6006 (-20) Victor closed a half of hundred his issues. From storchaka at gmail.com Sat Jul 1 04:19:28 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 1 Jul 2017 11:19:28 +0300 Subject: [Python-Dev] Summary of Python tracker Issues In-Reply-To: <20170623160917.C695056BD7@psf.upfronthosting.co.za> References: <20170623160917.C695056BD7@psf.upfronthosting.co.za> Message-ID: 23.06.17 19:09, Python tracker ????: > > ACTIVITY SUMMARY (2017-06-16 - 2017-06-23) > Python tracker at http://bugs.python.org/ > > To view or respond to any of the issues listed below, click on the issue. > Do NOT respond to this message. > > Issues counts and deltas: > open 6026 ( -8) Terry closed a third of hundred outdated IDLE issues. From victor.stinner at gmail.com Sat Jul 1 11:09:52 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Sat, 1 Jul 2017 17:09:52 +0200 Subject: [Python-Dev] Summary of Python tracker Issues In-Reply-To: References: <20170630160920.087DD11A882@psf.upfronthosting.co.za> Message-ID: Le 1 juil. 2017 10:18 AM, "Serhiy Storchaka" a ?crit : Victor closed a half of hundred his issues. Let me elaborate :-) I am still learning the new GitHub workflow, and it's common that I forget to close issues after merging a change. I had many issues that I forgot to update since we moved to GitHub. I had also complex bugs splitted into small issues and it took time to fix all branches and to wait for buildots to confirm that it's really fixed. To finish, I closed a lot of issues that were older than 2 years because the bug was fixed in the meanwhile in another issue, or just because I lost track of the issue and also lost interest and I don't consider that the bug or feature was worth it. Victor -------------- next part -------------- An HTML attachment was scrubbed... URL: From bhavishyagopesh at gmail.com Sat Jul 1 12:02:56 2017 From: bhavishyagopesh at gmail.com (Bhavishya) Date: Sat, 1 Jul 2017 21:32:56 +0530 Subject: [Python-Dev] Lazy_loading to pickle/unpickle_pure? Message-ID: Hi, I added lazy_loading in pickle.py.Here's some statistics if you consider them of any importance: 1)Unpickle_pure: Lazy -> unpickle_pure_python: Mean +- std dev: 728 us +- 24 us Original -> unpickle_pure_python: Mean +- std dev: 771 us +- 22 us 2)Pickle_pure: Lazy-> pickle_pure_python: Mean +- std dev: 919 us +- 18 us Orginal -> pickle_pure_python: Mean +- std dev: 1.04 ms +- 0.03 ms Thanks bhavishya -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Sat Jul 1 18:27:42 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Sun, 2 Jul 2017 00:27:42 +0200 Subject: [Python-Dev] Lazy_loading to pickle/unpickle_pure? In-Reply-To: References: Message-ID: What is lazy loading? How does it work? Victor Le 1 juil. 2017 6:02 PM, "Bhavishya" a ?crit : > Hi, > I added lazy_loading in pickle.py.Here's some statistics if you consider > them of any importance: > > 1)Unpickle_pure: > Lazy -> unpickle_pure_python: Mean +- std dev: 728 us +- 24 us > > Original -> unpickle_pure_python: Mean +- std dev: 771 us +- 22 us > > 2)Pickle_pure: > > Lazy-> pickle_pure_python: Mean +- std dev: 919 us +- 18 us > > Orginal -> pickle_pure_python: Mean +- std dev: 1.04 ms +- 0.03 ms > > Thanks > bhavishya > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nad at python.org Mon Jul 3 00:43:06 2017 From: nad at python.org (Ned Deily) Date: Mon, 3 Jul 2017 00:43:06 -0400 Subject: [Python-Dev] 3.6.2 update: 3.6.2rc2 coming Message-ID: As you may recall, 3.6.2rc1 was released on 2017-06-17 with an expected release date of 3017-06-30 for 3.6.2 final. As you also may recall, 3.6.2rc1 was delayed a bit to address some security issues including updating our version of libexpat. Shortly after 3.6.2rc1 was released, the expat project released another version of libexpat fixing another security problem. Also since 3.6.2rc1, fixes for several other security issues in Python 3.6 itself have become available and it would be better to get them out sooner rather than later. Therefore, I have decided to do a second release candidate with select fixes cherry-picked from the 3.6 branch; continue to merge bug fixes and doc fixes into 3.6 as usual for release in 3.6.3. Since there haven't been regressions reported so far with 3.6.2rc1, we will plan to compress the rest of the release cycle. Expect to see 3.6.2rc2 available within the next couple of days (2017-07-04 expected) and, assuming no new issues, 3.6.2 final about a week later (around 2017-07-11). --Ned -- Ned Deily nad at python.org -- [] From netheril96 at gmail.com Mon Jul 3 00:52:16 2017 From: netheril96 at gmail.com (Siyuan Ren) Date: Mon, 3 Jul 2017 12:52:16 +0800 Subject: [Python-Dev] 64 bit units in PyLong Message-ID: The current PyLong implementation represents arbitrary precision integers in units of 15 or 30 bits. I presume the purpose is to avoid overflow in addition , subtraction and multiplication. But compilers these days offer intrinsics that allow one to access the overflow flag, and to obtain the result of 64 bit multiplication as a 128 bit number. Or at least on x86-64, which is the dominant platform. Any reason why it is not done? If it is only because no one bothers, I may be able to do it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ma3yuki.8mamo10 at gmail.com Mon Jul 3 03:32:51 2017 From: ma3yuki.8mamo10 at gmail.com (Masayuki YAMAMOTO) Date: Mon, 3 Jul 2017 16:32:51 +0900 Subject: [Python-Dev] Remove own implementation for thread-local storage Message-ID: Hi, python-dev. I'd propose removing code which I think out-of-date. CPython has provided the own implementation for thread-local storage (TLS) on Python/thread.c, it's used in the case which a platform has not supplied native TLS. However, currently all supported platforms (NT and pthreads) have provided native TLS and defined the Py_HAVE_NATIVE_TLS macro with unconditional in any case. If the code is removed, the new TLS API for PEP 539 won't have to care the reinitialization of the thread keys managed by the interpreter (i.e. PyThread_ReInitTLS function has been working for own implementation and will be no longer necessary for new API). Does anyone have a reason we should keep it? Regards, Masayuki -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Mon Jul 3 04:05:20 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 3 Jul 2017 10:05:20 +0200 Subject: [Python-Dev] 64 bit units in PyLong In-Reply-To: References: Message-ID: 2017-07-03 6:52 GMT+02:00 Siyuan Ren : > The current PyLong implementation represents arbitrary precision integers in > units of 15 or 30 bits. I presume the purpose is to avoid overflow in > addition , subtraction and multiplication. But compilers these days offer > intrinsics that allow one to access the overflow flag, and to obtain the > result of 64 bit multiplication as a 128 bit number. The question is the performance. Is it fast? :-) You can try to write a patch and run a benchmark. See for example http://pyperformance.readthedocs.io/ for benchmarks. Victor From victor.stinner at gmail.com Mon Jul 3 04:07:06 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 3 Jul 2017 10:07:06 +0200 Subject: [Python-Dev] Remove own implementation for thread-local storage In-Reply-To: References: Message-ID: I'm in favor of removing it. I know that it confused people many times, they look at this fallback and found an issue, whereas I'm not aware of any platform using this fallback anymore. Can you please write a PR just to remove this fallback? We can merge it and then check buildbots :-) So in the worst case, we can revert it. Victor 2017-07-03 9:32 GMT+02:00 Masayuki YAMAMOTO : > Hi, python-dev. > > I'd propose removing code which I think out-of-date. > CPython has provided the own implementation for thread-local storage (TLS) > on Python/thread.c, it's used in the case which a platform has not supplied > native TLS. However, currently all supported platforms (NT and pthreads) > have provided native TLS and defined the Py_HAVE_NATIVE_TLS macro with > unconditional in any case. > If the code is removed, the new TLS API for PEP 539 won't have to care the > reinitialization of the thread keys managed by the interpreter (i.e. > PyThread_ReInitTLS function has been working for own implementation and will > be no longer necessary for new API). Does anyone have a reason we should > keep it? > > Regards, > Masayuki > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com > From solipsis at pitrou.net Mon Jul 3 04:19:29 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 3 Jul 2017 10:19:29 +0200 Subject: [Python-Dev] Remove own implementation for thread-local storage References: Message-ID: <20170703101929.65aea0b0@fsol> Also note that C11, MSVC and some Unix C compilers have built-in support for thread-local variables. Example: https://github.com/numba/numba/blob/master/numba/_random.c#L114-L119 Regards Antoine. On Mon, 3 Jul 2017 10:07:06 +0200 Victor Stinner wrote: > I'm in favor of removing it. I know that it confused people many > times, they look at this fallback and found an issue, whereas I'm not > aware of any platform using this fallback anymore. > > Can you please write a PR just to remove this fallback? We can merge > it and then check buildbots :-) So in the worst case, we can revert > it. > > Victor > > 2017-07-03 9:32 GMT+02:00 Masayuki YAMAMOTO : > > Hi, python-dev. > > > > I'd propose removing code which I think out-of-date. > > CPython has provided the own implementation for thread-local storage (TLS) > > on Python/thread.c, it's used in the case which a platform has not supplied > > native TLS. However, currently all supported platforms (NT and pthreads) > > have provided native TLS and defined the Py_HAVE_NATIVE_TLS macro with > > unconditional in any case. > > If the code is removed, the new TLS API for PEP 539 won't have to care the > > reinitialization of the thread keys managed by the interpreter (i.e. > > PyThread_ReInitTLS function has been working for own implementation and will > > be no longer necessary for new API). Does anyone have a reason we should > > keep it? > > > > Regards, > > Masayuki > > > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > https://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: > > https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com > > From victor.stinner at gmail.com Mon Jul 3 06:02:13 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 3 Jul 2017 12:02:13 +0200 Subject: [Python-Dev] Remove own implementation for thread-local storage In-Reply-To: <20170703101929.65aea0b0@fsol> References: <20170703101929.65aea0b0@fsol> Message-ID: Sadly, we only require C99 yet :-/ Victor 2017-07-03 10:19 GMT+02:00 Antoine Pitrou : > > Also note that C11, MSVC and some Unix C compilers have built-in support > for thread-local variables. Example: > https://github.com/numba/numba/blob/master/numba/_random.c#L114-L119 > > Regards > > Antoine. > > > On Mon, 3 Jul 2017 10:07:06 +0200 > Victor Stinner wrote: > >> I'm in favor of removing it. I know that it confused people many >> times, they look at this fallback and found an issue, whereas I'm not >> aware of any platform using this fallback anymore. >> >> Can you please write a PR just to remove this fallback? We can merge >> it and then check buildbots :-) So in the worst case, we can revert >> it. >> >> Victor >> >> 2017-07-03 9:32 GMT+02:00 Masayuki YAMAMOTO : >> > Hi, python-dev. >> > >> > I'd propose removing code which I think out-of-date. >> > CPython has provided the own implementation for thread-local storage (TLS) >> > on Python/thread.c, it's used in the case which a platform has not supplied >> > native TLS. However, currently all supported platforms (NT and pthreads) >> > have provided native TLS and defined the Py_HAVE_NATIVE_TLS macro with >> > unconditional in any case. >> > If the code is removed, the new TLS API for PEP 539 won't have to care the >> > reinitialization of the thread keys managed by the interpreter (i.e. >> > PyThread_ReInitTLS function has been working for own implementation and will >> > be no longer necessary for new API). Does anyone have a reason we should >> > keep it? >> > >> > Regards, >> > Masayuki >> > >> > _______________________________________________ >> > Python-Dev mailing list >> > Python-Dev at python.org >> > https://mail.python.org/mailman/listinfo/python-dev >> > Unsubscribe: >> > https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com >> > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com From victor.stinner at gmail.com Mon Jul 3 07:03:25 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 3 Jul 2017 13:03:25 +0200 Subject: [Python-Dev] Need help to review a test_nntplib enhancement: Message-ID: Hi, Sometimes, for an unknown reason, test_nntplib fails randomly: http://bugs.python.org/issue19613 Martin Panter wrote a patch, but since I don't know how to reproduce the bug, I'm unable to test it. Moreover, I don't know nntplib nor test_nntplib, so I don't feel able to review it. Sadly, Martin doesn't feel confident enough in his patch neither, since again, he is unable to test it. As a result, the issue is stuck and the bug continues to occur sometimes on buildbots. Victor From ncoghlan at gmail.com Mon Jul 3 10:34:32 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 4 Jul 2017 00:34:32 +1000 Subject: [Python-Dev] Remove own implementation for thread-local storage In-Reply-To: References: <20170703101929.65aea0b0@fsol> Message-ID: On 3 July 2017 at 20:02, Victor Stinner wrote: > Sadly, we only require C99 yet :-/ Handling fallbacks when shiny new features are unavailable is what autoconf is for, though :) The fact thread specific storage support made it directly into C11 makes me more confident in dropping our emulation, so +1 for investigating that *before* we finalize PEP 539. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From victor.stinner at gmail.com Mon Jul 3 10:38:19 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 3 Jul 2017 16:38:19 +0200 Subject: [Python-Dev] Remove own implementation for thread-local storage In-Reply-To: References: Message-ID: >> I'd propose removing code which I think out-of-date. Already done! https://github.com/python/cpython/commit/aa0aa0492c5fffe750a26d2ab13737a1a6d7d63c (and no buildbot complained). Victor From victor.stinner at gmail.com Tue Jul 4 05:55:01 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 4 Jul 2017 11:55:01 +0200 Subject: [Python-Dev] New work-in-progress bisection tool for the Python test suite (in regrtest) In-Reply-To: References: Message-ID: On Python 2, the addition of Lib/test/bisect.py caused conflict with Lib/bisect.py when running the Python test suite :-( I chose to rename Lib/test/bisect.py to Lib/test/bisectcmd.py to reduce changes caused by this new debug tool. So only in Python 2.7, you have to run: ./python -m test.bisectcmd ... instead of ./python -m tes.bisect ... See http://bugs.python.org/issue30843 for more information. Victor 2017-06-16 18:05 GMT+02:00 Victor Stinner : > Hi, > > Last weeks, I worked on a new tool to bisect failing tests because > it's painful to bisect manually reference leaks (I remove as much code > as possible until the code is small enough to be reviewable manually). > > See the bisect_test.py script attached to this issue: > http://bugs.python.org/issue29512 > > With the help of Louie Lu, I added new --list-cases option to "python > -m test", so you can now list all test cases and write it into a text > file: > > ./python -m test --list-cases test_os > tests > > I also added a new --matchfile option, to filter tests using a text > file which contains one pattern per line: > > ./python -m test --matchfile=tests test_os > > fnmatch is used to match test names, so "*" joker character can be > used in test names. > > > My bisection tool takes a text file with the --matchfile format (one > pattern per line) and creates a random subset of tests with half of > the tests. If tests still fail, use the subset. Otherwise, create a > new random subset. Loop until the subset contains a single test > (configurable threshold, -n command line option). > > The long term plan is to integrate the bisection feature directly into regrtest. > > > > Right now, my script is hardcoded to bisect reference leak bugs, but > it should be easy to modify it to bisect other test issues like test > creating files without removing it ("ENV_CHANGED" failure in > regrtest). > > For example, a core file is dumped when running test_subprocess on > FreeBSD buildbots: > > http://bugs.python.org/issue30448 > > But I'm unable to reproduce the issue on my FreeBSD. It would be nice > to be able to automate the bisection on the buildbot directly. > > > --list-cases and --matchfile options are now available in 2.7, 3.5, > 3.6 and master (3.7) branches. > > TODO: doctest tests are only partially supported, see: > > http://bugs.python.org/issue30683 > > Victor From ncoghlan at gmail.com Tue Jul 4 06:52:34 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 4 Jul 2017 20:52:34 +1000 Subject: [Python-Dev] New work-in-progress bisection tool for the Python test suite (in regrtest) In-Reply-To: References: Message-ID: On 4 July 2017 at 19:55, Victor Stinner wrote: > On Python 2, the addition of Lib/test/bisect.py caused conflict with > Lib/bisect.py when running the Python test suite :-( I chose to rename > Lib/test/bisect.py to Lib/test/bisectcmd.py to reduce changes caused > by this new debug tool. So only in Python 2.7, you have to run: > > ./python -m test.bisectcmd ... I know it's longer, but perhaps it would make sense to put the bisection helper under "python -m test.support.bisect" in both Python 2 & 3? Even in Python 3 "test.bisect" looks a bit like the test suite for the bisect module to me - you have to "just know" that the latter actually lives at "test.test_bisect". Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From victor.stinner at gmail.com Tue Jul 4 07:03:26 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 4 Jul 2017 13:03:26 +0200 Subject: [Python-Dev] New work-in-progress bisection tool for the Python test suite (in regrtest) In-Reply-To: References: Message-ID: 2017-07-04 12:52 GMT+02:00 Nick Coghlan : > I know it's longer, but perhaps it would make sense to put the > bisection helper under "python -m test.support.bisect" in both Python > 2 & 3? For me, test.support is a toolkit to *write* tests, not really to run tests. I don't really care where my bisect tool lives. Serhiy proposed test.bisect, I like because it's short and easy to remind. Technically it is possible to get test.bisect on Python 2, it just requires to modify 4 .py files which import Lib/bisect.py to add "from __future__ import absolute_import": Lib/urllib2.py:import bisect Lib/mhlib.py:from bisect import bisect Lib/test/test_bisect.py:import bisect as py_bisect Lib/multiprocessing/heap.py:import bisect I modified Lib/test/test_bisect.py, but I missed these other ones in my first commit. And then I got a failure in multiprocessing. I chose the conservative approach: rename the new Lib/test/bisect.py file. Do you prefer to get test.bisect, and so modify the 4 files to add "from __future__ import absolute_import"? I didn't recall the subtle details of "relative import" in Python 2. Since I'm now used to Python 3, the Python 2 behaviour now really looks weird to me :-) Victor From lele at metapensiero.it Tue Jul 4 07:20:36 2017 From: lele at metapensiero.it (Lele Gaifax) Date: Tue, 04 Jul 2017 13:20:36 +0200 Subject: [Python-Dev] New work-in-progress bisection tool for the Python test suite (in regrtest) References: Message-ID: <87a84k5tyz.fsf@metapensiero.it> Nick Coghlan writes: > On 4 July 2017 at 19:55, Victor Stinner wrote: >> On Python 2, the addition of Lib/test/bisect.py caused conflict with >> Lib/bisect.py when running the Python test suite :-( I chose to rename >> Lib/test/bisect.py to Lib/test/bisectcmd.py to reduce changes caused >> by this new debug tool. So only in Python 2.7, you have to run: >> >> ./python -m test.bisectcmd ... > > I know it's longer, but perhaps it would make sense to put the > bisection helper under "python -m test.support.bisect" in both Python > 2 & 3? Or test.tool.bisect, similar to json.tool. ciao, lele. -- nickname: Lele Gaifax | Quando vivr? di quello che ho pensato ieri real: Emanuele Gaifas | comincer? ad aver paura di chi mi copia. lele at metapensiero.it | -- Fortunato Depero, 1929. From ncoghlan at gmail.com Tue Jul 4 07:22:08 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 4 Jul 2017 21:22:08 +1000 Subject: [Python-Dev] New work-in-progress bisection tool for the Python test suite (in regrtest) In-Reply-To: References: Message-ID: On 4 July 2017 at 21:03, Victor Stinner wrote: > 2017-07-04 12:52 GMT+02:00 Nick Coghlan : >> I know it's longer, but perhaps it would make sense to put the >> bisection helper under "python -m test.support.bisect" in both Python >> 2 & 3? > > For me, test.support is a toolkit to *write* tests, not really to run tests. > > I don't really care where my bisect tool lives. Serhiy proposed > test.bisect, I like because it's short and easy to remind. > > Technically it is possible to get test.bisect on Python 2, it just > requires to modify 4 .py files which import Lib/bisect.py to add "from > __future__ import absolute_import": > > Lib/urllib2.py:import bisect > Lib/mhlib.py:from bisect import bisect > Lib/test/test_bisect.py:import bisect as py_bisect > Lib/multiprocessing/heap.py:import bisect That doesn't sound right, as implicit relative imports in Python 2 are relative to the *package* (they're akin to writing "from . import name"). That means if test.bisect is shadowing the top level bisect module when backported, it suggests that the test.regrtest directory is ending up on sys.path for the affected test run (e.g. because the tests were run as "python Lib/test/regrtest.py" rather than via the -m switch). Checking test.regrtest in 2.7, it looks to me like it's missing the stanza added to Py3 that makes sure "Lib/test/" isn't present on sys.path - if you add a similar operation to Py2.7, I'd expect the test.bisect name for the command to work there as well. Cheers, Nick. P.S. As far the multiprocessing failure you saw goes, my guess would be that the 2.7 version actually is relying on implicit relative imports to find peer modules, and would need some "from . import ..." adjustments to handle "from __future__ import absolute_import" at the top of the file. However, as noted above, I don't think that's actually what's happening - I think the "Lib/test/" directory is still sometimes ending up on sys.path while running the tests in 2.7. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From victor.stinner at gmail.com Tue Jul 4 08:10:14 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 4 Jul 2017 14:10:14 +0200 Subject: [Python-Dev] New work-in-progress bisection tool for the Python test suite (in regrtest) In-Reply-To: References: Message-ID: 2017-07-04 13:22 GMT+02:00 Nick Coghlan : > That means if test.bisect is shadowing the top level bisect module > when backported, it suggests that the test.regrtest directory is > ending up on sys.path for the affected test run (e.g. because the > tests were run as "python Lib/test/regrtest.py" rather than via the -m > switch). I don't think that Lib/test/ is in sys.path. It's more subtle than test. When you run "./python -m test test_bisect", Lib/test/regrtest.py imports "test.test_bisect", and so test_bisect is imported with __package__=['test']. With test_bisect.__package__=['test'], "import bisect" in Lib/test/test_bisect.py imports Lib/test/bisect.py. The question is more when Lib/multiprocessing/heap.py got Lib/test/bisect.py instead of Lib/bisect.py. I didn't dig into this issue. The Python 2 import machinery blows my mind :-) Victor From victor.stinner at gmail.com Tue Jul 4 09:15:53 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 4 Jul 2017 15:15:53 +0200 Subject: [Python-Dev] Star the CPython GitHub project if you like Python! In-Reply-To: References: Message-ID: 4 days later, we got +2,389 new stars, thank you! (8,539 => 10,928) Python moved from the 11th place to the 9th, before Elixir and Julia. Python is still behind Ruby (12,511) and PHP (12,318), but it's already much better than before! Victor 2017-06-30 15:59 GMT+02:00 Victor Stinner : > Hi, > > GitHub has a showcase page of hosted programming languages: > > https://github.com/showcases/programming-languages > > Python is only #11 with 8,539 stars, behind PHP and Ruby! > > Hey, you should "like" ("star"?) the CPython project if you like Python! > > https://github.com/python/cpython/ > Click on "Star" at the top right. > > Thank you! > Victor From victor.stinner at gmail.com Tue Jul 4 09:23:56 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 4 Jul 2017 15:23:56 +0200 Subject: [Python-Dev] Buildbot report (almost July) In-Reply-To: References: Message-ID: 2017-06-29 17:09 GMT+02:00 Victor Stinner : > Correct me if I'm wrong, but, for the first time, *all reference > leaks* have been fixed on *all branches* (2.7, 3.5, 3.6 and master), > on *Linux and Windows*! Before, we mostly focused on the master branch > (called "default" in Mercurial) on Linux. > > I also started to fix a few "memory block" leaks, most (or all?) of > them should also be fixed (on all branches, on Linux and Windows). The "AMD64 Windows8.1 Refleaks" now pass on 2.7, 3.5, 3.6 and master branches. I finished to backport my changes to fix false alarms in hunting reference and memory leaks. The next introduced reference leak should now send an email to the buildbot-status mailing list ;-) Victor From benhoyt at gmail.com Tue Jul 4 10:22:48 2017 From: benhoyt at gmail.com (Ben Hoyt) Date: Tue, 4 Jul 2017 10:22:48 -0400 Subject: [Python-Dev] Star the CPython GitHub project if you like Python! In-Reply-To: References: Message-ID: Nice! I also posted it on reddit.com/r/Python, where it got a bit of traction: https://www.reddit.com/r/Python/comments/6kg4w0/cpython_recently_moved_to_github_star_the_project/ -Ben On Tue, Jul 4, 2017 at 9:15 AM, Victor Stinner wrote: > 4 days later, we got +2,389 new stars, thank you! (8,539 => 10,928) > > Python moved from the 11th place to the 9th, before Elixir and Julia. > > Python is still behind Ruby (12,511) and PHP (12,318), but it's > already much better than before! > > Victor > > 2017-06-30 15:59 GMT+02:00 Victor Stinner : > > Hi, > > > > GitHub has a showcase page of hosted programming languages: > > > > https://github.com/showcases/programming-languages > > > > Python is only #11 with 8,539 stars, behind PHP and Ruby! > > > > Hey, you should "like" ("star"?) the CPython project if you like Python! > > > > https://github.com/python/cpython/ > > Click on "Star" at the top right. > > > > Thank you! > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > benhoyt%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Jul 4 10:27:34 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 5 Jul 2017 00:27:34 +1000 Subject: [Python-Dev] New work-in-progress bisection tool for the Python test suite (in regrtest) In-Reply-To: References: Message-ID: On 4 July 2017 at 22:10, Victor Stinner wrote: > 2017-07-04 13:22 GMT+02:00 Nick Coghlan : >> That means if test.bisect is shadowing the top level bisect module >> when backported, it suggests that the test.regrtest directory is >> ending up on sys.path for the affected test run (e.g. because the >> tests were run as "python Lib/test/regrtest.py" rather than via the -m >> switch). > > I don't think that Lib/test/ is in sys.path. It's more subtle than > test. When you run "./python -m test test_bisect", > Lib/test/regrtest.py imports "test.test_bisect", and so test_bisect is > imported with __package__=['test']. > > With test_bisect.__package__=['test'], "import bisect" in > Lib/test/test_bisect.py imports Lib/test/bisect.py. Right, for test_bisect specifically, the implicit relative import problem applies, and "from __future__ import absolute_import" is the relevant fix. That concern just doesn't apply to the *stdlib* modules doing a normal top-level "import bisect". > The question is more when Lib/multiprocessing/heap.py got > Lib/test/bisect.py instead of Lib/bisect.py. I didn't dig into this > issue. The Python 2 import machinery blows my mind :-) *This* is the case that I think is due to "Lib/test" being on sys.path when the tests are run: ``` $ ./python -i Lib/test/regrtest.py --help [snip output] >>> import sys >>> sys.path[0] '/home/ncoghlan/devel/py27/Lib/test' ``` Using test_urllib2 as the example: ``` $ touch Lib/test/bisect.py $ ./python -m test.regrtest test_urllib2 Run tests sequentially 0:00:00 [1/1] test_urllib2 1 test OK. Total duration: 800 ms Tests result: SUCCESS $ ./python Lib/test/regrtest.py test_urllib2 Run tests sequentially 0:00:00 [1/1] test_urllib2 Warning -- os.environ was modified by test_urllib2 [snip output] test test_urllib2 failed -- multiple errors occurred; run in verbose mode for details 1 test failed: test_urllib2 Total duration: 107 ms Tests result: FAILURE ``` Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From victor.stinner at gmail.com Tue Jul 4 10:46:00 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 4 Jul 2017 16:46:00 +0200 Subject: [Python-Dev] New work-in-progress bisection tool for the Python test suite (in regrtest) In-Reply-To: References: Message-ID: 2017-07-04 16:27 GMT+02:00 Nick Coghlan : > That concern just doesn't apply to the *stdlib* modules doing a normal > top-level "import bisect". Hum, ok. I created a PR which removes '' and Lib/test/ from sys.path, and rename again test.bisectcmd to test.bisect. Would you mind to review it? https://github.com/python/cpython/pull/2567 Victor From chris.jerdonek at gmail.com Tue Jul 4 15:11:42 2017 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Tue, 4 Jul 2017 12:11:42 -0700 Subject: [Python-Dev] Star the CPython GitHub project if you like Python! In-Reply-To: References: Message-ID: Great work, Victor! It seems like this would be an easy thing to mention at the beginning of conference talks and meetup presentations, and also something to ask coworkers if you work at a company that uses Python (e.g. on workplace Slack channels, etc). --Chris On Tue, Jul 4, 2017 at 6:15 AM, Victor Stinner wrote: > 4 days later, we got +2,389 new stars, thank you! (8,539 => 10,928) > > Python moved from the 11th place to the 9th, before Elixir and Julia. > > Python is still behind Ruby (12,511) and PHP (12,318), but it's > already much better than before! > > Victor > > 2017-06-30 15:59 GMT+02:00 Victor Stinner : >> Hi, >> >> GitHub has a showcase page of hosted programming languages: >> >> https://github.com/showcases/programming-languages >> >> Python is only #11 with 8,539 stars, behind PHP and Ruby! >> >> Hey, you should "like" ("star"?) the CPython project if you like Python! >> >> https://github.com/python/cpython/ >> Click on "Star" at the top right. >> >> Thank you! >> Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/chris.jerdonek%40gmail.com From tjreedy at udel.edu Tue Jul 4 16:00:09 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 4 Jul 2017 16:00:09 -0400 Subject: [Python-Dev] Star the CPython GitHub project if you like Python! In-Reply-To: References: Message-ID: On 7/4/2017 10:22 AM, Ben Hoyt wrote: > Nice! > > I also posted it on reddit.com/r/Python , > where it got a bit of traction: > https://www.reddit.com/r/Python/comments/6kg4w0/cpython_recently_moved_to_github_star_the_project/ I just posted on python-list. -- Terry Jan Reedy From dickinsm at gmail.com Wed Jul 5 15:05:06 2017 From: dickinsm at gmail.com (Mark Dickinson) Date: Wed, 5 Jul 2017 20:05:06 +0100 Subject: [Python-Dev] 64 bit units in PyLong In-Reply-To: References: Message-ID: On Mon, Jul 3, 2017 at 5:52 AM, Siyuan Ren wrote: > The current PyLong implementation represents arbitrary precision integers in > units of 15 or 30 bits. I presume the purpose is to avoid overflow in > addition , subtraction and multiplication. But compilers these days offer > intrinsics that allow one to access the overflow flag, and to obtain the > result of 64 bit multiplication as a 128 bit number. Or at least on x86-64, > which is the dominant platform. Any reason why it is not done? Portability matters, so any use of these intrinsics would likely also have to be accompanied by fallback code that doesn't depend on them, as well as some buildsystem complexity to figure out whether those intrinsics are supported or not. And then the Objects/longobject.c would suffer in terms of simplicity and readability, so there would have to be some clear gains to offset that. Note that the typical Python workload does not involve thousand-digit integers: what would matter would be performance of smaller integers, and it seems conceivable that 64-bit limbs would speed up those operations simply because so many more integers would become single-limb and so there would be more opportunities to take fast paths, but there would need to be benchmarks demonstrating that. Oh, and you'd have to rewrite the power algorithm, which currently depends on the size of a limb in bytes being a multiple of 5. :-) -- Mark From breamoreboy at yahoo.co.uk Wed Jul 5 15:33:23 2017 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Wed, 5 Jul 2017 20:33:23 +0100 Subject: [Python-Dev] 64 bit units in PyLong In-Reply-To: References: Message-ID: On 05/07/2017 20:05, Mark Dickinson wrote: > Oh, and you'd have to rewrite the power algorithm, which currently > depends on the size of a limb in bytes being a multiple of 5. :-) > What is a limb, as my search foo has let me down? -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence --- This email has been checked for viruses by AVG. http://www.avg.com From rosuav at gmail.com Wed Jul 5 15:39:08 2017 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 6 Jul 2017 05:39:08 +1000 Subject: [Python-Dev] 64 bit units in PyLong In-Reply-To: References: Message-ID: On Thu, Jul 6, 2017 at 5:33 AM, Mark Lawrence via Python-Dev wrote: > On 05/07/2017 20:05, Mark Dickinson wrote: > >> Oh, and you'd have to rewrite the power algorithm, which currently >> depends on the size of a limb in bytes being a multiple of 5. :-) >> > > What is a limb, as my search foo has let me down? A thing that has a bunch of digits, but fits inside a machine word. https://gmplib.org/manual/Nomenclature-and-Types.html ChrisA From greg at krypto.org Wed Jul 5 16:18:05 2017 From: greg at krypto.org (Gregory P. Smith) Date: Wed, 05 Jul 2017 20:18:05 +0000 Subject: [Python-Dev] 64 bit units in PyLong In-Reply-To: References: Message-ID: On Wed, Jul 5, 2017 at 12:05 PM Mark Dickinson wrote: > On Mon, Jul 3, 2017 at 5:52 AM, Siyuan Ren wrote: > > The current PyLong implementation represents arbitrary precision > integers in > > units of 15 or 30 bits. I presume the purpose is to avoid overflow in > > addition , subtraction and multiplication. But compilers these days offer > > intrinsics that allow one to access the overflow flag, and to obtain the > > result of 64 bit multiplication as a 128 bit number. Or at least on > x86-64, > > which is the dominant platform. Any reason why it is not done? > > Portability matters, so any use of these intrinsics would likely also > have to be accompanied by fallback code that doesn't depend on them, > as well as some buildsystem complexity to figure out whether those > intrinsics are supported or not. And then the Objects/longobject.c > would suffer in terms of simplicity and readability, so there would > have to be some clear gains to offset that. Note that the typical > Python workload does not involve thousand-digit integers: what would > matter would be performance of smaller integers, and it seems > conceivable that 64-bit limbs would speed up those operations simply > because so many more integers would become single-limb and so there > would be more opportunities to take fast paths, but there would need > to be benchmarks demonstrating that. > > Oh, and you'd have to rewrite the power algorithm, which currently > depends on the size of a limb in bytes being a multiple of 5. :-) > > -- > Mark > When I pushed to get us to adopt 30-bit digits instead of 15-bit digits, I hoped it could also happen on 32-bit x86 builds as the hardware has fast 32bit multiply -> 64 result support. But it made the configuration more difficult and IIRC would've increase the memory use of the PyLong type for single digit numbers on 32-bit platforms so we settled on moving to 30 bits only for 64-bit platforms (which were obviously going to become the norm). A reasonable digit size for hardware that supports 128bit results of 64bit multiplies would be 60 bits (to keep our multiple of 5 bits logic working). But I doubt you will see any notable performance gains in practical applications by doing so. Sure, numbers in the 1-4 billion range are slightly less efficient today, but those are not likely to appear in hot loops in a typical application. Microbenchmarks alone should not be used to make this decision. remember: We had native integer support in Python 1 & 2 via the old PyInt type. In Python 3 we ditched that in favor of PyLong everywhere. This was a performance hit for the sake of proper high level language simplicity. We've already regained all of that since then in other areas of the interpreter. -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From status at bugs.python.org Fri Jul 7 12:09:18 2017 From: status at bugs.python.org (Python tracker) Date: Fri, 7 Jul 2017 18:09:18 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20170707160918.4A8D056BE1@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2017-06-30 - 2017-07-07) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 6021 (+15) closed 36604 (+39) total 42625 (+54) Open issues with patches: 2344 Issues opened (48) ================== #23644: g++ module compile fails with ???_Atomic??? does not name a ty http://bugs.python.org/issue23644 reopened by haypo #27945: Various segfaults with dict http://bugs.python.org/issue27945 reopened by ned.deily #30819: Linking with 'ld -b' fails with 64-bit using Itanium HP compil http://bugs.python.org/issue30819 opened by Robert Boehne #30820: email.contentmanager.raw_data_manager fails to create multipar http://bugs.python.org/issue30820 opened by elenril #30821: unittest.mock.Mocks with specs aren't aware of default argumen http://bugs.python.org/issue30821 opened by Max Rothman #30822: Python implementation of datetime module is not being tested c http://bugs.python.org/issue30822 opened by musically_ut #30823: os.startfile("") craches Python 2.7, 3.4 in Windows 7 32 bit i http://bugs.python.org/issue30823 opened by mikeee #30824: Add mimetype for extension .json http://bugs.python.org/issue30824 opened by quentel #30825: csv.Sniffer does not detect lineterminator http://bugs.python.org/issue30825 opened by vmax #30826: More details in reference 'Looping through a list in Python an http://bugs.python.org/issue30826 opened by the_darklord #30828: Out of bounds write in _asyncio_Future_remove_done_callback http://bugs.python.org/issue30828 opened by Ned Williamson #30830: HTTPHandlerTest of test_logging leaks a "dangling" thread on A http://bugs.python.org/issue30830 opened by haypo #30831: Inconsistent or wrong documentation around Asynchronous Contex http://bugs.python.org/issue30831 opened by dmiyakawa #30833: UnloadUserprofile displays the error "The handle is invalid" http://bugs.python.org/issue30833 opened by d doe #30834: Warning -- files was modified by test_import, After: ['@test http://bugs.python.org/issue30834 opened by haypo #30835: AttributeError when parsing multipart email with invalid non-d http://bugs.python.org/issue30835 opened by Andrew Donnellan #30836: test_c_locale_coercion fails on AIX http://bugs.python.org/issue30836 opened by haypo #30837: Mac OS High Sierra Beta - Python Crash http://bugs.python.org/issue30837 opened by ayesjm #30839: Larger and/or configurable _MAX_LENGTH for unittest messages http://bugs.python.org/issue30839 opened by maarten-treewalker #30840: Contrary to documentation, relative imports cannot pass throug http://bugs.python.org/issue30840 opened by Malcolm Smith #30841: A shadowing variable naming emitted for Python-ast.c http://bugs.python.org/issue30841 opened by OswinC #30842: pyenv activate for bash and tcsh http://bugs.python.org/issue30842 opened by PyAcrisel #30844: selectors: Add urgent data to read event http://bugs.python.org/issue30844 opened by pklanke #30845: [3.5] test_first_completed_some_already_completed() of test_co http://bugs.python.org/issue30845 opened by haypo #30846: [3.6] test_rapid_restart() of test_multiprocessing_fork fails http://bugs.python.org/issue30846 opened by haypo #30847: asyncio: selector_events: add_urgent() for urgent data to read http://bugs.python.org/issue30847 opened by pklanke #30848: test_multiprocessing_forkserver hangs on AMD64 FreeBSD CURRENT http://bugs.python.org/issue30848 opened by haypo #30849: test_stress_delivery_dependent() of test_signal randomly fails http://bugs.python.org/issue30849 opened by haypo #30850: [2.7] bsddb3: test01_basic_replication() of test_bsddb3 fails http://bugs.python.org/issue30850 opened by haypo #30851: IDLE: configdialog -- fix tkinter Variables http://bugs.python.org/issue30851 opened by terry.reedy #30852: _PyObject_GC_UNTRACK corruption when call a lambda function wi http://bugs.python.org/issue30852 opened by ????????? #30853: IDLE: configdialog -- factor out Variable subclass http://bugs.python.org/issue30853 opened by terry.reedy #30856: unittest.TestResult.addSubTest should be called immediately af http://bugs.python.org/issue30856 opened by sir-sigurd #30857: test_bsddb3 hangs longer than 37 minutes on x86 Tiger 2.7 http://bugs.python.org/issue30857 opened by haypo #30858: Keyword can't be an expression? http://bugs.python.org/issue30858 opened by veky #30859: Can't install Python for Windows 3.6.1 on multiple profiles http://bugs.python.org/issue30859 opened by Joe Jacobs #30860: Consolidate stateful C globals under a single struct. http://bugs.python.org/issue30860 opened by eric.snow #30861: StreamReader does not return reamaing and ready data buffer be http://bugs.python.org/issue30861 opened by pfreixes #30863: Rewrite PyUnicode_AsWideChar() and PyUnicode_AsWideCharString( http://bugs.python.org/issue30863 opened by serhiy.storchaka #30864: Compile failure for linux socket CAN support http://bugs.python.org/issue30864 opened by Riccardo Magliocchetti #30865: python cannot import module located on a "VOLUME" directory http://bugs.python.org/issue30865 opened by apre #30866: Add _testcapi.stack_pointer() to measure the C stack consumpti http://bugs.python.org/issue30866 opened by haypo #30867: Add necessary macro that insure `HAVE_OPENSSL_VERIFY_PARAM` to http://bugs.python.org/issue30867 opened by signal1587 #30868: IDLE: Improve configuration tests with mock Save. http://bugs.python.org/issue30868 opened by terry.reedy #30869: regrtest: Add .idlerc to saved_test_environment http://bugs.python.org/issue30869 opened by louielu #30870: IDLE: configdialog/fonts: change font when select by key up/do http://bugs.python.org/issue30870 opened by louielu #30871: Add a "python info" command somewhere to dump versions of all http://bugs.python.org/issue30871 opened by haypo #30872: Update curses docs to Python 3 http://bugs.python.org/issue30872 opened by serhiy.storchaka Most recent 15 issues with no replies (15) ========================================== #30872: Update curses docs to Python 3 http://bugs.python.org/issue30872 #30870: IDLE: configdialog/fonts: change font when select by key up/do http://bugs.python.org/issue30870 #30868: IDLE: Improve configuration tests with mock Save. http://bugs.python.org/issue30868 #30867: Add necessary macro that insure `HAVE_OPENSSL_VERIFY_PARAM` to http://bugs.python.org/issue30867 #30864: Compile failure for linux socket CAN support http://bugs.python.org/issue30864 #30863: Rewrite PyUnicode_AsWideChar() and PyUnicode_AsWideCharString( http://bugs.python.org/issue30863 #30856: unittest.TestResult.addSubTest should be called immediately af http://bugs.python.org/issue30856 #30853: IDLE: configdialog -- factor out Variable subclass http://bugs.python.org/issue30853 #30852: _PyObject_GC_UNTRACK corruption when call a lambda function wi http://bugs.python.org/issue30852 #30851: IDLE: configdialog -- fix tkinter Variables http://bugs.python.org/issue30851 #30846: [3.6] test_rapid_restart() of test_multiprocessing_fork fails http://bugs.python.org/issue30846 #30842: pyenv activate for bash and tcsh http://bugs.python.org/issue30842 #30841: A shadowing variable naming emitted for Python-ast.c http://bugs.python.org/issue30841 #30833: UnloadUserprofile displays the error "The handle is invalid" http://bugs.python.org/issue30833 #30831: Inconsistent or wrong documentation around Asynchronous Contex http://bugs.python.org/issue30831 Most recent 15 issues waiting for review (15) ============================================= #30872: Update curses docs to Python 3 http://bugs.python.org/issue30872 #30867: Add necessary macro that insure `HAVE_OPENSSL_VERIFY_PARAM` to http://bugs.python.org/issue30867 #30863: Rewrite PyUnicode_AsWideChar() and PyUnicode_AsWideCharString( http://bugs.python.org/issue30863 #30860: Consolidate stateful C globals under a single struct. http://bugs.python.org/issue30860 #30828: Out of bounds write in _asyncio_Future_remove_done_callback http://bugs.python.org/issue30828 #30817: Abort in PyErr_PrintEx() when no memory http://bugs.python.org/issue30817 #30814: Import dotted name as alias breaks with concurrency http://bugs.python.org/issue30814 #30808: Use _Py_atomic API for concurrency-sensitive signal state http://bugs.python.org/issue30808 #30747: _Py_atomic_* not actually atomic on Windows with MSVC http://bugs.python.org/issue30747 #30714: test_ssl fails with openssl 1.1.0f http://bugs.python.org/issue30714 #30711: getaddrinfo invalid port number http://bugs.python.org/issue30711 #30710: getaddrinfo raises OverflowError http://bugs.python.org/issue30710 #30696: infinite loop in PyRun_InteractiveLoopFlags() http://bugs.python.org/issue30696 #30695: add a nomemory_allocator to the _testcapi module http://bugs.python.org/issue30695 #30693: tarfile add uses random order http://bugs.python.org/issue30693 Top 10 most discussed issues (10) ================================= #30822: Python implementation of datetime module is not being tested c http://bugs.python.org/issue30822 29 msgs #30302: Improve .__repr__ implementation for datetime.timedelta http://bugs.python.org/issue30302 14 msgs #30861: StreamReader does not return reamaing and ready data buffer be http://bugs.python.org/issue30861 12 msgs #29796: [2.7] test_weakref hangs on Python 2.7 on Windows http://bugs.python.org/issue29796 9 msgs #23644: g++ module compile fails with ???_Atomic??? does not name a ty http://bugs.python.org/issue23644 8 msgs #29854: Segfault when readline history is more then 2 * history size http://bugs.python.org/issue29854 8 msgs #30779: IDLE: configdialog -- factor out Changes class http://bugs.python.org/issue30779 8 msgs #30844: selectors: Add urgent data to read event http://bugs.python.org/issue30844 8 msgs #30847: asyncio: selector_events: add_urgent() for urgent data to read http://bugs.python.org/issue30847 8 msgs #30814: Import dotted name as alias breaks with concurrency http://bugs.python.org/issue30814 7 msgs Issues closed (40) ================== #6691: Support for nested classes and function for pyclbr http://bugs.python.org/issue6691 closed by terry.reedy #19325: _osx_support imports many modules http://bugs.python.org/issue19325 closed by haypo #20042: Python Launcher, Windows, fails on scripts w/ non-latin names http://bugs.python.org/issue20042 closed by terry.reedy #20669: OpenBSD: socket.recvmsg tests fail with OSError: [Errno 40] Me http://bugs.python.org/issue20669 closed by haypo #29293: Missing parameter "n" on multiprocessing.Condition.notify() http://bugs.python.org/issue29293 closed by pitrou #30259: Test somehow that generated files are up to date: run make reg http://bugs.python.org/issue30259 closed by haypo #30315: test_ftplib.TestTLS_FTPClass: "[Errno 54] Connection reset by http://bugs.python.org/issue30315 closed by haypo #30319: Change socket.close() to ignore ECONNRESET http://bugs.python.org/issue30319 closed by pitrou #30328: test_ssl.test_connect_with_context(): ConnectionResetError on http://bugs.python.org/issue30328 closed by haypo #30330: test_socket.test_idna(): socket.gaierror: [Errno 11001] getadd http://bugs.python.org/issue30330 closed by haypo #30351: [2.7] regrtest hangs on Python 2.7 (test_threading?) http://bugs.python.org/issue30351 closed by haypo #30371: test_long_lines() fails randomly on AMD64 Windows7 SP1 3.x http://bugs.python.org/issue30371 closed by haypo #30441: os.environ raises RuntimeError: dictionary changed size during http://bugs.python.org/issue30441 closed by serhiy.storchaka #30448: test_subprocess creates a core dump on FreeBSD http://bugs.python.org/issue30448 closed by haypo #30532: email.policy.SMTP.fold() mangles long headers http://bugs.python.org/issue30532 closed by r.david.murray #30543: test_timeout fails on AMD64 FreeBSD CURRENT Debug 3.x: Connect http://bugs.python.org/issue30543 closed by haypo #30623: python-nightly import numpy fails since recently http://bugs.python.org/issue30623 closed by ned.deily #30649: test_utime_current_old() of test_os fails randomy on x86 Windo http://bugs.python.org/issue30649 closed by haypo #30651: test_poplib.test_stls_context() access violation on x86 Window http://bugs.python.org/issue30651 closed by haypo #30652: test_threading_not_handled() of test_socketserver hangs random http://bugs.python.org/issue30652 closed by haypo #30703: Non-reentrant signal handler (test_multiprocessing_forkserver http://bugs.python.org/issue30703 closed by pitrou #30726: [Windows] Warnings in elementtree due to new expat http://bugs.python.org/issue30726 closed by haypo #30739: pypi ssl errors [CERTIFICATE_VERIFY_FAILED] http://bugs.python.org/issue30739 closed by ned.deily #30741: https://www.pypi-mirrors.org/ error 503 http://bugs.python.org/issue30741 closed by ned.deily #30758: regrtest hangs sometimes on the master branch (test_pydoc? tes http://bugs.python.org/issue30758 closed by haypo #30759: [2.7] Fix python2 -m test --list-cases test_multibytecodec_sup http://bugs.python.org/issue30759 closed by haypo #30777: IDLE: configdialog -- add docstrings and improve comments http://bugs.python.org/issue30777 closed by terry.reedy #30789: Redesign PyCodeObject.co_extras to use a single memory block, http://bugs.python.org/issue30789 closed by serhiy.storchaka #30791: tkinter.Tk() adds suffix to window class name when launching m http://bugs.python.org/issue30791 closed by serhiy.storchaka #30795: OS X failures in test_site http://bugs.python.org/issue30795 closed by ned.deily #30804: bolen-dmg-3.x build-installer.py failed http://bugs.python.org/issue30804 closed by haypo #30818: Warning -- asyncore.socket_map was modified by test_ftplib on http://bugs.python.org/issue30818 closed by haypo #30827: Tweak order of links in https://www.python.org/downloads/sourc http://bugs.python.org/issue30827 closed by berker.peksag #30829: 'Cannot serialize socket object' after ssl.wrap_socket http://bugs.python.org/issue30829 closed by pitrou #30832: Remove own implementation for thread-local storage http://bugs.python.org/issue30832 closed by haypo #30838: re \w does not match some valid Unicode characters http://bugs.python.org/issue30838 closed by davidism #30843: [2.7] Lib/test/bisect.py conflicts with Lib/bisect.py when run http://bugs.python.org/issue30843 closed by haypo #30854: Compile error on Python/ceval.c without threads http://bugs.python.org/issue30854 closed by haypo #30855: [3.5] test_tk: test_use() of test_tkinter.test_widgets randoml http://bugs.python.org/issue30855 closed by haypo #30862: parent logger should also check the level http://bugs.python.org/issue30862 closed by vinay.sajip From nad at python.org Sat Jul 8 01:22:32 2017 From: nad at python.org (Ned Deily) Date: Sat, 8 Jul 2017 01:22:32 -0400 Subject: [Python-Dev] [RELEASE] Python 3.6.2rc2 is now available for testing Message-ID: <4B22E08B-9C6C-47F6-B908-CC7676D43B77@python.org> On behalf of the Python development community and the Python 3.6 release team, I would like to announce the availability of Python 3.6.2rc2. 3.6.2rc2 is the second release candidate for Python 3.6.2, the next maintenance release of Python 3.6. 3.6.2rc2 includes fixes for three security-related issues resolved since the previous release candidate; see the change log (link below). While 3.6.2rc2 is a preview release and, thus, not intended for production environments, we encourage you to explore it and provide feedback via the Python bug tracker (https://bugs.python.org). Please see "What?s New In Python 3.6" for more information: https://docs.python.org/3.6/whatsnew/3.6.html You can find Python 3.6.2rc2 here: https://www.python.org/downloads/release/python-362rc2/ and its change log here: https://docs.python.org/3.6/whatsnew/changelog.html#python-3-6-2-release-candidate-2 3.6.2 is now planned for final release on 2017-07-17 with the next maintenance release expected to follow in about 3 months. More information about the 3.6 release schedule can be found here: https://www.python.org/dev/peps/pep-0494/ -- Ned Deily nad at python.org -- [] From bhavishyagopesh at gmail.com Sun Jul 9 10:08:09 2017 From: bhavishyagopesh at gmail.com (Bhavishya) Date: Sun, 9 Jul 2017 19:38:09 +0530 Subject: [Python-Dev] Pure pickle bechmark. Message-ID: Hello, 1).I was going through the code of *python pickle* to search any optimization possibility.But the only thing that I found very alarming was again the import time(I tried with lazy-import but it didn't helped much.) I found py3 to be ~45 times slower on* initial imports(very raw measure..using "time." ) *as compared to py2 on an usual example. py3-> ./python -c ' favorite_color = { "lion": "yellow", "kitty": "red" } pickle.dump( favorite_color, open( "save.p", "wb" ) )' 0.009715557098388672(time taken to do initial imports...measured using *time.time()* ) py2-> ./python -c ' favorite_color = { "lion": "yellow", "kitty": "red" } pickle.dump( favorite_color, open( "save.p", "wb" ) )' 0.000236034393311(time taken to do initial imports...measured using *time.time()* ) Do you have any thought/ideas on improving this? Thank You. -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Sun Jul 9 12:12:28 2017 From: python at mrabarnett.plus.com (MRAB) Date: Sun, 9 Jul 2017 17:12:28 +0100 Subject: [Python-Dev] Pure pickle bechmark. In-Reply-To: References: Message-ID: <1ccb6b93-65dd-0d4e-6d4b-1426aea6b65a@mrabarnett.plus.com> On 2017-07-09 15:08, Bhavishya wrote: > Hello, > > 1).I was going through the code of *python pickle* to search any > optimization possibility.But the only thing that I found very alarming > was again the import time(I tried with lazy-import but it didn't helped > much.) > > I found py3 to be ~45 times slower on*initial imports(very raw > measure..using "time." ) *as compared to py2 on an usual example. > > py3-> > ./python -c ' > favorite_color = { "lion": "yellow", "kitty": "red" } > pickle.dump( favorite_color, open( "save.p", "wb" ) )' > 0.009715557098388672(time taken to do initial imports...measured using > *time.time()* ) > > py2-> > ./python -c ' > favorite_color = { "lion": "yellow", "kitty": "red" } > pickle.dump( favorite_color, open( "save.p", "wb" ) )' > 0.000236034393311(time taken to do initial imports...measured using > *time.time()* ) > > Do you have any thought/ideas on improving this? > Python 3 is using Unicode strings, whereas Python 2 is using bytestrings. What you show above are very short (in time) examples (less than 1/100 of a second), so they're not that meaningful. If you had timed pickling a substantial object (the same object in both cases) and it took a significant amount of time and you found a significant slowdown, then it would be worth looking into further. From solipsis at pitrou.net Sun Jul 9 12:58:36 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 9 Jul 2017 18:58:36 +0200 Subject: [Python-Dev] Pure pickle bechmark. References: Message-ID: <20170709185836.67138939@fsol> Hi, On Sun, 9 Jul 2017 19:38:09 +0530 Bhavishya wrote: > Hello, > > 1).I was going through the code of *python pickle* to search any > optimization possibility.But the only thing that I found very alarming was > again the import time(I tried with lazy-import but it didn't helped much.) > > I found py3 to be ~45 times slower on* initial imports(very raw > measure..using "time." ) *as compared to py2 on an usual example. Can you explain how you measured exactly? Regards Antoine. From songofacandy at gmail.com Sun Jul 9 19:17:30 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Mon, 10 Jul 2017 08:17:30 +0900 Subject: [Python-Dev] Pure pickle bechmark. In-Reply-To: References: Message-ID: I don't know this is relating to your case. When I saw Victor's report [1], I researched why Python 3 is slower than Python 2 on unpickle_pure_python benchmark. [1] https://mail.python.org/pipermail/speed/2017-February/000503.html And I found Python 2 and 3 uses different version of pickle format. Current Python 3 uses "framing" format. While unpickling, `read(1)` is very performance critical. Python 2 uses `cStringIO.read` which is implemented in C. On the other hand, Python 3 uses `_Unframer.read` which is implemented in Python. Since this is not relating to "first import time", I don't know this is what you want to optimize. (Since _pickle is used for normal case, pure Python unpickle performance is not a common problem). If you want to optimize it, _Unframer uses BytesIO internally and performance critical part may be able to call BytesIO.read directly instead of _Unframer.read. Regards, INADA Naoki On Sun, Jul 9, 2017 at 11:08 PM, Bhavishya wrote: > Hello, > > 1).I was going through the code of python pickle to search any optimization > possibility.But the only thing that I found very alarming was again the > import time(I tried with lazy-import but it didn't helped much.) > > I found py3 to be ~45 times slower on initial imports(very raw > measure..using "time." ) as compared to py2 on an usual example. > > py3-> > ./python -c ' > favorite_color = { "lion": "yellow", "kitty": "red" } > pickle.dump( favorite_color, open( "save.p", "wb" ) )' > 0.009715557098388672(time taken to do initial imports...measured using > time.time() ) > > py2-> > ./python -c ' > favorite_color = { "lion": "yellow", "kitty": "red" } > pickle.dump( favorite_color, open( "save.p", "wb" ) )' > 0.000236034393311(time taken to do initial imports...measured using > time.time() ) > > Do you have any thought/ideas on improving this? > > > Thank You. From victor.stinner at gmail.com Sun Jul 9 19:36:03 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 10 Jul 2017 01:36:03 +0200 Subject: [Python-Dev] Pure pickle bechmark. In-Reply-To: References: Message-ID: Wait. Are we talking about the C accelerator or the pure Python implementation of pickle on Python 3? Victor Le 10 juil. 2017 01:19, "INADA Naoki" a ?crit : > I don't know this is relating to your case. > > When I saw Victor's report [1], I researched why Python 3 is slower than > Python 2 on unpickle_pure_python benchmark. > > [1] https://mail.python.org/pipermail/speed/2017-February/000503.html > > > And I found Python 2 and 3 uses different version of pickle format. > > Current Python 3 uses "framing" format. While unpickling, `read(1)` is > very performance critical. Python 2 uses `cStringIO.read` which is > implemented in C. > On the other hand, Python 3 uses `_Unframer.read` which is implemented > in Python. > > Since this is not relating to "first import time", I don't know this > is what you want to optimize. > (Since _pickle is used for normal case, pure Python unpickle > performance is not a common > problem). > > If you want to optimize it, _Unframer uses BytesIO internally and > performance critical > part may be able to call BytesIO.read directly instead of _Unframer.read. > > Regards, > INADA Naoki > > > On Sun, Jul 9, 2017 at 11:08 PM, Bhavishya > wrote: > > Hello, > > > > 1).I was going through the code of python pickle to search any > optimization > > possibility.But the only thing that I found very alarming was again the > > import time(I tried with lazy-import but it didn't helped much.) > > > > I found py3 to be ~45 times slower on initial imports(very raw > > measure..using "time." ) as compared to py2 on an usual example. > > > > py3-> > > ./python -c ' > > favorite_color = { "lion": "yellow", "kitty": "red" } > > pickle.dump( favorite_color, open( "save.p", "wb" ) )' > > 0.009715557098388672(time taken to do initial imports...measured using > > time.time() ) > > > > py2-> > > ./python -c ' > > favorite_color = { "lion": "yellow", "kitty": "red" } > > pickle.dump( favorite_color, open( "save.p", "wb" ) )' > > 0.000236034393311(time taken to do initial imports...measured using > > time.time() ) > > > > Do you have any thought/ideas on improving this? > > > > > > Thank You. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > victor.stinner%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From songofacandy at gmail.com Sun Jul 9 19:39:54 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Mon, 10 Jul 2017 08:39:54 +0900 Subject: [Python-Dev] Pure pickle bechmark. In-Reply-To: References: Message-ID: I said about pure Python implementation (unpickle_pure_python), because mail title is "Pure pickle bechmark". INADA Naoki On Mon, Jul 10, 2017 at 8:36 AM, Victor Stinner wrote: > Wait. Are we talking about the C accelerator or the pure Python > implementation of pickle on Python 3? > > Victor > > Le 10 juil. 2017 01:19, "INADA Naoki" a ?crit : >> >> I don't know this is relating to your case. >> >> When I saw Victor's report [1], I researched why Python 3 is slower than >> Python 2 on unpickle_pure_python benchmark. >> >> [1] https://mail.python.org/pipermail/speed/2017-February/000503.html >> >> >> And I found Python 2 and 3 uses different version of pickle format. >> >> Current Python 3 uses "framing" format. While unpickling, `read(1)` is >> very performance critical. Python 2 uses `cStringIO.read` which is >> implemented in C. >> On the other hand, Python 3 uses `_Unframer.read` which is implemented >> in Python. >> >> Since this is not relating to "first import time", I don't know this >> is what you want to optimize. >> (Since _pickle is used for normal case, pure Python unpickle >> performance is not a common >> problem). >> >> If you want to optimize it, _Unframer uses BytesIO internally and >> performance critical >> part may be able to call BytesIO.read directly instead of _Unframer.read. >> >> Regards, >> INADA Naoki >> >> >> On Sun, Jul 9, 2017 at 11:08 PM, Bhavishya >> wrote: >> > Hello, >> > >> > 1).I was going through the code of python pickle to search any >> > optimization >> > possibility.But the only thing that I found very alarming was again the >> > import time(I tried with lazy-import but it didn't helped much.) >> > >> > I found py3 to be ~45 times slower on initial imports(very raw >> > measure..using "time." ) as compared to py2 on an usual example. >> > >> > py3-> >> > ./python -c ' >> > favorite_color = { "lion": "yellow", "kitty": "red" } >> > pickle.dump( favorite_color, open( "save.p", "wb" ) )' >> > 0.009715557098388672(time taken to do initial imports...measured using >> > time.time() ) >> > >> > py2-> >> > ./python -c ' >> > favorite_color = { "lion": "yellow", "kitty": "red" } >> > pickle.dump( favorite_color, open( "save.p", "wb" ) )' >> > 0.000236034393311(time taken to do initial imports...measured using >> > time.time() ) >> > >> > Do you have any thought/ideas on improving this? >> > >> > >> > Thank You. >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com From victor.stinner at gmail.com Sun Jul 9 19:47:16 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 10 Jul 2017 01:47:16 +0200 Subject: [Python-Dev] Pure pickle bechmark. In-Reply-To: References: Message-ID: Please explain how to reproduce your benchmark. Maybe write a shell script? Victor Le 9 juil. 2017 17:49, "Bhavishya" a ?crit : > Hello, > > 1).I was going through the code of *python pickle* to search any > optimization possibility.But the only thing that I found very alarming was > again the import time(I tried with lazy-import but it didn't helped much.) > > I found py3 to be ~45 times slower on* initial imports(very raw > measure..using "time." ) *as compared to py2 on an usual example. > > py3-> > ./python -c ' > favorite_color = { "lion": "yellow", "kitty": "red" } > pickle.dump( favorite_color, open( "save.p", "wb" ) )' > 0.009715557098388672(time taken to do initial imports...measured using > *time.time()* ) > > py2-> > ./python -c ' > favorite_color = { "lion": "yellow", "kitty": "red" } > pickle.dump( favorite_color, open( "save.p", "wb" ) )' > 0.000236034393311(time taken to do initial imports...measured using > *time.time()* ) > > Do you have any thought/ideas on improving this? > > > Thank You. > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > victor.stinner%40gmail.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Sun Jul 9 20:35:37 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 10 Jul 2017 02:35:37 +0200 Subject: [Python-Dev] Pure pickle bechmark. In-Reply-To: References: Message-ID: (Oops, I didn't notice that we started to talk off the list, let's discuss that in python-dev please.) I don't see the point of optimizing "pickle/unpickle pure python" benchmark on Python 3. This benchmark doesn't make sense on Python 3, since I don't know anyone using the pure Python pickle. The C accelerator is now used by default. I already proposed to remove this benchmark: https://mail.python.org/pipermail/speed/2017-April/000554.html *but* Antoine Pitrou mentionned that the cloudpickle project uses it. Maybe we should try to understand what's wrong with _pickle (C module) for cloudpickle? Victor 2017-07-10 2:10 GMT+02:00 Bhavishya : > I was working on the two regressed benchmarks (i.e. pickle/unpickle > pure-python), and as it was a case with other benchmarks....that performance > is affected by import ...I thought that could be a case with pickle.py > too. And thus tried adding the above patch to Lib/pickle.py to measure the > initial import time. > > I haven't tried it for any practical use-case. > > > On Mon, Jul 10, 2017 at 5:27 AM, Victor Stinner > wrote: >> >> Sorry, I don't understand the direct link between the import time of 4 >> modules and the pickle module. Can you please elaborate? >> >> What are you trying to optimize? >> >> What is your use case? >> >> Victor From solipsis at pitrou.net Mon Jul 10 05:13:13 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 10 Jul 2017 11:13:13 +0200 Subject: [Python-Dev] Extendability of C vs Python pickle References: Message-ID: <20170710111313.129f67db@fsol> On Mon, 10 Jul 2017 02:35:37 +0200 Victor Stinner wrote: > > I already proposed to remove this benchmark: > https://mail.python.org/pipermail/speed/2017-April/000554.html > > *but* Antoine Pitrou mentionned that the cloudpickle project uses it. > > Maybe we should try to understand what's wrong with _pickle (C module) > for cloudpickle? That's a good question, Victor. cloudpickle uses three hooks inside pickle.py's Pickler: - the "dispatch" dictionary - overriding the "save_global" method to support saving more objects (such as closures, etc.) - overriding the "save_reduce" method; this one doesn't seem really necessary, perhaps some leftover from previous attempts _pickle.c's Pickler does seem to allow a custom "dispatch_table", but it doesn't allow overriding "save_global". Of course, if _pickle.c were improved to allow such extensions, it would suddenly allow cloudpickle to be much more performant! Regards Antoine. From artieua at gmail.com Mon Jul 10 10:37:40 2017 From: artieua at gmail.com (Artem Muterko) Date: Mon, 10 Jul 2017 17:37:40 +0300 Subject: [Python-Dev] Improve test coverage for standard library Message-ID: Good day, I've noticed that test coverage of standard library tools can be improved and wanted to ask which module is better to start with? I'm kindly asking to point to the module which would be good to start writing tests for and also easy to review for other contributors. Best regards, Artem Muterko -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Mon Jul 10 14:19:28 2017 From: brett at python.org (Brett Cannon) Date: Mon, 10 Jul 2017 18:19:28 +0000 Subject: [Python-Dev] Improve test coverage for standard library In-Reply-To: References: Message-ID: In general the answer to helping with code coverage is "whatever module motivates you to help". :) So unless a core dev has a specific module that they want to help you write tests for then it's whatever you want to work on. And for easy reference, the code coverage report can be found at https://codecov.io/gh/python/cpython (although it is somewhat inaccurate for any module imported at startup). On Mon, 10 Jul 2017 at 07:54 Artem Muterko wrote: > Good day, > > I've noticed that test coverage of standard library tools can be improved > and wanted to ask which module is better to start with? > > I'm kindly asking to point to the module which would be good to start > writing tests for and also easy to review for other contributors. > > Best regards, > Artem Muterko > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From larry at hastings.org Wed Jul 12 09:09:50 2017 From: larry at hastings.org (Larry Hastings) Date: Wed, 12 Jul 2017 15:09:50 +0200 Subject: [Python-Dev] Should I make a 3.4.7rc1 next weekend? Message-ID: <20ec439d-5735-1285-5380-ae22b57c50c6@hastings.org> I'm scheduled to tag and release 3.5.4rc1 next weekend. I've been releasing 3.4 and 3.5 at the same time for the last year; this is convenient for me as it halves the frequency with which I have to put on the "release manager" hat. There are currently no scheduled dates to release 3.4.7. The reason being that until very recently there was almost no work done in 3.4 since 3.4.6 was tagged. But! The reason for /that/ was because of a change in the workflow: once we switched to Github, for branches that are in security-fixes-only mode, only the Release Manager is allowed to accept PRs into that branch. It turned out there were a bunch of PRs waiting for my approval. After a flurry of accepted PRs, I have now accrued about ten fresh security fixes in the 3.4 branch. (Mostly from Victor, but also Serhiy, and one from Barry--thanks everyone!) There are now no outstanding security fix PRs against 3.4. Since I'm releasing 3.5.4rc1 next weekend, I wouldn't mind /also/ releasing 3.47rc1 next weekend. That would put 3.4.7 final the same day as 3.5.4 final: just over three weeks from now, releasing on Sunday August 5. I realize it's not much notice, and that's normally not how we do things in the CPython world. (Sorry for the short notice--it's my fault for not adjusting to the new workflow quickly enough.) Anyway the point of this email is to call for a vote. Which of these statements do you agree with: * Larry should tag and release 3.4.7rc1 next weekend. * Larry should schedule 3.4.7rc1 for a month from now, to give people time to get their work in. In particular, Victor and Serhiy, I'm interested in your votes. You both get veto powers for the short notice--if either of you say "do it a month from now" then it'll be a month from now. Also, if anybody has security fixes you want to get in to the next release of 3.4, but you haven't made a PR yet, please reply and describe them. (Please reply to list if appropriate, but if it should be kept quiet please reply to me directly.) Braising in my own juices at EuroPython, //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From larry at hastings.org Wed Jul 12 09:12:50 2017 From: larry at hastings.org (Larry Hastings) Date: Wed, 12 Jul 2017 15:12:50 +0200 Subject: [Python-Dev] Reminder: 3.5.4rc1 will be tagged next Saturday, July 22 2017 Message-ID: Just a quick reminder. I'll be tagging 3.5.4rc1 next Saturday, July 22. 3.5.4 final will be the last release of 3.5.4 that accepts bugfixes; after that, the 3.5 branch will transition to security-fixes-only mode. If you have bugfixes you want to ship with 3.5.4, please get them committed in the next nine days. Happy hacking, //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Wed Jul 12 09:18:42 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 12 Jul 2017 15:18:42 +0200 Subject: [Python-Dev] [python-committers] Should I make a 3.4.7rc1 next weekend? In-Reply-To: <20ec439d-5735-1285-5380-ae22b57c50c6@hastings.org> References: <20ec439d-5735-1285-5380-ae22b57c50c6@hastings.org> Message-ID: I would love to have a new 3.4 release including all security fixes, sure! It would reduce the number of known vulnerability in Python 3.4: http://python-security.readthedocs.io/vulnerabilities.html 2017-07-12 15:09 GMT+02:00 Larry Hastings : > After a flurry of accepted PRs, I have now accrued about ten fresh security > fixes in the 3.4 branch. (Mostly from Victor, but also Serhiy, and one from > Barry--thanks everyone!) There are now no outstanding security fix PRs > against 3.4. Thanks for merging them ;-) I would like to see my "[3.4] Backport CI config from master" PR merged into 3.4 to get at least a check from Travis and AppVeyor that there is no major regression on Linux and Windows: https://github.com/python/cpython/pull/2475 If I recall correctly, it would be the first time that we have a CI for a branch in security-fix only mode, no? Victor From victor.stinner at gmail.com Thu Jul 13 11:33:45 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 13 Jul 2017 17:33:45 +0200 Subject: [Python-Dev] Articles on my contributions to CPython during 2017 Q1 and Q2 Message-ID: Hi, I wrote a serie of new articles on my contributions to CPython during 2017 Q1 and Q2. "My contributions to CPython during 2017 Q1" https://haypo.github.io/contrib-cpython-2017q1.html "New Python test.bisect tool" https://haypo.github.io/python-test-bisect.html "Work on Python buildbots, 2017 Q2" https://haypo.github.io/python-buildbots-2017q2.html "My contributions to CPython during 2017 Q2 (part 1)" https://haypo.github.io/contrib-cpython-2017q2-part1.html "My contributions to CPython during 2017 Q2 (part 2)" https://haypo.github.io/contrib-cpython-2017q2-part2.html "My contributions to CPython during 2017 Q2 (part 3)" https://haypo.github.io/contrib-cpython-2017q2-part3.html Good reading ;-) Victor From benhoyt at gmail.com Fri Jul 14 07:37:45 2017 From: benhoyt at gmail.com (Ben Hoyt) Date: Fri, 14 Jul 2017 07:37:45 -0400 Subject: [Python-Dev] Articles on my contributions to CPython during 2017 Q1 and Q2 In-Reply-To: References: Message-ID: Wow, amazing work. The Stinnerbot strikes again! A lot of great optimizations and bugfixes. Speaking of optimizations, I just wrote some code which takes 12s on Python 2.7 and 5s on Python 3.5. so we're doing something right! I might post about it shortly. -Ben On Jul 13, 2017 11:34 AM, "Victor Stinner" wrote: Hi, I wrote a serie of new articles on my contributions to CPython during 2017 Q1 and Q2. "My contributions to CPython during 2017 Q1" https://haypo.github.io/contrib-cpython-2017q1.html "New Python test.bisect tool" https://haypo.github.io/python-test-bisect.html "Work on Python buildbots, 2017 Q2" https://haypo.github.io/python-buildbots-2017q2.html "My contributions to CPython during 2017 Q2 (part 1)" https://haypo.github.io/contrib-cpython-2017q2-part1.html "My contributions to CPython during 2017 Q2 (part 2)" https://haypo.github.io/contrib-cpython-2017q2-part2.html "My contributions to CPython during 2017 Q2 (part 3)" https://haypo.github.io/contrib-cpython-2017q2-part3.html Good reading ;-) Victor _______________________________________________ Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ benhoyt%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Fri Jul 14 09:33:05 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 14 Jul 2017 15:33:05 +0200 Subject: [Python-Dev] Articles on my contributions to CPython during 2017 Q1 and Q2 In-Reply-To: References: Message-ID: 2017-07-14 13:37 GMT+02:00 Ben Hoyt : > Wow, amazing work. The Stinnerbot strikes again! Thanks. > A lot of great optimizations and bugfixes. Speaking of optimizations, I just > wrote some code which takes 12s on Python 2.7 and 5s on Python 3.5. so we're > doing something right! I might post about it shortly. Hum, I'm curious to see which kind of code becomes so much faster on Python 3. Victor From p.f.moore at gmail.com Fri Jul 14 10:20:47 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 14 Jul 2017 15:20:47 +0100 Subject: [Python-Dev] Articles on my contributions to CPython during 2017 Q1 and Q2 In-Reply-To: References: Message-ID: On 14 July 2017 at 14:33, Victor Stinner wrote: >> A lot of great optimizations and bugfixes. Speaking of optimizations, I just >> wrote some code which takes 12s on Python 2.7 and 5s on Python 3.5. so we're >> doing something right! I might post about it shortly. > > Hum, I'm curious to see which kind of code becomes so much faster on Python 3. time.sleep(5 if sys.version_info >= (3,) else 12) :-) Paul From benhoyt at gmail.com Fri Jul 14 10:53:10 2017 From: benhoyt at gmail.com (Ben Hoyt) Date: Fri, 14 Jul 2017 10:53:10 -0400 Subject: [Python-Dev] Articles on my contributions to CPython during 2017 Q1 and Q2 In-Reply-To: References: Message-ID: Yeah, it was surprising to me too. I thought it'd be faster, but not that much. I did some quick cProfile tests, but that didn't show anything, and I think it's improvements to the bytecode interpreter and various bytecode instructions. (This particular test hammers the bytecode interpreter.) I'll post details in the next week or so. -Ben On Fri, Jul 14, 2017 at 9:33 AM, Victor Stinner wrote: > 2017-07-14 13:37 GMT+02:00 Ben Hoyt : > > Wow, amazing work. The Stinnerbot strikes again! > > Thanks. > > > A lot of great optimizations and bugfixes. Speaking of optimizations, I > just > > wrote some code which takes 12s on Python 2.7 and 5s on Python 3.5. so > we're > > doing something right! I might post about it shortly. > > Hum, I'm curious to see which kind of code becomes so much faster on > Python 3. > > Victor > -------------- next part -------------- An HTML attachment was scrubbed... URL: From status at bugs.python.org Fri Jul 14 12:09:13 2017 From: status at bugs.python.org (Python tracker) Date: Fri, 14 Jul 2017 18:09:13 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20170714160913.E6BDA56AF5@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2017-07-07 - 2017-07-14) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 6042 (+21) closed 36641 (+37) total 42683 (+58) Open issues with patches: 2342 Issues opened (39) ================== #26617: Assertion failed in gc with __del__ and weakref http://bugs.python.org/issue26617 reopened by serhiy.storchaka #29766: --with-lto still implied by --enable-optimizations in Python 2 http://bugs.python.org/issue29766 reopened by Arfrever #30296: Remove unnecessary tuples, lists, sets, and dicts from Lib http://bugs.python.org/issue30296 reopened by serhiy.storchaka #30758: test_pydoc hangs sometimes on 3.6 and master branches http://bugs.python.org/issue30758 reopened by haypo #30814: Import dotted name as alias breaks with concurrency http://bugs.python.org/issue30814 reopened by haypo #30874: unittest execute tests twice in some conditions http://bugs.python.org/issue30874 opened by ale5000 #30876: SystemError on importing module from unloaded package http://bugs.python.org/issue30876 opened by serhiy.storchaka #30877: possibe typo in json/scanner.py http://bugs.python.org/issue30877 opened by c-fos #30882: Built-in list disappeared from Python 2.7 intersphinx inventor http://bugs.python.org/issue30882 opened by Christoph.Deil #30883: test_urllib2net failed on s390x Debian 3.6: ftp.debian.org err http://bugs.python.org/issue30883 opened by haypo #30884: regrtest -jN --timeout=TIMEOUT should kill child process runni http://bugs.python.org/issue30884 opened by haypo #30885: test_subprocess hangs on AMD64 Windows8.1 Refleaks 3.x http://bugs.python.org/issue30885 opened by haypo #30888: import class not isinstance of the class http://bugs.python.org/issue30888 opened by ????????? #30889: distutils extra_link_args not working because it is added to c http://bugs.python.org/issue30889 opened by Liu Cailiang #30891: importlib: _find_and_load() race condition on sys.modules[name http://bugs.python.org/issue30891 opened by haypo #30892: _elementtree: assertion error if stdlib copy module is overrid http://bugs.python.org/issue30892 opened by haypo #30893: Expose importlib._bootstrap._ModuleLockManager in importlib.ma http://bugs.python.org/issue30893 opened by brett.cannon #30897: Add a is_mount() to pathlib http://bugs.python.org/issue30897 opened by cooperlees #30898: SSL cert failure running make test during Python 3.6 install http://bugs.python.org/issue30898 opened by Ben Johnston #30903: IPv4Network's hostmask attribute doesn't returns string value http://bugs.python.org/issue30903 opened by Abhijit Mamarde #30904: Python 3 logging HTTPHandler sends duplicate Host header http://bugs.python.org/issue30904 opened by lhelwerd #30905: Embedding should have public API for interactive mode http://bugs.python.org/issue30905 opened by steveire #30907: speed up comparisons to self for built-in containers http://bugs.python.org/issue30907 opened by wbolster #30908: test_os.TestSendfile.test_keywords() leaks dangling threads http://bugs.python.org/issue30908 opened by haypo #30909: ServerProxy should not make requests with malformed XML http://bugs.python.org/issue30909 opened by Alex Corcoles #30910: Add -fexception to ppc64le build http://bugs.python.org/issue30910 opened by brunoalr #30912: python 3 git master fails to find libffi and build _ctypes on http://bugs.python.org/issue30912 opened by shlomif #30915: distutils sometimes assumes wrong C compiler http://bugs.python.org/issue30915 opened by moxian #30916: Pre-build OpenSSL and Tcl/Tk for Windows http://bugs.python.org/issue30916 opened by steve.dower #30917: IDLE: Add idlelib.config.IdleConf unittest http://bugs.python.org/issue30917 opened by louielu #30918: Unable to launch IDLE in windows 7 http://bugs.python.org/issue30918 opened by trencyclo #30919: Shared Array Memory Allocation Regression http://bugs.python.org/issue30919 opened by dtasev #30920: Sequence Matcher from diff lib is not implementing longest com http://bugs.python.org/issue30920 opened by Syam Mohan #30923: Add -Wimplicit-fallthrough=0 to Makefile ? http://bugs.python.org/issue30923 opened by matrixise #30924: RPM build doc_files needs files separated into separate lines http://bugs.python.org/issue30924 opened by warthog9 #30925: RPM build lacks ability to include other files similar to doc_ http://bugs.python.org/issue30925 opened by warthog9 #30926: KeyError with cgitb inspecting exception in generator expressi http://bugs.python.org/issue30926 opened by jason.coombs #30928: Copy modified blurbs to idlelib/NEWS.txt http://bugs.python.org/issue30928 opened by terry.reedy #30929: AttributeErrors after import in multithreaded environment http://bugs.python.org/issue30929 opened by boytsovea Most recent 15 issues with no replies (15) ========================================== #30929: AttributeErrors after import in multithreaded environment http://bugs.python.org/issue30929 #30926: KeyError with cgitb inspecting exception in generator expressi http://bugs.python.org/issue30926 #30925: RPM build lacks ability to include other files similar to doc_ http://bugs.python.org/issue30925 #30924: RPM build doc_files needs files separated into separate lines http://bugs.python.org/issue30924 #30918: Unable to launch IDLE in windows 7 http://bugs.python.org/issue30918 #30915: distutils sometimes assumes wrong C compiler http://bugs.python.org/issue30915 #30912: python 3 git master fails to find libffi and build _ctypes on http://bugs.python.org/issue30912 #30910: Add -fexception to ppc64le build http://bugs.python.org/issue30910 #30905: Embedding should have public API for interactive mode http://bugs.python.org/issue30905 #30904: Python 3 logging HTTPHandler sends duplicate Host header http://bugs.python.org/issue30904 #30903: IPv4Network's hostmask attribute doesn't returns string value http://bugs.python.org/issue30903 #30898: SSL cert failure running make test during Python 3.6 install http://bugs.python.org/issue30898 #30889: distutils extra_link_args not working because it is added to c http://bugs.python.org/issue30889 #30874: unittest execute tests twice in some conditions http://bugs.python.org/issue30874 #30872: Update curses docs to Python 3 http://bugs.python.org/issue30872 Most recent 15 issues waiting for review (15) ============================================= #30891: importlib: _find_and_load() race condition on sys.modules[name http://bugs.python.org/issue30891 #30877: possibe typo in json/scanner.py http://bugs.python.org/issue30877 #30876: SystemError on importing module from unloaded package http://bugs.python.org/issue30876 #30872: Update curses docs to Python 3 http://bugs.python.org/issue30872 #30867: Add necessary macro that insure `HAVE_OPENSSL_VERIFY_PARAM` to http://bugs.python.org/issue30867 #30863: Rewrite PyUnicode_AsWideChar() and PyUnicode_AsWideCharString( http://bugs.python.org/issue30863 #30860: Consolidate stateful C globals under a single struct. http://bugs.python.org/issue30860 #30828: Out of bounds write in _asyncio_Future_remove_done_callback http://bugs.python.org/issue30828 #30817: Abort in PyErr_PrintEx() when no memory http://bugs.python.org/issue30817 #30808: Use _Py_atomic API for concurrency-sensitive signal state http://bugs.python.org/issue30808 #30747: _Py_atomic_* not actually atomic on Windows with MSVC http://bugs.python.org/issue30747 #30714: test_ssl fails with openssl 1.1.0f http://bugs.python.org/issue30714 #30711: getaddrinfo invalid port number http://bugs.python.org/issue30711 #30710: getaddrinfo raises OverflowError http://bugs.python.org/issue30710 #30696: infinite loop in PyRun_InteractiveLoopFlags() http://bugs.python.org/issue30696 Top 10 most discussed issues (10) ================================= #30870: IDLE: configdialog/fonts: change font when select by key up/do http://bugs.python.org/issue30870 19 msgs #30891: importlib: _find_and_load() race condition on sys.modules[name http://bugs.python.org/issue30891 18 msgs #30876: SystemError on importing module from unloaded package http://bugs.python.org/issue30876 13 msgs #30919: Shared Array Memory Allocation Regression http://bugs.python.org/issue30919 11 msgs #30730: [security] Injecting environment variable in subprocess on Win http://bugs.python.org/issue30730 8 msgs #27099: IDLE: turn builting extensions into regular modules http://bugs.python.org/issue27099 7 msgs #27584: New addition of vSockets to the python socket module http://bugs.python.org/issue27584 7 msgs #30907: speed up comparisons to self for built-in containers http://bugs.python.org/issue30907 7 msgs #8231: Unable to run IDLE without write-access to home directory http://bugs.python.org/issue8231 6 msgs #30171: Emit ResourceWarning in multiprocessing Queue destructor http://bugs.python.org/issue30171 6 msgs Issues closed (41) ================== #10438: list an example for calling static methods from WITHIN classes http://bugs.python.org/issue10438 closed by serhiy.storchaka #13220: print function unable while multiprocessing.Process is being r http://bugs.python.org/issue13220 closed by terry.reedy #22607: find by dichotomy the failing test http://bugs.python.org/issue22607 closed by haypo #25746: test_unittest failure in leaks searching mode http://bugs.python.org/issue25746 closed by serhiy.storchaka #29464: Specialize FASTCALL for functions with positional-only paramet http://bugs.python.org/issue29464 closed by serhiy.storchaka #29591: expat 2.2.0: Various security vulnerabilities in bundled expat http://bugs.python.org/issue29591 closed by larry #29812: test for token.py, and consistency tests for tokenize.py http://bugs.python.org/issue29812 closed by ammar2 #29854: Segfault when readline history is more then 2 * history size http://bugs.python.org/issue29854 closed by berker.peksag #30251: Windows Visual Studio solution does not have an install target http://bugs.python.org/issue30251 closed by steve.dower #30444: Add ability to change "-- more --" text in pager module http://bugs.python.org/issue30444 closed by Gautam krishna.R #30731: Use correct executable manifest for windows http://bugs.python.org/issue30731 closed by steve.dower #30779: IDLE: configdialog -- factor out Changes class http://bugs.python.org/issue30779 closed by terry.reedy #30801: shoutdown process error with python 3.4 and pyqt/PySide http://bugs.python.org/issue30801 closed by larry #30823: os.startfile("") craches Python 2.7, 3.4 in Windows 7 32 bit i http://bugs.python.org/issue30823 closed by terry.reedy #30837: Mac OS High Sierra Beta - Python Crash http://bugs.python.org/issue30837 closed by ned.deily #30851: IDLE: configdialog -- fix tkinter Variables http://bugs.python.org/issue30851 closed by terry.reedy #30859: Can't install Python for Windows 3.6.1 on multiple profiles http://bugs.python.org/issue30859 closed by terry.reedy #30873: `SystemError: returned NULL http://bugs.python.org/issue30873 closed by haypo #30875: round(number[, digits]) does not return value with >12 decima http://bugs.python.org/issue30875 closed by john Forgue #30878: The staticmethod doesn't properly reject keyword arguments http://bugs.python.org/issue30878 closed by serhiy.storchaka #30879: os.listdir(bytes) gives a list of bytes, but os.listdir(buffer http://bugs.python.org/issue30879 closed by serhiy.storchaka #30880: PCG random number generator http://bugs.python.org/issue30880 closed by rhettinger #30881: IDLE: add docstrings to browser.py http://bugs.python.org/issue30881 closed by terry.reedy #30886: multiprocessing.Queue.join_thread() does nothing if created an http://bugs.python.org/issue30886 closed by haypo #30887: Syntax checking confuses Try: class_instance_name as ... is us http://bugs.python.org/issue30887 closed by steven.daprano #30890: IDLE: Input method error in comment with Korean language http://bugs.python.org/issue30890 closed by terry.reedy #30894: Python 3.6.1 String Literal Error Not Going to sys.stderr http://bugs.python.org/issue30894 closed by steven.daprano #30895: Decimal arithmetic sum error http://bugs.python.org/issue30895 closed by zach.ware #30896: BytesWarning in re module when compiling certain bytes pattern http://bugs.python.org/issue30896 closed by serhiy.storchaka #30899: IDLE: Add idle config parser unittest http://bugs.python.org/issue30899 closed by terry.reedy #30900: IDLE: Fix configdialog should use wm_withdraw http://bugs.python.org/issue30900 closed by louielu #30901: "503 HTTP ERROR" on attempt to access some points of python li http://bugs.python.org/issue30901 closed by brett.cannon #30902: Python-Redmine plugin not seeing python install MacOS http://bugs.python.org/issue30902 closed by berker.peksag #30906: os.path.join misjoins paths http://bugs.python.org/issue30906 closed by paul.moore #30911: Warning in _json.c on platforms where char is unsigned http://bugs.python.org/issue30911 closed by haypo #30913: IDLE: Document tk Vars, attributes, methods by tab page http://bugs.python.org/issue30913 closed by terry.reedy #30914: test_alpn_protocols (test.test_ssl.ThreadedTests) fails on Fed http://bugs.python.org/issue30914 closed by ned.deily #30921: Process in not get killed using subprocess.call() in python th http://bugs.python.org/issue30921 closed by gvanrossum #30922: Process in not get killed using subprocess.call() in python th http://bugs.python.org/issue30922 closed by matrixise #30927: re.sub() does not work correctly on '.' pattern and \n http://bugs.python.org/issue30927 closed by mrabarnett #30930: Element wise multiplication issue http://bugs.python.org/issue30930 closed by mark.dickinson From tjreedy at udel.edu Fri Jul 14 13:59:37 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 14 Jul 2017 13:59:37 -0400 Subject: [Python-Dev] Articles on my contributions to CPython during 2017 Q1 and Q2 In-Reply-To: References: Message-ID: On 7/13/2017 11:33 AM, Victor Stinner wrote: > I wrote a serie of new articles on my contributions to CPython during > 2017 Q1 and Q2. ... > "Work on Python buildbots, 2017 Q2" > https://haypo.github.io/python-buildbots-2017q2.html "During this quarter, I tried to mark "easy" issues using a "[EASY]" tag in their title and the "easy" or "easy C" keyword. ... I mentored St?phane Wirtel and Louie Lu to fix issues (easy or not)." Thank you for doing this. Louie Lu has made very helpful contributions to IDLE also. -- Terry Jan Reedy From larry at hastings.org Sat Jul 15 08:26:10 2017 From: larry at hastings.org (Larry Hastings) Date: Sat, 15 Jul 2017 14:26:10 +0200 Subject: [Python-Dev] Announcing the schedule for 3.4.7 In-Reply-To: <20ec439d-5735-1285-5380-ae22b57c50c6@hastings.org> References: <20ec439d-5735-1285-5380-ae22b57c50c6@hastings.org> Message-ID: <882df9f6-fbb3-9e2b-c63a-af00028361f0@hastings.org> In reply to my proposal of a few days ago, I received two +1s and no other feedback. So I'm going to issue 3.4.7 with relatively-little notice.t Here's the schedule for 3.4.7; it mirrors the schedule for 3.5.4. Saturday, July 22, 2017 - tag 3.4.7 rc1 Sunday, July 23, 2017 - release 3.4.7 rc1 Sunday, August 6, 2017 - tag 3.4.7 final Monday, August 7, 2017 - release 3.4.7 final Cheers, //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From maxmoroz at gmail.com Sat Jul 15 03:01:36 2017 From: maxmoroz at gmail.com (Max Moroz) Date: Sat, 15 Jul 2017 00:01:36 -0700 Subject: [Python-Dev] deque implementation question In-Reply-To: References: Message-ID: What would be the disadvantage of implementing collections.deque as a circular array (rather than a doubly linked list of blocks)? My naive thinking was that a circular array would maintain the current O(1) append/pop from either side, and would improve index lookup in the middle from O(n) to O(1). What am I missing? The insertion/removal of an arbitrary item specified by a pointer would increase from constant time to linear, but since we don't have pointers this is a moot point. Of course when the circular array is full, it will need to be reallocated, but the amortized cost of that is still O(1). (Moreover, for a bounded deque, there's even an option of preallocation, which would completely eliminate reallocations.) Thanks Max -------------- next part -------------- An HTML attachment was scrubbed... URL: From nad at python.org Sat Jul 15 17:51:37 2017 From: nad at python.org (Ned Deily) Date: Sat, 15 Jul 2017 17:51:37 -0400 Subject: [Python-Dev] Python 3.3.7 release schedule and end-of-life Message-ID: <32CB89BE-A49E-4888-B8C7-4A8CAB8F15F1@python.org> Python 3.3 is fast approaching its end-of-life date, 2017-09-29. Per our release policy, that date is five years after the initial release of 3.3, 3.3.0 final on 2012-09-29. Note that 3.3 has been in security-fix only mode since the 2014-03-08 release of 3.3.5. It has been a while since we produced a 3.3.x security-fix release and, due to his commitments elsewhere, Georg has agreed for me to lead 3.3 to its well-deserved retirement. To that end, I would like to schedule its next, and hopefully final, security-fix release to coincide with the already announced 3.4.7 security-fix release. In particular, we'll plan to tag and release 3.3.7rc1 on Monday 2017-07-24 (UTC) and tag and release 3.3.7 final on Monday 2017-08-07. In the coming days, I'll be reviewing the outstanding 3.3 security issues and merging appropriate 3.3 PRs. Some of them have been sitting as patches for a long time so, if you have any such security issues that you think belong in 3.3, it would be very helpful if you would review such patches and turn them into 3.3 PRs. As a reminder, here are the guidelines from the devguide as to what is appropriate for a security-fix only branch: "The only changes made to a security branch are those fixing issues exploitable by attackers such as crashes, privilege escalation and, optionally, other issues such as denial of service attacks. Any other changes are not considered a security risk and thus not backported to a security branch. You should also consider fixing hard-failing tests in open security branches since it is important to be able to run the tests successfully before releasing." Note that documentation changes, other than any that might be related to a security fix, are also out of scope. Assuming no new security issues arise prior to the EOL date, 3.3.7 will likely be the final release of 3.3. And you really shouldn't be using 3.3 at all at this point; while downstream distributors are, of course, free to provide support of 3.3 to their customers, in a little over two months when EOL is reached python-dev will no longer accept any issues or make any changes available for 3.3. If you are still using 3.3, you really owe it to your applications, to your users, and to yourself to upgrade to a more recent release of Python 3, preferably 3.6! Many, many fixes, new features, and substantial performance improvements await you. https://www.python.org/dev/peps/pep-0398/ https://devguide.python.org/devcycle/#security-branches -- Ned Deily nad at python.org -- [] From francismb at email.de Sun Jul 16 11:30:43 2017 From: francismb at email.de (francismb) Date: Sun, 16 Jul 2017 17:30:43 +0200 Subject: [Python-Dev] Articles on my contributions to CPython during 2017 Q1 and Q2 In-Reply-To: References: Message-ID: <7f04b678-fe0d-2dab-f86f-7aaabbd5069e@email.de> Hi Victor, On 07/13/2017 05:33 PM, Victor Stinner wrote: > Hi, > > I wrote a serie of new articles on my contributions to CPython during > 2017 Q1 and Q2. > > "My contributions to CPython during 2017 Q1" > https://haypo.github.io/contrib-cpython-2017q1.html > > "New Python test.bisect tool" > https://haypo.github.io/python-test-bisect.html would it make sense too add a test that passes if the tool doesn't find anything ? if it fails so should the reason be already there (?) (Or to add a test step after running the test suite if some error happened (?)) Thanks ! -- francis From victor.stinner at gmail.com Sun Jul 16 12:19:10 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Sun, 16 Jul 2017 18:19:10 +0200 Subject: [Python-Dev] Articles on my contributions to CPython during 2017 Q1 and Q2 In-Reply-To: <7f04b678-fe0d-2dab-f86f-7aaabbd5069e@email.de> References: <7f04b678-fe0d-2dab-f86f-7aaabbd5069e@email.de> Message-ID: I'm not sure that I understood your suggestion. Basically, if a test file fails, you would like to automatically re-run the failing test with test.bisect to identify the failing *methods*? Yeah, it's doable, but I didn't write it :-) It's very easy to run bisect: just replace "-m test" with "-m test.bisect" in your command line, and you are done. Victor 2017-07-16 17:30 GMT+02:00 francismb : > Hi Victor, > > On 07/13/2017 05:33 PM, Victor Stinner wrote: >> Hi, >> >> I wrote a serie of new articles on my contributions to CPython during >> 2017 Q1 and Q2. >> >> "My contributions to CPython during 2017 Q1" >> https://haypo.github.io/contrib-cpython-2017q1.html >> >> "New Python test.bisect tool" >> https://haypo.github.io/python-test-bisect.html > would it make sense too add a test that passes if the tool doesn't find > anything ? if it fails so should the reason be already there (?) > (Or to add a test step after running the test suite if some error > happened (?)) > > Thanks ! > -- francis > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com From songofacandy at gmail.com Sun Jul 16 12:42:38 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Mon, 17 Jul 2017 01:42:38 +0900 Subject: [Python-Dev] deque implementation question In-Reply-To: References: Message-ID: I found the answer in _collectionsmodule.c /* Data for deque objects is stored in a doubly-linked list of fixed * length blocks. This assures that appends or pops never move any * other data elements besides the one being appended or popped. * * Another advantage is that it completely avoids use of realloc(), * resulting in more predictable performance. Regards, INADA Naoki On Sat, Jul 15, 2017 at 4:01 PM, Max Moroz wrote: > What would be the disadvantage of implementing collections.deque as a > circular array (rather than a doubly linked list of blocks)? My naive > thinking was that a circular array would maintain the current O(1) > append/pop from either side, and would improve index lookup in the middle > from O(n) to O(1). What am I missing? > > The insertion/removal of an arbitrary item specified by a pointer would > increase from constant time to linear, but since we don't have pointers this > is a moot point. > > Of course when the circular array is full, it will need to be reallocated, > but the amortized cost of that is still O(1). (Moreover, for a bounded > deque, there's even an option of preallocation, which would completely > eliminate reallocations.) > > Thanks > > Max > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com > From brett at python.org Sun Jul 16 17:15:00 2017 From: brett at python.org (Brett Cannon) Date: Sun, 16 Jul 2017 21:15:00 +0000 Subject: [Python-Dev] [python-committers] Python 3.3.7 release schedule and end-of-life In-Reply-To: <32CB89BE-A49E-4888-B8C7-4A8CAB8F15F1@python.org> References: <32CB89BE-A49E-4888-B8C7-4A8CAB8F15F1@python.org> Message-ID: A quick thanks from me, Ned, for stepping forward to help 3.3 pine for the fjords. On Sat, Jul 15, 2017, 14:51 Ned Deily, wrote: > Python 3.3 is fast approaching its end-of-life date, 2017-09-29. Per our > release policy, that date is five years after the initial release of 3.3, > 3.3.0 final on 2012-09-29. Note that 3.3 has been in security-fix only > mode since the 2014-03-08 release of 3.3.5. It has been a while since we > produced a 3.3.x security-fix release and, due to his commitments > elsewhere, Georg has agreed for me to lead 3.3 to its well-deserved > retirement. > > To that end, I would like to schedule its next, and hopefully final, > security-fix release to coincide with the already announced 3.4.7 > security-fix release. In particular, we'll plan to tag and release 3.3.7rc1 > on Monday 2017-07-24 (UTC) and tag and release 3.3.7 final on Monday > 2017-08-07. In the coming days, I'll be reviewing the outstanding 3.3 > security issues and merging appropriate 3.3 PRs. Some of them have been > sitting as patches for a long time so, if you have any such security issues > that you think belong in 3.3, it would be very helpful if you would review > such patches and turn them into 3.3 PRs. > > As a reminder, here are the guidelines from the devguide as to what is > appropriate for a security-fix only branch: > > "The only changes made to a security branch are those fixing issues > exploitable by attackers such as crashes, privilege escalation and, > optionally, other issues such as denial of service attacks. Any other > changes are not considered a security risk and thus not backported to a > security branch. You should also consider fixing hard-failing tests in open > security branches since it is important to be able to run the tests > successfully before releasing." > > Note that documentation changes, other than any that might be related to a > security fix, are also out of scope. > > Assuming no new security issues arise prior to the EOL date, 3.3.7 will > likely be the final release of 3.3. And you really shouldn't be using 3.3 > at all at this point; while downstream distributors are, of course, free to > provide support of 3.3 to their customers, in a little over two months when > EOL is reached python-dev will no longer accept any issues or make any > changes available for 3.3. If you are still using 3.3, you really owe it > to your applications, to your users, and to yourself to upgrade to a more > recent release of Python 3, preferably 3.6! Many, many fixes, new > features, and substantial performance improvements await you. > > https://www.python.org/dev/peps/pep-0398/ > https://devguide.python.org/devcycle/#security-branches > > -- > Ned Deily > nad at python.org -- [] > > _______________________________________________ > python-committers mailing list > python-committers at python.org > https://mail.python.org/mailman/listinfo/python-committers > Code of Conduct: https://www.python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Sun Jul 16 20:09:39 2017 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 16 Jul 2017 19:09:39 -0500 Subject: [Python-Dev] deque implementation question In-Reply-To: References: Message-ID: [Max Moroz ] > What would be the disadvantage of implementing collections.deque as a > circular array (rather than a doubly linked list of blocks)? ... You answered it yourself ;-) > ... > Of course when the circular array is full, it will need to be reallocated, > but the amortized cost of that is still O(1). Bingo. The primary point of a deque is to make popping and pushing at both ends efficient. That's what the current implementation does: worst-case constant time per push or pop regardless of how many items are in the deque. That beats "amortized O(1)" in the small and in the large. That's why it was done this way. Some other deque methods are consequently slower than they are for lists, but who cares? For example, the only indices I've ever used with a deque are 0 and -1 (to peek at one end or the other of a deque), and the implementation makes accessing those specific indices constant-time too. From aleksejs at mit.edu Sun Jul 16 22:37:47 2017 From: aleksejs at mit.edu (Aleksejs Popovs) Date: Sun, 16 Jul 2017 22:37:47 -0400 Subject: [Python-Dev] curses: error handling and the lower right margin of the screen Message-ID: <6d09e791-6e1d-e0b9-f06e-d8809b65a01e@mit.edu> Hello everyone, My name is Aleksejs, and this is my first time posting here. I'm working on a Python project (a client for Zephyr, MIT's instant messaging system) that uses curses, and I believe I've found either a bug in the Python curses bindings or a deficiency in their documentation. The problem has to do with ncurses' cursor advancing behavior. The manpage addch(3x) says: >If the advance is at the right margin: >? The cursor automatically wraps to the beginning of the next line. >? At the bottom of the current scrolling region, and if scrollok is enabled, the scrolling region is scrolled up one line. >? If scrollok is not enabled, writing a character at the lower right margin succeeds. However, an error is returned because it is not possible to wrap to a new line Python's window.addch(y, x, ch[, attr]) function seems to be internally calling addch or one of the related functions, and so, if scrollok is off and (y, x) is the lower right corner of the screen, addch returns an error, which window.addch then detects and turns into an exception. I'd like to argue that this is not expected behavior, and does not follow from the documentation. The documentation for window.addch makes no mention of the fact that the cursor is advanced at all, so there's no reason that a user should expect window.addch(height - 1, width - 1, ch) to fail (and the exception raised, "_curses.error: add_wch() returned ERR", is not very helpful in understanding what the deal is). Because the documentation doesn't say anything about the cursor, this is not even an error in any meaningful way, as the character is successfully written to the screen, and wrapping the code in a "try: ... except curses.error: pass" block "fixes" the error. The same problem affects window.addstr() if the end of the string being painted ends up in the lower right corner. This behavior is so unintuitive that it is not even accounted for elsewhere in the curses bindings: the implementation of curses.panel.rectangle contains a line [1] > win.addch(lry, lrx, curses.ACS_LRCORNER) which raises an exception when trying to draw a rectangle spanning an entire window (or in fact any rectangle touching the lower right corner). I also wrote a little example script [2] demonstrating the problem. I believe that this is a problem, but I am not sure how it could be resolved. It seems that there's no way to distinguish this "error" from a legitimate ncurses error. Perhaps the documentation ought to at least mention this behavior, and the implementation of curses.panel.rectangle should check for the case where the rectangle touches the lower right corner and use a try-catch block there. Best regards, Aleksejs Popovs [1] https://github.com/python/cpython/blob/3.6/Lib/curses/textpad.py#L16 [2] https://gist.github.com/popoffka/e21299967f5739d18c4fa393fa5cf20b From nad at python.org Mon Jul 17 01:50:33 2017 From: nad at python.org (Ned Deily) Date: Mon, 17 Jul 2017 01:50:33 -0400 Subject: [Python-Dev] [RELEASE] Python 3.6.2 is now available Message-ID: On behalf of the Python development community and the Python 3.6 release team, I am happy to announce the availability of Python 3.6.2, the second maintenance release of Python 3.6. 3.6.0 was released on 2016-12-22 to great interest and we are now providing the second set of bugfixes and documentation updates for it; the first maintenance release, 3.6.1, was released on 2017-03-31. Detailed information about the changes made in 3.6.2 can be found in the change log here: https://docs.python.org/3.6/whatsnew/changelog.html#python-3-6-2 Please see "What?s New In Python 3.6" for more information about the new features in Python 3.6: https://docs.python.org/3.6/whatsnew/3.6.html You can download Python 3.6.2 here: https://www.python.org/downloads/release/python-362/ The next maintenance release of Python 3.6 is expected to follow in about 3 months, around the end of 2017-09. More information about the 3.6 release schedule can be found here: https://www.python.org/dev/peps/pep-0494/ Enjoy! P.S. If you need to download the documentation set for 3.6.2 immediately, you can always find the release version here: https://docs.python.org/release/3.6.2/download.html The most current updated versions will appear here: https://docs.python.org/3.6/ -- Ned Deily nad at python.org -- [] From solipsis at pitrou.net Mon Jul 17 08:43:19 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 17 Jul 2017 14:43:19 +0200 Subject: [Python-Dev] Impact of Namedtuple on startup time Message-ID: <20170717144319.7fdbf64b@fsol> Hello, Cost of creating a namedtuple has been identified as a contributor to Python startup time. Not only Python core and the stdlib, but any third-party library creating namedtuple classes (there are many of them). An issue was created for this: https://bugs.python.org/issue28638 Raymond decided to close the issue because: 1) the proposed resolution makes the "_source" attribute empty (or, at least, something else than it currently is). Raymond claims the "_source" attribute is an essential feature of namedtuples. 2) optimizing startup cost is supposedly not worth the effort. To this, I will counter-argument: As for 1), a search for "namedtuple" and "_source" in a code search engine (*) brings *only* false positives of different kinds: * clones of the CPython repo * copies of the namedtuple class instantiation source code with slight tweaks (*not* reading the _source attribute of an existing namedtuple) * modules using namedtuples and also using a "_source" attribute on unrelated objects (*) https://searchcode.com/?q=namedtuple+_source As for 2), startup time is actually a very important consideration nowadays, both for small scripts *and* for interactive use with the now very wide-spread use of Jupyter Notebooks. A 1 ms. cost when importing a single module can translate into a large slowdown when your library imports (directly or indirectly) hundreds of modules, many of which may create their own namedtuple classes. Nick pointed out that one alternative is to make the C-written "struct sequence" class user-visible. My opinion is that, while better than nothing, this would complicate things by exposing two very similar primitives in the stdlib, without there being a clear choice for users. Should I use the well-known namedtuple? Should I use the new-ish "struct sequence", with similar characteristics and better performance, but worse compatibility (now I have to write fallback code for Python versions where the "struct sequence" isn't exposed)? And not to mention all third-party libraries must be migrated to the newly-exposed "struct sequence" + compatibility fallback code... So my take is: 1) Usage of "_source" in open source code (as per the search above) seems non-existent. 2) If the primary intent of "_source" is to show-case how to write a tuple subclass, well, why not write a recipe or tutorial somewhere? The Python stdlib is generally not a place where we reify tutorials or educational snippets as public APIs. 3) The well-known namedtuple would really benefit from a performance boost, without asking all maintainers of dependent code (that's a *ton*) to migrate to a new idiom + compatibility fallback. Regards Antoine. From solipsis at pitrou.net Mon Jul 17 08:53:12 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 17 Jul 2017 14:53:12 +0200 Subject: [Python-Dev] Impact of Namedtuple on startup time References: <20170717144319.7fdbf64b@fsol> Message-ID: <20170717145312.455c4a28@fsol> On Mon, 17 Jul 2017 14:43:19 +0200 Antoine Pitrou wrote: > Hello, > > Cost of creating a namedtuple has been identified as a contributor to > Python startup time. Imprecise wording: that's the cost of creating a namedtuple *class*, i.e. anytime someone writes `MyClass = namedtuple('MyClass', ...)`. Regards Antoine. From levkivskyi at gmail.com Mon Jul 17 09:03:26 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Mon, 17 Jul 2017 15:03:26 +0200 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: <20170717145312.455c4a28@fsol> References: <20170717144319.7fdbf64b@fsol> <20170717145312.455c4a28@fsol> Message-ID: Interesting coincidence, just two days ago I have heard that a team at one large company completely abandoned namedtuple because of the creation time problem. Concerning _source, why it is not possible to make it a property so that all the string formatting will happen on request, thus saving some time for users who doesn't need it? (Of course this will not be an actual source, but it can be made practically equivalent to the no-compile version.) -- Ivan On 17 July 2017 at 14:53, Antoine Pitrou wrote: > On Mon, 17 Jul 2017 14:43:19 +0200 > Antoine Pitrou wrote: > > Hello, > > > > Cost of creating a namedtuple has been identified as a contributor to > > Python startup time. > > Imprecise wording: that's the cost of creating a namedtuple *class*, > i.e. anytime someone writes `MyClass = namedtuple('MyClass', ...)`. > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > levkivskyi%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From isaac.morland at gmail.com Mon Jul 17 09:26:21 2017 From: isaac.morland at gmail.com (Isaac Morland) Date: Mon, 17 Jul 2017 09:26:21 -0400 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: <20170717144319.7fdbf64b@fsol> References: <20170717144319.7fdbf64b@fsol> Message-ID: On 17 July 2017 at 08:43, Antoine Pitrou wrote: > > Hello, > > Cost of creating a namedtuple has been identified as a contributor to > Python startup time. Not only Python core and the stdlib, but any > third-party library creating namedtuple classes (there are many of > them). An issue was created for this: > https://bugs.python.org/issue28638 > > Raymond decided to close the issue because: > > 1) the proposed resolution makes the "_source" attribute empty (or, at > least, something else than it currently is). Raymond claims the > "_source" attribute is an essential feature of namedtuples. > I think I understand well enough to say something intelligent? While actual references to _source are likely rare (certainly I?ve never used it), my understanding is that the way namedtuple works is to construct _source, and then exec it to create the class. Once that is done, there is no significant saving to be had by throwing away the constructed _source value. When namedtuple was being considered for inclusion, I actually went so far as to write a proof-of-concept version that worked by creating a class, creating attributes on it, etc. I don?t remember how far I got but the exec version is the version included in the stdlib. I come from a non-Pythonic background so use of exec still feels a bit weird to me but I absolutely love namedtuple and use it constantly. I don't know whether a polished and completed version of my idea could be faster than using exec, but I wouldn't expect a major saving ? a whole bunch of code has to run either way. -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Mon Jul 17 09:11:51 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 17 Jul 2017 15:11:51 +0200 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: References: <20170717144319.7fdbf64b@fsol> <20170717145312.455c4a28@fsol> Message-ID: <20170717151151.66465bdb@fsol> On Mon, 17 Jul 2017 15:03:26 +0200 Ivan Levkivskyi wrote: > Interesting coincidence, just two days ago I have heard that a team at one > large company completely abandoned namedtuple because of the creation time > problem. > > Concerning _source, why it is not possible to make it a property so that > all the string formatting will happen on request, thus saving some time for > users who doesn't need it? It was proposed in https://bugs.python.org/issue19640 but rejected. Regards Antoine. From antoine at python.org Mon Jul 17 09:31:35 2017 From: antoine at python.org (Antoine Pitrou) Date: Mon, 17 Jul 2017 15:31:35 +0200 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: References: <20170717144319.7fdbf64b@fsol> Message-ID: <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> Le 17/07/2017 ? 15:26, Isaac Morland a ?crit : > > I think I understand well enough to say something intelligent? > > While actual references to _source are likely rare (certainly I?ve never > used it), my understanding is that the way namedtuple works is to > construct _source, and then exec it to create the class. Once that is > done, there is no significant saving to be had by throwing away the > constructed _source value. The proposed resolution on https://bugs.python.org/issue28638 is to avoid exec() on most parts of the namedtuple class, hence speeding up the class creation. > I come from > a non-Pythonic background so use of exec still feels a bit weird to me > but I absolutely love namedtuple and use it constantly. I think for most Python programmers, it still feels a bit un-Pythonic. While exec() is part of Python, it's generally only used in fringe cases where nothing else works. Regards Antoine. From facundobatista at gmail.com Mon Jul 17 10:56:56 2017 From: facundobatista at gmail.com (Facundo Batista) Date: Mon, 17 Jul 2017 11:56:56 -0300 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: <20170717144319.7fdbf64b@fsol> References: <20170717144319.7fdbf64b@fsol> Message-ID: On Mon, Jul 17, 2017 at 9:43 AM, Antoine Pitrou wrote: > As for 2), startup time is actually a very important consideration > nowadays, both for small scripts *and* for interactive use with the > now very wide-spread use of Jupyter Notebooks. A 1 ms. cost when > importing a single module can translate into a large slowdown when your > library imports (directly or indirectly) hundreds of modules, many of > which may create their own namedtuple classes. My experience inside Canonical is that golang stole a lot of "codebase share" from Python, and (others and mine) talks hit two walls, mainly: one is memory consumption, and the other is startup time. So yes, startup time is important for user-faced scripts and services. Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ Twitter: @facundobatista From raymond.hettinger at gmail.com Mon Jul 17 10:59:51 2017 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Mon, 17 Jul 2017 07:59:51 -0700 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> Message-ID: > On Jul 17, 2017, at 6:31 AM, Antoine Pitrou wrote: > >> I think I understand well enough to say something intelligent? >> >> While actual references to _source are likely rare (certainly I?ve never >> used it), my understanding is that the way namedtuple works is to >> construct _source, and then exec it to create the class. Once that is >> done, there is no significant saving to be had by throwing away the >> constructed _source value. There are considerable benefits to namedtuple being able to generate and match its own source. * It makes it is really easy for a user to generate the code, drop it into another another module, and customize it. * It makes the named tuple factory function completely self-documenting. * The verbose/_source option teaches you exactly what named tuple does. That makes the tool relatively easy to learn, understand, and debug. I really don't want to throw away these benefits to save a couple of milliseconds. As Nick Coghlan recently posted, "Speed isn't everything, and it certainly isn't adequate justification for breaking public APIs that have been around for years." FWIW, the template/exec implementation has had excellent benefits for maintainability making it very easy to fix and update. As other parts of Python have changed (limitations on number of arguments, what is allowed as an identifier, etc), it mostly automatically stays in sync with the rest of the language. ISTM this issue is being pressed by micro-optimizers who are being very aggressive and not responding to actual user needs (it is more an invented issue than a real one). Named tuple has been around for a long time and users have been somewhat happy with it. If someone truly cares about the exec time for a particular named tuple, the _source option makes it trivially easy to just replace the generator call with the expanded code in that particular circumstance. Raymond P.S. I'm fully supportive of Victor's efforts to build-out structseq to make it sufficiently expressive to do more of what collections.namedtuple() does. That is a perfectly reasonable path to optimization. We've wanted that for a long time and no one has had the spare clock cycles to make it come true. From barry at python.org Mon Jul 17 11:13:51 2017 From: barry at python.org (Barry Warsaw) Date: Mon, 17 Jul 2017 11:13:51 -0400 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> Message-ID: On Jul 17, 2017, at 10:59, Raymond Hettinger wrote: > > ISTM this issue is being pressed by micro-optimizers who are being very aggressive and not responding to actual user needs (it is more an invented issue than a real one). Named tuple has been around for a long time and users have been somewhat happy with it. Regardless of whether this particular optimization is a good idea or not, start up time *is* a serious challenge in many environments for CPython in particular and the perception of Python?s applicability to many problems. I think we?re better off trying to identify and address such problems than ignoring or minimizing them. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From victor.stinner at gmail.com Mon Jul 17 11:14:07 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 17 Jul 2017 17:14:07 +0200 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: References: <20170717144319.7fdbf64b@fsol> Message-ID: 2017-07-17 16:56 GMT+02:00 Facundo Batista : > My experience inside Canonical is that golang stole a lot of "codebase > share" from Python, and (others and mine) talks hit two walls, mainly: > one is memory consumption, and the other is startup time. > > So yes, startup time is important for user-faced scripts and services. Removing the _source attribute would allow to: (1) Reduce the memory consumption http://bugs.python.org/issue19640#msg213949 (2) Pyhon startup up time https://bugs.python.org/issue28638#msg280277 Victor From christian at python.org Mon Jul 17 11:19:46 2017 From: christian at python.org (Christian Heimes) Date: Mon, 17 Jul 2017 17:19:46 +0200 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: <20170717144319.7fdbf64b@fsol> References: <20170717144319.7fdbf64b@fsol> Message-ID: On 2017-07-17 14:43, Antoine Pitrou wrote: > So my take is: > > 1) Usage of "_source" in open source code (as per the search above) > seems non-existent. > > 2) If the primary intent of "_source" is to show-case how to write a > tuple subclass, well, why not write a recipe or tutorial somewhere? > The Python stdlib is generally not a place where we reify tutorials or > educational snippets as public APIs. > > 3) The well-known namedtuple would really benefit from a performance > boost, without asking all maintainers of dependent code (that's a > *ton*) to migrate to a new idiom + compatibility fallback. I have an additional take on named tuples 4) The current approach uses exec() to generate the namedtuple class on the fly. The exec() function isn't necessarily evil and the use of exec() in namedtuple is safe. However I would appreciate if Python interpreter could be started without requiring the exec() function. It would make it easier to harden the interpreter for embedding and system integration uses cases. It's not about sandboxing Python. My goal is to make it harder to abuse Python. See Steve's lighting talk "Python as a security vulnerability" at the language summit, https://lwn.net/Articles/723823/ . Christian From steve at holdenweb.com Mon Jul 17 11:22:38 2017 From: steve at holdenweb.com (Steve Holden) Date: Mon, 17 Jul 2017 16:22:38 +0100 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> Message-ID: On Mon, Jul 17, 2017 at 3:59 PM, Raymond Hettinger < raymond.hettinger at gmail.com> wrote: > I really don't want to throw away these benefits to save a couple of > milliseconds. As Nick Coghlan recently posted, "Speed isn't everything, > and it certainly isn't adequate justification for breaking public APIs that > have been around for years." ?My only question is "what's a variable called _source doing in the public API?" regards Steve? Steve Holden -------------- next part -------------- An HTML attachment was scrubbed... URL: From raymond.hettinger at gmail.com Mon Jul 17 11:29:18 2017 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Mon, 17 Jul 2017 08:29:18 -0700 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> Message-ID: <34A86EF5-8A12-40E7-881D-1E2C5FFE7B18@gmail.com> > On Jul 17, 2017, at 8:22 AM, Steve Holden wrote: > > My only question is "what's a variable called _source doing in the public API?" The convention for named tuple hnas been for all the methods and attributes to be prefixed with an underscore so that the names won't conflict with field names in the named tuple itself. For example, we want to allow Path=namedtuple('Path', ['source', 'destination']). If I had it all to do over again, it might have been better to have had a different convention like source_ with a trailing underscore, but that ship sailed long ago :-) Raymond From steve at holdenweb.com Mon Jul 17 11:34:50 2017 From: steve at holdenweb.com (Steve Holden) Date: Mon, 17 Jul 2017 16:34:50 +0100 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: <34A86EF5-8A12-40E7-881D-1E2C5FFE7B18@gmail.com> References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> <34A86EF5-8A12-40E7-881D-1E2C5FFE7B18@gmail.com> Message-ID: Makes sense. Thanks. S Steve Holden On Mon, Jul 17, 2017 at 4:29 PM, Raymond Hettinger < raymond.hettinger at gmail.com> wrote: > > > On Jul 17, 2017, at 8:22 AM, Steve Holden wrote: > > > > My only question is "what's a variable called _source doing in the > public API?" > > The convention for named tuple hnas been for all the methods and > attributes to be prefixed with an underscore so that the names won't > conflict with field names in the named tuple itself. For example, we want > to allow Path=namedtuple('Path', ['source', 'destination']). > > If I had it all to do over again, it might have been better to have had a > different convention like source_ with a trailing underscore, but that ship > sailed long ago :-) > > > Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Jul 17 11:49:13 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 17 Jul 2017 08:49:13 -0700 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> Message-ID: I am firmly with Antoine here. The cumulative startup time of large Python programs is a serious problem and namedtuple is one of the major contributors -- especially because it is so convenient that it is ubiquitous. The approach of generating source code and exec()ing it, is a cool demonstration of Python's expressive power, but it's always been my sense that whenever we encounter a popular idiom that uses exec() and eval(), we should augment the language (or the builtins) to avoid these calls -- that's for example how we ended up with getattr(). One of the reasons to be wary of exec()/eval() other than the usual security concerns is that in some Python implementations they have a high overhead to initialize the parser and compiler. (Even in CPython it's not that fast.) Regarding the argument that it's easier to learn what namedtuple does if the generated source is available, while I don't feel this is important, supposedly it is important to Raymond. But surely there are other approaches possible that work just as well in an educational setting while being more efficient in production use. (E.g. the approach taken by itertools, where the docs show equivalent Python code.) Concluding, I think we should move on from the original implementation and optimize the heck out of namedtuple. The original has served us well. The world is constantly changing. Python should adapt to the (happy) fact that it's being used for systems larger than any of us could imagine 15 years ago. --Guido On Mon, Jul 17, 2017 at 7:59 AM, Raymond Hettinger < raymond.hettinger at gmail.com> wrote: > > > On Jul 17, 2017, at 6:31 AM, Antoine Pitrou wrote: > > > >> I think I understand well enough to say something intelligent? > >> > >> While actual references to _source are likely rare (certainly I?ve never > >> used it), my understanding is that the way namedtuple works is to > >> construct _source, and then exec it to create the class. Once that is > >> done, there is no significant saving to be had by throwing away the > >> constructed _source value. > > There are considerable benefits to namedtuple being able to generate and > match its own source. > > * It makes it is really easy for a user to generate the code, drop it into > another another module, and customize it. > > * It makes the named tuple factory function completely self-documenting. > > * The verbose/_source option teaches you exactly what named tuple does. > That makes the tool relatively easy to learn, understand, and debug. > > I really don't want to throw away these benefits to save a couple of > milliseconds. As Nick Coghlan recently posted, "Speed isn't everything, > and it certainly isn't adequate justification for breaking public APIs that > have been around for years." > > FWIW, the template/exec implementation has had excellent benefits for > maintainability making it very easy to fix and update. As other parts of > Python have changed (limitations on number of arguments, what is allowed as > an identifier, etc), it mostly automatically stays in sync with the rest of > the language. > > ISTM this issue is being pressed by micro-optimizers who are being very > aggressive and not responding to actual user needs (it is more an invented > issue than a real one). Named tuple has been around for a long time and > users have been somewhat happy with it. > > If someone truly cares about the exec time for a particular named tuple, > the _source option makes it trivially easy to just replace the generator > call with the expanded code in that particular circumstance. > > > Raymond > > > P.S. I'm fully supportive of Victor's efforts to build-out structseq to > make it sufficiently expressive to do more of what collections.namedtuple() > does. That is a perfectly reasonable path to optimization. We've wanted > that for a long time and no one has had the spare clock cycles to make it > come true. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Mon Jul 17 12:13:39 2017 From: greg at krypto.org (Gregory P. Smith) Date: Mon, 17 Jul 2017 16:13:39 +0000 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> Message-ID: On Mon, Jul 17, 2017 at 8:00 AM Raymond Hettinger < raymond.hettinger at gmail.com> wrote: > > > On Jul 17, 2017, at 6:31 AM, Antoine Pitrou wrote: > > > >> I think I understand well enough to say something intelligent? > >> > >> While actual references to _source are likely rare (certainly I?ve never > >> used it), my understanding is that the way namedtuple works is to > >> construct _source, and then exec it to create the class. Once that is > >> done, there is no significant saving to be had by throwing away the > >> constructed _source value. > > There are considerable benefits to namedtuple being able to generate and > match its own source. > > * It makes it is really easy for a user to generate the code, drop it into > another another module, and customize it. > > * It makes the named tuple factory function completely self-documenting. > > * The verbose/_source option teaches you exactly what named tuple does. > That makes the tool relatively easy to learn, understand, and debug. > > I really don't want to throw away these benefits to save a couple of > milliseconds. As Nick Coghlan recently posted, "Speed isn't everything, > and it certainly isn't adequate justification for breaking public APIs that > have been around for years." > > FWIW, the template/exec implementation has had excellent benefits for > maintainability making it very easy to fix and update. As other parts of > Python have changed (limitations on number of arguments, what is allowed as > an identifier, etc), it mostly automatically stays in sync with the rest of > the language. > > ISTM this issue is being pressed by micro-optimizers who are being very > aggressive and not responding to actual user needs (it is more an invented > issue than a real one). Named tuple has been around for a long time and > users have been somewhat happy with it. > Raymond, you keep repeating statements similar to "only a millisecond" and "aggressive micro-optimizers who don't care about user needs" in your comments on issues like this. That simply isn't true. These issues come up in the first place *because of* users who need fast startup. Please don't be so dismissive. The reason people care about this has been stated many times. It isn't just "a millisecond", it's 100s or 1000s of milliseconds in any application of reasonable size where namedtuples were adopted as a design pattern in various libraries. Real world use cases for startup time mattering exist: interactive command line tools are the most obvious one people keep citing. I'll toss another where Python startup time has raised eyebrows at work: unittest startup and completion time. When the bulk of a processes time is spent in startup before hitting unittest.main(), people take notice and consider it a problem. Developer productivity is reduced. The hacks individual developers come up with to try and workaround things like this are not pretty. If someone truly cares about the exec time for a particular named tuple, > the _source option makes it trivially easy to just replace the generator > call with the expanded code in that particular circumstance. > In real world applications you do not control the bulk of the code that has chosen to use namedtuple. They're scattered through 100-1000s of other transitive dependency libraries (not just the standard library), the modification of each of which faces hurdles both technical and non-technical in nature. To me the desired resolution to this is clear: Optimize the default use case of namedtuple and everybody wins. This isn't just about the stdlib's namedtuple uses being fast, those a small portion of all uses in any application where startup time matters. This is about making Python better for the world. ie: What Antoine's original write-up suggested in his #3. I get that namedtuple ._source is a public API. We may need to keep it. If so, that just means revisiting lazily generating it as a property - issue19640. -gps PS - Good call on the naming hindsight! A trailing underscore would've been nice. Oh well, too late for that. > > Raymond > > > P.S. I'm fully supportive of Victor's efforts to build-out structseq to > make it sufficiently expressive to do more of what collections.namedtuple() > does. That is a perfectly reasonable path to optimization. We've wanted > that for a long time and no one has had the spare clock cycles to make it > come true. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/greg%40krypto.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Mon Jul 17 12:25:23 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 17 Jul 2017 18:25:23 +0200 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> Message-ID: 2017-07-17 18:13 GMT+02:00 Gregory P. Smith : > I get that namedtuple ._source is a public API. We may need to keep it. If > so, that just means revisiting lazily generating it as a property - > issue19640. I agree. Technically speaking, optimizing namedtuple doesn't have to mean "remove the _source attribute". I wouldn't discuss here if _source should be kept or not, but even if we rewrite the namedtuple implementation, I agree that we *can* technically keep a _source property which would create the same Python code. It would allow it to speedup namedtuple, reduce the memory footprint, and have a smooth deprecation policy (*if* we decide to deprecate this attribute). Victor From steve at pearwood.info Mon Jul 17 12:45:20 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 18 Jul 2017 02:45:20 +1000 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: <20170717144319.7fdbf64b@fsol> References: <20170717144319.7fdbf64b@fsol> Message-ID: <20170717164519.GY3149@ando.pearwood.info> On Mon, Jul 17, 2017 at 02:43:19PM +0200, Antoine Pitrou wrote: > > Hello, > > Cost of creating a namedtuple has been identified as a contributor to > Python startup time. Not only Python core and the stdlib, but any > third-party library creating namedtuple classes (there are many of > them). An issue was created for this: > https://bugs.python.org/issue28638 Some time ago, I needed to backport a version of namedtuple to Python 2.4, so I started with Raymond's recipe on Activestate and modified it to only exec the code needed for __new__. The rest of the class is an ordinary inner class: # a short sketch def namedtuple(...): class Inner(tuple): ... exec(source, ns) Inner.__new__ = ns['__new__'] return Inner Here's my fork of Raymond's recipe: https://code.activestate.com/recipes/578918-yet-another-namedtuple/ Out of curiosity, I took that recipe, updated it to work in Python 3, and compared it to the std lib version. Here are some representative timings: [steve at ando ~]$ python3.5 -m timeit -s "from collections import namedtuple" "K = namedtuple('K', 'a b c')" 1000 loops, best of 3: 1.02 msec per loop [steve at ando ~]$ python3.5 -m timeit -s "from nt3 import namedtuple" "K = namedtuple('K', 'a b c')" 1000 loops, best of 3: 255 usec per loop I think that proves that this approach is viable and can lead to a big speed up. I don't think that merely dropping the _source attribute will save much time. It might save a bit of memory, but in my experiements dropping it only saves about 10?s more. I think the real bottleneck is the cost of exec'ing the entire class. -- Steve From jelle.zijlstra at gmail.com Mon Jul 17 13:04:39 2017 From: jelle.zijlstra at gmail.com (Jelle Zijlstra) Date: Mon, 17 Jul 2017 10:04:39 -0700 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: <20170717164519.GY3149@ando.pearwood.info> References: <20170717144319.7fdbf64b@fsol> <20170717164519.GY3149@ando.pearwood.info> Message-ID: 2017-07-17 9:45 GMT-07:00 Steven D'Aprano : > On Mon, Jul 17, 2017 at 02:43:19PM +0200, Antoine Pitrou wrote: > > > > Hello, > > > > Cost of creating a namedtuple has been identified as a contributor to > > Python startup time. Not only Python core and the stdlib, but any > > third-party library creating namedtuple classes (there are many of > > them). An issue was created for this: > > https://bugs.python.org/issue28638 > > Some time ago, I needed to backport a version of namedtuple to Python > 2.4, so I started with Raymond's recipe on Activestate and modified it > to only exec the code needed for __new__. The rest of the class is an > ordinary inner class: > > # a short sketch > def namedtuple(...): > class Inner(tuple): > ... > exec(source, ns) > Inner.__new__ = ns['__new__'] > return Inner > > > Here's my fork of Raymond's recipe: > > https://code.activestate.com/recipes/578918-yet-another-namedtuple/ > > > Out of curiosity, I took that recipe, updated it to work in Python 3, > and compared it to the std lib version. Here are some representative > timings: > > [steve at ando ~]$ python3.5 -m timeit -s "from collections import > namedtuple" "K = namedtuple('K', 'a b c')" > 1000 loops, best of 3: 1.02 msec per loop > > [steve at ando ~]$ python3.5 -m timeit -s "from nt3 import namedtuple" "K = > namedtuple('K', 'a b c')" > 1000 loops, best of 3: 255 usec per loop > > > I think that proves that this approach is viable and can lead to a big > speed up. > > I have an open pull request implementing this approach: https://github.com/python/cpython/pull/2736. We can discuss the exact form the code should take there (Ivan already added some good suggestions). > I don't think that merely dropping the _source attribute will save much > time. It might save a bit of memory, but in my experiements dropping it > only saves about 10?s more. I think the real bottleneck is the cost of > exec'ing the entire class. > > > > -- > Steve > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > jelle.zijlstra%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From raymond.hettinger at gmail.com Mon Jul 17 15:42:45 2017 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Mon, 17 Jul 2017 12:42:45 -0700 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> Message-ID: <9E9AD48B-155B-4357-98D8-65C088E1F02F@gmail.com> > On Jul 17, 2017, at 8:49 AM, Guido van Rossum wrote: > > The approach of generating source code and exec()ing it, is a cool demonstration of Python's expressive power, but it's always been my sense that whenever we encounter a popular idiom that uses exec() and eval(), we should augment the language (or the builtins) to avoid these calls -- that's for example how we ended up with getattr(). FYI, the proposal (from Jelle) isn't to remove exec. It is to only exec a smaller piece of code and make the rest of it static. It isn't bad idea, it just complicates the implementation (generating _source lazily) and the subsequence maintenance (which is currently really easy). > Concluding, I think we should move on from the original implementation and optimize the heck out of namedtuple. The original has served us well. The world is constantly changing. Python should adapt to the (happy) fact that it's being used for systems larger than any of us could imagine 15 years ago. Okay, then Nick and I are overruled. I'll move Jelle's patch forward. We'll also need to lazily generate _source but I don't think that will be hard. One minor grumble: I think we need to give careful cost/benefit considerations to optimizations that complicate the implementation. Over the last several years, the source for Python has grown increasingly complicated. Fewer people understand it now. It is much harder to newcomers to on-ramp. The old-timers (myself included) find that their knowledge is out of date. And complexity leads to bugs (the C optimization of random number seeding caused a major bug in the 3.6.0 release; the C optimization of the lru_cache resulted in multiple releases having a hard to find threading bugs, etc.). It is becoming increasingly difficult to look at code and tell whether it is correct (I still don't fully understand the implications of the recursive constant folding in the peephole optimizer for example). In the case of this named tuple proposal, the complexity is manageable, but the overall trend isn't good and I get the feeling the aggressive optimization is causing us to forget key parts of the zen-of-python. Cheers, Raymond P.S. Ironically, a lot of my consulting work comes from people who have created something complex our of something that could have been simple. So, I in a strange way, I should be happy about these trends -- just saying ;-) From raymond.hettinger at gmail.com Mon Jul 17 16:27:47 2017 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Mon, 17 Jul 2017 13:27:47 -0700 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> Message-ID: <0DC5C68F-953D-493A-BD18-6D77D2C047CA@gmail.com> > On Jul 17, 2017, at 8:49 AM, Guido van Rossum wrote: > > One of the reasons to be wary of exec()/eval() other than the usual security concerns is that in some Python implementations they have a high overhead to initialize the parser and compiler. (Even in CPython it's not that fast.) BTW, if getting rid of the template/exec pair is a goal, Joe Jevnik proposed a patch a couple of years ago the completely reimplemented namedtuple() in C. The patch was somewhat complex and hard to semantic equivalence, but we could resurrect it and clean it up. That way, we could like the existing namedtuple() code in-place and do a subsequent import from the C-version. This path won't be fun (whenever we have both a C version and Python version, we get years of trying to sync-up tiny differences); however, it will give you take fastest startup times, the fastest lookups at runtime, and eliminate use of exec. > On Jul 17, 2017, at 8:13 AM, Barry Warsaw wrote: > Regardless of whether this particular optimization is a good idea or not, start up time *is* a serious challenge in many environments for CPython in particular and the perception of Python?s applicability to many problems. I think we?re better off trying to identify and address such problems than ignoring or minimizing them. I agree with that sentiment but think we ought to look at places where the payoffs would actually matter such a minimizing the number of disk accesses (Python performs a lot of I/O on startup). Whenever I've addressed start-up time for my clients, named tuples we never the issue. Also, it would have been trivially easy to replace the factory function call with the generated code, but that never proved necessary or beneficial. IMO, we're about to turn the named tuple code into a mess but will find that most users, most of the time will get nearly zero benefit. Raymond From g.rodola at gmail.com Mon Jul 17 16:31:21 2017 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Mon, 17 Jul 2017 22:31:21 +0200 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> Message-ID: I completely agree. I love namedtuples but I've never been too happy about the additional overhead vs. plain tuples (both for creation and attribute access times), to the point that I explicitly avoid to use them in certain circumstances (e.g. a busy loop) and only for public end-user APIs returning multiple values. To be entirely honest, I'm not even sure why they need to be forcefully declared upfront in the first place, instead of just having a first-class function (builtin?) written in C: >>> ntuple(x=1, y=0) (x=1, y=0) ...or even a literal as in: >>> (x=1, y=0) (x=1, y=0) Most of the times this is what I really want: quickly returning an anonymous tuple with named attributes and nothing else, similarly to os.times() & others. I believe that if something like this would exist we would witness a big transition from tuple() to ntuple() for all those functions returning more than 1 value. We witnessed a similar transition in many parts of the stdlib when collections.namedtuple was first introduced, but not everywhere, probably because declaring a namedtuple is more work, it's more expensive, and it still feels like you're dealing with some kind of too high-level second-class citizen with too much overhead and too many sugar in terms of API (e.g. "verbose", "rename", "module" and "_source"). If something like this were to happen I expect collections.namedtuple to be used only by those who want to subclass it in order to attach methods, whereas the rest would stick and use ntuple() pretty much everywhere (both in "private" and "public" functions). On Mon, Jul 17, 2017 at 5:49 PM, Guido van Rossum wrote: > I am firmly with Antoine here. The cumulative startup time of large Python > programs is a serious problem and namedtuple is one of the major > contributors -- especially because it is so convenient that it is > ubiquitous. The approach of generating source code and exec()ing it, is a > cool demonstration of Python's expressive power, but it's always been my > sense that whenever we encounter a popular idiom that uses exec() and > eval(), we should augment the language (or the builtins) to avoid these > calls -- that's for example how we ended up with getattr(). > > One of the reasons to be wary of exec()/eval() other than the usual > security concerns is that in some Python implementations they have a high > overhead to initialize the parser and compiler. (Even in CPython it's not > that fast.) > > Regarding the argument that it's easier to learn what namedtuple does if > the generated source is available, while I don't feel this is important, > supposedly it is important to Raymond. But surely there are other > approaches possible that work just as well in an educational setting while > being more efficient in production use. (E.g. the approach taken by > itertools, where the docs show equivalent Python code.) > > Concluding, I think we should move on from the original implementation and > optimize the heck out of namedtuple. The original has served us well. The > world is constantly changing. Python should adapt to the (happy) fact that > it's being used for systems larger than any of us could imagine 15 years > ago. > > --Guido > > On Mon, Jul 17, 2017 at 7:59 AM, Raymond Hettinger < > raymond.hettinger at gmail.com> wrote: > >> >> > On Jul 17, 2017, at 6:31 AM, Antoine Pitrou wrote: >> > >> >> I think I understand well enough to say something intelligent? >> >> >> >> While actual references to _source are likely rare (certainly I?ve >> never >> >> used it), my understanding is that the way namedtuple works is to >> >> construct _source, and then exec it to create the class. Once that is >> >> done, there is no significant saving to be had by throwing away the >> >> constructed _source value. >> >> There are considerable benefits to namedtuple being able to generate and >> match its own source. >> >> * It makes it is really easy for a user to generate the code, drop it >> into another another module, and customize it. >> >> * It makes the named tuple factory function completely self-documenting. >> >> * The verbose/_source option teaches you exactly what named tuple does. >> That makes the tool relatively easy to learn, understand, and debug. >> >> I really don't want to throw away these benefits to save a couple of >> milliseconds. As Nick Coghlan recently posted, "Speed isn't everything, >> and it certainly isn't adequate justification for breaking public APIs that >> have been around for years." >> >> FWIW, the template/exec implementation has had excellent benefits for >> maintainability making it very easy to fix and update. As other parts of >> Python have changed (limitations on number of arguments, what is allowed as >> an identifier, etc), it mostly automatically stays in sync with the rest of >> the language. >> >> ISTM this issue is being pressed by micro-optimizers who are being very >> aggressive and not responding to actual user needs (it is more an invented >> issue than a real one). Named tuple has been around for a long time and >> users have been somewhat happy with it. >> >> If someone truly cares about the exec time for a particular named tuple, >> the _source option makes it trivially easy to just replace the generator >> call with the expanded code in that particular circumstance. >> >> >> Raymond >> >> >> P.S. I'm fully supportive of Victor's efforts to build-out structseq to >> make it sufficiently expressive to do more of what collections.namedtuple() >> does. That is a perfectly reasonable path to optimization. We've wanted >> that for a long time and no one has had the spare clock cycles to make it >> come true. >> >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido% >> 40python.org >> > > > > -- > --Guido van Rossum (python.org/~guido) > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/g. > rodola%40gmail.com > > -- Giampaolo - http://grodola.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Mon Jul 17 16:46:48 2017 From: python at mrabarnett.plus.com (MRAB) Date: Mon, 17 Jul 2017 21:46:48 +0100 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> Message-ID: <08dd8581-ac2d-059c-0b8e-e7bfb5576569@mrabarnett.plus.com> On 2017-07-17 21:31, Giampaolo Rodola' wrote: > I completely agree. I love namedtuples but I've never been too happy > about the additional overhead vs. plain tuples (both for creation and > attribute access times), to the point that I explicitly avoid to use > them in certain circumstances (e.g. a busy loop) and only for public > end-user APIs returning multiple values. > > To be entirely honest, I'm not even sure why they need to be forcefully > declared upfront in the first place, instead of just having a > first-class function (builtin?) written in C: > > >>> ntuple(x=1, y=0) > (x=1, y=0) > > ...or even a literal as in: > > >>> (x=1, y=0) > (x=1, y=0) > [snip] I know it's a bit early to bikeshed, but shouldn't that be: >>> (x: 1, y: 0) (x: 1, y: 0) instead if it's a display/literal? From isaac.morland at gmail.com Mon Jul 17 16:48:02 2017 From: isaac.morland at gmail.com (Isaac Morland) Date: Mon, 17 Jul 2017 16:48:02 -0400 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> Message-ID: My apologies, I misunderstood what had been proposed (and rejected). So it sounds like the _source is a pre-requisite for the current exec-based implementation, but the proposal is to replace with a non-exec-based implementation, meaning _source would no longer be needed for the module to work and might be eliminated. But _source could continue to be generated lazily (and cached if thought helpful) using an @property, so even the (apparently rare) uses of _source would continue to work. This would in some sense be a DRY violation, but of a very pragmatic Pythonic sort, where we have two implementations, one for documentation and one for efficiency. How different would this be from all those modules that have both Python and C implementations? On 17 July 2017 at 09:31, Antoine Pitrou wrote: > > Le 17/07/2017 ? 15:26, Isaac Morland a ?crit : > > > > I think I understand well enough to say something intelligent? > > > > While actual references to _source are likely rare (certainly I?ve never > > used it), my understanding is that the way namedtuple works is to > > construct _source, and then exec it to create the class. Once that is > > done, there is no significant saving to be had by throwing away the > > constructed _source value. > > The proposed resolution on https://bugs.python.org/issue28638 is to > avoid exec() on most parts of the namedtuple class, hence speeding up > the class creation. > > > I come from > > a non-Pythonic background so use of exec still feels a bit weird to me > > but I absolutely love namedtuple and use it constantly. > > I think for most Python programmers, it still feels a bit un-Pythonic. > While exec() is part of Python, it's generally only used in fringe cases > where nothing else works. > > Regards > > Antoine. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > isaac.morland%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Mon Jul 17 16:57:24 2017 From: python at mrabarnett.plus.com (MRAB) Date: Mon, 17 Jul 2017 21:57:24 +0100 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: <08dd8581-ac2d-059c-0b8e-e7bfb5576569@mrabarnett.plus.com> References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> <08dd8581-ac2d-059c-0b8e-e7bfb5576569@mrabarnett.plus.com> Message-ID: <56649d1a-90ea-2807-2586-59605dba2633@mrabarnett.plus.com> On 2017-07-17 21:46, MRAB wrote: > On 2017-07-17 21:31, Giampaolo Rodola' wrote: >> I completely agree. I love namedtuples but I've never been too happy >> about the additional overhead vs. plain tuples (both for creation and >> attribute access times), to the point that I explicitly avoid to use >> them in certain circumstances (e.g. a busy loop) and only for public >> end-user APIs returning multiple values. >> >> To be entirely honest, I'm not even sure why they need to be forcefully >> declared upfront in the first place, instead of just having a >> first-class function (builtin?) written in C: >> >> >>> ntuple(x=1, y=0) >> (x=1, y=0) >> >> ...or even a literal as in: >> >> >>> (x=1, y=0) >> (x=1, y=0) >> > [snip] > > I know it's a bit early to bikeshed, but shouldn't that be: > > >>> (x: 1, y: 0) > (x: 1, y: 0) > > instead if it's a display/literal? > Actually, come to think of it, a dict's keys would be quoted, so there would be a slight inconsistency there... From encukou at gmail.com Mon Jul 17 17:07:11 2017 From: encukou at gmail.com (Petr Viktorin) Date: Mon, 17 Jul 2017 23:07:11 +0200 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> Message-ID: <45a5391d-2760-d613-156b-6e5884d8123d@gmail.com> On 07/17/2017 10:31 PM, Giampaolo Rodola' wrote: > I completely agree. I love namedtuples but I've never been too happy > about the additional overhead vs. plain tuples (both for creation and > attribute access times), to the point that I explicitly avoid to use > them in certain circumstances (e.g. a busy loop) and only for public > end-user APIs returning multiple values. > > To be entirely honest, I'm not even sure why they need to be forcefully > declared upfront in the first place, instead of just having a > first-class function (builtin?) written in C: > > >>> ntuple(x=1, y=0) > (x=1, y=0) > > ...or even a literal as in: > > >>> (x=1, y=0) > (x=1, y=0) > > Most of the times this is what I really want: quickly returning an > anonymous tuple with named attributes and nothing else, similarly to > os.times() & others. [...] It seems that you want `types.SimpleNamespace(x=1, y=0)`. From g.rodola at gmail.com Mon Jul 17 17:09:16 2017 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Mon, 17 Jul 2017 23:09:16 +0200 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: <45a5391d-2760-d613-156b-6e5884d8123d@gmail.com> References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> <45a5391d-2760-d613-156b-6e5884d8123d@gmail.com> Message-ID: On Mon, Jul 17, 2017 at 11:07 PM, Petr Viktorin wrote: > On 07/17/2017 10:31 PM, Giampaolo Rodola' wrote: > >> I completely agree. I love namedtuples but I've never been too happy >> about the additional overhead vs. plain tuples (both for creation and >> attribute access times), to the point that I explicitly avoid to use them >> in certain circumstances (e.g. a busy loop) and only for public end-user >> APIs returning multiple values. >> >> To be entirely honest, I'm not even sure why they need to be forcefully >> declared upfront in the first place, instead of just having a first-class >> function (builtin?) written in C: >> >> >>> ntuple(x=1, y=0) >> (x=1, y=0) >> >> ...or even a literal as in: >> >> >>> (x=1, y=0) >> (x=1, y=0) >> >> Most of the times this is what I really want: quickly returning an >> anonymous tuple with named attributes and nothing else, similarly to >> os.times() & others. [...] >> > > It seems that you want `types.SimpleNamespace(x=1, y=0)`. > That doesn't support indexing (obj[0]). -- Giampaolo - http://grodola.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Mon Jul 17 17:24:17 2017 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 17 Jul 2017 16:24:17 -0500 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> Message-ID: [Giampaolo Rodola' ] > .... > To be entirely honest, I'm not even sure why they need to be forcefully > declared upfront in the first place, instead of just having a first-class > function (builtin?) written in C: > > >>> ntuple(x=1, y=0) > (x=1, y=0) > > ...or even a literal as in: > > >>> (x=1, y=0) > (x=1, y=0) How do you propose that the resulting object T know that T.x is 1. T.y is 0, and T.z doesn't make sense? Declaring a namedtuple up front allows the _class_ to know that all of its instances map attribute "x" to index 0 and attribute "y" to index 1. The instances know nothing about that on their own, and consume no more memory than a plain tuple. If your `ntuple()` returns an object implementing its own mapping, it loses a primary advantage (0 memory overhead) of namedtuples. From barry at python.org Mon Jul 17 17:26:24 2017 From: barry at python.org (Barry Warsaw) Date: Mon, 17 Jul 2017 17:26:24 -0400 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> Message-ID: namedtuple is great and clever, but it?s also a bit clunky. It has a weird signature and requires a made up type name. It?s also rather unPythonic if you want to support default arguments when creating namedtuple instances. Maybe as you say, a lot of the typical use cases for namedtuples could be addressed by a better builtin, but I fear we?ll end up down the bikeshedding hole for that. -Barry > On Jul 17, 2017, at 16:31, Giampaolo Rodola' wrote: > > I completely agree. I love namedtuples but I've never been too happy about the additional overhead vs. plain tuples (both for creation and attribute access times), to the point that I explicitly avoid to use them in certain circumstances (e.g. a busy loop) and only for public end-user APIs returning multiple values. > > To be entirely honest, I'm not even sure why they need to be forcefully declared upfront in the first place, instead of just having a first-class function (builtin?) written in C: > > >>> ntuple(x=1, y=0) > (x=1, y=0) > > ...or even a literal as in: > > >>> (x=1, y=0) > (x=1, y=0) > > Most of the times this is what I really want: quickly returning an anonymous tuple with named attributes and nothing else, similarly to os.times() & others. I believe that if something like this would exist we would witness a big transition from tuple() to ntuple() for all those functions returning more than 1 value. We witnessed a similar transition in many parts of the stdlib when collections.namedtuple was first introduced, but not everywhere, probably because declaring a namedtuple is more work, it's more expensive, and it still feels like you're dealing with some kind of too high-level second-class citizen with too much overhead and too many sugar in terms of API (e.g. "verbose", "rename", "module" and "_source"). -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From brett at python.org Mon Jul 17 17:31:20 2017 From: brett at python.org (Brett Cannon) Date: Mon, 17 Jul 2017 21:31:20 +0000 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: <0DC5C68F-953D-493A-BD18-6D77D2C047CA@gmail.com> References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> <0DC5C68F-953D-493A-BD18-6D77D2C047CA@gmail.com> Message-ID: On Mon, 17 Jul 2017 at 13:28 Raymond Hettinger wrote: > > > On Jul 17, 2017, at 8:49 AM, Guido van Rossum wrote: > > > > One of the reasons to be wary of exec()/eval() other than the usual > security concerns is that in some Python implementations they have a high > overhead to initialize the parser and compiler. (Even in CPython it's not > that fast.) > > BTW, if getting rid of the template/exec pair is a goal, Joe Jevnik > proposed a patch a couple of years ago the completely reimplemented > namedtuple() in C. The patch was somewhat complex and hard to semantic > equivalence, but we could resurrect it and clean it up. That way, we > could like the existing namedtuple() code in-place and do a subsequent > import from the C-version. > > This path won't be fun (whenever we have both a C version and Python > version, we get years of trying to sync-up tiny differences); however, it > will give you take fastest startup times, the fastest lookups at runtime, > and eliminate use of exec. > I vaguely remember some years ago someone proposing a patch that used metaclasses to avoid using exec() (I think it was to benefit PyPy or one of the JIT-backed interpreters). Would that work to remove the need for exec() while keeping the code in pure Python? As for removing exec() as a goal, I'll back up Christian's point and the one Steve made at the language summit that removing the use of exec() from the critical path in Python is a laudable goal from a security perspective. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsbueno at python.org.br Mon Jul 17 17:38:38 2017 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Mon, 17 Jul 2017 18:38:38 -0300 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> Message-ID: Just for sake of completeness since people are talking about a namedtuple overhaul, I have a couple implementations here - https://github.com/jsbueno/extradict/blob/master/extradict/extratuple.py If any idea there can help inspiring someone, I will be happy. js -><- On 17 July 2017 at 18:26, Barry Warsaw wrote: > namedtuple is great and clever, but it?s also a bit clunky. It has a > weird signature and requires a made up type name. It?s also rather > unPythonic if you want to support default arguments when creating > namedtuple instances. Maybe as you say, a lot of the typical use cases for > namedtuples could be addressed by a better builtin, but I fear we?ll end up > down the bikeshedding hole for that. > > -Barry > > > On Jul 17, 2017, at 16:31, Giampaolo Rodola' wrote: > > > > I completely agree. I love namedtuples but I've never been too happy > about the additional overhead vs. plain tuples (both for creation and > attribute access times), to the point that I explicitly avoid to use them > in certain circumstances (e.g. a busy loop) and only for public end-user > APIs returning multiple values. > > > > To be entirely honest, I'm not even sure why they need to be forcefully > declared upfront in the first place, instead of just having a first-class > function (builtin?) written in C: > > > > >>> ntuple(x=1, y=0) > > (x=1, y=0) > > > > ...or even a literal as in: > > > > >>> (x=1, y=0) > > (x=1, y=0) > > > > Most of the times this is what I really want: quickly returning an > anonymous tuple with named attributes and nothing else, similarly to > os.times() & others. I believe that if something like this would exist we > would witness a big transition from tuple() to ntuple() for all those > functions returning more than 1 value. We witnessed a similar transition in > many parts of the stdlib when collections.namedtuple was first introduced, > but not everywhere, probably because declaring a namedtuple is more work, > it's more expensive, and it still feels like you're dealing with some kind > of too high-level second-class citizen with too much overhead and too many > sugar in terms of API (e.g. "verbose", "rename", "module" and "_source"). > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > jsbueno%40python.org.br > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Mon Jul 17 18:27:21 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 18 Jul 2017 10:27:21 +1200 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> Message-ID: <596D39C9.4030805@canterbury.ac.nz> Barry Warsaw wrote: > namedtuple is great and clever, but it?s also a bit clunky. It has a weird > signature and requires a made up type name. Maybe a metaclass could be used to make something like this possible: class Foo(NamedTuple, fields = 'x,y,z'): ... Then the name is explicit and you get to add methods etc. if you want. -- Greg From ethan at stoneleaf.us Mon Jul 17 19:10:16 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 17 Jul 2017 16:10:16 -0700 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> Message-ID: <596D43D8.7010909@stoneleaf.us> On 07/17/2017 02:26 PM, Barry Warsaw wrote: > namedtuple is great and clever, but it?s also a bit clunky. It has a weird > signature and requires a made up type name. It?s also rather unPythonic if > you want to support default arguments when creating namedtuple instances. > Maybe as you say, a lot of the typical use cases for namedtuples could be > addressed by a better builtin, but I fear we?ll end up down the bikeshedding > hole for that. My aenum library [1] has a metaclass-based NamedTuple that allows for default arguments as well as other goodies (which would probably not make it to the stdlib since they are mostly fluff). -- ~Ethan~ [1] https://pypi.python.org/pypi/aenum From ethan at stoneleaf.us Mon Jul 17 19:05:38 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 17 Jul 2017 16:05:38 -0700 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> <0DC5C68F-953D-493A-BD18-6D77D2C047CA@gmail.com> Message-ID: <596D42C2.7050901@stoneleaf.us> On 07/17/2017 02:31 PM, Brett Cannon wrote: > I vaguely remember some years ago someone proposing a patch that used metaclasses to avoid using exec() (I think it was > to benefit PyPy or one of the JIT-backed interpreters). Would that work to remove the need for exec() while keeping the > code in pure Python? The aenum library [1] uses the same techniques as Enum for a metaclass-based namedtuple. I don't expect it to be faster, but somebody could do the benchmarks and then we'd know for sure. ;) -- ~Ethan~ [1] https://pypi.python.org/pypi/aenum From ethan at stoneleaf.us Mon Jul 17 19:17:57 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 17 Jul 2017 16:17:57 -0700 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: <596D39C9.4030805@canterbury.ac.nz> References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> <596D39C9.4030805@canterbury.ac.nz> Message-ID: <596D45A5.3090001@stoneleaf.us> On 07/17/2017 03:27 PM, Greg Ewing wrote: > Barry Warsaw wrote: >> namedtuple is great and clever, but it?s also a bit clunky. It has a weird >> signature and requires a made up type name. > > Maybe a metaclass could be used to make something > like this possible: > > > class Foo(NamedTuple, fields = 'x,y,z'): > ... > > Then the name is explicit and you get to add methods > etc. if you want. From the NamedTuple tests from my aenum library [1]: LifeForm = NamedTuple('LifeForm', 'branch genus species', module=__name__) class DeathForm(NamedTuple): color = 0 rigidity = 1 odor = 2 class WhatsIt(NamedTuple): def what(self): return self[0] class ThatsIt(WhatsIt): blah = 0 bleh = 1 class Character(NamedTuple): # second argument is doc string name = 0 gender = 1, None, 'male' klass = 2, None, 'fighter' class Point(NamedTuple): x = 0, 'horizondal coordinate', 0 y = 1, 'vertical coordinate', 0 class Point(NamedTuple): x = 0, 'horizontal coordinate', 1 y = 1, 'vertical coordinate', -1 class Color(NamedTuple): r = 0, 'red component', 11 g = 1, 'green component', 29 b = 2, 'blue component', 37 Pixel1 = NamedTuple('Pixel', Point+Color, module=__name__) class Pixel2(Point, Color): "a colored dot" class Pixel3(Point): r = 2, 'red component', 11 g = 3, 'green component', 29 b = 4, 'blue component', 37 -- ~Ethan~ [1] https://pypi.python.org/pypi/aenum From g.rodola at gmail.com Mon Jul 17 19:17:24 2017 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Tue, 18 Jul 2017 01:17:24 +0200 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> Message-ID: On Mon, Jul 17, 2017 at 11:24 PM, Tim Peters wrote: > [Giampaolo Rodola' ] > > .... > > To be entirely honest, I'm not even sure why they need to be forcefully > > declared upfront in the first place, instead of just having a first-class > > function (builtin?) written in C: > > > > >>> ntuple(x=1, y=0) > > (x=1, y=0) > > > > ...or even a literal as in: > > > > >>> (x=1, y=0) > > (x=1, y=0) > > How do you propose that the resulting object T know that T.x is 1. T.y > is 0, and T.z doesn't make sense? I'm not sure I understand your concern. That's pretty much what PyStructSequence already does. > Declaring a namedtuple up front > allows the _class_ to know that all of its instances map attribute "x" > to index 0 and attribute "y" to index 1. The instances know nothing > about that on their own Hence why I was talking about a "(lightweight) anonymous tuple with named attributes". The primary use case for namedtuples is accessing values by name (obj.x). Personally I've always considered the upfront module-level declaration only an annoyance which unnecessarily pollutes the API and adds extra overhead. I typically end up putting all namedtuples in a private module: https://github.com/giampaolo/psutil/blob/8b8da39e0c62432504fb5f67c418715aad35b291/psutil/_common.py#L156-L225 ...then import them from elsewhere and make sure they are not exposed publicly because the intermediate object returned by collections.namedtuple() is basically useless for the end-user. Also picking up a sensible name for the namedtuple is an annoyance and kinda weird. Consider this: from collections import namedtuple Coordinates = namedtuple('coordinates', ['x', 'y']) def get_coordinates(): return Coordinates(10, 20) ...vs. this: def get_coordinates(): return ntuple(x=10, y=20) ...or this: def get_coordinates(): return (x=10, y=20) If your `ntuple()` returns an object implementing its own > mapping, it loses a primary advantage (0 memory overhead) of > namedtuples. > The extra memory overhead is a price I would be happy to pay considering that collections.namedtuple is considerably slower than a plain tuple. Other than the additional overhead on startup / import time, instantiation is 4.5x slower than a plain tuple: $ python3.7 -m timeit -s "from collections import namedtuple; nt = namedtuple('xxx', ('x', 'y'))" "nt(1, 2)" 1000000 loops, best of 5: 313 nsec per loop $ python3.7 -m timeit "tuple((1, 2))" 5000000 loops, best of 5: 68.4 nsec per loop ...and name access is 2x slower than index access: $ python3.7 -m timeit -s "from collections import namedtuple; nt = namedtuple('xxx', ('x', 'y')); x = nt(1, 2)" "x.x" 5000000 loops, best of 5: 41.9 nsec per loop $ python3.7 -m timeit -s "from collections import namedtuple; nt = namedtuple('xxx', ('x', 'y')); x = nt(1, 2)" "x[0]" 10000000 loops, best of 5: 20.2 nsec per loop $ python3.7 -m timeit -s "x = (1, 2)" "x[0]" 10000000 loops, best of 5: 20.5 nsec per loop -- Giampaolo - http://grodola.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Jul 17 19:45:17 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 17 Jul 2017 16:45:17 -0700 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> Message-ID: Raymond agreed to reopen the issue. Everyone who's eager to redesign namedtuple, please go to python-ideas. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Mon Jul 17 19:48:05 2017 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 17 Jul 2017 19:48:05 -0400 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: <596D39C9.4030805@canterbury.ac.nz> References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> <596D39C9.4030805@canterbury.ac.nz> Message-ID: On Mon, Jul 17, 2017 at 6:27 PM, Greg Ewing wrote: > > > Maybe a metaclass could be used to make something > like this possible: > > > class Foo(NamedTuple, fields = 'x,y,z'): > ... > > If you think of it, collection.namedtuple *is* a metaclass. A simple wrapper will make it usable as such: import collections def namedtuple(name, bases, attrs, fields=()): # Override __init_subclass__ for Python 3.6 return collections.namedtuple(name, fields) class Foo(metaclass=namedtuple, fields='x,y'): pass print(Foo(1, 2)) # ---> Foo(x=1, y=2) -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Mon Jul 17 20:16:42 2017 From: barry at python.org (Barry Warsaw) Date: Mon, 17 Jul 2017 20:16:42 -0400 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: <596D39C9.4030805@canterbury.ac.nz> References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> <596D39C9.4030805@canterbury.ac.nz> Message-ID: <1046D04A-6EA5-40E8-AEB0-4B40F9648ACF@python.org> On Jul 17, 2017, at 18:27, Greg Ewing wrote: > > Barry Warsaw wrote: >> namedtuple is great and clever, but it?s also a bit clunky. It has a weird >> signature and requires a made up type name. > > Maybe a metaclass could be used to make something > like this possible: > > > class Foo(NamedTuple, fields = 'x,y,z'): > ... > > Then the name is explicit and you get to add methods > etc. if you want. Yes, I like how that reads. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From steve at pearwood.info Mon Jul 17 20:21:45 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 18 Jul 2017 10:21:45 +1000 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> <0DC5C68F-953D-493A-BD18-6D77D2C047CA@gmail.com> Message-ID: <20170718002143.GZ3149@ando.pearwood.info> On Mon, Jul 17, 2017 at 09:31:20PM +0000, Brett Cannon wrote: > As for removing exec() as a goal, I'll back up Christian's point and the > one Steve made at the language summit that removing the use of exec() from > the critical path in Python is a laudable goal from a security perspective. I'm sorry, I don't understand this point. What do you mean by "critical path"? Is the intention to remove exec from builtins? From the entire language? If not, how does its use in namedtuple introduce a security problem? -- Steve From alexander.belopolsky at gmail.com Mon Jul 17 21:14:38 2017 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 17 Jul 2017 21:14:38 -0400 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: <1046D04A-6EA5-40E8-AEB0-4B40F9648ACF@python.org> References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> <596D39C9.4030805@canterbury.ac.nz> <1046D04A-6EA5-40E8-AEB0-4B40F9648ACF@python.org> Message-ID: On Mon, Jul 17, 2017 at 8:16 PM, Barry Warsaw wrote: > .. > > class Foo(NamedTuple, fields = 'x,y,z'): > > ... > > > > Then the name is explicit and you get to add methods > > etc. if you want. > > Yes, I like how that reads. > > I would prefer class Foo(metaclass=namedtuple, fields = 'x,y,z'): ... which while slightly more verbose, does not lie about what namedtuple is - a factory of classes. This, however is completely orthogonal to the issue of performance. As I mentioned in my previous post, namedtuple metaclass above can be a simple function: def namedtuple(name, bases, attrs, fields=()): # Override __init_subclass__ for Python 3.6 return collections.namedtuple(name, fields) -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Mon Jul 17 21:22:48 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 18 Jul 2017 11:22:48 +1000 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> Message-ID: <20170718012248.GA3149@ando.pearwood.info> On Tue, Jul 18, 2017 at 01:17:24AM +0200, Giampaolo Rodola' wrote: > The extra memory overhead is a price I would be happy to pay considering > that collections.namedtuple is considerably slower than a plain tuple. > Other than the additional overhead on startup / import time, instantiation > is 4.5x slower than a plain tuple: > > $ python3.7 -m timeit -s "from collections import namedtuple; nt = > namedtuple('xxx', ('x', 'y'))" "nt(1, 2)" > 1000000 loops, best of 5: 313 nsec per loop > > $ python3.7 -m timeit "tuple((1, 2))" > 5000000 loops, best of 5: 68.4 nsec per loop I don't think that is a fair comparision. As far as I can tell, that gets compiled to a name lookup for "tuple" which then returns its argument unchanged, the tuple itself being constant-folded at compile time. py> dis.dis("tuple((1, 2))") 1 0 LOAD_NAME 0 (tuple) 3 LOAD_CONST 2 ((1, 2)) 6 CALL_FUNCTION 1 (1 positional, 0 keyword pair) 9 RETURN_VALUE -- Steve From njs at pobox.com Mon Jul 17 22:25:27 2017 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 17 Jul 2017 19:25:27 -0700 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: <20170718002143.GZ3149@ando.pearwood.info> References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> <0DC5C68F-953D-493A-BD18-6D77D2C047CA@gmail.com> <20170718002143.GZ3149@ando.pearwood.info> Message-ID: On Jul 17, 2017 5:28 PM, "Steven D'Aprano" wrote: On Mon, Jul 17, 2017 at 09:31:20PM +0000, Brett Cannon wrote: > As for removing exec() as a goal, I'll back up Christian's point and the > one Steve made at the language summit that removing the use of exec() from > the critical path in Python is a laudable goal from a security perspective. I'm sorry, I don't understand this point. What do you mean by "critical path"? Is the intention to remove exec from builtins? From the entire language? If not, how does its use in namedtuple introduce a security problem? I think the intention is to allow users with a certain kind of security requirement to opt in to a restricted version of the language that doesn't support exec. This is difficult if the stdlib is calling exec all over the place. But nobody is suggesting to change the language in regular usage, just provide another option. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Mon Jul 17 22:40:39 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 17 Jul 2017 19:40:39 -0700 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> Message-ID: <596D7527.6@stoneleaf.us> On 07/17/2017 04:45 PM, Guido van Rossum wrote: > Raymond agreed to reopen the issue. Everyone who's eager to redesign namedtuple, please go to python-ideas. Python Ideas thread started. -- ~Ethan~ From storchaka at gmail.com Tue Jul 18 01:20:27 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 18 Jul 2017 08:20:27 +0300 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: <20170717144319.7fdbf64b@fsol> References: <20170717144319.7fdbf64b@fsol> Message-ID: 17.07.17 15:43, Antoine Pitrou ????: > Cost of creating a namedtuple has been identified as a contributor to > Python startup time. Not only Python core and the stdlib, but any > third-party library creating namedtuple classes (there are many of > them). An issue was created for this: > https://bugs.python.org/issue28638 > > Raymond decided to close the issue because: > > 1) the proposed resolution makes the "_source" attribute empty (or, at > least, something else than it currently is). Raymond claims the > "_source" attribute is an essential feature of namedtuples. > > 2) optimizing startup cost is supposedly not worth the effort. The implementations of namedtuple that don't use compilation were written by different developers (including me) multiple times before issue28638. I provided my patch in issue28638 as an example, but I understand Raymond's arguments, and they look weighty to me. I don't know how much the _source attribute is used, but it is a part of public API. The drawback of these implementation is slower __new__ and __repr__ methods. This can be avoided if use compilation for creating __new__, but this makes the creation of a namedtuple class slower (but still faster than compiling full namedtuple class). The drawback of generating _source without using it to create a namedtuple class is complicating the code and possible quickly desynchronization of two implementations in future. I think that the right solution of this issue is generalizing the import machinery and allowing it to cache not just files, but arbitrary chunks of code. We already use precompiled bytecode files for exactly same goal -- speed up the startup by avoiding compilation. This solution could be used for caching other generated code, not just namedtuples. From songofacandy at gmail.com Tue Jul 18 06:04:37 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Tue, 18 Jul 2017 19:04:37 +0900 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: References: <20170717144319.7fdbf64b@fsol> Message-ID: > > I think that the right solution of this issue is generalizing the import > machinery and allowing it to cache not just files, but arbitrary chunks of > code. We already use precompiled bytecode files for exactly same goal -- > speed up the startup by avoiding compilation. This solution could be used > for caching other generated code, not just namedtuples. > I thought about adding C implementation based on PyStructSequence. But I like Jelle's approach because it may improve performance on all Python implementation. It's reducing source to eval. It shares code objects for most methods. (refer https://github.com/python/cpython/pull/2736#issuecomment-316014866 for quick and dirty bench on PyPy) I agree that template + eval pattern is nice for readability when comparing to other meta-programming magic. And code cache machinery can improve template + eval pattern in CPython. But namedtuple is very widely used. It's loved enough to get optimized for not only CPython. So I prefer Jelle's approach to adding code cache machinery in this case. Regards, INADA Naoki From ncoghlan at gmail.com Tue Jul 18 08:13:01 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 18 Jul 2017 22:13:01 +1000 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: <9E9AD48B-155B-4357-98D8-65C088E1F02F@gmail.com> References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> <9E9AD48B-155B-4357-98D8-65C088E1F02F@gmail.com> Message-ID: On 18 July 2017 at 05:42, Raymond Hettinger wrote: > One minor grumble: I think we need to give careful cost/benefit considerations to optimizations that complicate the implementation. Over the last several years, the source for Python has grown increasingly complicated. Fewer people understand it now. It is much harder to newcomers to on-ramp. The old-timers (myself included) find that their knowledge is out of date. And complexity leads to bugs (the C optimization of random number seeding caused a major bug in the 3.6.0 release; the C optimization of the lru_cache resulted in multiple releases having a hard to find threading bugs, etc.). It is becoming increasingly difficult to look at code and tell whether it is correct (I still don't fully understand the implications of the recursive constant folding in the peephole optimizer for example). In the case of this named tuple proposal, the complexity is manageable, but the overall trend isn't good and I get the feeling the aggressive optimization is causing us to forget key par > ts of the zen-of-python. As another example of this: while trading the global import lock for per-module locks eliminated most of the old import deadlocks, it turns out that it *also* left us with some fairly messy race conditions and more fragile code (I still count that particular case as a win overall, but it definitely raises the barrier to entry for maintaining that code). Unfortunately, these are frequently cases where the benefits are immediately visible (e.g. faster benchmark results, removing longstanding limitations on user code), but the downsides can literally take years to make themselves felt (e.g. higher defect rates in the interpreter, subtle bugs in previously correct user code that are eventually traced back to interpreter changes). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From solipsis at pitrou.net Tue Jul 18 08:28:51 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 18 Jul 2017 14:28:51 +0200 Subject: [Python-Dev] Impact of Namedtuple on startup time References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> <9E9AD48B-155B-4357-98D8-65C088E1F02F@gmail.com> Message-ID: <20170718142851.0ee9bda0@fsol> On Tue, 18 Jul 2017 22:13:01 +1000 Nick Coghlan wrote: > > As another example of this: while trading the global import lock for > per-module locks eliminated most of the old import deadlocks, it turns > out that it *also* left us with some fairly messy race conditions and > more fragile code (I still count that particular case as a win > overall, but it definitely raises the barrier to entry for maintaining > that code). > > Unfortunately, these are frequently cases where the benefits are > immediately visible (e.g. faster benchmark results, removing > longstanding limitations on user code), but the downsides can > literally take years to make themselves felt (e.g. higher defect rates > in the interpreter, subtle bugs in previously correct user code that > are eventually traced back to interpreter changes). The import deadlocks were really in the category of "subtle bugs" that only occur in certain timing conditions (especially when combined with PyImport_ImportModuleNoBlock and/or stdlib modules which can try to import stuff silently, such as the codecs module). So we traded a category of "subtle bugs" due to a core design deficiency for another category of "subtle bugs" due to an imperfect implementation, the latter being actually fixable incrementally :-) Disclaimer: I wrote the initial per-module lock implementation, which was motivated by those long-standing "subtle bugs" in multi-threaded applications. Regards Antoine. From guido at python.org Tue Jul 18 11:12:50 2017 From: guido at python.org (Guido van Rossum) Date: Tue, 18 Jul 2017 08:12:50 -0700 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> <9E9AD48B-155B-4357-98D8-65C088E1F02F@gmail.com> Message-ID: There are some weighty things being said in this subthread that shouldn't be hidden under the heading of improving NamedTuple. For continued discussion of our development philosophy let's open a new thread. (I have an opinion but I expect I'm not the only one with that opinion, so I'll let others express theirs first.) --Guido On Tue, Jul 18, 2017 at 5:13 AM, Nick Coghlan wrote: > On 18 July 2017 at 05:42, Raymond Hettinger > wrote: > > One minor grumble: I think we need to give careful cost/benefit > considerations to optimizations that complicate the implementation. Over > the last several years, the source for Python has grown increasingly > complicated. Fewer people understand it now. It is much harder to > newcomers to on-ramp. The old-timers (myself included) find that their > knowledge is out of date. And complexity leads to bugs (the C optimization > of random number seeding caused a major bug in the 3.6.0 release; the C > optimization of the lru_cache resulted in multiple releases having a hard > to find threading bugs, etc.). It is becoming increasingly difficult to > look at code and tell whether it is correct (I still don't fully understand > the implications of the recursive constant folding in the peephole > optimizer for example). In the case of this named tuple proposal, the > complexity is manageable, but the overall trend isn't good and I get the > feeling the aggressive optimization is causing us to forget key par > > ts of the zen-of-python. > > As another example of this: while trading the global import lock for > per-module locks eliminated most of the old import deadlocks, it turns > out that it *also* left us with some fairly messy race conditions and > more fragile code (I still count that particular case as a win > overall, but it definitely raises the barrier to entry for maintaining > that code). > > Unfortunately, these are frequently cases where the benefits are > immediately visible (e.g. faster benchmark results, removing > longstanding limitations on user code), but the downsides can > literally take years to make themselves felt (e.g. higher defect rates > in the interpreter, subtle bugs in previously correct user code that > are eventually traced back to interpreter changes). > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From benhoyt at gmail.com Tue Jul 18 12:03:36 2017 From: benhoyt at gmail.com (Ben Hoyt) Date: Tue, 18 Jul 2017 12:03:36 -0400 Subject: [Python-Dev] Program runs in 12s on Python 2.7, but 5s on Python 3.5 -- why so much difference? Message-ID: Hi folks, (Not entirely sure this is the right place for this question, but hopefully it's of interest to several folks.) A few days ago I posted a note in response to Victor Stinner's articles on his CPython contributions, noting that I wrote a program that ran in 11.7 seconds on Python 2.7, but only takes 5.1 seconds on Python 3.5 (on my 2.5 GHz macOS i7), more than 2x as fast. Obviously this is a Good Thing, but I'm curious as to why there's so much difference. The program is a pentomino puzzle solver, and it works via code generation, generating a ton of nested "if" statements, so I believe it's exercising the Python bytecode interpreter heavily. Obviously there have been some big optimizations to make this happen, but I'm curious what the main improvements are that are causing this much difference. There's a writeup about my program here, with benchmarks at the bottom: http://benhoyt.com/writings/python-pentomino/ This is the generated Python code that's being exercised: https://github.com/benhoyt/python-pentomino/blob/master/generated_solve.py For reference, on Python 3.6 it runs in 4.6 seconds (same on Python 3.7 alpha). This smallish increase from Python 3.5 to Python 3.6 was more expected to me due to the bytecode changing to wordcode in 3.6. I tried using cProfile on both Python versions, but that didn't say much, because the functions being called aren't taking the majority of the time. How does one benchmark at a lower level, or otherwise explain what's going on here? Thanks, Ben -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Tue Jul 18 12:08:08 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 18 Jul 2017 09:08:08 -0700 Subject: [Python-Dev] Design Philosophy: Performance vs Robustness/Maintainability Message-ID: <596E3268.4070705@stoneleaf.us> Raymond Hettinger: ----------------- > One minor grumble: I think we need to give careful cost/benefit considerations to > optimizations that complicate the implementation. Over the last several years, the > source for Python has grown increasingly complicated. Fewer people understand it > now. It is much harder to newcomers to on-ramp. The old-timers (myself included) > find that their knowledge is out of date. And complexity leads to bugs (the C > optimization of random number seeding caused a major bug in the 3.6.0 release; the > C optimization of the lru_cache resulted in multiple releases having a hard to find > threading bugs, etc.). It is becoming increasingly difficult to look at code and > tell whether it is correct (I still don't fully understand the implications of the > recursive constant folding in the peephole optimizer for example). In the case > of this named tuple proposal, the complexity is manageable, but the overall trend > isn't good and I get the feeling the aggressive optimization is causing us to > forget key parts of the zen-of-python. Nick Coughlan: ------------- > As another example of this: while trading the global import lock for > per-module locks eliminated most of the old import deadlocks, it turns > out that it *also* left us with some fairly messy race conditions and > more fragile code (I still count that particular case as a win > overall, but it definitely raises the barrier to entry for maintaining > that code). > > Unfortunately, these are frequently cases where the benefits are > immediately visible (e.g. faster benchmark results, removing > longstanding limitations on user code), but the downsides can > literally take years to make themselves felt (e.g. higher defect rates > in the interpreter, subtle bugs in previously correct user code that > are eventually traced back to interpreter changes). Barry Warsaw: ------------ > Regardless of whether [namedtuple] optimization is a good idea or not, start up > time *is* a serious challenge in many environments for CPython in particular and > the perception of Python?s applicability to many problems. I think we?re better > off trying to identify and address such problems than ignoring or minimizing them. Ethan Furman: ------------ Speed is not the only factor, and certainly shouldn't be the first concern, but once we have correct code we need to follow our own advice: find the bottlenecks and optimize them. Optimized code will never be as pretty or maintainable as simple, unoptimized code but real-world applications often require as much performance as can be obtained. [My apologies if I missed any points from the namedtuple thread.] -- ~Ethan~ From ethan at stoneleaf.us Tue Jul 18 12:08:33 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 18 Jul 2017 09:08:33 -0700 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> <9E9AD48B-155B-4357-98D8-65C088E1F02F@gmail.com> Message-ID: <596E3281.1050804@stoneleaf.us> On 07/18/2017 08:12 AM, Guido van Rossum wrote: > There are some weighty things being said in this subthread that shouldn't be hidden under the heading of improving > NamedTuple. For continued discussion of our development philosophy let's open a new thread. (I have an opinion but I > expect I'm not the only one with that opinion, so I'll let others express theirs first.) New thread created. -- ~Ethan~ From jsbueno at python.org.br Tue Jul 18 12:08:49 2017 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Tue, 18 Jul 2017 13:08:49 -0300 Subject: [Python-Dev] deque implementation question In-Reply-To: References: Message-ID: On 15 July 2017 at 04:01, Max Moroz wrote: > What would be the disadvantage of implementing collections.deque as a > circular array (rather than a doubly linked list of blocks)? My naive > thinking was that a circular array would maintain the current O(1) append/pop > from either side, and would improve index lookup in the middle from O(n) to > O(1). What am I missing? > > The insertion/removal of an arbitrary item specified by a pointer would > increase from constant time to linear, but since we don't have pointers > this is a moot point. > > Of course when the circular array is full, it will need to be reallocated, > but the amortized cost of that is still O(1). (Moreover, for a bounded > deque, there's even an option of preallocation, which would completely > eliminate reallocations.) > Now - since you are at it, you could possibly mine pypi for interesting efficient data structures that could cover use cases lists and deque not suffice. I am pretty sure one could find a couple, - and once we get a few that are well behaved and somewhat popular, they could be made candidates for inclusion in collections, I guess. > Thanks > > Max > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > jsbueno%40python.org.br > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Tue Jul 18 12:12:09 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 18 Jul 2017 18:12:09 +0200 Subject: [Python-Dev] Design Philosophy: Performance vs Robustness/Maintainability In-Reply-To: <596E3268.4070705@stoneleaf.us> References: <596E3268.4070705@stoneleaf.us> Message-ID: [Python-Dev] Design Philosophy: Performance vs Robustness/Maintainability 2017-07-18 18:08 GMT+02:00 Ethan Furman : > Nick Coughlan: > ------------- >> >> As another example of this: while trading the global import lock for >> per-module locks eliminated most of the old import deadlocks, (...) Minor remark: the email subject is inaccurate, this change is not related to performance. I would more say that it's about correctness. Python 3 doesn't hung on deadlock in "legit" import anymore ;-) Victor From solipsis at pitrou.net Tue Jul 18 12:16:30 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 18 Jul 2017 18:16:30 +0200 Subject: [Python-Dev] Design Philosophy: Performance vs Robustness/Maintainability References: <596E3268.4070705@stoneleaf.us> Message-ID: <20170718181630.0fe90f85@fsol> On Tue, 18 Jul 2017 09:08:08 -0700 Ethan Furman wrote: > > Nick Coughlan: > ------------- It is "Nick Coghlan" not "Coughlan". > > As another example of this: while trading the global import lock for > > per-module locks eliminated most of the old import deadlocks, it turns > > out that it *also* left us with some fairly messy race conditions and > > more fragile code (I still count that particular case as a win > > overall, but it definitely raises the barrier to entry for maintaining > > that code). > > > > Unfortunately, these are frequently cases where the benefits are > > immediately visible (e.g. faster benchmark results, removing > > longstanding limitations on user code), but the downsides can > > literally take years to make themselves felt (e.g. higher defect rates > > in the interpreter, subtle bugs in previously correct user code that > > are eventually traced back to interpreter changes). I'll reply here again: the original motivation for the per-module import lock was not performance but correctness. The import deadlocks were really in the category of "subtle bugs" that only occur in certain timing conditions (especially when combined with PyImport_ImportModuleNoBlock and/or stdlib modules which can try to import stuff silently, such as the codecs module). So we traded a category of "subtle bugs" due to a core design deficiency for another category of "subtle bugs" due to an imperfect implementation, the latter being actually fixable incrementally :-) Disclaimer: I wrote the initial per-module lock implementation. Regards Antoine. From victor.stinner at gmail.com Tue Jul 18 12:16:52 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 18 Jul 2017 18:16:52 +0200 Subject: [Python-Dev] Design Philosophy: Performance vs Robustness/Maintainability In-Reply-To: <596E3268.4070705@stoneleaf.us> References: <596E3268.4070705@stoneleaf.us> Message-ID: 2017-07-18 18:08 GMT+02:00 Ethan Furman : > Raymond Hettinger: > ----------------- >> And complexity leads to bugs >> (the C >> optimization of random number seeding caused a major bug in the 3.6.0 >> release Hum, I guess that Raymond is referring to http://bugs.python.org/issue29085 This regression was not by an optimization at all, but a change to harden Python: https://www.python.org/dev/peps/pep-0524/ Victor From solipsis at pitrou.net Tue Jul 18 12:18:06 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 18 Jul 2017 18:18:06 +0200 Subject: [Python-Dev] Program runs in 12s on Python 2.7, but 5s on Python 3.5 -- why so much difference? References: Message-ID: <20170718181806.14fed37f@fsol> On Tue, 18 Jul 2017 12:03:36 -0400 Ben Hoyt wrote: > Hi folks, > > (Not entirely sure this is the right place for this question, but hopefully > it's of interest to several folks.) > > A few days ago I posted a note in response to Victor Stinner's articles on > his CPython contributions, noting that I wrote a program that ran in 11.7 > seconds on Python 2.7, but only takes 5.1 seconds on Python 3.5 (on my 2.5 > GHz macOS i7), more than 2x as fast. Obviously this is a Good Thing, but > I'm curious as to why there's so much difference. > > The program is a pentomino puzzle solver, and it works via code generation, > generating a ton of nested "if" statements, so I believe it's exercising > the Python bytecode interpreter heavily. A first step would be to see if the generated bytecode has changed substantially. Otherwise, you can try to comment out parts of the function until the performance difference has been nullified. Regards Antoine. From ethan at stoneleaf.us Tue Jul 18 12:27:13 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 18 Jul 2017 09:27:13 -0700 Subject: [Python-Dev] Design Philosophy: Performance vs Robustness/Maintainability In-Reply-To: <20170718181630.0fe90f85@fsol> References: <596E3268.4070705@stoneleaf.us> <20170718181630.0fe90f85@fsol> Message-ID: <596E36E1.2020506@stoneleaf.us> On 07/18/2017 09:16 AM, Antoine Pitrou wrote: > On Tue, 18 Jul 2017 09:08:08 -0700 > Ethan Furman wrote: >> >> Nick Coughlan: >> ------------- > > It is "Nick Coghlan" not "Coughlan". Argh. Sorry, Nick, and thank you, Antoine! >>> As another example of this: while trading the global import lock for >>> per-module locks eliminated most of the old import deadlocks, it turns >>> out that it *also* left us with some fairly messy race conditions and >>> more fragile code (I still count that particular case as a win >>> overall, but it definitely raises the barrier to entry for maintaining >>> that code). >>> >>> Unfortunately, these are frequently cases where the benefits are >>> immediately visible (e.g. faster benchmark results, removing >>> longstanding limitations on user code), but the downsides can >>> literally take years to make themselves felt (e.g. higher defect rates >>> in the interpreter, subtle bugs in previously correct user code that >>> are eventually traced back to interpreter changes). > > I'll reply here again: the original motivation for the per-module > import lock was not performance but correctness. I meant that as an example of the dangers of increased code complexity. -- ~Ethan~ From brett at python.org Tue Jul 18 18:00:44 2017 From: brett at python.org (Brett Cannon) Date: Tue, 18 Jul 2017 22:00:44 +0000 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> <0DC5C68F-953D-493A-BD18-6D77D2C047CA@gmail.com> <20170718002143.GZ3149@ando.pearwood.info> Message-ID: On Mon, 17 Jul 2017 at 19:26 Nathaniel Smith wrote: > On Jul 17, 2017 5:28 PM, "Steven D'Aprano" wrote: > > On Mon, Jul 17, 2017 at 09:31:20PM +0000, Brett Cannon wrote: > > > As for removing exec() as a goal, I'll back up Christian's point and the > > one Steve made at the language summit that removing the use of exec() > from > > the critical path in Python is a laudable goal from a security > perspective. > > I'm sorry, I don't understand this point. What do you mean by "critical > path"? > > Is the intention to remove exec from builtins? From the entire language? > If not, how does its use in namedtuple introduce a security problem? > > > I think the intention is to allow users with a certain kind of security > requirement to opt in to a restricted version of the language that doesn't > support exec. This is difficult if the stdlib is calling exec all over the > place. But nobody is suggesting to change the language in regular usage, > just provide another option. > What Nathaniel said. :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From larry at hastings.org Tue Jul 18 18:22:16 2017 From: larry at hastings.org (Larry Hastings) Date: Tue, 18 Jul 2017 15:22:16 -0700 Subject: [Python-Dev] Impact of Namedtuple on startup time In-Reply-To: References: <20170717144319.7fdbf64b@fsol> <4d161c7b-87cf-7d78-6967-07be1c584591@python.org> <0DC5C68F-953D-493A-BD18-6D77D2C047CA@gmail.com> <20170718002143.GZ3149@ando.pearwood.info> Message-ID: <484e070d-633a-35c0-7f04-756805568dad@hastings.org> On 07/17/2017 07:25 PM, Nathaniel Smith wrote: > I think the intention is to allow users with a certain kind of > security requirement to opt in to a restricted version of the language > that doesn't support exec. This is difficult if the stdlib is calling > exec all over the place. But nobody is suggesting to change the > language in regular usage, just provide another option. An anecdote about removing exec(). Back in 2012 I interviewed Kristjan Valur Jonsson, then of CCP Games, for my podcast Radio Free Python. He said that due to memory constraints they'd had to remove the compiler from the Playstation 3 build of Python for some game project. This meant that namedtuple didn't work, which had knock-on effects for other bits of the standard library So security concerns aren't the only reason for removing the compiler, //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Tue Jul 18 18:59:58 2017 From: brett at python.org (Brett Cannon) Date: Tue, 18 Jul 2017 22:59:58 +0000 Subject: [Python-Dev] Design Philosophy: Performance vs Robustness/Maintainability In-Reply-To: <596E3268.4070705@stoneleaf.us> References: <596E3268.4070705@stoneleaf.us> Message-ID: On Tue, 18 Jul 2017 at 09:07 Ethan Furman wrote: > Raymond Hettinger: > ----------------- > > One minor grumble: I think we need to give careful cost/benefit > considerations to > > optimizations that complicate the implementation. Over the last > several years, the > > source for Python has grown increasingly complicated. Fewer people > understand it > > now. It is much harder to newcomers to on-ramp. The old-timers (myself > included) > > find that their knowledge is out of date. And complexity leads to bugs > (the C > > optimization of random number seeding caused a major bug in the 3.6.0 > release; the > > C optimization of the lru_cache resulted in multiple releases having a > hard to find > > threading bugs, etc.). It is becoming increasingly difficult to look > at code and > > tell whether it is correct (I still don't fully understand the > implications of the > > recursive constant folding in the peephole optimizer for example). > In the case > > of this named tuple proposal, the complexity is manageable, but the > overall trend > > isn't good and I get the feeling the aggressive optimization is causing > us to > > forget key parts of the zen-of-python. > > Nick Coghlan: > ------------- > > As another example of this: while trading the global import lock for > > per-module locks eliminated most of the old import deadlocks, it turns > > out that it *also* left us with some fairly messy race conditions and > > more fragile code (I still count that particular case as a win > > overall, but it definitely raises the barrier to entry for maintaining > > that code). > > > > Unfortunately, these are frequently cases where the benefits are > > immediately visible (e.g. faster benchmark results, removing > > longstanding limitations on user code), but the downsides can > > literally take years to make themselves felt (e.g. higher defect rates > > in the interpreter, subtle bugs in previously correct user code that > > are eventually traced back to interpreter changes). > > Barry Warsaw: > ------------ > > Regardless of whether [namedtuple] optimization is a good idea or not, > start up > > time *is* a serious challenge in many environments for CPython in > particular and > > the perception of Python?s applicability to many problems. I think > we?re better > > off trying to identify and address such problems than ignoring or > minimizing them. > > Ethan Furman: > ------------ > Speed is not the only factor, and certainly shouldn't be the first > concern, but once > we have correct code we need to follow our own advice: find the > bottlenecks and optimize > them. Optimized code will never be as pretty or maintainable as simple, > unoptimized > code but real-world applications often require as much performance as can > be obtained. > For me it's a balance based on how critical the code is and how complicated the code will become long-term. I think between Victor and me we maybe have 1 person/week of paid work time on CPython and the rest is volunteer time, so there always has to be some consideration as to whether maintenance will become untenable long-term (this is why complex is better than complicated pretty much no matter what). In namedtuple's case, Raymond designed something that was useful with an elegant solution. Unfortunately namedtuple is a victim of its own success and became a bottleneck when it came to startup time in apps that used it extensively as well as being a sticking point for anyone who wanted to askew exec(). So now we're keeping the usefulness/API design aspect and are being pragmatic about the fact that we want to rework the elegant design to be computationally cheaper so it's no longer an obvious performance penalty at app startup for people who use it a lot. And so now the work is trying to balance the pragmatic performance aspect with the long-term maintenance aspect. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Jul 18 21:35:20 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 19 Jul 2017 11:35:20 +1000 Subject: [Python-Dev] Program runs in 12s on Python 2.7, but 5s on Python 3.5 -- why so much difference? In-Reply-To: <20170718181806.14fed37f@fsol> References: <20170718181806.14fed37f@fsol> Message-ID: On 19 July 2017 at 02:18, Antoine Pitrou wrote: > On Tue, 18 Jul 2017 12:03:36 -0400 > Ben Hoyt wrote: >> The program is a pentomino puzzle solver, and it works via code generation, >> generating a ton of nested "if" statements, so I believe it's exercising >> the Python bytecode interpreter heavily. > > A first step would be to see if the generated bytecode has changed > substantially. Scanning over them, the Python 2.7 bytecode appears to have many more JUMP_FORWARD and JUMP_ABSOLUTE opcodes than appear in the 3.6 version (I didn't dump them into a Counter instance to tally them properly though, since 2.7's dis module is missing the structured opcode iteration APIs). With the shift to wordcode, the overall size of the bytecode is also significantly *smaller*: >>> len(co.co_consts[0].co_code) # 2.7 14427 >>> len(co.co_consts[0].co_code) # 3.6 11850 However, I'm not aware of any Python profilers that currently offer opcode level profiling - the closest would probably be VMProf's JIT profiling, and that aspect of VMProf is currently PyPy specific (although could presumably be extended to CPython 3.6+ by way of the opcode evaluation hook). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From benhoyt at gmail.com Tue Jul 18 21:41:38 2017 From: benhoyt at gmail.com (Ben Hoyt) Date: Tue, 18 Jul 2017 21:41:38 -0400 Subject: [Python-Dev] Program runs in 12s on Python 2.7, but 5s on Python 3.5 -- why so much difference? In-Reply-To: References: <20170718181806.14fed37f@fsol> Message-ID: Thanks, Nick -- that's interesting. I just saw the extra JUMP_FORWARD and JUMP_ABSOLUTE instructions on my commute home (I guess those are something Python 3.x optimizes away). VERY strangely, on Windows Python 2.7 is faster! Comparing 64-bit Python 2.7.12 against Python 3.5.3 on my Windows 10 laptop: * Python 2.7.12: 4.088s * Python 3.5.3: 5.792s I'm pretty sure MSVC/Windows doesn't support computed gotos, but that doesn't explain why 3.5 is so much faster than 2.7 on Mac. I have yet to try it on Linux. -Ben On Tue, Jul 18, 2017 at 9:35 PM, Nick Coghlan wrote: > On 19 July 2017 at 02:18, Antoine Pitrou wrote: > > On Tue, 18 Jul 2017 12:03:36 -0400 > > Ben Hoyt wrote: > >> The program is a pentomino puzzle solver, and it works via code > generation, > >> generating a ton of nested "if" statements, so I believe it's exercising > >> the Python bytecode interpreter heavily. > > > > A first step would be to see if the generated bytecode has changed > > substantially. > > Scanning over them, the Python 2.7 bytecode appears to have many more > JUMP_FORWARD and JUMP_ABSOLUTE opcodes than appear in the 3.6 version > (I didn't dump them into a Counter instance to tally them properly > though, since 2.7's dis module is missing the structured opcode > iteration APIs). > > With the shift to wordcode, the overall size of the bytecode is also > significantly *smaller*: > > >>> len(co.co_consts[0].co_code) # 2.7 > 14427 > > >>> len(co.co_consts[0].co_code) # 3.6 > 11850 > > However, I'm not aware of any Python profilers that currently offer > opcode level profiling - the closest would probably be VMProf's JIT > profiling, and that aspect of VMProf is currently PyPy specific > (although could presumably be extended to CPython 3.6+ by way of the > opcode evaluation hook). > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > benhoyt%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Jul 18 21:59:54 2017 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 18 Jul 2017 18:59:54 -0700 Subject: [Python-Dev] Program runs in 12s on Python 2.7, but 5s on Python 3.5 -- why so much difference? In-Reply-To: References: Message-ID: I'd probably start with a regular C-level profiler, like perf or callgrind. They're not very useful for comparing two versions of code written in Python, but here the Python code is the same (modulo changes in the stdlib), and it's changes in the interpreter's C code that probably make the difference. On Tue, Jul 18, 2017 at 9:03 AM, Ben Hoyt wrote: > Hi folks, > > (Not entirely sure this is the right place for this question, but hopefully > it's of interest to several folks.) > > A few days ago I posted a note in response to Victor Stinner's articles on > his CPython contributions, noting that I wrote a program that ran in 11.7 > seconds on Python 2.7, but only takes 5.1 seconds on Python 3.5 (on my 2.5 > GHz macOS i7), more than 2x as fast. Obviously this is a Good Thing, but I'm > curious as to why there's so much difference. > > The program is a pentomino puzzle solver, and it works via code generation, > generating a ton of nested "if" statements, so I believe it's exercising the > Python bytecode interpreter heavily. Obviously there have been some big > optimizations to make this happen, but I'm curious what the main > improvements are that are causing this much difference. > > There's a writeup about my program here, with benchmarks at the bottom: > http://benhoyt.com/writings/python-pentomino/ > > This is the generated Python code that's being exercised: > https://github.com/benhoyt/python-pentomino/blob/master/generated_solve.py > > For reference, on Python 3.6 it runs in 4.6 seconds (same on Python 3.7 > alpha). This smallish increase from Python 3.5 to Python 3.6 was more > expected to me due to the bytecode changing to wordcode in 3.6. > > I tried using cProfile on both Python versions, but that didn't say much, > because the functions being called aren't taking the majority of the time. > How does one benchmark at a lower level, or otherwise explain what's going > on here? > > Thanks, > Ben > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/njs%40pobox.com > -- Nathaniel J. Smith -- https://vorpus.org From victor.stinner at gmail.com Wed Jul 19 08:59:52 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 19 Jul 2017 14:59:52 +0200 Subject: [Python-Dev] Python startup time Message-ID: Hi, On Twitter, Raymond Hettinger wrote: "The decision making process on Python-dev is an anti-pattern, governed by anecdotal data and ambiguity over what problem is solved." https://twitter.com/raymondh/status/887069454693158912 About "anecdotal data", I would like to discuss the Python startup time. == Python 3.7 compared to 2.7 == First of all, on speed.python.org, we have: * Python 2.7: 6.4 ms with site, 3.0 ms without site (-S) * master (3.7): 14.5 ms with site, 8.4 ms without site (-S) Python 3.7 startup time is 2.3x slower with site (default mode), or 2.8x slower without site (-S command line option). (I will skip Python 3.4, 3.5 and 3.6 which are much worse than Python 3.7...) So if an user complained about Python 2.7 startup time: be prepared for a 2x - 3x more angry user when "forced" to upgrade to Python 3! == Mercurial vs Git, Python vs C, startup time == Startup time matters a lot for Mercurial since Mercurial is compared to Git. Git and Mercurial have similar features, but Git is written in C whereas Mercurial is written in Python. Quick benchmark on the speed.python.org server: * hg version: 44.6 ms +- 0.2 ms * git --version: 974 us +- 7 us Mercurial startup time is already 45.8x slower than Git whereas tested Mercurial runs on Python 2.7.12. Now try to sell Python 3 to Mercurial developers, with a startup time 2x - 3x slower... I tested Mecurial 3.7.3 and Git 2.7.4 on Ubuntu 16.04.1 using "python3 -m perf command -- ...". == CPython core developers don't care? no, they do care == Christian Heimes, Naoki INADA, Serhiy Storchaka, Yury Selivanov, me (Victor Stinner) and other core developers made multiple changes last years to reduce the number of imports at startup, optimize impotlib, etc. IHMO all these core developers are well aware of the competition of programming languages, and honesty Python startup time isn't "good". So let's compare it to other programming languages similar to Python. == PHP, Ruby, Perl == I measured the startup time of other programming languages which are similar to Python, still on the speed.python.org server using "python3 -m perf command -- ...": * perl -e ' ': 1.18 ms +- 0.01 ms * php -r ' ': 8.57 ms +- 0.05 ms * ruby -e ' ': 32.8 ms +- 0.1 ms Wow, Perl is quite good! PHP seems as good as Python 2 (but Python 3 is worse). Ruby startup time seems less optimized than other languages. Tested versions: * perl 5, version 22, subversion 1 (v5.22.1) * PHP 7.0.18-0ubuntu0.16.04.1 (cli) ( NTS ) * ruby 2.3.1p112 (2016-04-26) [x86_64-linux-gnu] == Quick Google search == I also searched for "python startup time" and "python slow startup time" on Google and found many articles. Some examples: "Reducing the Python startup time" http://www.draketo.de/book/export/html/498 => "The python startup time always nagged me (17-30ms) and I just searched again for a way to reduce it, when I found this: The Python-Launcher caches GTK imports and forks new processes to reduce the startup time of python GUI programs." https://nelsonslog.wordpress.com/2013/04/08/python-startup-time/ => "Wow, Python startup time is worse than I thought." "How to speed up python starting up and/or reduce file search while loading libraries?" https://stackoverflow.com/questions/15474160/how-to-speed-up-python-starting-up-and-or-reduce-file-search-while-loading-libra => "The first time I log to the system and start one command it takes 6 seconds just to show a few line of help. If I immediately issue the same command again it takes 0.1s. After a couple of minutes it gets back to 6s. (proof of short-lived cache)" "How does one optimise the startup of a Python script/program?" https://www.quora.com/How-does-one-optimise-the-startup-of-a-Python-script-program => "I wrote a Python program that would be used very often (imagine 'cd' or 'ls') for very short runtimes, how would I make it start up as fast as possible?" "Python Interpreter Startup time" https://bytes.com/topic/python/answers/34469-pyhton-interpreter-startup-time "Python is very slow to start on Windows 7" https://stackoverflow.com/questions/29997274/python-is-very-slow-to-start-on-windows-7 => "Python takes 17 times longer to load on my Windows 7 machine than Ubuntu 14.04 running on a VM" => "returns in 0.614s on Windows and 0.036s on Linux" "How to make a fast command line tool in Python" (old article Python 2.5.2) https://files.bemusement.org/talks/OSDC2008-FastPython/ => "(...) some techniques Bazaar uses to start quickly, such as lazy imports." -- So please continue efforts for make Python startup even faster to beat all other programming languages, and finally convince Mercurial to upgrade ;-) Victor From phd at phdru.name Wed Jul 19 09:22:11 2017 From: phd at phdru.name (Oleg Broytman) Date: Wed, 19 Jul 2017 15:22:11 +0200 Subject: [Python-Dev] Python startup time In-Reply-To: References: Message-ID: <20170719132211.GA30782@phdru.name> On Wed, Jul 19, 2017 at 02:59:52PM +0200, Victor Stinner wrote: > "Python is very slow to start on Windows 7" > https://stackoverflow.com/questions/29997274/python-is-very-slow-to-start-on-windows-7 However hard you are going to optimize Python you cannot fix those "defenders", "guards" and "protectors". :-) This particular link can be excluded from consideration. Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From ncoghlan at gmail.com Wed Jul 19 10:05:50 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 20 Jul 2017 00:05:50 +1000 Subject: [Python-Dev] Python startup time In-Reply-To: References: Message-ID: On 19 July 2017 at 22:59, Victor Stinner wrote: > == CPython core developers don't care? no, they do care == > > Christian Heimes, Naoki INADA, Serhiy Storchaka, Yury Selivanov, me > (Victor Stinner) and other core developers made multiple changes last > years to reduce the number of imports at startup, optimize impotlib, > etc. I actually also care myself, since interpreter startup time feeds directly into cost of execution when running in environments like AWS Lambda which charge by the "gigabyte second" (i.e. you allocate a certain amount of RAM to a particular command, and then get charged for that RAM for the amount of time it takes to run, as measured with subsecond precision - if you exceed the limits of the free tier, anything you 're losing to language runtime startup in such an environment translates almost directly to higher costs). In aggregate, shaving time off CPython startup saves *scary* amounts of collective compute time around the world - even though most runtime environments don't track that as closely in financial terms as Lambda does, we're still nudging the power & cooling requirements of data centers slightly higher than they would otherwise be. So even when the per-invocation impact of a performance improvement is small, it's worth keeping in mind that CPython gets invoked a *lot*, whether it's to respond to a web request, run a test, run a build, deploy another application, analyse some data, etc :) However, I'm also of the view that module & API maintainers *do* have the authority to set the design priorities for the parts of the standard library that they're personally responsible for, and if we'd like them to change their minds based on information we have that they don't, then reopening enhancement requests that they already closed is *not* the way to go about it (as while the issue tracker is an excellent venue for figuring out the technical details of a change, or deciding whether or not an RFE is a good idea given a common understanding of the relevant design priorities, it's almost always a *terrible* venue for resolving outright disagreements as to what the most relevant design priorities actually are). Rather, the best available way to publicly request reconsideration is the way Antoine did when he escalated the namedtuple question to python-dev: by explicitly acknowledging that there's a conflict in design priorities between core developers, and asking for a collective discussion (and potentially a determination from Guido) as to the right way forward for the project as a whole. Cheers, Nick. P.S. I'll also note that we're not *actually* limited to resolving such conflicts in public venues (even though I think that's a good default habit for us to retain): as long as we report the outcome of any mutual agreements about design priorities back to the relevant public venue (e.g. a tracker issue), there's nothing wrong with shifting our attempts to better understand each other's perspectives to private email, IRC, video chat, etc. A non-trivial number of previously vociferous arguments have been resolved amicably once the main parties involved have had a chance to discuss them in person at a conference or sprint. It can even make sense to reach out to other core devs for help, since it's almost always easier for someone not caught in the midst of an argument to see both sides of it, and potentially spot a core of agreement amidst various surface level disagreements :) -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From victor.stinner at gmail.com Wed Jul 19 10:26:56 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 19 Jul 2017 16:26:56 +0200 Subject: [Python-Dev] Python startup time In-Reply-To: <20170719132211.GA30782@phdru.name> References: <20170719132211.GA30782@phdru.name> Message-ID: 2017-07-19 15:22 GMT+02:00 Oleg Broytman : > On Wed, Jul 19, 2017 at 02:59:52PM +0200, Victor Stinner wrote: >> "Python is very slow to start on Windows 7" >> https://stackoverflow.com/questions/29997274/python-is-very-slow-to-start-on-windows-7 > > However hard you are going to optimize Python you cannot fix those > "defenders", "guards" and "protectors". :-) This particular link can be > excluded from consideration. Sorry, I didn't read carefully each link I posted. Even for me knowing what Python does at startup, it's hard to explain why 3 people have different timing: 15 ms, 75 ms and 300 ms for example. In my experience, the following things impact Python startup: * -S option: loading or not the site module * Paths in sys.path: PYTHONPATH environment variable for example * .pth files files in sys.path * Python running in a virtual environment or not * Operating system: Python loads different modules at startup depending on the OS. Naoki INADA just removed _osx_support from being imported in the site module on macOS for example. My list is likely incomplete. In the performance benchmark suite, a controlled virtual environment is created to have a known set of modules. FYI running Python is a virtual environment is slower than "system" python which runs outside a virtual environment... Victor From solipsis at pitrou.net Wed Jul 19 11:13:08 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 19 Jul 2017 17:13:08 +0200 Subject: [Python-Dev] anecdotal data References: Message-ID: <20170719171308.0d50ee6a@fsol> On Wed, 19 Jul 2017 14:59:52 +0200 Victor Stinner wrote: > Hi, > > On Twitter, Raymond Hettinger wrote: > > "The decision making process on Python-dev is an anti-pattern, > governed by anecdotal data and ambiguity over what problem is solved." > > https://twitter.com/raymondh/status/887069454693158912 > > About "anecdotal data", I would like to discuss the Python startup time. And I would like to step back and examine the general criticism of "anecdotal data". Large software and hardware companies have the resources to conduct comprehensive surveys of how people use their products. For example, Intel might have accumulated millions of traces of critical production x86 code that they want to keep running efficiently (or even keep running at all). Apple might have thousands of third-party applications which they can simulate running on a newer version of whatever OS, core library or pieces of hardware those applications rely on. Even Google may nowadays have hundreds or thousands of critical services written in Go, and they may be able to assess the effect of further changes of the Go runtime on those services (not sure they do, but they would certainly have the resources to). CPython is a comparatively small, disorganized and volunteer-based community. It doesn't have the resources or organization required to lead such studies on a regular basis. Chances are it will never have. So all we can rely on is 1) our respective individual experiences in the field 2) anecdotal data. When we rewrote the Python 3 IO stack in C, we were relying on our intuition that high-performance IO is important, and on anecdotal data (micro-benchmarks) that the pure Python IO stack is slow. When Tim or Raymond tweak the lookup function for dicts, they rely on anecdotal data delivered by a few select micro-benchmarks, and their intuition that some use cases need to be fast (for example dicts with string keys or keys made up of consecutive integers). We don't have any hard data that all those optimizations are necessary for the majority of Python applications. I don't think anybody in the world has statistically sound data about the entire body of Python code, or even a sufficiently large and relevant subset thereof (such as "Python code used in production for critical services"). We aren't scientists. We are engineers and have to make with whatever anecdotes we are aware of (be they from our own experiences, or users' complaints). We can't just say "yes, there seems be a performance issue, but I'll wait until we have non-anecdotal data that it's important". Because that day will probably never come, and in the meantime our users will have fled elsewhere. Regards Antoine. From solipsis at pitrou.net Wed Jul 19 11:20:56 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 19 Jul 2017 17:20:56 +0200 Subject: [Python-Dev] [OT] Twitter echo chamber (Python startup time) References: Message-ID: <20170719172056.19e392af@fsol> On Wed, 19 Jul 2017 14:59:52 +0200 Victor Stinner wrote: > Hi, > > On Twitter, Raymond Hettinger wrote: > > "The decision making process on Python-dev is an anti-pattern, > governed by anecdotal data and ambiguity over what problem is solved." > > https://twitter.com/raymondh/status/887069454693158912 Kind-of OT: while I understand (and have sometimes felt myself) the desire to vent frustration about a decision one doesn't agree with, thers should be *at least* a link to the discussion alluded to so that readers make their own mind. Otherwise, it feels to me like any disagreement here may end up chastised on Twitter by some influential figure of authority. That's not a pleasant place to be in. Regards Antoine. From gvanrossum at gmail.com Wed Jul 19 11:57:42 2017 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed, 19 Jul 2017 08:57:42 -0700 Subject: [Python-Dev] anecdotal data In-Reply-To: <20170719171308.0d50ee6a@fsol> References: <20170719171308.0d50ee6a@fsol> Message-ID: Exactly. This is how Python came to be in the first place. Benchmarks are great, but don't underestimate creativity. On Jul 19, 2017 8:15 AM, "Antoine Pitrou" wrote: On Wed, 19 Jul 2017 14:59:52 +0200 Victor Stinner wrote: > Hi, > > On Twitter, Raymond Hettinger wrote: > > "The decision making process on Python-dev is an anti-pattern, > governed by anecdotal data and ambiguity over what problem is solved." > > https://twitter.com/raymondh/status/887069454693158912 > > About "anecdotal data", I would like to discuss the Python startup time. And I would like to step back and examine the general criticism of "anecdotal data". Large software and hardware companies have the resources to conduct comprehensive surveys of how people use their products. For example, Intel might have accumulated millions of traces of critical production x86 code that they want to keep running efficiently (or even keep running at all). Apple might have thousands of third-party applications which they can simulate running on a newer version of whatever OS, core library or pieces of hardware those applications rely on. Even Google may nowadays have hundreds or thousands of critical services written in Go, and they may be able to assess the effect of further changes of the Go runtime on those services (not sure they do, but they would certainly have the resources to). CPython is a comparatively small, disorganized and volunteer-based community. It doesn't have the resources or organization required to lead such studies on a regular basis. Chances are it will never have. So all we can rely on is 1) our respective individual experiences in the field 2) anecdotal data. When we rewrote the Python 3 IO stack in C, we were relying on our intuition that high-performance IO is important, and on anecdotal data (micro-benchmarks) that the pure Python IO stack is slow. When Tim or Raymond tweak the lookup function for dicts, they rely on anecdotal data delivered by a few select micro-benchmarks, and their intuition that some use cases need to be fast (for example dicts with string keys or keys made up of consecutive integers). We don't have any hard data that all those optimizations are necessary for the majority of Python applications. I don't think anybody in the world has statistically sound data about the entire body of Python code, or even a sufficiently large and relevant subset thereof (such as "Python code used in production for critical services"). We aren't scientists. We are engineers and have to make with whatever anecdotes we are aware of (be they from our own experiences, or users' complaints). We can't just say "yes, there seems be a performance issue, but I'll wait until we have non-anecdotal data that it's important". Because that day will probably never come, and in the meantime our users will have fled elsewhere. Regards Antoine. _______________________________________________ Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ guido%40python.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From larry at hastings.org Wed Jul 19 15:15:46 2017 From: larry at hastings.org (Larry Hastings) Date: Wed, 19 Jul 2017 12:15:46 -0700 Subject: [Python-Dev] Python startup time In-Reply-To: References: Message-ID: <12e547e7-de6f-2dca-d3fe-47b63e108a8b@hastings.org> On 07/19/2017 05:59 AM, Victor Stinner wrote: > Mercurial startup time is already 45.8x slower than Git whereas tested > Mercurial runs on Python 2.7.12. Now try to sell Python 3 to Mercurial > developers, with a startup time 2x - 3x slower... When Matt Mackall spoke at the Python Language Summit some years back, I recall that he specifically complained about Python startup time. He said Python 3 "didn't solve any problems for [them]"--they'd already solved their Unicode hygiene problems--and that Python's slow startup time was already a big problem for them. Python 3 being /even slower/ to start was absolutely one of the reasons why they didn't want to upgrade. You might think "what's a few milliseconds matter". But if you run hundreds of commands in a shell script it adds up. git's speed is one of the few bright spots in its UX, and hg's comparative slowness here is a palpable disadvantage. > So please continue efforts for make Python startup even faster to beat > all other programming languages, and finally convince Mercurial to > upgrade ;-) I believe Mercurial is, finally, slowly porting to Python 3. https://www.mercurial-scm.org/wiki/Python3 Nevertheless, I can't really be annoyed or upset at them moving slowly to adopt Python 3, as Matt's objections were entirely legitimate. Cheers, //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From benhoyt at gmail.com Wed Jul 19 15:26:47 2017 From: benhoyt at gmail.com (Ben Hoyt) Date: Wed, 19 Jul 2017 15:26:47 -0400 Subject: [Python-Dev] Python startup time In-Reply-To: <12e547e7-de6f-2dca-d3fe-47b63e108a8b@hastings.org> References: <12e547e7-de6f-2dca-d3fe-47b63e108a8b@hastings.org> Message-ID: Yes, agreed that startup time matters for scripting. I was talking to someone on the Google Cloud SDK (CLI) team recently, and they said startup time is a big deal for them ... it's especially problematic for shell tab completion helpers, because every time you press tab the shell has to load your Python program to do the completion. Even a couple dozen milliseconds is noticeable when you're typing quickly. -Ben On Wed, Jul 19, 2017 at 3:15 PM, Larry Hastings wrote: > > > On 07/19/2017 05:59 AM, Victor Stinner wrote: > > Mercurial startup time is already 45.8x slower than Git whereas tested > Mercurial runs on Python 2.7.12. Now try to sell Python 3 to Mercurial > developers, with a startup time 2x - 3x slower... > > > When Matt Mackall spoke at the Python Language Summit some years back, I > recall that he specifically complained about Python startup time. He said > Python 3 "didn't solve any problems for [them]"--they'd already solved > their Unicode hygiene problems--and that Python's slow startup time was > already a big problem for them. Python 3 being *even slower* to start > was absolutely one of the reasons why they didn't want to upgrade. > > You might think "what's a few milliseconds matter". But if you run > hundreds of commands in a shell script it adds up. git's speed is one of > the few bright spots in its UX, and hg's comparative slowness here is a > palpable disadvantage. > > > So please continue efforts for make Python startup even faster to beat > all other programming languages, and finally convince Mercurial to > upgrade ;-) > > > I believe Mercurial is, finally, slowly porting to Python 3. > > https://www.mercurial-scm.org/wiki/Python3 > > Nevertheless, I can't really be annoyed or upset at them moving slowly to > adopt Python 3, as Matt's objections were entirely legitimate. > > > Cheers, > > > */arry* > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > benhoyt%40gmail.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Wed Jul 19 16:02:09 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 19 Jul 2017 16:02:09 -0400 Subject: [Python-Dev] Python most popular progamming language on github? Message-ID: https://blog.sourced.tech/post/language_migrations/ Waren Long analyzed several years of Github data for 22 top languages (excluding browser Javascript) with respect to language use and change of use, defined a 'centrality measure' based on the stationary distribution of a markov chain model of language switching. Time trend: Python rose from about 2002 to 2007, stayed flat until 2013, then has risen since. Conclusion: The Python sky is not falling; Python3 did not kill Python. (This is not a call for complacency.) The measure is *not* based on lines of code. The 4 after Python, Java, C, C++, and PHP have more lines on Github. Well, we all know that non-cryptic conciseness is good ;-). Java has at least doubled since 2007. Perhaps that is mostly prortable Android devices. This analysis was stimulated (provoked?) by Erik Bernhardsson's analysis of Google searches related to changing language. Go won in that. It seems that people learn and use Python without asking Google so much. -- Terry Jan Reedy From solipsis at pitrou.net Wed Jul 19 16:35:16 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 19 Jul 2017 22:35:16 +0200 Subject: [Python-Dev] Python startup time References: <12e547e7-de6f-2dca-d3fe-47b63e108a8b@hastings.org> Message-ID: <20170719223516.51db2c00@fsol> On Wed, 19 Jul 2017 15:26:47 -0400 Ben Hoyt wrote: > Yes, agreed that startup time matters for scripting. I was talking to > someone on the Google Cloud SDK (CLI) team recently, and they said startup > time is a big deal for them ... it's especially problematic for shell tab > completion helpers, because every time you press tab the shell has to load > your Python program to do the completion. And also, for the same reason, for shell prompt additions such as git-prompt. Mercurial had to write a C client (chg) to make this usable. Regards Antoine. From chris.barker at noaa.gov Wed Jul 19 19:11:24 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 19 Jul 2017 16:11:24 -0700 Subject: [Python-Dev] Python startup time In-Reply-To: <20170719223516.51db2c00@fsol> References: <12e547e7-de6f-2dca-d3fe-47b63e108a8b@hastings.org> <20170719223516.51db2c00@fsol> Message-ID: As long as we are talking anecdotes: If it could save a person?s life, could you find a way to save ten seconds off the boot time? If there were five million people using the Mac, and it took ten seconds extra to turn it on every day, that added up to three hundred million or so hours per year people would save, which was the equivalent of at least one hundred lifetimes saved per year. Steve Jobs. (http://stevejobsdailyquote.com/2014/03/26/boot-time/) It really does depend on how/what users are using Python for. In general, Python has been moving more and more toward a "systems development language" from a "scripting language". Which may make us think "scripting" issues like startup time don't matter -- but,. of course, they matter a lot to those use cases. -CHB On Wed, Jul 19, 2017 at 1:35 PM, Antoine Pitrou wrote: > On Wed, 19 Jul 2017 15:26:47 -0400 > Ben Hoyt wrote: > > Yes, agreed that startup time matters for scripting. I was talking to > > someone on the Google Cloud SDK (CLI) team recently, and they said > startup > > time is a big deal for them ... it's especially problematic for shell tab > > completion helpers, because every time you press tab the shell has to > load > > your Python program to do the completion. > > And also, for the same reason, for shell prompt additions such as > git-prompt. Mercurial had to write a C client (chg) to make this > usable. > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > chris.barker%40noaa.gov > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed Jul 19 21:19:25 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 20 Jul 2017 11:19:25 +1000 Subject: [Python-Dev] Python startup time In-Reply-To: References: <12e547e7-de6f-2dca-d3fe-47b63e108a8b@hastings.org> <20170719223516.51db2c00@fsol> Message-ID: <20170720011924.GH3149@ando.pearwood.info> On Wed, Jul 19, 2017 at 04:11:24PM -0700, Chris Barker wrote: > As long as we are talking anecdotes: > > If it could save a person?s life, could you find a way to save ten seconds > off the boot time? If there were five million people using the Mac, and it > took ten seconds extra to turn it on every day, that added up to three > hundred million or so hours per year people would save, which was the > equivalent of at least one hundred lifetimes saved per year. > > Steve Jobs. And about a fifth of the time they spent standing in lines waiting to buy the latest unnecessary iGadget... But seriously, that calculation is completely bogus. Not only is Steve Job's arithmetic *completely* wrong, but the whole premise is nonsense. Do the maths yourself: ten seconds per day is 3650 seconds in a year, which is slightly over an hour (3600 seconds). Multiply by five million users, that's about five million hours, not 300 million. So Jobs exaggerates the time saved by a factor of sixty. (Or maybe Jobs was warning that Macs crash sixty times a day...) But the premise is wrong too. Those hypothetical people don't turn their Macs on in sequence, each person turning their computer on only after the previous person's Mac had finished booting. They effectively boot them up in parallel but offset, spread out over a 24 hour period, so about 3472 people booting up at the same time each minute of the day. Time savings for parallel processes don't add in the way Jobs adds them, if we treat this as 1440 parallel processes (one per minute of the day) we save 1440 hours a year. But really, the only meaningful calculation is the each person saves 10 seconds per day. We can't even meaningfully say they save one hour a year: it doesn't come nicely packaged up for you all at once, so you can actually do something useful with it, nor can you save those ten seconds from one day to the next. You only get one shot at using them. What can you do with ten seconds per day? By the time you decide what to do with the extra time, it's already gone. There are good reasons for speeding up boot time, but this sort of calculation is not one of them. I think it is in particularly bad taste to exaggerate the significance of it by putting it in terms of saving lives. You want to save real lives? How about fixing the conditions in the sweatshops that make Apple phones? And installing suicide nets around the building doesn't count. -- Steve From schesis at gmail.com Wed Jul 19 22:40:47 2017 From: schesis at gmail.com (Zero Piraeus) Date: Wed, 19 Jul 2017 22:40:47 -0400 Subject: [Python-Dev] Python startup time In-Reply-To: <20170720011924.GH3149@ando.pearwood.info> References: <12e547e7-de6f-2dca-d3fe-47b63e108a8b@hastings.org> <20170719223516.51db2c00@fsol> <20170720011924.GH3149@ando.pearwood.info> Message-ID: : On 19 July 2017 at 21:19, Steven D'Aprano wrote: > But the premise is wrong too. Those hypothetical people don't turn their > Macs on in sequence, each person turning their computer on only after > the previous person's Mac had finished booting. They effectively boot > them up in parallel but offset, spread out over a 24 hour period, so > about 3472 people booting up at the same time each minute of the day. > Time savings for parallel processes don't add in the way Jobs adds them, > if we treat this as 1440 parallel processes (one per minute of the day) > we save 1440 hours a year. Ah, but the relevant unit here is person-hours, not hours: Jobs is claiming that *each* Mac user loses X% of *their* life to boot times, and then adds all those slices of life together into N lifetimes (which again, are counted in person-years, not years). It's still wrong, though: longer boot times actually increase the proportion of your life spent in meaningful activity (e.g. going to the canteen and talking to someone). -[]z. From ncoghlan at gmail.com Wed Jul 19 23:42:11 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 20 Jul 2017 13:42:11 +1000 Subject: [Python-Dev] Python most popular progamming language on github? In-Reply-To: References: Message-ID: On 20 July 2017 at 06:02, Terry Reedy wrote: > https://blog.sourced.tech/post/language_migrations/ > Waren Long analyzed several years of Github data for 22 top languages > (excluding browser Javascript) with respect to language use and change of > use, defined a 'centrality measure' based on the stationary distribution of > a markov chain model of language switching. > > Time trend: Python rose from about 2002 to 2007, stayed flat until 2013, > then has risen since. > > Conclusion: The Python sky is not falling; Python3 did not kill Python. > (This is not a call for complacency.) Folks may also be interested in this year's IEEE Spectrum language popularity analysis, which slots Python in at number 1 for the first time: http://spectrum.ieee.org/computing/software/the-2017-top-programming-languages Folks involved in the Python community (whether as educators, advocates, event organisers, developers, or otherwise) should take a lot of pride in that outcome, since making the reference interpreter available for use is only step one in the process of enabling real world adoption :) Cheers, Nick. P.S. One of the nice things about the IEEE analysis is that they list all of their data sources and the relative weight they ascribe to each one in determining their overall summary rankings, and then provide the ability to customise the weightings to come up with your own ranking. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From tjreedy at udel.edu Thu Jul 20 02:20:39 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 20 Jul 2017 02:20:39 -0400 Subject: [Python-Dev] Python startup time In-Reply-To: References: Message-ID: On 7/19/2017 10:05 AM, Nick Coghlan wrote: > P.S. I'll also note that we're not *actually* limited to resolving > such conflicts in public venues (even though I think that's a good > default habit for us to retain): as long as we report the outcome of > any mutual agreements about design priorities back to the relevant > public venue (e.g. a tracker issue), there's nothing wrong with > shifting our attempts to better understand each other's perspectives > to private email, IRC, video chat, etc. I expect and hope that there will be discussion of this issue at the core developer sprint in September, with summary reports back here on pydev. > It can even make sense to reach out to other > core devs for help, since it's almost always easier for someone not > caught in the midst of an argument to see both sides of it, and > potentially spot a core of agreement amidst various surface level > disagreements :) I always understood the Python development process, both for core and users, to be "Make it right; then make it faster", with the second clause conditioned on 'while keeping it right' and maybe, and especially for core development 'if significantly slow'. (People can rightly work on speed of personal code for other reasons.) I believe we pretty much agree on the principles. The disagreement seems to be on whether a particular case is 'significantly slow'. I believe that the burden of proof is with those who propose a change. The burden of the proof depends on the final qualification: 'without adding unnecessary or extreme complexity'. If there is no added complication, the burden is slight. If not, we will likely disagree about complexity and its tradeoff with speed. About 'keeping it right': It has been mentioned that more complicated code *generally* makes it harder to 'see' that the code is (basically) correct. The second line of defense is the automated test suite. I think, for instance, that someone interested in changing namedtuple (to a faster and presumably more complicated implementation) should check the coverage of the current code, with branches checked both ways. Then, bring the coverage up to 100% if is not already, and carefully check the test for possible missing cases. A small static set of test cases cannot cover everything. The third test of an implementation is accumulated user experience. A new implementation starts at 0. One way to increase that is test the implementation with 3rd-part code. Another, I think, is through randomized testing. Proposal 1: Depending on our confidence in a new implementation, simulate user experience with randomized tests, perhaps running for hours. Example: we develop a random (unicode) identifier generator that starts with any of the legal initial codepoints and continues with a random number of legal follow codepoints. Then test (old) and new namedtuple with random class and a random number of random field names. A developer could also use third-party packages, like hypothesis. Code and a summary could be uploaded to bpo. A summary could even go in the code file. Note 1: Tim Peters did something like this when developing timsort. He provided a nice summary of test cases and time results. Note 2: Randomized tests require that either a) randomized inputs are verified by property or predicate, rather than by hard-coded values, or b) inputs are generated from outputs, where either the output or inverse generation are randomized. Tests of sorting can use either is_sorted(list(sorted(random_input))) or list(sorted(random_shuffle(output))) == output. Proposal 2: Add randomized tests here and there in the test suite. Each randomized test x 30 buildbots x 2 runs/day x 365 days/year is about 22000 random inputs a year. Since each buildbot would be running a slightly different test, we need to act on and not ignore sporadic failures. Victor Stinner's buildbot work is making this feasible. -- Terry Jan Reedy -- Terry Jan Reedy From victor.stinner at gmail.com Thu Jul 20 04:56:43 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 20 Jul 2017 10:56:43 +0200 Subject: [Python-Dev] Python startup time In-Reply-To: References: Message-ID: Hi, I applied the patch above to count the number of times that Python is run. Running the Python test suite with "./python -m test -j0 -rW" runs Python 2,256 times. Honestly, I expected more. I'm running tests with Python compiled in debug mode. And in debug mode, Python startup time is much worse: haypo at selma$ python3 -m perf command --inherit=PYTHONPATH -v -- ./python -c pass command: Mean +- std dev: 46.4 ms +- 2.3 ms FYI I'm using gcc -O0 rather than -Og to make compilation even faster. Victor diff --git a/Lib/site.py b/Lib/site.py index 7dc1b04..4b0c167 100644 --- a/Lib/site.py +++ b/Lib/site.py @@ -540,6 +540,21 @@ def execusercustomize(): (err.__class__.__name__, err)) +def run_counter(): + import fcntl + + fd = os.open("/home/haypo/prog/python/master/run_counter", + os.O_WRONLY | os.O_CREAT | os.O_APPEND) + try: + fcntl.flock(fd, fcntl.LOCK_EX) + try: + os.write(fd, b'\x01') + finally: + fcntl.flock(fd, fcntl.LOCK_UN) + finally: + os.close(fd) + + def main(): """Add standard site-specific directories to the module search path. @@ -568,6 +583,7 @@ def main(): execsitecustomize() if ENABLE_USER_SITE: execusercustomize() + run_counter() # Prevent extending of sys.path when python was started with -S and # site is imported later. From levkivskyi at gmail.com Thu Jul 20 07:24:32 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Thu, 20 Jul 2017 13:24:32 +0200 Subject: [Python-Dev] Python startup time In-Reply-To: References: Message-ID: I agree the start-up time is important. There is something that is related. ABCMeta is currently implemented in Python. This makes it slow, creation of an ABC is 2x slower than creation of a normal class. However, ABCs are used by many medium and large size projects. Also, both abc and _collections_abc are imported at start-up (in particular importlib uses several ABCs, os also needs them for environments). Finally, all generics in typing module and user-defined generic types are ABCs (to allow interoperability with collections.abc). My idea is to re-implement ABCMeta (and ingredients it depends on, like WeakSet) in C. I didn't find such proposal on b.p.o., I have two questions: * Are there some potential problems with this idea (except that it may take some time and effort)? * Is it something worth doing as an optimization? (If answers are no and yes, then maybe I would spend part of my vacation in August on it.) -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From songofacandy at gmail.com Thu Jul 20 08:29:18 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Thu, 20 Jul 2017 21:29:18 +0900 Subject: [Python-Dev] Python startup time In-Reply-To: References: Message-ID: Hi, Ivan. First of all, Yes, please do it! On Thu, Jul 20, 2017 at 8:24 PM, Ivan Levkivskyi wrote: > I agree the start-up time is important. There is something that is related. > ABCMeta is currently implemented in Python. > This makes it slow, creation of an ABC is 2x slower than creation of a > normal class. Additionally, ABC infects by inheritance. When people use mix-in provided by collections.abc, the class is ABC even if it's concrete class. There are no documented/recommended way to inherit from ABC class but not use ABCMeta. > However, ABCs are used by many medium and large size projects. Many people having other language background uses ABC for Java's interface or Abstract Class. So it may worth enough to have just Abstract, but not ABC. See https://mail.python.org/pipermail/python-ideas/2017-July/046495.html > Also, both abc and _collections_abc are imported at start-up (in particular > importlib uses several ABCs, os also needs them for environments). > Finally, all generics in typing module and user-defined generic types are > ABCs (to allow interoperability with collections.abc). > Yes. Even if site.py doesn't use typing, many application and libraries will start using typing. And it's much slower than collections.abc. > My idea is to re-implement ABCMeta (and ingredients it depends on, like > WeakSet) in C. > I didn't find such proposal on b.p.o., I have two questions: > * Are there some potential problems with this idea (except that it may take > some time and effort)? WeakSet should be cared specially. Maybe, ABCMeta can be optimized first. Currently, ABCMeta use three WeakSets. But it can be delayed until `register` or `issubclass` is called. So even if WeakSet is implemented in Python, I think ABCMeta can be much faster. > * Is it something worth doing as an optimization? > (If answers are no and yes, then maybe I would spend part of my vacation in > August on it.) > > -- > Ivan > > Bests, From solipsis at pitrou.net Thu Jul 20 08:56:37 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 20 Jul 2017 14:56:37 +0200 Subject: [Python-Dev] Python startup time References: Message-ID: <20170720145637.3afe5133@fsol> On Thu, 20 Jul 2017 21:29:18 +0900 INADA Naoki wrote: > > WeakSet should be cared specially. > Maybe, ABCMeta can be optimized first. > > Currently, ABCMeta use three WeakSets. But it can be delayed until > `register` or > `issubclass` is called. > So even if WeakSet is implemented in Python, I think ABCMeta can be much faster. Simple uses of WeakSet can probably be replaced with regular sets + weakref callbacks. As long as you are not doing one of the delicate things (such as iterate), it should be fine. Regards Antoine. From stefan_ml at behnel.de Thu Jul 20 09:32:48 2017 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 20 Jul 2017 15:32:48 +0200 Subject: [Python-Dev] Python startup time In-Reply-To: References: Message-ID: Ivan Levkivskyi schrieb am 20.07.2017 um 13:24: > I agree the start-up time is important. There is something that is related. > ABCMeta is currently implemented in Python. > This makes it slow, creation of an ABC is 2x slower than creation of a > normal class. > However, ABCs are used by many medium and large size projects. > Also, both abc and _collections_abc are imported at start-up (in particular > importlib uses several ABCs, os also needs them for environments). > Finally, all generics in typing module and user-defined generic types are > ABCs (to allow interoperability with collections.abc). > > My idea is to re-implement ABCMeta (and ingredients it depends on, like > WeakSet) in C. I know that this hasn't really been an accepted option so far (and it's actually not an option for a few really early modules during startup), but compiling a Python module with Cython will usually speed it up quite noticibly (often 10-30%, sometimes more if you're lucky, e.g. [1]). And that also applies to the startup time, simply because it's pre-compiled. So, before considering to write an accelerator module in C that replaces some existing Python module, and thus duplicating its entire source code with highly increased complexity, I'd like to remind you that simply compiling the Python module itself to C should give at least reasonable speed-ups *without* adding to the maintenance burden, and can be done optionally as part of the build process. We do that for Cython itself during its installation, for example. Stefan (Cython core developer) [1] 3x faster URL routing by compiling a single Django module with Cython: https://us.pycon.org/2017/schedule/presentation/693/ From ncoghlan at gmail.com Thu Jul 20 10:02:09 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 21 Jul 2017 00:02:09 +1000 Subject: [Python-Dev] Python startup time In-Reply-To: References: Message-ID: On 20 July 2017 at 23:32, Stefan Behnel wrote: > So, before considering to write an accelerator module in C that replaces > some existing Python module, and thus duplicating its entire source code > with highly increased complexity, I'd like to remind you that simply > compiling the Python module itself to C should give at least reasonable > speed-ups *without* adding to the maintenance burden, and can be done > optionally as part of the build process. We do that for Cython itself > during its installation, for example. And if folks are concerned about the potential bootstrapping issues with this approach, the gist is that it would have to look something like this: Phase 0: freeze importlib - build a CPython with only builtin and frozen module support - use it to freeze importlib Phase 1: traditional CPython - build the traditional Python interpreter with no Cython accelerated modules Phase 2: accelerated CPython - if not otherwise available, use the traditional Python interpreter to download & install Cython in a virtual environment - run Cython to selectively precompile key modules (such as those implicitly imported at startup) Technically, phase 2 doesn't actually *change* CPython itself, since the import system is already setup such that if an extension module and a source module are side-by-side in the same directory, then the extension module will take precedence. As a result, precompiling with Cython is similar in many ways to precompiling to bytecode, its just that the result is native machine code with Python C API calls, rather than CPython bytecode. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From cesare.di.mauro at gmail.com Thu Jul 20 13:09:38 2017 From: cesare.di.mauro at gmail.com (Cesare Di Mauro) Date: Thu, 20 Jul 2017 19:09:38 +0200 Subject: [Python-Dev] Python startup time In-Reply-To: References: <20170719132211.GA30782@phdru.name> Message-ID: 2017-07-19 16:26 GMT+02:00 Victor Stinner : > 2017-07-19 15:22 GMT+02:00 Oleg Broytman : > > On Wed, Jul 19, 2017 at 02:59:52PM +0200, Victor Stinner < > victor.stinner at gmail.com> wrote: > >> "Python is very slow to start on Windows 7" > >> https://stackoverflow.com/questions/29997274/python-is- > very-slow-to-start-on-windows-7 > > > > However hard you are going to optimize Python you cannot fix those > > "defenders", "guards" and "protectors". :-) This particular link can be > > excluded from consideration. > > Sorry, I didn't read carefully each link I posted. Even for me knowing > what Python does at startup, it's hard to explain why 3 people have > different timing: 15 ms, 75 ms and 300 ms for example. In my > experience, the following things impact Python startup: > > * -S option: loading or not the site module > * Paths in sys.path: PYTHONPATH environment variable for example > * .pth files files in sys.path > * Python running in a virtual environment or not > * Operating system: Python loads different modules at startup > depending on the OS. Naoki INADA just removed _osx_support from being > imported in the site module on macOS for example. > > My list is likely incomplete. > > In the performance benchmark suite, a controlled virtual environment > is created to have a known set of modules. FYI running Python is a > virtual environment is slower than "system" python which runs outside > a virtual environment... > > Victor > > Hi Victor, I assume that Python loads compiled (.pyc and/or .pyo) from the stdlib. That's something that also influences the startup time (compiling source vs loading pre-compiled modules). Bests, Cesare Mail priva di virus. www.avast.com <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Thu Jul 20 13:23:28 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 20 Jul 2017 19:23:28 +0200 Subject: [Python-Dev] Python startup time In-Reply-To: References: <20170719132211.GA30782@phdru.name> Message-ID: 2017-07-20 19:09 GMT+02:00 Cesare Di Mauro : > I assume that Python loads compiled (.pyc and/or .pyo) from the stdlib. That's something that also influences the startup time (compiling source vs loading pre-compiled modules). My benchmark was "python3 -m perf command -- python3 -c pass": I don't explicitly remove .pyc files, I expect that Python uses prebuilt .pyc files from __pycache__. Victor From jimjjewett at gmail.com Thu Jul 20 13:53:52 2017 From: jimjjewett at gmail.com (Jim J. Jewett) Date: Thu, 20 Jul 2017 13:53:52 -0400 Subject: [Python-Dev] startup time repeated? why not daemon Message-ID: I agree that startup time is a problem, but I wonder if some of the pain could be mitigated by using a persistent process. For example, in https://mail.python.org/pipermail/python-dev/2017-July/148664.html Ben Hoyt mentions that the Google Cloud SDK (CLI) team has found it "especially problematic for shell tab completion helpers, because every time you press tab the shell has to load your Python program" Decades ago, I learned to set my editor to vi instead of emacs for similar reasons -- but there was also an emacsclient option that simply opened a new window from an already running emacs process. tab completion seems like the exactly the sort of thing that should be sent to an existing process instead of creating a new one. Is it too hard to create a daemon server? Is the communication and context switch slower than a new startup? Is the pattern just not well-enough advertised? -jJ -------------- next part -------------- An HTML attachment was scrubbed... URL: From phd at phdru.name Thu Jul 20 14:15:35 2017 From: phd at phdru.name (Oleg Broytman) Date: Thu, 20 Jul 2017 20:15:35 +0200 Subject: [Python-Dev] startup time repeated? why not daemon In-Reply-To: References: Message-ID: <20170720181535.GA5957@phdru.name> On Thu, Jul 20, 2017 at 01:53:52PM -0400, "Jim J. Jewett" wrote: > I agree that startup time is a problem, but I wonder if some of the pain > could be mitigated by using a persistent process. > > For example, in > https://mail.python.org/pipermail/python-dev/2017-July/148664.html Ben Hoyt > mentions that the Google Cloud SDK (CLI) team has found it "especially > problematic for shell tab completion helpers, because every time you press > tab the shell has to load your Python program" > > Decades ago, I learned to set my editor to vi instead of emacs for similar > reasons -- but there was also an emacsclient option that simply opened a > new window from an already running emacs process. tab completion seems > like the exactly the sort of thing that should be sent to an existing > process instead of creating a new one. > > Is it too hard to create a daemon server? > Is the communication and context switch slower than a new startup? > Is the pattern just not well-enough advertised? Just yesterday there was a link to such a daemon that caches pyGTK. Eons ago I'd been using ReadyExec: http://readyexec.sourceforge.net/ > -jJ Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From cesare.di.mauro at gmail.com Thu Jul 20 15:38:23 2017 From: cesare.di.mauro at gmail.com (Cesare Di Mauro) Date: Thu, 20 Jul 2017 21:38:23 +0200 Subject: [Python-Dev] Python startup time In-Reply-To: References: <20170719132211.GA30782@phdru.name> Message-ID: 2017-07-20 19:23 GMT+02:00 Victor Stinner : > 2017-07-20 19:09 GMT+02:00 Cesare Di Mauro : > > I assume that Python loads compiled (.pyc and/or .pyo) from the stdlib. > That's something that also influences the startup time (compiling source vs > loading pre-compiled modules). > > My benchmark was "python3 -m perf command -- python3 -c pass": I don't > explicitly remove .pyc files, I expect that Python uses prebuilt .pyc > files from __pycache__. > > Victor > OK, that should be the best case. An idea to improve the situation might be to find an alternative structure for .pyc/pyo files, which allows to (partially) "parallelize" their loading (not execution, of course), or at least speed-up the process. Maybe a GSoC project for some student, if no core dev has time to investigate it. Cesare Mail priva di virus. www.avast.com <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Thu Jul 20 16:47:30 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 20 Jul 2017 21:47:30 +0100 Subject: [Python-Dev] startup time repeated? why not daemon In-Reply-To: References: Message-ID: On 20 July 2017 at 18:53, Jim J. Jewett wrote: > Is it too hard to create a daemon server? > Is the communication and context switch slower than a new startup? > Is the pattern just not well-enough advertised? Managing a daemon (including things like stopping it when it's been idle for "too long") is hard to get right, and even more so when it needs to be cross-platform. That's not always a problem, but probably is enough of the time to make "use a daemon" a somewhat specialist solution. Paul From ericsnowcurrently at gmail.com Thu Jul 20 17:16:10 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 20 Jul 2017 15:16:10 -0600 Subject: [Python-Dev] startup time repeated? why not daemon In-Reply-To: References: Message-ID: On Thu, Jul 20, 2017 at 11:53 AM, Jim J. Jewett wrote: > I agree that startup time is a problem, but I wonder if some of the pain > could be mitigated by using a persistent process. > > [snip] > > Is it too hard to create a daemon server? > Is the communication and context switch slower than a new startup? > Is the pattern just not well-enough advertised? A couple years ago I suggested the same idea (i.e. "pythond") during a conversation with MAL at PyCon UK. IIRC, security and complexity were the two major obstacles. Assuming you use fork, you must ensure that the daemon gets into just the right state. Otherwise you're leaking (potentially sensitive) info into the forked processes or you're wasting cycles/memory. Relatedly, at PyCon this year Barry and I were talking about the idea of bootstrapping the interpreter from a memory snapshot on disk, rather than from scatch (thus drastically reducing the number of IO events). From what I gather, emacs does (or did) something like this. The key thing for both solutions is getting the CPython runtime in a very specific state. Any such solution needs to get as much of the runtime ready as possible, but only as much as is common to "most" possible "python" invocations. Furthermore, it has to be extremely careful about security, e.g. protecting sensitive data and not escalating privileges. Having a python daemon that runs as root is probably out of the question for now, meaning each user would have to run their own daemon, paying for startup the first time they run "python". Aside from security concerns, there are parts of the CPython runtime that depend on CLI flags and environment variables during startup. Each "python" invocation must respect those inputs, as happens now, rather than preserving the inputs from when the daemon was started. FWIW, the startup-related code we landed in May (at PyCon), as a precursor for Nick Coglan's PEP 432, improves the technical situation somewhat by more clearly organizing startup of CPython's runtime (and the main interpreter). Also, as part of my (slowly progressing) multi-core Python project, I'm currently working on consolidating the (CPython) global runtime state into an explicit struct. This will help us reason better about the state of the runtime, allowing us to be more confident about (and more able to implement) solutions for isolating/protecting/optimizing the CPython runtime. These efforts have been all about improving the understandability of CPython's runtime through more concise encapsulation. The overarching goals have been: reducing out maintenance burden, lowering the cost of enhancement, improving the embedding story, and even enabling better runtime portability (e.g. across threads, processes, and even hosts). There is a direct correlation there with better opportunities to improve startup time, including a python daemon. -eric From njs at pobox.com Thu Jul 20 20:19:19 2017 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 20 Jul 2017 17:19:19 -0700 Subject: [Python-Dev] startup time repeated? why not daemon In-Reply-To: References: Message-ID: On Jul 20, 2017 14:18, "Eric Snow" wrote: On Thu, Jul 20, 2017 at 11:53 AM, Jim J. Jewett wrote: > I agree that startup time is a problem, but I wonder if some of the pain > could be mitigated by using a persistent process. > > [snip] > > Is it too hard to create a daemon server? > Is the communication and context switch slower than a new startup? > Is the pattern just not well-enough advertised? A couple years ago I suggested the same idea (i.e. "pythond") during a conversation with MAL at PyCon UK. IIRC, security and complexity were the two major obstacles. Assuming you use fork, you must ensure that the daemon gets into just the right state. Otherwise you're leaking (potentially sensitive) info into the forked processes or you're wasting cycles/memory. Relatedly, at PyCon this year Barry and I were talking about the idea of bootstrapping the interpreter from a memory snapshot on disk, rather than from scatch (thus drastically reducing the number of IO events). From what I gather, emacs does (or did) something like this. There's a fair amount of prior art for both of these. The prestart/daemon approach is apparently somewhat popular in the Java world, because the jvm is super slow to start up. E.g.: https://github.com/ninjudd/drip The interesting thing there is probably their README's comparison of their strategy to previous attempts at the same thing. (They've explicitly moved away from a persistent daemon approach.) The emacs memory dump approach is *really* challenging. They've been struggling to move away from it for years. Have you ever wondered why jemalloc is so much better than the default glibc malloc on Linux? Apparently it's because for many years it was impossible to improve glibc malloc's internal memory layout because it would break emacs. I'm not sure either of these make much sense when python startup is already in the single digit milliseconds. While it's certainly great if we can lower that further, my impression is that for any real application, startup time is overwhelmingly spent importing user packages, not in the interpreter start up itself. And this is difficult to optimize with a daemon or memory dump, because you need a full list of modules to preload and it'll differ between programs. This suggests that optimizations to finding/loading/executing modules are likely to give the biggest startup time wins. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Thu Jul 20 20:49:31 2017 From: greg at krypto.org (Gregory P. Smith) Date: Fri, 21 Jul 2017 00:49:31 +0000 Subject: [Python-Dev] startup time repeated? why not daemon In-Reply-To: References: Message-ID: On Thu, Jul 20, 2017 at 10:56 AM Jim J. Jewett wrote: > I agree that startup time is a problem, but I wonder if some of the pain > could be mitigated by using a persistent process. > This is one strategy that works under some situations, but not all. There are downsides to daemons: * They only work on one machine. New instances being launched in a cloud (think kubernetes jobs, app engine workers, etc) cannot benefit. * A daemon that forks off new workers can lose the benefit of hash randomization as tons of processes at once share the same seed. Mitigation for this is possible by regularly relaunching new replacement daemons but that complicates the already complicated. * Correctly launching and managing a daemon process is hard. Even once you have done so, you now have a interprocess concurrency and synchronization issues. For example, in > https://mail.python.org/pipermail/python-dev/2017-July/148664.html Ben > Hoyt mentions that the Google Cloud SDK (CLI) team has found it "especially > problematic for shell tab completion helpers, because every time you press > tab the shell has to load your Python program" > I can imagine a daemon working well in this specific example. Is it too hard to create a daemon server? > That is my take on it. Is the communication and context switch slower than a new startup? > Is the pattern just not well-enough advertised? > I have experienced good daemon processes. Bazel (a Java based build system) uses that approach. I can imagine Mercurial being able to do so as well but have no idea if they've looked into it or not. Daemons are by their nature an application specific thing. -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Jul 20 22:44:38 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 21 Jul 2017 12:44:38 +1000 Subject: [Python-Dev] Python startup time In-Reply-To: References: <20170719132211.GA30782@phdru.name> Message-ID: On 21 July 2017 at 05:38, Cesare Di Mauro wrote: > > > 2017-07-20 19:23 GMT+02:00 Victor Stinner : > >> 2017-07-20 19:09 GMT+02:00 Cesare Di Mauro : >> > I assume that Python loads compiled (.pyc and/or .pyo) from the stdlib. >> That's something that also influences the startup time (compiling source vs >> loading pre-compiled modules). >> >> My benchmark was "python3 -m perf command -- python3 -c pass": I don't >> explicitly remove .pyc files, I expect that Python uses prebuilt .pyc >> files from __pycache__. >> >> Victor >> > > OK, that should be the best case. > > An idea to improve the situation might be to find an alternative structure > for .pyc/pyo files, which allows to (partially) "parallelize" their loading > (not execution, of course), or at least speed-up the process. Maybe a GSoC > project for some student, if no core dev has time to investigate it. > Unmarshalling the code object from disk generally isn't the slow part - it's the module level execution that takes time. Using the typing module as an example, a full reload cycle takes almost 10 milliseconds: $ python3 -m perf timeit -s "import typing; from importlib import reload" "reload(typing)" ..................... Mean +- std dev: 9.89 ms +- 0.46 ms (Don't try timing "import typing" directly - the sys.modules cache amortises the cost down to being measured in nanoseconds, since you're effectively just measuring the speed of a dict lookup) We can separately measure the cost of unmarshalling the code object: $ python3 -m perf timeit -s "import typing; from marshal import loads; from importlib.util import cache_from_source; cache = cache_from_source(typing.__file__); data = open(cache, 'rb').read()[12:]" "loads(data)" ..................... Mean +- std dev: 286 us +- 4 us Finding the module spec: $ python3 -m perf timeit -s "from importlib.util import find_spec" "find_spec('typing')" ..................... Mean +- std dev: 69.2 us +- 2.3 us And actually running the module's code (this includes unmarshalling the code object, but *not* calculating the import spec): $ python3 -m perf timeit -s "import typing; loader_exec = typing.__spec__.loader.exec_module" "loader_exec(typing)" ..................... Mean +- std dev: 9.68 ms +- 0.43 ms Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Jul 20 22:52:11 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 21 Jul 2017 12:52:11 +1000 Subject: [Python-Dev] Python startup time In-Reply-To: References: <20170719132211.GA30782@phdru.name> Message-ID: On 21 July 2017 at 12:44, Nick Coghlan wrote: > We can separately measure the cost of unmarshalling the code object: > > $ python3 -m perf timeit -s "import typing; from marshal import loads; from > importlib.util import cache_from_source; cache = > cache_from_source(typing.__file__); data = open(cache, 'rb').read()[12:]" > "loads(data)" > ..................... > Mean +- std dev: 286 us +- 4 us Slight adjustment here, as the cost of locating the cached bytecode and reading it from disk should really be accounted for in each iteration: $ python3 -m perf timeit -s "import typing; from marshal import loads; from importlib.util import cache_from_source" "cache = cache_from_source(typing.__spec__.origin); data = open(cache, 'rb').read()[12:]; loads(data)" ..................... Mean +- std dev: 337 us +- 8 us That will have a bigger impact when loading from spinning disk or a network drive, but it's fairly negligible when loading from a local SSD or an already primed filesystem cache. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Thu Jul 20 23:49:53 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 21 Jul 2017 13:49:53 +1000 Subject: [Python-Dev] startup time repeated? why not daemon In-Reply-To: References: Message-ID: On 21 July 2017 at 10:19, Nathaniel Smith wrote: > I'm not sure either of these make much sense when python startup is already > in the single digit milliseconds. While it's certainly great if we can lower > that further, my impression is that for any real application, startup time > is overwhelmingly spent importing user packages, not in the interpreter > start up itself. And this is difficult to optimize with a daemon or memory > dump, because you need a full list of modules to preload and it'll differ > between programs. > > This suggests that optimizations to finding/loading/executing modules are > likely to give the biggest startup time wins. Agreed, and this is where both lazy loading and Cython precompilation are genuinely interesting: * Cython precompilation can have a significant impact on startup time, as it replaces module level code execution at import time with a combination of Cython translation to C code at build time and Python C API calls at import time * Lazy loading can have a significant impact on startup time, as it means you don't have to pay for the cost of finding and loading modules that you don't actually end up using on that particular run We've historically resisted adopting these techniques for the standard library because they *do* make things more complicated *and* harder to debug relative to plain old eagerly imported dynamic Python code. However, if we're going to recommend them as good practices for 3rd party developers looking to optimise the startup time of their Python applications, then it makes sense for us to embrace them for the standard library as well, rather than having our first reaction be to write more hand-crafted C code. On that last point, it's also worth keeping in mind that we have a much harder time finding new C-level contributors than we do new Python-level ones, and have every reason to expect that problem to get worse over time rather than better (since writing and maintaining handcrafted C code is likely to go the way of writing and maintaining handcrafted assembly code as a skillset: while it will still be genuinely necessary in some contexts, it will also be an increasingly niche technical specialty). Starting to migrate to using Cython for our acceleration modules instead of plain C should thus prove to be a win for everyone: - Cython structurally avoids a lot of typical bugs that arise in hand-coded extensions (e.g. refcount bugs) - by design, it's much easier to mentally switch between Python & Cython than it is between Python & C - Cython accelerated modules are easier to adapt to other interpeter implementations than handcrafted C modules - keeping Python modules and their C accelerated counterparts in sync will be easier, as they'll mostly be using the same code - we'd be able to start writing C API test cases in Cython rather than in handcrafted C (which currently mostly translates to only testing them indirectly) - CPython's own test suite would naturally help test Cython compatibility with any C API updates - we'd have an inherent incentive to help enhance Cython to take advantage of new C API features The are some genuine downsides in increasing the complexity of bootstrapping CPython when all you're starting with is a VCS clone and a C compiler, but those complications are ultimately no worse than those we already have with Argument Clinic, and hence amenable to the same solution: if we need to, we can check in the generated C files in order to make bootstrapping easier. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From rosuav at gmail.com Thu Jul 20 23:55:25 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 21 Jul 2017 13:55:25 +1000 Subject: [Python-Dev] startup time repeated? why not daemon In-Reply-To: References: Message-ID: On Fri, Jul 21, 2017 at 1:49 PM, Nick Coghlan wrote: > The are some genuine downsides in increasing the complexity of > bootstrapping CPython when all you're starting with is a VCS clone and > a C compiler, but those complications are ultimately no worse than > those we already have with Argument Clinic, and hence amenable to the > same solution: if we need to, we can check in the generated C files in > order to make bootstrapping easier. Are the generated C files perfectly identical? If you use Cython to compile the same file twice, will you always get a byte-for-byte identical file? If so, it should be safe to check them in, and then have a "make regenerate" that wipes out all Cython-generated files and rebuilds them. That followed by "git status" would immediately tell you if something failed to get checked in. ChrisA From ncoghlan at gmail.com Fri Jul 21 00:09:23 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 21 Jul 2017 14:09:23 +1000 Subject: [Python-Dev] startup time repeated? why not daemon In-Reply-To: References: Message-ID: On 21 July 2017 at 13:55, Chris Angelico wrote: > On Fri, Jul 21, 2017 at 1:49 PM, Nick Coghlan wrote: >> The are some genuine downsides in increasing the complexity of >> bootstrapping CPython when all you're starting with is a VCS clone and >> a C compiler, but those complications are ultimately no worse than >> those we already have with Argument Clinic, and hence amenable to the >> same solution: if we need to, we can check in the generated C files in >> order to make bootstrapping easier. > > Are the generated C files perfectly identical? If you use Cython to > compile the same file twice, will you always get a byte-for-byte > identical file? We that's certainly highly beneficial, we don't necessarily need it as an ironclad guarantee (it isn't true for autoconf, for example, especially if you change versions, but we still check in the autoconf output in order to avoid relying on autoconf as a build dependency). > If so, it should be safe to check them in, and then > have a "make regenerate" that wipes out all Cython-generated files and > rebuilds them. That followed by "git status" would immediately tell > you if something failed to get checked in. Yep, and we already have "make regen-all" as a target to cover various other build steps where this concern applies: regen-all: regen-opcode regen-opcode-targets regen-typeslots regen-grammar regen-ast regen-importlib Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From rosuav at gmail.com Fri Jul 21 00:12:00 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 21 Jul 2017 14:12:00 +1000 Subject: [Python-Dev] startup time repeated? why not daemon In-Reply-To: References: Message-ID: On Fri, Jul 21, 2017 at 2:09 PM, Nick Coghlan wrote: > On 21 July 2017 at 13:55, Chris Angelico wrote: >> On Fri, Jul 21, 2017 at 1:49 PM, Nick Coghlan wrote: >>> The are some genuine downsides in increasing the complexity of >>> bootstrapping CPython when all you're starting with is a VCS clone and >>> a C compiler, but those complications are ultimately no worse than >>> those we already have with Argument Clinic, and hence amenable to the >>> same solution: if we need to, we can check in the generated C files in >>> order to make bootstrapping easier. >> >> Are the generated C files perfectly identical? If you use Cython to >> compile the same file twice, will you always get a byte-for-byte >> identical file? > > We that's certainly highly beneficial, we don't necessarily need it as > an ironclad guarantee (it isn't true for autoconf, for example, > especially if you change versions, but we still check in the autoconf > output in order to avoid relying on autoconf as a build dependency). > >> If so, it should be safe to check them in, and then >> have a "make regenerate" that wipes out all Cython-generated files and >> rebuilds them. That followed by "git status" would immediately tell >> you if something failed to get checked in. > > Yep, and we already have "make regen-all" as a target to cover various > other build steps where this concern applies: > > regen-all: regen-opcode regen-opcode-targets regen-typeslots > regen-grammar regen-ast regen-importlib Cool. (Shows how much I know about the CPython build process.) Then I'm definitely +1 on using Cython. ChrisA From chris.jerdonek at gmail.com Fri Jul 21 01:11:01 2017 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Thu, 20 Jul 2017 22:11:01 -0700 Subject: [Python-Dev] startup time repeated? why not daemon In-Reply-To: References: Message-ID: On Thu, Jul 20, 2017 at 8:49 PM, Nick Coghlan wrote: > ... > * Lazy loading can have a significant impact on startup time, as it > means you don't have to pay for the cost of finding and loading > modules that you don't actually end up using on that particular run > > We've historically resisted adopting these techniques for the standard > library because they *do* make things more complicated *and* harder to > debug relative to plain old eagerly imported dynamic Python code. > However, if we're going to recommend them as good practices for 3rd > party developers looking to optimise the startup time of their Python > applications, then it makes sense for us to embrace them for the > standard library as well, rather than having our first reaction be to > write more hand-crafted C code. Are there any good write-ups of best practices and techniques in this area for applications (other than obvious things like avoiding unnecessary imports)? I'm thinking of things like how to structure your project, things to look for, developer tools that might help, and perhaps third-party runtime libraries? --Chris > > On that last point, it's also worth keeping in mind that we have a > much harder time finding new C-level contributors than we do new > Python-level ones, and have every reason to expect that problem to get > worse over time rather than better (since writing and maintaining > handcrafted C code is likely to go the way of writing and maintaining > handcrafted assembly code as a skillset: while it will still be > genuinely necessary in some contexts, it will also be an increasingly > niche technical specialty). > > Starting to migrate to using Cython for our acceleration modules > instead of plain C should thus prove to be a win for everyone: > > - Cython structurally avoids a lot of typical bugs that arise in > hand-coded extensions (e.g. refcount bugs) > - by design, it's much easier to mentally switch between Python & > Cython than it is between Python & C > - Cython accelerated modules are easier to adapt to other interpeter > implementations than handcrafted C modules > - keeping Python modules and their C accelerated counterparts in sync > will be easier, as they'll mostly be using the same code > - we'd be able to start writing C API test cases in Cython rather than > in handcrafted C (which currently mostly translates to only testing > them indirectly) > - CPython's own test suite would naturally help test Cython > compatibility with any C API updates > - we'd have an inherent incentive to help enhance Cython to take > advantage of new C API features > > The are some genuine downsides in increasing the complexity of > bootstrapping CPython when all you're starting with is a VCS clone and > a C compiler, but those complications are ultimately no worse than > those we already have with Argument Clinic, and hence amenable to the > same solution: if we need to, we can check in the generated C files in > order to make bootstrapping easier. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/chris.jerdonek%40gmail.com From cesare.di.mauro at gmail.com Fri Jul 21 01:30:06 2017 From: cesare.di.mauro at gmail.com (Cesare Di Mauro) Date: Fri, 21 Jul 2017 07:30:06 +0200 Subject: [Python-Dev] Python startup time In-Reply-To: References: <20170719132211.GA30782@phdru.name> Message-ID: 2017-07-21 4:52 GMT+02:00 Nick Coghlan : > On 21 July 2017 at 12:44, Nick Coghlan wrote: > > We can separately measure the cost of unmarshalling the code object: > > > > $ python3 -m perf timeit -s "import typing; from marshal import loads; > from > > importlib.util import cache_from_source; cache = > > cache_from_source(typing.__file__); data = open(cache, > 'rb').read()[12:]" > > "loads(data)" > > ..................... > > Mean +- std dev: 286 us +- 4 us > > Slight adjustment here, as the cost of locating the cached bytecode > and reading it from disk should really be accounted for in each > iteration: > > $ python3 -m perf timeit -s "import typing; from marshal import loads; > from importlib.util import cache_from_source" "cache = > cache_from_source(typing.__spec__.origin); data = open(cache, > 'rb').read()[12:]; loads(data)" > ..................... > Mean +- std dev: 337 us +- 8 us > > That will have a bigger impact when loading from spinning disk or a > network drive, but it's fairly negligible when loading from a local > SSD or an already primed filesystem cache. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > Thanks for your tests, Nick. It's quite evident that the marshal code cannot improve the situation, so I regret from my proposal. I took a look at the typing module, and there are some small things that can be optimized, but it'll not change the overall situation unfortunately. Code execution can be improved. :) However, it requires a massive amount of time experimenting... Bests, Cesare Mail priva di virus. www.avast.com <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Jul 21 02:23:58 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 21 Jul 2017 16:23:58 +1000 Subject: [Python-Dev] Python startup time In-Reply-To: References: <20170719132211.GA30782@phdru.name> Message-ID: On 21 July 2017 at 15:30, Cesare Di Mauro wrote: > > > 2017-07-21 4:52 GMT+02:00 Nick Coghlan : > >> On 21 July 2017 at 12:44, Nick Coghlan wrote: >> > We can separately measure the cost of unmarshalling the code object: >> > >> > $ python3 -m perf timeit -s "import typing; from marshal import loads; >> from >> > importlib.util import cache_from_source; cache = >> > cache_from_source(typing.__file__); data = open(cache, >> 'rb').read()[12:]" >> > "loads(data)" >> > ..................... >> > Mean +- std dev: 286 us +- 4 us >> >> Slight adjustment here, as the cost of locating the cached bytecode >> and reading it from disk should really be accounted for in each >> iteration: >> >> $ python3 -m perf timeit -s "import typing; from marshal import loads; >> from importlib.util import cache_from_source" "cache = >> cache_from_source(typing.__spec__.origin); data = open(cache, >> 'rb').read()[12:]; loads(data)" >> ..................... >> Mean +- std dev: 337 us +- 8 us >> >> That will have a bigger impact when loading from spinning disk or a >> network drive, but it's fairly negligible when loading from a local >> SSD or an already primed filesystem cache. >> >> Cheers, >> Nick. >> >> -- >> Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia >> > Thanks for your tests, Nick. It's quite evident that the marshal code > cannot improve the situation, so I regret from my proposal. > It was still a good suggestion, since it made me realise I *hadn't* actually measured the relative timings lately, so it was technically an untested assumption that module level code execution still dominated the overall import time. typing is also a particularly large & complex module, and bytecode unmarshalling represents a larger fraction of the import time for simpler modules like abc: $ python3 -m perf timeit -s "import abc; from marshal import loads; from importlib.util import cache_from_source" "cache = cache_from_source(abc.__spec__.origin); data = open(cache, 'rb').read()[12:]; loads(data)" ..................... Mean +- std dev: 45.2 us +- 1.1 us $ python3 -m perf timeit -s "import abc; loader_exec = abc.__spec__.loader.exec_module" "loader_exec(abc)" ..................... Mean +- std dev: 172 us +- 5 us $ python3 -m perf timeit -s "import abc; from importlib import reload" "reload(abc)" ..................... Mean +- std dev: 280 us +- 14 us And _weakrefset: $ python3 -m perf timeit -s "import _weakrefset; from marshal import loads; from importlib.util import cache_from_source" "cache = cache_from_source(_weakrefset.__spec__.origin); data = open(cache, 'rb').read()[12:]; loads(data)" ..................... Mean +- std dev: 57.7 us +- 1.3 us $ python3 -m perf timeit -s "import _weakrefset; loader_exec = _weakrefset.__spec__.loader.exec_module" "loader_exec(_weakrefset)" ..................... Mean +- std dev: 129 us +- 6 us $ python3 -m perf timeit -s "import _weakrefset; from importlib import reload" "reload(_weakrefset)" ..................... Mean +- std dev: 226 us +- 4 us The conclusion still holds (the absolute numbers here are likely still too small for the extra complexity of parallelising bytecode loading to pay off in any significant way), but it also helps us set reasonable expectations around how much of a gain we're likely to be able to get just from precompilation with Cython. That does actually raise a small microbenchmarking problem: for source and bytecode imports, we can force the import system to genuinely rerun the module or unmarshal the bytecode inside a single Python process, allowing perf to measure it independently of CPython startup. While I'm pretty sure it's possible to trick the import machinery into rerunning module level init functions even for old-style extension modules (hence allowing us to run similar tests to those above for a Cython compiled module), I don't actually remember how to do it off the top of my head. Cheers, Nick. P.S. I'll also note that in these cases where the import overhead is proportionally significant for always-imported modules, we may want to look at the benefits of freezing them (if they otherwise remain as pure Python modules), or compiling them as builtin modules (if we switch them over to Cython), in addition to looking at ways to make the modules themselves faster. Being built directly into the interpreter binary is pretty much the best case scenario for reducing import overhead. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Fri Jul 21 03:12:20 2017 From: mertz at gnosis.cx (David Mertz) Date: Fri, 21 Jul 2017 00:12:20 -0700 Subject: [Python-Dev] Python startup time In-Reply-To: References: <20170719132211.GA30782@phdru.name> Message-ID: How implausible is it to write out the actual memory image of a loaded Python process? I.e. on a specific machine, OS, Python version, etc? This can only be overhead initially, of course, but on subsequent runs it's just one memory map, which the cheapest possible operation. E.g. $ python3.7 --write-image "import typing, re, os, numpy" I imagine this creating a file like: /tmp/__python__/python37-typing-re-os-numpy.mem Then just terminating as if just that line had run, however long it takes (but snapshotting before exit). Then subsequent invocations would only restore the image to memory. Maybe: $ pyrunner --load-image python37-typing-re-os-numpy myscript.py The last line could be aliased of course. I suppose we'd need to check if relevant file exists, and if not fall back to just ignoring the '--load-image' flag and running plain old Python. This helps not at all for something like AWS Lambda where each instance is spun up fresh. But for the use-case of running many Python shell commands at an interactive shell on one machine, it seems like that could be very fast. In my hypothetical I suppose pre-loading some collection of modules in the image. Of course, the script may need to load others, and it may not use some in the image. But users could decide their typical needed modules themselves under this idea. On Jul 20, 2017 11:27 PM, "Nick Coghlan" wrote: > On 21 July 2017 at 15:30, Cesare Di Mauro > wrote: > >> >> >> 2017-07-21 4:52 GMT+02:00 Nick Coghlan : >> >>> On 21 July 2017 at 12:44, Nick Coghlan wrote: >>> > We can separately measure the cost of unmarshalling the code object: >>> > >>> > $ python3 -m perf timeit -s "import typing; from marshal import loads; >>> from >>> > importlib.util import cache_from_source; cache = >>> > cache_from_source(typing.__file__); data = open(cache, >>> 'rb').read()[12:]" >>> > "loads(data)" >>> > ..................... >>> > Mean +- std dev: 286 us +- 4 us >>> >>> Slight adjustment here, as the cost of locating the cached bytecode >>> and reading it from disk should really be accounted for in each >>> iteration: >>> >>> $ python3 -m perf timeit -s "import typing; from marshal import loads; >>> from importlib.util import cache_from_source" "cache = >>> cache_from_source(typing.__spec__.origin); data = open(cache, >>> 'rb').read()[12:]; loads(data)" >>> ..................... >>> Mean +- std dev: 337 us +- 8 us >>> >>> That will have a bigger impact when loading from spinning disk or a >>> network drive, but it's fairly negligible when loading from a local >>> SSD or an already primed filesystem cache. >>> >>> Cheers, >>> Nick. >>> >>> -- >>> Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia >>> >> Thanks for your tests, Nick. It's quite evident that the marshal code >> cannot improve the situation, so I regret from my proposal. >> > > It was still a good suggestion, since it made me realise I *hadn't* > actually measured the relative timings lately, so it was technically an > untested assumption that module level code execution still dominated the > overall import time. > > typing is also a particularly large & complex module, and bytecode > unmarshalling represents a larger fraction of the import time for simpler > modules like abc: > > $ python3 -m perf timeit -s "import abc; from marshal import loads; from > importlib.util import cache_from_source" "cache = > cache_from_source(abc.__spec__.origin); data = open(cache, > 'rb').read()[12:]; loads(data)" > ..................... > Mean +- std dev: 45.2 us +- 1.1 us > > $ python3 -m perf timeit -s "import abc; loader_exec = > abc.__spec__.loader.exec_module" "loader_exec(abc)" > ..................... > Mean +- std dev: 172 us +- 5 us > > $ python3 -m perf timeit -s "import abc; from importlib import reload" > "reload(abc)" > ..................... > Mean +- std dev: 280 us +- 14 us > > And _weakrefset: > > $ python3 -m perf timeit -s "import _weakrefset; from marshal import > loads; from importlib.util import cache_from_source" "cache = > cache_from_source(_weakrefset.__spec__.origin); data = open(cache, > 'rb').read()[12:]; loads(data)" > ..................... > Mean +- std dev: 57.7 us +- 1.3 us > > $ python3 -m perf timeit -s "import _weakrefset; loader_exec = > _weakrefset.__spec__.loader.exec_module" "loader_exec(_weakrefset)" > ..................... > Mean +- std dev: 129 us +- 6 us > > $ python3 -m perf timeit -s "import _weakrefset; from importlib import > reload" "reload(_weakrefset)" > ..................... > Mean +- std dev: 226 us +- 4 us > > The conclusion still holds (the absolute numbers here are likely still too > small for the extra complexity of parallelising bytecode loading to pay off > in any significant way), but it also helps us set reasonable expectations > around how much of a gain we're likely to be able to get just from > precompilation with Cython. > > That does actually raise a small microbenchmarking problem: for source and > bytecode imports, we can force the import system to genuinely rerun the > module or unmarshal the bytecode inside a single Python process, allowing > perf to measure it independently of CPython startup. While I'm pretty sure > it's possible to trick the import machinery into rerunning module level > init functions even for old-style extension modules (hence allowing us to > run similar tests to those above for a Cython compiled module), I don't > actually remember how to do it off the top of my head. > > Cheers, > Nick. > > P.S. I'll also note that in these cases where the import overhead is > proportionally significant for always-imported modules, we may want to look > at the benefits of freezing them (if they otherwise remain as pure Python > modules), or compiling them as builtin modules (if we switch them over to > Cython), in addition to looking at ways to make the modules themselves > faster. Being built directly into the interpreter binary is pretty much the > best case scenario for reducing import overhead. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > mertz%40gnosis.cx > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Fri Jul 21 04:54:32 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 21 Jul 2017 10:54:32 +0200 Subject: [Python-Dev] Python startup time References: <20170719132211.GA30782@phdru.name> Message-ID: <20170721105432.099a2a1c@fsol> On Fri, 21 Jul 2017 00:12:20 -0700 David Mertz wrote: > How implausible is it to write out the actual memory image of a loaded > Python process? I.e. on a specific machine, OS, Python version, etc? This > can only be overhead initially, of course, but on subsequent runs it's just > one memory map, which the cheapest possible operation. You can't rely on the file being remapped at the same address when you reload it. So you'd have to write a relocation routine that's able to find and fix *all* pointers inside the Python object tree and CPython's internal structures (fixing the pointers is not necessarily difficult, finding them without missing any is the difficult part). Regards Antoine. From songofacandy at gmail.com Fri Jul 21 05:28:30 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Fri, 21 Jul 2017 18:28:30 +0900 Subject: [Python-Dev] Python startup time In-Reply-To: References: <20170719132211.GA30782@phdru.name> Message-ID: On Fri, Jul 21, 2017 at 4:12 PM, David Mertz wrote: > How implausible is it to write out the actual memory image of a loaded > Python process? I.e. on a specific machine, OS, Python version, etc? This > can only be overhead initially, of course, but on subsequent runs it's just > one memory map, which the cheapest possible operation. FYI, you may be interested in very recent node.js security issue. https://nodejs.org/en/blog/vulnerability/july-2017-security-releases/#node-js-specific-security-flaws From victor.stinner at gmail.com Fri Jul 21 06:02:39 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 21 Jul 2017 12:02:39 +0200 Subject: [Python-Dev] Need help to fix urllib(.parse) vulnerabilities Message-ID: Hi, Recently, two security vulnerabilities were reported in the urllib module: https://bugs.python.org/issue30500 http://python-security.readthedocs.io/vuln/bpo-30500_urllib_connects_to_a_wrong_host.html#bpo-30500-urllib-connects-to-a-wrong-host => already fixed in Python 3.6.2 https://bugs.python.org/issue29606 http://python-security.readthedocs.io/vuln/urllib_ftp_protocol_stream_injection.html#urllib-ftp-protocol-stream-injection => not fixed yet I also proposed a more general protection: "Reject newline character (U+000A) in URLs in urllib.parse": http://bugs.python.org/issue30713 The problem with the urllib module is how we handle invalid URL. Right now, we return the URL unmodified if we cannot parse it. Should we raise an exception if an URL contains a newline for example? It's very hard to harden the urllib module without the backward compatibility. That's why it took 3 weeks to fix "urllib connects to a wrong host": find how to fix the vulnerability without brekaing the backward compatibility. Another proposed approach is to reject invalid data earlier or later, but not in urllib... So if you understand URLs, HTTP, etc. : please join these issues to help us to fix them! Victor From victor.stinner at gmail.com Fri Jul 21 06:45:36 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 21 Jul 2017 12:45:36 +0200 Subject: [Python-Dev] Need help to fix urllib(.parse) vulnerabilities In-Reply-To: References: Message-ID: 2017-07-21 12:02 GMT+02:00 Victor Stinner : > https://bugs.python.org/issue29606 > http://python-security.readthedocs.io/vuln/urllib_ftp_protocol_stream_injection.html#urllib-ftp-protocol-stream-injection > => not fixed yet Ok, I more concrete problem. To fix the "urllib FTP" bug, we have to find a balance between security (reject any URL looking like an attempt to counter the security protections) and backward compatibility (accept filenames containing newlines). Maybe we need to only reject an URL which contains a newline in the "host" part, but accept them in the "path" part of the URL? The question is if the code splits correctly "host" and "path" parts when the URL contains a newline. My bet is that no, it behaves badly :-) Victor From Nikolaus at rath.org Fri Jul 21 07:25:36 2017 From: Nikolaus at rath.org (Nikolaus Rath) Date: Fri, 21 Jul 2017 13:25:36 +0200 Subject: [Python-Dev] Python startup time In-Reply-To: (David Mertz's message of "Fri, 21 Jul 2017 00:12:20 -0700") References: <20170719132211.GA30782@phdru.name> Message-ID: <874lu62frj.fsf@vostro.rath.org> On Jul 21 2017, David Mertz wrote: > How implausible is it to write out the actual memory image of a loaded > Python process? That is what Emacs does, and it causes them a lot of trouble. They're trying to move away from it at the moment, but the direction is not yet clear. The keyword is "unexec", and it wrecks havoc with malloc. Best, -Nikolaus -- GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F ?Time flies like an arrow, fruit flies like a Banana.? From g.rodola at gmail.com Fri Jul 21 08:43:18 2017 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Fri, 21 Jul 2017 14:43:18 +0200 Subject: [Python-Dev] Need help to fix urllib(.parse) vulnerabilities In-Reply-To: References: Message-ID: On Fri, Jul 21, 2017 at 12:45 PM, Victor Stinner wrote: > 2017-07-21 12:02 GMT+02:00 Victor Stinner : > > https://bugs.python.org/issue29606 > > http://python-security.readthedocs.io/vuln/urllib_ > ftp_protocol_stream_injection.html#urllib-ftp-protocol-stream-injection > > => not fixed yet > > Ok, I more concrete problem. To fix the "urllib FTP" bug, we have to > find a balance between security (reject any URL looking like an > attempt to counter the security protections) and backward > compatibility (accept filenames containing newlines). > > Maybe we need to only reject an URL which contains a newline in the > "host" part, but accept them in the "path" part of the URL? The > question is if the code splits correctly "host" and "path" parts when > the URL contains a newline. My bet is that no, it behaves badly :-) > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/g. > rodola%40gmail.com > It took me a while to understand the security implications of this FTP-related bug, but I believe I got the gist of it here (I can elaborate further if it's not clear): https://github.com/python/cpython/pull/1214#issuecomment-298393169 My proposal is to fix ftplib.py and guard against malicious strings involving the *PORT command only*. This way we fix the issue *and* maintain backward compatibility by allowing users to specify "\n" in their paths and username / password pairs. Java took a different approach and disallowed "\n" completely. To my understanding fixing ftplib would automatically mean fixing urllib as well. -- Giampaolo - http://grodola.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From random832 at fastmail.com Fri Jul 21 10:23:10 2017 From: random832 at fastmail.com (Random832) Date: Fri, 21 Jul 2017 10:23:10 -0400 Subject: [Python-Dev] Need help to fix urllib(.parse) vulnerabilities In-Reply-To: References: Message-ID: <1500646990.321852.1048299248.02BB02F9@webmail.messagingengine.com> On Fri, Jul 21, 2017, at 08:43, Giampaolo Rodola' wrote: > It took me a while to understand the security implications of this > FTP-related bug, but I believe I got the gist of it here (I can > elaborate further if it's not clear): > https://github.com/python/cpython/pull/1214#issuecomment-298393169 > My proposal is to fix ftplib.py and guard against malicious > strings involving the *PORT command only*. This way we fix the > issue *and* maintain backward compatibility by allowing users to > specify "\n" in their paths and username / password pairs. Java > took a different approach and disallowed "\n" completely. To my > understanding fixing ftplib would automatically mean fixing urllib > as well. What would a \n in a path mean? What commands would you send over FTP to successfully retrieve a file (or enter a username or password) containing a newline in the name? In other words, what exactly are we being backward compatible *with*? From status at bugs.python.org Fri Jul 21 12:09:12 2017 From: status at bugs.python.org (Python tracker) Date: Fri, 21 Jul 2017 18:09:12 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20170721160912.72EA711A869@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2017-07-14 - 2017-07-21) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 6058 (+16) closed 36679 (+38) total 42737 (+54) Open issues with patches: 2343 Issues opened (37) ================== #19896: Exposing "q" and "Q" to multiprocessing.sharedctypes http://bugs.python.org/issue19896 reopened by haypo #30450: Pull Windows dependencies from GitHub rather than svn.python.o http://bugs.python.org/issue30450 reopened by steve.dower #30931: Race condition in asyncore may access the wrong dispatcher http://bugs.python.org/issue30931 opened by walkhour #30934: Document how to run coverage for repository idlelib files. http://bugs.python.org/issue30934 opened by terry.reedy #30935: document the new behavior of get_event_loop() in Python 3.6 http://bugs.python.org/issue30935 opened by chris.jerdonek #30937: csv module examples miss newline='' when opening files http://bugs.python.org/issue30937 opened by Pavel #30938: pdb lacks debugger command to list and show all user-defined v http://bugs.python.org/issue30938 opened by David Rieger #30939: Sphinx 1.6.3 deprecation warning for sphinx.util.compat.Direct http://bugs.python.org/issue30939 opened by ned.deily #30940: Documentation for round() is incorrect. http://bugs.python.org/issue30940 opened by George K #30944: Python 32 bit install fails on Windows - BitDefender false pos http://bugs.python.org/issue30944 opened by Arie van Wingerden #30945: loop.create_server does not detect if the interface is IPv6 en http://bugs.python.org/issue30945 opened by cecton #30947: Update embeded copy of libexpat to 2.2.2 http://bugs.python.org/issue30947 opened by haypo #30949: Provide assertion functions in unittest.mock http://bugs.python.org/issue30949 opened by odd_bloke #30950: Convert round() to Arument Clinic http://bugs.python.org/issue30950 opened by serhiy.storchaka #30951: Documentation error in inspect module http://bugs.python.org/issue30951 opened by jalexvig #30952: include Math extension in SQlite http://bugs.python.org/issue30952 opened by Big Stone #30953: Fatal python error when jumping into except clause http://bugs.python.org/issue30953 opened by ppperry #30956: ftplib behaves oddly if socket timeout is greater than the def http://bugs.python.org/issue30956 opened by arloclarke #30959: Constructor signature is duplicated in the help of namedtuples http://bugs.python.org/issue30959 opened by serhiy.storchaka #30962: Add caching to logging.Logger.isEnabledFor() http://bugs.python.org/issue30962 opened by aviso #30963: xxlimited.c XxoObject_Check should be XxoObject_CheckExact http://bugs.python.org/issue30963 opened by Jim.Jewett #30964: Mention ensurepip in package installation docs http://bugs.python.org/issue30964 opened by ncoghlan #30966: multiprocessing.queues.SimpleQueue leaks 2 fds http://bugs.python.org/issue30966 opened by arigo #30967: Crash in PyThread_ReInitTLS() in the child process after os.fo http://bugs.python.org/issue30967 opened by Thomas Mortensson #30969: Docs should say that `x is z or x == z` is used for `x in y` i http://bugs.python.org/issue30969 opened by ztane #30971: Improve code readability of json.tool http://bugs.python.org/issue30971 opened by dhimmel #30972: Event loop incorrectly inherited in child processes. http://bugs.python.org/issue30972 opened by Elvis.Pranskevichus #30974: Update os.samefile docstring to match documentation http://bugs.python.org/issue30974 opened by eMPee584 #30975: multiprocessing.Condition.notify_all() blocks indefinitely if http://bugs.python.org/issue30975 opened by mickp #30977: reduce uuid.UUID() memory footprint http://bugs.python.org/issue30977 opened by wbolster #30978: str.format_map() silences exceptions in __getitem__ http://bugs.python.org/issue30978 opened by Akuli #30979: the winapi fails to run shortcuts (because considers a shortcu http://bugs.python.org/issue30979 opened by Bern??t G??bor #30980: Calling asyncore.file_wrapper.close twice may close unrelated http://bugs.python.org/issue30980 opened by Nir Soffer #30981: IDLE: Add config dialog font page tests http://bugs.python.org/issue30981 opened by terry.reedy #30982: AMD64 Windows8.1 Refleaks 3.x: compilation error, cannot open http://bugs.python.org/issue30982 opened by haypo #30983: eval frame rename in pep 0523 broke gdb's python extension http://bugs.python.org/issue30983 opened by bcap #30984: traceback.print_exc return value documentation http://bugs.python.org/issue30984 opened by Jelle Zijlstra Most recent 15 issues with no replies (15) ========================================== #30984: traceback.print_exc return value documentation http://bugs.python.org/issue30984 #30983: eval frame rename in pep 0523 broke gdb's python extension http://bugs.python.org/issue30983 #30980: Calling asyncore.file_wrapper.close twice may close unrelated http://bugs.python.org/issue30980 #30972: Event loop incorrectly inherited in child processes. http://bugs.python.org/issue30972 #30963: xxlimited.c XxoObject_Check should be XxoObject_CheckExact http://bugs.python.org/issue30963 #30952: include Math extension in SQlite http://bugs.python.org/issue30952 #30950: Convert round() to Arument Clinic http://bugs.python.org/issue30950 #30938: pdb lacks debugger command to list and show all user-defined v http://bugs.python.org/issue30938 #30937: csv module examples miss newline='' when opening files http://bugs.python.org/issue30937 #30935: document the new behavior of get_event_loop() in Python 3.6 http://bugs.python.org/issue30935 #30926: KeyError with cgitb inspecting exception in generator expressi http://bugs.python.org/issue30926 #30925: RPM build lacks ability to include other files similar to doc_ http://bugs.python.org/issue30925 #30924: RPM build doc_files needs files separated into separate lines http://bugs.python.org/issue30924 #30915: distutils sometimes assumes wrong C compiler http://bugs.python.org/issue30915 #30912: python 3 git master fails to find libffi and build _ctypes on http://bugs.python.org/issue30912 Most recent 15 issues waiting for review (15) ============================================= #30978: str.format_map() silences exceptions in __getitem__ http://bugs.python.org/issue30978 #30971: Improve code readability of json.tool http://bugs.python.org/issue30971 #30950: Convert round() to Arument Clinic http://bugs.python.org/issue30950 #30945: loop.create_server does not detect if the interface is IPv6 en http://bugs.python.org/issue30945 #30939: Sphinx 1.6.3 deprecation warning for sphinx.util.compat.Direct http://bugs.python.org/issue30939 #30937: csv module examples miss newline='' when opening files http://bugs.python.org/issue30937 #30919: Shared Array Memory Allocation Regression http://bugs.python.org/issue30919 #30891: importlib: _find_and_load() race condition on sys.modules[name http://bugs.python.org/issue30891 #30877: possibe typo in json/scanner.py http://bugs.python.org/issue30877 #30876: SystemError on importing module from unloaded package http://bugs.python.org/issue30876 #30872: Update curses docs to Python 3 http://bugs.python.org/issue30872 #30867: Add necessary macro that insure `HAVE_OPENSSL_VERIFY_PARAM` to http://bugs.python.org/issue30867 #30863: Rewrite PyUnicode_AsWideChar() and PyUnicode_AsWideCharString( http://bugs.python.org/issue30863 #30860: Consolidate stateful C globals under a single struct. http://bugs.python.org/issue30860 #30828: Out of bounds write in _asyncio_Future_remove_done_callback http://bugs.python.org/issue30828 Top 10 most discussed issues (10) ================================= #28638: Optimize namedtuple creation http://bugs.python.org/issue28638 33 msgs #30450: Pull Windows dependencies from GitHub rather than svn.python.o http://bugs.python.org/issue30450 29 msgs #30940: Documentation for round() is incorrect. http://bugs.python.org/issue30940 11 msgs #30822: Python implementation of datetime module is not being tested c http://bugs.python.org/issue30822 10 msgs #18558: Iterable glossary entry needs clarification http://bugs.python.org/issue18558 9 msgs #30919: Shared Array Memory Allocation Regression http://bugs.python.org/issue30919 9 msgs #30934: Document how to run coverage for repository idlelib files. http://bugs.python.org/issue30934 9 msgs #19896: Exposing "q" and "Q" to multiprocessing.sharedctypes http://bugs.python.org/issue19896 8 msgs #25988: collections.abc.Indexable http://bugs.python.org/issue25988 8 msgs #30981: IDLE: Add config dialog font page tests http://bugs.python.org/issue30981 7 msgs Issues closed (37) ================== #11230: "Full unicode import system" not in 3.2 http://bugs.python.org/issue11230 closed by serhiy.storchaka #23267: multiprocessing pool.py doesn't close inqueue and outqueue pip http://bugs.python.org/issue23267 closed by pitrou #26781: os.walk max_depth http://bugs.python.org/issue26781 closed by rhettinger #28523: Idlelib.configdialog: use 'color' insteadof 'colour' http://bugs.python.org/issue28523 closed by terry.reedy #30466: Tutorial doesn't explain the use of classes http://bugs.python.org/issue30466 closed by Mariatta #30490: Allow pass an exception to the Event.set method http://bugs.python.org/issue30490 closed by pfreixes #30585: [security][3.3] Backport smtplib fix for TLS stripping vulnera http://bugs.python.org/issue30585 closed by ned.deily #30694: Update embedded copy of expat to 2.2.1 http://bugs.python.org/issue30694 closed by ned.deily #30730: [security] Injecting environment variable in subprocess on Win http://bugs.python.org/issue30730 closed by serhiy.storchaka #30794: Add multiprocessing.Process.kill() http://bugs.python.org/issue30794 closed by pitrou #30797: ./pyconfig.h:1438:0: warning: "_GNU_SOURCE" redefined [enabled http://bugs.python.org/issue30797 closed by ned.deily #30808: Use _Py_atomic API for concurrency-sensitive signal state http://bugs.python.org/issue30808 closed by pitrou #30836: test_c_locale_coercion fails on AIX http://bugs.python.org/issue30836 closed by ncoghlan #30870: IDLE: configdialog/fonts: change font when select by key up/do http://bugs.python.org/issue30870 closed by terry.reedy #30883: test_urllib2net failed on s390x Debian 3.6: ftp.debian.org err http://bugs.python.org/issue30883 closed by berker.peksag #30917: IDLE: Add idlelib.config.IdleConf unittest http://bugs.python.org/issue30917 closed by terry.reedy #30920: Sequence Matcher from diff lib is not implementing longest com http://bugs.python.org/issue30920 closed by terry.reedy #30929: AttributeErrors after import in multithreaded environment http://bugs.python.org/issue30929 closed by larry #30932: Identity comparison ("is") fails for floats in Python3 but wor http://bugs.python.org/issue30932 closed by steven.daprano #30933: Python not IEEE 754 float compliant for various zero operation http://bugs.python.org/issue30933 closed by r.david.murray #30936: json module ref leaks detected by test_json http://bugs.python.org/issue30936 closed by serhiy.storchaka #30941: Missing line in example program http://bugs.python.org/issue30941 closed by r.david.murray #30942: Implement lwalk(levelwalk) function on os.py http://bugs.python.org/issue30942 closed by r.david.murray #30943: printf-style Bytes Formatting sometimes do not worked. http://bugs.python.org/issue30943 closed by r.david.murray #30946: readline module has obsolete code http://bugs.python.org/issue30946 closed by pitrou #30948: Docs for __subclasses__(): Add hint that Python imports are la http://bugs.python.org/issue30948 closed by r.david.murray #30954: backport unittest.TestCase.assertLogs to 2.7? http://bugs.python.org/issue30954 closed by r.david.murray #30955: \\N in f-string causes next { to be literal if not escaped http://bugs.python.org/issue30955 closed by ned.deily #30957: pathlib: Path and PurePath cannot be subclassed http://bugs.python.org/issue30957 closed by serhiy.storchaka #30958: Scripts folder is empty http://bugs.python.org/issue30958 closed by ned.deily #30960: Python script is failing to run http://bugs.python.org/issue30960 closed by serhiy.storchaka #30961: Py_Decref a borrowed reference in _tracemalloc http://bugs.python.org/issue30961 closed by xiang.zhang #30965: Unexpected behavior of operator "in" http://bugs.python.org/issue30965 closed by ammar2 #30968: test_get_font (idlelib.idle_test.test_config.IdleConfTest) fai http://bugs.python.org/issue30968 closed by terry.reedy #30970: return-value of filecmp.dircmp.report_* http://bugs.python.org/issue30970 closed by r.david.murray #30973: Regular expression "hangs" interpreter http://bugs.python.org/issue30973 closed by r.david.murray #30976: multiprocessing.Process.is_alive can show True for dead proces http://bugs.python.org/issue30976 closed by mickp From brett at python.org Fri Jul 21 12:52:25 2017 From: brett at python.org (Brett Cannon) Date: Fri, 21 Jul 2017 16:52:25 +0000 Subject: [Python-Dev] startup time repeated? why not daemon In-Reply-To: References: Message-ID: On Thu, 20 Jul 2017 at 22:11 Chris Jerdonek wrote: > On Thu, Jul 20, 2017 at 8:49 PM, Nick Coghlan wrote: > > ... > > * Lazy loading can have a significant impact on startup time, as it > > means you don't have to pay for the cost of finding and loading > > modules that you don't actually end up using on that particular run > It should be mentioned that I have started designing an API to make using lazy loading much easier in Python 3.7 (i.e. "calling a single function" easier), but I still have to write the tests and such before I propose a patch and it will still be mainly for apps that know what they are doing since lazy loading makes debugging import errors harder. > > > > We've historically resisted adopting these techniques for the standard > > library because they *do* make things more complicated *and* harder to > > debug relative to plain old eagerly imported dynamic Python code. > > However, if we're going to recommend them as good practices for 3rd > > party developers looking to optimise the startup time of their Python > > applications, then it makes sense for us to embrace them for the > > standard library as well, rather than having our first reaction be to > > write more hand-crafted C code. > > Are there any good write-ups of best practices and techniques in this > area for applications (other than obvious things like avoiding > unnecessary imports)? I'm thinking of things like how to structure > your project, things to look for, developer tools that might help, and > perhaps third-party runtime libraries? > Nothing beyond "profile your application" and "don't do stuff during import as a side-effect" that I'm aware of. -Brett > > --Chris > > > > > > > On that last point, it's also worth keeping in mind that we have a > > much harder time finding new C-level contributors than we do new > > Python-level ones, and have every reason to expect that problem to get > > worse over time rather than better (since writing and maintaining > > handcrafted C code is likely to go the way of writing and maintaining > > handcrafted assembly code as a skillset: while it will still be > > genuinely necessary in some contexts, it will also be an increasingly > > niche technical specialty). > > > > Starting to migrate to using Cython for our acceleration modules > > instead of plain C should thus prove to be a win for everyone: > > > > - Cython structurally avoids a lot of typical bugs that arise in > > hand-coded extensions (e.g. refcount bugs) > > - by design, it's much easier to mentally switch between Python & > > Cython than it is between Python & C > > - Cython accelerated modules are easier to adapt to other interpeter > > implementations than handcrafted C modules > > - keeping Python modules and their C accelerated counterparts in sync > > will be easier, as they'll mostly be using the same code > > - we'd be able to start writing C API test cases in Cython rather than > > in handcrafted C (which currently mostly translates to only testing > > them indirectly) > > - CPython's own test suite would naturally help test Cython > > compatibility with any C API updates > > - we'd have an inherent incentive to help enhance Cython to take > > advantage of new C API features > > > > The are some genuine downsides in increasing the complexity of > > bootstrapping CPython when all you're starting with is a VCS clone and > > a C compiler, but those complications are ultimately no worse than > > those we already have with Argument Clinic, and hence amenable to the > > same solution: if we need to, we can check in the generated C files in > > order to make bootstrapping easier. > > > > Cheers, > > Nick. > > > > -- > > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > https://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/chris.jerdonek%40gmail.com > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From raymond.hettinger at gmail.com Fri Jul 21 15:32:43 2017 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Fri, 21 Jul 2017 12:32:43 -0700 Subject: [Python-Dev] Need help to fix urllib(.parse) vulnerabilities In-Reply-To: References: Message-ID: > On Jul 21, 2017, at 3:45 AM, Victor Stinner wrote: > > Ok, I more concrete problem. To fix the "urllib FTP" bug, we have to > find a balance between security (reject any URL looking like an > attempt to counter the security protections) and backward > compatibility (accept filenames containing newlines). For this case, the balance should probably tilt more towards security than backwards compatibility. I would be very concerned about such odd URLs. That said, if backwards compatibility is going to be broken, consider giving users a temporary, clean way to opt-out of the additional projections (don't want to leave them high and dry if they happen to have a legitimate use case). Raymond From stefan_ml at behnel.de Fri Jul 21 16:43:36 2017 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 21 Jul 2017 22:43:36 +0200 Subject: [Python-Dev] Cython compiled stdlib modules - Re: Python startup time In-Reply-To: References: <20170719132211.GA30782@phdru.name> Message-ID: Nick Coghlan schrieb am 21.07.2017 um 08:23: > I'll also note that in these cases where the import overhead is > proportionally significant for always-imported modules, we may want to look > at the benefits of freezing them (if they otherwise remain as pure Python > modules), or compiling them as builtin modules (if we switch them over to > Cython), in addition to looking at ways to make the modules themselves > faster. Just for the sake of it, I gave the Cython compilation a try. I had to apply the attached hack to Lib/typing.py to get the test passing, because it uses frame call offsets in some places and Cython functions do not create frames when being called (they only create them for exception traces). I also had to disable the import of "abc" in the Cython generated module to remove the circular self dependency at startup when the "abc" module is compiled. That shouldn't have an impact on the runtime performance, though. Note that this is otherwise using the unmodified Python code, as provided in the current modules, constructing and using normal Python classes for everything, no extension types etc. Only two stdlib Python modules were compiled into shared libraries, and not statically linked into the CPython core. I used the "python_startup" benchmark in the "performance" project to measure the overall startup times of a clean non-debug non-pgo build of CPython 3.7 (rev d0969d6) against the same build with a compiled typing.py and abc.py. To compile these modules, I used the following command (plus the attached patch) $ cythonize -3 -X binding=True -i Lib/typing.py Lib/abc.py I modified the startup benchmark to run "python -c 'import typing'" etc. instead of just executing "pass". - stock CPython starting up and running "pass": Mean +- std dev: 14.7 ms +- 0.3 ms - stock CPython starting up and running "import abc": Mean +- std dev: 14.8 ms +- 0.3 ms - with compiled abc.py: Mean +- std dev: 14.9 ms +- 0.3 ms - stock CPython starting up and running "import typing": Mean +- std dev: 34.6 ms +- 1.0 ms - with compiled abc.py Mean +- std dev: 34.4 ms +- 0.6 ms - with compiled typing.py: Mean +- std dev: 33.5 ms +- 0.7 ms - with both compiled: Mean +- std dev: 33.1 ms +- 0.4 ms That's only a 4% improvement in the overall startup time on my machine, and about a 7% faster overall runtime of "import typing" compared to "pass". Note also that compiling abc.py leads to a slightly *increased* startup time in the "import abc" case, which might be due to the larger file size of the abc.so file compared to the abc.pyc file. This is amortised by the decreased runtime in the "import typing" case (I guess). I then ran the test suites for both modules in lack of a better post-startup runtime benchmark. The improvement for abc.py is in the order of 1-2%, but test_typing.py has many more tests and wins about 13% overall: - stock CPython executing essentially "runner.run(deepcopy(suite))" in "test_typing.py" (the deepcopy() takes about 6 ms): Mean +- std dev: 68.6 ms +- 0.8 ms - compiled abc.py and typing.py: Mean +- std dev: 60.7 ms +- 0.7 ms One more thing to note: the compiled modules are quite large. I get these file sizes: 8658 Lib/abc.py 7525 Lib/__pycache__/abc.cpython-37.pyc 369930 Lib/abc.c 122048 Lib/abc.cpython-37m-x86_64-linux-gnu.so 80290 Lib/typing.py 73921 Lib/__pycache__/typing.cpython-37.pyc 2951893 Lib/typing.c 1182632 Lib/typing.cpython-37m-x86_64-linux-gnu.so The .so files are about 16x as large as the .pyc files. The typing.so file weighs in with about 40% of the size of the stripped python binary: 2889136 python As it stands, the gain is probably not worth the increase in library file size, which also translates to a higher bottom line for the memory consumption. At least not for these two modules. Manually optimising the files would likely also reduce the .so file size in addition to giving better speedups, though, because the generated code would become less generic. Stefan -------------- next part -------------- A non-text attachment was scrubbed... Name: typing_frames.patch Type: text/x-patch Size: 1439 bytes Desc: not available URL: From brett at python.org Fri Jul 21 17:28:05 2017 From: brett at python.org (Brett Cannon) Date: Fri, 21 Jul 2017 21:28:05 +0000 Subject: [Python-Dev] Appending a link back to bugs.python.org in GitHub PRs Message-ID: Thanks to Kushal Das we now have one of the most requested features since the transition: a link in PRs back to bugs.python.org (in a more discoverable way since we have had them since Bedevere launched :) . When a pull request comes in with an issue number in the title (or one gets added), a link to bugs.python.org will be appended to the PR's body (the message you fill out when creating a PR). There's no logic to remove the link if the issue number is removed from the title, changed, or for multiple issue numbers since basically those cases are all rare and it was easier to launch without that kind of support. P.S.: Berker Peksag is working on providing commit emails with diffs in them which is the other most requested feature since the transition. -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Fri Jul 21 18:21:00 2017 From: barry at python.org (Barry Warsaw) Date: Fri, 21 Jul 2017 18:21:00 -0400 Subject: [Python-Dev] Python startup time In-Reply-To: <874lu62frj.fsf@vostro.rath.org> References: <20170719132211.GA30782@phdru.name> <874lu62frj.fsf@vostro.rath.org> Message-ID: <20170721182100.6da92da1@presto> On Jul 21, 2017, at 01:25 PM, Nikolaus Rath wrote: >That is what Emacs does, and it causes them a lot of trouble. They're >trying to move away from it at the moment, but the direction is not yet >clear. The keyword is "unexec", and it wrecks havoc with malloc. Emacs has been unexec'ing for as long as I can remember (which is longer than I can remember Python :). I know that it's been problematic and there have been many efforts over the years to replace it, but I think it's been a fairly successful technique in practice, at least on platforms that support it. That's another problem with the approach of course; it's not universally possible to implement. -Barry From barry at python.org Fri Jul 21 18:28:04 2017 From: barry at python.org (Barry Warsaw) Date: Fri, 21 Jul 2017 18:28:04 -0400 Subject: [Python-Dev] startup time repeated? why not daemon In-Reply-To: References: Message-ID: <20170721182804.492a8ab8@presto> On Jul 20, 2017, at 03:16 PM, Eric Snow wrote: >Relatedly, at PyCon this year Barry and I were talking about the idea of >bootstrapping the interpreter from a memory snapshot on disk, rather than >from scatch (thus drastically reducing the number of IO events). The TPI (Terrible Python Idea) I had at Pycon was some kind of (local) memcached of imported Python modules, which would theoretically allow avoiding loading the modules from the file system on start up. There would be all kinds of problems with this (i.e. putting the "terrible" in TPI), such as having to deal with module import side-effects, but perhaps those could be handled by enough APIs and engineering. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 801 bytes Desc: OpenPGP digital signature URL: From skip.montanaro at gmail.com Fri Jul 21 18:34:03 2017 From: skip.montanaro at gmail.com (Skip Montanaro) Date: Fri, 21 Jul 2017 17:34:03 -0500 Subject: [Python-Dev] Python startup time In-Reply-To: <20170721182100.6da92da1@presto> References: <20170719132211.GA30782@phdru.name> <874lu62frj.fsf@vostro.rath.org> <20170721182100.6da92da1@presto> Message-ID: Emacs has been unexec'ing for as long as I can remember (which is longer than I can remember Python :). I know that it's been problematic and there have been many efforts over the years to replace it, but I think it's been a fairly successful technique in practice, at least on platforms that support it. I've been using Emacs far longer than Python. I remember having to invoke temacs on something. Still, if I didn't know better, I could be convinced you were referring to the GIL. :-) Skip -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Fri Jul 21 18:53:32 2017 From: mertz at gnosis.cx (David Mertz) Date: Fri, 21 Jul 2017 15:53:32 -0700 Subject: [Python-Dev] Python startup time In-Reply-To: <20170721182100.6da92da1@presto> References: <20170719132211.GA30782@phdru.name> <874lu62frj.fsf@vostro.rath.org> <20170721182100.6da92da1@presto> Message-ID: I would guess that Windows users don't tend to run lots of command line tools where startup time dominates, as *nix users do. On Fri, Jul 21, 2017 at 3:21 PM, Barry Warsaw wrote: > On Jul 21, 2017, at 01:25 PM, Nikolaus Rath wrote: > > >That is what Emacs does, and it causes them a lot of trouble. They're > >trying to move away from it at the moment, but the direction is not yet > >clear. The keyword is "unexec", and it wrecks havoc with malloc. > > Emacs has been unexec'ing for as long as I can remember (which is longer > than > I can remember Python :). I know that it's been problematic and there have > been many efforts over the years to replace it, but I think it's been a > fairly > successful technique in practice, at least on platforms that support it. > That's another problem with the approach of course; it's not universally > possible to implement. > > -Barry > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > mertz%40gnosis.cx > -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From larry at hastings.org Fri Jul 21 22:59:08 2017 From: larry at hastings.org (Larry Hastings) Date: Fri, 21 Jul 2017 19:59:08 -0700 Subject: [Python-Dev] startup time repeated? why not daemon In-Reply-To: <20170721182804.492a8ab8@presto> References: <20170721182804.492a8ab8@presto> Message-ID: <4735c42d-3a8c-81bc-00f6-e1d2ecec5719@hastings.org> On 07/21/2017 03:28 PM, Barry Warsaw wrote: > The TPI (Terrible Python Idea) I had at Pycon was some kind of (local) > memcached of imported Python modules, which would theoretically allow avoiding > loading the modules from the file system on start up. > > There would be all kinds of problems with this (i.e. putting the "terrible" in > TPI), such as having to deal with module import side-effects, but perhaps > those could be handled by enough APIs and engineering. This would be taking a page out of PHP's book. PHP--or at least, PHP ten years ago--doesn't have the equivalent of .pyc files. If you have mod_php running inside Apache with no other extensions, it literally tokenizes each .php script every time it's invoked. To solve this performance problem, someone wrote the "Alternative PHP Cache" (or "APC"), which runs /in Apache/. (Yep, it's not usable outside Apache!) APC stores the tokenized versions of PHP scripts in something approximating their actual in-memory representation. To use something stored in the cache, you'd iterate over all the variables/functions, copy each one into your local interpreter instance, and perform fixups on all the pointers to convert them from relative to absolute addresses. http://php.net/manual/en/book.apc.php I note that the introduction to APC says: Warning This extension is considered unmaintained and dead. However, the source code for this extension is still available within PECL. So perhaps the PHP folks have moved on from this technique. //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.jerdonek at gmail.com Sat Jul 22 01:18:02 2017 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Fri, 21 Jul 2017 22:18:02 -0700 Subject: [Python-Dev] startup time repeated? why not daemon In-Reply-To: References: Message-ID: On Fri, Jul 21, 2017 at 9:52 AM, Brett Cannon wrote: > On Thu, 20 Jul 2017 at 22:11 Chris Jerdonek > wrote: >> On Thu, Jul 20, 2017 at 8:49 PM, Nick Coghlan wrote: >> > ... >> > * Lazy loading can have a significant impact on startup time, as it >> > means you don't have to pay for the cost of finding and loading >> > modules that you don't actually end up using on that particular run > > It should be mentioned that I have started designing an API to make using > lazy loading much easier in Python 3.7 (i.e. "calling a single function" > easier), but I still have to write the tests and such before I propose a > patch and it will still be mainly for apps that know what they are doing > since lazy loading makes debugging import errors harder. > ... >> > However, if we're going to recommend them as good practices for 3rd >> > party developers looking to optimise the startup time of their Python >> > applications, then it makes sense for us to embrace them for the >> > standard library as well, rather than having our first reaction be to >> > write more hand-crafted C code. >> >> Are there any good write-ups of best practices and techniques in this >> area for applications (other than obvious things like avoiding >> unnecessary imports)? I'm thinking of things like how to structure >> your project, things to look for, developer tools that might help, and >> perhaps third-party runtime libraries? > > Nothing beyond "profile your application" and "don't do stuff during import > as a side-effect" that I'm aware of. One "project structure" idea of the sort I had in mind is to move frequently used functions in a module into their own module. This way the most common paths of execution don't load unneeded functions. Following this line of reasoning could lead to grouping functions in an application by when they're needed instead of by what they do, which is different from what we normally see. I don't recall seeing advice like this anywhere, so maybe the trade-offs aren't worth it. Thoughts? --Chris > > -Brett > >> >> >> --Chris >> >> >> >> > >> > On that last point, it's also worth keeping in mind that we have a >> > much harder time finding new C-level contributors than we do new >> > Python-level ones, and have every reason to expect that problem to get >> > worse over time rather than better (since writing and maintaining >> > handcrafted C code is likely to go the way of writing and maintaining >> > handcrafted assembly code as a skillset: while it will still be >> > genuinely necessary in some contexts, it will also be an increasingly >> > niche technical specialty). >> > >> > Starting to migrate to using Cython for our acceleration modules >> > instead of plain C should thus prove to be a win for everyone: >> > >> > - Cython structurally avoids a lot of typical bugs that arise in >> > hand-coded extensions (e.g. refcount bugs) >> > - by design, it's much easier to mentally switch between Python & >> > Cython than it is between Python & C >> > - Cython accelerated modules are easier to adapt to other interpeter >> > implementations than handcrafted C modules >> > - keeping Python modules and their C accelerated counterparts in sync >> > will be easier, as they'll mostly be using the same code >> > - we'd be able to start writing C API test cases in Cython rather than >> > in handcrafted C (which currently mostly translates to only testing >> > them indirectly) >> > - CPython's own test suite would naturally help test Cython >> > compatibility with any C API updates >> > - we'd have an inherent incentive to help enhance Cython to take >> > advantage of new C API features >> > >> > The are some genuine downsides in increasing the complexity of >> > bootstrapping CPython when all you're starting with is a VCS clone and >> > a C compiler, but those complications are ultimately no worse than >> > those we already have with Argument Clinic, and hence amenable to the >> > same solution: if we need to, we can check in the generated C files in >> > order to make bootstrapping easier. >> > >> > Cheers, >> > Nick. >> > >> > -- >> > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia >> > _______________________________________________ >> > Python-Dev mailing list >> > Python-Dev at python.org >> > https://mail.python.org/mailman/listinfo/python-dev >> > Unsubscribe: >> > https://mail.python.org/mailman/options/python-dev/chris.jerdonek%40gmail.com >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> https://mail.python.org/mailman/options/python-dev/brett%40python.org On Fri, Jul 21, 2017 at 9:52 AM, Brett Cannon wrote: > > > On Thu, 20 Jul 2017 at 22:11 Chris Jerdonek > wrote: >> >> On Thu, Jul 20, 2017 at 8:49 PM, Nick Coghlan wrote: >> > ... >> > * Lazy loading can have a significant impact on startup time, as it >> > means you don't have to pay for the cost of finding and loading >> > modules that you don't actually end up using on that particular run > > > It should be mentioned that I have started designing an API to make using > lazy loading much easier in Python 3.7 (i.e. "calling a single function" > easier), but I still have to write the tests and such before I propose a > patch and it will still be mainly for apps that know what they are doing > since lazy loading makes debugging import errors harder. > >> >> > >> > We've historically resisted adopting these techniques for the standard >> > library because they *do* make things more complicated *and* harder to >> > debug relative to plain old eagerly imported dynamic Python code. >> > However, if we're going to recommend them as good practices for 3rd >> > party developers looking to optimise the startup time of their Python >> > applications, then it makes sense for us to embrace them for the >> > standard library as well, rather than having our first reaction be to >> > write more hand-crafted C code. >> >> Are there any good write-ups of best practices and techniques in this >> area for applications (other than obvious things like avoiding >> unnecessary imports)? I'm thinking of things like how to structure >> your project, things to look for, developer tools that might help, and >> perhaps third-party runtime libraries? > > > Nothing beyond "profile your application" and "don't do stuff during import > as a side-effect" that I'm aware of. > > -Brett > >> >> >> --Chris >> >> >> >> > >> > On that last point, it's also worth keeping in mind that we have a >> > much harder time finding new C-level contributors than we do new >> > Python-level ones, and have every reason to expect that problem to get >> > worse over time rather than better (since writing and maintaining >> > handcrafted C code is likely to go the way of writing and maintaining >> > handcrafted assembly code as a skillset: while it will still be >> > genuinely necessary in some contexts, it will also be an increasingly >> > niche technical specialty). >> > >> > Starting to migrate to using Cython for our acceleration modules >> > instead of plain C should thus prove to be a win for everyone: >> > >> > - Cython structurally avoids a lot of typical bugs that arise in >> > hand-coded extensions (e.g. refcount bugs) >> > - by design, it's much easier to mentally switch between Python & >> > Cython than it is between Python & C >> > - Cython accelerated modules are easier to adapt to other interpeter >> > implementations than handcrafted C modules >> > - keeping Python modules and their C accelerated counterparts in sync >> > will be easier, as they'll mostly be using the same code >> > - we'd be able to start writing C API test cases in Cython rather than >> > in handcrafted C (which currently mostly translates to only testing >> > them indirectly) >> > - CPython's own test suite would naturally help test Cython >> > compatibility with any C API updates >> > - we'd have an inherent incentive to help enhance Cython to take >> > advantage of new C API features >> > >> > The are some genuine downsides in increasing the complexity of >> > bootstrapping CPython when all you're starting with is a VCS clone and >> > a C compiler, but those complications are ultimately no worse than >> > those we already have with Argument Clinic, and hence amenable to the >> > same solution: if we need to, we can check in the generated C files in >> > order to make bootstrapping easier. >> > >> > Cheers, >> > Nick. >> > >> > -- >> > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia >> > _______________________________________________ >> > Python-Dev mailing list >> > Python-Dev at python.org >> > https://mail.python.org/mailman/listinfo/python-dev >> > Unsubscribe: >> > https://mail.python.org/mailman/options/python-dev/chris.jerdonek%40gmail.com >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> https://mail.python.org/mailman/options/python-dev/brett%40python.org From storchaka at gmail.com Sat Jul 22 02:01:57 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 22 Jul 2017 09:01:57 +0300 Subject: [Python-Dev] Need help to fix urllib(.parse) vulnerabilities In-Reply-To: References: Message-ID: 21.07.17 13:02, Victor Stinner ????: > Recently, two security vulnerabilities were reported in the urllib module: > > https://bugs.python.org/issue30500 > http://python-security.readthedocs.io/vuln/bpo-30500_urllib_connects_to_a_wrong_host.html#bpo-30500-urllib-connects-to-a-wrong-host > => already fixed in Python 3.6.2 > > https://bugs.python.org/issue29606 > http://python-security.readthedocs.io/vuln/urllib_ftp_protocol_stream_injection.html#urllib-ftp-protocol-stream-injection > => not fixed yet > > I also proposed a more general protection: "Reject newline character > (U+000A) in URLs in urllib.parse": > http://bugs.python.org/issue30713 > > The problem with the urllib module is how we handle invalid URL. Right > now, we return the URL unmodified if we cannot parse it. Should we > raise an exception if an URL contains a newline for example? > > It's very hard to harden the urllib module without the backward > compatibility. That's why it took 3 weeks to fix "urllib connects to a > wrong host": find how to fix the vulnerability without brekaing the > backward compatibility. > > Another proposed approach is to reject invalid data earlier or later, > but not in urllib... Checking an URL in urllib.parse is too early and not enough. The urllib module is general, and different protocols have different limitations. There are other ways besides urllib to pass invalid parameters to low-level protocol implementations. I think the only reliable way of fixing the vulnerability is rejecting or escaping (as specified in RFC 2640) CR and LF inside sent lines. Adding the support of RFC 2640 is a new feature and can be added only in 3.7. And this feature should be optional since not all servers support RFC 2640. https://github.com/python/cpython/pull/1214 does the right thing. The other way of hardening the Python stdlib implementation of the FTP server is making it accepting only CRLF as a line delimiter, not sole CR or LF. Additional sanity checks can be added in FTP.login() for earlier detecting and raising more specific errors. Every protocol (FTP, HTTP, telnet, SMTP, POP3, IMAP, etc) should be fixed separately. If they allow escaping special characters, they should do this. Otherwise they should be rejected. From p.f.moore at gmail.com Sat Jul 22 04:13:48 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Sat, 22 Jul 2017 09:13:48 +0100 Subject: [Python-Dev] Python startup time In-Reply-To: References: <20170719132211.GA30782@phdru.name> <874lu62frj.fsf@vostro.rath.org> <20170721182100.6da92da1@presto> Message-ID: On 21 July 2017 at 23:53, David Mertz wrote: > I would guess that Windows users don't tend to run lots of command line > tools where startup time dominates, as *nix users do. Well, in the sense that many Windows users don't use the command line at all, this is true. However, startup time is a definite problem for Windows users who *do* use the command line, because process creation cost is a lot higher than on Unix, so starting new commands is *already* costly, and therefore minimising additional overhead is crucial. It's a bit of a chicken and egg problem - Windows users avoid excessive command line program invocation because startup time is high, so no-one optimises startup time because Windows users don't use short-lived command line programs. But I'm seeing a trend away from that - more and more Windows tools these days seem to be comfortable spawning subprocesses. I don't know what prompted that trend. Paul From tritium-list at sdamon.com Sat Jul 22 04:37:06 2017 From: tritium-list at sdamon.com (Alex Walters) Date: Sat, 22 Jul 2017 04:37:06 -0400 Subject: [Python-Dev] Python startup time In-Reply-To: References: <20170719132211.GA30782@phdru.name> <874lu62frj.fsf@vostro.rath.org> <20170721182100.6da92da1@presto> Message-ID: <08a001d302c5$b42542b0$1c6fc810$@sdamon.com> > -----Original Message----- > From: Python-Dev [mailto:python-dev-bounces+tritium- > list=sdamon.com at python.org] On Behalf Of Paul Moore > Sent: Saturday, July 22, 2017 4:14 AM > To: David Mertz > Cc: Barry Warsaw ; Python-Dev dev at python.org> > Subject: Re: [Python-Dev] Python startup time > It's a bit of a chicken and egg problem - Windows users avoid > excessive command line program invocation because startup time is > high, so no-one optimises startup time because Windows users don't use > short-lived command line programs. But I'm seeing a trend away from > that - more and more Windows tools these days seem to be comfortable > spawning subprocesses. I don't know what prompted that trend. The programs I see that are comfortable spawning processes willy-nilly on windows are mostly .net, which has a lot of the runtime assemblies cached by the OS in the GAC - if you are spawning a second processes of yourself, or something that uses the same libraries as you, the compile step on those can be skipped. Unless you are talking about python/non-.NET programs, in which case, I have no answer. > Paul > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/tritium- > list%40sdamon.com From steve.dower at python.org Sat Jul 22 10:21:07 2017 From: steve.dower at python.org (Steve Dower) Date: Sat, 22 Jul 2017 07:21:07 -0700 Subject: [Python-Dev] Python startup time In-Reply-To: <08a001d302c5$b42542b0$1c6fc810$@sdamon.com> References: <20170719132211.GA30782@phdru.name> <874lu62frj.fsf@vostro.rath.org> <20170721182100.6da92da1@presto> <08a001d302c5$b42542b0$1c6fc810$@sdamon.com> Message-ID: I believe the trend is due to language like Python and Node.js, most of which aggressively discourage threading (more from the broader community than the core languages, but I see a lot of apps using these now), and also the higher reliability afforded by out-of-process tasks (that is, one crash doesn?t kill the entire app ? e.g browser tabs). Optimizing startup time is incredibly valuable, and having tried it a few times I believe that the import system (in essence, stat calls) is the biggest culprit. The tens of ms prior to the first user import can?t really go anywhere. Cheers, Steve Top-posted from my Windows phone From: Alex Walters Sent: Saturday, July 22, 2017 1:39 Cc: 'Python-Dev' Subject: Re: [Python-Dev] Python startup time > -----Original Message----- > From: Python-Dev [mailto:python-dev-bounces+tritium- > list=sdamon.com at python.org] On Behalf Of Paul Moore > Sent: Saturday, July 22, 2017 4:14 AM > To: David Mertz > Cc: Barry Warsaw ; Python-Dev dev at python.org> > Subject: Re: [Python-Dev] Python startup time > It's a bit of a chicken and egg problem - Windows users avoid > excessive command line program invocation because startup time is > high, so no-one optimises startup time because Windows users don't use > short-lived command line programs. But I'm seeing a trend away from > that - more and more Windows tools these days seem to be comfortable > spawning subprocesses. I don't know what prompted that trend. The programs I see that are comfortable spawning processes willy-nilly on windows are mostly .net, which has a lot of the runtime assemblies cached by the OS in the GAC - if you are spawning a second processes of yourself, or something that uses the same libraries as you, the compile step on those can be skipped. Unless you are talking about python/non-.NET programs, in which case, I have no answer. > Paul > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/tritium- > list%40sdamon.com _______________________________________________ Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/steve.dower%40python.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Sat Jul 22 12:38:16 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Sat, 22 Jul 2017 18:38:16 +0200 Subject: [Python-Dev] Need help to fix urllib(.parse) vulnerabilities In-Reply-To: References: Message-ID: Le 22 juil. 2017 8:04 AM, "Serhiy Storchaka" a ?crit : I think the only reliable way of fixing the vulnerability is rejecting or escaping (as specified in RFC 2640) CR and LF inside sent lines. Adding the support of RFC 2640 is a new feature and can be added only in 3.7. And this feature should be optional since not all servers support RFC 2640. https://github.com/python/cpython/pull/1214 does the right thing. In that case, I suggest to reject newlines in ftplib, and maybe add an opt-in option to escape newlines. Java just rejected newlines, no? Or does Java allows to escape them? Victor -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.rodola at gmail.com Sat Jul 22 13:10:15 2017 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Sat, 22 Jul 2017 19:10:15 +0200 Subject: [Python-Dev] Need help to fix urllib(.parse) vulnerabilities In-Reply-To: References: Message-ID: On Sat, Jul 22, 2017 at 6:38 PM, Victor Stinner wrote: > Le 22 juil. 2017 8:04 AM, "Serhiy Storchaka" a > ?crit : > > I think the only reliable way of fixing the vulnerability is rejecting or > escaping (as specified in RFC 2640) CR and LF inside sent lines. Adding the > support of RFC 2640 is a new feature and can be added only in 3.7. And this > feature should be optional since not all servers support RFC 2640. > https://github.com/python/cpython/pull/1214 does the right thing. > > > In that case, I suggest to reject newlines in ftplib, and maybe add an > opt-in option to escape newlines. > > Java just rejected newlines, no? Or does Java allows to escape them? > > Victor > > OK, let's just reject \n then and be done with it. It's a rare use case after all. Java just rejects \n for all commands and does not support escaping (aka RFC 2640). -- Giampaolo - http://grodola.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Sat Jul 22 13:17:05 2017 From: brett at python.org (Brett Cannon) Date: Sat, 22 Jul 2017 17:17:05 +0000 Subject: [Python-Dev] Python startup time In-Reply-To: References: <20170719132211.GA30782@phdru.name> <874lu62frj.fsf@vostro.rath.org> <20170721182100.6da92da1@presto> <08a001d302c5$b42542b0$1c6fc810$@sdamon.com> Message-ID: On Sat, Jul 22, 2017, 07:22 Steve Dower, wrote: > I believe the trend is due to language like Python and Node.js, most of > which aggressively discourage threading (more from the broader community > than the core languages, but I see a lot of apps using these now), and also > the higher reliability afforded by out-of-process tasks (that is, one crash > doesn?t kill the entire app ? e.g browser tabs). > > > > Optimizing startup time is incredibly valuable, and having tried it a few > times I believe that the import system (in essence, stat calls) is the > biggest culprit. The tens of ms prior to the first user import can?t really > go anywhere. > Stat calls in the import system were optimized in importlib a while back to be cached in finders so at this point you will have to remove a stat call to lower that cost or cache more which goes into breaking abstractions or designing new APIs. -brett > > Cheers, > > Steve > > > > Top-posted from my Windows phone > > > > *From: *Alex Walters > *Sent: *Saturday, July 22, 2017 1:39 > *Cc: *'Python-Dev' > > > *Subject: *Re: [Python-Dev] Python startup time > > > > > -----Original Message----- > > > From: Python-Dev [mailto:python-dev-bounces+tritium- > > > list=sdamon.com at python.org] On Behalf Of Paul Moore > > > Sent: Saturday, July 22, 2017 4:14 AM > > > To: David Mertz > > > Cc: Barry Warsaw ; Python-Dev > > dev at python.org> > > > Subject: Re: [Python-Dev] Python startup time > > > > > > > It's a bit of a chicken and egg problem - Windows users avoid > > > excessive command line program invocation because startup time is > > > high, so no-one optimises startup time because Windows users don't use > > > short-lived command line programs. But I'm seeing a trend away from > > > that - more and more Windows tools these days seem to be comfortable > > > spawning subprocesses. I don't know what prompted that trend. > > > > The programs I see that are comfortable spawning processes willy-nilly on > > windows are mostly .net, which has a lot of the runtime assemblies cached > by > > the OS in the GAC - if you are spawning a second processes of yourself, or > > something that uses the same libraries as you, the compile step on those > can > > be skipped. Unless you are talking about python/non-.NET programs, in > which > > case, I have no answer. > > > Paul > > > _______________________________________________ > > > Python-Dev mailing list > > > Python-Dev at python.org > > > https://mail.python.org/mailman/listinfo/python-dev > > > Unsubscribe: https://mail.python.org/mailman/options/python-dev/tritium- > > > list%40sdamon.com > > > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > https://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/steve.dower%40python.org > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.rodola at gmail.com Sat Jul 22 13:29:37 2017 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Sat, 22 Jul 2017 19:29:37 +0200 Subject: [Python-Dev] Need help to fix urllib(.parse) vulnerabilities In-Reply-To: References: Message-ID: On Sat, Jul 22, 2017 at 7:10 PM, Giampaolo Rodola' wrote: > > > On Sat, Jul 22, 2017 at 6:38 PM, Victor Stinner > wrote: > >> Le 22 juil. 2017 8:04 AM, "Serhiy Storchaka" a >> ?crit : >> >> I think the only reliable way of fixing the vulnerability is rejecting or >> escaping (as specified in RFC 2640) CR and LF inside sent lines. Adding the >> support of RFC 2640 is a new feature and can be added only in 3.7. And this >> feature should be optional since not all servers support RFC 2640. >> https://github.com/python/cpython/pull/1214 does the right thing. >> >> >> In that case, I suggest to reject newlines in ftplib, and maybe add an >> opt-in option to escape newlines. >> >> Java just rejected newlines, no? Or does Java allows to escape them? >> >> Victor >> >> > OK, let's just reject \n then and be done with it. It's a rare use case > after all. > Java just rejects \n for all commands and does not support escaping (aka > RFC 2640). > I've just merged the PR. There's the question whether to backport this to older versions, considering there's a small chance this may break some code/apps, but considering the chance is small and this a security fix I'd probably be +0.5 for backporting it (2.7 + 3.x - not sure up 'till when). -- Giampaolo - http://grodola.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Sat Jul 22 17:47:38 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Sat, 22 Jul 2017 23:47:38 +0200 Subject: [Python-Dev] Need help to fix urllib(.parse) vulnerabilities In-Reply-To: References: Message-ID: I consider that it is a security vulneraibility and so should be fixed in all supported branches including 3.3 and 3.4. If someone is blocked for a legit usecase, an old Python version can be used until we decide how to handle it. I concur with you, I don't think that anyone uses filenames containing newlines on FTP. FTP protocol is text based and uses newlines as the command separator. I expect a lot of not fun issues if someone uses such filename on legit files. Victor -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve.dower at python.org Sat Jul 22 19:35:31 2017 From: steve.dower at python.org (Steve Dower) Date: Sat, 22 Jul 2017 16:35:31 -0700 Subject: [Python-Dev] Python startup time In-Reply-To: References: <20170719132211.GA30782@phdru.name> <874lu62frj.fsf@vostro.rath.org> <20170721182100.6da92da1@presto> <08a001d302c5$b42542b0$1c6fc810$@sdamon.com> Message-ID: ?Stat calls in the import system were optimized in importlib a while back? Yes, I?m aware of that, which is why I don?t have any specific suggestions off-hand. But given the differences in file systems between Windows and other OSs, it wouldn?t surprise me if there were a more optimal approach for NTFS to amortize calls better. Perhaps not, but it is still the most expensive part of startup that we have any ability to change, so it?s worth investigating. Cheers, Steve Top-posted from my Windows phone From: Brett Cannon Sent: Saturday, July 22, 2017 10:18 To: Steve Dower; Alex Walters Cc: Python-Dev Subject: Re: [Python-Dev] Python startup time On Sat, Jul 22, 2017, 07:22 Steve Dower, wrote: I believe the trend is due to language like Python and Node.js, most of which aggressively discourage threading (more from the broader community than the core languages, but I see a lot of apps using these now), and also the higher reliability afforded by out-of-process tasks (that is, one crash doesn?t kill the entire app ? e.g browser tabs). ? Optimizing startup time is incredibly valuable, and having tried it a few times I believe that the import system (in essence, stat calls) is the biggest culprit. The tens of ms prior to the first user import can?t really go anywhere. Stat calls in the import system were optimized in importlib a while back to be cached in finders so at this point you will have to remove a stat call to lower that cost or cache more which goes into breaking abstractions or designing new APIs. -brett ? Cheers, Steve ? Top-posted from my Windows phone ? From: Alex Walters Sent: Saturday, July 22, 2017 1:39 Cc: 'Python-Dev' Subject: Re: [Python-Dev] Python startup time ? > -----Original Message----- > From: Python-Dev [mailto:python-dev-bounces+tritium- > list=sdamon.com at python.org] On Behalf Of Paul Moore > Sent: Saturday, July 22, 2017 4:14 AM > To: David Mertz > Cc: Barry Warsaw ; Python-Dev dev at python.org> > Subject: Re: [Python-Dev] Python startup time ? ? > It's a bit of a chicken and egg problem - Windows users avoid > excessive command line program invocation because startup time is > high, so no-one optimises startup time because Windows users don't use > short-lived command line programs. But I'm seeing a trend away from > that - more and more Windows tools these days seem to be comfortable > spawning subprocesses. I don't know what prompted that trend. ? The programs I see that are comfortable spawning processes willy-nilly on windows are mostly .net, which has a lot of the runtime assemblies cached by the OS in the GAC - if you are spawning a second processes of yourself, or something that uses the same libraries as you, the compile step on those can be skipped.? Unless you are talking about python/non-.NET programs, in which case, I have no answer. > Paul > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/tritium- > list%40sdamon.com ? _______________________________________________ Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/steve.dower%40python.org ? _______________________________________________ Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/brett%40python.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From desmoulinmichel at gmail.com Sun Jul 23 03:52:10 2017 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Sun, 23 Jul 2017 09:52:10 +0200 Subject: [Python-Dev] Python startup time In-Reply-To: References: <874lu62frj.fsf@vostro.rath.org> <20170721182100.6da92da1@presto> <08a001d302c5$b42542b0$1c6fc810$@sdamon.com> Message-ID: <75cc1c12-77cd-ef08-bc8e-69978f5bbd1b@gmail.com> > Optimizing startup time is incredibly valuable, I've been reading that from the beginning of this thread but I've been using python since the 2.4 and I never felt the burden of the startup time. I'm guessing a lot of people are like me, they just don't express them self because "better startup time can't be bad so let's not put a barrier on this". I'm not against it, but since the necessity of a faster Python in general has been a debate for years and is only finally catching up with the work of Victor Stinner, can somebody explain me the deal with start up time ? I understand where it can improve your lives. I just don't get why it's suddenly such an explosion of expectations and needs. From solipsis at pitrou.net Sun Jul 23 04:18:48 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 23 Jul 2017 10:18:48 +0200 Subject: [Python-Dev] Python startup time References: <874lu62frj.fsf@vostro.rath.org> <20170721182100.6da92da1@presto> <08a001d302c5$b42542b0$1c6fc810$@sdamon.com> Message-ID: <20170723101848.24f72130@fsol> On Sat, 22 Jul 2017 16:35:31 -0700 Steve Dower wrote: > > Yes, I?m aware of that, which is why I don?t have any specific suggestions off-hand. But given the differences in file systems between Windows and other OSs, it wouldn?t surprise me if there were a more optimal approach for NTFS to amortize calls better. Perhaps not, but it is still the most expensive part of startup that we have any ability to change, so it?s worth investigating. Can you expand on it being "the most expensive part of startup that we have any ability to change"? For example, how do Nick's benchmarks above fare on Windows? Regards Antoine. From victor.stinner at gmail.com Sun Jul 23 08:57:59 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Sun, 23 Jul 2017 14:57:59 +0200 Subject: [Python-Dev] startup time repeated? why not daemon In-Reply-To: References: Message-ID: We already did that. See _bootlocale for example. (Maybe also _collecctions_abc?) Victor Le 22 juil. 2017 07:20, "Chris Jerdonek" a ?crit : > On Fri, Jul 21, 2017 at 9:52 AM, Brett Cannon wrote: > > On Thu, 20 Jul 2017 at 22:11 Chris Jerdonek > > wrote: > >> On Thu, Jul 20, 2017 at 8:49 PM, Nick Coghlan > wrote: > >> > ... > >> > * Lazy loading can have a significant impact on startup time, as it > >> > means you don't have to pay for the cost of finding and loading > >> > modules that you don't actually end up using on that particular run > > > > It should be mentioned that I have started designing an API to make using > > lazy loading much easier in Python 3.7 (i.e. "calling a single function" > > easier), but I still have to write the tests and such before I propose a > > patch and it will still be mainly for apps that know what they are doing > > since lazy loading makes debugging import errors harder. > > ... > >> > However, if we're going to recommend them as good practices for 3rd > >> > party developers looking to optimise the startup time of their Python > >> > applications, then it makes sense for us to embrace them for the > >> > standard library as well, rather than having our first reaction be to > >> > write more hand-crafted C code. > >> > >> Are there any good write-ups of best practices and techniques in this > >> area for applications (other than obvious things like avoiding > >> unnecessary imports)? I'm thinking of things like how to structure > >> your project, things to look for, developer tools that might help, and > >> perhaps third-party runtime libraries? > > > > Nothing beyond "profile your application" and "don't do stuff during > import > > as a side-effect" that I'm aware of. > > One "project structure" idea of the sort I had in mind is to move > frequently used functions in a module into their own module. This way > the most common paths of execution don't load unneeded functions. > Following this line of reasoning could lead to grouping functions in > an application by when they're needed instead of by what they do, > which is different from what we normally see. I don't recall seeing > advice like this anywhere, so maybe the trade-offs aren't worth it. > Thoughts? > > --Chris > > > > > > -Brett > > > >> > >> > >> --Chris > >> > >> > >> > >> > > >> > On that last point, it's also worth keeping in mind that we have a > >> > much harder time finding new C-level contributors than we do new > >> > Python-level ones, and have every reason to expect that problem to get > >> > worse over time rather than better (since writing and maintaining > >> > handcrafted C code is likely to go the way of writing and maintaining > >> > handcrafted assembly code as a skillset: while it will still be > >> > genuinely necessary in some contexts, it will also be an increasingly > >> > niche technical specialty). > >> > > >> > Starting to migrate to using Cython for our acceleration modules > >> > instead of plain C should thus prove to be a win for everyone: > >> > > >> > - Cython structurally avoids a lot of typical bugs that arise in > >> > hand-coded extensions (e.g. refcount bugs) > >> > - by design, it's much easier to mentally switch between Python & > >> > Cython than it is between Python & C > >> > - Cython accelerated modules are easier to adapt to other interpeter > >> > implementations than handcrafted C modules > >> > - keeping Python modules and their C accelerated counterparts in sync > >> > will be easier, as they'll mostly be using the same code > >> > - we'd be able to start writing C API test cases in Cython rather than > >> > in handcrafted C (which currently mostly translates to only testing > >> > them indirectly) > >> > - CPython's own test suite would naturally help test Cython > >> > compatibility with any C API updates > >> > - we'd have an inherent incentive to help enhance Cython to take > >> > advantage of new C API features > >> > > >> > The are some genuine downsides in increasing the complexity of > >> > bootstrapping CPython when all you're starting with is a VCS clone and > >> > a C compiler, but those complications are ultimately no worse than > >> > those we already have with Argument Clinic, and hence amenable to the > >> > same solution: if we need to, we can check in the generated C files in > >> > order to make bootstrapping easier. > >> > > >> > Cheers, > >> > Nick. > >> > > >> > -- > >> > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > >> > _______________________________________________ > >> > Python-Dev mailing list > >> > Python-Dev at python.org > >> > https://mail.python.org/mailman/listinfo/python-dev > >> > Unsubscribe: > >> > https://mail.python.org/mailman/options/python-dev/ > chris.jerdonek%40gmail.com > >> _______________________________________________ > >> Python-Dev mailing list > >> Python-Dev at python.org > >> https://mail.python.org/mailman/listinfo/python-dev > >> Unsubscribe: > >> https://mail.python.org/mailman/options/python-dev/brett%40python.org > > On Fri, Jul 21, 2017 at 9:52 AM, Brett Cannon wrote: > > > > > > On Thu, 20 Jul 2017 at 22:11 Chris Jerdonek > > wrote: > >> > >> On Thu, Jul 20, 2017 at 8:49 PM, Nick Coghlan > wrote: > >> > ... > >> > * Lazy loading can have a significant impact on startup time, as it > >> > means you don't have to pay for the cost of finding and loading > >> > modules that you don't actually end up using on that particular run > > > > > > It should be mentioned that I have started designing an API to make using > > lazy loading much easier in Python 3.7 (i.e. "calling a single function" > > easier), but I still have to write the tests and such before I propose a > > patch and it will still be mainly for apps that know what they are doing > > since lazy loading makes debugging import errors harder. > > > >> > >> > > >> > We've historically resisted adopting these techniques for the standard > >> > library because they *do* make things more complicated *and* harder to > >> > debug relative to plain old eagerly imported dynamic Python code. > >> > However, if we're going to recommend them as good practices for 3rd > >> > party developers looking to optimise the startup time of their Python > >> > applications, then it makes sense for us to embrace them for the > >> > standard library as well, rather than having our first reaction be to > >> > write more hand-crafted C code. > >> > >> Are there any good write-ups of best practices and techniques in this > >> area for applications (other than obvious things like avoiding > >> unnecessary imports)? I'm thinking of things like how to structure > >> your project, things to look for, developer tools that might help, and > >> perhaps third-party runtime libraries? > > > > > > Nothing beyond "profile your application" and "don't do stuff during > import > > as a side-effect" that I'm aware of. > > > > -Brett > > > >> > >> > >> --Chris > >> > >> > >> > >> > > >> > On that last point, it's also worth keeping in mind that we have a > >> > much harder time finding new C-level contributors than we do new > >> > Python-level ones, and have every reason to expect that problem to get > >> > worse over time rather than better (since writing and maintaining > >> > handcrafted C code is likely to go the way of writing and maintaining > >> > handcrafted assembly code as a skillset: while it will still be > >> > genuinely necessary in some contexts, it will also be an increasingly > >> > niche technical specialty). > >> > > >> > Starting to migrate to using Cython for our acceleration modules > >> > instead of plain C should thus prove to be a win for everyone: > >> > > >> > - Cython structurally avoids a lot of typical bugs that arise in > >> > hand-coded extensions (e.g. refcount bugs) > >> > - by design, it's much easier to mentally switch between Python & > >> > Cython than it is between Python & C > >> > - Cython accelerated modules are easier to adapt to other interpeter > >> > implementations than handcrafted C modules > >> > - keeping Python modules and their C accelerated counterparts in sync > >> > will be easier, as they'll mostly be using the same code > >> > - we'd be able to start writing C API test cases in Cython rather than > >> > in handcrafted C (which currently mostly translates to only testing > >> > them indirectly) > >> > - CPython's own test suite would naturally help test Cython > >> > compatibility with any C API updates > >> > - we'd have an inherent incentive to help enhance Cython to take > >> > advantage of new C API features > >> > > >> > The are some genuine downsides in increasing the complexity of > >> > bootstrapping CPython when all you're starting with is a VCS clone and > >> > a C compiler, but those complications are ultimately no worse than > >> > those we already have with Argument Clinic, and hence amenable to the > >> > same solution: if we need to, we can check in the generated C files in > >> > order to make bootstrapping easier. > >> > > >> > Cheers, > >> > Nick. > >> > > >> > -- > >> > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > >> > _______________________________________________ > >> > Python-Dev mailing list > >> > Python-Dev at python.org > >> > https://mail.python.org/mailman/listinfo/python-dev > >> > Unsubscribe: > >> > https://mail.python.org/mailman/options/python-dev/ > chris.jerdonek%40gmail.com > >> _______________________________________________ > >> Python-Dev mailing list > >> Python-Dev at python.org > >> https://mail.python.org/mailman/listinfo/python-dev > >> Unsubscribe: > >> https://mail.python.org/mailman/options/python-dev/brett%40python.org > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > victor.stinner%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Sun Jul 23 13:36:56 2017 From: brett at python.org (Brett Cannon) Date: Sun, 23 Jul 2017 17:36:56 +0000 Subject: [Python-Dev] Python startup time In-Reply-To: <75cc1c12-77cd-ef08-bc8e-69978f5bbd1b@gmail.com> References: <874lu62frj.fsf@vostro.rath.org> <20170721182100.6da92da1@presto> <08a001d302c5$b42542b0$1c6fc810$@sdamon.com> <75cc1c12-77cd-ef08-bc8e-69978f5bbd1b@gmail.com> Message-ID: On Sun, Jul 23, 2017, 00:53 Michel Desmoulin, wrote: > > > > Optimizing startup time is incredibly valuable, > > I've been reading that from the beginning of this thread but I've been > using python since the 2.4 and I never felt the burden of the startup time. > > I'm guessing a lot of people are like me, they just don't express them > self because "better startup time can't be bad so let's not put a > barrier on this". > > I'm not against it, but since the necessity of a faster Python in > general has been a debate for years and is only finally catching up with > the work of Victor Stinner, can somebody explain me the deal with start > up time ? > > I understand where it can improve your lives. I just don't get why it's > suddenly such an explosion of expectations and needs. > It's actually always been something we have tried to improve, it just comes in waves. For instance we occasionally re-examine what modules get pulled in during startup. Importlib was optimized to help with startup. This just happens to be the latest round of trying to improve the situation. As for why we care, every command-line app wants to at least appear faster if not be faster because just getting to the point of being able to e.g. print a version number is dominated by Python and app start-up. And this is not guessing; I work with a team that puts out a command line app and one of the biggest complaints they get is the startup time. -brett _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From desmoulinmichel at gmail.com Sun Jul 23 13:52:12 2017 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Sun, 23 Jul 2017 19:52:12 +0200 Subject: [Python-Dev] Python startup time In-Reply-To: References: <874lu62frj.fsf@vostro.rath.org> <20170721182100.6da92da1@presto> <08a001d302c5$b42542b0$1c6fc810$@sdamon.com> <75cc1c12-77cd-ef08-bc8e-69978f5bbd1b@gmail.com> Message-ID: Le 23/07/2017 ? 19:36, Brett Cannon a ?crit : > > > On Sun, Jul 23, 2017, 00:53 Michel Desmoulin, > wrote: > > > > > Optimizing startup time is incredibly valuable, > > I've been reading that from the beginning of this thread but I've been > using python since the 2.4 and I never felt the burden of the > startup time. > > I'm guessing a lot of people are like me, they just don't express them > self because "better startup time can't be bad so let's not put a > barrier on this". > > I'm not against it, but since the necessity of a faster Python in > general has been a debate for years and is only finally catching up with > the work of Victor Stinner, can somebody explain me the deal with start > up time ? > > I understand where it can improve your lives. I just don't get why it's > suddenly such an explosion of expectations and needs. > > > It's actually always been something we have tried to improve, it just > comes in waves. For instance we occasionally re-examine what modules get > pulled in during startup. Importlib was optimized to help with startup. > This just happens to be the latest round of trying to improve the situation. > > As for why we care, every command-line app wants to at least appear > faster if not be faster because just getting to the point of being able > to e.g. print a version number is dominated by Python and app start-up. Fair enought. > And this is not guessing; I work with a team that puts out a command > line app and one of the biggest complaints they get is the startup time. This I don't get. When I run any command line utility in python (grin, ffind, pyped, django-admin.py...), the execute in a split second. I can't even SEE the different between: python3 -c "import os; [print(x) for x in os.listdir('.')]" and ls . I'm having a hard time understanding how the Python VM startup time can be perceived as a barriere here. I can understand if you have an application firing Python 1000 times a second, like a CGI service or some kind of code exec service. But scripting ? Now I can imagine that a given Python program can be slow to start up, because it imports a lot of things. But not the VM itself. > > -brett > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > From brett at python.org Sun Jul 23 16:28:32 2017 From: brett at python.org (Brett Cannon) Date: Sun, 23 Jul 2017 20:28:32 +0000 Subject: [Python-Dev] Python startup time In-Reply-To: References: <874lu62frj.fsf@vostro.rath.org> <20170721182100.6da92da1@presto> <08a001d302c5$b42542b0$1c6fc810$@sdamon.com> <75cc1c12-77cd-ef08-bc8e-69978f5bbd1b@gmail.com> Message-ID: On Sun, Jul 23, 2017, 10:52 Michel Desmoulin, wrote: > > > Le 23/07/2017 ? 19:36, Brett Cannon a ?crit : > > > > > > On Sun, Jul 23, 2017, 00:53 Michel Desmoulin, > > wrote: > > > > > > > > > Optimizing startup time is incredibly valuable, > > > > I've been reading that from the beginning of this thread but I've > been > > using python since the 2.4 and I never felt the burden of the > > startup time. > > > > I'm guessing a lot of people are like me, they just don't express > them > > self because "better startup time can't be bad so let's not put a > > barrier on this". > > > > I'm not against it, but since the necessity of a faster Python in > > general has been a debate for years and is only finally catching up > with > > the work of Victor Stinner, can somebody explain me the deal with > start > > up time ? > > > > I understand where it can improve your lives. I just don't get why > it's > > suddenly such an explosion of expectations and needs. > > > > > > It's actually always been something we have tried to improve, it just > > comes in waves. For instance we occasionally re-examine what modules get > > pulled in during startup. Importlib was optimized to help with startup. > > This just happens to be the latest round of trying to improve the > situation. > > > > As for why we care, every command-line app wants to at least appear > > faster if not be faster because just getting to the point of being able > > to e.g. print a version number is dominated by Python and app start-up. > > > Fair enought. > > > And this is not guessing; I work with a team that puts out a command > > line app and one of the biggest complaints they get is the startup time. > > This I don't get. When I run any command line utility in python (grin, > ffind, pyped, django-admin.py...), the execute in a split second. > > I can't even SEE the different between: > > python3 -c "import os; [print(x) for x in os.listdir('.')]" > > and > > ls . > > I'm having a hard time understanding how the Python VM startup time can > be perceived as a barriere here. I can understand if you have an > application firing Python 1000 times a second, like a CGI service or > some kind of code exec service. But scripting ? > So you're viewing it from a single OS and single machine perspective. Stuff varies so much that you can't compare something like this based on a single experience. I also said "appear" on purpose. ? Some people just compare Python against other languages based on benchmarks like startup when choosing a language so part of this is optics. This also applies when people compare Python 2 to 3. > Now I can imagine that a given Python program can be slow to start up, > because it imports a lot of things. But not the VM itself. > There's also the fact that some things we might do to speed up Python's own startup will propagate to user code and so have a bigger effect, e.g. making namedtuple cheaper reaches into user code that uses namedtuple. IOW based on experience this is worth the time to look into. > > > > > -brett > > > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > https://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: > > > https://mail.python.org/mailman/options/python-dev/brett%40python.org > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.jerdonek at gmail.com Sun Jul 23 19:48:23 2017 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Sun, 23 Jul 2017 16:48:23 -0700 Subject: [Python-Dev] startup time repeated? why not daemon In-Reply-To: References: Message-ID: On Sun, Jul 23, 2017 at 5:57 AM, Victor Stinner wrote: > We already did that. See _bootlocale for example. (Maybe also > _collecctions_abc?) I was asking more in the context of recommended practices for third-party developers, as Nick mentioned earlier, because it's not a strategy I've ever seen mentioned (and common practice is to group only by functionality). It's good to know re: locale and collections though. Incidentally, from the issue thread it doesn't look like _bootlocale was motivated primarily by startup time, but _collections_abc was: locale: http://bugs.python.org/issue9548 collections.abc: http://bugs.python.org/issue19218 --Chris > > Victor > > Le 22 juil. 2017 07:20, "Chris Jerdonek" a ?crit > : >> >> On Fri, Jul 21, 2017 at 9:52 AM, Brett Cannon wrote: >> > On Thu, 20 Jul 2017 at 22:11 Chris Jerdonek >> > wrote: >> >> On Thu, Jul 20, 2017 at 8:49 PM, Nick Coghlan >> >> wrote: >> >> > ... >> >> > * Lazy loading can have a significant impact on startup time, as it >> >> > means you don't have to pay for the cost of finding and loading >> >> > modules that you don't actually end up using on that particular run >> > >> > It should be mentioned that I have started designing an API to make >> > using >> > lazy loading much easier in Python 3.7 (i.e. "calling a single function" >> > easier), but I still have to write the tests and such before I propose a >> > patch and it will still be mainly for apps that know what they are doing >> > since lazy loading makes debugging import errors harder. >> > ... >> >> > However, if we're going to recommend them as good practices for 3rd >> >> > party developers looking to optimise the startup time of their Python >> >> > applications, then it makes sense for us to embrace them for the >> >> > standard library as well, rather than having our first reaction be to >> >> > write more hand-crafted C code. >> >> >> >> Are there any good write-ups of best practices and techniques in this >> >> area for applications (other than obvious things like avoiding >> >> unnecessary imports)? I'm thinking of things like how to structure >> >> your project, things to look for, developer tools that might help, and >> >> perhaps third-party runtime libraries? >> > >> > Nothing beyond "profile your application" and "don't do stuff during >> > import >> > as a side-effect" that I'm aware of. >> >> One "project structure" idea of the sort I had in mind is to move >> frequently used functions in a module into their own module. This way >> the most common paths of execution don't load unneeded functions. >> Following this line of reasoning could lead to grouping functions in >> an application by when they're needed instead of by what they do, >> which is different from what we normally see. I don't recall seeing >> advice like this anywhere, so maybe the trade-offs aren't worth it. >> Thoughts? >> >> --Chris >> >> >> > >> > -Brett >> > >> >> >> >> >> >> --Chris >> >> >> >> >> >> >> >> > >> >> > On that last point, it's also worth keeping in mind that we have a >> >> > much harder time finding new C-level contributors than we do new >> >> > Python-level ones, and have every reason to expect that problem to >> >> > get >> >> > worse over time rather than better (since writing and maintaining >> >> > handcrafted C code is likely to go the way of writing and maintaining >> >> > handcrafted assembly code as a skillset: while it will still be >> >> > genuinely necessary in some contexts, it will also be an increasingly >> >> > niche technical specialty). >> >> > >> >> > Starting to migrate to using Cython for our acceleration modules >> >> > instead of plain C should thus prove to be a win for everyone: >> >> > >> >> > - Cython structurally avoids a lot of typical bugs that arise in >> >> > hand-coded extensions (e.g. refcount bugs) >> >> > - by design, it's much easier to mentally switch between Python & >> >> > Cython than it is between Python & C >> >> > - Cython accelerated modules are easier to adapt to other interpeter >> >> > implementations than handcrafted C modules >> >> > - keeping Python modules and their C accelerated counterparts in sync >> >> > will be easier, as they'll mostly be using the same code >> >> > - we'd be able to start writing C API test cases in Cython rather >> >> > than >> >> > in handcrafted C (which currently mostly translates to only testing >> >> > them indirectly) >> >> > - CPython's own test suite would naturally help test Cython >> >> > compatibility with any C API updates >> >> > - we'd have an inherent incentive to help enhance Cython to take >> >> > advantage of new C API features >> >> > >> >> > The are some genuine downsides in increasing the complexity of >> >> > bootstrapping CPython when all you're starting with is a VCS clone >> >> > and >> >> > a C compiler, but those complications are ultimately no worse than >> >> > those we already have with Argument Clinic, and hence amenable to the >> >> > same solution: if we need to, we can check in the generated C files >> >> > in >> >> > order to make bootstrapping easier. >> >> > >> >> > Cheers, >> >> > Nick. >> >> > >> >> > -- >> >> > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia >> >> > _______________________________________________ >> >> > Python-Dev mailing list >> >> > Python-Dev at python.org >> >> > https://mail.python.org/mailman/listinfo/python-dev >> >> > Unsubscribe: >> >> > >> >> > https://mail.python.org/mailman/options/python-dev/chris.jerdonek%40gmail.com >> >> _______________________________________________ >> >> Python-Dev mailing list >> >> Python-Dev at python.org >> >> https://mail.python.org/mailman/listinfo/python-dev >> >> Unsubscribe: >> >> https://mail.python.org/mailman/options/python-dev/brett%40python.org >> >> On Fri, Jul 21, 2017 at 9:52 AM, Brett Cannon wrote: >> > >> > >> > On Thu, 20 Jul 2017 at 22:11 Chris Jerdonek >> > wrote: >> >> >> >> On Thu, Jul 20, 2017 at 8:49 PM, Nick Coghlan >> >> wrote: >> >> > ... >> >> > * Lazy loading can have a significant impact on startup time, as it >> >> > means you don't have to pay for the cost of finding and loading >> >> > modules that you don't actually end up using on that particular run >> > >> > >> > It should be mentioned that I have started designing an API to make >> > using >> > lazy loading much easier in Python 3.7 (i.e. "calling a single function" >> > easier), but I still have to write the tests and such before I propose a >> > patch and it will still be mainly for apps that know what they are doing >> > since lazy loading makes debugging import errors harder. >> > >> >> >> >> > >> >> > We've historically resisted adopting these techniques for the >> >> > standard >> >> > library because they *do* make things more complicated *and* harder >> >> > to >> >> > debug relative to plain old eagerly imported dynamic Python code. >> >> > However, if we're going to recommend them as good practices for 3rd >> >> > party developers looking to optimise the startup time of their Python >> >> > applications, then it makes sense for us to embrace them for the >> >> > standard library as well, rather than having our first reaction be to >> >> > write more hand-crafted C code. >> >> >> >> Are there any good write-ups of best practices and techniques in this >> >> area for applications (other than obvious things like avoiding >> >> unnecessary imports)? I'm thinking of things like how to structure >> >> your project, things to look for, developer tools that might help, and >> >> perhaps third-party runtime libraries? >> > >> > >> > Nothing beyond "profile your application" and "don't do stuff during >> > import >> > as a side-effect" that I'm aware of. >> > >> > -Brett >> > >> >> >> >> >> >> --Chris >> >> >> >> >> >> >> >> > >> >> > On that last point, it's also worth keeping in mind that we have a >> >> > much harder time finding new C-level contributors than we do new >> >> > Python-level ones, and have every reason to expect that problem to >> >> > get >> >> > worse over time rather than better (since writing and maintaining >> >> > handcrafted C code is likely to go the way of writing and maintaining >> >> > handcrafted assembly code as a skillset: while it will still be >> >> > genuinely necessary in some contexts, it will also be an increasingly >> >> > niche technical specialty). >> >> > >> >> > Starting to migrate to using Cython for our acceleration modules >> >> > instead of plain C should thus prove to be a win for everyone: >> >> > >> >> > - Cython structurally avoids a lot of typical bugs that arise in >> >> > hand-coded extensions (e.g. refcount bugs) >> >> > - by design, it's much easier to mentally switch between Python & >> >> > Cython than it is between Python & C >> >> > - Cython accelerated modules are easier to adapt to other interpeter >> >> > implementations than handcrafted C modules >> >> > - keeping Python modules and their C accelerated counterparts in sync >> >> > will be easier, as they'll mostly be using the same code >> >> > - we'd be able to start writing C API test cases in Cython rather >> >> > than >> >> > in handcrafted C (which currently mostly translates to only testing >> >> > them indirectly) >> >> > - CPython's own test suite would naturally help test Cython >> >> > compatibility with any C API updates >> >> > - we'd have an inherent incentive to help enhance Cython to take >> >> > advantage of new C API features >> >> > >> >> > The are some genuine downsides in increasing the complexity of >> >> > bootstrapping CPython when all you're starting with is a VCS clone >> >> > and >> >> > a C compiler, but those complications are ultimately no worse than >> >> > those we already have with Argument Clinic, and hence amenable to the >> >> > same solution: if we need to, we can check in the generated C files >> >> > in >> >> > order to make bootstrapping easier. >> >> > >> >> > Cheers, >> >> > Nick. >> >> > >> >> > -- >> >> > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia >> >> > _______________________________________________ >> >> > Python-Dev mailing list >> >> > Python-Dev at python.org >> >> > https://mail.python.org/mailman/listinfo/python-dev >> >> > Unsubscribe: >> >> > >> >> > https://mail.python.org/mailman/options/python-dev/chris.jerdonek%40gmail.com >> >> _______________________________________________ >> >> Python-Dev mailing list >> >> Python-Dev at python.org >> >> https://mail.python.org/mailman/listinfo/python-dev >> >> Unsubscribe: >> >> https://mail.python.org/mailman/options/python-dev/brett%40python.org >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com From ncoghlan at gmail.com Sun Jul 23 23:59:30 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 24 Jul 2017 13:59:30 +1000 Subject: [Python-Dev] Python startup time In-Reply-To: References: <20170719132211.GA30782@phdru.name> <874lu62frj.fsf@vostro.rath.org> <20170721182100.6da92da1@presto> <08a001d302c5$b42542b0$1c6fc810$@sdamon.com> Message-ID: On 23 July 2017 at 09:35, Steve Dower wrote: > Yes, I?m aware of that, which is why I don?t have any specific suggestions > off-hand. But given the differences in file systems between Windows and > other OSs, it wouldn?t surprise me if there were a more optimal approach for > NTFS to amortize calls better. Perhaps not, but it is still the most > expensive part of startup that we have any ability to change, so it?s worth > investigating. That does remind me of a capability we haven''t played with a lot recently: $ python3 -m site sys.path = [ '/home/ncoghlan', '/usr/lib64/python36.zip', '/usr/lib64/python3.6', '/usr/lib64/python3.6/lib-dynload', '/home/ncoghlan/.local/lib/python3.6/site-packages', '/usr/lib64/python3.6/site-packages', '/usr/lib/python3.6/site-packages', ] USER_BASE: '/home/ncoghlan/.local' (exists) USER_SITE: '/home/ncoghlan/.local/lib/python3.6/site-packages' (exists) ENABLE_USER_SITE: True The interpreter puts a zip file ahead of the regular unpacked standard library on sys.path because at one point in time that was a useful optimisation technique for reducing import costs on application startup. It was a potentially big win with the old "multiple stat calls" import implementation, but I'm not aware of any more recent benchmarks relative to the current listdir-caching based import implementation. So I think some interesting experiments to try measuring might be: - pushing the "always imported" modules into a dedicated zip archive - having the interpreter pre-seed sys.modules with the contents of that dedicated archive - freezing those modules and building them into the interpreter that way - compiling the standalone top-level modules with Cython, and loading them as extension modules - compiling in the Cython generated modules as builtins (not currently an option for packages & submodules due to [1]) The nice thing about those kinds of approaches is that they're all fairly general purpose, and relate primarily to how the Python interpreter is put together, rather than how the individual modules are written in the first place. (I'm not volunteering to run those experiments, though - just pointing out some of the technical options we have available to us that don't involve adding more handcrafted C extension modules to CPython) [1] https://bugs.python.org/issue1644818 Cheers, NIck. P.S. Checking the current list of source modules implicitly loaded at startup, I get: >>> import sys >>> sorted(k for k, m in sys.modules.items() if m.__spec__ is not None and type(m.__spec__.loader).__name__ == "SourceFileLoader") ['_collections_abc', '_sitebuiltins', '_weakrefset', 'abc', 'codecs', 'encodings', 'encodings.aliases', 'encodings.latin_1', 'encodings.utf_8', 'genericpath', 'io', 'os', 'os.path', 'posixpath', 'rlcompleter', 'site', 'stat'] -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From larry at hastings.org Mon Jul 24 03:00:07 2017 From: larry at hastings.org (Larry Hastings) Date: Mon, 24 Jul 2017 00:00:07 -0700 Subject: [Python-Dev] Python 3.5.4rc1 and 3.4.7rc1 slipping by a day, to July 24 2017 Message-ID: Release engineering for 3.5.4rc1 and 3.4.7rc1 took a lot longer than expected, because this is the first release using "blurb", and it turned out there was a lot of work left to do and a couple dark corners yet to stumble over. 3.5.4rc1 and 3.4.7rc1 will be released Monday, July 24, 2017. The release dates for 3.5.4 final and 3.4.7 final are not expected to change. Sorry about that, //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Mon Jul 24 05:04:52 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 24 Jul 2017 11:04:52 +0200 Subject: [Python-Dev] Is Windows XP still supported on Python 2.7? Message-ID: Hi, We have a Windows XP buildbot for Python 2.7, run by David Bolen: http://buildbot.python.org/all/builders/x86%20Windows%20XP%202.7/ test_bsddb3 fails randomly on this buildbot: http://bugs.python.org/issue30778 But Windows XP clearly reached its end-of-life, Microsoft doesn't support it anymore. So my question is if it makes sense to spend time on it? We have a rule for new x.y.0 released, but not if a Microsoft Windows support expires during the lifetime of a Python release (2.7 here): https://www.python.org/dev/peps/pep-0011/#microsoft-windows Firefox made great efforts to support Windows XP last years, but they decided to drop support last March with Firefox 52, last release supporting XP and Visa: https://support.mozilla.org/en-US/kb/end-support-windows-xp-and-vista Victor From tritium-list at sdamon.com Mon Jul 24 05:38:07 2017 From: tritium-list at sdamon.com (Alex Walters) Date: Mon, 24 Jul 2017 05:38:07 -0400 Subject: [Python-Dev] Is Windows XP still supported on Python 2.7? In-Reply-To: References: Message-ID: <09ff01d30460$8ed79160$ac86b420$@sdamon.com> The promise that PEP-11 is making is that as long as a python was released while Microsoft still supported that OS, and that python is still supported, there will still be a python that works for you. So, yes, Windows XP is long since unsupported by Microsoft, but a disturbing number of people still run it. (I think the NHS in the UK still runs embedded windows XP, just to name a big one). Yes, it's a support burden, but it's on the support burden version of python anyways. 2.7 is a very slow moving branch so it shouldn't be THAT big of a pain for the last 2 years of python 2 support. > -----Original Message----- > From: Python-Dev [mailto:python-dev-bounces+tritium- > list=sdamon.com at python.org] On Behalf Of Victor Stinner > Sent: Monday, July 24, 2017 5:05 AM > To: Python Dev ; David Bolen > > Subject: [Python-Dev] Is Windows XP still supported on Python 2.7? > > Hi, > > We have a Windows XP buildbot for Python 2.7, run by David Bolen: > http://buildbot.python.org/all/builders/x86%20Windows%20XP%202.7/ > > test_bsddb3 fails randomly on this buildbot: > http://bugs.python.org/issue30778 > > But Windows XP clearly reached its end-of-life, Microsoft doesn't > support it anymore. So my question is if it makes sense to spend time > on it? > > We have a rule for new x.y.0 released, but not if a Microsoft Windows > support expires during the lifetime of a Python release (2.7 here): > https://www.python.org/dev/peps/pep-0011/#microsoft-windows > > Firefox made great efforts to support Windows XP last years, but they > decided to drop support last March with Firefox 52, last release > supporting XP and Visa: > https://support.mozilla.org/en-US/kb/end-support-windows-xp-and-vista > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/tritium- > list%40sdamon.com From victor.stinner at gmail.com Mon Jul 24 05:56:44 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 24 Jul 2017 11:56:44 +0200 Subject: [Python-Dev] Is Windows XP still supported on Python 2.7? In-Reply-To: <09ff01d30460$8ed79160$ac86b420$@sdamon.com> References: <09ff01d30460$8ed79160$ac86b420$@sdamon.com> Message-ID: 2017-07-24 11:38 GMT+02:00 Alex Walters : > The promise that PEP-11 is making is that as long as a python was released > while Microsoft still supported that OS, and that python is still supported, > there will still be a python that works for you. So, yes, Windows XP is > long since unsupported by Microsoft, but a disturbing number of people still > run it. (I think the NHS in the UK still runs embedded windows XP, just to > name a big one). Yes, it's a support burden, but it's on the support burden > version of python anyways. 2.7 is a very slow moving branch so it shouldn't > be THAT big of a pain for the last 2 years of python 2 support. Python 2.7.13 which was released at 2016-12-17 still supports Windows XP. It's not like you cannot install on Windows XP anymore. The question is who will fix bugs specific to Windows XP or Visa until 2020. Having a working Windows XP is probably required to debug and fix such bugs. I don't want to install Windows XP on my network. The last time I ran XP was something like 4 years ago, and I was reading an article saying that malwares are able to attack XP even during the installation time. So you cannot be sure that installed Windows is safe of malwares or not... Victor From ncoghlan at gmail.com Mon Jul 24 10:25:04 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 25 Jul 2017 00:25:04 +1000 Subject: [Python-Dev] Cython compiled stdlib modules - Re: Python startup time In-Reply-To: References: <20170719132211.GA30782@phdru.name> Message-ID: On 22 July 2017 at 06:43, Stefan Behnel wrote: > Nick Coghlan schrieb am 21.07.2017 um 08:23: >> I'll also note that in these cases where the import overhead is >> proportionally significant for always-imported modules, we may want to look >> at the benefits of freezing them (if they otherwise remain as pure Python >> modules), or compiling them as builtin modules (if we switch them over to >> Cython), in addition to looking at ways to make the modules themselves >> faster. > > Just for the sake of it, I gave the Cython compilation a try. I had to > apply the attached hack to Lib/typing.py to get the test passing, because > it uses frame call offsets in some places and Cython functions do not > create frames when being called (they only create them for exception > traces). I also had to disable the import of "abc" in the Cython generated > module to remove the circular self dependency at startup when the "abc" > module is compiled. That shouldn't have an impact on the runtime > performance, though. [snip] > As it stands, the gain is probably not worth the increase in library file > size, which also translates to a higher bottom line for the memory > consumption. At least not for these two modules. Manually optimising the > files would likely also reduce the .so file size in addition to giving > better speedups, though, because the generated code would become less generic. Thanks for trying the experiment! I agree with your conclusion that the file size impact likely rules it out as a general technique. Selective freezing may still be interesting though, since that at least avoids the import path searches and merges the disk read into the initial loading of the executable. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From katherinebobrovnik at gmail.com Mon Jul 24 04:04:48 2017 From: katherinebobrovnik at gmail.com (Katherine Bobrovnik) Date: Mon, 24 Jul 2017 11:04:48 +0300 Subject: [Python-Dev] dictionaries in Dataframe column Message-ID: Hello guys! I've stuck at this: I have pandas Dataframe with a lot of columns. One column contains dictionaries with emoji like {'count': 1, 'name': 'fire'}. My goal is to sort rows of this Dataframe by the number from 'count'. Like first row will be with {'count': 49, 'name': '+1'}, the last - {'count': 1, 'name': 'fire'}, for instance. Help, please :) Best regards, Kateryna From phd at phdru.name Mon Jul 24 11:11:21 2017 From: phd at phdru.name (Oleg Broytman) Date: Mon, 24 Jul 2017 17:11:21 +0200 Subject: [Python-Dev] dictionaries in Dataframe column In-Reply-To: References: Message-ID: <20170724151121.GA30536@phdru.name> Hello. This mailing list is to work on developing Python (adding new features to Python itself and fixing bugs); if you're having problems learning, understanding or using Python, please find another forum. Probably python-list/comp.lang.python mailing list/news group is the best place; there are Python developers who participate in it; you may get a faster, and probably more complete, answer there. See http://www.python.org/community/ for other lists/news groups/fora. Thank you for understanding. On Mon, Jul 24, 2017 at 11:04:48AM +0300, Katherine Bobrovnik wrote: > Hello guys! > I've stuck at this: > I have pandas Dataframe with a lot of columns. One column contains > dictionaries with emoji like {'count': 1, 'name': 'fire'}. My goal is > to sort rows of this Dataframe by the number from 'count'. Like first > row will be with {'count': 49, 'name': '+1'}, the last - {'count': > 1, 'name': 'fire'}, for instance. Help, please :) > > Best regards, > Kateryna > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/phd%40phdru.name Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From kiuhnm03 at yahoo.it Mon Jul 24 11:35:03 2017 From: kiuhnm03 at yahoo.it (Kiuhnm) Date: Mon, 24 Jul 2017 17:35:03 +0200 Subject: [Python-Dev] for...else Message-ID: Hello, I think that the expression "for...else" or "while...else" is completely counter-intuitive. Wouldn't it be possible to make it clearer? Maybe something like break in for i in range(n): ... if cond: break else: ... I'm not an English native speaker so I don't know whether "break in" is acceptable English in this context or can only mean "to get into a building by force". Kiuhnm From steve at pearwood.info Mon Jul 24 12:14:48 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 25 Jul 2017 02:14:48 +1000 Subject: [Python-Dev] for...else In-Reply-To: References: Message-ID: <20170724161447.GQ3149@ando.pearwood.info> Hello Kiuhnm, and welcome. On Mon, Jul 24, 2017 at 05:35:03PM +0200, Kiuhnm via Python-Dev wrote: > Hello, > > I think that the expression "for...else" or "while...else" is completely > counter-intuitive. You may be right -- this has been discussed many, many times before. In my personal opinion, the best (and only accurate!) phrase would have been: for item in sequence: # block then: # block If you look at the byte-code generated by a for...else statement, you see that the "else" block is unconditionally executed after the for loop completes, unless something causes a jump outside of the entire statement: return, break, or raise. So it is more like: - run the loop; - *then* run the following block rather than: - run the loop; - otherwise ("else") run the following block. Others disagree and would prefer other keywords. But regardless, backwards compatibility means that we must keep "for...else", so I'm afraid that discussing alternatives is *almost certainly* a waste of time. > Wouldn't it be possible to make it clearer? Maybe > something like At this point, no, it is not practical to change the syntax used. Maybe when Python 3.0 was first introduced, but that ship has long sailed. It is very, very unlikely that the syntax for this will ever change, but if it does, it probably won't be until something in the distant future like Python 5. But not Python 4: Guido has already ruled that Python 4 will not include major backwards-incompatible changes. Going from 3 to 4 will not be as disruptive as going from 2 to 3. So depending on how you look at it: discussing alternative syntax to for...else is either ten years too late or ten years too early. -- Steve From benhoyt at gmail.com Mon Jul 24 12:23:49 2017 From: benhoyt at gmail.com (Ben Hoyt) Date: Mon, 24 Jul 2017 12:23:49 -0400 Subject: [Python-Dev] for...else In-Reply-To: <20170724161447.GQ3149@ando.pearwood.info> References: <20170724161447.GQ3149@ando.pearwood.info> Message-ID: This is more of a python-ideas discussion, and Steven's answer is good. I'll just add one thing. Maybe it's obvious to others, but I've liked for...else since I found a kind of mnemonic to help me remember when the "else" part happens: I think of it not as "for ... else" but as "break ... else" -- saying it this way makes it clear to me that the break goes with the else. "If this condition inside the loop is true, break. ... *else* if we didn't break, do this other thing after the loop." -Ben On Mon, Jul 24, 2017 at 12:14 PM, Steven D'Aprano wrote: > Hello Kiuhnm, and welcome. > > On Mon, Jul 24, 2017 at 05:35:03PM +0200, Kiuhnm via Python-Dev wrote: > > Hello, > > > > I think that the expression "for...else" or "while...else" is completely > > counter-intuitive. > > > You may be right -- this has been discussed many, many times before. In > my personal opinion, the best (and only accurate!) phrase would have > been: > > for item in sequence: > # block > then: > # block > > If you look at the byte-code generated by a for...else statement, you > see that the "else" block is unconditionally executed after the for loop > completes, unless something causes a jump outside of the entire > statement: return, break, or raise. So it is more like: > > - run the loop; > - *then* run the following block > > rather than: > > - run the loop; > - otherwise ("else") run the following block. > > Others disagree and would prefer other keywords. But regardless, > backwards compatibility means that we must keep "for...else", so I'm > afraid that discussing alternatives is *almost certainly* a waste of > time. > > > > Wouldn't it be possible to make it clearer? Maybe > > something like > > At this point, no, it is not practical to change the syntax used. Maybe > when Python 3.0 was first introduced, but that ship has long sailed. It > is very, very unlikely that the syntax for this will ever change, but if > it does, it probably won't be until something in the distant future like > Python 5. > > But not Python 4: Guido has already ruled that Python 4 will not include > major backwards-incompatible changes. Going from 3 to 4 will not be as > disruptive as going from 2 to 3. > > So depending on how you look at it: discussing alternative syntax to > for...else is either ten years too late or ten years too early. > > > > -- > Steve > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > benhoyt%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Mon Jul 24 12:36:44 2017 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 24 Jul 2017 12:36:44 -0400 Subject: [Python-Dev] for...else In-Reply-To: References: <20170724161447.GQ3149@ando.pearwood.info> Message-ID: On Mon, Jul 24, 2017 at 12:23 PM, Ben Hoyt wrote: > .. I found a kind of mnemonic to help me remember when the > "else" part happens: I think of it not as "for ... else" but as "break ... > else" -- saying it this way makes it clear to me that the break goes with > the else. "If this condition inside the loop is true, break. ... *else* if > we didn't break, do this other thing after the loop." Note that since break itself is typically guarded by an "if" as in for i in x: ... if cond(i): break ... else: ... you can match the "else" above to the "if" inside the loop. From tjreedy at udel.edu Mon Jul 24 12:48:16 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 24 Jul 2017 12:48:16 -0400 Subject: [Python-Dev] Is Windows XP still supported on Python 2.7? In-Reply-To: References: Message-ID: On 7/24/2017 5:04 AM, Victor Stinner wrote: > We have a Windows XP buildbot for Python 2.7, run by David Bolen: > http://buildbot.python.org/all/builders/x86%20Windows%20XP%202.7/ > > test_bsddb3 fails randomly on this buildbot: > http://bugs.python.org/issue30778 If that turns out to be an unfixable intermittent failure of two particular functions, then it becomes expected. To keep buildbots green, skip the one that crashes and turn the failure of the other into a skip. > But Windows XP clearly reached its end-of-life, Microsoft doesn't > support it anymore. So my question is if it makes sense to spend time > on it? When an entity *sells* 'support', that means an active effort to fix. For free, volunteer-built Python, 'support' for system Z means that we (continue to) allow system Z specific code, new patches, and a system Z buildbot. If we make an installer, it installs on system Z. In that minimal sense, xp is still generally supported, although experts for specific modules may refuse to review and merge patches for 'their' modules. But that exception typically has nothing to do with xp in particular. But it could. When XP became unsupported in 3.5, xp-specific code was removed, the xp buildbot was removed, we started rejecting xp-specific patches, and our windows installer refused to install on XP. Are you are proposing this for 2.7? > We have a rule for new x.y.0 released, but not if a Microsoft Windows > support expires during the lifetime of a Python release (2.7 here): > https://www.python.org/dev/peps/pep-0011/#microsoft-windows > Firefox made great efforts to support Windows XP last years, but they > decided to drop support last March with Firefox 52, last release > supporting XP and Visa: > https://support.mozilla.org/en-US/kb/end-support-windows-xp-and-vista The link says that they continue with security patches until next September, at which point they will review installation numbers. The extra 5 years of support for 2.7, above the originally intended 5 years, is mainly for security fixes, although some people continue with routine non-security bugfixes. -- Terry Jan Reedy From zachary.ware+pydev at gmail.com Mon Jul 24 13:05:47 2017 From: zachary.ware+pydev at gmail.com (Zachary Ware) Date: Mon, 24 Jul 2017 12:05:47 -0500 Subject: [Python-Dev] Is Windows XP still supported on Python 2.7? In-Reply-To: References: Message-ID: On Mon, Jul 24, 2017 at 11:48 AM, Terry Reedy wrote: > On 7/24/2017 5:04 AM, Victor Stinner wrote: >> We have a Windows XP buildbot for Python 2.7, run by David Bolen: >> http://buildbot.python.org/all/builders/x86%20Windows%20XP%202.7/ >> >> test_bsddb3 fails randomly on this buildbot: >> http://bugs.python.org/issue30778 > > > If that turns out to be an unfixable intermittent failure of two particular > functions, then it becomes expected. To keep buildbots green, skip the one > that crashes and turn the failure of the other into a skip. We are committed to support back to Windows 2000 in Python 2.7. In general, that just means "don't commit something that makes the platform unsupportable and accept a patch if somebody fixes something on that platform". In this case, considering that it's a test of a 2.x-only module on an out-of-vendor-support OS, skipping the tests (possibly even the entirety of test_bsddb3) on XP sounds just fine to me. -- Zach From victor.stinner at gmail.com Mon Jul 24 13:08:59 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 24 Jul 2017 19:08:59 +0200 Subject: [Python-Dev] Is Windows XP still supported on Python 2.7? In-Reply-To: References: Message-ID: 2017-07-24 19:05 GMT+02:00 Zachary Ware : > In this case, considering that it's a test of a > 2.x-only module on an out-of-vendor-support OS, skipping the tests > (possibly even the entirety of test_bsddb3) on XP sounds just fine to > me. Oh ok. Since Terry and you agree on that, I will skip the test on Windows XP. Victor From isaac.morland at gmail.com Mon Jul 24 13:12:45 2017 From: isaac.morland at gmail.com (Isaac Morland) Date: Mon, 24 Jul 2017 13:12:45 -0400 Subject: [Python-Dev] for...else In-Reply-To: References: Message-ID: The way I remember it is to observe that the following are *almost* exactly the same thing: if C: T else: E while C: T else: E The *only* differences are: 1) where execution jumps if it reaches the end of the T: in the "while", it jumps back to the while itself, resulting in the condition being rechecked, whereas in the "if" execution skips over the "else" to whatever follows; and 2) in the "while", inside the T "break" and "continue" relate to this control structure rather than to a containing loop. (At least I don't think I'm missing any other differences!) Seen this way, the meaning of the "else" is easy to understand and to remember. And the for loop else is like the while loop else. On 24 July 2017 at 11:35, Kiuhnm via Python-Dev wrote: > Hello, > > I think that the expression "for...else" or "while...else" is completely > counter-intuitive. Wouldn't it be possible to make it clearer? Maybe > something like > > break in for i in range(n): > ... > if cond: > break > else: > ... > > I'm not an English native speaker so I don't know whether "break in" is > acceptable English in this context or can only mean "to get into a building > by force". > > Kiuhnm > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/isaac. > morland%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter.xihong.wang at intel.com Mon Jul 24 13:49:44 2017 From: peter.xihong.wang at intel.com (Wang, Peter Xihong) Date: Mon, 24 Jul 2017 17:49:44 +0000 Subject: [Python-Dev] Program runs in 12s on Python 2.7, but 5s on Python 3.5 -- why so much difference? In-Reply-To: References: Message-ID: <371EBC7881C7844EAAF5556BFF21BCCC830D98CF@ORSMSX105.amr.corp.intel.com> Hi Ben, Out of curiosity with a quick experiment, I ran your pentomino.py with 2.7.12 PGO+LTO build (Ubuntu OS 16.04.2 LTS default at /usr/bin/python), and compared with 3.7.0 alpha1 PGO+LTO (which I built a while ago), on my SkyLake processor based desktop, and 2.7 outperforms 3.7 by 3.5%. On your 2.5 GHz i7 system, I'd recommend making sure the 2 Python binaries you are comparing are in equal footings (compiled with same optimization PGO+LTO). Thanks, Peter -----Original Message----- From: Python-Dev [mailto:python-dev-bounces+peter.xihong.wang=intel.com at python.org] On Behalf Of Nathaniel Smith Sent: Tuesday, July 18, 2017 7:00 PM To: Ben Hoyt Cc: Python-Dev Subject: Re: [Python-Dev] Program runs in 12s on Python 2.7, but 5s on Python 3.5 -- why so much difference? I'd probably start with a regular C-level profiler, like perf or callgrind. They're not very useful for comparing two versions of code written in Python, but here the Python code is the same (modulo changes in the stdlib), and it's changes in the interpreter's C code that probably make the difference. On Tue, Jul 18, 2017 at 9:03 AM, Ben Hoyt wrote: > Hi folks, > > (Not entirely sure this is the right place for this question, but > hopefully it's of interest to several folks.) > > A few days ago I posted a note in response to Victor Stinner's > articles on his CPython contributions, noting that I wrote a program > that ran in 11.7 seconds on Python 2.7, but only takes 5.1 seconds on > Python 3.5 (on my 2.5 GHz macOS i7), more than 2x as fast. Obviously > this is a Good Thing, but I'm curious as to why there's so much difference. > > The program is a pentomino puzzle solver, and it works via code > generation, generating a ton of nested "if" statements, so I believe > it's exercising the Python bytecode interpreter heavily. Obviously > there have been some big optimizations to make this happen, but I'm > curious what the main improvements are that are causing this much difference. > > There's a writeup about my program here, with benchmarks at the bottom: > http://benhoyt.com/writings/python-pentomino/ > > This is the generated Python code that's being exercised: > https://github.com/benhoyt/python-pentomino/blob/master/generated_solv > e.py > > For reference, on Python 3.6 it runs in 4.6 seconds (same on Python > 3.7 alpha). This smallish increase from Python 3.5 to Python 3.6 was > more expected to me due to the bytecode changing to wordcode in 3.6. > > I tried using cProfile on both Python versions, but that didn't say > much, because the functions being called aren't taking the majority of the time. > How does one benchmark at a lower level, or otherwise explain what's > going on here? > > Thanks, > Ben > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/njs%40pobox.com > -- Nathaniel J. Smith -- https://vorpus.org _______________________________________________ Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/peter.xihong.wang%40intel.com From benhoyt at gmail.com Mon Jul 24 15:35:22 2017 From: benhoyt at gmail.com (Ben Hoyt) Date: Mon, 24 Jul 2017 15:35:22 -0400 Subject: [Python-Dev] Program runs in 12s on Python 2.7, but 5s on Python 3.5 -- why so much difference? In-Reply-To: <371EBC7881C7844EAAF5556BFF21BCCC830D98CF@ORSMSX105.amr.corp.intel.com> References: <371EBC7881C7844EAAF5556BFF21BCCC830D98CF@ORSMSX105.amr.corp.intel.com> Message-ID: Thanks for testing. Oddly, I just tested it in Linux (Ubuntu), and get the same results as you -- Python 2.7.13 outperforms 3 (3.5.3 in my case) by a few percent. And even under a Virtualbox VM it takes 3.4 and 3.6 seconds, compared to ~5s on the host macOS operating system. Very odd. I guess that means Virtualbox is very good, and that clang/LLVM is not as good at optimizing the Python VM as gcc is. I can't find anything majorly different about my macOS Python 2 and 3 builds. Both look like they have PGO turned on (from sysconfig.get_config_vars()). Both have HAVE_COMPUTED_GOTOS=1 but USE_COMPUTED_GOTOS=0 for some reason. My Python 2 version is the macOS system version (/usr/local/bin/python2), whereas my Python3 version is from "brew install", so that's probably the difference, though still doesn't explain exactly why. -Ben On Mon, Jul 24, 2017 at 1:49 PM, Wang, Peter Xihong < peter.xihong.wang at intel.com> wrote: > Hi Ben, > > Out of curiosity with a quick experiment, I ran your pentomino.py with > 2.7.12 PGO+LTO build (Ubuntu OS 16.04.2 LTS default at /usr/bin/python), > and compared with 3.7.0 alpha1 PGO+LTO (which I built a while ago), on my > SkyLake processor based desktop, and 2.7 outperforms 3.7 by 3.5%. > On your 2.5 GHz i7 system, I'd recommend making sure the 2 Python binaries > you are comparing are in equal footings (compiled with same optimization > PGO+LTO). > > Thanks, > > Peter > > > > -----Original Message----- > From: Python-Dev [mailto:python-dev-bounces+peter.xihong.wang=intel.com@ > python.org] On Behalf Of Nathaniel Smith > Sent: Tuesday, July 18, 2017 7:00 PM > To: Ben Hoyt > Cc: Python-Dev > Subject: Re: [Python-Dev] Program runs in 12s on Python 2.7, but 5s on > Python 3.5 -- why so much difference? > > I'd probably start with a regular C-level profiler, like perf or > callgrind. They're not very useful for comparing two versions of code > written in Python, but here the Python code is the same (modulo changes in > the stdlib), and it's changes in the interpreter's C code that probably > make the difference. > > On Tue, Jul 18, 2017 at 9:03 AM, Ben Hoyt wrote: > > Hi folks, > > > > (Not entirely sure this is the right place for this question, but > > hopefully it's of interest to several folks.) > > > > A few days ago I posted a note in response to Victor Stinner's > > articles on his CPython contributions, noting that I wrote a program > > that ran in 11.7 seconds on Python 2.7, but only takes 5.1 seconds on > > Python 3.5 (on my 2.5 GHz macOS i7), more than 2x as fast. Obviously > > this is a Good Thing, but I'm curious as to why there's so much > difference. > > > > The program is a pentomino puzzle solver, and it works via code > > generation, generating a ton of nested "if" statements, so I believe > > it's exercising the Python bytecode interpreter heavily. Obviously > > there have been some big optimizations to make this happen, but I'm > > curious what the main improvements are that are causing this much > difference. > > > > There's a writeup about my program here, with benchmarks at the bottom: > > http://benhoyt.com/writings/python-pentomino/ > > > > This is the generated Python code that's being exercised: > > https://github.com/benhoyt/python-pentomino/blob/master/generated_solv > > e.py > > > > For reference, on Python 3.6 it runs in 4.6 seconds (same on Python > > 3.7 alpha). This smallish increase from Python 3.5 to Python 3.6 was > > more expected to me due to the bytecode changing to wordcode in 3.6. > > > > I tried using cProfile on both Python versions, but that didn't say > > much, because the functions being called aren't taking the majority of > the time. > > How does one benchmark at a lower level, or otherwise explain what's > > going on here? > > > > Thanks, > > Ben > > > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > https://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: > > https://mail.python.org/mailman/options/python-dev/njs%40pobox.com > > > > > > -- > Nathaniel J. Smith -- https://vorpus.org ______________________________ > _________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > peter.xihong.wang%40intel.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter.xihong.wang at intel.com Mon Jul 24 16:47:37 2017 From: peter.xihong.wang at intel.com (Wang, Peter Xihong) Date: Mon, 24 Jul 2017 20:47:37 +0000 Subject: [Python-Dev] Program runs in 12s on Python 2.7, but 5s on Python 3.5 -- why so much difference? In-Reply-To: References: <371EBC7881C7844EAAF5556BFF21BCCC830D98CF@ORSMSX105.amr.corp.intel.com> Message-ID: <371EBC7881C7844EAAF5556BFF21BCCC830D9970@ORSMSX105.amr.corp.intel.com> I believe we have evaluated clang vs gcc before (long time ago), and gcc won at that time. PGO might have overshadowed impact from computed goto, and thus the latter may no longer be needed. When the performance difference is as large as 50%, there could be various options to nail down the root cause, including bytecode analysis. However, coming down to 3.6 sec vs 3.4 sec, a delta of ~5%, it could be hard to find out. Internally we use sampling based performance tools for micro-architecture level analysis. Or generic Linux based and open source tool ?perf? is very good to use. You could also do a disassembly analysis/comparison of the object files such as the main loop, ceval.o, looking at the efficiency of the generated codes (which gives generic info regarding to Python2 and 3, but may not tell you the run time behavior with respect your specific app, pentomino.py). Hope that helps. Peter From: Ben Hoyt [mailto:benhoyt at gmail.com] Sent: Monday, July 24, 2017 12:35 PM To: Wang, Peter Xihong Cc: Nathaniel Smith ; Python-Dev Subject: Re: [Python-Dev] Program runs in 12s on Python 2.7, but 5s on Python 3.5 -- why so much difference? Thanks for testing. Oddly, I just tested it in Linux (Ubuntu), and get the same results as you -- Python 2.7.13 outperforms 3 (3.5.3 in my case) by a few percent. And even under a Virtualbox VM it takes 3.4 and 3.6 seconds, compared to ~5s on the host macOS operating system. Very odd. I guess that means Virtualbox is very good, and that clang/LLVM is not as good at optimizing the Python VM as gcc is. I can't find anything majorly different about my macOS Python 2 and 3 builds. Both look like they have PGO turned on (from sysconfig.get_config_vars()). Both have HAVE_COMPUTED_GOTOS=1 but USE_COMPUTED_GOTOS=0 for some reason. My Python 2 version is the macOS system version (/usr/local/bin/python2), whereas my Python3 version is from "brew install", so that's probably the difference, though still doesn't explain exactly why. -Ben On Mon, Jul 24, 2017 at 1:49 PM, Wang, Peter Xihong > wrote: Hi Ben, Out of curiosity with a quick experiment, I ran your pentomino.py with 2.7.12 PGO+LTO build (Ubuntu OS 16.04.2 LTS default at /usr/bin/python), and compared with 3.7.0 alpha1 PGO+LTO (which I built a while ago), on my SkyLake processor based desktop, and 2.7 outperforms 3.7 by 3.5%. On your 2.5 GHz i7 system, I'd recommend making sure the 2 Python binaries you are comparing are in equal footings (compiled with same optimization PGO+LTO). Thanks, Peter -----Original Message----- From: Python-Dev [mailto:python-dev-bounces+peter.xihong.wang=intel.com at python.org] On Behalf Of Nathaniel Smith Sent: Tuesday, July 18, 2017 7:00 PM To: Ben Hoyt > Cc: Python-Dev > Subject: Re: [Python-Dev] Program runs in 12s on Python 2.7, but 5s on Python 3.5 -- why so much difference? I'd probably start with a regular C-level profiler, like perf or callgrind. They're not very useful for comparing two versions of code written in Python, but here the Python code is the same (modulo changes in the stdlib), and it's changes in the interpreter's C code that probably make the difference. On Tue, Jul 18, 2017 at 9:03 AM, Ben Hoyt > wrote: > Hi folks, > > (Not entirely sure this is the right place for this question, but > hopefully it's of interest to several folks.) > > A few days ago I posted a note in response to Victor Stinner's > articles on his CPython contributions, noting that I wrote a program > that ran in 11.7 seconds on Python 2.7, but only takes 5.1 seconds on > Python 3.5 (on my 2.5 GHz macOS i7), more than 2x as fast. Obviously > this is a Good Thing, but I'm curious as to why there's so much difference. > > The program is a pentomino puzzle solver, and it works via code > generation, generating a ton of nested "if" statements, so I believe > it's exercising the Python bytecode interpreter heavily. Obviously > there have been some big optimizations to make this happen, but I'm > curious what the main improvements are that are causing this much difference. > > There's a writeup about my program here, with benchmarks at the bottom: > http://benhoyt.com/writings/python-pentomino/ > > This is the generated Python code that's being exercised: > https://github.com/benhoyt/python-pentomino/blob/master/generated_solv > e.py > > For reference, on Python 3.6 it runs in 4.6 seconds (same on Python > 3.7 alpha). This smallish increase from Python 3.5 to Python 3.6 was > more expected to me due to the bytecode changing to wordcode in 3.6. > > I tried using cProfile on both Python versions, but that didn't say > much, because the functions being called aren't taking the majority of the time. > How does one benchmark at a lower level, or otherwise explain what's > going on here? > > Thanks, > Ben > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/njs%40pobox.com > -- Nathaniel J. Smith -- https://vorpus.org _______________________________________________ Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/peter.xihong.wang%40intel.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Mon Jul 24 17:03:42 2017 From: greg at krypto.org (Gregory P. Smith) Date: Mon, 24 Jul 2017 21:03:42 +0000 Subject: [Python-Dev] Program runs in 12s on Python 2.7, but 5s on Python 3.5 -- why so much difference? In-Reply-To: <371EBC7881C7844EAAF5556BFF21BCCC830D9970@ORSMSX105.amr.corp.intel.com> References: <371EBC7881C7844EAAF5556BFF21BCCC830D98CF@ORSMSX105.amr.corp.intel.com> <371EBC7881C7844EAAF5556BFF21BCCC830D9970@ORSMSX105.amr.corp.intel.com> Message-ID: On Mon, Jul 24, 2017 at 1:49 PM Wang, Peter Xihong < peter.xihong.wang at intel.com> wrote: > I believe we have evaluated clang vs gcc before (long time ago), and gcc > won at that time. > > > > PGO might have overshadowed impact from computed goto, and thus the latter > may no longer be needed. > Computed goto is still needed. PGO does not magically replace it. A PGO build with computed goto is faster than one without computed goto. ... as tested on gcc 4.9 a couple years ago. I doubt that has changed or changes between compilers; PGO and computed goto are different types of optimizations. -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From mariatta.wijaya at gmail.com Mon Jul 24 18:20:07 2017 From: mariatta.wijaya at gmail.com (Mariatta Wijaya) Date: Mon, 24 Jul 2017 15:20:07 -0700 Subject: [Python-Dev] Appending a link back to bugs.python.org in GitHub PRs In-Reply-To: References: Message-ID: Thanks for working on this, Kushal and Brett. Works great! Mariatta Wijaya On Fri, Jul 21, 2017 at 2:28 PM, Brett Cannon wrote: > Thanks to Kushal Das we now have one of the most requested features since > the transition: a link in PRs back to bugs.python.org (in a more > discoverable way since we have had them since Bedevere launched :) . When a > pull request comes in with an issue number in the title (or one gets > added), a link to bugs.python.org will be appended to the PR's body (the > message you fill out when creating a PR). There's no logic to remove the > link if the issue number is removed from the title, changed, or for > multiple issue numbers since basically those cases are all rare and it was > easier to launch without that kind of support. > > P.S.: Berker Peksag is working on providing commit emails with diffs in > them which is the other most requested feature since the transition. > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > mariatta.wijaya%40gmail.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Jul 25 01:51:10 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 25 Jul 2017 15:51:10 +1000 Subject: [Python-Dev] for...else In-Reply-To: References: <20170724161447.GQ3149@ando.pearwood.info> Message-ID: On 25 July 2017 at 02:23, Ben Hoyt wrote: > This is more of a python-ideas discussion, and Steven's answer is good. > > I'll just add one thing. Maybe it's obvious to others, but I've liked > for...else since I found a kind of mnemonic to help me remember when the > "else" part happens: I think of it not as "for ... else" but as "break ... > else" -- saying it this way makes it clear to me that the break goes with > the else. "If this condition inside the loop is true, break. ... *else* if > we didn't break, do this other thing after the loop." For folks looking for a more in-depth explanation of the "if-break-else" approach to thinking about this construct: http://python-notes.curiousefficiency.org/en/latest/python_concepts/break_else.html That article also has a note explaining that we're unlikely to ever change this: http://python-notes.curiousefficiency.org/en/latest/python_concepts/break_else.html#but-couldn-t-python-be-different Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From larry at hastings.org Tue Jul 25 04:37:02 2017 From: larry at hastings.org (Larry Hastings) Date: Tue, 25 Jul 2017 01:37:02 -0700 Subject: [Python-Dev] RELEASED] Python 3.4.7rc1 and Python 3.5.4rc1 are now available Message-ID: On behalf of the Python development community and the Python 3.4 and Python 3.5 release teams, I'm relieved to announce the availability of Python 3.4.7rc1 and Python 3.5.4rc1. Python 3.4 is now in "security fixes only" mode. This is the final stage of support for Python 3.4. Python 3.4 now only receives security fixes, not bug fixes, and Python 3.4 releases are source code only--no more official binary installers will be produced. Python 3.5.4 will be the final 3.5 release in "bug fix" mode. After 3.5.4 is released, Python 3.5 will also move into "security fixes mode". Both these releases are "release candidates". They should not be considered the final releases, although the final releases should contain only minor differences. Python users are encouraged to test with these releases and report any problems they encounter. You can find Python 3.4.7rc1 here: https://www.python.org/downloads/release/python-347rc1/ And you can find Python 3.5.4rc1 here: https://www.python.org/downloads/release/python-354rc1/ Python 3.4.7 final and Python 3.5.4 final are both scheduled for release on August 6th, 2017. Happy Pythoning, //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Tue Jul 25 06:11:22 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 25 Jul 2017 12:11:22 +0200 Subject: [Python-Dev] [python-committers] RELEASED] Python 3.4.7rc1 and Python 3.5.4rc1 are now available In-Reply-To: References: Message-ID: 2017-07-25 10:37 GMT+02:00 Larry Hastings : > On behalf of the Python development community and the Python 3.4 and Python > 3.5 release teams, I'm relieved to announce the availability of Python > 3.4.7rc1 and Python 3.5.4rc1. I checked for known security vulnerabilities: except of the issue #29606, all known vulnerabilties are fixed in these versions. It's ok, the issue #29606 is not fixed in master yet: https://bugs.python.org/issue29606 Victor From berker.peksag at gmail.com Tue Jul 25 06:46:27 2017 From: berker.peksag at gmail.com (=?UTF-8?Q?Berker_Peksa=C4=9F?=) Date: Tue, 25 Jul 2017 13:46:27 +0300 Subject: [Python-Dev] Appending a link back to bugs.python.org in GitHub PRs In-Reply-To: References: Message-ID: On Sat, Jul 22, 2017 at 12:28 AM, Brett Cannon wrote: > P.S.: Berker Peksag is working on providing commit emails with diffs in them > which is the other most requested feature since the transition. I forgot to give a status update on this. I deployed it on Heroku last week. You can see an example email at https://mail.python.org/pipermail/python-checkins/2017-July/151296.html --Berker From rob.cliffe at btinternet.com Tue Jul 25 06:28:38 2017 From: rob.cliffe at btinternet.com (Rob Cliffe) Date: Tue, 25 Jul 2017 11:28:38 +0100 Subject: [Python-Dev] for...else In-Reply-To: References: <20170724161447.GQ3149@ando.pearwood.info> Message-ID: <711e812f-f84d-dc12-1402-3c9c6a0d672b@btinternet.com> On 25/07/2017 06:51, Nick Coghlan wrote: > On 25 July 2017 at 02:23, Ben Hoyt wrote: >> This is more of a python-ideas discussion, and Steven's answer is good. >> >> I'll just add one thing. Maybe it's obvious to others, but I've liked >> for...else since I found a kind of mnemonic to help me remember when the >> "else" part happens: I think of it not as "for ... else" but as "break ... >> else" -- saying it this way makes it clear to me that the break goes with >> the else. "If this condition inside the loop is true, break. ... *else* if >> we didn't break, do this other thing after the loop." > For folks looking for a more in-depth explanation of the > "if-break-else" approach to thinking about this construct: > http://python-notes.curiousefficiency.org/en/latest/python_concepts/break_else.html A helpful explanation. But that it is necessary at all underlines that (IMHO) this use of 'else' is unnatural and hard to understand. I always have to think twice about it, whether reading it or using it myself. Therefore I would have preferred a more obvious keyword such as 'ifnobreak' (others may think of something better). But as has been stated, it's not going to change. Rob Cliffe > > That article also has a note explaining that we're unlikely to ever > change this: http://python-notes.curiousefficiency.org/en/latest/python_concepts/break_else.html#but-couldn-t-python-be-different > > Cheers, > Nick. > From benhoyt at gmail.com Tue Jul 25 10:30:01 2017 From: benhoyt at gmail.com (Ben Hoyt) Date: Tue, 25 Jul 2017 10:30:01 -0400 Subject: [Python-Dev] Appending a link back to bugs.python.org in GitHub PRs In-Reply-To: References: Message-ID: With the linking back and forth, I'm curious why there wasn't a switch to use GitHub's issue tracker when we switched to GitHub. I'm sure there was previous discussion about this and good reasons not to, but couldn't find those quickly (PEP 512, Google search, etc) -- can someone point me in the right direction? -Ben On Fri, Jul 21, 2017 at 5:28 PM, Brett Cannon wrote: > Thanks to Kushal Das we now have one of the most requested features since > the transition: a link in PRs back to bugs.python.org (in a more > discoverable way since we have had them since Bedevere launched :) . When a > pull request comes in with an issue number in the title (or one gets > added), a link to bugs.python.org will be appended to the PR's body (the > message you fill out when creating a PR). There's no logic to remove the > link if the issue number is removed from the title, changed, or for > multiple issue numbers since basically those cases are all rare and it was > easier to launch without that kind of support. > > P.S.: Berker Peksag is working on providing commit emails with diffs in > them which is the other most requested feature since the transition. > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > benhoyt%40gmail.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Tue Jul 25 12:21:05 2017 From: brett at python.org (Brett Cannon) Date: Tue, 25 Jul 2017 16:21:05 +0000 Subject: [Python-Dev] Appending a link back to bugs.python.org in GitHub PRs In-Reply-To: References: Message-ID: On Tue, 25 Jul 2017 at 07:30 Ben Hoyt wrote: > With the linking back and forth, I'm curious why there wasn't a switch to > use GitHub's issue tracker when we switched to GitHub. I'm sure there was > previous discussion about this and good reasons not to, but couldn't find > those quickly (PEP 512, Google search, etc) -- can someone point me in the > right direction? -Ben > Basically there was push-back on the idea and I only had enough time and patience for one major infrastructure change that was somewhat controversial and not for two. -Brett > > On Fri, Jul 21, 2017 at 5:28 PM, Brett Cannon wrote: > >> Thanks to Kushal Das we now have one of the most requested features since >> the transition: a link in PRs back to bugs.python.org (in a more >> discoverable way since we have had them since Bedevere launched :) . When a >> pull request comes in with an issue number in the title (or one gets >> added), a link to bugs.python.org will be appended to the PR's body (the >> message you fill out when creating a PR). There's no logic to remove the >> link if the issue number is removed from the title, changed, or for >> multiple issue numbers since basically those cases are all rare and it was >> easier to launch without that kind of support. >> >> P.S.: Berker Peksag is working on providing commit emails with diffs in >> them which is the other most requested feature since the transition. >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> > Unsubscribe: >> https://mail.python.org/mailman/options/python-dev/benhoyt%40gmail.com >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeanpierreda at gmail.com Wed Jul 26 00:59:10 2017 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Tue, 25 Jul 2017 21:59:10 -0700 Subject: [Python-Dev] Am I allowed to use C++-style // comments? Message-ID: https://www.python.org/dev/peps/pep-0007/ says two things: > Python versions greater than or equal to 3.6 use C89 with several select C99 features: > [...] > C++-style line comments and also: > Never use C++ style // one-line comments. Which is it? -- Devin From benjamin at python.org Wed Jul 26 01:00:51 2017 From: benjamin at python.org (Benjamin Peterson) Date: Tue, 25 Jul 2017 22:00:51 -0700 Subject: [Python-Dev] Am I allowed to use C++-style // comments? In-Reply-To: References: Message-ID: <1501045251.2835685.1052807344.0F995199@webmail.messagingengine.com> On Tue, Jul 25, 2017, at 21:59, Devin Jeanpierre wrote: > https://www.python.org/dev/peps/pep-0007/ says two things: > > > Python versions greater than or equal to 3.6 use C89 with several select C99 features: > > [...] > > C++-style line comments This section overrides further edicts in the document for Python 3.6+. > > and also: > > > Never use C++ style // one-line comments. > > Which is it? From jeanpierreda at gmail.com Wed Jul 26 01:04:12 2017 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Tue, 25 Jul 2017 22:04:12 -0700 Subject: [Python-Dev] Am I allowed to use C++-style // comments? In-Reply-To: References: Message-ID: I actually realized right after I sent this that I am writing C++, so maybe it's a moot point. (Still trying to figure out how to use C for this, but it's an optional extension module only exposed for testing, so maybe it really doesn't matter.) Context is https://github.com/python/cpython/pull/2878 -- Devin On Tue, Jul 25, 2017 at 9:59 PM, Devin Jeanpierre wrote: > https://www.python.org/dev/peps/pep-0007/ says two things: > >> Python versions greater than or equal to 3.6 use C89 with several select C99 features: >> [...] >> C++-style line comments > > and also: > >> Never use C++ style // one-line comments. > > Which is it? > > -- Devin From ncoghlan at gmail.com Wed Jul 26 11:35:43 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 27 Jul 2017 01:35:43 +1000 Subject: [Python-Dev] Appending a link back to bugs.python.org in GitHub PRs In-Reply-To: References: Message-ID: On 26 July 2017 at 02:21, Brett Cannon wrote: > > > On Tue, 25 Jul 2017 at 07:30 Ben Hoyt wrote: >> >> With the linking back and forth, I'm curious why there wasn't a switch to >> use GitHub's issue tracker when we switched to GitHub. I'm sure there was >> previous discussion about this and good reasons not to, but couldn't find >> those quickly (PEP 512, Google search, etc) -- can someone point me in the >> right direction? -Ben > > Basically there was push-back on the idea and I only had enough time and > patience for one major infrastructure change that was somewhat controversial > and not for two. And unlike repository management and review management, we don't have any major process bottlenecks specifically related to bugs.python.org, and Github's issue tracker is merely "good enough if you don't otherwise have an issue tracker" rather than being exemplary the way their repository and review management are. So given the volume of incoming references to the current issue URLs, the potential for increased lock-in to a proprietary service provider with non-public finances, the difficulty of actually doing such a migration, and the questionable practical benefits, "Integrate Roundup with GitHub" was the default winner over doing a second data migration. The idea of moving tracker development *itself* to GitHub (and hence getting to dispense with the metatracker in favour of a GitHub repo with issues enabled) *has* been raised, and may be worth considering, but that would be up to the folks that actually do the bulk of the work on tracker maintenance (Berker, Ezio, Maciej, etc) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From k7hoven at gmail.com Wed Jul 26 18:55:34 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Thu, 27 Jul 2017 01:55:34 +0300 Subject: [Python-Dev] for...else In-Reply-To: <20170724161447.GQ3149@ando.pearwood.info> References: <20170724161447.GQ3149@ando.pearwood.info> Message-ID: On Mon, Jul 24, 2017 at 7:14 PM, Steven D'Aprano wrote: > Hello Kiuhnm, and welcome. > > On Mon, Jul 24, 2017 at 05:35:03PM +0200, Kiuhnm via Python-Dev wrote: > > Hello, > > > > I think that the expression "for...else" or "while...else" is completely > > counter-intuitive. > > > You may be right -- this has been discussed many, many times before. In > my personal opinion, the best (and only accurate!) phrase would have > been: > > for item in sequence: > # block > then: > # block > > ?IMO, for item in sequence: # block nobreak: # or perhaps `if not break:` # block would be clearer (if the syntax is necessary at all). ?[...] > ? > > > Wouldn't it be possible to make it clearer? Maybe > > something like > > At this point, no, it is not practical to change the syntax used. Maybe > when Python 3.0 was first introduced, but that ship has long sailed. It > is very, very unlikely that the syntax for this will ever change, but if > it does, it probably won't be until something in the distant future like > Python 5. > ?I don't have a strong opinion on this particular case, but if something like this is changed in Python 5, I think the decision should be made much earlier (now?) so that the old else syntax could be discouraged (and new syntax potentially already introduced). The same thing would apply to many other "possibly in Python 5" changes, where there is no reason to expect that the situation is somehow different years later. -- Koos > > But not Python 4: Guido has already ruled that Python 4 will not include > major backwards-incompatible changes. Going from 3 to 4 will not be as > disruptive as going from 2 to 3. > > ?[...]? -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Wed Jul 26 19:36:22 2017 From: python at mrabarnett.plus.com (MRAB) Date: Thu, 27 Jul 2017 00:36:22 +0100 Subject: [Python-Dev] for...else In-Reply-To: References: <20170724161447.GQ3149@ando.pearwood.info> Message-ID: <815112c0-68f0-341d-ea6d-6db979a14261@mrabarnett.plus.com> On 2017-07-26 23:55, Koos Zevenhoven wrote: > On Mon, Jul 24, 2017 at 7:14 PM, Steven D'Aprano >wrote: > > Hello Kiuhnm, and welcome. > > On Mon, Jul 24, 2017 at 05:35:03PM +0200, Kiuhnm via Python-Dev wrote: > > Hello, > > > > I think that the expression "for...else" or "while...else" is completely > > counter-intuitive. > > > You may be right -- this has been discussed many, many times before. In > my personal opinion, the best (and only accurate!) phrase would have > been: > > for item in sequence: > # block > then: > # block > > > ?IMO, > > for item in sequence: > # block > nobreak: # or perhaps `if not break:` > # block > > would be clearer (if the syntax is necessary at all). > [snip] You couldn't have "if not break:" because that would look like the start of an 'if' statement. "nobreak" would introduce a new keyword, but "not break" wouldn't. From k7hoven at gmail.com Wed Jul 26 20:07:51 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Thu, 27 Jul 2017 03:07:51 +0300 Subject: [Python-Dev] for...else In-Reply-To: <815112c0-68f0-341d-ea6d-6db979a14261@mrabarnett.plus.com> References: <20170724161447.GQ3149@ando.pearwood.info> <815112c0-68f0-341d-ea6d-6db979a14261@mrabarnett.plus.com> Message-ID: On Jul 27, 2017 02:38, "MRAB" wrote: On 2017-07-26 23:55, Koos Zevenhoven wrote: > > ?IMO, > > for item in sequence: > # block > nobreak: # or perhaps `if not break:` > # block > > would be clearer (if the syntax is necessary at all). > You couldn't have "if not break:" because that would look like the start of an 'if' statement. Do you mean as an implementation issue or for human readability? "nobreak" would introduce a new keyword, but "not break" wouldn't. Sure :) -- Koos (mobile) -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Wed Jul 26 20:46:23 2017 From: python at mrabarnett.plus.com (MRAB) Date: Thu, 27 Jul 2017 01:46:23 +0100 Subject: [Python-Dev] for...else In-Reply-To: References: <20170724161447.GQ3149@ando.pearwood.info> <815112c0-68f0-341d-ea6d-6db979a14261@mrabarnett.plus.com> Message-ID: <70501486-23dc-2c91-ad48-c400ca979f2b@mrabarnett.plus.com> On 2017-07-27 01:07, Koos Zevenhoven wrote: > On Jul 27, 2017 02:38, "MRAB" > wrote: > > On 2017-07-26 23:55, Koos Zevenhoven wrote: > > > ?IMO, > > for item in sequence: > # block > nobreak: # or perhaps `if not break:` > # block > > would be clearer (if the syntax is necessary at all). > > > You couldn't have "if not break:" because that would look like the > start of an 'if' statement. > > > Do you mean as an implementation issue or for human readability? > I suppose you _could_ use "if not break:", but as 'if' normally indicates the start of an 'if' statement, you would get complaints about it! :-) Maybe it would be clearer if it was "elif not break:". :-) > "nobreak" would introduce a new keyword, but "not break" wouldn't. > > > Sure :) > From tjreedy at udel.edu Wed Jul 26 22:01:01 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 26 Jul 2017 22:01:01 -0400 Subject: [Python-Dev] for...else In-Reply-To: References: <20170724161447.GQ3149@ando.pearwood.info> <815112c0-68f0-341d-ea6d-6db979a14261@mrabarnett.plus.com> Message-ID: This discussion belongs on python-list (where is it mostly a repeat). -- Terry Jan Reedy From python-dev at mgmiller.net Wed Jul 26 22:34:51 2017 From: python-dev at mgmiller.net (Mike Miller) Date: Wed, 26 Jul 2017 19:34:51 -0700 Subject: [Python-Dev] for...else In-Reply-To: <815112c0-68f0-341d-ea6d-6db979a14261@mrabarnett.plus.com> References: <20170724161447.GQ3149@ando.pearwood.info> <815112c0-68f0-341d-ea6d-6db979a14261@mrabarnett.plus.com> Message-ID: <7bb5440a-aea9-9986-5f77-5aa2bff10c6a@mgmiller.net> On 2017-07-26 16:36, MRAB wrote: > "nobreak" would introduce a new keyword, but "not break" wouldn't. Whenever I've used the for-else, I've put a # no-break right next to it, to remind myself as much as anyone else. for...: not break: is the best alternative I've yet seen, congrats. Perhaps in Python 5 it can be enabled, with for-else: used instead for empty iterables, as that's what I expected the first few dozen times. -Mike From jmatejek at suse.cz Thu Jul 27 09:48:34 2017 From: jmatejek at suse.cz (jan matejek) Date: Thu, 27 Jul 2017 15:48:34 +0200 Subject: [Python-Dev] Non-stable pyc results on python 3.6 Message-ID: <3b3357c2-1583-4dc4-06c6-a68f82ca301f@suse.cz> hello, we're seeing strange problems when trying to do reproducible builds of some python 3.6 modules. Namely, from one build to another, there will be something like the following difference in the compiled object: 00004e40 da 07 5f 5f 61 6c 6c 5f 5f da 0a 5f 5f 61 75 74 |..__all__..__aut| -00004e50 68 6f 72 5f 5f da 07 64 65 63 69 6d 61 6c 72 0c |hor__..decimalr.| +00004e50 68 6f 72 5f 5f 5a 07 64 65 63 69 6d 61 6c 72 0c |hor__Z.decimalr.| 00004e60 00 00 00 72 43 00 00 00 72 08 00 00 00 72 41 00 |...rC...r....rA.| This specific one is in the top-level co_names segment and the 0x5a vs 0xda byte is TYPE_SHORT_ASCII_INTERNED, with FLAG_REF set or unset. I'm also seeing off-by-one differences in reference ids, i.e., the number appearing after TYPE_REF. Not in all cases, but it seems that when a "part" is affected, all references in that "part" are changed (for some value of "part"; all the knowledge of pycs I have was gained from about an hour of reading marshal.c). So that seems to imply that there's a reference that is sometimes included and sometimes not? This is most often found in __init__.py. Often this affects optimized pycs, but we can see it in un-optimized as well. The issue is rare -- 99% of all pycs are stable -- but when it occurs, it's easy to replicate it in the same place. This also happens on different machines, so that seems to rule out hardware memory errors :) The pycs in question are generated by normal "setup.py build" -> "setup.py install". It happens on Python 3.6 but not on Python 2.7. I'm not sure about Python 3.5 because we don't currently use it. It doesn't seem to depend on hash seed - the instability is observed even with PYTHONHASHSEED set to zero. What seems to fix it, however, is running the build on disorderfs, which ensures that the filesystem entries are in the same order. Any ideas why something like this would happen and why would it be correlated with filesystem ordering? thanks m. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From ncoghlan at gmail.com Thu Jul 27 10:54:19 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 28 Jul 2017 00:54:19 +1000 Subject: [Python-Dev] Non-stable pyc results on python 3.6 In-Reply-To: <3b3357c2-1583-4dc4-06c6-a68f82ca301f@suse.cz> References: <3b3357c2-1583-4dc4-06c6-a68f82ca301f@suse.cz> Message-ID: On 27 July 2017 at 23:48, jan matejek wrote: > This is most often found in __init__.py. Often this affects optimized pycs, but we can see it in > un-optimized as well. > The issue is rare -- 99% of all pycs are stable -- but when it occurs, it's easy to replicate it in > the same place. This also happens on different machines, so that seems to rule out hardware memory > errors :) The marshal implementation received some significant optimisations in 3.5 [1] and a new marshal format in 3.4 [2], so if you're able to check the behaviour in 3.4 and 3.5 that would be helpful: if the problem occurs in 3.5, but *not* in 3.4, the hashtable based optimisations would be the place to start looking, while if 3.4 misbehaves as well, then there may be some general inconsistency to resolve in how the module decides which instances to mark with `FLAG_REF`. The fact that disorderfs makes a difference does make me a little suspicious, as there's an early exit from the FLAG_REF setting code related to objects having exactly one live reference. Courtesy of string interning and other immutable object caches, order of code compilation can affect how many live references there are to strings and other constants, and hence there may be cases where marshal *won't* flag an immutable object if *only* that particular code object has been compiled, but *will* flag it if some other code object has been created first. That check has been there since version 3 of the marshal format was defined, so if it *is* the culprit, then you'll see this misbehaviour with 3.4 as well. Cheers, Nick. [1] https://docs.python.org/3/whatsnew/3.5.html#optimizations [2] https://github.com/python/cpython/commit/d7009c69136a3809282804f460902ab42e9972f6 -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Thu Jul 27 11:08:53 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 28 Jul 2017 01:08:53 +1000 Subject: [Python-Dev] Non-stable pyc results on python 3.6 In-Reply-To: References: <3b3357c2-1583-4dc4-06c6-a68f82ca301f@suse.cz> Message-ID: On 28 Jul 2017 00:54, "Nick Coghlan" wrote: The fact that disorderfs makes a difference does make me a little suspicious, as there's an early exit from the FLAG_REF setting code related to objects having exactly one live reference. Courtesy of string interning and other immutable object caches, order of code compilation can affect how many live references there are to strings and other constants, and hence there may be cases where marshal *won't* flag an immutable object if *only* that particular code object has been compiled, but *will* flag it if some other code object has been created first. That check has been there since version 3 of the marshal format was defined, so if it *is* the culprit, then you'll see this misbehaviour with 3.4 as well. It occurs to me that it would likely be easier for you to just test that theory directly: the check that now seems suspicious to me is the one that calls "Py_REFCNT" (and is the only reference to that API in the file), so if you comment that out, and the problem goes away, it is most likely the cause of the currently variable behaviour. Cheers, Nick. -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Thu Jul 27 15:19:08 2017 From: python at mrabarnett.plus.com (MRAB) Date: Thu, 27 Jul 2017 20:19:08 +0100 Subject: [Python-Dev] for...else In-Reply-To: <7bb5440a-aea9-9986-5f77-5aa2bff10c6a@mgmiller.net> References: <20170724161447.GQ3149@ando.pearwood.info> <815112c0-68f0-341d-ea6d-6db979a14261@mrabarnett.plus.com> <7bb5440a-aea9-9986-5f77-5aa2bff10c6a@mgmiller.net> Message-ID: <52163635-e530-cf54-a1e1-53d7bc8af091@mrabarnett.plus.com> On 2017-07-27 03:34, Mike Miller wrote: > > > On 2017-07-26 16:36, MRAB wrote: >> "nobreak" would introduce a new keyword, but "not break" wouldn't. > > Whenever I've used the for-else, I've put a # no-break right next to it, to > remind myself as much as anyone else. > > for...: not break: is the best alternative I've yet seen, congrats. Perhaps in > Python 5 it can be enabled, with for-else: used instead for empty iterables, as > that's what I expected the first few dozen times. > For empty iterables, how about "elif None:"? :-) From eric.lafontaine1 at gmail.com Thu Jul 27 20:27:04 2017 From: eric.lafontaine1 at gmail.com (Eric Lafontaine) Date: Thu, 27 Jul 2017 20:27:04 -0400 Subject: [Python-Dev] for...else In-Reply-To: <52163635-e530-cf54-a1e1-53d7bc8af091@mrabarnett.plus.com> References: <20170724161447.GQ3149@ando.pearwood.info> <815112c0-68f0-341d-ea6d-6db979a14261@mrabarnett.plus.com> <7bb5440a-aea9-9986-5f77-5aa2bff10c6a@mgmiller.net> <52163635-e530-cf54-a1e1-53d7bc8af091@mrabarnett.plus.com> Message-ID: funny ; this made me think of this talk; https://youtu.be/OSGv2VnC0go?t=1013 ?ric Lafontaine | Membre du Projet VUE, Groupe Contr?le G?nie ?lectrique, 54?me promotion UdeS | ?tudiant en maitrise TI ? l'ETS VAS OPS chez Bell Mobility ? Nous voulons proposer une alternative de transport en pr?sentant un v?hicule ?lectrique sp?cifiquement con?u pour les d?placements urbains. ? 2017-07-27 15:19 GMT-04:00 MRAB : > On 2017-07-27 03:34, Mike Miller wrote: > >> >> >> On 2017-07-26 16:36, MRAB wrote: >> >>> "nobreak" would introduce a new keyword, but "not break" wouldn't. >>> >> >> Whenever I've used the for-else, I've put a # no-break right next to it, >> to >> remind myself as much as anyone else. >> >> for...: not break: is the best alternative I've yet seen, congrats. >> Perhaps in >> Python 5 it can be enabled, with for-else: used instead for empty >> iterables, as >> that's what I expected the first few dozen times. >> >> For empty iterables, how about "elif None:"? :-) > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/eric. > lafontaine1%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From desmoulinmichel at gmail.com Fri Jul 28 05:17:24 2017 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Fri, 28 Jul 2017 11:17:24 +0200 Subject: [Python-Dev] for...else In-Reply-To: <52163635-e530-cf54-a1e1-53d7bc8af091@mrabarnett.plus.com> References: <20170724161447.GQ3149@ando.pearwood.info> <815112c0-68f0-341d-ea6d-6db979a14261@mrabarnett.plus.com> <7bb5440a-aea9-9986-5f77-5aa2bff10c6a@mgmiller.net> <52163635-e530-cf54-a1e1-53d7bc8af091@mrabarnett.plus.com> Message-ID: <9ede1c70-74fb-8f71-9666-d3512a42c7a2@gmail.com> elif break and elif None: I'd like that very much. It's weird a break the semantic of break and None, but it's in such a dark corner of Python anyway I don't bother. Le 27/07/2017 ? 21:19, MRAB a ?crit : > On 2017-07-27 03:34, Mike Miller wrote: >> >> >> On 2017-07-26 16:36, MRAB wrote: >>> "nobreak" would introduce a new keyword, but "not break" wouldn't. >> >> Whenever I've used the for-else, I've put a # no-break right next to >> it, to >> remind myself as much as anyone else. >> >> for...: not break: is the best alternative I've yet seen, congrats. >> Perhaps in >> Python 5 it can be enabled, with for-else: used instead for empty >> iterables, as >> that's what I expected the first few dozen times. >> > For empty iterables, how about "elif None:"? :-) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/desmoulinmichel%40gmail.com > From status at bugs.python.org Fri Jul 28 12:09:23 2017 From: status at bugs.python.org (Python tracker) Date: Fri, 28 Jul 2017 18:09:23 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20170728160923.391DC5640B@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2017-07-21 - 2017-07-28) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 6088 (+30) closed 36736 (+57) total 42824 (+87) Open issues with patches: 2337 Issues opened (72) ================== #26732: multiprocessing sentinel resource leak http://bugs.python.org/issue26732 reopened by haypo #28734: argparse: successive parsing wipes out nargs=? values http://bugs.python.org/issue28734 reopened by paul.j3 #29512: regrtest refleak: implement bisection feature http://bugs.python.org/issue29512 reopened by haypo #30188: test_nntplib: random EOFError in setUpClass() http://bugs.python.org/issue30188 reopened by haypo #30778: test_bsddb3 crash on x86 Windows XP 2.7 http://bugs.python.org/issue30778 reopened by haypo #30985: Set closing variable in asyncore at close http://bugs.python.org/issue30985 opened by walkhour #30986: Add --include-py argument to Tools/msi/make_zip.py http://bugs.python.org/issue30986 opened by Segev Finer #30987: Support for ISO-TP protocol in SocketCAN http://bugs.python.org/issue30987 opened by Pier-Yves Lessard #30988: Exception parsing invalid email address headers starting or en http://bugs.python.org/issue30988 opened by timb07 #30989: Sort only when needed in TimedRotatingFileHandler's getFilesTo http://bugs.python.org/issue30989 opened by Lovesh Harchandani #30990: Calls to C functions using `.__call__` don't get sent to profi http://bugs.python.org/issue30990 opened by ppperry #30991: test_ctypes ,test_dbm and test_ssl fail on arm64 (aarch64) ar http://bugs.python.org/issue30991 opened by jonny789 #30992: Invalid PGP Key Prevents Archive Validation http://bugs.python.org/issue30992 opened by cwprogram #30995: Support logging.getLogger(style='{') http://bugs.python.org/issue30995 opened by mitar #30996: add coroutine AbstractEventLoop.sock_close http://bugs.python.org/issue30996 opened by cfy #30997: TestCase.subTest and expectedFailure http://bugs.python.org/issue30997 opened by ronaldoussoren #30999: statistics module: add "key" keyword argument to median, mode, http://bugs.python.org/issue30999 opened by gerion #31000: Test failure in resource module on ZFS http://bugs.python.org/issue31000 opened by larry #31001: IDLE: Add tests for configdialog highlight tab http://bugs.python.org/issue31001 opened by terry.reedy #31002: IDLE: Add tests for configdialog keys tab http://bugs.python.org/issue31002 opened by terry.reedy #31004: IDLE, configdialog: Factor out FontTab class from ConfigDialog http://bugs.python.org/issue31004 opened by terry.reedy #31005: caught and stored exception creates a reference cycle outside http://bugs.python.org/issue31005 opened by vojtechfried #31006: typing.NamedTuple should add annotations to its constructor (_ http://bugs.python.org/issue31006 opened by Antony.Lee #31007: ERROR: test_pipe_handle (test.test_asyncio.test_windows_utils. http://bugs.python.org/issue31007 opened by haypo #31008: FAIL: test_wait_for_handle (test.test_asyncio.test_windows_eve http://bugs.python.org/issue31008 opened by haypo #31010: test_socketserver.test_ForkingTCPServer(): threading_cleanup() http://bugs.python.org/issue31010 opened by haypo #31011: Users (except from the one who installed) not able to see pyt http://bugs.python.org/issue31011 opened by Debarshi.Goswami #31012: suggestion: allow termination argument in argparse to be speci http://bugs.python.org/issue31012 opened by Leon Avery #31013: gcc7 throws warning when pymem.h development header is used http://bugs.python.org/issue31013 opened by Gabriel Somlo #31014: webbrowser._synthesize uses outdated calling signature for web http://bugs.python.org/issue31014 opened by jmsdvl #31015: PyErr_WriteUnraisable should be more verbose in Python 2.7 http://bugs.python.org/issue31015 opened by christian.aguilera at foundry.com #31016: [Regression] sphinx shows an EOF error when using python2.7 fr http://bugs.python.org/issue31016 opened by doko #31020: Add support for custom compressor in tarfile http://bugs.python.org/issue31020 opened by insomniacslk #31021: Clarify programming faq. http://bugs.python.org/issue31021 opened by terry.reedy #31022: ERROR: testRegularFile (test.test_socket.SendfileUsingSendTest http://bugs.python.org/issue31022 opened by haypo #31024: typing.Tuple is class but is defined as data inside https://do http://bugs.python.org/issue31024 opened by Bern??t G??bor #31026: test_dbm fails when run directly http://bugs.python.org/issue31026 opened by serhiy.storchaka #31027: test_listcomps fails when run directly http://bugs.python.org/issue31027 opened by serhiy.storchaka #31028: test_pydoc fails when run directly http://bugs.python.org/issue31028 opened by serhiy.storchaka #31029: test_tokenize fails when run directly http://bugs.python.org/issue31029 opened by serhiy.storchaka #31030: sys.executable can be not normalized http://bugs.python.org/issue31030 opened by serhiy.storchaka #31031: Unify duplicate bits_in_digit and bit_length http://bugs.python.org/issue31031 opened by niklasf #31033: Add argument to .cancel() of Task and Future http://bugs.python.org/issue31033 opened by socketpair #31035: Document order of firing callbacks added with Future.add_done_ http://bugs.python.org/issue31035 opened by socketpair #31036: building the python docs requires the blurb module http://bugs.python.org/issue31036 opened by doko #31038: test_runpy causes running all Python tests when run directly http://bugs.python.org/issue31038 opened by serhiy.storchaka #31039: Python on android must use ashmem instead of shmem http://bugs.python.org/issue31039 opened by Alex Davies #31040: mimetypes.add_type should complain when you give it an undotte http://bugs.python.org/issue31040 opened by odd_bloke #31041: test_handle_called_with_mp_queue() of test_logging: threading_ http://bugs.python.org/issue31041 opened by haypo #31042: Inconsistency in documentation of operator.index http://bugs.python.org/issue31042 opened by madphysicist #31045: Add a language switch to the Python documentation http://bugs.python.org/issue31045 opened by mdk #31046: ensurepip does not honour the value of $(prefix) http://bugs.python.org/issue31046 opened by xdegaye #31047: Windows: os.path.isabs(os.path.abspath(" ")) == False http://bugs.python.org/issue31047 opened by lazka #31048: ResourceWarning in test_asyncio.test_events.ProactorEventLoopT http://bugs.python.org/issue31048 opened by Segev Finer #31050: IDLE, configdialog: Factor out GenTab class from ConfigDialog http://bugs.python.org/issue31050 opened by terry.reedy #31051: IDLE, configdialog, General tab: re-arrange, test user entries http://bugs.python.org/issue31051 opened by terry.reedy #31053: Unnecessary argument in command example http://bugs.python.org/issue31053 opened by cocoatomo #31054: Python 2.7.8 Release does not update the system Path variable http://bugs.python.org/issue31054 opened by Ekrem Saban #31055: All Sphinx generated pages could have a new "Edit This Page" l http://bugs.python.org/issue31055 opened by Paul Hammant #31056: Import Module Not Working According To Documentation Python 3. http://bugs.python.org/issue31056 opened by MrJman006 #31057: pydoc for tempfile.TemporaryDirectory should say it returns th http://bugs.python.org/issue31057 opened by Thomas Thurman #31059: asyncio.StreamReader.read hangs if n<0 http://bugs.python.org/issue31059 opened by s_kostyuk #31061: asyncio segfault when using threadpool and "_asyncio" native m http://bugs.python.org/issue31061 opened by thehesiod #31062: socket.makefile does not handle line buffering http://bugs.python.org/issue31062 opened by kchen #31064: test_ossaudiodev fails under padsp (Linux PulseAudio OSS emula http://bugs.python.org/issue31064 opened by ncoghlan #31065: Documentation for Popen.poll is unclear http://bugs.python.org/issue31065 opened by mark.dickinson #31066: FAIL: test_last_modified (test.test_httpservers.SimpleHTTPServ http://bugs.python.org/issue31066 opened by haypo #31067: test_subprocess.test_leak_fast_process_del_killed() fails rand http://bugs.python.org/issue31067 opened by haypo #31068: test_ttk_guionly hangs on AMD64 Windows8.1 Refleaks 2.7 http://bugs.python.org/issue31068 opened by haypo #31069: test_multiprocessing_spawn leaked a process on AMD64 Windows8. http://bugs.python.org/issue31069 opened by haypo #31070: test_threaded_import: KeyError ignored in _get_module_lock. References: <20170724161447.GQ3149@ando.pearwood.info> <815112c0-68f0-341d-ea6d-6db979a14261@mrabarnett.plus.com> <7bb5440a-aea9-9986-5f77-5aa2bff10c6a@mgmiller.net> <52163635-e530-cf54-a1e1-53d7bc8af091@mrabarnett.plus.com> <9ede1c70-74fb-8f71-9666-d3512a42c7a2@gmail.com> Message-ID: <2a41ff6d-6b3f-cfee-4508-af72e4b8a36e@mrabarnett.plus.com> On 2017-07-28 10:17, Michel Desmoulin wrote: > elif break and elif None: I'd like that very much. It's weird a break > the semantic of break and None, but it's in such a dark corner of Python > anyway I don't bother. > Surely it would not be "elif break", but "elif not break"? > Le 27/07/2017 ? 21:19, MRAB a ?crit : >> On 2017-07-27 03:34, Mike Miller wrote: >>> >>> >>> On 2017-07-26 16:36, MRAB wrote: >>>> "nobreak" would introduce a new keyword, but "not break" wouldn't. >>> >>> Whenever I've used the for-else, I've put a # no-break right next to >>> it, to >>> remind myself as much as anyone else. >>> >>> for...: not break: is the best alternative I've yet seen, congrats. >>> Perhaps in >>> Python 5 it can be enabled, with for-else: used instead for empty >>> iterables, as >>> that's what I expected the first few dozen times. >>> >> For empty iterables, how about "elif None:"? :-) From rob.cliffe at btinternet.com Fri Jul 28 18:11:25 2017 From: rob.cliffe at btinternet.com (Rob Cliffe) Date: Fri, 28 Jul 2017 23:11:25 +0100 Subject: [Python-Dev] for...else In-Reply-To: <2a41ff6d-6b3f-cfee-4508-af72e4b8a36e@mrabarnett.plus.com> References: <20170724161447.GQ3149@ando.pearwood.info> <815112c0-68f0-341d-ea6d-6db979a14261@mrabarnett.plus.com> <7bb5440a-aea9-9986-5f77-5aa2bff10c6a@mgmiller.net> <52163635-e530-cf54-a1e1-53d7bc8af091@mrabarnett.plus.com> <9ede1c70-74fb-8f71-9666-d3512a42c7a2@gmail.com> <2a41ff6d-6b3f-cfee-4508-af72e4b8a36e@mrabarnett.plus.com> Message-ID: <9c236fb3-3aaa-605b-a8c3-3ec9e53b5e3b@btinternet.com> On 28/07/2017 20:57, MRAB wrote: > On 2017-07-28 10:17, Michel Desmoulin wrote: >> elif break and elif None: I'd like that very much. It's weird a break >> the semantic of break and None, but it's in such a dark corner of Python >> anyway I don't bother. >> > Surely it would not be "elif break", but "elif not break"? To me, anything beginning with "else" or "elif" suggests an alternative branch, not an additional one (YMMV): if condition: do_something else: do_something_completely_different Therefore I would find "if not break" or even "and if not break" more intuitive. Best wishes Rob Cliffe > >> Le 27/07/2017 ? 21:19, MRAB a ?crit : >>> On 2017-07-27 03:34, Mike Miller wrote: >>>> >>>> >>>> On 2017-07-26 16:36, MRAB wrote: >>>>> "nobreak" would introduce a new keyword, but "not break" wouldn't. >>>> >>>> Whenever I've used the for-else, I've put a # no-break right next to >>>> it, to >>>> remind myself as much as anyone else. >>>> >>>> for...: not break: is the best alternative I've yet seen, congrats. >>>> Perhaps in >>>> Python 5 it can be enabled, with for-else: used instead for empty >>>> iterables, as >>>> that's what I expected the first few dozen times. >>>> >>> For empty iterables, how about "elif None:"? :-) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/rob.cliffe%40btinternet.com > > > --- > This email has been checked for viruses by AVG. > http://www.avg.com From larry at hastings.org Sat Jul 29 20:48:37 2017 From: larry at hastings.org (Larry Hastings) Date: Sat, 29 Jul 2017 17:48:37 -0700 Subject: [Python-Dev] for...else In-Reply-To: <9c236fb3-3aaa-605b-a8c3-3ec9e53b5e3b@btinternet.com> References: <20170724161447.GQ3149@ando.pearwood.info> <815112c0-68f0-341d-ea6d-6db979a14261@mrabarnett.plus.com> <7bb5440a-aea9-9986-5f77-5aa2bff10c6a@mgmiller.net> <52163635-e530-cf54-a1e1-53d7bc8af091@mrabarnett.plus.com> <9ede1c70-74fb-8f71-9666-d3512a42c7a2@gmail.com> <2a41ff6d-6b3f-cfee-4508-af72e4b8a36e@mrabarnett.plus.com> <9c236fb3-3aaa-605b-a8c3-3ec9e53b5e3b@btinternet.com> Message-ID: <9d0b6b88-33dd-2f2e-93b9-a224ce1844bc@hastings.org> As previously requested: please take this discussion to python-ideas. If you reply, remove python-dev from the To: and Cc: lists, and add python-ideas instead. This speculative discussion was never appropriate for python-dev. //arry/ On 07/28/2017 03:11 PM, Rob Cliffe wrote: > > > On 28/07/2017 20:57, MRAB wrote: >> On 2017-07-28 10:17, Michel Desmoulin wrote: >>> elif break and elif None: I'd like that very much. It's weird a break >>> the semantic of break and None, but it's in such a dark corner of >>> Python >>> anyway I don't bother. >>> >> Surely it would not be "elif break", but "elif not break"? > To me, anything beginning with "else" or "elif" suggests an > alternative branch, not an additional one (YMMV): > if condition: > do_something > else: > do_something_completely_different > > Therefore I would find "if not break" or even "and if not break" more > intuitive. > Best wishes > Rob Cliffe > >> >>> Le 27/07/2017 ? 21:19, MRAB a ?crit : >>>> On 2017-07-27 03:34, Mike Miller wrote: >>>>> >>>>> >>>>> On 2017-07-26 16:36, MRAB wrote: >>>>>> "nobreak" would introduce a new keyword, but "not break" wouldn't. >>>>> >>>>> Whenever I've used the for-else, I've put a # no-break right next to >>>>> it, to >>>>> remind myself as much as anyone else. >>>>> >>>>> for...: not break: is the best alternative I've yet seen, >>>>> congrats. Perhaps in >>>>> Python 5 it can be enabled, with for-else: used instead for empty >>>>> iterables, as >>>>> that's what I expected the first few dozen times. >>>>> >>>> For empty iterables, how about "elif None:"? :-) >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> https://mail.python.org/mailman/options/python-dev/rob.cliffe%40btinternet.com >> >> >> --- >> This email has been checked for viruses by AVG. >> http://www.avg.com > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/larry%40hastings.org -------------- next part -------------- An HTML attachment was scrubbed... URL: