From ma3yuki.8mamo10 at gmail.com Fri Sep 1 04:03:04 2017 From: ma3yuki.8mamo10 at gmail.com (Masayuki YAMAMOTO) Date: Fri, 1 Sep 2017 17:03:04 +0900 Subject: [Python-Dev] PEP 539 (second round): A new C API for Thread-Local Storage in CPython In-Reply-To: References: Message-ID: 2017-08-31 19:40 GMT+09:00 Erik Bray : > [...] > the changes is nice. I just have a few minor changes to suggest > (typos and such) that I'll make in a pull request. > Steve Dower points out which avoids the use of bool in header declarations [*]. I'd change PyThread_tss_is_created declaration's bool to replace with int. Erik, would you change the specification at present PEP PR? (I comment also the PR) Thanks, Masayuki [*]: https://github.com/python/cpython/pull/1362#pullrequestreview-59884901 -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Fri Sep 1 08:57:25 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 1 Sep 2017 14:57:25 +0200 Subject: [Python-Dev] HTTPS on bugs.python.org Message-ID: Hi, When I go to http://bugs.python.org/ Firefox warns me that the form on the left to login (user, password) sends data in clear text (HTTP). Ok, I switch manually to HTTPS: add "s" in "http://" of the URL. I log in. I go to an issue using HTTPS like https://bugs.python.org/issue31250 I modify an issue using the form and click on [Submit Changes] (or just press Enter): I'm back to HTTP. Truncated URL: http://bugs.python.org/issue31250?@ok_message=msg%20301099%20created%... Hum, again I switch manually to HTTPS by modifying the URL: https://bugs.python.org/issue31250?@ok_message=msg%20301099%20created%... I click on the "clear this message" link: oops, I'm back to the HTTP world... http://bugs.python.org/issue31250 So, would it be possible to enforce HTTPS on the bug tracker? The best would be to always generate HTTPS urls and *maybe* redirect HTTP to HTTPS. Sorry, I don't know what are the best practices. For example, should we use HTTPS only cookies? Victor From songofacandy at gmail.com Fri Sep 1 09:15:29 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Fri, 1 Sep 2017 22:15:29 +0900 Subject: [Python-Dev] HTTPS on bugs.python.org In-Reply-To: References: Message-ID: FYI, there is issue report for it. http://psf.upfronthosting.co.za/roundup/meta/issue463 INADA Naoki On Fri, Sep 1, 2017 at 9:57 PM, Victor Stinner wrote: > Hi, > > When I go to http://bugs.python.org/ Firefox warns me that the form on > the left to login (user, password) sends data in clear text (HTTP). > > Ok, I switch manually to HTTPS: add "s" in "http://" of the URL. > > I log in. > > I go to an issue using HTTPS like https://bugs.python.org/issue31250 > > I modify an issue using the form and click on [Submit Changes] (or > just press Enter): I'm back to HTTP. Truncated URL: > > http://bugs.python.org/issue31250?@ok_message=msg%20301099%20created%... > > Hum, again I switch manually to HTTPS by modifying the URL: > > https://bugs.python.org/issue31250?@ok_message=msg%20301099%20created%... > > I click on the "clear this message" link: oops, I'm back to the HTTP world... > > http://bugs.python.org/issue31250 > > So, would it be possible to enforce HTTPS on the bug tracker? > > The best would be to always generate HTTPS urls and *maybe* redirect > HTTP to HTTPS. > > Sorry, I don't know what are the best practices. For example, should > we use HTTPS only cookies? > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com From solipsis at pitrou.net Fri Sep 1 09:29:58 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 1 Sep 2017 15:29:58 +0200 Subject: [Python-Dev] HTTPS on bugs.python.org References: Message-ID: <20170901152958.6e246b94@fsol> On Fri, 1 Sep 2017 22:15:29 +0900 INADA Naoki wrote: > FYI, there is issue report for it. > http://psf.upfronthosting.co.za/roundup/meta/issue463 > INADA Naoki That issue is about making the tracker HTTPS-only, but fixing internal links to point to the HTTPS site would already go a long way, even without switching off HTTP access. Regards Antoine. From solipsis at pitrou.net Fri Sep 1 09:36:30 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 1 Sep 2017 15:36:30 +0200 Subject: [Python-Dev] HTTPS on bugs.python.org References: <20170901152958.6e246b94@fsol> Message-ID: <20170901153630.3bea59c5@fsol> And by the way the problem goes away if you use the "HTTPS Everywhere" plugin for Firefox. Regards Antoine. On Fri, 1 Sep 2017 15:29:58 +0200 Antoine Pitrou wrote: > On Fri, 1 Sep 2017 22:15:29 +0900 > INADA Naoki wrote: > > FYI, there is issue report for it. > > http://psf.upfronthosting.co.za/roundup/meta/issue463 > > INADA Naoki > > That issue is about making the tracker HTTPS-only, but fixing > internal links to point to the HTTPS site would already go a long way, > even without switching off HTTP access. > > Regards > > Antoine. > > From mariatta.wijaya at gmail.com Fri Sep 1 09:35:12 2017 From: mariatta.wijaya at gmail.com (Mariatta Wijaya) Date: Fri, 1 Sep 2017 06:35:12 -0700 Subject: [Python-Dev] HTTPS on bugs.python.org In-Reply-To: <20170901152958.6e246b94@fsol> References: <20170901152958.6e246b94@fsol> Message-ID: I also would like the links from bug tracker emails be in https instead of http. On Sep 1, 2017 6:31 AM, "Antoine Pitrou" wrote: > On Fri, 1 Sep 2017 22:15:29 +0900 > INADA Naoki wrote: > > FYI, there is issue report for it. > > http://psf.upfronthosting.co.za/roundup/meta/issue463 > > INADA Naoki > > That issue is about making the tracker HTTPS-only, but fixing > internal links to point to the HTTPS site would already go a long way, > even without switching off HTTP access. > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > mariatta.wijaya%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Fri Sep 1 10:08:32 2017 From: wes.turner at gmail.com (Wes Turner) Date: Fri, 1 Sep 2017 09:08:32 -0500 Subject: [Python-Dev] HTTPS on bugs.python.org In-Reply-To: References: <20170901152958.6e246b94@fsol> Message-ID: ## HTTP STS - Wikipedia: https://en.wikipedia.org/wiki/HTTP_Strict_Transport_Security - Docs: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Strict-Transport-Security - https://https.cio.gov/hsts/ ## letsencrypt "A free, automated, and open certificate authority." - Wikipedia: https://en.wikipedia.org/wiki/Let%27s_Encrypt - Homepage: https://letsencrypt.org/ - Src: https://github.com/letsencrypt - Docs: https://letsencrypt.readthedocs.io/en/latest/ - Docs: https://letsencrypt.readthedocs.io/en/latest/using.html#getting-certificates-and-choosing-plugins - Docs: https://letsencrypt.readthedocs.io/en/latest/using.html#third-party-plugins ### ACME Protocol - Wikipedia: https://en.wikipedia.org/wiki/Automated_Certificate_Management_Environment On Fri, Sep 1, 2017 at 8:35 AM, Mariatta Wijaya wrote: > I also would like the links from bug tracker emails be in https instead of > http. > > > > On Sep 1, 2017 6:31 AM, "Antoine Pitrou" wrote: > >> On Fri, 1 Sep 2017 22:15:29 +0900 >> INADA Naoki wrote: >> > FYI, there is issue report for it. >> > http://psf.upfronthosting.co.za/roundup/meta/issue463 >> > INADA Naoki >> >> That issue is about making the tracker HTTPS-only, but fixing >> internal links to point to the HTTPS site would already go a long way, >> even without switching off HTTP access. >> >> Regards >> >> Antoine. >> >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: https://mail.python.org/mailman/options/python-dev/mariatta. >> wijaya%40gmail.com >> > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > wes.turner%40gmail.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Fri Sep 1 10:12:23 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 1 Sep 2017 10:12:23 -0400 Subject: [Python-Dev] HTTPS on bugs.python.org In-Reply-To: <20170901153630.3bea59c5@fsol> References: <20170901152958.6e246b94@fsol> <20170901153630.3bea59c5@fsol> Message-ID: On 9/1/2017 9:36 AM, Antoine Pitrou wrote: > > And by the way the problem goes away if you use the "HTTPS Everywhere" > plugin for Firefox. Firefox has both 'extension' and 'plugin' add-ons. "HTTPS Everywhere" is found under 'extensions'. Works great. > On Fri, 1 Sep 2017 15:29:58 +0200 > Antoine Pitrou wrote: >> On Fri, 1 Sep 2017 22:15:29 +0900 >> INADA Naoki wrote: >>> FYI, there is issue report for it. >>> http://psf.upfronthosting.co.za/roundup/meta/issue463 >>> INADA Naoki >> >> That issue is about making the tracker HTTPS-only, but fixing >> internal links to point to the HTTPS site would already go a long way, >> even without switching off HTTP access. -- Terry Jan Reedy From wes.turner at gmail.com Fri Sep 1 10:20:03 2017 From: wes.turner at gmail.com (Wes Turner) Date: Fri, 1 Sep 2017 09:20:03 -0500 Subject: [Python-Dev] HTTPS on bugs.python.org In-Reply-To: References: <20170901152958.6e246b94@fsol> Message-ID: Here's e.g. Jupyter Notebook w/ letsencrypt in a Makefile: https://github.com/jupyter/docker-stacks/blob/master/examples/make-deploy/letsencrypt.makefile ... https://github.com/jupyter/docker-stacks On Fri, Sep 1, 2017 at 9:08 AM, Wes Turner wrote: > > ## HTTP STS > - Wikipedia: https://en.wikipedia.org/wiki/HTTP_Strict_Transport_Security > - Docs: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/ > Strict-Transport-Security > > - https://https.cio.gov/hsts/ > > ## letsencrypt > "A free, automated, and open certificate authority." > > - Wikipedia: https://en.wikipedia.org/wiki/Let%27s_Encrypt > - Homepage: https://letsencrypt.org/ > - Src: https://github.com/letsencrypt > - Docs: https://letsencrypt.readthedocs.io/en/latest/ > - Docs: https://letsencrypt.readthedocs.io/en/latest/using.html#getting- > certificates-and-choosing-plugins > - Docs: https://letsencrypt.readthedocs.io/en/latest/ > using.html#third-party-plugins > > ### ACME Protocol > - Wikipedia: https://en.wikipedia.org/wiki/Automated_ > Certificate_Management_Environment > > > > > On Fri, Sep 1, 2017 at 8:35 AM, Mariatta Wijaya > wrote: > >> I also would like the links from bug tracker emails be in https instead >> of http. >> >> >> >> On Sep 1, 2017 6:31 AM, "Antoine Pitrou" wrote: >> >>> On Fri, 1 Sep 2017 22:15:29 +0900 >>> INADA Naoki wrote: >>> > FYI, there is issue report for it. >>> > http://psf.upfronthosting.co.za/roundup/meta/issue463 >>> > INADA Naoki >>> >>> That issue is about making the tracker HTTPS-only, but fixing >>> internal links to point to the HTTPS site would already go a long way, >>> even without switching off HTTP access. >>> >>> Regards >>> >>> Antoine. >>> >>> >>> _______________________________________________ >>> Python-Dev mailing list >>> Python-Dev at python.org >>> https://mail.python.org/mailman/listinfo/python-dev >>> Unsubscribe: https://mail.python.org/mailma >>> n/options/python-dev/mariatta.wijaya%40gmail.com >>> >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: https://mail.python.org/mailman/options/python-dev/wes. >> turner%40gmail.com >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Fri Sep 1 10:32:03 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 1 Sep 2017 16:32:03 +0200 Subject: [Python-Dev] HTTPS on bugs.python.org In-Reply-To: <20170901153630.3bea59c5@fsol> References: <20170901152958.6e246b94@fsol> <20170901153630.3bea59c5@fsol> Message-ID: 2017-09-01 15:36 GMT+02:00 Antoine Pitrou : > And by the way the problem goes away if you use the "HTTPS Everywhere" > plugin for Firefox. I do have "HTTPS Everywhere" Firefox plugin version 2017.8.31 (so it seems very recent), but it displayed as "obsolete" ("obsol?te" in french). I'm using Firefox 55 on Fedora 26. It seems like the plugin has to be updated to use the new WebExtensions API. https://www.eff.org/https-everywhere https://github.com/EFForg/https-everywhere/issues/7389 "No. HTTPS Everywhere has already been migrated to WebExtensions. We're unable to switch HTTPSE on Firefox over to WebExtensions until Tor Browser rebases to FF 52 ESR, as I already stated: #7389 (comment)" "Currently the main blocker to WebExtensions deployment on Firefox is a secure signing mechanism for the self-hosted version. See #9958 (comment)" In short, it doesn't work :-) Victor From antoine at python.org Fri Sep 1 10:34:26 2017 From: antoine at python.org (Antoine Pitrou) Date: Fri, 1 Sep 2017 16:34:26 +0200 Subject: [Python-Dev] HTTPS on bugs.python.org In-Reply-To: References: <20170901152958.6e246b94@fsol> <20170901153630.3bea59c5@fsol> Message-ID: Le 01/09/2017 ? 16:32, Victor Stinner a ?crit : > > In short, it doesn't work :-) I'm using Firefox 55 on Ubuntu 16.04 and it works here. You may be misunderstading what happens :-) Regards Antoine. From victor.stinner at gmail.com Fri Sep 1 11:03:59 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 1 Sep 2017 17:03:59 +0200 Subject: [Python-Dev] HTTPS on bugs.python.org In-Reply-To: References: <20170901152958.6e246b94@fsol> <20170901153630.3bea59c5@fsol> Message-ID: 2017-09-01 16:34 GMT+02:00 Antoine Pitrou : > I'm using Firefox 55 on Ubuntu 16.04 and it works here. You may be > misunderstading what happens :-) Maybe I misunderstood you when you wrote: > And by the way the problem goes away if you use the "HTTPS Everywhere" > plugin for Firefox. Try for example this page: https://bugs.python.org/issue31234?@ok_message=msg%20301118%20created For me, the "clear this message" link is HTTP, not HTTPS: http://bugs.python.org/issue31234 Victor From solipsis at pitrou.net Fri Sep 1 11:27:59 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 1 Sep 2017 17:27:59 +0200 Subject: [Python-Dev] HTTPS on bugs.python.org References: <20170901152958.6e246b94@fsol> <20170901153630.3bea59c5@fsol> Message-ID: <20170901172759.4048be7f@fsol> On Fri, 1 Sep 2017 17:03:59 +0200 Victor Stinner wrote: > > > And by the way the problem goes away if you use the "HTTPS Everywhere" > > plugin for Firefox. > > Try for example this page: > > https://bugs.python.org/issue31234?@ok_message=msg%20301118%20created > > For me, the "clear this message" link is HTTP, not HTTPS: > > http://bugs.python.org/issue31234 Sure, but if you click on this link, it will go to the HTTPS version nevertheless. Regards Antoine. From phd at phdru.name Fri Sep 1 11:31:00 2017 From: phd at phdru.name (Oleg Broytman) Date: Fri, 1 Sep 2017 17:31:00 +0200 Subject: [Python-Dev] HTTPS on bugs.python.org In-Reply-To: <20170901172759.4048be7f@fsol> References: <20170901152958.6e246b94@fsol> <20170901153630.3bea59c5@fsol> <20170901172759.4048be7f@fsol> Message-ID: <20170901153100.GA15325@phdru.name> On Fri, Sep 01, 2017 at 05:27:59PM +0200, Antoine Pitrou wrote: > On Fri, 1 Sep 2017 17:03:59 +0200 > Victor Stinner wrote: > > > > > And by the way the problem goes away if you use the "HTTPS Everywhere" > > > plugin for Firefox. > > > > Try for example this page: > > > > https://bugs.python.org/issue31234?@ok_message=msg%20301118%20created > > > > For me, the "clear this message" link is HTTP, not HTTPS: > > > > http://bugs.python.org/issue31234 > > Sure, but if you click on this link, it will go to the HTTPS version > nevertheless. It doesn't for me. :-( FFox 55.0.1, HTTPS Everywhere 2017.8.15. > Regards > > Antoine. Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From status at bugs.python.org Fri Sep 1 12:09:20 2017 From: status at bugs.python.org (Python tracker) Date: Fri, 1 Sep 2017 18:09:20 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20170901160920.612DE11A892@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2017-08-25 - 2017-09-01) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 6166 (+17) closed 36910 (+30) total 43076 (+47) Open issues with patches: 2339 Issues opened (35) ================== #30581: os.cpu_count() returns wrong number of processors on system wi http://bugs.python.org/issue30581 reopened by haypo #30776: regrtest: change -R/--huntrleaks rule to decide if a test leak http://bugs.python.org/issue30776 reopened by pitrou #31250: test_asyncio leaks dangling threads http://bugs.python.org/issue31250 reopened by haypo #31281: fileinput inplace does not work with pathlib.Path http://bugs.python.org/issue31281 opened by zmwangx #31282: C APIs called without GIL in PyOS_Readline http://bugs.python.org/issue31282 opened by xiang.zhang #31284: IDLE: Make GUI test teardown less fragile http://bugs.python.org/issue31284 opened by csabella #31285: a SystemError and an assertion failure in warnings.warn_explic http://bugs.python.org/issue31285 opened by Oren Milman #31288: IDLE tests: don't modify tkinter.messagebox. http://bugs.python.org/issue31288 opened by terry.reedy #31289: File paths in exception traceback resolve symlinks http://bugs.python.org/issue31289 opened by Paul Pinterits #31290: segfault on missing library symbol http://bugs.python.org/issue31290 opened by immortalplants #31292: `python setup.py check --restructuredtext` fails when a includ http://bugs.python.org/issue31292 opened by flying sheep #31293: crashes in multiply_float_timedelta() and in truedivide_timede http://bugs.python.org/issue31293 opened by Oren Milman #31294: ZeroMQSocketListener and ZeroMQSocketHandler examples in the L http://bugs.python.org/issue31294 opened by pablogsal #31296: support pty.fork and os.forkpty actions in posix subprocess mo http://bugs.python.org/issue31296 opened by gregory.p.smith #31297: Unpickleable ModuleImportError in unittest patch not backporte http://bugs.python.org/issue31297 opened by Rachel Tobin #31298: Error when calling numpy.astype http://bugs.python.org/issue31298 opened by droth #31299: Add "ignore_modules" option to TracebackException.format() http://bugs.python.org/issue31299 opened by ncoghlan #31301: Python 2.7 SIGSEGV http://bugs.python.org/issue31301 opened by cody #31302: smtplib on linux fails to log in correctly http://bugs.python.org/issue31302 opened by murphdasurf #31304: Update doc for starmap_async error_back kwarg http://bugs.python.org/issue31304 opened by tamas #31305: 'pydoc -w import' report "no Python documentation found for 'i http://bugs.python.org/issue31305 opened by limuyuan #31306: IDLE, configdialog, General tab: validate user entries http://bugs.python.org/issue31306 opened by terry.reedy #31307: ConfigParser.read silently fails if filenames argument is a by http://bugs.python.org/issue31307 opened by vxgmichel #31308: forkserver process isn't re-launched if it died http://bugs.python.org/issue31308 opened by pitrou #31310: semaphore tracker isn't protected against crashes http://bugs.python.org/issue31310 opened by pitrou #31311: a SystemError and a crash in PyCData_setstate() when __dict__ http://bugs.python.org/issue31311 opened by Oren Milman #31313: Feature Add support of os.chflags() on Linux platform http://bugs.python.org/issue31313 opened by socketpair #31314: email throws exception with oversized header input http://bugs.python.org/issue31314 opened by doko #31315: assertion failure in imp.create_dynamic(), when spec.name is n http://bugs.python.org/issue31315 opened by Oren Milman #31319: Rename idlelib to just idle http://bugs.python.org/issue31319 opened by rhettinger #31320: test_ssl logs a traceback http://bugs.python.org/issue31320 opened by haypo #31321: traceback.clear_frames() doesn't clear *all* frames http://bugs.python.org/issue31321 opened by haypo #31323: test_ssl: reference cycle between ThreadedEchoServer and its C http://bugs.python.org/issue31323 opened by haypo #31324: support._match_test() used by test.bisect is very inefficient http://bugs.python.org/issue31324 opened by haypo #31325: req_rate is a namedtuple type rather than instance http://bugs.python.org/issue31325 opened by gvx Most recent 15 issues with no replies (15) ========================================== #31325: req_rate is a namedtuple type rather than instance http://bugs.python.org/issue31325 #31323: test_ssl: reference cycle between ThreadedEchoServer and its C http://bugs.python.org/issue31323 #31314: email throws exception with oversized header input http://bugs.python.org/issue31314 #31310: semaphore tracker isn't protected against crashes http://bugs.python.org/issue31310 #31308: forkserver process isn't re-launched if it died http://bugs.python.org/issue31308 #31307: ConfigParser.read silently fails if filenames argument is a by http://bugs.python.org/issue31307 #31306: IDLE, configdialog, General tab: validate user entries http://bugs.python.org/issue31306 #31305: 'pydoc -w import' report "no Python documentation found for 'i http://bugs.python.org/issue31305 #31304: Update doc for starmap_async error_back kwarg http://bugs.python.org/issue31304 #31301: Python 2.7 SIGSEGV http://bugs.python.org/issue31301 #31299: Add "ignore_modules" option to TracebackException.format() http://bugs.python.org/issue31299 #31298: Error when calling numpy.astype http://bugs.python.org/issue31298 #31296: support pty.fork and os.forkpty actions in posix subprocess mo http://bugs.python.org/issue31296 #31294: ZeroMQSocketListener and ZeroMQSocketHandler examples in the L http://bugs.python.org/issue31294 #31290: segfault on missing library symbol http://bugs.python.org/issue31290 Most recent 15 issues waiting for review (15) ============================================= #31310: semaphore tracker isn't protected against crashes http://bugs.python.org/issue31310 #31308: forkserver process isn't re-launched if it died http://bugs.python.org/issue31308 #31270: Simplify documentation of itertools.zip_longest http://bugs.python.org/issue31270 #31185: Miscellaneous errors in asyncio speedup module http://bugs.python.org/issue31185 #31184: Fix data descriptor detection in inspect.getattr_static http://bugs.python.org/issue31184 #31179: Speed-up dict.copy() up to 5.5 times. http://bugs.python.org/issue31179 #31178: [EASY] subprocess: TypeError: can't concat str to bytes, in _e http://bugs.python.org/issue31178 #31175: Exception while extracting file from ZIP with non-matching fil http://bugs.python.org/issue31175 #31151: socketserver.ForkingMixIn.server_close() leaks zombie processe http://bugs.python.org/issue31151 #31120: [2.7] Python 64 bit _ssl compile fails due missing buildinf_am http://bugs.python.org/issue31120 #31113: Stack overflow with large program http://bugs.python.org/issue31113 #31106: os.posix_fallocate() generate exception with errno 0 http://bugs.python.org/issue31106 #31065: Documentation for Popen.poll is unclear http://bugs.python.org/issue31065 #31051: IDLE, configdialog, General tab: re-arrange, test user entries http://bugs.python.org/issue31051 #31046: ensurepip does not honour the value of $(prefix) http://bugs.python.org/issue31046 Top 10 most discussed issues (10) ================================= #10746: ctypes c_long & c_bool have incorrect PEP-3118 type codes http://bugs.python.org/issue10746 5 msgs #30776: regrtest: change -R/--huntrleaks rule to decide if a test leak http://bugs.python.org/issue30776 5 msgs #31284: IDLE: Make GUI test teardown less fragile http://bugs.python.org/issue31284 5 msgs #31051: IDLE, configdialog, General tab: re-arrange, test user entries http://bugs.python.org/issue31051 4 msgs #31250: test_asyncio leaks dangling threads http://bugs.python.org/issue31250 4 msgs #31271: an assertion failure in io.TextIOWrapper.write http://bugs.python.org/issue31271 4 msgs #31282: C APIs called without GIL in PyOS_Readline http://bugs.python.org/issue31282 4 msgs #31293: crashes in multiply_float_timedelta() and in truedivide_timede http://bugs.python.org/issue31293 4 msgs #31313: Feature Add support of os.chflags() on Linux platform http://bugs.python.org/issue31313 4 msgs #31315: assertion failure in imp.create_dynamic(), when spec.name is n http://bugs.python.org/issue31315 4 msgs Issues closed (30) ================== #5001: Remove assertion-based checking in multiprocessing http://bugs.python.org/issue5001 closed by pitrou #23835: configparser does not convert defaults to strings http://bugs.python.org/issue23835 closed by lukasz.langa #28261: wrong error messages when using PyArg_ParseTuple to parse norm http://bugs.python.org/issue28261 closed by serhiy.storchaka #29741: BytesIO methods don't accept integer types, while StringIO cou http://bugs.python.org/issue29741 closed by steve.dower #30617: IDLE: Add docstrings and unittests to outwin.py http://bugs.python.org/issue30617 closed by terry.reedy #30781: IDLE: configdialog -- switch to ttk widgets. http://bugs.python.org/issue30781 closed by terry.reedy #31108: add __contains__ for list_iterator (and others) for better per http://bugs.python.org/issue31108 closed by serhiy.storchaka #31191: Fix grammar in threading.Barrier docs http://bugs.python.org/issue31191 closed by Mariatta #31209: MappingProxyType can not be pickled http://bugs.python.org/issue31209 closed by Alex Hayes #31217: test_code leaked [1, 1, 1] memory blocks on x86 Gentoo Refleak http://bugs.python.org/issue31217 closed by haypo #31237: test_gdb disables 25% of tests in optimized builds http://bugs.python.org/issue31237 closed by lukasz.langa #31243: checks whether PyArg_ParseTuple returned a negative int http://bugs.python.org/issue31243 closed by serhiy.storchaka #31249: test_concurrent_futures leaks dangling threads http://bugs.python.org/issue31249 closed by haypo #31272: typing module conflicts with __slots__-classes http://bugs.python.org/issue31272 closed by levkivskyi #31275: Check fall-through in _codecs_iso2022.c http://bugs.python.org/issue31275 closed by skrah #31279: Squash new gcc warning (-Wstringop-overflow) http://bugs.python.org/issue31279 closed by skrah #31280: Namespace packages in directories added to path aren't importa http://bugs.python.org/issue31280 closed by j1m #31283: Inconsistent behaviours with explicit and implicit inheritance http://bugs.python.org/issue31283 closed by r.david.murray #31286: import in finally results in SystemError http://bugs.python.org/issue31286 closed by serhiy.storchaka #31287: IDLE configdialog tests: don't modify tkinter.messagebox. http://bugs.python.org/issue31287 closed by terry.reedy #31291: zipimport.zipimporter.get_data() crashes when path.replace() r http://bugs.python.org/issue31291 closed by brett.cannon #31295: typo in __hash__ docs http://bugs.python.org/issue31295 closed by r.david.murray #31300: Traceback prints different code than the running module http://bugs.python.org/issue31300 closed by r.david.murray #31303: xml.etree.ElementTree fails to parse a document (regression) http://bugs.python.org/issue31303 closed by serhiy.storchaka #31309: Tkinter root window does not close if used with matplotlib.pyp http://bugs.python.org/issue31309 closed by terry.reedy #31312: Build differences caused by the time stamps http://bugs.python.org/issue31312 closed by r.david.murray #31316: Frequent *** stack smashing detected *** with Python 3.6.2/mei http://bugs.python.org/issue31316 closed by pitrou #31317: Memory leak in dict with shared keys http://bugs.python.org/issue31317 closed by haypo #31318: On Windows importlib.util.find_spec("re") results in Attribute http://bugs.python.org/issue31318 closed by steve.dower #31322: SimpleNamespace deep copy http://bugs.python.org/issue31322 closed by Pritish Patil From songofacandy at gmail.com Fri Sep 1 12:24:03 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Sat, 2 Sep 2017 01:24:03 +0900 Subject: [Python-Dev] HTTPS on bugs.python.org In-Reply-To: <20170901152958.6e246b94@fsol> References: <20170901152958.6e246b94@fsol> Message-ID: You're right. It should be bpo configuration issue. https://hg.python.org/tracker/roundup/file/bugs.python.org/roundup/cgi/client.py#l303 https://hg.python.org/tracker/python-dev/file/tip/config.ini.template#l118 I can't real config file used for bpo. But maybe, tracker.web is 'http://bugs.python.org/' instead of 'https://bugs.python.org/' INADA Naoki On Fri, Sep 1, 2017 at 10:29 PM, Antoine Pitrou wrote: > On Fri, 1 Sep 2017 22:15:29 +0900 > INADA Naoki wrote: >> FYI, there is issue report for it. >> http://psf.upfronthosting.co.za/roundup/meta/issue463 >> INADA Naoki > > That issue is about making the tracker HTTPS-only, but fixing > internal links to point to the HTTPS site would already go a long way, > even without switching off HTTP access. > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com From solipsis at pitrou.net Fri Sep 1 13:06:57 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 1 Sep 2017 19:06:57 +0200 Subject: [Python-Dev] HTTPS on bugs.python.org References: <20170901152958.6e246b94@fsol> <20170901153630.3bea59c5@fsol> <20170901172759.4048be7f@fsol> <20170901153100.GA15325@phdru.name> Message-ID: <20170901190657.3d9ab481@fsol> On Fri, 1 Sep 2017 17:31:00 +0200 Oleg Broytman wrote: > On Fri, Sep 01, 2017 at 05:27:59PM +0200, Antoine Pitrou wrote: > > On Fri, 1 Sep 2017 17:03:59 +0200 > > Victor Stinner wrote: > > > > > > > And by the way the problem goes away if you use the "HTTPS Everywhere" > > > > plugin for Firefox. > > > > > > Try for example this page: > > > > > > https://bugs.python.org/issue31234?@ok_message=msg%20301118%20created > > > > > > For me, the "clear this message" link is HTTP, not HTTPS: > > > > > > http://bugs.python.org/issue31234 > > > > Sure, but if you click on this link, it will go to the HTTPS version > > nevertheless. > > It doesn't for me. :-( FFox 55.0.1, HTTPS Everywhere 2017.8.15. That's surprising. It's definitely part of the standard rules (enabled by default): https://www.eff.org/https-everywhere/atlas/domains/python.org.html Perhaps you tweaked your configuration? Regards Antoine. From victor.stinner at gmail.com Fri Sep 1 13:16:39 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 1 Sep 2017 19:16:39 +0200 Subject: [Python-Dev] HTTPS on bugs.python.org In-Reply-To: <20170901190657.3d9ab481@fsol> References: <20170901152958.6e246b94@fsol> <20170901153630.3bea59c5@fsol> <20170901172759.4048be7f@fsol> <20170901153100.GA15325@phdru.name> <20170901190657.3d9ab481@fsol> Message-ID: 2017-09-01 19:06 GMT+02:00 Antoine Pitrou : > That's surprising. It's definitely part of the standard rules (enabled > by default): > https://www.eff.org/https-everywhere/atlas/domains/python.org.html Maybe the plugin is also broken, as my setup. Maybe it's related to the recent "multiprocess" major change of Firefox? Victor From phd at phdru.name Fri Sep 1 13:33:51 2017 From: phd at phdru.name (Oleg Broytman) Date: Fri, 1 Sep 2017 19:33:51 +0200 Subject: [Python-Dev] HTTPS on bugs.python.org In-Reply-To: <20170901190657.3d9ab481@fsol> References: <20170901152958.6e246b94@fsol> <20170901153630.3bea59c5@fsol> <20170901172759.4048be7f@fsol> <20170901153100.GA15325@phdru.name> <20170901190657.3d9ab481@fsol> Message-ID: <20170901173351.GA21481@phdru.name> On Fri, Sep 01, 2017 at 07:06:57PM +0200, Antoine Pitrou wrote: > On Fri, 1 Sep 2017 17:31:00 +0200 > Oleg Broytman wrote: > > > On Fri, Sep 01, 2017 at 05:27:59PM +0200, Antoine Pitrou wrote: > > > On Fri, 1 Sep 2017 17:03:59 +0200 > > > Victor Stinner wrote: > > > > > > > > > And by the way the problem goes away if you use the "HTTPS Everywhere" > > > > > plugin for Firefox. > > > > > > > > Try for example this page: > > > > > > > > https://bugs.python.org/issue31234?@ok_message=msg%20301118%20created > > > > > > > > For me, the "clear this message" link is HTTP, not HTTPS: > > > > > > > > http://bugs.python.org/issue31234 > > > > > > Sure, but if you click on this link, it will go to the HTTPS version > > > nevertheless. > > > > It doesn't for me. :-( FFox 55.0.1, HTTPS Everywhere 2017.8.15. > > That's surprising. It's definitely part of the standard rules (enabled > by default): > https://www.eff.org/https-everywhere/atlas/domains/python.org.html > > Perhaps you tweaked your configuration? Not for HTTPS Everywhere. > Regards > > Antoine. Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From tjreedy at udel.edu Fri Sep 1 14:55:40 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 1 Sep 2017 14:55:40 -0400 Subject: [Python-Dev] HTTPS on bugs.python.org In-Reply-To: <20170901153100.GA15325@phdru.name> References: <20170901152958.6e246b94@fsol> <20170901153630.3bea59c5@fsol> <20170901172759.4048be7f@fsol> <20170901153100.GA15325@phdru.name> Message-ID: On 9/1/2017 11:31 AM, Oleg Broytman wrote: > On Fri, Sep 01, 2017 at 05:27:59PM +0200, Antoine Pitrou wrote: >> On Fri, 1 Sep 2017 17:03:59 +0200 >> Victor Stinner wrote: >>> >>>> And by the way the problem goes away if you use the "HTTPS Everywhere" >>>> plugin for Firefox. >>> >>> Try for example this page: >>> >>> https://bugs.python.org/issue31234?@ok_message=msg%20301118%20created >>> >>> For me, the "clear this message" link is HTTP, not HTTPS: >>> >>> http://bugs.python.org/issue31234 >> >> Sure, but if you click on this link, it will go to the HTTPS version >> nevertheless. > > It doesn't for me. :-( FFox 55.0.1, HTTPS Everywhere 2017.8.15. Is fetches https: for me: 55.0.3, 2017.8.31, updated yesterday. >> Regards >> >> Antoine. > > Oleg. > -- Terry Jan Reedy From phd at phdru.name Fri Sep 1 15:35:14 2017 From: phd at phdru.name (Oleg Broytman) Date: Fri, 1 Sep 2017 21:35:14 +0200 Subject: [Python-Dev] HTTPS on bugs.python.org In-Reply-To: References: <20170901152958.6e246b94@fsol> <20170901153630.3bea59c5@fsol> <20170901172759.4048be7f@fsol> <20170901153100.GA15325@phdru.name> Message-ID: <20170901193514.GA32251@phdru.name> On Fri, Sep 01, 2017 at 02:55:40PM -0400, Terry Reedy wrote: > On 9/1/2017 11:31 AM, Oleg Broytman wrote: > >On Fri, Sep 01, 2017 at 05:27:59PM +0200, Antoine Pitrou wrote: > >>On Fri, 1 Sep 2017 17:03:59 +0200 > >>Victor Stinner wrote: > >>> > >>>>And by the way the problem goes away if you use the "HTTPS Everywhere" > >>>>plugin for Firefox. > >>> > >>>Try for example this page: > >>> > >>>https://bugs.python.org/issue31234?@ok_message=msg%20301118%20created > >>> > >>>For me, the "clear this message" link is HTTP, not HTTPS: > >>> > >>>http://bugs.python.org/issue31234 > >> > >>Sure, but if you click on this link, it will go to the HTTPS version > >>nevertheless. > > > > It doesn't for me. :-( FFox 55.0.1, HTTPS Everywhere 2017.8.15. > > Is fetches https: for me: 55.0.3, 2017.8.31, updated yesterday. I upgraded Fox and the extension. http://bugs.python.org now is redirected to https:// Thanks! > >>Regards > >> > >>Antoine. > > > >Oleg. > -- > Terry Jan Reedy Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From chris.barker at noaa.gov Fri Sep 1 16:05:12 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Fri, 1 Sep 2017 13:05:12 -0700 Subject: [Python-Dev] Inplace operations for PyLong objects In-Reply-To: <20170901021259.0bb7d002@fsol> References: <20170901021259.0bb7d002@fsol> Message-ID: On Thu, Aug 31, 2017 at 5:12 PM, Antoine Pitrou wrote: > I'm skeptical there are some programs out there that are limited by the > speed of PyLong inplace additions. > indeed, but that could be said about any number of operations. My question is -- how can the interpreter know if it can alter what is supposed to be an immutable in-place? If it's used only internally to a function, the it would be safe, but how to know that? -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From jelle.zijlstra at gmail.com Fri Sep 1 16:19:39 2017 From: jelle.zijlstra at gmail.com (Jelle Zijlstra) Date: Fri, 1 Sep 2017 13:19:39 -0700 Subject: [Python-Dev] Inplace operations for PyLong objects In-Reply-To: References: <20170901021259.0bb7d002@fsol> Message-ID: 2017-09-01 13:05 GMT-07:00 Chris Barker : > On Thu, Aug 31, 2017 at 5:12 PM, Antoine Pitrou > wrote: > >> I'm skeptical there are some programs out there that are limited by the >> speed of PyLong inplace additions. >> > > indeed, but that could be said about any number of operations. > > My question is -- how can the interpreter know if it can alter what is > supposed to be an immutable in-place? If it's used only internally to a > function, the it would be safe, but how to know that? > I believe Catalin's implementation checks if the object's refcount is 1. If that is the case, it is safe to mutate it. > > -CHB > > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > jelle.zijlstra%40gmail.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jjevnik at quantopian.com Fri Sep 1 16:35:41 2017 From: jjevnik at quantopian.com (Joe Jevnik) Date: Fri, 1 Sep 2017 16:35:41 -0400 Subject: [Python-Dev] Inplace operations for PyLong objects In-Reply-To: References: <20170901021259.0bb7d002@fsol> Message-ID: Is it true that checking for refcount == 1 is enough? What if a user wrote: args = (compute_integer(), 5) # give away args to someone int.__iadd__(*args) here `args[0]` still has refcount=1 because only `args` owns this integer. On Fri, Sep 1, 2017 at 4:19 PM, Jelle Zijlstra wrote: > > > 2017-09-01 13:05 GMT-07:00 Chris Barker : > >> On Thu, Aug 31, 2017 at 5:12 PM, Antoine Pitrou >> wrote: >> >>> I'm skeptical there are some programs out there that are limited by the >>> speed of PyLong inplace additions. >>> >> >> indeed, but that could be said about any number of operations. >> >> My question is -- how can the interpreter know if it can alter what is >> supposed to be an immutable in-place? If it's used only internally to a >> function, the it would be safe, but how to know that? >> > I believe Catalin's implementation checks if the object's refcount is 1. > If that is the case, it is safe to mutate it. > >> >> -CHB >> >> >> >> -- >> >> Christopher Barker, Ph.D. >> Oceanographer >> >> Emergency Response Division >> NOAA/NOS/OR&R (206) 526-6959 voice >> 7600 Sand Point Way NE (206) 526-6329 fax >> Seattle, WA 98115 (206) 526-6317 main reception >> >> Chris.Barker at noaa.gov >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: https://mail.python.org/mailman/options/python-dev/jelle. >> zijlstra%40gmail.com >> >> > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > joe%40quantopian.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From catalin.gabriel.manciu at intel.com Fri Sep 1 16:49:44 2017 From: catalin.gabriel.manciu at intel.com (Manciu, Catalin Gabriel) Date: Fri, 1 Sep 2017 20:49:44 +0000 Subject: [Python-Dev] Inplace operations for PyLong objects In-Reply-To: References: <20170901021259.0bb7d002@fsol> Message-ID: > My question is -- how can the interpreter know if it can alter what is > supposed to be an immutable in-place? If it's used only internally to a > function, the it would be safe, but how to know that? > -CHB You can just check the reference count of your object, it's a member of the PyObject structure which every CPython object contains: ob_refcnt. This will indicate if your object is referenced by other Python variables or by Python containers such as lists, tuples, dicts, sets etc. From rosuav at gmail.com Fri Sep 1 17:12:13 2017 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 2 Sep 2017 07:12:13 +1000 Subject: [Python-Dev] Inplace operations for PyLong objects In-Reply-To: References: <20170901021259.0bb7d002@fsol> Message-ID: On Sat, Sep 2, 2017 at 6:35 AM, Joe Jevnik via Python-Dev wrote: > Is it true that checking for refcount == 1 is enough? What if a user wrote: > > args = (compute_integer(), 5) > # give away args to someone > int.__iadd__(*args) > > here `args[0]` still has refcount=1 because only `args` owns this integer. This particular example is safe, because the arguments get passed individually - so 'args' has one reference, plus there's one more for the actual function call (what would be 'self' if it were implemented in Python). There may be other examples that are more dangerous, but my suspicion is that the nature of += with anything other than a simple name will defeat the optimization, since the owning collection will retain a reference until __iadd__ returns. ChrisA From yselivanov.ml at gmail.com Fri Sep 1 19:02:16 2017 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Fri, 1 Sep 2017 19:02:16 -0400 Subject: [Python-Dev] PEP 550 V5 Message-ID: Hi, Below is the fifth iteration of the PEP. The summary of changes is in the "Version History" section, but I'll list them here too: * Coroutines have no logical context by default (a revert to the V3 semantics). Read about the motivation in the `Coroutines not leaking context changes by default`_ section. The `High-Level Specification`_ section was also updated (specifically Generators and Coroutines subsections). * All APIs have been placed to the ``contextvars`` module, and the factory functions were changed to class constructors (``ContextVar``, ``ExecutionContext``, and ``LogicalContext``). Thanks to Nick for the idea. * ``ContextVar.lookup()`` got renamed back to ``ContextVar.get()`` and gained the ``topmost`` and ``default`` keyword arguments. Added ``ContextVar.delete()``. * Fixed ``ContextVar.get()`` cache bug (thanks Nathaniel!). * New `Rejected Ideas`_, `Should "yield from" leak context changes?`_, `Alternative Designs for ContextVar API`_, `Setting and restoring context variables`_, and `Context manager as the interface for modifications`_ sections. Thanks! PEP: 550 Title: Execution Context Version: $Revision$ Last-Modified: $Date$ Author: Yury Selivanov , Elvis Pranskevichus Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 11-Aug-2017 Python-Version: 3.7 Post-History: 11-Aug-2017, 15-Aug-2017, 18-Aug-2017, 25-Aug-2017, 01-Sep-2017 Abstract ======== This PEP adds a new generic mechanism of ensuring consistent access to non-local state in the context of out-of-order execution, such as in Python generators and coroutines. Thread-local storage, such as ``threading.local()``, is inadequate for programs that execute concurrently in the same OS thread. This PEP proposes a solution to this problem. Rationale ========= Prior to the advent of asynchronous programming in Python, programs used OS threads to achieve concurrency. The need for thread-specific state was solved by ``threading.local()`` and its C-API equivalent, ``PyThreadState_GetDict()``. A few examples of where Thread-local storage (TLS) is commonly relied upon: * Context managers like decimal contexts, ``numpy.errstate``, and ``warnings.catch_warnings``. * Request-related data, such as security tokens and request data in web applications, language context for ``gettext`` etc. * Profiling, tracing, and logging in large code bases. Unfortunately, TLS does not work well for programs which execute concurrently in a single thread. A Python generator is the simplest example of a concurrent program. Consider the following:: def fractions(precision, x, y): with decimal.localcontext() as ctx: ctx.prec = precision yield Decimal(x) / Decimal(y) yield Decimal(x) / Decimal(y ** 2) g1 = fractions(precision=2, x=1, y=3) g2 = fractions(precision=6, x=2, y=3) items = list(zip(g1, g2)) The intuitively expected value of ``items`` is:: [(Decimal('0.33'), Decimal('0.666667')), (Decimal('0.11'), Decimal('0.222222'))] Rather surprisingly, the actual result is:: [(Decimal('0.33'), Decimal('0.666667')), (Decimal('0.111111'), Decimal('0.222222'))] This is because implicit Decimal context is stored as a thread-local, so concurrent iteration of the ``fractions()`` generator would corrupt the state. For Decimal, specifically, the only current workaround is to use explicit context method calls for all arithmetic operations [28]_. Arguably, this defeats the usefulness of overloaded operators and makes even simple formulas hard to read and write. Coroutines are another class of Python code where TLS unreliability is a significant issue. The inadequacy of TLS in asynchronous code has lead to the proliferation of ad-hoc solutions, which are limited in scope and do not support all required use cases. The current status quo is that any library (including the standard library), which relies on TLS, is likely to be broken when used in asynchronous code or with generators (see [3]_ as an example issue.) Some languages, that support coroutines or generators, recommend passing the context manually as an argument to every function, see [1]_ for an example. This approach, however, has limited use for Python, where there is a large ecosystem that was built to work with a TLS-like context. Furthermore, libraries like ``decimal`` or ``numpy`` rely on context implicitly in overloaded operator implementations. The .NET runtime, which has support for async/await, has a generic solution for this problem, called ``ExecutionContext`` (see [2]_). Goals ===== The goal of this PEP is to provide a more reliable ``threading.local()`` alternative, which: * provides the mechanism and the API to fix non-local state issues with coroutines and generators; * implements TLS-like semantics for synchronous code, so that users like ``decimal`` and ``numpy`` can switch to the new mechanism with minimal risk of breaking backwards compatibility; * has no or negligible performance impact on the existing code or the code that will be using the new mechanism, including C extensions. High-Level Specification ======================== The full specification of this PEP is broken down into three parts: * High-Level Specification (this section): the description of the overall solution. We show how it applies to generators and coroutines in user code, without delving into implementation details. * Detailed Specification: the complete description of new concepts, APIs, and related changes to the standard library. * Implementation Details: the description and analysis of data structures and algorithms used to implement this PEP, as well as the necessary changes to CPython. For the purpose of this section, we define *execution context* as an opaque container of non-local state that allows consistent access to its contents in the concurrent execution environment. A *context variable* is an object representing a value in the execution context. A call to ``contextvars.ContextVar(name)`` creates a new context variable object. A context variable object has three methods: * ``get()``: returns the value of the variable in the current execution context; * ``set(value)``: sets the value of the variable in the current execution context; * ``delete()``: can be used for restoring variable state, it's purpose and semantics are explained in `Setting and restoring context variables`_. Regular Single-threaded Code ---------------------------- In regular, single-threaded code that doesn't involve generators or coroutines, context variables behave like globals:: var = contextvars.ContextVar('var') def sub(): assert var.get() == 'main' var.set('sub') def main(): var.set('main') sub() assert var.get() == 'sub' Multithreaded Code ------------------ In multithreaded code, context variables behave like thread locals:: var = contextvars.ContextVar('var') def sub(): assert var.get() is None # The execution context is empty # for each new thread. var.set('sub') def main(): var.set('main') thread = threading.Thread(target=sub) thread.start() thread.join() assert var.get() == 'main' Generators ---------- Unlike regular function calls, generators can cooperatively yield their control of execution to the caller. Furthermore, a generator does not control *where* the execution would continue after it yields. It may be resumed from an arbitrary code location. For these reasons, the least surprising behaviour of generators is as follows: * changes to context variables are always local and are not visible in the outer context, but are visible to the code called by the generator; * once set in the generator, the context variable is guaranteed not to change between iterations; * changes to context variables in outer context (where the generator is being iterated) are visible to the generator, unless these variables were also modified inside the generator. Let's review:: var1 = contextvars.ContextVar('var1') var2 = contextvars.ContextVar('var2') def gen(): var1.set('gen') assert var1.get() == 'gen' assert var2.get() == 'main' yield 1 # Modification to var1 in main() is shielded by # gen()'s local modification. assert var1.get() == 'gen' # But modifications to var2 are visible assert var2.get() == 'main modified' yield 2 def main(): g = gen() var1.set('main') var2.set('main') next(g) # Modification of var1 in gen() is not visible. assert var1.get() == 'main' var1.set('main modified') var2.set('main modified') next(g) Now, let's revisit the decimal precision example from the `Rationale`_ section, and see how the execution context can improve the situation:: import decimal # create a new context var decimal_ctx = contextvars.ContextVar('decimal context') # Pre-PEP 550 Decimal relies on TLS for its context. # For illustration purposes, we monkey-patch the decimal # context functions to use the execution context. # A real working fix would need to properly update the # C implementation as well. def patched_setcontext(context): decimal_ctx.set(context) def patched_getcontext(): ctx = decimal_ctx.get() if ctx is None: ctx = decimal.Context() decimal_ctx.set(ctx) return ctx decimal.setcontext = patched_setcontext decimal.getcontext = patched_getcontext def fractions(precision, x, y): with decimal.localcontext() as ctx: ctx.prec = precision yield MyDecimal(x) / MyDecimal(y) yield MyDecimal(x) / MyDecimal(y ** 2) g1 = fractions(precision=2, x=1, y=3) g2 = fractions(precision=6, x=2, y=3) items = list(zip(g1, g2)) The value of ``items`` is:: [(Decimal('0.33'), Decimal('0.666667')), (Decimal('0.11'), Decimal('0.222222'))] which matches the expected result. Coroutines and Asynchronous Tasks --------------------------------- Like generators, coroutines can yield and regain control. The major difference from generators is that coroutines do not yield to the immediate caller. Instead, the entire coroutine call stack (coroutines chained by ``await``) switches to another coroutine call stack. In this regard, ``await``-ing on a coroutine is conceptually similar to a regular function call, and a coroutine chain (or a "task", e.g. an ``asyncio.Task``) is conceptually similar to a thread. >From this similarity we conclude that context variables in coroutines should behave like "task locals": * changes to context variables in a coroutine are visible to the coroutine that awaits on it; * changes to context variables made in the caller prior to awaiting are visible to the awaited coroutine; * changes to context variables made in one task are not visible in other tasks; * tasks spawned by other tasks inherit the execution context from the parent task, but any changes to context variables made in the parent task *after* the child task was spawned are *not* visible. The last point shows behaviour that is different from OS threads. OS threads do not inherit the execution context by default. There are two reasons for this: *common usage intent* and backwards compatibility. The main reason for why tasks inherit the context, and threads do not, is the common usage intent. Tasks are often used for relatively short-running operations which are logically tied to the code that spawned the tasks (like running a coroutine with a timeout in asyncio). OS threads, on the other hand, are normally used for long-running, logically separate code. With respect to backwards compatibility, we want the execution context to behave like ``threading.local()``. This is so that libraries can start using the execution context in place of TLS with a lesser risk of breaking compatibility with existing code. Let's review a few examples to illustrate the semantics we have just defined. Context variable propagation in a single task:: import asyncio var = contextvars.ContextVar('var') async def main(): var.set('main') await sub() # The effect of sub() is visible. assert var.get() == 'sub' async def sub(): assert var.get() == 'main' var.set('sub') assert var.get() == 'sub' loop = asyncio.get_event_loop() loop.run_until_complete(main()) Context variable propagation between tasks:: import asyncio var = contextvars.ContextVar('var') async def main(): var.set('main') loop.create_task(sub()) # schedules asynchronous execution # of sub(). assert var.get() == 'main' var.set('main changed') async def sub(): # Sleeping will make sub() run after # `var` is modified in main(). await asyncio.sleep(1) # The value of "var" is inherited from main(), but any # changes to "var" made in main() after the task # was created are *not* visible. assert var.get() == 'main' # This change is local to sub() and will not be visible # to other tasks, including main(). var.set('sub') loop = asyncio.get_event_loop() loop.run_until_complete(main()) As shown above, changes to the execution context are local to the task, and tasks get a snapshot of the execution context at the point of creation. There is one narrow edge case when this can lead to surprising behaviour. Consider the following example where we modify the context variable in a nested coroutine:: async def sub(var_value): await asyncio.sleep(1) var.set(var_value) async def main(): var.set('main') # waiting for sub() directly await sub('sub-1') # var change is visible assert var.get() == 'sub-1' # waiting for sub() with a timeout; await asyncio.wait_for(sub('sub-2'), timeout=2) # wait_for() creates an implicit task, which isolates # context changes, which means that the below assertion # will fail. assert var.get() == 'sub-2' # AssertionError! However, relying on context changes leaking to the caller is ultimately a bad pattern. For this reason, the behaviour shown in the above example is not considered a major issue and can be addressed with proper documentation. Detailed Specification ====================== Conceptually, an *execution context* (EC) is a stack of logical contexts. There is always exactly one active EC per Python thread. A *logical context* (LC) is a mapping of context variables to their values in that particular LC. A *context variable* is an object representing a value in the execution context. A new context variable object is created by calling ``contextvars.ContextVar(name: str)``. The value of the required ``name`` argument is not used by the EC machinery, but may be used for debugging and introspection. The context variable object has the following methods and attributes: * ``name``: the value passed to ``ContextVar()``. * ``get(*, topmost=False, default=None)``, if *topmost* is ``False`` (the default), traverses the execution context top-to-bottom, until the variable value is found, if *topmost* is ``True``, returns the value of the variable in the topmost logical context. If the variable value was not found, returns the value of *default*. * ``set(value)``: sets the value of the variable in the topmost logical context. * ``delete()``: removes the variable from the topmost logical context. Useful when restoring the logical context to the state prior to the ``set()`` call, for example, in a context manager, see `Setting and restoring context variables`_ for more information. Generators ---------- When created, each generator object has an empty logical context object stored in its ``__logical_context__`` attribute. This logical context is pushed onto the execution context at the beginning of each generator iteration and popped at the end:: var1 = contextvars.ContextVar('var1') var2 = contextvars.ContextVar('var2') def gen(): var1.set('var1-gen') var2.set('var2-gen') # EC = [ # outer_LC(), # gen_LC({var1: 'var1-gen', var2: 'var2-gen'}) # ] n = nested_gen() # nested_gen_LC is created next(n) # EC = [ # outer_LC(), # gen_LC({var1: 'var1-gen', var2: 'var2-gen'}) # ] var1.set('var1-gen-mod') var2.set('var2-gen-mod') # EC = [ # outer_LC(), # gen_LC({var1: 'var1-gen-mod', var2: 'var2-gen-mod'}) # ] next(n) def nested_gen(): # EC = [ # outer_LC(), # gen_LC({var1: 'var1-gen', var2: 'var2-gen'}), # nested_gen_LC() # ] assert var1.get() == 'var1-gen' assert var2.get() == 'var2-gen' var1.set('var1-nested-gen') # EC = [ # outer_LC(), # gen_LC({var1: 'var1-gen', var2: 'var2-gen'}), # nested_gen_LC({var1: 'var1-nested-gen'}) # ] yield # EC = [ # outer_LC(), # gen_LC({var1: 'var1-gen-mod', var2: 'var2-gen-mod'}), # nested_gen_LC({var1: 'var1-nested-gen'}) # ] assert var1.get() == 'var1-nested-gen' assert var2.get() == 'var2-gen-mod' yield # EC = [outer_LC()] g = gen() # gen_LC is created for the generator object `g` list(g) # EC = [outer_LC()] The snippet above shows the state of the execution context stack throughout the generator lifespan. contextlib.contextmanager ------------------------- The ``contextlib.contextmanager()`` decorator can be used to turn a generator into a context manager. A context manager that temporarily modifies the value of a context variable could be defined like this:: var = contextvars.ContextVar('var') @contextlib.contextmanager def var_context(value): original_value = var.get() try: var.set(value) yield finally: var.set(original_value) Unfortunately, this would not work straight away, as the modification to the ``var`` variable is contained to the ``var_context()`` generator, and therefore will not be visible inside the ``with`` block:: def func(): # EC = [{}, {}] with var_context(10): # EC becomes [{}, {}, {var: 10}] in the # *precision_context()* generator, # but here the EC is still [{}, {}] assert var.get() == 10 # AssertionError! The way to fix this is to set the generator's ``__logical_context__`` attribute to ``None``. This will cause the generator to avoid modifying the execution context stack. We modify the ``contextlib.contextmanager()`` decorator to set ``genobj.__logical_context__`` to ``None`` to produce well-behaved context managers:: def func(): # EC = [{}, {}] with var_context(10): # EC = [{}, {var: 10}] assert var.get() == 10 # EC becomes [{}, {var: None}] Enumerating context vars ------------------------ The ``ExecutionContext.vars()`` method returns a list of ``ContextVar`` objects, that have values in the execution context. This method is mostly useful for introspection and logging. coroutines ---------- In CPython, coroutines share the implementation with generators. The difference is that in coroutines ``__logical_context__`` defaults to ``None``. This affects both the ``async def`` coroutines and the old-style generator-based coroutines (generators decorated with ``@types.coroutine``). Asynchronous Generators ----------------------- The execution context semantics in asynchronous generators does not differ from that of regular generators. asyncio ------- ``asyncio`` uses ``Loop.call_soon``, ``Loop.call_later``, and ``Loop.call_at`` to schedule the asynchronous execution of a function. ``asyncio.Task`` uses ``call_soon()`` to run the wrapped coroutine. We modify ``Loop.call_{at,later,soon}`` to accept the new optional *execution_context* keyword argument, which defaults to the copy of the current execution context:: def call_soon(self, callback, *args, execution_context=None): if execution_context is None: execution_context = contextvars.get_execution_context() # ... some time later contextvars.run_with_execution_context( execution_context, callback, args) The ``contextvars.get_execution_context()`` function returns a shallow copy of the current execution context. By shallow copy here we mean such a new execution context that: * lookups in the copy provide the same results as in the original execution context, and * any changes in the original execution context do not affect the copy, and * any changes to the copy do not affect the original execution context. Either of the following satisfy the copy requirements: * a new stack with shallow copies of logical contexts; * a new stack with one squashed logical context. The ``contextvars.run_with_execution_context(ec, func, *args, **kwargs)`` function runs ``func(*args, **kwargs)`` with *ec* as the execution context. The function performs the following steps: 1. Set *ec* as the current execution context stack in the current thread. 2. Push an empty logical context onto the stack. 3. Run ``func(*args, **kwargs)``. 4. Pop the logical context from the stack. 5. Restore the original execution context stack. 6. Return or raise the ``func()`` result. These steps ensure that *ec* cannot be modified by *func*, which makes ``run_with_execution_context()`` idempotent. ``asyncio.Task`` is modified as follows:: class Task: def __init__(self, coro): ... # Get the current execution context snapshot. self._exec_context = contextvars.get_execution_context() # Create an empty Logical Context that will be # used by coroutines run in the task. coro.__logical_context__ = contextvars.LogicalContext() self._loop.call_soon( self._step, execution_context=self._exec_context) def _step(self, exc=None): ... self._loop.call_soon( self._step, execution_context=self._exec_context) ... Generators Transformed into Iterators ------------------------------------- Any Python generator can be represented as an equivalent iterator. Compilers like Cython rely on this axiom. With respect to the execution context, such iterator should behave the same way as the generator it represents. This means that there needs to be a Python API to create new logical contexts and run code with a given logical context. The ``contextvars.LogicalContext()`` function creates a new empty logical context. The ``contextvars.run_with_logical_context(lc, func, *args, **kwargs)`` function can be used to run functions in the specified logical context. The *lc* can be modified as a result of the call. The ``contextvars.run_with_logical_context()`` function performs the following steps: 1. Push *lc* onto the current execution context stack. 2. Run ``func(*args, **kwargs)``. 3. Pop *lc* from the execution context stack. 4. Return or raise the ``func()`` result. By using ``LogicalContext()`` and ``run_with_logical_context()``, we can replicate the generator behaviour like this:: class Generator: def __init__(self): self.logical_context = contextvars.LogicalContext() def __iter__(self): return self def __next__(self): return contextvars.run_with_logical_context( self.logical_context, self._next_impl) def _next_impl(self): # Actual __next__ implementation. ... Let's see how this pattern can be applied to an example generator:: # create a new context variable var = contextvars.ContextVar('var') def gen_series(n): var.set(10) for i in range(1, n): yield var.get() * i # gen_series is equivalent to the following iterator: class CompiledGenSeries: # This class is what the `gen_series()` generator can # be transformed to by a compiler like Cython. def __init__(self, n): # Create a new empty logical context, # like the generators do. self.logical_context = contextvars.LogicalContext() # Initialize the generator in its LC. # Otherwise `var.set(10)` in the `_init` method # would leak. contextvars.run_with_logical_context( self.logical_context, self._init, n) def _init(self, n): self.i = 1 self.n = n var.set(10) def __iter__(self): return self def __next__(self): # Run the actual implementation of __next__ in our LC. return contextvars.run_with_logical_context( self.logical_context, self._next_impl) def _next_impl(self): if self.i == self.n: raise StopIteration result = var.get() * self.i self.i += 1 return result For hand-written iterators such approach to context management is normally not necessary, and it is easier to set and restore context variables directly in ``__next__``:: class MyIterator: # ... def __next__(self): old_val = var.get() try: var.set(new_val) # ... finally: var.set(old_val) Implementation ============== Execution context is implemented as an immutable linked list of logical contexts, where each logical context is an immutable weak key mapping. A pointer to the currently active execution context is stored in the OS thread state:: +-----------------+ | | ec | PyThreadState +-------------+ | | | +-----------------+ | | ec_node ec_node ec_node v +------+------+ +------+------+ +------+------+ | NULL | lc |<----| prev | lc |<----| prev | lc | +------+--+---+ +------+--+---+ +------+--+---+ | | | LC v LC v LC v +-------------+ +-------------+ +-------------+ | var1: obj1 | | EMPTY | | var1: obj4 | | var2: obj2 | +-------------+ +-------------+ | var3: obj3 | +-------------+ The choice of the immutable list of immutable mappings as a fundamental data structure is motivated by the need to efficiently implement ``contextvars.get_execution_context()``, which is to be frequently used by asynchronous tasks and callbacks. When the EC is immutable, ``get_execution_context()`` can simply copy the current execution context *by reference*:: def get_execution_context(self): return PyThreadState_Get().ec Let's review all possible context modification scenarios: * The ``ContextVariable.set()`` method is called:: def ContextVar_set(self, val): # See a more complete set() definition # in the `Context Variables` section. tstate = PyThreadState_Get() top_ec_node = tstate.ec top_lc = top_ec_node.lc new_top_lc = top_lc.set(self, val) tstate.ec = ec_node( prev=top_ec_node.prev, lc=new_top_lc) * The ``contextvars.run_with_logical_context()`` is called, in which case the passed logical context object is appended to the execution context:: def run_with_logical_context(lc, func, *args, **kwargs): tstate = PyThreadState_Get() old_top_ec_node = tstate.ec new_top_ec_node = ec_node(prev=old_top_ec_node, lc=lc) try: tstate.ec = new_top_ec_node return func(*args, **kwargs) finally: tstate.ec = old_top_ec_node * The ``contextvars.run_with_execution_context()`` is called, in which case the current execution context is set to the passed execution context with a new empty logical context appended to it:: def run_with_execution_context(ec, func, *args, **kwargs): tstate = PyThreadState_Get() old_top_ec_node = tstate.ec new_lc = contextvars.LogicalContext() new_top_ec_node = ec_node(prev=ec, lc=new_lc) try: tstate.ec = new_top_ec_node return func(*args, **kwargs) finally: tstate.ec = old_top_ec_node * Either ``genobj.send()``, ``genobj.throw()``, ``genobj.close()`` are called on a ``genobj`` generator, in which case the logical context recorded in ``genobj`` is pushed onto the stack:: PyGen_New(PyGenObject *gen): if (gen.gi_code.co_flags & (CO_COROUTINE | CO_ITERABLE_COROUTINE)): # gen is an 'async def' coroutine, or a generator # decorated with @types.coroutine. gen.__logical_context__ = None else: # Non-coroutine generator gen.__logical_context__ = contextvars.LogicalContext() gen_send(PyGenObject *gen, ...): tstate = PyThreadState_Get() if gen.__logical_context__ is not None: old_top_ec_node = tstate.ec new_top_ec_node = ec_node( prev=old_top_ec_node, lc=gen.__logical_context__) try: tstate.ec = new_top_ec_node return _gen_send_impl(gen, ...) finally: gen.__logical_context__ = tstate.ec.lc tstate.ec = old_top_ec_node else: return _gen_send_impl(gen, ...) * Coroutines and asynchronous generators share the implementation with generators, and the above changes apply to them as well. In certain scenarios the EC may need to be squashed to limit the size of the chain. For example, consider the following corner case:: async def repeat(coro, delay): await coro() await asyncio.sleep(delay) loop.create_task(repeat(coro, delay)) async def ping(): print('ping') loop = asyncio.get_event_loop() loop.create_task(repeat(ping, 1)) loop.run_forever() In the above code, the EC chain will grow as long as ``repeat()`` is called. Each new task will call ``contextvars.run_with_execution_context()``, which will append a new logical context to the chain. To prevent unbounded growth, ``contextvars.get_execution_context()`` checks if the chain is longer than a predetermined maximum, and if it is, squashes the chain into a single LC:: def get_execution_context(): tstate = PyThreadState_Get() if tstate.ec_len > EC_LEN_MAX: squashed_lc = contextvars.LogicalContext() ec_node = tstate.ec while ec_node: # The LC.merge() method does not replace # existing keys. squashed_lc = squashed_lc.merge(ec_node.lc) ec_node = ec_node.prev return ec_node(prev=NULL, lc=squashed_lc) else: return tstate.ec Logical Context --------------- Logical context is an immutable weak key mapping which has the following properties with respect to garbage collection: * ``ContextVar`` objects are strongly-referenced only from the application code, not from any of the execution context machinery or values they point to. This means that there are no reference cycles that could extend their lifespan longer than necessary, or prevent their collection by the GC. * Values put in the execution context are guaranteed to be kept alive while there is a ``ContextVar`` key referencing them in the thread. * If a ``ContextVar`` is garbage collected, all of its values will be removed from all contexts, allowing them to be GCed if needed. * If an OS thread has ended its execution, its thread state will be cleaned up along with its execution context, cleaning up all values bound to all context variables in the thread. As discussed earlier, we need ``contextvars.get_execution_context()`` to be consistently fast regardless of the size of the execution context, so logical context is necessarily an immutable mapping. Choosing ``dict`` for the underlying implementation is suboptimal, because ``LC.set()`` will cause ``dict.copy()``, which is an O(N) operation, where *N* is the number of items in the LC. ``get_execution_context()``, when squashing the EC, is an O(M) operation, where *M* is the total number of context variable values in the EC. So, instead of ``dict``, we choose Hash Array Mapped Trie (HAMT) as the underlying implementation of logical contexts. (Scala and Clojure use HAMT to implement high performance immutable collections [5]_, [6]_.) With HAMT ``.set()`` becomes an O(log N) operation, and ``get_execution_context()`` squashing is more efficient on average due to structural sharing in HAMT. See `Appendix: HAMT Performance Analysis`_ for a more elaborate analysis of HAMT performance compared to ``dict``. Context Variables ----------------- The ``ContextVar.get()`` and ``ContextVar.set()`` methods are implemented as follows (in pseudo-code):: class ContextVar: def get(self, *, default=None, topmost=False): tstate = PyThreadState_Get() ec_node = tstate.ec while ec_node: if self in ec_node.lc: return ec_node.lc[self] if topmost: break ec_node = ec_node.prev return default def set(self, value): tstate = PyThreadState_Get() top_ec_node = tstate.ec if top_ec_node is not None: top_lc = top_ec_node.lc new_top_lc = top_lc.set(self, value) tstate.ec = ec_node( prev=top_ec_node.prev, lc=new_top_lc) else: # First ContextVar.set() in this OS thread. top_lc = contextvars.LogicalContext() new_top_lc = top_lc.set(self, value) tstate.ec = ec_node( prev=NULL, lc=new_top_lc) def delete(self): tstate = PyThreadState_Get() top_ec_node = tstate.ec if top_ec_node is None: raise LookupError top_lc = top_ec_node.lc if self not in top_lc: raise LookupError new_top_lc = top_lc.delete(self) tstate.ec = ec_node( prev=top_ec_node.prev, lc=new_top_lc) For efficient access in performance-sensitive code paths, such as in ``numpy`` and ``decimal``, we cache lookups in ``ContextVar.get()``, making it an O(1) operation when the cache is hit. The cache key is composed from the following: * The new ``uint64_t PyThreadState->unique_id``, which is a globally unique thread state identifier. It is computed from the new ``uint64_t PyInterpreterState->ts_counter``, which is incremented whenever a new thread state is created. * The new ``uint64_t PyThreadState->stack_version``, which is a thread-specific counter, which is incremented whenever a non-empty logical context is pushed onto the stack or popped from the stack. * The ``uint64_t ContextVar->version`` counter, which is incremented whenever the context variable value is changed in any logical context in any OS thread. The cache is then implemented as follows:: class ContextVar: def set(self, value): ... # implementation self.version += 1 def get(self, *, default=None, topmost=False): if topmost: return self._get_uncached( default=default, topmost=topmost) tstate = PyThreadState_Get() if (self.last_tstate_id == tstate.unique_id and self.last_stack_ver == tstate.stack_version and self.last_version == self.version): return self.last_value value = self._get_uncached(default=default) self.last_value = value # borrowed ref self.last_tstate_id = tstate.unique_id self.last_stack_version = tstate.stack_version self.last_version = self.version return value Note that ``last_value`` is a borrowed reference. We assume that if the version checks are fine, the value object will be alive. This allows the values of context variables to be properly garbage collected. This generic caching approach is similar to what the current C implementation of ``decimal`` does to cache the the current decimal context, and has similar performance characteristics. Performance Considerations ========================== Tests of the reference implementation based on the prior revisions of this PEP have shown 1-2% slowdown on generator microbenchmarks and no noticeable difference in macrobenchmarks. The performance of non-generator and non-async code is not affected by this PEP. Summary of the New APIs ======================= Python ------ The following new Python APIs are introduced by this PEP: 1. The new ``contextvars.ContextVar(name: str='...')`` class, instances of which have the following: * the read-only ``.name`` attribute, * the ``.get()`` method, which returns the value of the variable in the current execution context; * the ``.set()`` method, which sets the value of the variable in the current logical context; * the ``.delete()`` method, which removes the value of the variable from the current logical context. 2. The new ``contextvars.ExecutionContext()`` class, which represents an execution context. 3. The new ``contextvars.LogicalContext()`` class, which represents a logical context. 4. The new ``contextvars.get_execution_context()`` function, which returns an ``ExecutionContext`` instance representing a copy of the current execution context. 5. The ``contextvars.run_with_execution_context(ec: ExecutionContext, func, *args, **kwargs)`` function, which runs *func* with the provided execution context. 6. The ``contextvars.run_with_logical_context(lc: LogicalContext, func, *args, **kwargs)`` function, which runs *func* with the provided logical context on top of the current execution context. C API ----- 1. ``PyContextVar * PyContext_NewVar(char *desc)``: create a ``PyContextVar`` object. 2. ``PyObject * PyContext_GetValue(PyContextVar *, int topmost)``: return the value of the variable in the current execution context. 3. ``int PyContext_SetValue(PyContextVar *, PyObject *)``: set the value of the variable in the current logical context. 4. ``int PyContext_DelValue(PyContextVar *)``: delete the value of the variable from the current logical context. 5. ``PyLogicalContext * PyLogicalContext_New()``: create a new empty ``PyLogicalContext``. 6. ``PyExecutionContext * PyExecutionContext_New()``: create a new empty ``PyExecutionContext``. 7. ``PyExecutionContext * PyExecutionContext_Get()``: return the current execution context. 8. ``int PyContext_SetCurrent( PyExecutionContext *, PyLogicalContext *)``: set the passed EC object as the current execution context for the active thread state, and/or set the passed LC object as the current logical context. Design Considerations ===================== Should "yield from" leak context changes? ----------------------------------------- No. It may be argued that ``yield from`` is semantically equivalent to calling a function, and should leak context changes. However, it is not possible to satisfy the following at the same time: * ``next(gen)`` *does not* leak context changes made in ``gen``, and * ``yield from gen`` *leaks* context changes made in ``gen``. The reason is that ``yield from`` can be used with a partially iterated generator, which already has local context changes:: var = contextvars.ContextVar('var') def gen(): for i in range(10): var.set('gen') yield i def outer_gen(): var.set('outer_gen') g = gen() yield next(g) # Changes not visible during partial iteration, # the goal of this PEP: assert var.get() == 'outer_gen' yield from g assert var.get() == 'outer_gen' # or 'gen'? Another example would be refactoring of an explicit ``for..in yield`` construct to a ``yield from`` expression. Consider the following code:: def outer_gen(): var.set('outer_gen') for i in gen(): yield i assert var.get() == 'outer_gen' which we want to refactor to use ``yield from``:: def outer_gen(): var.set('outer_gen') yield from gen() assert var.get() == 'outer_gen' # or 'gen'? The above examples illustrate that it is unsafe to refactor generator code using ``yield from`` when it can leak context changes. Thus, the only well-defined and consistent behaviour is to **always** isolate context changes in generators, regardless of how they are being iterated. Should ``PyThreadState_GetDict()`` use the execution context? ------------------------------------------------------------- No. ``PyThreadState_GetDict`` is based on TLS, and changing its semantics will break backwards compatibility. PEP 521 ------- :pep:`521` proposes an alternative solution to the problem, which extends the context manager protocol with two new methods: ``__suspend__()`` and ``__resume__()``. Similarly, the asynchronous context manager protocol is also extended with ``__asuspend__()`` and ``__aresume__()``. This allows implementing context managers that manage non-local state, which behave correctly in generators and coroutines. For example, consider the following context manager, which uses execution state:: class Context: def __init__(self): self.var = contextvars.ContextVar('var') def __enter__(self): self.old_x = self.var.get() self.var.set('something') def __exit__(self, *err): self.var.set(self.old_x) An equivalent implementation with PEP 521:: local = threading.local() class Context: def __enter__(self): self.old_x = getattr(local, 'x', None) local.x = 'something' def __suspend__(self): local.x = self.old_x def __resume__(self): local.x = 'something' def __exit__(self, *err): local.x = self.old_x The downside of this approach is the addition of significant new complexity to the context manager protocol and the interpreter implementation. This approach is also likely to negatively impact the performance of generators and coroutines. Additionally, the solution in :pep:`521` is limited to context managers, and does not provide any mechanism to propagate state in asynchronous tasks and callbacks. Can Execution Context be implemented without modifying CPython? --------------------------------------------------------------- No. It is true that the concept of "task-locals" can be implemented for coroutines in libraries (see, for example, [29]_ and [30]_). On the other hand, generators are managed by the Python interpreter directly, and so their context must also be managed by the interpreter. Furthermore, execution context cannot be implemented in a third-party module at all, otherwise the standard library, including ``decimal`` would not be able to rely on it. Should we update sys.displayhook and other APIs to use EC? ---------------------------------------------------------- APIs like redirecting stdout by overwriting ``sys.stdout``, or specifying new exception display hooks by overwriting the ``sys.displayhook`` function are affecting the whole Python process **by design**. Their users assume that the effect of changing them will be visible across OS threads. Therefore we cannot just make these APIs to use the new Execution Context. That said we think it is possible to design new APIs that will be context aware, but that is outside of the scope of this PEP. Greenlets --------- Greenlet is an alternative implementation of cooperative scheduling for Python. Although greenlet package is not part of CPython, popular frameworks like gevent rely on it, and it is important that greenlet can be modified to support execution contexts. Conceptually, the behaviour of greenlets is very similar to that of generators, which means that similar changes around greenlet entry and exit can be done to add support for execution context. This PEP provides the necessary C APIs to do that. Context manager as the interface for modifications -------------------------------------------------- This PEP concentrates on the low-level mechanics and the minimal API that enables fundamental operations with execution context. For developer convenience, a high-level context manager interface may be added to the ``contextvars`` module. For example:: with contextvars.set_var(var, 'foo'): # ... Setting and restoring context variables --------------------------------------- The ``ContextVar.delete()`` method removes the context variable from the topmost logical context. If the variable is not found in the topmost logical context, a ``LookupError`` is raised, similarly to ``del var`` raising ``NameError`` when ``var`` is not in scope. This method is useful when there is a (rare) need to correctly restore the state of a logical context, such as when a nested generator wants to modify the logical context *temporarily*:: var = contextvars.ContextVar('var') def gen(): with some_var_context_manager('gen'): # EC = [{var: 'main'}, {var: 'gen'}] assert var.get() == 'gen' yield # EC = [{var: 'main modified'}, {}] assert var.get() == 'main modified' yield def main(): var.set('main') g = gen() next(g) var.set('main modified') next(g) The above example would work correctly only if there is a way to delete ``var`` from the logical context in ``gen()``. Setting it to a "previous value" in ``__exit__()`` would mask changes made in ``main()`` between the iterations. Alternative Designs for ContextVar API -------------------------------------- Logical Context with stacked values ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ By the design presented in this PEP, logical context is a simple ``LC({ContextVar: value, ...})`` mapping. An alternative representation is to store a stack of values for each context variable: ``LC({ContextVar: [val1, val2, ...], ...})``. The ``ContextVar`` methods would then be: * ``get(*, default=None)`` -- traverses the stack of logical contexts, and returns the top value from the first non-empty logical context; * ``push(val)`` -- pushes *val* onto the stack of values in the current logical context; * ``pop()`` -- pops the top value from the stack of values in the current logical context. Compared to the single-value design with the ``set()`` and ``delete()`` methods, the stack-based approach allows for a simpler implementation of the set/restore pattern. However, the mental burden of this approach is considered to be higher, since there would be *two* stacks to consider: a stack of LCs and a stack of values in each LC. (This idea was suggested by Nathaniel Smith.) ContextVar "set/reset" ^^^^^^^^^^^^^^^^^^^^^^ Yet another approach is to return a special object from ``ContextVar.set()``, which would represent the modification of the context variable in the current logical context:: var = contextvars.ContextVar('var') def foo(): mod = var.set('spam') # ... perform work mod.reset() # Reset the value of var to the original value # or remove it from the context. The critical flaw in this approach is that it becomes possible to pass context var "modification objects" into code running in a different execution context, which leads to undefined side effects. Backwards Compatibility ======================= This proposal preserves 100% backwards compatibility. Rejected Ideas ============== Replication of threading.local() interface ------------------------------------------ Choosing the ``threading.local()``-like interface for context variables was considered and rejected for the following reasons: * A survery of the standard library and Django has shown that the vast majority of ``threading.local()`` uses involve a single attribute, which indicates that the namespace approach is not as helpful in the field. * Using ``__getattr__()`` instead of ``.get()`` for value lookup does not provide any way to specify the depth of the lookup (i.e. search only the top logical context). * Single-value ``ContextVar`` is easier to reason about in terms of visibility. Suppose ``ContextVar()`` is a namespace, and the consider the following:: ns = contextvars.ContextVar('ns') def gen(): ns.a = 2 yield assert ns.b == 'bar' # ?? def main(): ns.a = 1 ns.b = 'foo' g = gen() next(g) # should not see the ns.a modification in gen() assert ns.a == 1 # but should gen() see the ns.b modification made here? ns.b = 'bar' yield The above example demonstrates that reasoning about the visibility of different attributes of the same context var is not trivial. * Single-value ``ContextVar`` allows straightforward implementation of the lookup cache; * Single-value ``ContextVar`` interface allows the C-API to be simple and essentially the same as the Python API. See also the mailing list discussion: [26]_, [27]_. Coroutines not leaking context changes by default ------------------------------------------------- In V4 (`Version History`_) of this PEP, coroutines were considered to behave exactly like generators with respect to the execution context: changes in awaited coroutines were not visible in the outer coroutine. This idea was rejected on the grounds that is breaks the semantic similarity of the task and thread models, and, more specifically, makes it impossible to reliably implement asynchronous context managers that modify context vars, since ``__aenter__`` is a coroutine. Appendix: HAMT Performance Analysis =================================== .. figure:: pep-0550-hamt_vs_dict-v2.png :align: center :width: 100% Figure 1. Benchmark code can be found here: [9]_. The above chart demonstrates that: * HAMT displays near O(1) performance for all benchmarked dictionary sizes. * ``dict.copy()`` becomes very slow around 100 items. .. figure:: pep-0550-lookup_hamt.png :align: center :width: 100% Figure 2. Benchmark code can be found here: [10]_. Figure 2 compares the lookup costs of ``dict`` versus a HAMT-based immutable mapping. HAMT lookup time is 30-40% slower than Python dict lookups on average, which is a very good result, considering that the latter is very well optimized. There is research [8]_ showing that there are further possible improvements to the performance of HAMT. The reference implementation of HAMT for CPython can be found here: [7]_. Acknowledgments =============== Thanks to Victor Petrovykh for countless discussions around the topic and PEP proofreading and edits. Thanks to Nathaniel Smith for proposing the ``ContextVar`` design [17]_ [18]_, for pushing the PEP towards a more complete design, and coming up with the idea of having a stack of contexts in the thread state. Thanks to Nick Coghlan for numerous suggestions and ideas on the mailing list, and for coming up with a case that cause the complete rewrite of the initial PEP version [19]_. Version History =============== 1. Initial revision, posted on 11-Aug-2017 [20]_. 2. V2 posted on 15-Aug-2017 [21]_. The fundamental limitation that caused a complete redesign of the first version was that it was not possible to implement an iterator that would interact with the EC in the same way as generators (see [19]_.) Version 2 was a complete rewrite, introducing new terminology (Local Context, Execution Context, Context Item) and new APIs. 3. V3 posted on 18-Aug-2017 [22]_. Updates: * Local Context was renamed to Logical Context. The term "local" was ambiguous and conflicted with local name scopes. * Context Item was renamed to Context Key, see the thread with Nick Coghlan, Stefan Krah, and Yury Selivanov [23]_ for details. * Context Item get cache design was adjusted, per Nathaniel Smith's idea in [25]_. * Coroutines are created without a Logical Context; ceval loop no longer needs to special case the ``await`` expression (proposed by Nick Coghlan in [24]_.) 4. V4 posted on 25-Aug-2017 [31]_. * The specification section has been completely rewritten. * Coroutines now have their own Logical Context. This means there is no difference between coroutines, generators, and asynchronous generators w.r.t. interaction with the Execution Context. * Context Key renamed to Context Var. * Removed the distinction between generators and coroutines with respect to logical context isolation. 5. V5 posted on 01-Sep-2017: the current version. * Coroutines have no logical context by default (a revert to the V3 semantics). Read about the motivation in the `Coroutines not leaking context changes by default`_ section. The `High-Level Specification`_ section was also updated (specifically Generators and Coroutines subsections). * All APIs have been placed to the ``contextvars`` module, and the factory functions were changed to class constructors (``ContextVar``, ``ExecutionContext``, and ``LogicalContext``). Thanks to Nick for the idea [33]_. * ``ContextVar.lookup()`` got renamed back to ``ContextVar.get()`` and gained the ``topmost`` and ``default`` keyword arguments. Added ``ContextVar.delete()``. See Guido's comment in [32]_. * Fixed ``ContextVar.get()`` cache bug (thanks Nathaniel!). * New `Rejected Ideas`_, `Should "yield from" leak context changes?`_, `Alternative Designs for ContextVar API`_, `Setting and restoring context variables`_, and `Context manager as the interface for modifications`_ sections. References ========== .. [1] https://blog.golang.org/context .. [2] https://msdn.microsoft.com/en-us/library/system.threading.executioncontext.aspx .. [3] https://github.com/numpy/numpy/issues/9444 .. [4] http://bugs.python.org/issue31179 .. [5] https://en.wikipedia.org/wiki/Hash_array_mapped_trie .. [6] http://blog.higher-order.net/2010/08/16/assoc-and-clojures-persistenthashmap-part-ii.html .. [7] https://github.com/1st1/cpython/tree/hamt .. [8] https://michael.steindorfer.name/publications/oopsla15.pdf .. [9] https://gist.github.com/1st1/9004813d5576c96529527d44c5457dcd .. [10] https://gist.github.com/1st1/dbe27f2e14c30cce6f0b5fddfc8c437e .. [11] https://github.com/1st1/cpython/tree/pep550 .. [12] https://www.python.org/dev/peps/pep-0492/#async-await .. [13] https://github.com/MagicStack/uvloop/blob/master/examples/bench/echoserver.py .. [14] https://github.com/MagicStack/pgbench .. [15] https://github.com/python/performance .. [16] https://gist.github.com/1st1/6b7a614643f91ead3edf37c4451a6b4c .. [17] https://mail.python.org/pipermail/python-ideas/2017-August/046752.html .. [18] https://mail.python.org/pipermail/python-ideas/2017-August/046772.html .. [19] https://mail.python.org/pipermail/python-ideas/2017-August/046775.html .. [20] https://github.com/python/peps/blob/e8a06c9a790f39451d9e99e203b13b3ad73a1d01/pep-0550.rst .. [21] https://github.com/python/peps/blob/e3aa3b2b4e4e9967d28a10827eed1e9e5960c175/pep-0550.rst .. [22] https://github.com/python/peps/blob/287ed87bb475a7da657f950b353c71c1248f67e7/pep-0550.rst .. [23] https://mail.python.org/pipermail/python-ideas/2017-August/046801.html .. [24] https://mail.python.org/pipermail/python-ideas/2017-August/046790.html .. [25] https://mail.python.org/pipermail/python-ideas/2017-August/046786.html .. [26] https://mail.python.org/pipermail/python-ideas/2017-August/046888.html .. [27] https://mail.python.org/pipermail/python-ideas/2017-August/046889.html .. [28] https://docs.python.org/3/library/decimal.html#decimal.Context.abs .. [29] https://curio.readthedocs.io/en/latest/reference.html#task-local-storage .. [30] https://docs.atlassian.com/aiolocals/latest/usage.html .. [31] https://github.com/python/peps/blob/1b8728ded7cde9df0f9a24268574907fafec6d5e/pep-0550.rst .. [32] https://mail.python.org/pipermail/python-dev/2017-August/149020.html .. [33] https://mail.python.org/pipermail/python-dev/2017-August/149043.html Copyright ========= This document has been placed in the public domain. From greg.ewing at canterbury.ac.nz Fri Sep 1 21:10:23 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 02 Sep 2017 13:10:23 +1200 Subject: [Python-Dev] Inplace operations for PyLong objects In-Reply-To: References: <20170901021259.0bb7d002@fsol> Message-ID: <59AA04FF.10907@canterbury.ac.nz> Chris Angelico wrote: > This particular example is safe, because the arguments get passed > individually - so 'args' has one reference, plus there's one more for > the actual function call However, that's also true when you use the += operator, so if the optimisation is to trigger at all in any useful case, the refcount threshold needs to be set higher than 1. Some experiments I did suggest that if you set it high enough for x += y to trigger it, then it will also be triggered in Joe's case. BTW, isn't there already a similar optimisation somewhere for concatenating strings? Does it still exist? How does it avoid this issue? -- Greg From jjevnik at quantopian.com Fri Sep 1 21:16:37 2017 From: jjevnik at quantopian.com (Joe Jevnik) Date: Fri, 1 Sep 2017 21:16:37 -0400 Subject: [Python-Dev] Inplace operations for PyLong objects In-Reply-To: <59AA04FF.10907@canterbury.ac.nz> References: <20170901021259.0bb7d002@fsol> <59AA04FF.10907@canterbury.ac.nz> Message-ID: The string concat optimization happens in the interpreter dispatch for INPLACE_ADD On Fri, Sep 1, 2017 at 9:10 PM, Greg Ewing wrote: > Chris Angelico wrote: > >> This particular example is safe, because the arguments get passed >> individually - so 'args' has one reference, plus there's one more for >> the actual function call >> > > However, that's also true when you use the += operator, > so if the optimisation is to trigger at all in any useful > case, the refcount threshold needs to be set higher than > 1. > > Some experiments I did suggest that if you set it high > enough for x += y to trigger it, then it will also be > triggered in Joe's case. > > BTW, isn't there already a similar optimisation somewhere > for concatenating strings? Does it still exist? How does > it avoid this issue? > > -- > Greg > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/joe% > 40quantopian.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Fri Sep 1 23:29:28 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 1 Sep 2017 21:29:28 -0600 Subject: [Python-Dev] PEP 550 V5 In-Reply-To: References: Message-ID: Nice working staying on top of this! Keeping up with discussion is arguably much harder than actually writing the PEP. :) I have some comments in-line below. -eric On Fri, Sep 1, 2017 at 5:02 PM, Yury Selivanov wrote: > [snip] > > Abstract > ======== > > [snip] > > Rationale > ========= > > [snip] > > Goals > ===== > I still think that the Abstract, Rationale, and Goals sections should be clear that a major component of this proposal is lookup via chained contexts. Without such clarity it may not be apparent that chained lookup is not strictly necessary to achieve the stated goals (i.e. an async-compatible TLS replacement). This matters because the chaining introduces extra non-trivial complexity. > The goal of this PEP is to provide a more reliable > ``threading.local()`` alternative, which: > > [snip] > > * implements TLS-like semantics for synchronous code, so that > users like ``decimal`` and ``numpy`` can switch to the new > mechanism with minimal risk of breaking backwards compatibility; > > [snip] > > Replication of threading.local() interface > ------------------------------------------ > > Choosing the ``threading.local()``-like interface for context > variables was considered and rejected for the following reasons: > > * A survery of the standard library and Django has shown that the > vast majority of ``threading.local()`` uses involve a single > attribute, which indicates that the namespace approach is not > as helpful in the field. > > * Using ``__getattr__()`` instead of ``.get()`` for value lookup > does not provide any way to specify the depth of the lookup > (i.e. search only the top logical context). > > * Single-value ``ContextVar`` is easier to reason about in terms > of visibility. Suppose ``ContextVar()`` is a namespace, > and the consider the following:: > > ns = contextvars.ContextVar('ns') > > def gen(): > ns.a = 2 > yield > assert ns.b == 'bar' # ?? > > def main(): > ns.a = 1 > ns.b = 'foo' > g = gen() > next(g) > # should not see the ns.a modification in gen() > assert ns.a == 1 > # but should gen() see the ns.b modification made here? > ns.b = 'bar' > yield > > The above example demonstrates that reasoning about the visibility > of different attributes of the same context var is not trivial. > > * Single-value ``ContextVar`` allows straightforward implementation > of the lookup cache; > > * Single-value ``ContextVar`` interface allows the C-API to be > simple and essentially the same as the Python API. > > See also the mailing list discussion: [26]_, [27]_. > > [snip] On the one hand the first three sections imply that the PEP is intended as a replacement for the current TLS mechanism; threading.local. On the other hand, the PEP (and related discussion) clearly says that the feature works differently than threading.local and hence is not a (drop-in) replacement. I'd prefer it to be a drop-in replacement but recognize I'm on the losing side of that argument. :P Regardless, having a consistent message in the PEP would help folks looking to switch over. Speaking of which, I have plans for the near-to-middle future that involve making use of the PEP 550 functionality in a way that is quite similar to decimal. However, it sounds like the implementation of such (namespace) contexts under PEP 550 is much more complex than it is with threading.local (where subclassing made it easy). It would be helpful to have some direction in the PEP on how to port to PEP 550 from threading.local. It would be even better if the PEP included the addition of a contextlib.Context or contextvars.Context class (or NamespaceContext or ContextNamespace or ...). :) However, I recognize that may be out of scope for this PEP. From guido at python.org Sat Sep 2 01:08:50 2017 From: guido at python.org (Guido van Rossum) Date: Fri, 1 Sep 2017 22:08:50 -0700 Subject: [Python-Dev] PEP 550 V5 In-Reply-To: References: Message-ID: On Fri, Sep 1, 2017 at 8:29 PM, Eric Snow wrote: > Nice working staying on top of this! Keeping up with discussion is > arguably much harder than actually writing the PEP. :) I have some > comments in-line below. > > -eric > > > On Fri, Sep 1, 2017 at 5:02 PM, Yury Selivanov > wrote: > > [snip] > > > > Abstract > > ======== > > > > [snip] > > > > Rationale > > ========= > > > > [snip] > > > > Goals > > ===== > > > > I still think that the Abstract, Rationale, and Goals sections should > be clear that a major component of this proposal is lookup via chained > contexts. Without such clarity it may not be apparent that chained > lookup is not strictly necessary to achieve the stated goals (i.e. an > async-compatible TLS replacement). This matters because the chaining > introduces extra non-trivial complexity. > In defense of the PEP, the Goals section clearly states it aims to provide an alternative, not a plug-in API equivalent. There's also been some discussion (I hope it wasn't off-list) on how to implement threading.local on top of this proposal. That said, I agree it would be nice if the chained lookup was mentioned in the abstract too, because it's a pretty essential part of the proposal (it determines the semantics of ContextVar). (OTOH the HAMT implementation is less essential, there's another way to do the chained lookup.) > > [snip] > > On the one hand the first three sections imply that the PEP is > intended as a replacement for the current TLS mechanism; > threading.local. On the other hand, the PEP (and related discussion) > clearly says that the feature works differently than threading.local > and hence is not a (drop-in) replacement. I'd prefer it to be a > drop-in replacement but recognize I'm on the losing side of that > argument. :P Regardless, having a consistent message in the PEP would > help folks looking to switch over. > > Speaking of which, I have plans for the near-to-middle future that > involve making use of the PEP 550 functionality in a way that is quite > similar to decimal. However, it sounds like the implementation of > such (namespace) contexts under PEP 550 is much more complex than it > is with threading.local (where subclassing made it easy). It would be > helpful to have some direction in the PEP on how to port to PEP 550 > from threading.local. It would be even better if the PEP included the > addition of a contextlib.Context or contextvars.Context class (or > NamespaceContext or ContextNamespace or ...). :) However, I recognize > that may be out of scope for this PEP. > If you look at what decimal does, it only ever stores a single value in its threading.local instance -- __decimal_context__. (And part of the reason is that the C version has to work with a different API for TLS, which really does only store a single value per key.) -- --Guido van Rossum (python.org/~guido ) -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Sat Sep 2 13:38:05 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Sat, 2 Sep 2017 10:38:05 -0700 Subject: [Python-Dev] Inplace operations for PyLong objects In-Reply-To: References: <20170901021259.0bb7d002@fsol> <59AA04FF.10907@canterbury.ac.nz> Message-ID: On Fri, Sep 1, 2017 at 6:16 PM, Joe Jevnik via Python-Dev < python-dev at python.org> wrote: > The string concat optimization happens in the interpreter dispatch for > INPLACE_ADD > In that case, isn't there a new string being build, all in one line of python? that is, the string the optimization is being created at that point, so the interpreter can know there isn't anything referencing it anywhere else. -CHB > On Fri, Sep 1, 2017 at 9:10 PM, Greg Ewing > wrote: > >> Chris Angelico wrote: >> >>> This particular example is safe, because the arguments get passed >>> individually - so 'args' has one reference, plus there's one more for >>> the actual function call >>> >> >> However, that's also true when you use the += operator, >> so if the optimisation is to trigger at all in any useful >> case, the refcount threshold needs to be set higher than >> 1. >> >> Some experiments I did suggest that if you set it high >> enough for x += y to trigger it, then it will also be >> triggered in Joe's case. >> >> BTW, isn't there already a similar optimisation somewhere >> for concatenating strings? Does it still exist? How does >> it avoid this issue? >> >> -- >> Greg >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: https://mail.python.org/mailman/options/python-dev/joe%40qua >> ntopian.com >> > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > chris.barker%40noaa.gov > > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Sat Sep 2 13:50:39 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 2 Sep 2017 19:50:39 +0200 Subject: [Python-Dev] Inplace operations for PyLong objects References: <20170901021259.0bb7d002@fsol> <59AA04FF.10907@canterbury.ac.nz> Message-ID: <20170902195039.6f0a841e@fsol> On Sat, 2 Sep 2017 10:38:05 -0700 Chris Barker wrote: > On Fri, Sep 1, 2017 at 6:16 PM, Joe Jevnik via Python-Dev < > python-dev at python.org> wrote: > > > The string concat optimization happens in the interpreter dispatch for > > INPLACE_ADD > > > > In that case, isn't there a new string being build, all in one line of > python? that is, the string the optimization is being created at that > point, so the interpreter can know there isn't anything referencing it > anywhere else. You can go to the source and check yourself: https://github.com/python/cpython/blob/master/Python/ceval.c#L5226-L5277 Regards Antoine. From yselivanov.ml at gmail.com Sun Sep 3 07:09:19 2017 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Sun, 3 Sep 2017 07:09:19 -0400 Subject: [Python-Dev] PEP 550 V5 In-Reply-To: References: Message-ID: Hi Eric, On Fri, Sep 1, 2017 at 11:29 PM, Eric Snow wrote: > Nice working staying on top of this! Keeping up with discussion is > arguably much harder than actually writing the PEP. :) I have some > comments in-line below. Thanks! > > -eric > > > On Fri, Sep 1, 2017 at 5:02 PM, Yury Selivanov wrote: >> [snip] >> >> Abstract >> ======== >> >> [snip] >> >> Rationale >> ========= >> >> [snip] >> >> Goals >> ===== >> > > I still think that the Abstract, Rationale, and Goals sections should > be clear that a major component of this proposal is lookup via chained > contexts. Without such clarity it may not be apparent that chained > lookup is not strictly necessary to achieve the stated goals (i.e. an > async-compatible TLS replacement). This matters because the chaining > introduces extra non-trivial complexity. Let me walk you through the history of PEP 550 to explain how we arrived to having "chained lookups". If we didn't have generators in Python, the first version of PEP 550 ([1]) would have been a perfect solution. First let's recap the how all versions of PEP 550 reason about generators: * generators can be paused/resumed at any time by code they cannot control; * thus any context changes in a generator should be isolated, and visible only to the generator and code it runs. PEP 550 v1 had excellent performance characteristics, a specification that was easy to fully understand, and a relatively straightforward implementation. It also had *no* chained lookups, as there was only one thread-specific collection of context values. PEP 550 states that v1 was rejected because of the following reason: "The fundamental limitation that caused a complete redesign of the first version was that it was not possible to implement an iterator that would interact with the EC in the same way as generators". I believe it's no longer true: I think that we could use the same trick [2] we use in the current version of PEP 550. But that doesn't really matter. The fundamental problem of PEP 550 v1 was that generators could not see EC updates *while being iterated*. Let's look at the following: def gen(): yield some_decimal_calculation() yield some_decimal_calculation() yield some_decimal_calculation() with decimal_context_0: g = gen() with decimal_context_1: next(g) with decimal_context_2: next(g) In the above example, with PEP 550 v1, generator 'g' would snapshot the execution context at the point where it was created. So all calls to "some_decimal_calculation()" would be made with "decimal_context_0". We could easily change the semantics to snapshot the EC at the point of the first iteration, but that would only make "some_decimal_calculation()" to be always called with "decimal_context_1". In this email [3], Nathaniel explained that there are use cases where seeing updates in the surrounding Execution Context matters in some situations. One case is his idea to implement special timeouts handling in his async framework Trio. Another one, is backwards compatibility: currently, if you use a "threading.local()", you should be able to see updates in it while the generator is being iterated. With PEP 550 v1, you couldn't. To fix this problem, in later versions of PEP 550, we transformed the EC into a stack of Logical Contexts. Every generator has its own LC, which contains changes to the EC local to that generator. Having a stack naturally made it a requirement to have "chained" lookups. That's how we fixed the above example! Therefore, while the current version of PEP 550 is more complex than v1, it has a *proper* support of generators. It provides strong guarantees w.r.t. context isolation, and at the same time maintains their ability to see the full outer context while being iterated. [..] > Speaking of which, I have plans for the near-to-middle future that > involve making use of the PEP 550 functionality in a way that is quite > similar to decimal. However, it sounds like the implementation of > such (namespace) contexts under PEP 550 is much more complex than it > is with threading.local (where subclassing made it easy). It would be > helpful to have some direction in the PEP on how to port to PEP 550 > from threading.local. It would be even better if the PEP included the > addition of a contextlib.Context or contextvars.Context class (or > NamespaceContext or ContextNamespace or ...). :) However, I recognize > that may be out of scope for this PEP. Currently we are focused on refining the fundamental low-level APIs and data structures, and the scope we cover in the PEP is already huge. Adding high-level APIs is much easier, so yeah, I'd discuss them if/after the PEP is accepted. Thanks! Yury [1] https://github.com/python/peps/blob/e8a06c9a790f39451d9e99e203b13b3ad73a1d01/pep-0550.rst [2] https://www.python.org/dev/peps/pep-0550/#generators-transformed-into-iterators [3] https://mail.python.org/pipermail/python-ideas/2017-August/046736.html From g.rodola at gmail.com Mon Sep 4 06:47:23 2017 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Mon, 4 Sep 2017 18:47:23 +0800 Subject: [Python-Dev] pythonhosted.org doc upload no longer works Message-ID: I know pythonhosted.org was deprecated long ago in favor of readthedocs but I kept postponing it and my doc for psutil is still hosted on https://pythonhosted.org/psutil/. http://pythonhosted.org/ web page still recommends this: <> I took a look at: https://pypi.python.org/pypi?:action=pkg_edit&name=psutil ...but the form to upload the doc in zip format is gone. The only available option is the "destroy documentation" button. I would like to at least upload a static HTML page redirecting to the new doc which I will soon move on readthedocs. Is that possible? Thanks in advance. -- Giampaolo - http://grodola.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From phd at phdru.name Mon Sep 4 07:10:42 2017 From: phd at phdru.name (Oleg Broytman) Date: Mon, 4 Sep 2017 13:10:42 +0200 Subject: [Python-Dev] pythonhosted.org doc upload no longer works In-Reply-To: References: Message-ID: <20170904111042.GA28995@phdru.name> Hi! On Mon, Sep 04, 2017 at 06:47:23PM +0800, Giampaolo Rodola' wrote: > I know pythonhosted.org was deprecated long ago in favor of readthedocs but > I kept postponing it and my doc for psutil is still hosted on > https://pythonhosted.org/psutil/. > > http://pythonhosted.org/ web page still recommends this: > > < http://pypi.python.org/pypi?%3Aaction=pkg_edit&name=yourpackage), and fill > out the form at the bottom of the page.>> > > I took a look at: > https://pypi.python.org/pypi?:action=pkg_edit&name=psutil > ...but the form to upload the doc in zip format is gone. > > The only available option is the "destroy documentation" button. What is worse, it doesn't work, at least for me. > I would like to at least upload a static HTML page redirecting to the new > doc which I will soon move on readthedocs. Is that possible? I'm also interested in redirecting or at least removing outdated docs. > Thanks in advance. > > -- > Giampaolo - http://grodola.blogspot.com Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From benjamin at python.org Mon Sep 4 16:48:53 2017 From: benjamin at python.org (Benjamin Peterson) Date: Mon, 04 Sep 2017 13:48:53 -0700 Subject: [Python-Dev] removing IRIX support Message-ID: <1504558133.543459.1094999488.1CDC94D7@webmail.messagingengine.com> Since IRIX was EOLed in 2013, I propose support for it be removed in Python 3.7. I will add it to PEP 11. From guido at python.org Mon Sep 4 17:01:41 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 4 Sep 2017 14:01:41 -0700 Subject: [Python-Dev] removing IRIX support In-Reply-To: <1504558133.543459.1094999488.1CDC94D7@webmail.messagingengine.com> References: <1504558133.543459.1094999488.1CDC94D7@webmail.messagingengine.com> Message-ID: Wow. IRIX. It was probably Python's first UNIX. It does sound like it's time to stop supporting... +1 On Mon, Sep 4, 2017 at 1:48 PM, Benjamin Peterson wrote: > Since IRIX was EOLed in 2013, I propose support for it be removed in > Python 3.7. I will add it to PEP 11. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From a.tber10 at gmail.com Mon Sep 4 17:05:16 2017 From: a.tber10 at gmail.com (TBER Abdelmalek) Date: Mon, 4 Sep 2017 22:05:16 +0100 Subject: [Python-Dev] Complex numbers Message-ID: This module implements cplx class (complex numbers) regardless to the built-in class. The main goal of this module is to propose some improvement to complex numbers in python and deal with them from a mathematical approach. Also cplx class doesn't support the built-in class intentionally, as the idea was to give an alternative to it. With the hope I managed to succeed, here is the module : -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- # Module for complex numbers (doesn't support the built-in complex class) # Originally contributed by TBER Abdelmalek import re, math from fractions import Fraction as Q if __name__ == '__main__' : import os, Complex help(Complex) os.system("pause") _supported = (int, float, Q, str, tuple, list) # Supported types for instanciation and operations for cplx class class cplx : """This class implements complex numbers at the form of 'a+bi' with a : the real part, and b : the imaginary part. a and b are real numbers : [integers, floating numbers, or fractions (from Fraction class of fractions module), refered here by Q]. Construction of complex numbers can be from two major ways : - cplx([real part[, imaginary part]]) : Two real numbers arguments given and both optionals, so that cplx() = 0 For fractions use Fraction class : cplx(Q(n,d),Q(n',d')) for details about Fraction, see help(Fraction). -cplx('valid form of a complex number in a STRING'): The advantage of this way is the ease of fractions use with '/'. Examples : cplx('2i-1/4') cplx('6 - 1/5 * i') # spaces don't bother at all cplx('7i') """ global _supported def __init__(self, a=0, b=None): if isinstance(a, str) and b is None: # construction from string if a == '' : self.re = 0 self.im = 0 elif cplx.verify(a): part_one = '' part_two = '' switch = False first = True for c in a : if c in ('+', '-') and not first : switch = True part_two += c elif not switch : if c.isalnum() or c in ('+', '-', '/') : part_one += c elif c in (',', '.') : part_one += '.' else : if c.isalnum() or c == '/' : part_two += c elif c in (',', '.') : part_two += '.' first = False if 'i' in part_two : part_two = part_two[:len(part_two)-1] if '.' in part_one : self.re = float(part_one) elif '/' in part_one : self.re = Q(part_one) else : self.re = int(part_one) if '.' in part_two : self.im = float(part_two) elif '/' in part_two : self.im = Q(part_two) elif part_two == '+' : self.im = 1 elif part_two == '-' : self.im = -1 else : self.im = int(part_two) elif 'i' in part_one : part_one = part_one[:len(part_one)-1] if part_two == '' : self.re = 0 elif '.' in part_two : self.re = float(part_two) elif '/' in part_two : self.re = Q(part_two) else : self.re = int(part_two) if '.' in part_one : self.im = float(part_one) elif '/' in part_one : self.im = Q(part_one) elif part_one == '' or part_one == '+' : self.im = 1 elif part_one == '-' : self.im = -1 else : self.im = int(part_one) else : if '.' in part_one : self.re = float(part_one) elif '/' in part_one : self.re = Q(part_one) else : self.re = int(part_one) self.im = 0 else : raise ValueError("The form of complex numbers should be such as : 'a+bi' with a, b integers, floating numbers or fractions.") elif isinstance(a, (tuple,list)) and len(a) == 2 and b is None: # construction from 2-tuples or list self.re = a[0] self.im = a[1] elif isinstance(a, cplx) and b is None : # construction from cplx(complex) self.re = a.re self.im = a.im elif isinstance(a, (int, float, Q)) : # construction from integers, fractions and floating numbers self.re = a if b is None : self.im = 0 elif isinstance(b, (int, float, Q)) : self.im = b else : raise TypeError("Imaginary part sould be an integer, floating number or fraction .") else : raise TypeError("Invalid arguments! For details see help(cplx).") def __setattr__(self, n, v): if n not in ('re', 'im') : raise AttributeError("Invalid attribute.") if isinstance(v, (int, float, Q)) : object.__setattr__(self, n, v) else : raise TypeError("Illegal assignement.") def __delattr__(self, n): raise AttributeError("cplx instances are characterized by 're' and 'im' attributes, deleting them is impossible.") def __repr__(self): """Returns repr(self) in an elegant way.""" chain = '' if isinstance(self.re, Q) and self.re._numerator != 0 : if self.re._denominator != 1 : if self.re._numerator < 0 : chain += '-' chain += '({}/{})'.format(abs(self.re._numerator), self.re._denominator) else : chain += '{}'.format(int(self.re)) elif self.re != 0 : chain += '{}'.format(self.re) if self.re != 0 and self.im > 0 : chain += '+' elif self.im < 0 : chain += '-' if isinstance(self.im, Q) and self.im._numerator != 0 : if self.im._denominator != 1 : chain += '({}/{})'.format(abs(self.im._numerator), self.im._denominator) elif abs(self.im) != 1 : chain += '{}'.format(abs(int(self.im))) elif self.im != 0 and abs(self.im) != 1 : chain += '{}'.format(abs(self.im)) if self.im != 0 : chain += 'i' if chain == '' : chain = '0' return chain def __str__(self): """Returns str(self)""" return repr(self) def __int__(self): """Returns int(real part)""" return int(self.re) def __float__(self): """Returns float(real part)""" return float(self.re) def __bool__(self): """Returns self != 0""" return self != 0 def __pos__(self): """+self""" return self def __neg__(self): """-self""" return cplx(-self.re, -self.im) def __abs__(self): """Returns the absolute value if self is a real number. Returns its modulus if it is complex.""" if self.im == 0 : return cplx(abs(self.re)) else : return self.modul() # Comparaison block def __eq__(self, value): """self == value""" if isinstance(value, (_supported, cplx)) : value = cplx(value) return self.re == value.re and self.im == value.im else : raise TypeError("This type : {} is not supported for this operation.".format(type(value))) def __ne__(self, value): """self != 0""" return not self == value def __gt__(self, value): """self > value""" if isinstance(value, (_supported, cplx)) : value = cplx(value) if self.im == 0 and value.im == 0 : return self.re > value.re else : raise ValueError("Only real numbers are to be compared.") else : raise TypeError("This type : {} is not supported for this operation.".format(type(value))) def __ge__(self, value): """self >= value""" return self > value or self == value def __lt__(self, value): """self < value""" return not self >= value def __le__(self, value): """self <= value""" return not self > value # Operation block def __add__(self, value): """self + value""" if isinstance(value, (_supported, cplx)) : value = cplx(value) return cplx(self.re + value.re, self.im + value.im) else : raise TypeError("This type : {} is not supported for this operation.".format(type(value))) def __radd__(self, value): """value + self""" return self + value def __sub__(self, value): """self - value""" return self + (-value) def __rsub__(self, value): """value - self""" return -self + value def __mul__(self, value): """self * value""" if isinstance(value, (_supported, cplx)) : value = cplx(value) return cplx(self.re*value.re - self.im*value.im, self.re*value.im + self.im*value.re) else : raise TypeError("This type : {} is not supported for this operation.".format(type(value))) def __rmul__(self, value): """value * self""" return self * value def __pow__(self, value): """self**value""" if self.im == 0 : return cplx(self.re**value) elif isinstance(value, (_supported, cplx)) : value = cplx(value) if value == int(value): if value == 0 : return 1 z = self x = 1 while x < abs(value) : z *= self x += 1 if value > 0 : return z else : return 1/z else : return math.e**(value * (math.log(self.modul()) + self.arg()*cplx('i'))) else : raise TypeError("This type : {} is not supported for this operation.".format(type(value))) def __rpow__(self, value): """value**self""" if self.im == 0 : return value**self.re elif value == 0 : return 1 if self == 0 else 0 elif isinstance(value, (int, float, Q)) : return (value**self.re)*(math.cos(self.im*math.log(value)) + math.sin(self.im*math.log(value))*cplx('i')) else : raise TypeError("This type : {} is not supported for this operation.".format(type(value))) def __truediv__(self, value): """self/value : real and imaginary parts are left as fractions. Use c_float converter method to have floating numbers instead of fractions.""" if value == 0 : raise ZeroDivisionError elif isinstance(value, (_supported, cplx)) : value = cplx(value) return cplx(Q(self.re*value.re + self.im*value.im, value.re**2 + value.im**2), Q(self.im*value.re - self.re*value.im, value.re**2 + value.im**2)) else : raise TypeError("This type : {} is not supported for this operation.".format(type(value))) def __rtruediv__(self,value): """value/self""" if isinstance(value, (_supported, cplx)) : return cplx(value)/self else : raise TypeError("This type : {} is not supported for this operation.".format(type(value))) def __floordiv__(self, value): """self//value""" return (self/value).c_int() def __rfloordiv__(self, value): """value//self""" return (value/self).c_int() def __mod__(self, value): """self % value""" return self - (self//value)*value def __rmod__(self, value): """value % self""" return value - (value//self)*self def __divmod__(self, value): """divmod(self, value)""" return (self//value, self % value) def __rdivmod__(self, value): """divmod(value, self)""" return (value//self, value % self) # Converting methods for complex def c_int(self): """Converts real and imaginary parts into integers (returns a new object)""" return cplx(int(self.re), int(self.im)) def c_float(self): """Converts real and imaginary parts into floating numbers (returns a new object)""" return cplx(float(self.re), float(self.im)) def c_fract(self): """Converts real and imaginary parts into fractions (returns a new object)""" return cplx(Q.from_float(float(self.re)), Q.from_float(float(self.im))) # Useful methods def coord(self): """Returns the coordinates of a complex number in a tuple. a+bi -> (a,b)""" return (self.re, self.im) def conjugate(self): """Returns the conjugate of a complex number. a+bi -> a-bi""" return cplx(self.re, -self.im) def switch(self): """Returns a complex number with real part and imaginary part switched. a+bi -> b+ai""" return cplx(self.im, self.re) def modul(self): """Returns the modulus of the complex number""" return math.sqrt(self.re**2 + self.im**2) def arg(self): """Returns the argument of the complex number in radian""" if self != 0 : teta = math.acos(abs(self.re)/abs(self)) if self.re >= 0 and self.im > 0 : angle = teta elif self.re > 0 and self.im <= 0 : angle = -teta elif self.re < 0 and self.im >= 0 : angle = math.pi - teta elif self.re <= 0 and self.im < 0 : angle = -math.pi + teta else : raise ValueError("0 doesn't have an argument.") return angle def deg_arg(self): """Returns the argument of the complex number in degrees""" return math.degrees(self.arg()) def polar(self): """Returns the trigonometric form of a complex number -> str""" return "{}(cos({}) + sin({})i)".format(self.modul(), self.arg(), self.arg()) def factorise(self, value=1): """Returns a factorisation of a complex number by a value given (similar to divmod) -> str Example : (2i).factorise(i+1) >>> 2i = (i+1)(i+1)+0""" chain = "{} = ({})({})".format(self, value, self//value) if str(self%value)[0] == '-' : chain += "{}".format(self%value) else : chain += "+{}".format(self%value) return chain @classmethod def verify(cls, phrase) : """verify('a+bi') -> bool The argument must be provided in a string to verify whether it's a valid complex number or not. a and b can be integers, fractions(n/d), or floating numbers(with . or ,)""" if isinstance(phrase, str) : phrase = phrase.replace(" ", "") model_re = r"[+-]?[0-9]+([,./][0-9]+)?([+-]([0-9]+([,./][0-9]+)?[*]?)?i)?" # a real part and an optional imaginary part model_im = r"[+-]?([0-9]+([,./][0-9]+)?[*]?)?i([+-][0-9]+([,./][0-9]+)?)?" # an imaginary part and an optional real part return re.fullmatch(model_re, phrase) is not None or re.fullmatch(model_im, phrase) is not None else : raise TypeError("The complex number must be given in a string.") #---------------------------------------# i = cplx(0,1) # Most basic complex number #---------------------------------------# From mertz at gnosis.cx Mon Sep 4 17:35:29 2017 From: mertz at gnosis.cx (David Mertz) Date: Mon, 4 Sep 2017 14:35:29 -0700 Subject: [Python-Dev] Complex numbers In-Reply-To: References: Message-ID: This sounds like something worthwhile to put on GitHub and PyPI. But it doesn't seem like it has much to do with developing CPython itself, the purpose of this list. On Sep 4, 2017 2:09 PM, "TBER Abdelmalek" wrote: > This module implements cplx class (complex numbers) regardless to the > built-in class. > The main goal of this module is to propose some improvement to complex > numbers in python and deal with them from a mathematical approach. > Also cplx class doesn't support the built-in class intentionally, as the > idea was to give an alternative to it. > With the hope I managed to succeed, here is the module : > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > mertz%40gnosis.cx > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From a.tber10 at gmail.com Mon Sep 4 18:30:20 2017 From: a.tber10 at gmail.com (TBER Abdelmalek) Date: Mon, 4 Sep 2017 23:30:20 +0100 Subject: [Python-Dev] Complex numbers In-Reply-To: References: Message-ID: thanks for your feedback. Le 4 sept. 2017 22:35, "David Mertz" a ?crit : > This sounds like something worthwhile to put on GitHub and PyPI. But it > doesn't seem like it has much to do with developing CPython itself, the > purpose of this list. > > On Sep 4, 2017 2:09 PM, "TBER Abdelmalek" wrote: > >> This module implements cplx class (complex numbers) regardless to the >> built-in class. >> The main goal of this module is to propose some improvement to complex >> numbers in python and deal with them from a mathematical approach. >> Also cplx class doesn't support the built-in class intentionally, as the >> idea was to give an alternative to it. >> With the hope I managed to succeed, here is the module : >> >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: https://mail.python.org/mailman/options/python-dev/mertz% >> 40gnosis.cx >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From songofacandy at gmail.com Tue Sep 5 04:38:06 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Tue, 5 Sep 2017 17:38:06 +0900 Subject: [Python-Dev] [RFC] Removing pure Python implementation of OrderedDict Message-ID: Hi, all. Currently, deque and defaultdict have only C implementation. Python implementations should provide _collections.deque and _collections.defaultdict. Like that, how about removing OrderedDict Pure Python implementation from stdlib and require it to implementation? ## Pros ### Thread safety AFAIK, there are no thread safety guarantee in OrderedDict. I don't look carefully, but some methods seems thread unsafe. I'm considering adding `OrderedDict.lru_get(key, default=None)` which atomically lookup key and move the key to end if found. It can be used when functools.lru_cahce can't be used. (see http://bugs.python.org/issue28193 ) And thread safety is very important for such method. Anyway, thread safety will improve usability of OrderedDict significantly, like defaultdict. ### Less maintenance cost of test_ordered_dict. I'm sending pull request for removing doubly linked list from C OrderedDict, for faster creatin, iteration, and reduce memory usage by 1/2. See https://bugs.python.org/issue31265 While implementing it, I noticed that current test_ordered_dict has some tests about implementation detail without @cpython_only decorator. For example: https://github.com/python/cpython/blob/fcd97d44382df520e39de477a5ec99dd89c3fda8/Lib/test/test_ordered_dict.py#L522-L530 While current test expects KeyError, my pull request (and PyPy's OrderedDict) doesn't raise KeyError because inconsistency between doubly linked list and dict never happens. PyPy changed the test: https://bitbucket.org/pypy/pypy/src/ac3e33369ba0aa14278a6a3e8f2add09590d358c/lib-python/3/test/test_ordered_dict.py?at=py3.6&fileviewer=file-view-default#test_ordered_dict.py-525:542 My pull request has same change: https://github.com/methane/cpython/pull/9/files#diff-23c28e7fa52967682669d9059dc977ed Maintain compatibility with odd behavior of Pure Python implementation is not constructive job not only for us, but also other Python implementations. ### `import collections` bit faster. Pure Python implementation has 5 classes (OrderedDict, _Link, _OrderedDictKeyViews, _OrderedDictItemsView, and _OrderedDictValuesView). Three of them inheriting from ABC. So it makes importing collections bit slower. ## Cons * All Python 3.7 implementations should provide _collections.OrderedDict PyPy has it already. But I don't know about micropython. Regards, INADA Naoki From storchaka at gmail.com Tue Sep 5 07:48:32 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 5 Sep 2017 14:48:32 +0300 Subject: [Python-Dev] [RFC] Removing pure Python implementation of OrderedDict In-Reply-To: References: Message-ID: 05.09.17 11:38, INADA Naoki ????: > ## Cons > > * All Python 3.7 implementations should provide _collections.OrderedDict > PyPy has it already. But I don't know about micropython. Current C implementation of OrderedDict is not safe regarding using mutating dict methods (or dict C API) like dict.__setitem__ or PyDict_SetItem. Using them can cause hangs or segfaults. See issue24726 and issue25410. I hope your implementation will solve these issues, but there may be others. While the C implementation still is not enough mature, we should allow users that encountered one of such issues to use pure Python implementation which is free from hangs and segfaults. From hodgestar+pythondev at gmail.com Tue Sep 5 08:58:16 2017 From: hodgestar+pythondev at gmail.com (Simon Cross) Date: Tue, 5 Sep 2017 14:58:16 +0200 Subject: [Python-Dev] [RFC] Removing pure Python implementation of OrderedDict In-Reply-To: References: Message-ID: I thought the decision a few years ago was that all modules that have a C library for performance reasons should also have a Python version? Did this decision change at some point? (just curious). -------------- next part -------------- An HTML attachment was scrubbed... URL: From songofacandy at gmail.com Tue Sep 5 09:02:19 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Tue, 5 Sep 2017 22:02:19 +0900 Subject: [Python-Dev] To reduce Python "application" startup time Message-ID: Hi, While I can't attend to sprint, I saw etherpad and I found Neil Schemenauer and Eric Snow will work on startup time. I want to share my current knowledge about startup time. For bare (e.g. `python -c pass`) startup time, I'm waiting C implementation of ABC. But application startup time is more important. And we can improve them with optimize importing common stdlib. Current `python -v` is not useful to optimize import. So I use this patch to profile import time. https://gist.github.com/methane/e688bb31a23bcc437defcea4b815b1eb With this profile, I tried optimize `python -c 'import asyncio'`, logging and http.client. https://gist.github.com/methane/1ab97181e74a33592314c7619bf34233#file-0-optimize-import-patch With this small patch: logging: 14.9ms -> 12.9ms asyncio: 62.1ms -> 58.2ms http.client: 43.8ms -> 36.1ms I haven't created pull request yet. (Can I create without issue, as trivial patch?) I'm very busy these days, maybe until December. But I hope this report helps people working on optimizing startup time. Regards, INADA Naoki From songofacandy at gmail.com Tue Sep 5 09:05:54 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Tue, 5 Sep 2017 22:05:54 +0900 Subject: [Python-Dev] [RFC] Removing pure Python implementation of OrderedDict In-Reply-To: References: Message-ID: On Tue, Sep 5, 2017 at 9:58 PM, Simon Cross wrote: > I thought the decision a few years ago was that all modules that have a C > library for performance reasons should also have a Python version? Did this > decision change at some point? (just curious). But in this case, not only for optimize, but also better behavior. Pure Python version is not thread safe even for `od[k] = v`. It's very difficult to write thread safe complex container in Pure Python. Regards, INADA Naoki From songofacandy at gmail.com Tue Sep 5 09:19:18 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Tue, 5 Sep 2017 22:19:18 +0900 Subject: [Python-Dev] [RFC] Removing pure Python implementation of OrderedDict In-Reply-To: References: Message-ID: On Tue, Sep 5, 2017 at 8:48 PM, Serhiy Storchaka wrote: > 05.09.17 11:38, INADA Naoki ????: >> >> ## Cons >> >> * All Python 3.7 implementations should provide _collections.OrderedDict >> PyPy has it already. But I don't know about micropython. > > > Current C implementation of OrderedDict is not safe regarding using mutating > dict methods (or dict C API) like dict.__setitem__ or PyDict_SetItem. Using > them can cause hangs or segfaults. See issue24726 and issue25410. I hope > your implementation will solve these issues, but there may be others. Thanks for sharing these issues. My pull request is still early stage and may have some bugs. But I think it will solve many of issues because most odd behaviors comes from difficult of consistency between linked list and dict. > While > the C implementation still is not enough mature, we should allow users that > encountered one of such issues to use pure Python implementation which is > free from hangs and segfaults. > I see. Maybe, we can move Pure Python OrderedDict to collections._odict module, and collections/__init__.py can be like this: from _collections import OrderedDict # Pure Python implementation can be used when C implementation cause segfault. # from collections._odict import OrderedDict > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com From solipsis at pitrou.net Tue Sep 5 09:22:48 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 5 Sep 2017 15:22:48 +0200 Subject: [Python-Dev] [RFC] Removing pure Python implementation of OrderedDict References: Message-ID: <20170905152248.3b31e8bc@fsol> On Tue, 5 Sep 2017 17:38:06 +0900 INADA Naoki wrote: > > Like that, how about removing OrderedDict Pure Python implementation > from stdlib and require it to implementation? I don't like this. The C version of OrderedDict is probably very hard to read, while the Python version helps understand the semantics. Also, when devising a new API, it's much easier to first try it out on the Python version. > I'm considering adding `OrderedDict.lru_get(key, default=None)` which > atomically lookup key and move the key to end if found. The Python version needn't have the same atomicity requirements as the C version. Regards Antoine. From guido at python.org Tue Sep 5 10:40:14 2017 From: guido at python.org (Guido van Rossum) Date: Tue, 5 Sep 2017 07:40:14 -0700 Subject: [Python-Dev] [RFC] Removing pure Python implementation of OrderedDict In-Reply-To: References: Message-ID: On Tue, Sep 5, 2017 at 5:58 AM, Simon Cross wrote: > I thought the decision a few years ago was that all modules that have a C > library for performance reasons should also have a Python version? Did this > decision change at some point? (just curious). > It was never meant as a hard rule. IIRC the actual rule is more that *if* you have both a C and a Python version they need to supply the same API, *and* the naming should be e.g. _pickle (C) and pickle (Python), *and* the Python version should automatically replace itself with the C version when the latter exists (e.g. by putting an import * at the end with a try/except around it). There are tons of modules that only have a C version and there's no need to change this -- and I'm fine with occasionally a module removing the Python version when the C version is deemed sufficiently stable and compatible. (In some cases a move in the opposite direction may also be reasonable. No two cases are exactly alike.) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From levkivskyi at gmail.com Tue Sep 5 12:08:14 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Tue, 5 Sep 2017 18:08:14 +0200 Subject: [Python-Dev] To reduce Python "application" startup time In-Reply-To: References: Message-ID: On 5 September 2017 at 15:02, INADA Naoki wrote: > Hi, > > [...] > > For bare (e.g. `python -c pass`) startup time, I'm waiting C > implementation of ABC. > Hi, I am not sure I will be able to finish it this week, also this depends on fixing interactions with ABC caches in ``typing`` first (as I mentioned on b.p.o., currently ``typing`` "aggressively" uses private ABC API). -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue Sep 5 12:36:51 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 5 Sep 2017 18:36:51 +0200 Subject: [Python-Dev] Compiling without multithreading support -- still useful? Message-ID: <20170905183651.6934a16d@fsol> Hello, It's 2017 and we are still allowing people to compile CPython without threads support. It adds some complication in several places (including delicate parts of our internal C code) without a clear benefit. Do people still need this? Regards Antoine. From victor.stinner at gmail.com Tue Sep 5 12:42:04 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 5 Sep 2017 18:42:04 +0200 Subject: [Python-Dev] Compiling without multithreading support -- still useful? In-Reply-To: <20170905183651.6934a16d@fsol> References: <20170905183651.6934a16d@fsol> Message-ID: I proposed to drop the --without-threads option multiple times. I worked on tiny and cheap embedded devices and we used Python *with* threads for concurrency. Many Python features require threads, like asyncio and multiprocessing. Also subprocess.communicate() on Windows, no? I'm strongly in favor of dropping this option from Python 3.7. It would remove a lot of code! Victor 2017-09-05 18:36 GMT+02:00 Antoine Pitrou : > > Hello, > > It's 2017 and we are still allowing people to compile CPython without > threads support. It adds some complication in several places > (including delicate parts of our internal C code) without a clear > benefit. Do people still need this? > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com From christian at python.org Tue Sep 5 13:08:21 2017 From: christian at python.org (Christian Heimes) Date: Tue, 5 Sep 2017 10:08:21 -0700 Subject: [Python-Dev] Compiling without multithreading support -- still useful? In-Reply-To: References: <20170905183651.6934a16d@fsol> Message-ID: On 2017-09-05 09:42, Victor Stinner wrote: > I proposed to drop the --without-threads option multiple times. I > worked on tiny and cheap embedded devices and we used Python *with* > threads for concurrency. Many Python features require threads, like > asyncio and multiprocessing. Also subprocess.communicate() on Windows, > no? > > I'm strongly in favor of dropping this option from Python 3.7. It > would remove a lot of code! +1 These days, tiny embedded devices can make use of MicroPython. Christian From christian at python.org Tue Sep 5 13:08:21 2017 From: christian at python.org (Christian Heimes) Date: Tue, 5 Sep 2017 10:08:21 -0700 Subject: [Python-Dev] Compiling without multithreading support -- still useful? In-Reply-To: References: <20170905183651.6934a16d@fsol> Message-ID: On 2017-09-05 09:42, Victor Stinner wrote: > I proposed to drop the --without-threads option multiple times. I > worked on tiny and cheap embedded devices and we used Python *with* > threads for concurrency. Many Python features require threads, like > asyncio and multiprocessing. Also subprocess.communicate() on Windows, > no? > > I'm strongly in favor of dropping this option from Python 3.7. It > would remove a lot of code! +1 These days, tiny embedded devices can make use of MicroPython. Christian From ericsnowcurrently at gmail.com Tue Sep 5 13:13:59 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 5 Sep 2017 10:13:59 -0700 Subject: [Python-Dev] [RFC] Removing pure Python implementation of OrderedDict In-Reply-To: References: Message-ID: On Tue, Sep 5, 2017 at 1:38 AM, INADA Naoki wrote: > Like that, how about removing OrderedDict Pure Python implementation > from stdlib and require it to implementation? -1 Like Antoine, I consider the pure Python implementation to be valuable. Furthermore, the pure Python implementation is the reference, so its behavior is idiomatic. > ### Thread safety > > AFAIK, there are no thread safety guarantee in OrderedDict. > I don't look carefully, but some methods seems thread unsafe. What isn't thread-safe? I know that Raymond has a good understanding of this area. For instance, he was very clear about re-entrancy concerns when I was working on the C implementation. I recommend getting feedback from him on this. FWIW, I don't recall any bugs related to thread-safety in OrderedDict, even though it's been around a while. > [snip] > > ### Less maintenance cost of test_ordered_dict. > > [snip] I don't find this to be a strong argument. If there are concerns with the reference behavior then those should be addressed rather than used to justify removing the implementation. > ### `import collections` bit faster. > > [snip] This is not a strong argument. The difference is not significant enough to warrant removing the reference implementation. So, again, I'm against removing the pure Python implementation of OrderedDict. I highly recommend making sure you have Raymond's cooperation before making changes in the collections module. -eric From zachary.ware+pydev at gmail.com Tue Sep 5 13:27:20 2017 From: zachary.ware+pydev at gmail.com (Zachary Ware) Date: Tue, 5 Sep 2017 12:27:20 -0500 Subject: [Python-Dev] [RFC] Removing pure Python implementation of OrderedDict In-Reply-To: References: Message-ID: On Tue, Sep 5, 2017 at 12:13 PM, Eric Snow wrote: > On Tue, Sep 5, 2017 at 1:38 AM, INADA Naoki wrote: >> Like that, how about removing OrderedDict Pure Python implementation >> from stdlib and require it to implementation? > > -1 > > Like Antoine, I consider the pure Python implementation to be > valuable. Furthermore, the pure Python implementation is the > reference, so its behavior is idiomatic. -1 as well, I'm in favor of keeping Python implementations of things implemented in C, no matter the order in which they are added. >> ### Less maintenance cost of test_ordered_dict. >> >> [snip] > > I don't find this to be a strong argument. If there are concerns with > the reference behavior then those should be addressed rather than used > to justify removing the implementation. I agree here as well. We have plenty of things implemented in both Python and C, and have helpers in test.support to make it (relatively, at least) easy to test both. PEP 399 is also relevant here, I think. -- Zach From tjreedy at udel.edu Tue Sep 5 13:52:28 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 5 Sep 2017 13:52:28 -0400 Subject: [Python-Dev] To reduce Python "application" startup time In-Reply-To: References: Message-ID: On 9/5/2017 9:02 AM, INADA Naoki wrote: > But application startup time is more important. And we can improve > them with optimize importing common stdlib. > > Current `python -v` is not useful to optimize import. > So I use this patch to profile import time. > https://gist.github.com/methane/e688bb31a23bcc437defcea4b815b1eb > > With this profile, I tried optimize `python -c 'import asyncio'`, logging > and http.client. > > https://gist.github.com/methane/1ab97181e74a33592314c7619bf34233#file-0-optimize-import-patch > > With this small patch: > > logging: 14.9ms -> 12.9ms > asyncio: 62.1ms -> 58.2ms > http.client: 43.8ms -> 36.1ms > > I haven't created pull request yet. > (Can I create without issue, as trivial patch?) Trivial, no-issue PRs are meant for things like typo fixes that need no discussion or record. Moving imports in violation of the PEP 8 rule, "Imports are always put at the top of the file, just after any module comments and docstrings, and before module globals and constants", is not trivial. Doing so voluntarily for speed, as opposed to doing so necessarily to avoid circular import errors, is controversial. -- Terry Jan Reedy From tjreedy at udel.edu Tue Sep 5 14:00:55 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 5 Sep 2017 14:00:55 -0400 Subject: [Python-Dev] [RFC] Removing pure Python implementation of OrderedDict In-Reply-To: References: Message-ID: On 9/5/2017 10:40 AM, Guido van Rossum wrote: > On Tue, Sep 5, 2017 at 5:58 AM, Simon Cross > > > wrote: > > I thought the decision a few years ago was that all modules that > have a C library for performance reasons should also have a Python > version? Did this decision change at some point? (just curious). > > It was never meant as a hard rule. IIRC the actual rule is more that > *if* you have both a C and a Python version they need to supply the same > API, *and* the naming should be e.g. _pickle (C) and pickle (Python), > *and* the Python version should automatically replace itself with the C > version when the latter exists (e.g. by putting an import * at the end > with a try/except around it). *And* both versions should be tested in the test file ;-). -- Terry Jan Reedy From jelle.zijlstra at gmail.com Tue Sep 5 14:13:58 2017 From: jelle.zijlstra at gmail.com (Jelle Zijlstra) Date: Tue, 5 Sep 2017 11:13:58 -0700 Subject: [Python-Dev] To reduce Python "application" startup time In-Reply-To: References: Message-ID: 2017-09-05 6:02 GMT-07:00 INADA Naoki : > Hi, > > While I can't attend to sprint, I saw etherpad and I found > Neil Schemenauer and Eric Snow will work on startup time. > > I want to share my current knowledge about startup time. > > For bare (e.g. `python -c pass`) startup time, I'm waiting C > implementation of ABC. > > But application startup time is more important. And we can improve > them with optimize importing common stdlib. > > Current `python -v` is not useful to optimize import. > So I use this patch to profile import time. > https://gist.github.com/methane/e688bb31a23bcc437defcea4b815b1eb > > With this profile, I tried optimize `python -c 'import asyncio'`, logging > and http.client. > > https://gist.github.com/methane/1ab97181e74a33592314c7619bf342 > 33#file-0-optimize-import-patch > > This patch moves a few imports inside functions. I wonder whether that kind of change actually helps with real applications?doesn't any real application end up importing the socket module anyway at some point? > With this small patch: > > logging: 14.9ms -> 12.9ms > asyncio: 62.1ms -> 58.2ms > http.client: 43.8ms -> 36.1ms > > I haven't created pull request yet. > (Can I create without issue, as trivial patch?) > > I'm very busy these days, maybe until December. > But I hope this report helps people working on optimizing startup time. > > Regards, > > INADA Naoki > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > jelle.zijlstra%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From songofacandy at gmail.com Tue Sep 5 14:20:43 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Wed, 6 Sep 2017 03:20:43 +0900 Subject: [Python-Dev] [RFC] Removing pure Python implementation of OrderedDict In-Reply-To: References: Message-ID: First of all, I saw enough -1 so I gave up about removing. Following reply is just a technical topic. On Wed, Sep 6, 2017 at 2:13 AM, Eric Snow wrote: [snip] > > Like Antoine, I consider the pure Python implementation to be > valuable. Furthermore, the pure Python implementation is the > reference, so its behavior is idiomatic. > What is *idiomatic*? For example, in this test case: https://github.com/python/cpython/blob/564a2c68add64ebf2e558a54f5697513b19293cb/Lib/test/test_ordered_dict.py#L575-L582 def test_dict_delitem(self): OrderedDict = self.OrderedDict od = OrderedDict() od['spam'] = 1 od['ham'] = 2 dict.__delitem__(od, 'spam') with self.assertRaises(KeyError): repr(od) Since dict.__delitem__ is same to OrderedDict.__delitem__ in PyPy implementation, repr(od) doesn't raise KeyError. >>>> import collections >>>> od = collections.OrderedDict() >>>> od['spam'] = 1 >>>> od['ham'] = 2 >>>> dict.__delitem__(od, 'spam') >>>> repr(od) "OrderedDict([('ham', 2)])" Do you think this behavior is not idiomatic? I feel both are OK and there are no idiomatic behavior. The tested behavior seems just an implementation detail. At least, PyPy modified this test as I wrote in previous mail. It makes sense to me. >> ### Thread safety >> >> AFAIK, there are no thread safety guarantee in OrderedDict. >> I don't look carefully, but some methods seems thread unsafe. > > What isn't thread-safe? I know that Raymond has a good understanding > of this area. For instance, he was very clear about re-entrancy > concerns when I was working on the C implementation. I recommend > getting feedback from him on this. FWIW, I don't recall any bugs > related to thread-safety in OrderedDict, even though it's been around > a while. For example, when one thread do `od[k1] = v1` and another thread do `od[k2] = v2`, result should be equivalent to one of `od[k1] = v1; od[k2] = v2;` or `od[k2] = v2; od[k1] = v1`. And internal data structure should be consistent. But see current Pure Python implementation: def __setitem__(self, key, value, dict_setitem=dict.__setitem__, proxy=_proxy, Link=_Link): 'od.__setitem__(i, y) <==> od[i]=y' # Setting a new item creates a new link at the end of the linked list, # and the inherited dictionary is updated with the new key/value pair. if key not in self: self.__map[key] = link = Link() root = self.__root last = root.prev link.prev, link.next, link.key = last, root, key last.next = link root.prev = proxy(link) dict_setitem(self, key, value) If thread is switched right after `root = self.__root`, either k1 or k2 is disappeared in linked list. More worth, if thread switch is happen right after `last = root.prev`, linked list will be broken. Fixing such issue is not easy job. See http://bugs.python.org/issue14976 for example. On the other hand, thanks to GIL, C implementation (not only mine, but also your's) doesn't have such issue. Sometime, writing reentrant and threadsafe container in C is easier than in Pure Python. Regards, P.S. Since it's over 3 a.m. in Japan, I'll read and reply other mails tomorrow. Sorry for delay. From rdmurray at bitdance.com Tue Sep 5 14:46:50 2017 From: rdmurray at bitdance.com (R. David Murray) Date: Tue, 05 Sep 2017 14:46:50 -0400 Subject: [Python-Dev] Version and Last-Modified headers are no longer required in PEPs. Message-ID: <20170905184650.92F321B10079@webabinitio.net> The Version and Last-Modified headers required by PEP1 used to be maintained by the version control system, but this is not true now that we've switched to git. We are therefore deprecating these headers and have removed them from PEP1. The PEP generation script now considers them to be optional. --David From solipsis at pitrou.net Tue Sep 5 15:20:24 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 5 Sep 2017 21:20:24 +0200 Subject: [Python-Dev] [RFC] Removing pure Python implementation of OrderedDict References: Message-ID: <20170905212024.7401434a@fsol> On Wed, 6 Sep 2017 03:20:43 +0900 INADA Naoki wrote: > What is *idiomatic*? For example, in this test case: > > https://github.com/python/cpython/blob/564a2c68add64ebf2e558a54f5697513b19293cb/Lib/test/test_ordered_dict.py#L575-L582 > > def test_dict_delitem(self): > OrderedDict = self.OrderedDict > od = OrderedDict() > od['spam'] = 1 > od['ham'] = 2 > dict.__delitem__(od, 'spam') > with self.assertRaises(KeyError): > repr(od) > > Since dict.__delitem__ is same to OrderedDict.__delitem__ in PyPy > implementation, > repr(od) doesn't raise KeyError. Since this test is testing an implementation detail, it can safely be moved to a test class specific to the pure Python implementation. > For example, when one thread do `od[k1] = v1` and another thread do > `od[k2] = v2`, > result should be equivalent to one of `od[k1] = v1; od[k2] = v2;` or > `od[k2] = v2; od[k1] = v1`. And internal data structure should be consistent. I agree the pure Python OrderedDict is not thread-safe. But who said it should? And, actually, are you sure the C implementation is? GIL switches can happen at many points. A simple Py_DECREF, or perhaps even a tuple allocation can trigger a GC run that may release the GIL at some point in a finalizer function. > Sometime, writing reentrant and threadsafe container in C is easier than > in Pure Python. The C and Python versions needn't make exactly the same guarantees. Regards Antoine. From njs at pobox.com Tue Sep 5 16:11:50 2017 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 5 Sep 2017 13:11:50 -0700 Subject: [Python-Dev] To reduce Python "application" startup time In-Reply-To: References: Message-ID: On Tue, Sep 5, 2017 at 11:13 AM, Jelle Zijlstra wrote: > > > 2017-09-05 6:02 GMT-07:00 INADA Naoki : >> With this profile, I tried optimize `python -c 'import asyncio'`, logging >> and http.client. >> >> >> https://gist.github.com/methane/1ab97181e74a33592314c7619bf34233#file-0-optimize-import-patch >> > This patch moves a few imports inside functions. I wonder whether that kind > of change actually helps with real applications?doesn't any real application > end up importing the socket module anyway at some point? I don't know if this particular change is worthwhile, but one place where startup slowness is particularly noticed is with commands like 'foo.py --help' or 'foo.py --generate-completions' (the latter called implicitly by hitting in some shell), which typically do lots of imports that end up not being used. -n -- Nathaniel J. Smith -- https://vorpus.org From tjreedy at udel.edu Tue Sep 5 16:30:21 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 5 Sep 2017 16:30:21 -0400 Subject: [Python-Dev] [RFC] Removing pure Python implementation of OrderedDict In-Reply-To: References: Message-ID: On 9/5/2017 2:20 PM, INADA Naoki wrote: > On Wed, Sep 6, 2017 at 2:13 AM, Eric Snow wrote: >> Like Antoine, I consider the pure Python implementation to be >> valuable. Furthermore, the pure Python implementation is the >> reference, so its behavior is idiomatic. To this native English speaker, 'behavior is idiomatic' is not idiomatic ;-). The word 'idiomatic' refers to mode of expression, not mechanical behavior. Perhaps what Eric is saying is that the behavior of the reference implementation is by definition correct. Yet we have learned over the years to be more careful about what cpython behaviors constitute 'python' and which are incidental implementations. 'Improving the test suite' sometimes mean segregating the two. > What is *idiomatic*? Expression that seems natural to a native speaker of a language, dialect, or subculture. The question you address is which behaviors of a Python OrderedDict are part of the definition and which are implementation-specific, and whether the tests need improvement. -- Terry Jan Reedy From larry at hastings.org Tue Sep 5 17:39:36 2017 From: larry at hastings.org (Larry Hastings) Date: Tue, 5 Sep 2017 14:39:36 -0700 Subject: [Python-Dev] Workflow change reminder: The Blurb Has Landed Message-ID: <04b97978-cc81-ed50-e3a6-5f96c9ea54ea@hastings.org> Yesterday I "blurbified" the 2.7, 3.6, and master branches in CPython. It's finally official: all* branches have now been "blurbified". This means that Misc/NEWS is no longer present in any of CPython's active branches. Instead, Misc/NEWS has been broken up into a zillion little files in the new Misc/NEWS.d directory tree. (Misc/NEWS can be reconstituted consistently from this directory tree.) This banishes forever the annoying problem of Misc/NEWS merge conflicts when you attempt to merge a pull request, when release managers create a release, etc. We announced "blurb" a month or two back, and you should already be creating your news entries in Misc/NEWS.d. But just in case, here's a quick refresher. The core-workflow team has created a tool called "blurb" that makes it easy to create a news entry for a commit. It's easy to install and use. You can install it with: python3 -m pip install blurb It requires Python 3.5 or newer. (Sorry, it can't get backported to 3.4 without fixing a bug in "re"... and 3.4 is closed for bugfixes.) To create a news entry, just run blurb from anywhere inside your CPython repo. On POSIX systems just make sure "blurb" is on your path, then simply run "blurb". On Windows the recommended method is "python3 -m blurb". When you run blurb, it'll guide you through creating a news entry. It'll pop open an editor window on a template. Just modify the template as needed, write your news entry at the bottom, and save and exit. blurb will write out your news entry as a uniquely-named file in Misc/NEWS.d--and even stage it for you in "git"! If for some reason you want to create news entries by hand, or if you just want more information, please see the Dev Guide: https://devguide.python.org/committing/#what-s-new-and-news-entries and the documentation for blurb on PyPI: https://pypi.org/project/blurb/ The migration to Misc/NEWS.d and blurb has a few other implications: 1. Existing pending PRs or patches with Misc/NEWS entries will need to be updated to generate the entry with blurb. (Again, there really shouldn't /be/ any by this point.) 2. If you want to read a branch's NEWS entries in a convenient format, simply use "blurb merge" to produce a temporary NEWS file. (But don't check it in!) See the blurb documentation. 3. As part of the release process, release managers now use blurb to merge all the individual NEWS.d/next files into a single NEWS.d file for that version. They'll also build a traditional monolithic Misc/NEWS included in the release's source tarball (but this won't get checked in). 4. If you're building 3.x docs from a cpython git repo, you'll now need to have blurb installed. You can use the "make venv" recipe in Doc/Makefile to create a python3 venv with blurb and sphinx. Happy blurbing! /arry * All except 3.3. Ned Deily has taken over as release manager of 3.3, and he thinks we shouldn't bother with blurbifying that branch--it'll reach end-of-life in mere weeks, and we're only going to make one more release of it, and it gets very little work done these days. p.s. Thanks to Ned Deily who originally wrote this email, which I hacked up and sent. -------------- next part -------------- An HTML attachment was scrubbed... URL: From larry at hastings.org Tue Sep 5 18:03:18 2017 From: larry at hastings.org (Larry Hastings) Date: Tue, 5 Sep 2017 15:03:18 -0700 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) Message-ID: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> I've written a PEP proposing a language change: https://www.python.org/dev/peps/pep-0549/ The TL;DR summary: add support for property objects to modules. I've already posted a prototype. How's that sound? //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Sep 5 18:52:31 2017 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 5 Sep 2017 15:52:31 -0700 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) In-Reply-To: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> Message-ID: On Tue, Sep 5, 2017 at 3:03 PM, Larry Hastings wrote: > > I've written a PEP proposing a language change: > > https://www.python.org/dev/peps/pep-0549/ > > The TL;DR summary: add support for property objects to modules. I've > already posted a prototype. Interesting idea! It's definitely less arcane than the __class__ assignment support that was added in 3.5. I guess the question is whether to add another language feature here, or to provide better documentation/helpers for the existing feature. If anyone's curious what the __class__ trick looks like in practice, here's some simple deprecation machinery: https://github.com/njsmith/trio/blob/ee8d909e34a2b28d55b5c6137707e8861eee3234/trio/_deprecate.py#L102-L138 And here's what it looks like in use: https://github.com/njsmith/trio/blob/ee8d909e34a2b28d55b5c6137707e8861eee3234/trio/__init__.py#L91-L115 Advantages of PEP 549: - easier to explain and use Advantages of the __class__ trick: - faster (no need to add an extra step to the normal attribute lookup fast path); only those who need the feature pay for it - exposes the full power of Python's class model. Notice that the above code overrides __getattr__ but not __dir__, so the attributes are accessible via direct lookup but not listed in dir(mod). This is on purpose, for two reasons: (a) tab completion shouldn't be suggesting deprecated attributes, (b) when I did expose them in __dir__, I had trouble with test runners that iterated through dir(mod) looking for tests, and ended up spewing tons of spurious deprecation warnings. (This is especially bad when you have a policy of running your tests with DeprecationWarnings converted to errors.) I don't think there's any way to do this with PEP 549. - already supported in CPython 3.5+ and PyPy3, and with a bit of care can be faked on all older CPython releases (including, crucially, CPython 2). PEP 549 OTOH AFAICT will only be usable for packages that have 3.7 as their minimum supported version. I don't imagine that I would actually use PEP 549 any time in the foreseeable future, due to the inability to override __dir__ and the minimum version requirement. If you only need to support CPython 3.5+ and PyPy3 5.9+, then you can effectively get PEP 549's functionality at the cost of 3 lines of code and a block indent: import sys, types class _MyModuleType(types.ModuleType): @property def ... @property def ... sys.modules[__name__].__class__ = _MyModuleType It's definitely true though that they're not the most obvious lines of code :-) -n -- Nathaniel J. Smith -- https://vorpus.org From greg.ewing at canterbury.ac.nz Tue Sep 5 19:59:49 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 06 Sep 2017 11:59:49 +1200 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: References: <20170828111903.GA3032@bytereef.org> <20170828155215.GA3763@bytereef.org> <20170828173327.GA2671@bytereef.org> <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> Message-ID: <59AF3A75.2020406@canterbury.ac.nz> Yury Selivanov wrote: > Question: how to write a context manager with contextvar.new? > > var = new_context_var() > > class CM: > > def __enter__(self): > var.new(42) > > with CM(): > print(var.get() or 'None') > > My understanding that the above code will print "None", because > "var.new()" makes 42 visible only to callees of __enter__. If you tie the introduction of a new scope for context vars to generators, as PEP 550 currently does, then this isn't a problem. But I'm trying to avoid doing that. The basic issue is that, ever since yield-from, "generator" and "task" are not synonymous. When you use a generator to implement an iterator, you probably want it to behave as a distinct task with its own local context. But a generator used with yield-from isn't a task of its own, it's just part of another task, and there is nothing built into Python that lets you tell the difference automatically. So I'm now thinking that the introduction of a new local context should also be explicit. Suppose we have these primitives: push_local_context() pop_local_context() Now introducing a temporary decimal context looks like: push_local_context() decimal.localcontextvar.new(decimal.getcontext().copy()) decimal.localcontextvar.prec = 5 do_some_calculations() pop_local_context() Since calls (either normal or generator) no longer automatically result in a new local context, we can easily factor this out into a context manager: class LocalDecimalContext(): def __enter__(self): push_local_context() ctx = decimal.getcontext().copy() decimal.localcontextvar.new(ctx) return ctx def __exit__(self): pop_local_context() Usage: with LocalDecimalContext() as ctx: ctx.prec = 5 do_some_calculations() -- Greg > > But if I use "set()" in "CM.__enter__", presumably, it will traverse > the stack of LCs to the very bottom and set "var=42" in in it. Right? > > If so, how can fix the example in PEP 550 Rationale: > https://www.python.org/dev/peps/pep-0550/#rationale where we zip() the > "fractions()" generator? > > With current PEP 550 semantics that's trivial: > https://www.python.org/dev/peps/pep-0550/#generators > > Yury From rdmurray at bitdance.com Tue Sep 5 20:11:00 2017 From: rdmurray at bitdance.com (R. David Murray) Date: Tue, 05 Sep 2017 20:11:00 -0400 Subject: [Python-Dev] PEP 548: More Flexible Loop Control Message-ID: <20170906001100.5F9DB1B10001@webabinitio.net> I've written a PEP proposing a small enhancement to the Python loop control statements. Short version: here's what feels to me like a Pythonic way to spell "repeat until": while: break if The PEP goes into some detail on why this feels like a readability improvement in the more general case, with examples taken from the standard library: https://www.python.org/dev/peps/pep-0548/ Unlike Larry, I don't have a prototype, and in fact if this idea meets with approval I'll be looking for a volunteer to do the actual implementation. --David PS: this came to me in a dream on Sunday night, and the more I explored the idea the better I liked it. I have no idea what I was dreaming about that resulted in this being the thing left in my mind when I woke up :) From yselivanov.ml at gmail.com Tue Sep 5 20:31:13 2017 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Tue, 5 Sep 2017 17:31:13 -0700 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: <59AF3A75.2020406@canterbury.ac.nz> References: <20170828111903.GA3032@bytereef.org> <20170828155215.GA3763@bytereef.org> <20170828173327.GA2671@bytereef.org> <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> Message-ID: On Tue, Sep 5, 2017 at 4:59 PM, Greg Ewing wrote: > Yury Selivanov wrote: >> >> Question: how to write a context manager with contextvar.new? >> >> var = new_context_var() >> >> class CM: >> >> def __enter__(self): >> var.new(42) >> >> with CM(): >> print(var.get() or 'None') >> >> My understanding that the above code will print "None", because >> "var.new()" makes 42 visible only to callees of __enter__. > > > If you tie the introduction of a new scope for context vars > to generators, as PEP 550 currently does, then this isn't a > problem. > > But I'm trying to avoid doing that. The basic issue is that, > ever since yield-from, "generator" and "task" are not > synonymous. > > When you use a generator to implement an iterator, you > probably want it to behave as a distinct task with its > own local context. But a generator used with yield-from > isn't a task of its own, it's just part of another task, > and there is nothing built into Python that lets you > tell the difference automatically. Greg, have you seen this new section: https://www.python.org/dev/peps/pep-0550/#should-yield-from-leak-context-changes ? It has a couple of examples that illustrate some issues with the "But a generator used with yield-from isn't a task of its own, it's just part of another task," reasoning. In principle, we can modify PEP 550 to make 'yield from' transparent to context changes. The interpreter can just reset g.__logical_context__ to None whenever 'g' is being 'yield-frommed'. The key issue is that there are a couple of edge-cases when having this semantics is problematic. The bottomline is that it's easier to reason about context when it's guaranteed that context changes are always isolated in generators no matter what. I think this semantics actually makes the refactoring easier. Please take a look at the linked section. > > So I'm now thinking that the introduction of a new local > context should also be explicit. > > Suppose we have these primitives: > > push_local_context() > pop_local_context() > > Now introducing a temporary decimal context looks like: > > push_local_context() > decimal.localcontextvar.new(decimal.getcontext().copy()) > decimal.localcontextvar.prec = 5 > do_some_calculations() > pop_local_context() > > Since calls (either normal or generator) no longer automatically > result in a new local context, we can easily factor this out into > a context manager: > > class LocalDecimalContext(): > > def __enter__(self): > push_local_context() > ctx = decimal.getcontext().copy() > decimal.localcontextvar.new(ctx) > return ctx > > def __exit__(self): > pop_local_context() > > Usage: > > with LocalDecimalContext() as ctx: > ctx.prec = 5 > do_some_calculations() This will have some performance implications and make the API way more complex. But I'm not convinced yet that real-life code needs the semantics you want. This will work with the current PEP 550 design: def g(): with DecimalContext() as ctx: ctx.prec = 5 yield from do_some_calculations() # will run with the correct ctx the only thing that won't work is this: def do_some_calculations(): ctx = DecimalContext() ctx.prec = 10 decimal.setcontext(ctx) yield def g(): yield from do_some_calculations() # Context changes in do_some_calculations() will not leak to g() In the above example, do_some_calculations() deliberately tries to leak context changes (by not using a contextmanager). And I consider it a feature that PEP 550 does not allow generators to leak state. If you write code that uses 'with' statements consistently, you will never even know that context changes are isolated in generators. Yury From mariatta.wijaya at gmail.com Tue Sep 5 21:10:48 2017 From: mariatta.wijaya at gmail.com (Mariatta Wijaya) Date: Tue, 5 Sep 2017 18:10:48 -0700 Subject: [Python-Dev] Cherry picker bot deployed in CPython repo Message-ID: Hi, The cherry picker bot has just been deployed to CPython repo, codenamed miss-islington. miss-islington made the very first backport PR for CPython and became a first time GitHub contributor: https://github.com/python/cpython/pull/3369 GitHub repo: https://github.com/python/miss-islington What is this? ========== As part of our workflow, quite often changes made on the master branch need to be backported to the earlier versions. (for example: from master to 3.6 and 2.7) Previously the backport has to be done manually by either a core developer or the original PR author. With the bot, the backport PR is created automatically after the PR has been merged. A core developer will need to review the backport PR. The issue was tracked in https://github.com/python/core-workflow/issues/8 How it works ========== 1. If a PR needs to be backported to one of the maintenance branches, a core developer should apply the "needs backport to X.Y" label. Do this **before** you merge the PR. 2. Merge the PR 3. miss-islington will leave a comment on the PR, saying it is working on backporting the PR. 4. If there's no merge conflict, the PR should be created momentarily. 5. Review the backport PR created by miss-islington and merge it when you're ready. Merge Conflicts / Problems? ====================== In case of merge conflicts, or if a backport PR was not created within 2 minutes, it likely failed and you should do the backport manually. Manual backport can be done using cherry_picker: https://pypi.org/project/cherry-picker/ Older merged PRs not yet backported? ============================== At the moment, those need to be backported manually. Don't want PR to be backported by a bot? ================================ My recommendation is to apply the "needs backport to X.Y" **after** the PR has been merged. The label is still useful to remind ourselves that this PR still needs backporting. Who is Miss Islington? ================= I found out from Wikipedia that Miss Islington is the name of the witch in Monty Python and The Holy Grail. miss-islington has not signed the CLA! ============================= A core dev can ignore the warning and merge the PR anyway. Thanks! Mariatta Wijaya -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Tue Sep 5 21:14:12 2017 From: barry at python.org (Barry Warsaw) Date: Tue, 5 Sep 2017 18:14:12 -0700 Subject: [Python-Dev] PEP 553: Built-in debug() Message-ID: I?ve written a PEP proposing the addition of a new built-in function called debug(). Adding this to your code would invoke a debugger through the hook function sys.debughook(). Like the existing sys.displayhook() and sys.excepthook(), you can change sys.debughook() to point to the debugger of your choice. By default it invokes pdb.set_trace(). With this PEP instead of: foo() import pdb; pdb.set_trace() bar() you can write: foo() debug() bar() and you would drop into the debugger after foo() but before bar(). More rationale and details are provided in the PEP: https://www.python.org/dev/peps/pep-0553/ Unlike David, but like Larry, I have a prototype implementation: https://github.com/python/cpython/pull/3355 Cheers, -Barry P.S. This came to me in a nightmare on Sunday night, and the more I explored the idea the more it frightened me. I know exactly what I was dreaming about and the only way to make it all go away was to write this thing up. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 273 bytes Desc: Message signed with OpenPGP URL: From alex.gaynor at gmail.com Tue Sep 5 21:15:06 2017 From: alex.gaynor at gmail.com (Alex Gaynor) Date: Tue, 5 Sep 2017 21:15:06 -0400 Subject: [Python-Dev] [python-committers] Cherry picker bot deployed in CPython repo In-Reply-To: References: Message-ID: This is a great UX win for our development process. Thanks for making this happen! Alex On Tue, Sep 5, 2017 at 9:10 PM, Mariatta Wijaya wrote: > Hi, > > The cherry picker bot has just been deployed to CPython repo, codenamed > miss-islington. > > miss-islington made the very first backport PR for CPython and became a > first time GitHub contributor: https://github.com/python/cpython/pull/3369 > > > GitHub repo: https://github.com/python/miss-islington > > What is this? > ========== > > As part of our workflow, quite often changes made on the master branch > need to be backported to the earlier versions. (for example: from master to > 3.6 and 2.7) > > Previously the backport has to be done manually by either a core developer > or the original PR author. > > With the bot, the backport PR is created automatically after the PR has > been merged. A core developer will need to review the backport PR. > > The issue was tracked in https://github.com/python/core-workflow/issues/8 > > How it works > ========== > > 1. If a PR needs to be backported to one of the maintenance branches, a > core developer should apply the "needs backport to X.Y" label. Do this > **before** you merge the PR. > > 2. Merge the PR > > 3. miss-islington will leave a comment on the PR, saying it is working on > backporting the PR. > > 4. If there's no merge conflict, the PR should be created momentarily. > > 5. Review the backport PR created by miss-islington and merge it when > you're ready. > > Merge Conflicts / Problems? > ====================== > > In case of merge conflicts, or if a backport PR was not created within 2 > minutes, it likely failed and you should do the backport manually. > > Manual backport can be done using cherry_picker: https://pypi. > org/project/cherry-picker/ > > Older merged PRs not yet backported? > ============================== > > At the moment, those need to be backported manually. > > Don't want PR to be backported by a bot? > ================================ > > My recommendation is to apply the "needs backport to X.Y" **after** the PR > has been merged. The label is still useful to remind ourselves that this PR > still needs backporting. > > Who is Miss Islington? > ================= > > I found out from Wikipedia that Miss Islington is the name of the witch in > Monty Python and The Holy Grail. > > miss-islington has not signed the CLA! > ============================= > > A core dev can ignore the warning and merge the PR anyway. > > Thanks! > > > Mariatta Wijaya > > _______________________________________________ > python-committers mailing list > python-committers at python.org > https://mail.python.org/mailman/listinfo/python-committers > Code of Conduct: https://www.python.org/psf/codeofconduct/ > > -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero GPG Key fingerprint: D1B3 ADC0 E023 8CA6 -------------- next part -------------- An HTML attachment was scrubbed... URL: From songofacandy at gmail.com Tue Sep 5 21:25:40 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Wed, 6 Sep 2017 10:25:40 +0900 Subject: [Python-Dev] [python-committers] Cherry picker bot deployed in CPython repo In-Reply-To: References: Message-ID: Congrats! INADA Naoki On Wed, Sep 6, 2017 at 10:10 AM, Mariatta Wijaya wrote: > Hi, > > The cherry picker bot has just been deployed to CPython repo, codenamed > miss-islington. > > miss-islington made the very first backport PR for CPython and became a > first time GitHub contributor: https://github.com/python/cpython/pull/3369 > > GitHub repo: https://github.com/python/miss-islington > > What is this? > ========== > > As part of our workflow, quite often changes made on the master branch need > to be backported to the earlier versions. (for example: from master to 3.6 > and 2.7) > > Previously the backport has to be done manually by either a core developer > or the original PR author. > > With the bot, the backport PR is created automatically after the PR has been > merged. A core developer will need to review the backport PR. > > The issue was tracked in https://github.com/python/core-workflow/issues/8 > > How it works > ========== > > 1. If a PR needs to be backported to one of the maintenance branches, a core > developer should apply the "needs backport to X.Y" label. Do this **before** > you merge the PR. > > 2. Merge the PR > > 3. miss-islington will leave a comment on the PR, saying it is working on > backporting the PR. > > 4. If there's no merge conflict, the PR should be created momentarily. > > 5. Review the backport PR created by miss-islington and merge it when you're > ready. > > Merge Conflicts / Problems? > ====================== > > In case of merge conflicts, or if a backport PR was not created within 2 > minutes, it likely failed and you should do the backport manually. > > Manual backport can be done using cherry_picker: > https://pypi.org/project/cherry-picker/ > > Older merged PRs not yet backported? > ============================== > > At the moment, those need to be backported manually. > > Don't want PR to be backported by a bot? > ================================ > > My recommendation is to apply the "needs backport to X.Y" **after** the PR > has been merged. The label is still useful to remind ourselves that this PR > still needs backporting. > > Who is Miss Islington? > ================= > > I found out from Wikipedia that Miss Islington is the name of the witch in > Monty Python and The Holy Grail. > > miss-islington has not signed the CLA! > ============================= > > A core dev can ignore the warning and merge the PR anyway. > > Thanks! > > > Mariatta Wijaya > > _______________________________________________ > python-committers mailing list > python-committers at python.org > https://mail.python.org/mailman/listinfo/python-committers > Code of Conduct: https://www.python.org/psf/codeofconduct/ > From songofacandy at gmail.com Tue Sep 5 22:26:52 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Wed, 6 Sep 2017 11:26:52 +0900 Subject: [Python-Dev] [RFC] Removing pure Python implementation of OrderedDict In-Reply-To: <20170905212024.7401434a@fsol> References: <20170905212024.7401434a@fsol> Message-ID: > >> For example, when one thread do `od[k1] = v1` and another thread do >> `od[k2] = v2`, >> result should be equivalent to one of `od[k1] = v1; od[k2] = v2;` or >> `od[k2] = v2; od[k1] = v1`. And internal data structure should be consistent. > > I agree the pure Python OrderedDict is not thread-safe. But who said > it should? No one. I meant just let's say "it should" from Python 3.7. > And, actually, are you sure the C implementation is? GIL switches can > happen at many points. A simple Py_DECREF, or perhaps even a tuple > allocation can trigger a GC run that may release the GIL at some point > in a finalizer function. > I know such difficulity already. I thought simple (when key is str or int) __setitem__ doesn't break internal state of current C implementation of OrderedDict. And I'll carefully review my code too. >> Sometime, writing reentrant and threadsafe container in C is easier than >> in Pure Python. > > The C and Python versions needn't make exactly the same guarantees. > Yes, of course. But what we can recommend for library developers (including stdlib)? Currently, dict ordering is implementation detail of CPython and PyPy. We don't recommend to rely on the behavior. Like that, should we say "atomic & threadsafe __setitem__ for simple key is implementation detail of CPython and PyPy. We recommend using mutex when using OrderedDict from multiple thread."? That was my point about thread safety. Regards, INADA Naoki From g.rodola at gmail.com Tue Sep 5 22:31:59 2017 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Wed, 6 Sep 2017 10:31:59 +0800 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: References: Message-ID: > It's a lot to type (27 characters). True. Personally I have a shortcut in my IDE (Sublime) so I when I type "pdb" -> TAB it auto completes it. > Python linters (e.g. flake8 [1]) complain about this line because it contains two statements. Breaking the idiom up into two lines further complicates the use of the debugger, I see this more as an advantage. After all a pdb line is "temporary" and you never want to commit it. When you do it is by accident so you want it to be more noticeable. I can configure my IDE to use flake8 and highlight the pdb line for me so that I can immediately see it because it's not PEP-8 compliant. I can have a GIT commit hook which checks there's no "pdb.set_trace()" in the files I'm committing: https://github.com/giampaolo/psutil/blob/master/.git-pre-commit#L93-L96 Somehow I think debug() would make this a bit harder as it's more likely a "debug()" line will pass unnoticed. For this reason I would give a -1 to this proposal. > It ties debugging directly to the choice of pdb. There might be other debugging options, say if you're using an IDE or some other development environment. Personally I would find it helpful if there was a hook to choose the default debugger to use on "pdb.set_trace()" via .pdbrc or PYTHONDEBUGGER environment variable or something. I tried (unsuccessfully) to run ipdb on "pdb.set_trace()", I gave up and ended up emulating auto completion and commands history with this: https://github.com/giampaolo/sysconf/blob/master/home/.pdbrc.py On Wed, Sep 6, 2017 at 9:14 AM, Barry Warsaw wrote: > I?ve written a PEP proposing the addition of a new built-in function > called debug(). Adding this to your code would invoke a debugger through > the hook function sys.debughook(). > > Like the existing sys.displayhook() and sys.excepthook(), you can change > sys.debughook() to point to the debugger of your choice. By default it > invokes pdb.set_trace(). > > With this PEP instead of: > > foo() > import pdb; pdb.set_trace() > bar() > > you can write: > > foo() > debug() > bar() > > and you would drop into the debugger after foo() but before bar(). More > rationale and details are provided in the PEP: > > https://www.python.org/dev/peps/pep-0553/ > > Unlike David, but like Larry, I have a prototype implementation: > > https://github.com/python/cpython/pull/3355 > > Cheers, > -Barry > > P.S. This came to me in a nightmare on Sunday night, and the more I > explored the idea the more it frightened me. I know exactly what I was > dreaming about and the only way to make it all go away was to write this > thing up. > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/g. > rodola%40gmail.com > > -- Giampaolo - http://grodola.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Sep 5 22:58:46 2017 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 5 Sep 2017 19:58:46 -0700 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: References: Message-ID: On Tue, Sep 5, 2017 at 6:14 PM, Barry Warsaw wrote: > I?ve written a PEP proposing the addition of a new built-in function called debug(). Adding this to your code would invoke a debugger through the hook function sys.debughook(). The 'import pdb; pdb.set_trace()' dance is *extremely* obscure, so replacing it with something more friendly seems like a great idea. Maybe breakpoint() would be a better description of what set_trace() actually does? This would also avoid confusion with IPython's very useful debug magic: https://ipython.readthedocs.io/en/stable/interactive/magics.html#magic-debug and which might also be worth stealing for the builtin REPL. (Personally I use it way more often than set_trace().) Example: In [1]: def f(): ...: x = 1 ...: raise RuntimeError ...: In [2]: f() --------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) in () ----> 1 f() in f() 1 def f(): 2 x = 1 ----> 3 raise RuntimeError RuntimeError: In [3]: debug > (3)f() 1 def f(): 2 x = 1 ----> 3 raise RuntimeError ipdb> p x 1 -n -- Nathaniel J. Smith -- https://vorpus.org From steve at pearwood.info Tue Sep 5 22:57:15 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 6 Sep 2017 12:57:15 +1000 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: References: Message-ID: <20170906025715.GG13110@ando.pearwood.info> On Tue, Sep 05, 2017 at 06:14:12PM -0700, Barry Warsaw wrote: > I?ve written a PEP proposing the addition of a new built-in function > called debug(). Adding this to your code would invoke a debugger > through the hook function sys.debughook(). [...] > P.S. This came to me in a nightmare on Sunday night, and the more I > explored the idea the more it frightened me. I know exactly what I > was dreaming about and the only way to make it all go away was to > write this thing up. Sorry, are we to interpret this as you asking that the PEP be rejected? I can't tell whether you are being poetic and actually think the PEP is a good idea, or whether you have written it to have it rejected and prevent anyone else ever implementing this? -- Steve From benjamin at python.org Tue Sep 5 23:06:26 2017 From: benjamin at python.org (Benjamin Peterson) Date: Tue, 05 Sep 2017 20:06:26 -0700 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: <20170906025715.GG13110@ando.pearwood.info> References: <20170906025715.GG13110@ando.pearwood.info> Message-ID: <1504667186.2203533.1096578744.6E33CEB6@webmail.messagingengine.com> On Tue, Sep 5, 2017, at 19:57, Steven D'Aprano wrote: > Sorry, are we to interpret this as you asking that the PEP be rejected? > I can't tell whether you are being poetic and actually think the PEP is > a good idea, or whether you have written it to have it rejected and > prevent anyone else ever implementing this? It is a playful reference to the postscript on the earlier https://mail.python.org/pipermail/python-dev/2017-September/149174.html From guido at python.org Tue Sep 5 23:15:46 2017 From: guido at python.org (Guido van Rossum) Date: Tue, 5 Sep 2017 20:15:46 -0700 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: References: Message-ID: Yeah, I like the idea, but I don't like the debug() name -- IIRC there's a helper named debug() in some codebase I know of that prints its arguments under certain circumstances. On Tue, Sep 5, 2017 at 7:58 PM, Nathaniel Smith wrote: > On Tue, Sep 5, 2017 at 6:14 PM, Barry Warsaw wrote: > > I?ve written a PEP proposing the addition of a new built-in function > called debug(). Adding this to your code would invoke a debugger through > the hook function sys.debughook(). > > The 'import pdb; pdb.set_trace()' dance is *extremely* obscure, so > replacing it with something more friendly seems like a great idea. > > Maybe breakpoint() would be a better description of what set_trace() > actually does? This would also avoid confusion with IPython's very > useful debug magic: > https://ipython.readthedocs.io/en/stable/interactive/ > magics.html#magic-debug > and which might also be worth stealing for the builtin REPL. > (Personally I use it way more often than set_trace().) > > Example: > > In [1]: def f(): > ...: x = 1 > ...: raise RuntimeError > ...: > > In [2]: f() > ------------------------------------------------------------ > --------------- > RuntimeError Traceback (most recent call last) > in () > ----> 1 f() > > in f() > 1 def f(): > 2 x = 1 > ----> 3 raise RuntimeError > > RuntimeError: > > In [3]: debug > > (3)f() > 1 def f(): > 2 x = 1 > ----> 3 raise RuntimeError > > ipdb> p x > 1 > > -n > > -- > Nathaniel J. Smith -- https://vorpus.org > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From asottile at umich.edu Wed Sep 6 00:10:28 2017 From: asottile at umich.edu (Anthony Sottile) Date: Tue, 5 Sep 2017 21:10:28 -0700 Subject: [Python-Dev] Looking for review on small argparse patch Message-ID: As argparse seems to not have a current maintainer, I'm hoping a volunteer could review this small patch to the argparse module. I posted the PR nearly a month ago and have not received any feedback yet: https://github.com/python/cpython/pull/3027 bpo link: https://bugs.python.org/issue26510 Thanks in advance, Anthony From songofacandy at gmail.com Wed Sep 6 00:17:52 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Wed, 6 Sep 2017 13:17:52 +0900 Subject: [Python-Dev] To reduce Python "application" startup time In-Reply-To: References: Message-ID: >> >> I haven't created pull request yet. >> (Can I create without issue, as trivial patch?) > > > Trivial, no-issue PRs are meant for things like typo fixes that need no > discussion or record. > > Moving imports in violation of the PEP 8 rule, "Imports are always put at > the top of the file, just after any module comments and docstrings, and > before module globals and constants", is not trivial. Doing so voluntarily > for speed, as opposed to doing so necessarily to avoid circular import > errors, is controversial. > > -- > Terry Jan Reedy > Make sense. I'll create issues for each module if it seems really worth enough. Thanks, From songofacandy at gmail.com Wed Sep 6 00:20:34 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Wed, 6 Sep 2017 13:20:34 +0900 Subject: [Python-Dev] To reduce Python "application" startup time In-Reply-To: References: Message-ID: >> With this profile, I tried optimize `python -c 'import asyncio'`, logging >> and http.client. >> >> >> https://gist.github.com/methane/1ab97181e74a33592314c7619bf34233#file-0-optimize-import-patch >> > This patch moves a few imports inside functions. I wonder whether that kind > of change actually helps with real applications?doesn't any real application > end up importing the socket module anyway at some point? > Ah, I'm sorry. It doesn't importing asyncio, logging and http.client faster. I saw pkg_resources. While it's not stdlib, it is imported very often. And it uses email.parser, but doesn't require socket or random. Since socket module creates some enums, removing it reduces few milliseconds. Regards, From songofacandy at gmail.com Wed Sep 6 00:30:23 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Wed, 6 Sep 2017 13:30:23 +0900 Subject: [Python-Dev] To reduce Python "application" startup time In-Reply-To: References: Message-ID: >> This patch moves a few imports inside functions. I wonder whether that kind >> of change actually helps with real applications?doesn't any real application >> end up importing the socket module anyway at some point? > > I don't know if this particular change is worthwhile, but one place > where startup slowness is particularly noticed is with commands like > 'foo.py --help' or 'foo.py --generate-completions' (the latter called > implicitly by hitting in some shell), which typically do lots of > imports that end up not being used. > Yes. And There are more worse scenario. 1. Jinja2 supports asyncio. So it imports asyncio. 2. asyncio imports concurrent.futures, for compatibility with Future class. 3. concurrent.futures package does `from concurrent.futures.process import ProcessPoolExecutor` 4. concurrent.futures.process package imports multiprocessing. So when I use Jinja2 but not asyncio or multiprocessing, I need to import large dependency tree. I want to make `import asyncio` dependency tree smaller. FYI, current version of Jinja2 has very large regex which took more than 100ms when import time. It is fixed in master branch. So if you try to see Jinja2, please use master branch. Regrads, From rosuav at gmail.com Wed Sep 6 00:34:14 2017 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 6 Sep 2017 14:34:14 +1000 Subject: [Python-Dev] To reduce Python "application" startup time In-Reply-To: References: Message-ID: On Wed, Sep 6, 2017 at 2:30 PM, INADA Naoki wrote: >>> This patch moves a few imports inside functions. I wonder whether that kind >>> of change actually helps with real applications?doesn't any real application >>> end up importing the socket module anyway at some point? >> >> I don't know if this particular change is worthwhile, but one place >> where startup slowness is particularly noticed is with commands like >> 'foo.py --help' or 'foo.py --generate-completions' (the latter called >> implicitly by hitting in some shell), which typically do lots of >> imports that end up not being used. >> > > Yes. And There are more worse scenario. > > 1. Jinja2 supports asyncio. So it imports asyncio. > 2. asyncio imports concurrent.futures, for compatibility with Future class. > 3. concurrent.futures package does > `from concurrent.futures.process import ProcessPoolExecutor` > 4. concurrent.futures.process package imports multiprocessing. > > So when I use Jinja2 but not asyncio or multiprocessing, I need to import > large dependency tree. > I want to make `import asyncio` dependency tree smaller. > > FYI, current version of Jinja2 has very large regex which took more than 100ms > when import time. It is fixed in master branch. So if you try to see > Jinja2, please > use master branch. How significant is application startup time to something that uses Jinja2? Are there short-lived programs that use it? Python startup time matters enormously to command-line tools like Mercurial, but far less to something that's designed to start up and then keep running (eg a web app, which is where Jinja is most used). ChrisA From rosuav at gmail.com Wed Sep 6 01:05:51 2017 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 6 Sep 2017 15:05:51 +1000 Subject: [Python-Dev] PEP 548: More Flexible Loop Control In-Reply-To: <20170906001100.5F9DB1B10001@webabinitio.net> References: <20170906001100.5F9DB1B10001@webabinitio.net> Message-ID: On Wed, Sep 6, 2017 at 10:11 AM, R. David Murray wrote: > I've written a PEP proposing a small enhancement to the Python loop > control statements. Short version: here's what feels to me like a > Pythonic way to spell "repeat until": > > while: > > break if > > The PEP goes into some detail on why this feels like a readability > improvement in the more general case, with examples taken from > the standard library: > > https://www.python.org/dev/peps/pep-0548/ Is "break if" legal in loops that have their own conditions as well, or only in a bare "while:" loop? For instance, is this valid? while not found_the_thing_we_want: data = sock.read() break if not data process(data) Or this, which uses the condition purely as a descriptor: while "moar socket data": data = sock.read() break if not data process(data) Also - shouldn't this be being discussed first on python-ideas? ChrisA From storchaka at gmail.com Wed Sep 6 02:42:40 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 6 Sep 2017 09:42:40 +0300 Subject: [Python-Dev] PEP 548: More Flexible Loop Control In-Reply-To: <20170906001100.5F9DB1B10001@webabinitio.net> References: <20170906001100.5F9DB1B10001@webabinitio.net> Message-ID: 06.09.17 03:11, R. David Murray ????: > I've written a PEP proposing a small enhancement to the Python loop > control statements. Short version: here's what feels to me like a > Pythonic way to spell "repeat until": > > while: > > break if > > The PEP goes into some detail on why this feels like a readability > improvement in the more general case, with examples taken from > the standard library: > > https://www.python.org/dev/peps/pep-0548/ > > Unlike Larry, I don't have a prototype, and in fact if this idea > meets with approval I'll be looking for a volunteer to do the actual > implementation. > > --David > > PS: this came to me in a dream on Sunday night, and the more I explored > the idea the better I liked it. I have no idea what I was dreaming about > that resulted in this being the thing left in my mind when I woke up :) This looks rather like Perl way than Python way. "There should be one-- and preferably only one --obvious way to do it." This proposing saves just a one line of the code. But it makes "break" and "continue" statement less visually distinguishable as it is seen in your example from uuid.py. If allow "break if" and "continue if", why not allow "return if"? Or arbitrary statement before "if"? This adds PHP-like inconsistency in the language. Current idiom is easier for modification. If you add the second condition, it may be that you need to execute different code before "break". while True: if not : break if not : break It is easy to modify the code with the current syntax, but the code with the proposed syntax should be totally rewritten. Your example from sre_parse.py demonstrates this. Please note that pre-exit code is slightly different. In the first case self.error() is called with one argument, and in the second case it is called with two arguments. Your rewritten code is not equivalent to the existing one. Other concern is that the current code is highly optimized for common cases. Your rewritten code checks the condition "c is None" two times in common case. I'm -1 for this proposition. From greg.ewing at canterbury.ac.nz Wed Sep 6 03:07:12 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 06 Sep 2017 19:07:12 +1200 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: References: <20170828111903.GA3032@bytereef.org> <20170828155215.GA3763@bytereef.org> <20170828173327.GA2671@bytereef.org> <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> Message-ID: <59AF9EA0.9020608@canterbury.ac.nz> Yury Selivanov wrote: > Greg, have you seen this new section: > https://www.python.org/dev/peps/pep-0550/#should-yield-from-leak-context-changes That section seems to be addressing the idea of a generator behaving differently depending on whether you use yield-from on it. I never suggested that, and I'm still not suggesting it. > The bottomline is that it's easier to > reason about context when it's guaranteed that context changes are > always isolated in generators no matter what. I don't see a lot of value in trying to automagically isolate changes to global state *only* in generators. Under PEP 550, if you want to e.g. change the decimal context temporarily in a non-generator function, you're still going to have to protect those changes using a with-statement or something equivalent. I don't see why the same thing shouldn't apply to generators. It seems to me that it will be *more* confusing to give generators this magical ability to avoid with-statements. > This will have some performance implications and make the API way more > complex. I can't see how it would have any significant effect on performance. The implementation would be very similar to what's currently described in the PEP. You'll have to elaborate on how you think it would be less efficient. As for complexity, push_local_context() and push_local_context() would be considered low-level primitives that you wouldn't often use directly. Most of the time they would be hidden inside context managers. You could even have a context manager just for applying them: with new_local_context(): # go nuts with context vars here > But I'm not convinced yet that real-life code needs the > semantics you want. And I'm not convinced that it needs as much magic as you want. > If you write code that uses 'with' statements consistently, you will > never even know that context changes are isolated in generators. But if you write code that uses context managers consistently, and those context managers know about and handle local contexts properly, generators don't *need* to isolate their context automatically. -- Greg From songofacandy at gmail.com Wed Sep 6 03:42:11 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Wed, 6 Sep 2017 16:42:11 +0900 Subject: [Python-Dev] To reduce Python "application" startup time In-Reply-To: References: Message-ID: > How significant is application startup time to something that uses > Jinja2? Are there short-lived programs that use it? Python startup > time matters enormously to command-line tools like Mercurial, but far > less to something that's designed to start up and then keep running > (eg a web app, which is where Jinja is most used). Since Jinja2 is very popular template engine, it is used by CLI tools like ansible. Additionally, faster startup time (and smaller memory footprint) is good for even Web applications. For example, CGI is still comfortable tool sometimes. Another example is GAE/Python. Anyway, I think researching import tree of popular library is good startline about optimizing startup time. For example, modules like ast and tokenize are imported often than I thought. Jinja2 is one of libraries I often use. I'm checking other libraries like requests. Thanks, INADA Naoki From levkivskyi at gmail.com Wed Sep 6 04:22:04 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Wed, 6 Sep 2017 10:22:04 +0200 Subject: [Python-Dev] PEP 548: More Flexible Loop Control In-Reply-To: References: <20170906001100.5F9DB1B10001@webabinitio.net> Message-ID: On 6 September 2017 at 08:42, Serhiy Storchaka wrote: > 06.09.17 03:11, R. David Murray ????: > >> I've written a PEP proposing a small enhancement to the Python loop >> control statements. Short version: here's what feels to me like a >> Pythonic way to spell "repeat until": >> >> while: >> >> break if >> >> > I'm -1 for this proposition. > > > I also think this is not worth it. This will save few lines of code but introduces some ambiguity and makes syntax more complex. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From levkivskyi at gmail.com Wed Sep 6 04:49:16 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Wed, 6 Sep 2017 10:49:16 +0200 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: <59AF9EA0.9020608@canterbury.ac.nz> References: <20170828111903.GA3032@bytereef.org> <20170828155215.GA3763@bytereef.org> <20170828173327.GA2671@bytereef.org> <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> Message-ID: Another comment from bystander point of view: it looks like the discussions of API design and implementation are a bit entangled here. This is much better in the current version of the PEP, but still there is a _feelling_ that some design decisions are influenced by the implementation strategy. As I currently see the "philosophy" at large is like this: there are different level of coupling between concurrently executing code: * processes: practically not coupled, designed to be long running * threads: more tightly coupled, designed to be less long-lived, context is managed by threading.local, which is not inherited on "forking" * tasks: tightly coupled, designed to be short-lived, context will be managed by PEP 550, context is inherited on "forking" This seems right to me. Normal generators fall out from this "scheme", and it looks like their behavior is determined by the fact that coroutines are implemented as generators. What I think miht help is to add few more motivational examples to the design section of the PEP. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Sep 6 05:13:54 2017 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 6 Sep 2017 02:13:54 -0700 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: References: <20170828111903.GA3032@bytereef.org> <20170828155215.GA3763@bytereef.org> <20170828173327.GA2671@bytereef.org> <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> Message-ID: On Wed, Sep 6, 2017 at 1:49 AM, Ivan Levkivskyi wrote: > Another comment from bystander point of view: it looks like the discussions > of API design and implementation are a bit entangled here. > This is much better in the current version of the PEP, but still there is a > _feelling_ that some design decisions are influenced by the implementation > strategy. > > As I currently see the "philosophy" at large is like this: > there are different level of coupling between concurrently executing code: > * processes: practically not coupled, designed to be long running > * threads: more tightly coupled, designed to be less long-lived, context is > managed by threading.local, which is not inherited on "forking" > * tasks: tightly coupled, designed to be short-lived, context will be > managed by PEP 550, context is inherited on "forking" > > This seems right to me. > > Normal generators fall out from this "scheme", and it looks like their > behavior is determined by the fact that coroutines are implemented as > generators. What I think miht help is to add few more motivational examples > to the design section of the PEP. Literally the first motivating example at the beginning of the PEP ('def fractions ...') involves only generators, not coroutines, and only works correctly if generators get special handling. (In fact, I'd be curious to see how Greg's {push,pop}_local_storage could handle this case.) The implementation strategy changed radically between v1 and v2 because of considerations around generator (not coroutine) semantics. I'm not sure what more it can do to dispel these feelings :-). -n -- Nathaniel J. Smith -- https://vorpus.org From k7hoven at gmail.com Wed Sep 6 06:11:32 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Wed, 6 Sep 2017 13:11:32 +0300 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: References: <20170828111903.GA3032@bytereef.org> <20170828155215.GA3763@bytereef.org> <20170828173327.GA2671@bytereef.org> <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> Message-ID: On Wed, Sep 6, 2017 at 12:13 PM, Nathaniel Smith wrote: > On Wed, Sep 6, 2017 at 1:49 AM, Ivan Levkivskyi > wrote: > > Another comment from bystander point of view: it looks like the > discussions > > of API design and implementation are a bit entangled here. > > This is much better in the current version of the PEP, but still there > is a > > _feelling_ that some design decisions are influenced by the > implementation > > strategy. > > > > As I currently see the "philosophy" at large is like this: > > there are different level of coupling between concurrently executing > code: > > * processes: practically not coupled, designed to be long running > > * threads: more tightly coupled, designed to be less long-lived, context > is > > managed by threading.local, which is not inherited on "forking" > > * tasks: tightly coupled, designed to be short-lived, context will be > > managed by PEP 550, context is inherited on "forking" > > > > This seems right to me. > > > > Normal generators fall out from this "scheme", and it looks like their > > behavior is determined by the fact that coroutines are implemented as > > generators. What I think miht help is to add few more motivational > examples > > to the design section of the PEP. > > Literally the first motivating example at the beginning of the PEP > ('def fractions ...') involves only generators, not coroutines, and > only works correctly if generators get special handling. (In fact, I'd > be curious to see how Greg's {push,pop}_local_storage could handle > this case.) The implementation strategy changed radically between v1 > and v2 because of considerations around generator (not coroutine) > semantics. I'm not sure what more it can do to dispel these feelings > :-). > > ?Just to mention that this is now closely related to the discussion on my proposal on python-ideas. BTW, that proposal is now submitted as PEP 555 on the peps repo. ??Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Wed Sep 6 06:14:59 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 6 Sep 2017 12:14:59 +0200 Subject: [Python-Dev] Consolidate stateful runtime globals Message-ID: <20170906121459.215159bc@fsol> Hello, I'm a bit concerned about https://github.com/python/cpython/commit/76d5abc8684bac4f2fc7cccfe2cd940923357351 My main gripe is that makes writing C code more tedious. Simple C global variables such as "_once_registry" are now spelled "_PyRuntime.warnings.once_registry". The most egregious example seems to be "_PyRuntime.ceval.gil.locked" (used to be simply "gil_locked"). Granted, C is more verbose than Python, but it doesn't have to become that verbose. I don't know about you, but when code becomes annoying to type, I tend to try and take shortcuts. Regards Antoine. From solipsis at pitrou.net Wed Sep 6 06:09:05 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 6 Sep 2017 12:09:05 +0200 Subject: [Python-Dev] [RFC] Removing pure Python implementation of OrderedDict In-Reply-To: References: <20170905212024.7401434a@fsol> Message-ID: <20170906120905.0115d946@fsol> On Wed, 6 Sep 2017 11:26:52 +0900 INADA Naoki wrote: > > Like that, should we say "atomic & threadsafe __setitem__ for simple > key is implementation detail of CPython and PyPy. We recommend > using mutex when using OrderedDict from multiple thread."? I think you may be overstating the importance of making OrderedDict thread-safe. It's quite rare to be able to rely on the thread safety of a single structure, since most often your state is more complex than that and you have to use a lock anyway. The statu quo is that only experts rely on the thread-safety of list and dict, and they should be ready to reconsider if some day the guarantees change. Regards Antoine. From p.f.moore at gmail.com Wed Sep 6 06:40:33 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 6 Sep 2017 11:40:33 +0100 Subject: [Python-Dev] [RFC] Removing pure Python implementation of OrderedDict In-Reply-To: <20170906120905.0115d946@fsol> References: <20170905212024.7401434a@fsol> <20170906120905.0115d946@fsol> Message-ID: On 6 September 2017 at 11:09, Antoine Pitrou wrote: > On Wed, 6 Sep 2017 11:26:52 +0900 > INADA Naoki wrote: >> >> Like that, should we say "atomic & threadsafe __setitem__ for simple >> key is implementation detail of CPython and PyPy. We recommend >> using mutex when using OrderedDict from multiple thread."? > > I think you may be overstating the importance of making OrderedDict > thread-safe. It's quite rare to be able to rely on the thread safety > of a single structure, since most often your state is more complex than > that and you have to use a lock anyway. > > The statu quo is that only experts rely on the thread-safety of list > and dict, and they should be ready to reconsider if some day the > guarantees change. Agreed. I wasn't even aware that list and dict were guaranteed threadsafe (in the language reference). And even if they are, there's going to be a lot of provisos that mean in practice you need to know what you're doing to rely on that. Simple example: mydict[the_value()] += 1 isn't thread safe, no matter how thread safe dictionaries are. I don't have a strong opinion on making OrderedDict "guaranteed thread safe" according to the language definition. But I'm pretty certain that whether we do or not will make very little practical difference to the vast majority of Python users. Paul From songofacandy at gmail.com Wed Sep 6 06:49:56 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Wed, 6 Sep 2017 19:49:56 +0900 Subject: [Python-Dev] [RFC] Removing pure Python implementation of OrderedDict In-Reply-To: References: <20170905212024.7401434a@fsol> <20170906120905.0115d946@fsol> Message-ID: OK, I stop worring about thread safety and other implementation detail behavior on edge cases. Thanks, INADA Naoki On Wed, Sep 6, 2017 at 7:40 PM, Paul Moore wrote: > On 6 September 2017 at 11:09, Antoine Pitrou wrote: >> On Wed, 6 Sep 2017 11:26:52 +0900 >> INADA Naoki wrote: >>> >>> Like that, should we say "atomic & threadsafe __setitem__ for simple >>> key is implementation detail of CPython and PyPy. We recommend >>> using mutex when using OrderedDict from multiple thread."? >> >> I think you may be overstating the importance of making OrderedDict >> thread-safe. It's quite rare to be able to rely on the thread safety >> of a single structure, since most often your state is more complex than >> that and you have to use a lock anyway. >> >> The statu quo is that only experts rely on the thread-safety of list >> and dict, and they should be ready to reconsider if some day the >> guarantees change. > > Agreed. I wasn't even aware that list and dict were guaranteed > threadsafe (in the language reference). And even if they are, there's > going to be a lot of provisos that mean in practice you need to know > what you're doing to rely on that. Simple example: > > mydict[the_value()] += 1 > > isn't thread safe, no matter how thread safe dictionaries are. > > I don't have a strong opinion on making OrderedDict "guaranteed thread > safe" according to the language definition. But I'm pretty certain > that whether we do or not will make very little practical difference > to the vast majority of Python users. > > Paul > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com From benhoyt at gmail.com Wed Sep 6 08:05:46 2017 From: benhoyt at gmail.com (Ben Hoyt) Date: Wed, 6 Sep 2017 08:05:46 -0400 Subject: [Python-Dev] PEP 548: More Flexible Loop Control In-Reply-To: References: <20170906001100.5F9DB1B10001@webabinitio.net> Message-ID: I think Serhiy's response is excellent and agree with it. My gut reaction is "this looks like Perl" (and not in a good way), but more specifically it makes the control flow almost invisible. So I'm definitely -1 on this. The current while True ... break idiom is not pretty, but it's also very clear and obvious, and the control flow is immediately visible. One thing I do like about the proposal is the bare "while:", and I think that syntax is obvious and might be worth keeping (separately to the rest of the proposal). A bare "while:" (meaning "while True:") seems somehow less insulting to the intelligence, is still clear, and has precedent in Go's bare "for { ... }". -Ben On Wed, Sep 6, 2017 at 2:42 AM, Serhiy Storchaka wrote: > 06.09.17 03:11, R. David Murray ????: > >> I've written a PEP proposing a small enhancement to the Python loop >> control statements. Short version: here's what feels to me like a >> Pythonic way to spell "repeat until": >> >> while: >> >> break if >> >> The PEP goes into some detail on why this feels like a readability >> improvement in the more general case, with examples taken from >> the standard library: >> >> https://www.python.org/dev/peps/pep-0548/ >> >> Unlike Larry, I don't have a prototype, and in fact if this idea >> meets with approval I'll be looking for a volunteer to do the actual >> implementation. >> >> --David >> >> PS: this came to me in a dream on Sunday night, and the more I explored >> the idea the better I liked it. I have no idea what I was dreaming about >> that resulted in this being the thing left in my mind when I woke up :) >> > > This looks rather like Perl way than Python way. > > "There should be one-- and preferably only one --obvious way to do it." > > This proposing saves just a one line of the code. But it makes "break" and > "continue" statement less visually distinguishable as it is seen in your > example from uuid.py. > > If allow "break if" and "continue if", why not allow "return if"? Or > arbitrary statement before "if"? This adds PHP-like inconsistency in the > language. > > Current idiom is easier for modification. If you add the second condition, > it may be that you need to execute different code before "break". > > while True: > > if not : > > break > > if not : > > break > > It is easy to modify the code with the current syntax, but the code with > the proposed syntax should be totally rewritten. > > Your example from sre_parse.py demonstrates this. Please note that > pre-exit code is slightly different. In the first case self.error() is > called with one argument, and in the second case it is called with two > arguments. Your rewritten code is not equivalent to the existing one. > > Other concern is that the current code is highly optimized for common > cases. Your rewritten code checks the condition "c is None" two times in > common case. > > I'm -1 for this proposition. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/benhoyt% > 40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From agoretoy at gmail.com Wed Sep 6 08:29:08 2017 From: agoretoy at gmail.com (alex goretoy) Date: Wed, 6 Sep 2017 19:29:08 +0700 Subject: [Python-Dev] PEP 548: More Flexible Loop Control In-Reply-To: References: <20170906001100.5F9DB1B10001@webabinitio.net> Message-ID: https://www.youtube.com/watch?v=pNe1wWeaHOU&list=PLYI8318YYdkCsZ7dsYV01n6TZhXA6Wf9i&index=1 Thank you, -Alex Goretoy On Wed, Sep 6, 2017 at 7:05 PM, Ben Hoyt wrote: > I think Serhiy's response is excellent and agree with it. My gut reaction is > "this looks like Perl" (and not in a good way), but more specifically it > makes the control flow almost invisible. So I'm definitely -1 on this. > > The current while True ... break idiom is not pretty, but it's also very > clear and obvious, and the control flow is immediately visible. > > One thing I do like about the proposal is the bare "while:", and I think > that syntax is obvious and might be worth keeping (separately to the rest of > the proposal). A bare "while:" (meaning "while True:") seems somehow less > insulting to the intelligence, is still clear, and has precedent in Go's > bare "for { ... }". > > -Ben > > On Wed, Sep 6, 2017 at 2:42 AM, Serhiy Storchaka > wrote: >> >> 06.09.17 03:11, R. David Murray ????: >>> >>> I've written a PEP proposing a small enhancement to the Python loop >>> control statements. Short version: here's what feels to me like a >>> Pythonic way to spell "repeat until": >>> >>> while: >>> >>> break if >>> >>> The PEP goes into some detail on why this feels like a readability >>> improvement in the more general case, with examples taken from >>> the standard library: >>> >>> https://www.python.org/dev/peps/pep-0548/ >>> >>> Unlike Larry, I don't have a prototype, and in fact if this idea >>> meets with approval I'll be looking for a volunteer to do the actual >>> implementation. >>> >>> --David >>> >>> PS: this came to me in a dream on Sunday night, and the more I explored >>> the idea the better I liked it. I have no idea what I was dreaming about >>> that resulted in this being the thing left in my mind when I woke up :) >> >> >> This looks rather like Perl way than Python way. >> >> "There should be one-- and preferably only one --obvious way to do it." >> >> This proposing saves just a one line of the code. But it makes "break" and >> "continue" statement less visually distinguishable as it is seen in your >> example from uuid.py. >> >> If allow "break if" and "continue if", why not allow "return if"? Or >> arbitrary statement before "if"? This adds PHP-like inconsistency in the >> language. >> >> Current idiom is easier for modification. If you add the second condition, >> it may be that you need to execute different code before "break". >> >> while True: >> >> if not : >> >> break >> >> if not : >> >> break >> >> It is easy to modify the code with the current syntax, but the code with >> the proposed syntax should be totally rewritten. >> >> Your example from sre_parse.py demonstrates this. Please note that >> pre-exit code is slightly different. In the first case self.error() is >> called with one argument, and in the second case it is called with two >> arguments. Your rewritten code is not equivalent to the existing one. >> >> Other concern is that the current code is highly optimized for common >> cases. Your rewritten code checks the condition "c is None" two times in >> common case. >> >> I'm -1 for this proposition. >> >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> https://mail.python.org/mailman/options/python-dev/benhoyt%40gmail.com > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/agoretoy%40gmail.com > From agoretoy at gmail.com Wed Sep 6 08:34:21 2017 From: agoretoy at gmail.com (alex goretoy) Date: Wed, 6 Sep 2017 19:34:21 +0700 Subject: [Python-Dev] [RFC] Removing pure Python implementation of OrderedDict In-Reply-To: References: <20170905212024.7401434a@fsol> <20170906120905.0115d946@fsol> Message-ID: https://www.youtube.com/watch?v=pNe1wWeaHOU&list=PLYI8318YYdkCsZ7dsYV01n6TZhXA6Wf9i&index=1 On Wed, Sep 6, 2017 at 5:49 PM, INADA Naoki wrote: > OK, I stop worring about thread safety and other implementation > detail behavior on edge cases. > > Thanks, > > INADA Naoki > > > On Wed, Sep 6, 2017 at 7:40 PM, Paul Moore wrote: >> On 6 September 2017 at 11:09, Antoine Pitrou wrote: >>> On Wed, 6 Sep 2017 11:26:52 +0900 >>> INADA Naoki wrote: >>>> >>>> Like that, should we say "atomic & threadsafe __setitem__ for simple >>>> key is implementation detail of CPython and PyPy. We recommend >>>> using mutex when using OrderedDict from multiple thread."? >>> >>> I think you may be overstating the importance of making OrderedDict >>> thread-safe. It's quite rare to be able to rely on the thread safety >>> of a single structure, since most often your state is more complex than >>> that and you have to use a lock anyway. >>> >>> The statu quo is that only experts rely on the thread-safety of list >>> and dict, and they should be ready to reconsider if some day the >>> guarantees change. >> >> Agreed. I wasn't even aware that list and dict were guaranteed >> threadsafe (in the language reference). And even if they are, there's >> going to be a lot of provisos that mean in practice you need to know >> what you're doing to rely on that. Simple example: >> >> mydict[the_value()] += 1 >> >> isn't thread safe, no matter how thread safe dictionaries are. >> >> I don't have a strong opinion on making OrderedDict "guaranteed thread >> safe" according to the language definition. But I'm pretty certain >> that whether we do or not will make very little practical difference >> to the vast majority of Python users. >> >> Paul >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/agoretoy%40gmail.com From levkivskyi at gmail.com Wed Sep 6 08:58:18 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Wed, 6 Sep 2017 14:58:18 +0200 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: References: <20170828111903.GA3032@bytereef.org> <20170828155215.GA3763@bytereef.org> <20170828173327.GA2671@bytereef.org> <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> Message-ID: On 6 September 2017 at 11:13, Nathaniel Smith wrote: > On Wed, Sep 6, 2017 at 1:49 AM, Ivan Levkivskyi > wrote: > > Normal generators fall out from this "scheme", and it looks like their > > behavior is determined by the fact that coroutines are implemented as > > generators. What I think miht help is to add few more motivational > examples > > to the design section of the PEP. > > Literally the first motivating example at the beginning of the PEP > ('def fractions ...') involves only generators, not coroutines. > And this is probably what confuses people. As I understand, the tasks/coroutines are among the primary motivations for the PEP, but they appear somewhere later. There are four potential ways to see the PEP: 1) Generators are broken*, and therefore coroutines are broken, we want to fix the latter therefore we fix the former. 2) Coroutines are broken, we want to fix them and let's also fix generators while we are at it. 3) Generators are broken, we want to fix them and let's also fix coroutines while we are at it. 4) Generators and coroutines are broken in similar ways, let us fix them as consistently as we can. As I understand the PEP is based on option (4), please correct me if I am wrong. Therefore maybe this should be said more straight, and maybe then we should show _in addition_ a task example in rationale, show how it is broken, and explain that they are broken in slightly different ways (since expected semantics is a bit different). -- Ivan * here and below by broken I mean "broken" (sometimes behave in non-intuitive way, and lack some functionality we would like them to have) -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Wed Sep 6 10:39:09 2017 From: barry at python.org (Barry Warsaw) Date: Wed, 6 Sep 2017 07:39:09 -0700 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: References: Message-ID: <7DDCE154-7596-414F-90C4-A2AE0618C728@python.org> On Sep 5, 2017, at 20:15, Guido van Rossum wrote: > > Yeah, I like the idea, but I don't like the debug() name -- IIRC there's a helper named debug() in some codebase I know of that prints its arguments under certain circumstances. > > On Tue, Sep 5, 2017 at 7:58 PM, Nathaniel Smith wrote: > > Maybe breakpoint() would be a better description of what set_trace() > actually does? Originally I was thinking of a keyword like ?break here?, but once (after discussion with a few folks at the sprint) I settled on a built-in function, I was looking for something concise that most directly reflected the intent. Plus I knew I wanted to mirror the sys.*hooks, so again I looked for something short. debug() was the best I could come up with! breakpoint() could work, although would the hooks then be sys.breakpointhook() and sys.__breakpointhook__? Too bad we can?t just use break() :). Guido, is that helper you?re thinking of implemented as a built-in? If you have a suggestion, it would short-circuit the inevitable bikeshedding. > This would also avoid confusion with IPython's very > useful debug magic: > https://ipython.readthedocs.io/en/stable/interactive/magics.html#magic-debug > and which might also be worth stealing for the builtin REPL. > (Personally I use it way more often than set_trace().) Interesting. I?m not an IPython user. Do you think its %debug magic would benefit from PEP 553? (Aside: improving/expanding the stdlib debugger is something else I?d like to work on, but this is completely independent of PEP 553.) Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 273 bytes Desc: Message signed with OpenPGP URL: From wes.turner at gmail.com Wed Sep 6 10:12:00 2017 From: wes.turner at gmail.com (Wes Turner) Date: Wed, 6 Sep 2017 09:12:00 -0500 Subject: [Python-Dev] To reduce Python "application" startup time In-Reply-To: References: Message-ID: On Wednesday, September 6, 2017, INADA Naoki wrote: > > How significant is application startup time to something that uses > > Jinja2? Are there short-lived programs that use it? Python startup > > time matters enormously to command-line tools like Mercurial, but far > > less to something that's designed to start up and then keep running > > (eg a web app, which is where Jinja is most used). > > Since Jinja2 is very popular template engine, it is used by CLI tools > like ansible. SaltStack uses Jinja2. It really is a good idea to regularly restart the minion processes. Celery can also cycle through worker processes, IIRC. > > Additionally, faster startup time (and smaller memory footprint) is good > for even Web applications. > For example, CGI is still comfortable tool sometimes. > Another example is GAE/Python. Short-lived processes are sometimes preferable from a security standpoint. Python is currently less viable for CGI use than other scripting languages due to startup time. Resource leaks (e.g. memory, file handles, database references; valgrind) do not last w/ short-lived CGI processes. If there's ASLR, that's also harder. Scale up operations with e.g. IaaS platforms like Kubernetes and PaaS platforms like AppScale all incur Python startup time on a regular basis. > > Anyway, I think researching import tree of popular library is good > startline > about optimizing startup time. > For example, modules like ast and tokenize are imported often than I > thought. > > Jinja2 is one of libraries I often use. I'm checking other libraries > like requests. > Thanks, > > INADA Naoki > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > wes.turner%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Wed Sep 6 10:42:51 2017 From: barry at python.org (Barry Warsaw) Date: Wed, 6 Sep 2017 07:42:51 -0700 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: References: Message-ID: <98F44C60-2DBB-4771-9544-B9DA7210353C@python.org> On Sep 5, 2017, at 19:31, Giampaolo Rodola' wrote: > > True. Personally I have a shortcut in my IDE (Sublime) so I when I type "pdb" -> TAB it auto completes it. > > Somehow I think debug() would make this a bit harder as it's more likely a "debug()" line will pass unnoticed. > For this reason I would give a -1 to this proposal. I think if your linter or editor can take note of the pdb idiom, it can also do so for the debug() built-in. > Personally I would find it helpful if there was a hook to choose the default debugger to use on "pdb.set_trace()" via .pdbrc or PYTHONDEBUGGER environment variable or something. > I tried (unsuccessfully) to run ipdb on "pdb.set_trace()", I gave up and ended up emulating auto completion and commands history with this: > https://github.com/giampaolo/sysconf/blob/master/home/.pdbrc.py I don?t think that?s a good idea. pdb is a thing, and that thing is the standard library debugger. I don?t think ?pdb? should be the term we use to describe a generic Python debugger interface. That to me is one of the advantages of PEP 553; it separates the act of invoking the debugging from the actual debugger so invoked. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 273 bytes Desc: Message signed with OpenPGP URL: From guido at python.org Wed Sep 6 10:36:15 2017 From: guido at python.org (Guido van Rossum) Date: Wed, 6 Sep 2017 07:36:15 -0700 Subject: [Python-Dev] [RFC] Removing pure Python implementation of OrderedDict In-Reply-To: References: <20170905212024.7401434a@fsol> <20170906120905.0115d946@fsol> Message-ID: On Wed, Sep 6, 2017 at 3:49 AM, INADA Naoki wrote: > OK, I stop worring about thread safety and other implementation > detail behavior on edge cases. > That sounds like overreacting. At the risk of stating the obvious: I want the data structure itself to maintain its integrity even under edge cases of multi-threading. However that's different from promising that all (or certain) operations will be atomic in all cases. (Plus, for dicts and sets and other data structures that compare items, you can't have atomicity if those comparisons call back into Python -- so it's extra important that even when that happens the data structure's *integrity* is still maintained.) IMO, in edge cases, it's okay to not do an operation, do it twice, get the wrong answer, or raise an exception, as long as the data structure's internal constraints are still satisfied (e.g. no dangling pointers, no inconsistent indexes, that sort of stuff.) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From yselivanov.ml at gmail.com Wed Sep 6 10:42:52 2017 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Wed, 6 Sep 2017 07:42:52 -0700 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: References: <20170828111903.GA3032@bytereef.org> <20170828155215.GA3763@bytereef.org> <20170828173327.GA2671@bytereef.org> <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> Message-ID: On Wed, Sep 6, 2017 at 5:58 AM, Ivan Levkivskyi wrote: > On 6 September 2017 at 11:13, Nathaniel Smith wrote: >> >> On Wed, Sep 6, 2017 at 1:49 AM, Ivan Levkivskyi >> wrote: >> > Normal generators fall out from this "scheme", and it looks like their >> > behavior is determined by the fact that coroutines are implemented as >> > generators. What I think miht help is to add few more motivational >> > examples >> > to the design section of the PEP. >> >> Literally the first motivating example at the beginning of the PEP >> ('def fractions ...') involves only generators, not coroutines. > > > And this is probably what confuses people. As I understand, the > tasks/coroutines are among the primary motivations for the PEP, > but they appear somewhere later. There are four potential ways to see the > PEP: > > 1) Generators are broken*, and therefore coroutines are broken, we want to > fix the latter therefore we fix the former. > 2) Coroutines are broken, we want to fix them and let's also fix generators > while we are at it. > 3) Generators are broken, we want to fix them and let's also fix coroutines > while we are at it. > 4) Generators and coroutines are broken in similar ways, let us fix them as > consistently as we can. Ivan, generators and coroutines are fundamentally different objects (even though they share the implementation). The only common thing is that they both allow for out of order execution of code in the same OS thread. The PEP explains the semantical difference of EC in the High-level Specification in detail, literally on the 2nd page of the PEP. I don't see any benefit in reshuffling the rationale section. Yury From guido at python.org Wed Sep 6 10:46:00 2017 From: guido at python.org (Guido van Rossum) Date: Wed, 6 Sep 2017 07:46:00 -0700 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: <7DDCE154-7596-414F-90C4-A2AE0618C728@python.org> References: <7DDCE154-7596-414F-90C4-A2AE0618C728@python.org> Message-ID: IIRC they indeed insinuate debug() into the builtins. My suggestion is also breakpoint(). On Wed, Sep 6, 2017 at 7:39 AM, Barry Warsaw wrote: > On Sep 5, 2017, at 20:15, Guido van Rossum wrote: > > > > Yeah, I like the idea, but I don't like the debug() name -- IIRC there's > a helper named debug() in some codebase I know of that prints its arguments > under certain circumstances. > > > > On Tue, Sep 5, 2017 at 7:58 PM, Nathaniel Smith wrote: > > > > Maybe breakpoint() would be a better description of what set_trace() > > actually does? > > Originally I was thinking of a keyword like ?break here?, but once (after > discussion with a few folks at the sprint) I settled on a built-in > function, I was looking for something concise that most directly reflected > the intent. Plus I knew I wanted to mirror the sys.*hooks, so again I > looked for something short. debug() was the best I could come up with! > > breakpoint() could work, although would the hooks then be > sys.breakpointhook() and sys.__breakpointhook__? Too bad we can?t just use > break() :). > > Guido, is that helper you?re thinking of implemented as a built-in? If > you have a suggestion, it would short-circuit the inevitable bikeshedding. > > > This would also avoid confusion with IPython's very > > useful debug magic: > > https://ipython.readthedocs.io/en/stable/interactive/ > magics.html#magic-debug > > and which might also be worth stealing for the builtin REPL. > > (Personally I use it way more often than set_trace().) > > Interesting. I?m not an IPython user. Do you think its %debug magic > would benefit from PEP 553? > > (Aside: improving/expanding the stdlib debugger is something else I?d like > to work on, but this is completely independent of PEP 553.) > > Cheers, > -Barry > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From yselivanov.ml at gmail.com Wed Sep 6 10:36:54 2017 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Wed, 6 Sep 2017 07:36:54 -0700 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: <59AF9EA0.9020608@canterbury.ac.nz> References: <20170828111903.GA3032@bytereef.org> <20170828155215.GA3763@bytereef.org> <20170828173327.GA2671@bytereef.org> <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> Message-ID: On Wed, Sep 6, 2017 at 12:07 AM, Greg Ewing wrote: > Yury Selivanov wrote: [..] > I don't see a lot of value in trying to automagically > isolate changes to global state *only* in generators. > > Under PEP 550, if you want to e.g. change the decimal > context temporarily in a non-generator function, you're > still going to have to protect those changes using a > with-statement or something equivalent. I don't see > why the same thing shouldn't apply to generators. > > It seems to me that it will be *more* confusing to give > generators this magical ability to avoid with-statements. Greg, just to make sure that we are talking about the same thing, could you please show an example (using the current PEP 550 API/semantics) of something that in your opinion should work differently for generators? Yury From brian at python.org Wed Sep 6 11:22:48 2017 From: brian at python.org (Brian Curtin) Date: Wed, 6 Sep 2017 11:22:48 -0400 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: References: <7DDCE154-7596-414F-90C4-A2AE0618C728@python.org> Message-ID: On Wed, Sep 6, 2017 at 10:46 AM, Guido van Rossum wrote: > IIRC they indeed insinuate debug() into the builtins. My suggestion is > also breakpoint(). > I'm also a bigger fan of the `breakpoint` name. `debug` as a name is already pretty widely used, plus breakpoint is more specific in naming what's actually going to happen. sys.breakpoint_hook() and sys.__breakpoint_hook__ seem reasonable -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Sep 6 11:55:17 2017 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 6 Sep 2017 08:55:17 -0700 Subject: [Python-Dev] To reduce Python "application" startup time In-Reply-To: References: Message-ID: <-3796574119850315591@unknownmsgid> > Anyway, I think researching import tree of popular library is good > startline > about optimizing startup time. > I agree -- in this case, you've identified that asyncio is expensive -- good to know. In the jinja2 case, does it always need asyncio? Pep8 as side, I think it often makes sense for expensive optional imports to be done only if needed. Perhaps a patch to jinja2 is in order. CHB For example, modules like ast and tokenize are imported often than I > thought. > > Jinja2 is one of libraries I often use. I'm checking other libraries > like requests. > Thanks, > > INADA Naoki > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > wes.turner%40gmail.com > _______________________________________________ Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/chris.barker%40noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Sep 6 11:26:12 2017 From: guido at python.org (Guido van Rossum) Date: Wed, 6 Sep 2017 08:26:12 -0700 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) In-Reply-To: References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> Message-ID: So we've seen a real use case for __class__ assignment: deprecating things on access. That use case could also be solved if modules natively supported defining __getattr__ (with the same "only used if attribute not found otherwise" semantics as it has on classes), but it couldn't be solved using @property (or at least it would be quite hacky). Is there a real use case for @property? Otherwise, if we're going to mess with module's getattro, it makes more sense to add __getattr__, which would have made Nathaniel's use case somewhat simpler. (Except for the __dir__ thing -- what else might we need?) On Tue, Sep 5, 2017 at 3:52 PM, Nathaniel Smith wrote: > On Tue, Sep 5, 2017 at 3:03 PM, Larry Hastings wrote: > > > > I've written a PEP proposing a language change: > > > > https://www.python.org/dev/peps/pep-0549/ > > > > The TL;DR summary: add support for property objects to modules. I've > > already posted a prototype. > > Interesting idea! It's definitely less arcane than the __class__ > assignment support that was added in 3.5. I guess the question is > whether to add another language feature here, or to provide better > documentation/helpers for the existing feature. > > If anyone's curious what the __class__ trick looks like in practice, > here's some simple deprecation machinery: > https://github.com/njsmith/trio/blob/ee8d909e34a2b28d55b5c6137707e8 > 861eee3234/trio/_deprecate.py#L102-L138 > > And here's what it looks like in use: > https://github.com/njsmith/trio/blob/ee8d909e34a2b28d55b5c6137707e8 > 861eee3234/trio/__init__.py#L91-L115 > > Advantages of PEP 549: > - easier to explain and use > > Advantages of the __class__ trick: > - faster (no need to add an extra step to the normal attribute lookup > fast path); only those who need the feature pay for it > > - exposes the full power of Python's class model. Notice that the > above code overrides __getattr__ but not __dir__, so the attributes > are accessible via direct lookup but not listed in dir(mod). This is > on purpose, for two reasons: (a) tab completion shouldn't be > suggesting deprecated attributes, (b) when I did expose them in > __dir__, I had trouble with test runners that iterated through > dir(mod) looking for tests, and ended up spewing tons of spurious > deprecation warnings. (This is especially bad when you have a policy > of running your tests with DeprecationWarnings converted to errors.) I > don't think there's any way to do this with PEP 549. > > - already supported in CPython 3.5+ and PyPy3, and with a bit of care > can be faked on all older CPython releases (including, crucially, > CPython 2). PEP 549 OTOH AFAICT will only be usable for packages that > have 3.7 as their minimum supported version. > > I don't imagine that I would actually use PEP 549 any time in the > foreseeable future, due to the inability to override __dir__ and the > minimum version requirement. If you only need to support CPython 3.5+ > and PyPy3 5.9+, then you can effectively get PEP 549's functionality > at the cost of 3 lines of code and a block indent: > > import sys, types > class _MyModuleType(types.ModuleType): > @property > def ... > > @property > def ... > sys.modules[__name__].__class__ = _MyModuleType > > It's definitely true though that they're not the most obvious lines of > code :-) > > -n > > -- > Nathaniel J. Smith -- https://vorpus.org > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From levkivskyi at gmail.com Wed Sep 6 12:15:58 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Wed, 6 Sep 2017 18:15:58 +0200 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) In-Reply-To: References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> Message-ID: On 6 September 2017 at 17:26, Guido van Rossum wrote: > > Is there a real use case for @property? Otherwise, if we're going to mess > with module's getattro, it makes more sense to add __getattr__, which would > have made Nathaniel's use case somewhat simpler. (Except for the __dir__ > thing -- what else might we need?) > > One additional (IMO quite strong) argument in favor of module level __getattr__ is that this is already used by PEP 484 for stub files and is supported by mypy, see https://github.com/python/mypy/pull/3647 -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdmurray at bitdance.com Wed Sep 6 12:34:00 2017 From: rdmurray at bitdance.com (R. David Murray) Date: Wed, 06 Sep 2017 12:34:00 -0400 Subject: [Python-Dev] PEP 548: More Flexible Loop Control In-Reply-To: References: <20170906001100.5F9DB1B10001@webabinitio.net> Message-ID: <20170906163401.2EDA01B10001@webabinitio.net> On Wed, 06 Sep 2017 15:05:51 +1000, Chris Angelico wrote: > On Wed, Sep 6, 2017 at 10:11 AM, R. David Murray wrote: > > I've written a PEP proposing a small enhancement to the Python loop > > control statements. Short version: here's what feels to me like a > > Pythonic way to spell "repeat until": > > > > while: > > > > break if > > > > The PEP goes into some detail on why this feels like a readability > > improvement in the more general case, with examples taken from > > the standard library: > > > > https://www.python.org/dev/peps/pep-0548/ > > Is "break if" legal in loops that have their own conditions as well, > or only in a bare "while:" loop? For instance, is this valid? > > while not found_the_thing_we_want: > data = sock.read() > break if not data > process(data) Yes. > Or this, which uses the condition purely as a descriptor: > > while "moar socket data": > data = sock.read() > break if not data > process(data) Yes. > Also - shouldn't this be being discussed first on python-ideas? Yep, you are absolutely right. Someone has told me I also missed a related discussion on python-ideas in my searching for prior discussions. (I haven't looked for it yet...) I'll blame jet lag :) --David From benjamin at python.org Wed Sep 6 12:42:29 2017 From: benjamin at python.org (Benjamin Peterson) Date: Wed, 06 Sep 2017 09:42:29 -0700 Subject: [Python-Dev] Consolidate stateful runtime globals In-Reply-To: <20170906121459.215159bc@fsol> References: <20170906121459.215159bc@fsol> Message-ID: <1504716149.433314.1097304088.73AD5975@webmail.messagingengine.com> On Wed, Sep 6, 2017, at 03:14, Antoine Pitrou wrote: > > Hello, > > I'm a bit concerned about > https://github.com/python/cpython/commit/76d5abc8684bac4f2fc7cccfe2cd940923357351 > > My main gripe is that makes writing C code more tedious. Simple C > global variables such as "_once_registry" are now spelled > "_PyRuntime.warnings.once_registry". The most egregious example seems > to be "_PyRuntime.ceval.gil.locked" (used to be simply "gil_locked"). > > Granted, C is more verbose than Python, but it doesn't have to become > that verbose. I don't know about you, but when code becomes annoying > to type, I tend to try and take shortcuts. How often are you actually typing the names of runtime globals, though? If you are using a globals, perhaps the typing time will allow you to fully consider the gravity of the situation. From barry at python.org Wed Sep 6 12:44:38 2017 From: barry at python.org (Barry Warsaw) Date: Wed, 6 Sep 2017 09:44:38 -0700 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: References: <7DDCE154-7596-414F-90C4-A2AE0618C728@python.org> Message-ID: <3AA673BE-9B32-4ACB-AB4D-940A006C0A4D@python.org> On Sep 6, 2017, at 07:46, Guido van Rossum wrote: > > IIRC they indeed insinuate debug() into the builtins. My suggestion is also breakpoint(). breakpoint() it is then! I?ll rename the sys hooks too, but keep the naming scheme of the existing sys hooks. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 273 bytes Desc: Message signed with OpenPGP URL: From guido at python.org Wed Sep 6 12:45:37 2017 From: guido at python.org (Guido van Rossum) Date: Wed, 6 Sep 2017 09:45:37 -0700 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) In-Reply-To: References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> Message-ID: On Wed, Sep 6, 2017 at 9:15 AM, Ivan Levkivskyi wrote: > > On 6 September 2017 at 17:26, Guido van Rossum wrote: > >> >> Is there a real use case for @property? Otherwise, if we're going to mess >> with module's getattro, it makes more sense to add __getattr__, which would >> have made Nathaniel's use case somewhat simpler. (Except for the __dir__ >> thing -- what else might we need?) >> >> > One additional (IMO quite strong) argument in favor of module level > __getattr__ is that this is already used by PEP 484 for stub files and is > supported by mypy, see https://github.com/python/mypy/pull/3647 > So we're looking for a competing PEP here. Shouldn't be long, just summarize the discussion about use cases and generality here. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Sep 6 13:00:23 2017 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 6 Sep 2017 10:00:23 -0700 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: <7DDCE154-7596-414F-90C4-A2AE0618C728@python.org> References: <7DDCE154-7596-414F-90C4-A2AE0618C728@python.org> Message-ID: On Wed, Sep 6, 2017 at 7:39 AM, Barry Warsaw wrote: >> On Tue, Sep 5, 2017 at 7:58 PM, Nathaniel Smith wrote: >> This would also avoid confusion with IPython's very >> useful debug magic: >> https://ipython.readthedocs.io/en/stable/interactive/magics.html#magic-debug >> and which might also be worth stealing for the builtin REPL. >> (Personally I use it way more often than set_trace().) > > Interesting. I?m not an IPython user. Do you think its %debug magic would benefit from PEP 553? Not in particular. But if you're working on making debugger entry more discoverable/human-friendly, then providing a friendlier alias for the pdb.pm() semantics might be useful too? Actually, if you look at the pdb docs, the 3 ways of entering the debugger that merit demonstrations at the top of the manual page are: pdb.run("...code...") # "I want to debug this code" pdb.set_trace() # "break here" pdb.pm() # "wtf just happened?" The set_trace() name is particularly opaque, but if we're talking about adding a friendly debugger abstraction layer then I'd at least think about whether to make it cover all three of these. -n -- Nathaniel J. Smith -- https://vorpus.org From k7hoven at gmail.com Wed Sep 6 11:07:36 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Wed, 6 Sep 2017 18:07:36 +0300 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: <59AF9EA0.9020608@canterbury.ac.nz> References: <20170828111903.GA3032@bytereef.org> <20170828155215.GA3763@bytereef.org> <20170828173327.GA2671@bytereef.org> <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> Message-ID: On Wed, Sep 6, 2017 at 10:07 AM, Greg Ewing wrote: > Yury Selivanov wrote: > >> Greg, have you seen this new section: >> https://www.python.org/dev/peps/pep-0550/#should-yield-from- >> leak-context-changes >> > > That section seems to be addressing the idea of a generator > behaving differently depending on whether you use yield-from > on it. > ?Regarding this, I think yield from should have the same semantics as iterating over the generator with next/send, and PEP 555 has no issues with this. > > I never suggested that, and I'm still not suggesting it. > > The bottomline is that it's easier to >> reason about context when it's guaranteed that context changes are >> always isolated in generators no matter what. >> > > I don't see a lot of value in trying to automagically > isolate changes to global state *only* in generators. > > Under PEP 550, if you want to e.g. change the decimal > context temporarily in a non-generator function, you're > still going to have to protect those changes using a > with-statement or something equivalent. I don't see > why the same thing shouldn't apply to generators. > ?? > > It seems to me that it will be *more* confusing to give > generators this magical ability to avoid with-statements. > ?? > > ?Exactly. To state it clearly: PEP 555 does not have this issue. ???Koos? -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Wed Sep 6 13:08:01 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 6 Sep 2017 19:08:01 +0200 Subject: [Python-Dev] Consolidate stateful runtime globals In-Reply-To: <1504716149.433314.1097304088.73AD5975@webmail.messagingengine.com> References: <20170906121459.215159bc@fsol> <1504716149.433314.1097304088.73AD5975@webmail.messagingengine.com> Message-ID: <20170906190801.5121e323@fsol> On Wed, 06 Sep 2017 09:42:29 -0700 Benjamin Peterson wrote: > On Wed, Sep 6, 2017, at 03:14, Antoine Pitrou wrote: > > > > Hello, > > > > I'm a bit concerned about > > https://github.com/python/cpython/commit/76d5abc8684bac4f2fc7cccfe2cd940923357351 > > > > My main gripe is that makes writing C code more tedious. Simple C > > global variables such as "_once_registry" are now spelled > > "_PyRuntime.warnings.once_registry". The most egregious example seems > > to be "_PyRuntime.ceval.gil.locked" (used to be simply "gil_locked"). > > > > Granted, C is more verbose than Python, but it doesn't have to become > > that verbose. I don't know about you, but when code becomes annoying > > to type, I tend to try and take shortcuts. > > How often are you actually typing the names of runtime globals, though? Not very often, but if I want to experiment with some low-level implementation details, it is nice to avoid the hassle. There's also a readability argument: with very long names, expressions can become less easy to parse. > If you are using a globals, perhaps the typing time will allow you to > fully consider the gravity of the situation. Right, I needed to be reminded of how perilous the use of C globals is. Perhaps I should contact the PSRT the next time I contemplate using a C global. Regards Antoine. From fabiofz at gmail.com Wed Sep 6 13:14:47 2017 From: fabiofz at gmail.com (Fabio Zadrozny) Date: Wed, 6 Sep 2017 14:14:47 -0300 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: <3AA673BE-9B32-4ACB-AB4D-940A006C0A4D@python.org> References: <7DDCE154-7596-414F-90C4-A2AE0618C728@python.org> <3AA673BE-9B32-4ACB-AB4D-940A006C0A4D@python.org> Message-ID: Hi Barry, I think it's a nice idea. Related to the name, on the windows c++ there's "DebugBreak": https://msdn.microsoft.com/en-us/library/windows/desktop/ms679297(v=vs.85).aspx, which I think is a better name (so, it'd be debug_break for Python -- I think it's better than plain breakpoint(), and wouldn't clash as debug()). For the PyDev.Debugger (https://github.com/fabioz/PyDev.Debugger), which is the one used by PyDev & PyCharm, I think it would also work. For instance, for adding the debugger in PyDev, there's a template completion that'll add the debugger to the PYTHONPATH and start the remote debugger (same as pdb.set_trace()): i.e.: the 'pydevd' template expands to something as: import sys;sys.path.append(r'path/to/ide/shipped_debugger/pysrc') import pydevd;pydevd.settrace() I think I could change the hook on a custom sitecustomize (there's already one in place in PyDev) so that the debug_break() would actually read some env var to do that work (and provide some utility for users to pre-setup it when not launching from inside the IDE). Still, there may be other settings that the user needs to pass to settrace() when doing a remote debug session -- i.e.: things such as the host, port to connect, etc -- see: https://github.com/fabioz/PyDev.Debugger/blob/master/pydevd.py#L1121, so, maybe the debug_break() method should accept keyword arguments to pass along to support other backends? Cheers, Fabio On Wed, Sep 6, 2017 at 1:44 PM, Barry Warsaw wrote: > On Sep 6, 2017, at 07:46, Guido van Rossum wrote: > > > > IIRC they indeed insinuate debug() into the builtins. My suggestion is > also breakpoint(). > > breakpoint() it is then! I?ll rename the sys hooks too, but keep the > naming scheme of the existing sys hooks. > > Cheers, > -Barry > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > fabiofz%40gmail.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Wed Sep 6 13:15:47 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 6 Sep 2017 19:15:47 +0200 Subject: [Python-Dev] Compiling without multithreading support -- still useful? References: <20170905183651.6934a16d@fsol> Message-ID: <20170906191547.2e454a23@fsol> I made an experimental PR to remove support for threads-less builds: https://github.com/python/cpython/pull/3385 The next effect is to remove almost 2000 lines of code (including many #ifdef sections in C code). Regards Antoine. On Tue, 5 Sep 2017 18:36:51 +0200 Antoine Pitrou wrote: > Hello, > > It's 2017 and we are still allowing people to compile CPython without > threads support. It adds some complication in several places > (including delicate parts of our internal C code) without a clear > benefit. Do people still need this? > > Regards > > Antoine. > > From guido at python.org Wed Sep 6 13:16:35 2017 From: guido at python.org (Guido van Rossum) Date: Wed, 6 Sep 2017 10:16:35 -0700 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: References: <20170828111903.GA3032@bytereef.org> <20170828155215.GA3763@bytereef.org> <20170828173327.GA2671@bytereef.org> <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> Message-ID: On Wed, Sep 6, 2017 at 8:07 AM, Koos Zevenhoven wrote: > I think yield from should have the same semantics as iterating over the > generator with next/send, and PEP 555 has no issues with this. > I think the onus is on you and Greg to show a realistic example that shows why this is necessary. So far all the argumentation about this has been of the form "if you have code that currently does this (example using foo) and you refactor it in using yield from (example using bar), and if you were relying on context propagation back out of calls, then it should still propagate out." This feels like a very abstract argument. I have a feeling that context state propagating out of a call is used relatively rarely -- it must work for cases where you refactor something that changes context inline into a utility function (e.g. decimal.setcontext()), but I just can't think of a realistic example where coroutines (either of the yield-from variety or of the async/def form) would be used for such a utility function. A utility function that sets context state but also makes a network call just sounds like asking for trouble! -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From benjamin at python.org Wed Sep 6 13:17:06 2017 From: benjamin at python.org (Benjamin Peterson) Date: Wed, 06 Sep 2017 10:17:06 -0700 Subject: [Python-Dev] Consolidate stateful runtime globals In-Reply-To: <20170906190801.5121e323@fsol> References: <20170906121459.215159bc@fsol> <1504716149.433314.1097304088.73AD5975@webmail.messagingengine.com> <20170906190801.5121e323@fsol> Message-ID: <1504718226.444156.1097348696.6F266B12@webmail.messagingengine.com> On Wed, Sep 6, 2017, at 10:08, Antoine Pitrou wrote: > On Wed, 06 Sep 2017 09:42:29 -0700 > Benjamin Peterson wrote: > > On Wed, Sep 6, 2017, at 03:14, Antoine Pitrou wrote: > > > > > > Hello, > > > > > > I'm a bit concerned about > > > https://github.com/python/cpython/commit/76d5abc8684bac4f2fc7cccfe2cd940923357351 > > > > > > My main gripe is that makes writing C code more tedious. Simple C > > > global variables such as "_once_registry" are now spelled > > > "_PyRuntime.warnings.once_registry". The most egregious example seems > > > to be "_PyRuntime.ceval.gil.locked" (used to be simply "gil_locked"). > > > > > > Granted, C is more verbose than Python, but it doesn't have to become > > > that verbose. I don't know about you, but when code becomes annoying > > > to type, I tend to try and take shortcuts. > > > > How often are you actually typing the names of runtime globals, though? > > Not very often, but if I want to experiment with some low-level > implementation details, it is nice to avoid the hassle. It seems like this could be remediated with some inline functions or macros, which would also help safely encapsulate state. > > There's also a readability argument: with very long names, expressions > can become less easy to parse. > > > If you are using a globals, perhaps the typing time will allow you to > > fully consider the gravity of the situation. > > Right, I needed to be reminded of how perilous the use of C globals is. > Perhaps I should contact the PSRT the next time I contemplate using a C > global. It's not just you but future readers. From guido at python.org Wed Sep 6 13:19:48 2017 From: guido at python.org (Guido van Rossum) Date: Wed, 6 Sep 2017 10:19:48 -0700 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: References: <7DDCE154-7596-414F-90C4-A2AE0618C728@python.org> Message-ID: 99% of the time I use a debugger I use pdb.set_trace(). The pm() stuff is typically useful for debugging small, simple programs only -- complex programs likely hide the exception somewhere (after logging it) so there's nothing for pdb.pm() to look at. I think Barry is wisely focusing on just the ability to quickly and programmatically insert a breakpoint. On Wed, Sep 6, 2017 at 10:00 AM, Nathaniel Smith wrote: > On Wed, Sep 6, 2017 at 7:39 AM, Barry Warsaw wrote: > >> On Tue, Sep 5, 2017 at 7:58 PM, Nathaniel Smith wrote: > >> This would also avoid confusion with IPython's very > >> useful debug magic: > >> https://ipython.readthedocs.io/en/stable/interactive/ > magics.html#magic-debug > >> and which might also be worth stealing for the builtin REPL. > >> (Personally I use it way more often than set_trace().) > > > > Interesting. I?m not an IPython user. Do you think its %debug magic > would benefit from PEP 553? > > Not in particular. But if you're working on making debugger entry more > discoverable/human-friendly, then providing a friendlier alias for the > pdb.pm() semantics might be useful too? > > Actually, if you look at the pdb docs, the 3 ways of entering the > debugger that merit demonstrations at the top of the manual page are: > > pdb.run("...code...") # "I want to debug this code" > pdb.set_trace() # "break here" > pdb.pm() # "wtf just happened?" > > The set_trace() name is particularly opaque, but if we're talking > about adding a friendly debugger abstraction layer then I'd at least > think about whether to make it cover all three of these. > > -n > > -- > Nathaniel J. Smith -- https://vorpus.org > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From cody.piersall at gmail.com Wed Sep 6 13:13:52 2017 From: cody.piersall at gmail.com (Cody Piersall) Date: Wed, 6 Sep 2017 12:13:52 -0500 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) In-Reply-To: References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> Message-ID: On Wed, Sep 6, 2017 at 10:26 AM, Guido van Rossum wrote: > > So we've seen a real use case for __class__ assignment: deprecating things on access. That use case could also be solved if modules natively supported defining __getattr__ (with the same "only used if attribute not found otherwise" semantics as it has on classes), but it couldn't be solved using @property (or at least it would be quite hacky). > > Is there a real use case for @property? Otherwise, if we're going to mess with module's getattro, it makes more sense to add __getattr__, which would have made Nathaniel's use case somewhat simpler. (Except for the __dir__ thing -- what else might we need?) > -- > --Guido van Rossum (python.org/~guido) I think a more natural way for the __dir__ problem would be to update module_dir() in moduleobject.c to check if __all__ is defined and then just return that list if it is defined. I think that would be a friendlier default for __dir__ anyway. Cody From yselivanov.ml at gmail.com Wed Sep 6 13:22:29 2017 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Wed, 6 Sep 2017 10:22:29 -0700 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: References: <20170828111903.GA3032@bytereef.org> <20170828155215.GA3763@bytereef.org> <20170828173327.GA2671@bytereef.org> <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> Message-ID: On Wed, Sep 6, 2017 at 8:07 AM, Koos Zevenhoven wrote: > On Wed, Sep 6, 2017 at 10:07 AM, Greg Ewing > wrote: >> >> Yury Selivanov wrote: >>> >>> Greg, have you seen this new section: >>> >>> https://www.python.org/dev/peps/pep-0550/#should-yield-from-leak-context-changes >> >> >> That section seems to be addressing the idea of a generator >> behaving differently depending on whether you use yield-from >> on it. > > > Regarding this, I think yield from should have the same semantics as > iterating over the generator with next/send, and PEP 555 has no issues with > this. > >> >> >> I never suggested that, and I'm still not suggesting it. >> >>> The bottomline is that it's easier to >>> reason about context when it's guaranteed that context changes are >>> always isolated in generators no matter what. >> >> >> I don't see a lot of value in trying to automagically >> isolate changes to global state *only* in generators. >> >> Under PEP 550, if you want to e.g. change the decimal >> context temporarily in a non-generator function, you're >> still going to have to protect those changes using a >> with-statement or something equivalent. I don't see >> why the same thing shouldn't apply to generators. >> >> It seems to me that it will be *more* confusing to give >> generators this magical ability to avoid with-statements. >> > > Exactly. To state it clearly: PEP 555 does not have this issue. It would be great if you or Greg could show a couple of real-world examples showing the "issue" (with the current PEP 550 APIs/semantics). PEP 550 treats coroutines and generators as objects that support out of order execution. OS threads are similar to them in some ways. I find it questionable to try to enforce context management rules we have for regular functions to generators/coroutines. I don't really understand the "refactoring" argument you and Greg are talking about all the time. PEP 555 still doesn't clearly explain how exactly it is different from PEP 550. Because 555 was posted *after* 550, I think that it's PEP 555 that should have that comparison. Yury From guido at python.org Wed Sep 6 12:43:53 2017 From: guido at python.org (Guido van Rossum) Date: Wed, 6 Sep 2017 09:43:53 -0700 Subject: [Python-Dev] PEP 548: More Flexible Loop Control In-Reply-To: <20170906163401.2EDA01B10001@webabinitio.net> References: <20170906001100.5F9DB1B10001@webabinitio.net> <20170906163401.2EDA01B10001@webabinitio.net> Message-ID: I'm actually not in favor of this. It's another way to do the same thing. Sorry to rain on your dream! On Wed, Sep 6, 2017 at 9:34 AM, R. David Murray wrote: > On Wed, 06 Sep 2017 15:05:51 +1000, Chris Angelico > wrote: > > On Wed, Sep 6, 2017 at 10:11 AM, R. David Murray > wrote: > > > I've written a PEP proposing a small enhancement to the Python loop > > > control statements. Short version: here's what feels to me like a > > > Pythonic way to spell "repeat until": > > > > > > while: > > > > > > break if > > > > > > The PEP goes into some detail on why this feels like a readability > > > improvement in the more general case, with examples taken from > > > the standard library: > > > > > > https://www.python.org/dev/peps/pep-0548/ > > > > Is "break if" legal in loops that have their own conditions as well, > > or only in a bare "while:" loop? For instance, is this valid? > > > > while not found_the_thing_we_want: > > data = sock.read() > > break if not data > > process(data) > > Yes. > > > Or this, which uses the condition purely as a descriptor: > > > > while "moar socket data": > > data = sock.read() > > break if not data > > process(data) > > Yes. > > > Also - shouldn't this be being discussed first on python-ideas? > > Yep, you are absolutely right. Someone has told me I also missed > a related discussion on python-ideas in my searching for prior > discussions. (I haven't looked for it yet...) > > I'll blame jet lag :) > > --David > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at ethanhs.me Wed Sep 6 13:50:11 2017 From: ethan at ethanhs.me (Ethan Smith) Date: Wed, 6 Sep 2017 10:50:11 -0700 Subject: [Python-Dev] Compiling without multithreading support -- still useful? In-Reply-To: <20170906191547.2e454a23@fsol> References: <20170905183651.6934a16d@fsol> <20170906191547.2e454a23@fsol> Message-ID: I think this is useful as it can make porting easier. I am using it in my attempts to cross compile CPython to WebAssembly (since WebAssembly in its MVP does not support threading). On Wed, Sep 6, 2017 at 10:15 AM, Antoine Pitrou wrote: > > I made an experimental PR to remove support for threads-less builds: > https://github.com/python/cpython/pull/3385 > > The next effect is to remove almost 2000 lines of code (including many > #ifdef sections in C code). > > Regards > > Antoine. > > > On Tue, 5 Sep 2017 18:36:51 +0200 > Antoine Pitrou wrote: > > Hello, > > > > It's 2017 and we are still allowing people to compile CPython without > > threads support. It adds some complication in several places > > (including delicate parts of our internal C code) without a clear > > benefit. Do people still need this? > > > > Regards > > > > Antoine. > > > > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > ethan%40ethanhs.me > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Wed Sep 6 14:13:53 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 6 Sep 2017 20:13:53 +0200 Subject: [Python-Dev] Compiling without multithreading support -- still useful? In-Reply-To: References: <20170905183651.6934a16d@fsol> <20170906191547.2e454a23@fsol> Message-ID: <20170906201353.65da3276@fsol> On Wed, 6 Sep 2017 10:50:11 -0700 Ethan Smith wrote: > I think this is useful as it can make porting easier. I am using it in my > attempts to cross compile CPython to WebAssembly (since WebAssembly in its > MVP does not support threading). The problem is that the burden of maintenance falls on us (core CPython developers), while none of us and probably 99.99% of our userbase have absolutely no use for the "functionality". Perhaps there's a simpler, cruder way to "support" threads-less platforms. For example a Python/thread_nothreads.h where PyThread_start_new_thread() would always fail (and with trivial implementations of locks and TLS keys). But I'm not sure how much it would help those porting attempts. Regards Antoine. From ethan at ethanhs.me Wed Sep 6 14:23:29 2017 From: ethan at ethanhs.me (Ethan Smith) Date: Wed, 6 Sep 2017 11:23:29 -0700 Subject: [Python-Dev] Compiling without multithreading support -- still useful? In-Reply-To: <20170906201353.65da3276@fsol> References: <20170905183651.6934a16d@fsol> <20170906191547.2e454a23@fsol> <20170906201353.65da3276@fsol> Message-ID: Certainly, I understand it can be burdensome. I suppose I can use 3.6 branch for the initial port, so it shouldn't be an issue. On Wed, Sep 6, 2017 at 11:13 AM, Antoine Pitrou wrote: > On Wed, 6 Sep 2017 10:50:11 -0700 > Ethan Smith wrote: > > I think this is useful as it can make porting easier. I am using it in my > > attempts to cross compile CPython to WebAssembly (since WebAssembly in > its > > MVP does not support threading). > > The problem is that the burden of maintenance falls on us (core CPython > developers), while none of us and probably 99.99% of our userbase have > absolutely no use for the "functionality". > > Perhaps there's a simpler, cruder way to "support" threads-less > platforms. For example a Python/thread_nothreads.h where > PyThread_start_new_thread() would always fail (and with trivial > implementations of locks and TLS keys). But I'm not sure how much it > would help those porting attempts. > > Regards > > Antoine. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > ethan%40ethanhs.me > -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Wed Sep 6 16:06:04 2017 From: greg at krypto.org (Gregory P. Smith) Date: Wed, 06 Sep 2017 20:06:04 +0000 Subject: [Python-Dev] Compiling without multithreading support -- still useful? In-Reply-To: References: <20170905183651.6934a16d@fsol> <20170906191547.2e454a23@fsol> <20170906201353.65da3276@fsol> Message-ID: My take on platforms without thread support is that they should provide a their own fake/green/virtual threading APIs. I don't know how practical that thought actually is for things like web assembly but I'm with Antoine here. The maintenance burden for --without-threads builds is a pain I'd love to avoid. -G On Wed, Sep 6, 2017 at 11:49 AM Ethan Smith wrote: > Certainly, I understand it can be burdensome. I suppose I can use 3.6 > branch for the initial port, so it shouldn't be an issue. > > On Wed, Sep 6, 2017 at 11:13 AM, Antoine Pitrou > wrote: > >> On Wed, 6 Sep 2017 10:50:11 -0700 >> Ethan Smith wrote: >> > I think this is useful as it can make porting easier. I am using it in >> my >> > attempts to cross compile CPython to WebAssembly (since WebAssembly in >> its >> > MVP does not support threading). >> >> The problem is that the burden of maintenance falls on us (core CPython >> developers), while none of us and probably 99.99% of our userbase have >> absolutely no use for the "functionality". >> >> Perhaps there's a simpler, cruder way to "support" threads-less >> platforms. For example a Python/thread_nothreads.h where >> PyThread_start_new_thread() would always fail (and with trivial >> implementations of locks and TLS keys). But I'm not sure how much it >> would help those porting attempts. >> > >> Regards >> >> Antoine. >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> > Unsubscribe: >> https://mail.python.org/mailman/options/python-dev/ethan%40ethanhs.me >> > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/greg%40krypto.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Wed Sep 6 16:18:52 2017 From: greg at krypto.org (Gregory P. Smith) Date: Wed, 06 Sep 2017 20:18:52 +0000 Subject: [Python-Dev] Consolidate stateful runtime globals In-Reply-To: <1504718226.444156.1097348696.6F266B12@webmail.messagingengine.com> References: <20170906121459.215159bc@fsol> <1504716149.433314.1097304088.73AD5975@webmail.messagingengine.com> <20170906190801.5121e323@fsol> <1504718226.444156.1097348696.6F266B12@webmail.messagingengine.com> Message-ID: On Wed, Sep 6, 2017 at 10:26 AM Benjamin Peterson wrote: > > On Wed, Sep 6, 2017, at 10:08, Antoine Pitrou wrote: > > On Wed, 06 Sep 2017 09:42:29 -0700 > > Benjamin Peterson wrote: > > > On Wed, Sep 6, 2017, at 03:14, Antoine Pitrou wrote: > > > > > > > > Hello, > > > > > > > > I'm a bit concerned about > > > > > https://github.com/python/cpython/commit/76d5abc8684bac4f2fc7cccfe2cd940923357351 > > > > > > > > My main gripe is that makes writing C code more tedious. Simple C > > > > global variables such as "_once_registry" are now spelled > > > > "_PyRuntime.warnings.once_registry". The most egregious example > seems > > > > to be "_PyRuntime.ceval.gil.locked" (used to be simply "gil_locked"). > > > > > > > > Granted, C is more verbose than Python, but it doesn't have to become > > > > that verbose. I don't know about you, but when code becomes annoying > > > > to type, I tend to try and take shortcuts. > > > > > > How often are you actually typing the names of runtime globals, though? > > > > Not very often, but if I want to experiment with some low-level > > implementation details, it is nice to avoid the hassle. > > It seems like this could be remediated with some inline functions or > macros, which would also help safely encapsulate state. > > > > > There's also a readability argument: with very long names, expressions > > can become less easy to parse. > > > > > If you are using a globals, perhaps the typing time will allow you to > > > fully consider the gravity of the situation. > > > > Right, I needed to be reminded of how perilous the use of C globals is. > > Perhaps I should contact the PSRT the next time I contemplate using a C > > global. > > It's not just you but future readers. > I'm not concerned about moving things into a state structure rather than wildly scattered globals declared all over the place. It is good code hygiene. It ultimately moves us closer (much more work to be done) to being able to actually have multiple independent interpreters within the same process (including potentially even of different Python versions). For commonly typed things that get annoying, #define _Py_grail _PyRuntme.ceval.holy.grail within the .c source file that does a lot of grail flinging seems fine to me. -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From berker.peksag at gmail.com Wed Sep 6 16:19:08 2017 From: berker.peksag at gmail.com (=?UTF-8?Q?Berker_Peksa=C4=9F?=) Date: Wed, 6 Sep 2017 23:19:08 +0300 Subject: [Python-Dev] Compiling without multithreading support -- still useful? In-Reply-To: References: <20170905183651.6934a16d@fsol> Message-ID: On Tue, Sep 5, 2017 at 7:42 PM, Victor Stinner wrote: > I'm strongly in favor of dropping this option from Python 3.7. It > would remove a lot of code! +1 Do we still have buildbots for testing the --without-threads option? --Berker From barry at python.org Wed Sep 6 16:33:12 2017 From: barry at python.org (Barry Warsaw) Date: Wed, 6 Sep 2017 13:33:12 -0700 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: References: <7DDCE154-7596-414F-90C4-A2AE0618C728@python.org> Message-ID: <7107C96F-DB98-4CBB-B44B-FE82FC9C48A0@python.org> On Sep 6, 2017, at 10:19, Guido van Rossum wrote: > > 99% of the time I use a debugger I use pdb.set_trace(). The pm() stuff is typically useful for debugging small, simple programs only -- complex programs likely hide the exception somewhere (after logging it) so there's nothing for pdb.pm() to look at. I think Barry is wisely focusing on just the ability to quickly and programmatically insert a breakpoint. Thanks Guido, that?s my thinking exactly. pdb isn?t going away of course, so those less common use cases are still always available. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 273 bytes Desc: Message signed with OpenPGP URL: From solipsis at pitrou.net Wed Sep 6 16:39:43 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 6 Sep 2017 22:39:43 +0200 Subject: [Python-Dev] Consolidate stateful runtime globals In-Reply-To: References: <20170906121459.215159bc@fsol> <1504716149.433314.1097304088.73AD5975@webmail.messagingengine.com> <20170906190801.5121e323@fsol> <1504718226.444156.1097348696.6F266B12@webmail.messagingengine.com> Message-ID: <20170906223943.27d9eaf9@fsol> On Wed, 06 Sep 2017 20:18:52 +0000 "Gregory P. Smith" wrote: > > For commonly typed things that get annoying, > > #define _Py_grail _PyRuntme.ceval.holy.grail > > within the .c source file that does a lot of grail flinging seems fine to > me. That sounds fine to me too. Thank you! Regards Antoine. From k7hoven at gmail.com Wed Sep 6 16:39:56 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Wed, 6 Sep 2017 23:39:56 +0300 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: References: <20170828111903.GA3032@bytereef.org> <20170828155215.GA3763@bytereef.org> <20170828173327.GA2671@bytereef.org> <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> Message-ID: On Wed, Sep 6, 2017 at 8:16 PM, Guido van Rossum wrote: > On Wed, Sep 6, 2017 at 8:07 AM, Koos Zevenhoven wrote: > >> I think yield from should have the same semantics as iterating over the >> generator with next/send, and PEP 555 has no issues with this. >> > > I think the onus is on you and Greg to show a realistic example that shows > why this is necessary. > > ?Well, regarding this part, it's just that things like for obj in gen: ? yield obj often get modernized into yield from gen And realistic examples of that include pretty much any normal use of yield from. So far all the argumentation about this has been of the form "if you have > code that currently does this (example using foo) and you refactor it in > using yield from (example using bar), and if you were relying on context > propagation back out of calls, then it should still propagate out." > > ?So here's a realistic example, with the semantics of PEP 550 applied to a decimal.setcontext() kind of thing, but it could be anything using var.set(value): def process_data_buffers(?buffers): setcontext(default_context) for buf in buffers: for data in buf: if data.tag == "NEW_PRECISION": setcontext(context_based_on(data)) else: yield compute(data) Code smells? Yes, but maybe you often see much worse things, so let's say it's fine. ?But then, if you refactor it into a subgenerator like this: def process_data_buffer(buffer): for data in buf: if data.tag == "NEW_PRECISION": setcontext(context_based_on(data)) else: yield compute(data) def process_data_buffers(?buffers): setcontext(default_context) for buf in buffers: yield from buf Now, if setcontext uses PEP 550 semantics, the refactoring broke the code, because a generator introduce a scope barrier by adding a LogicalContext on the stack, and setcontext is only local to the process_data_buffer subroutine. But the programmer is puzzled, because with regular functions it had worked just fine in a similar situation before they learned about generators: def process_data_buffer(buffer, output): for data in buf: if data.tag == "precision change": setcontext(context_based_on(data)) else: output.append(compute(data)) def process_data_buffers(?buffers): output = [] setcontext(default_context) for buf in buffers: process_data_buffer(buf, output) ?In fact, this code had another problem, namely that the context state is leaked out of process_d?ata_buffers, because PEP 550 leaks context state out of functions, but not out of generators. But we can easily imagine that the unit tests for process_data_buffers *do* pass. But let's look at a user of the functionality: def get_total(): return sum(process_data_buffers(get_buffers())) setcontext(somecontext) value = get_total() * compute_factor() Now the code is broken, because setcontext(somecontext) has no effect, because get_total() leaks out another context. Not to mention that our data buffer source now has control over the behavior of compute_factor(). But if one is lucky, the last line was written as value = compute_factor() * get_total() And hooray, the code works! (Except for perhaps the code that is run after this.) Now this was of course a completely fictional example, and hopefully I didn't introduce any bugs or syntax errors other than the ones I described. I haven't seen code like this anywhere, but somehow we caught the problems anyway. -- Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From v+python at g.nevcal.com Wed Sep 6 16:30:29 2017 From: v+python at g.nevcal.com (Glenn Linderman) Date: Wed, 6 Sep 2017 13:30:29 -0700 Subject: [Python-Dev] Consolidate stateful runtime globals In-Reply-To: References: <20170906121459.215159bc@fsol> <1504716149.433314.1097304088.73AD5975@webmail.messagingengine.com> <20170906190801.5121e323@fsol> <1504718226.444156.1097348696.6F266B12@webmail.messagingengine.com> Message-ID: <5c6869c5-c03a-46e6-0d17-81528cdd47e0@g.nevcal.com> On 9/6/2017 1:18 PM, Gregory P. Smith wrote: > I'm not concerned about moving things into a state structure rather > than wildly scattered globals declared all over the place.? It is good > code hygiene. It ultimately moves us closer (much more work to be > done) to being able to actually have multiple independent interpreters > within the same process (including potentially even of different > Python versions). > > For commonly typed things that get annoying, > > #define _Py_grail ? _PyRuntme.ceval.holy.grail > > within the .c source file that does a lot of grail flinging seems fine > to me. > > -gps You just need a PEP 550 (or 555) to use instead of C globals. But why would you ever want multiple Python versions in one process? Sounds like a debug headache in the making. Name collisions would abound for libraries and functions even if globals were cured! -------------- next part -------------- An HTML attachment was scrubbed... URL: From k7hoven at gmail.com Wed Sep 6 16:57:41 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Wed, 6 Sep 2017 23:57:41 +0300 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: References: <20170828111903.GA3032@bytereef.org> <20170828155215.GA3763@bytereef.org> <20170828173327.GA2671@bytereef.org> <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> Message-ID: On Wed, Sep 6, 2017 at 8:22 PM, Yury Selivanov wrote: ?[...] ? > PEP 550 treats coroutines and generators as objects that support out > of order execution. ?Out of order? More like interleaved.? ?? > PEP 555 still doesn't clearly explain how exactly it is different from > PEP 550. Because 555 was posted *after* 550, I think that it's PEP > 555 that should have that comparison. > 555 was *posted* as a pep after 550, yes. And yes, there could be a comparison, especially now that PEP 550 semantics seem to have converged, so PEP 555 does not have to adapt the comparison to PEP 550 changes. -- Koos? -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From k7hoven at gmail.com Wed Sep 6 17:19:58 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Thu, 7 Sep 2017 00:19:58 +0300 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) In-Reply-To: References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> Message-ID: On Wed, Sep 6, 2017 at 1:52 AM, Nathaniel Smith wrote: ?[...]? > import sys, types > class _MyModuleType(types.ModuleType): > @property > def ... > > @property > def ... > sys.modules[__name__].__class__ = _MyModuleType > > It's definitely true though that they're not the most obvious lines of > code :-) > > ?It would kind of be in line with the present behavior if you could simply write something like this in the module: class __class__(types.ModuleType): ? @property def hello(self): return "hello" def __dir__(self): return ["hello"] assuming it would be equivalent to setting __class__ afterwards.? ?--Koos? -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From mariatta.wijaya at gmail.com Wed Sep 6 17:59:52 2017 From: mariatta.wijaya at gmail.com (Mariatta Wijaya) Date: Wed, 6 Sep 2017 14:59:52 -0700 Subject: [Python-Dev] Cherry picker bot deployed in CPython repo In-Reply-To: References: Message-ID: Update: Older merged PRs not yet backported? ============================== A core developer can re-apply the `needs backport ..` label to trigger the backport. Meaning, remove the existing label, then add it back. If there was no label and now you want it to be backported, adding the label will also trigger the backport. Don't want PR to be backported by a bot? ================================ Close the backport PR made by Miss Islington and make your own backport PR. Thanks! Mariatta Wijaya On Tue, Sep 5, 2017 at 6:10 PM, Mariatta Wijaya wrote: > Hi, > > The cherry picker bot has just been deployed to CPython repo, codenamed > miss-islington. > > miss-islington made the very first backport PR for CPython and became a > first time GitHub contributor: https://github.com/python/cpython/pull/3369 > > > GitHub repo: https://github.com/python/miss-islington > > What is this? > ========== > > As part of our workflow, quite often changes made on the master branch > need to be backported to the earlier versions. (for example: from master to > 3.6 and 2.7) > > Previously the backport has to be done manually by either a core developer > or the original PR author. > > With the bot, the backport PR is created automatically after the PR has > been merged. A core developer will need to review the backport PR. > > The issue was tracked in https://github.com/python/core-workflow/issues/8 > > How it works > ========== > > 1. If a PR needs to be backported to one of the maintenance branches, a > core developer should apply the "needs backport to X.Y" label. Do this > **before** you merge the PR. > > 2. Merge the PR > > 3. miss-islington will leave a comment on the PR, saying it is working on > backporting the PR. > > 4. If there's no merge conflict, the PR should be created momentarily. > > 5. Review the backport PR created by miss-islington and merge it when > you're ready. > > Merge Conflicts / Problems? > ====================== > > In case of merge conflicts, or if a backport PR was not created within 2 > minutes, it likely failed and you should do the backport manually. > > Manual backport can be done using cherry_picker: https://pypi. > org/project/cherry-picker/ > > Older merged PRs not yet backported? > ============================== > > At the moment, those need to be backported manually. > > Don't want PR to be backported by a bot? > ================================ > > My recommendation is to apply the "needs backport to X.Y" **after** the PR > has been merged. The label is still useful to remind ourselves that this PR > still needs backporting. > > Who is Miss Islington? > ================= > > I found out from Wikipedia that Miss Islington is the name of the witch in > Monty Python and The Holy Grail. > > miss-islington has not signed the CLA! > ============================= > > A core dev can ignore the warning and merge the PR anyway. > > Thanks! > > > Mariatta Wijaya > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Wed Sep 6 17:59:52 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 6 Sep 2017 17:59:52 -0400 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: <98F44C60-2DBB-4771-9544-B9DA7210353C@python.org> References: <98F44C60-2DBB-4771-9544-B9DA7210353C@python.org> Message-ID: On 9/6/2017 10:42 AM, Barry Warsaw wrote: > I don?t think that?s a good idea. pdb is a thing, and that thing is the standard library debugger. I don?t think ?pdb? should be the term we use to describe a generic Python debugger interface. That to me is one of the advantages of PEP 553; it separates the act of invoking the debugging from the actual debugger so invoked. I tried inserting import pdb; pdb.set_trace() into a simple file and running it from an IDLE editor. It works fine, using IDLE's shell as the console. (The only glitch is that (Pdb) q raises bdb.BdbQuit, which causes Shell to restart.) So I expect the proposed breakpoint() to work similarly. IDLE has a gui-based debugger based on its Idb subclass of bdb.Bdb. I took a first look into whether IDLE could set it as the breakpoint() handler and concluded maybe, with some work, if some problems are solved. Currently, the debugger is started in response to a menu seletion in the IDLE process while the python process is idle. One reason for the 'idle' requirement' is because when code is exec-uting, the loop that reads commands, executes them, and sends responses is blocked on the exec call. The IDLE process sets up its debugger window, its ends of the rpc channels, and commands to python process to set up Idb and the other ends of the channels. The challenge would be to initiate setup from the server process and deal with the blocked loop. -- Terry Jan Reedy From ronaldoussoren at mac.com Wed Sep 6 17:13:39 2017 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Wed, 06 Sep 2017 23:13:39 +0200 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) In-Reply-To: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> Message-ID: > On 6 Sep 2017, at 00:03, Larry Hastings wrote: > > > > I've written a PEP proposing a language change: > https://www.python.org/dev/peps/pep-0549/ > The TL;DR summary: add support for property objects to modules. I've already posted a prototype. > > > How's that sound? To be honest this sounds like a fairly crude hack. Updating the __class__ of a module object feels dirty, but at least you get normal behavior w.r.t. properties. Why is there no mechanism to add new descriptors that can work in this context? BTW. The interaction with import is interesting? Module properties only work as naive users expect when accessing them as attributes of the module object, in particular importing the name using ?from module import prop? would only call the property getter once and that may not be the intended behavior. Ronald -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Wed Sep 6 18:24:20 2017 From: barry at python.org (Barry Warsaw) Date: Wed, 6 Sep 2017 15:24:20 -0700 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: References: <7DDCE154-7596-414F-90C4-A2AE0618C728@python.org> <3AA673BE-9B32-4ACB-AB4D-940A006C0A4D@python.org> Message-ID: <4AEEEC14-2FF3-4A48-962D-1293B8B0A8FC@python.org> On Sep 6, 2017, at 10:14, Fabio Zadrozny wrote: > > I think it's a nice idea. Great! > Related to the name, on the windows c++ there's "DebugBreak": https://msdn.microsoft.com/en-us/library/windows/desktop/ms679297(v=vs.85).aspx, which I think is a better name (so, it'd be debug_break for Python -- I think it's better than plain breakpoint(), and wouldn't clash as debug()). It?s important to understand that a new built-in is technically never going to clash with existing code, regardless of what it?s called. Given Python?s name resolution rules, if your code uses any built, it?ll just shadow it. That?s one big reason why the PEP proposed a built-in rather than say a keyword. That said, while I like the more succinct `debug()` name, Guido prefers `breakpoint()` and that works fine for me. > I think I could change the hook on a custom sitecustomize (there's already one in place in PyDev) so that the debug_break() would actually read some env var to do that work (and provide some utility for users to pre-setup it when not launching from inside the IDE). I had meant to add an open issue about the idea of adding an environment variable, such as $PYTHONBREAKPOINTHOOK which could be set to the callable to bind to sys.breakpointhook(). I?ve now added that to PEP 553, and I think that handles the use case you outline above. > Still, there may be other settings that the user needs to pass to settrace() when doing a remote debug session -- i.e.: things such as the host, port to connect, etc -- see: https://github.com/fabioz/PyDev.Debugger/blob/master/pydevd.py#L1121, so, maybe the debug_break() method should accept keyword arguments to pass along to support other backends? Possibly, and there?s an open issue about that, but I?m skeptical about that for your use case. Will a user setting a breakpoint know what to pass there? Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 273 bytes Desc: Message signed with OpenPGP URL: From barry at python.org Wed Sep 6 18:45:22 2017 From: barry at python.org (Barry Warsaw) Date: Wed, 6 Sep 2017 15:45:22 -0700 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: References: <98F44C60-2DBB-4771-9544-B9DA7210353C@python.org> Message-ID: <03B5AA26-0AE0-4EFB-9300-B71F9ECD3AE8@python.org> On Sep 6, 2017, at 14:59, Terry Reedy wrote: > > Currently, the debugger is started in response to a menu seletion in the IDLE process while the python process is idle. One reason for the 'idle' requirement' is because when code is exec-uting, the loop that reads commands, executes them, and sends responses is blocked on the exec call. The IDLE process sets up its debugger window, its ends of the rpc channels, and commands to python process to set up Idb and the other ends of the channels. The challenge would be to initiate setup from the server process and deal with the blocked loop. Would the environment variable idea in the latest version of the PEP help you here? Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 273 bytes Desc: Message signed with OpenPGP URL: From guido at python.org Wed Sep 6 18:50:02 2017 From: guido at python.org (Guido van Rossum) Date: Wed, 6 Sep 2017 15:50:02 -0700 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: References: <20170828111903.GA3032@bytereef.org> <20170828155215.GA3763@bytereef.org> <20170828173327.GA2671@bytereef.org> <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> Message-ID: On Wed, Sep 6, 2017 at 1:39 PM, Koos Zevenhoven wrote: > On Wed, Sep 6, 2017 at 8:16 PM, Guido van Rossum wrote: > >> On Wed, Sep 6, 2017 at 8:07 AM, Koos Zevenhoven >> wrote: >> >>> I think yield from should have the same semantics as iterating over the >>> generator with next/send, and PEP 555 has no issues with this. >>> >> >> I think the onus is on you and Greg to show a realistic example that >> shows why this is necessary. >> >> > ?Well, regarding this part, it's just that things like > > for obj in gen: > ? yield obj > > often get modernized into > > yield from gen > I know that that's the pattern, but everybody just shows the same foo/bar example. > And realistic examples of that include pretty much any normal use of yield > from. > There aren't actually any "normal" uses of yield from. The vast majority of uses of yield from are in coroutines written using yield from. > > > So far all the argumentation about this has been of the form "if you have >> code that currently does this (example using foo) and you refactor it in >> using yield from (example using bar), and if you were relying on context >> propagation back out of calls, then it should still propagate out." >> >> > ?So here's a realistic example, with the semantics of PEP 550 applied to a > decimal.setcontext() kind of thing, but it could be anything using > var.set(value): > > def process_data_buffers(?buffers): > setcontext(default_context) > for buf in buffers: > for data in buf: > if data.tag == "NEW_PRECISION": > setcontext(context_based_on(data)) > else: > yield compute(data) > > > Code smells? Yes, but maybe you often see much worse things, so let's say > it's fine. > > ?But then, if you refactor it into a subgenerator like this: > > def process_data_buffer(buffer): > for data in buf: > if data.tag == "NEW_PRECISION": > setcontext(context_based_on(data)) > else: > yield compute(data) > > def process_data_buffers(?buffers): > setcontext(default_context) > for buf in buffers: > yield from buf > > > Now, if setcontext uses PEP 550 semantics, the refactoring broke the code, > because a generator introduce a scope barrier by adding a LogicalContext on > the stack, and setcontext is only local to the process_data_buffer > subroutine. But the programmer is puzzled, because with regular functions > it had worked just fine in a similar situation before they learned about > generators: > > > def process_data_buffer(buffer, output): > for data in buf: > if data.tag == "precision change": > setcontext(context_based_on(data)) > else: > output.append(compute(data)) > > def process_data_buffers(?buffers): > output = [] > setcontext(default_context) > for buf in buffers: > process_data_buffer(buf, output) > > ?In fact, this code had another problem, namely that the context state is > leaked out of process_d?ata_buffers, because PEP 550 leaks context state > out of functions, but not out of generators. But we can easily imagine that > the unit tests for process_data_buffers *do* pass. > > But let's look at a user of the functionality: > > def get_total(): > return sum(process_data_buffers(get_buffers())) > > setcontext(somecontext) > value = get_total() * compute_factor() > > > Now the code is broken, because setcontext(somecontext) has no effect, > because get_total() leaks out another context. Not to mention that our data > buffer source now has control over the behavior of compute_factor(). But if > one is lucky, the last line was written as > > value = compute_factor() * get_total() > > > And hooray, the code works! > > (Except for perhaps the code that is run after this.) > > > Now this was of course a completely fictional example, and hopefully I > didn't introduce any bugs or syntax errors other than the ones I described. > I haven't seen code like this anywhere, but somehow we caught the problems > anyway. > Yeah, so my claim this is simply a non-problem, and you've pretty much just proved that by failing to come up with pointers to actual code that would suffer from this. Clearly you're not aware of any such code. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Wed Sep 6 19:22:12 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 7 Sep 2017 01:22:12 +0200 Subject: [Python-Dev] Compiling without multithreading support -- still useful? In-Reply-To: References: <20170905183651.6934a16d@fsol> Message-ID: 2017-09-06 22:19 GMT+02:00 Berker Peksa? : > Do we still have buildbots for testing the --without-threads option? We had such buildbot once, but it's gone. I just removed its unused class from the buildbot configuration: https://github.com/python/buildmaster-config/commit/091f52aa05a8977966796ba3ef4b8257bef1c0e9 Victor From rdmurray at bitdance.com Wed Sep 6 19:22:35 2017 From: rdmurray at bitdance.com (R. David Murray) Date: Wed, 06 Sep 2017 19:22:35 -0400 Subject: [Python-Dev] PEP 548: More Flexible Loop Control In-Reply-To: References: <20170906001100.5F9DB1B10001@webabinitio.net> <20170906163401.2EDA01B10001@webabinitio.net> Message-ID: <20170906232235.84C441B10001@webabinitio.net> On Wed, 06 Sep 2017 09:43:53 -0700, Guido van Rossum wrote: > I'm actually not in favor of this. It's another way to do the same thing. > Sorry to rain on your dream! So it goes :) I learned things by going through the process, so it wasn't wasted time for me even if (or because) I made several mistakes. Sorry for wasting anyone else's time :( --David From greg.ewing at canterbury.ac.nz Wed Sep 6 19:27:16 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 07 Sep 2017 11:27:16 +1200 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: References: <20170828155215.GA3763@bytereef.org> <20170828173327.GA2671@bytereef.org> <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> Message-ID: <59B08454.3060005@canterbury.ac.nz> Ivan Levkivskyi wrote: > Normal generators fall out from this "scheme", and it looks like their > behavior is determined by the fact that coroutines are implemented as > generators. This is what I disagree with. Generators don't implement coroutines, they implement *parts* of coroutines. We want "task local storage" that behaves analogously to thread local storage. But PEP 550 as it stands doesn't give us that; it gives something more like "function local storage" for certain kinds of function. -- Greg From barry at python.org Wed Sep 6 19:42:37 2017 From: barry at python.org (Barry Warsaw) Date: Wed, 6 Sep 2017 16:42:37 -0700 Subject: [Python-Dev] To reduce Python "application" startup time In-Reply-To: References: Message-ID: <94D7D735-A961-42F4-8770-34132BEF069B@python.org> On Sep 6, 2017, at 00:42, INADA Naoki wrote: > Additionally, faster startup time (and smaller memory footprint) is good > for even Web applications. > For example, CGI is still comfortable tool sometimes. > Another example is GAE/Python. > > Anyway, I think researching import tree of popular library is good startline > about optimizing startup time. > For example, modules like ast and tokenize are imported often than I thought. Improving start up time may indeed help long running processes but start up costs will generally be amortized across the lifetime of the process, so it isn?t as noticeable. However, startup time *is* a real issue for command line tools. I?m not sure however whether burying imports inside functions (as a kind of poor man?s lazy import) is ultimately going to be satisfying. First, it?s not natural, it generally violates coding standards (e.g. PEP 8), and can make linters complain. Second, I think you?ll end up chasing imports all over the stdlib and third party modules in any sufficiently complicated application. Third, I?m not sure that the gains you?ll get won?t just be overwhelmed by lots of other things going on, such as pkg_resources entry point processing, pth file processing, site.py effects, command line processing libraries such as click, and implicitly added distribution exception hooks (e.g. Ubuntu?s apport). Many of these can?t be blamed on Python itself, but all can contribute significantly to Python?s apparent start up time. It?s definitely worth investigating the details of Python import, and a few of us at the core sprint have looked at those numbers and thrown around ideas for improvement, but we?ll need to look at the effects up and down the stack to improve the start up performance for the average Python application. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 273 bytes Desc: Message signed with OpenPGP URL: From yselivanov.ml at gmail.com Wed Sep 6 19:53:06 2017 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Wed, 6 Sep 2017 16:53:06 -0700 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: References: <20170828111903.GA3032@bytereef.org> <20170828155215.GA3763@bytereef.org> <20170828173327.GA2671@bytereef.org> <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> Message-ID: On Wed, Sep 6, 2017 at 1:39 PM, Koos Zevenhoven wrote: [..] > Now this was of course a completely fictional example, and hopefully I > didn't introduce any bugs or syntax errors other than the ones I described. > I haven't seen code like this anywhere, but somehow we caught the problems > anyway. Thank you for the example, Koos. FWIW I agree it is a "completely fictional example". There are two ways how we can easily adapt PEP 550 to follow your semantics: 1. Set gen.__logical_context__ to None when it is being 'yield frommmed' 2. Merge gen.__logical_context__ with the outer LC when the generator is iterated to the end. But I still really dislike the examples you and Greg show to us. They are not typical or real-world examples, they are showcases of ways to abuse contexts. I still think that giving Python programmers one strong rule: "context mutation is always isolated in generators" makes it easier to reason about the EC and write maintainable code. Yury From fperez.net at gmail.com Wed Sep 6 19:55:20 2017 From: fperez.net at gmail.com (Fernando Perez) Date: Wed, 6 Sep 2017 16:55:20 -0700 Subject: [Python-Dev] PEP 553: Built-in debug() References: <20170906025715.GG13110@ando.pearwood.info> <1504667186.2203533.1096578744.6E33CEB6@webmail.messagingengine.com> Message-ID: If I may suggest a small API tweak, I think it would be useful if breakpoint() accepted an optional header argument. In IPython, the equivalent for non-postmortem debugging is IPython.embed, which can be given a header. This is useful to provide the user with some information about perhaps where the breakpoint is coming from, relevant data they might want to look at, etc: ``` from IPython import embed def f(x=10): y = x+2 embed(header="in f") return y x = 20 print(f(x)) embed(header="Top level") ``` I understand in most cases these are meant to be deleted right after usage and the author is likely to have a text editor open next to the terminal where they're debugging. But still, I've found myself putting multiple such calls in a code to look at what's going on in different parts of the execution stack, and it can be handy to have a bit of information to get your bearings. Just a thought... Best f From greg.ewing at canterbury.ac.nz Wed Sep 6 20:00:39 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 07 Sep 2017 12:00:39 +1200 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: References: <20170828173327.GA2671@bytereef.org> <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> Message-ID: <59B08C27.2040706@canterbury.ac.nz> Nathaniel Smith wrote: > Literally the first motivating example at the beginning of the PEP > ('def fractions ...') involves only generators, not coroutines, and > only works correctly if generators get special handling. (In fact, I'd > be curious to see how Greg's {push,pop}_local_storage could handle > this case.) I've given a decimal-based example, but it was a bit scattered. Here's a summary and application to the fractions example. I'm going to assume that the decimal module has been modified to keep the current context in a context var, and that getcontext() and setcontext() access that context var. THe decimal.localcontext context manager is also redefined as: class localcontext(): def __enter__(self): push_local_context() ctx = getcontext().copy() setcontext(ctx) return ctx def __exit__(self): pop_local_context() Now we can write the fractions generator as: def fractions(precision, x, y): with decimal.localcontext() as ctx: ctx.prec = precision yield Decimal(x) / Decimal(y) yield Decimal(x) / Decimal(y ** 2) You may notice that this is exactly the same as what you would write today for the same task... -- Greg From greg.ewing at canterbury.ac.nz Wed Sep 6 20:06:36 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 07 Sep 2017 12:06:36 +1200 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: References: <20170828173327.GA2671@bytereef.org> <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> Message-ID: <59B08D8C.5030309@canterbury.ac.nz> Nathaniel Smith wrote: > The implementation strategy changed radically between v1 > and v2 because of considerations around generator (not coroutine) > semantics. I'm not sure what more it can do to dispel these feelings > :-). I can't say the changes have dispelled any feelings on my part. The implementation suggested in the PEP seems very complicated and messy. There are garbage collection issues, which it proposes using weak references to mitigate. There is also apparently some issue with long chains building up and having to be periodically collapsed. None of this inspires confidence that we have the basic design right. My approach wouldn't have any of those problems. The implementation would be a lot simpler. -- Greg From yselivanov.ml at gmail.com Wed Sep 6 20:13:07 2017 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Wed, 6 Sep 2017 17:13:07 -0700 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: <59B08454.3060005@canterbury.ac.nz> References: <20170828155215.GA3763@bytereef.org> <20170828173327.GA2671@bytereef.org> <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> <59B08454.3060005@canterbury.ac.nz> Message-ID: On Wed, Sep 6, 2017 at 4:27 PM, Greg Ewing wrote: > Ivan Levkivskyi wrote: >> >> Normal generators fall out from this "scheme", and it looks like their >> behavior is determined by the fact that coroutines are implemented as >> generators. > > > This is what I disagree with. Generators don't implement > coroutines, they implement *parts* of coroutines. > > We want "task local storage" that behaves analogously > to thread local storage. But PEP 550 as it stands doesn't > give us that; it gives something more like "function > local storage" for certain kinds of function. The PEP gives you a Task Local Storage, where Task is: 1. your single-threaded code 2. a generator 3. an async task If you correctly use context managers, PEP 550 works intuitively and similar to how one would think that threading.local() should work. The only example you (and Koos) can come up with is this: def generator(): set_decimal_context() yield next(generator()) # decimal context is not set # or yield from generator() # decimal context is still not set I consider that the above is a feature. Yury From yselivanov.ml at gmail.com Wed Sep 6 20:08:31 2017 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Wed, 6 Sep 2017 17:08:31 -0700 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: <59B08C27.2040706@canterbury.ac.nz> References: <20170828173327.GA2671@bytereef.org> <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> <59B08C27.2040706@canterbury.ac.nz> Message-ID: On Wed, Sep 6, 2017 at 5:00 PM, Greg Ewing wrote: > Nathaniel Smith wrote: >> >> Literally the first motivating example at the beginning of the PEP >> ('def fractions ...') involves only generators, not coroutines, and >> only works correctly if generators get special handling. (In fact, I'd >> be curious to see how Greg's {push,pop}_local_storage could handle >> this case.) > > > I've given a decimal-based example, but it was a bit > scattered. Here's a summary and application to the > fractions example. > > I'm going to assume that the decimal module has been > modified to keep the current context in a context var, > and that getcontext() and setcontext() access that > context var. > > THe decimal.localcontext context manager is also > redefined as: > > class localcontext(): > > def __enter__(self): > push_local_context() > ctx = getcontext().copy() > setcontext(ctx) > return ctx > > def __exit__(self): > pop_local_context() 1. So essentially this means that we will have one "local context" per context manager storing one value. 2. If somebody makes a mistake and calls "push_local_context" without a corresponding "pop_local_context" -- you will have an unbounded growth of LCs (happen's in Koos' proposal too, btw). 3. Users will need to know way more to correctly use the mechanism. So far, both you and Koos can't give us a realistic example which illustrates why we should suffer the implications of (1), (2), and (3). Yury From yselivanov.ml at gmail.com Wed Sep 6 20:17:40 2017 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Wed, 6 Sep 2017 17:17:40 -0700 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: <59B08D8C.5030309@canterbury.ac.nz> References: <20170828173327.GA2671@bytereef.org> <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> <59B08D8C.5030309@canterbury.ac.nz> Message-ID: On Wed, Sep 6, 2017 at 5:06 PM, Greg Ewing wrote: > Nathaniel Smith wrote: >> >> The implementation strategy changed radically between v1 >> and v2 because of considerations around generator (not coroutine) >> semantics. I'm not sure what more it can do to dispel these feelings >> :-). > > > I can't say the changes have dispelled any feelings on my part. > > The implementation suggested in the PEP seems very complicated > and messy. There are garbage collection issues, which it > proposes using weak references to mitigate. "messy" and "complicated" doesn't sound like a valuable feedback :( There are no "garbage collection issues", sorry. The issue that we use weak references for is the same issue why threading.local() uses them: def foo(): var = ContextVar() var.set(1) for _ in range(10**6): foo() If 'var' is strongly referenced, we would have a bunch of them. > There is also > apparently some issue with long chains building up and > having to be periodically collapsed. None of this inspires > confidence that we have the basic design right. > > My approach wouldn't have any of those problems. The > implementation would be a lot simpler. Cool. Yury From barry at python.org Wed Sep 6 20:20:17 2017 From: barry at python.org (Barry Warsaw) Date: Wed, 6 Sep 2017 17:20:17 -0700 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: References: <20170906025715.GG13110@ando.pearwood.info> <1504667186.2203533.1096578744.6E33CEB6@webmail.messagingengine.com> Message-ID: <21D29DF6-B2DB-4A6C-8290-13DB8D7B21B4@python.org> On Sep 6, 2017, at 16:55, Fernando Perez wrote: > > If I may suggest a small API tweak, I think it would be useful if breakpoint() accepted an optional header argument. In IPython, the equivalent for non-postmortem debugging is IPython.embed, which can be given a header. This is useful to provide the user with some information about perhaps where the breakpoint is coming from, relevant data they might want to look at, etc: > > ``` > from IPython import embed > > def f(x=10): > y = x+2 > embed(header="in f") > return y > > x = 20 > print(f(x)) > embed(header="Top level") > ``` > > I understand in most cases these are meant to be deleted right after usage and the author is likely to have a text editor open next to the terminal where they're debugging. But still, I've found myself putting multiple such calls in a code to look at what's going on in different parts of the execution stack, and it can be handy to have a bit of information to get your bearings. > > Just a thought... Thanks Fernando, this is exactly the kind of feedback from other debuggers that I?m looking for. It certainly sounds like a handy feature; I?ve found myself wanting something like that from pdb from time to time. The PEP has an open issue regarding breakpoint() taking *args and **kws, which would just be passed through the call stack. It sounds like you?d be in favor of that enhancement. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 273 bytes Desc: Message signed with OpenPGP URL: From nad at python.org Wed Sep 6 19:52:21 2017 From: nad at python.org (Ned Deily) Date: Wed, 6 Sep 2017 16:52:21 -0700 Subject: [Python-Dev] Python 3.3.7rc1 now available prior to Python 3.3 end-of-life Message-ID: <3B07469D-32E6-4C3B-90CD-14EA0A94575D@python.org> On behalf of the Python development community and the Python 3.3 release teams, I would like to announce the availability of Python 3.3.7rc1, the release candidate of Python 3.3.7. It is a security-fix source-only release. Python 3.3.0 was released 5 years ago on 2012-09-29 and has been in security-fix-only mode since 2014-03-08. Per project policy, all support for the 3.3 series of releases ends on 2017-09-29, five years after the initial release. Therefore, Python 3.3.7 is expected to be the final release of any kind for the 3.3 series. After 2017-09-29, **we will no longer accept bug reports nor provide fixes of any kind for Python 3.3.x**; of course, third-party distributors of Python 3.3.x may choose to offer their own extended support. Because 3.3.x has long been in security-fix mode, 3.3.7 may no longer build correctly on all current operating system releases and some tests may fail. If you are still using Python 3.3.x, we **strongly** encourage you to upgrade to a more recent, fully supported version of Python 3; see https://www.python.org/downloads/. If you are still using your own build of Python 3.3.x , please report any critical issues with 3.3.7rc1 to the Python bug tracker prior to 2017-09-18, the expected release date for Python 3.3.7 final. Even better, use the time to upgrade to Python 3.6.x! Thank you to everyone who has contributed to the success of Python 3.3.x over the past years! You can find Python 3.3.7rc1 here: https://www.python.org/downloads/release/python-337rc1/ -- Ned Deily nad at python.org -- [] From victor.stinner at gmail.com Wed Sep 6 21:16:00 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 7 Sep 2017 03:16:00 +0200 Subject: [Python-Dev] New C API not leaking implementation details: an usable stable ABI Message-ID: Hi, I am currently at a CPython sprint 2017 at Facebook. We are discussing my idea of writing a new C API for CPython hiding implementation details and replacing macros with function calls. I wrote a short blog post to explain the issue of the current API, the link between the API and the ABI, and give examples of optimizations which become possible with an "unsable" stable ABI: https://haypo.github.io/new-python-c-api.html I am sorry, I'm too busy to write a proper PEP. But here is the link to my old PEP draft written in June. I didn't update it yet. But multiple people are asking me for the PEP draft, so here you have! https://github.com/haypo/misc/blob/master/python/pep_c_api.rst See also the thread on python-ideas last June, where I first proposed my draft: [Python-ideas] PEP: Hide implementation details in the C API https://mail.python.org/pipermail/python-ideas/2017-July/046399.html This is not a request for comments :-) I will write a proper PEP for that. Victor From elprans at gmail.com Wed Sep 6 22:00:27 2017 From: elprans at gmail.com (Elvis Pranskevichus) Date: Wed, 06 Sep 2017 22:00:27 -0400 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: <59B08D8C.5030309@canterbury.ac.nz> References: <59B08D8C.5030309@canterbury.ac.nz> Message-ID: <1604676.LaHrQX02Lt@klinga.prans.org> On Wednesday, September 6, 2017 8:06:36 PM EDT Greg Ewing wrote: > Nathaniel Smith wrote: > > The implementation strategy changed radically between v1 > > and v2 because of considerations around generator (not coroutine) > > semantics. I'm not sure what more it can do to dispel these > > feelings> > > :-). > > I can't say the changes have dispelled any feelings on my part. > > The implementation suggested in the PEP seems very complicated > and messy. There are garbage collection issues, which it > proposes using weak references to mitigate. There is also > apparently some issue with long chains building up and > having to be periodically collapsed. None of this inspires > confidence that we have the basic design right. > > My approach wouldn't have any of those problems. The > implementation would be a lot simpler. I might have missed something, but your claim doesn't make any sense to me. All you've proposed is to replace the implicit and guaranteed push_lc()/pop_lc() around each generator with explicit LC stack management. You *still* need to retain and switch the current stack on every generator send() and throw(). Everything else written out in PEP 550 stays relevant as well. As for the "long chains building up", your approach is actually much worse. The absense of a guaranteed context fence around generators would mean that contextvar context managers will *have* to push LCs whether really needed or not. Consider the following (naive) way of computing the N-th Fibonacci number: def fib(n): with decimal.localcontext(): if n == 0: return 0 elif n == 1: return 1 else: return fib(n - 1) + fib(n - 2) Your proposal can cause the LC stack to grow incessantly even in simple cases, and will affect code that doesn't even use generators. A great deal of effort was put into PEP 550, and the matter discussed is far from trivial. What you see as "complicated and messy" is actually the result of us carefully considering the solutions to real- world problems, and then the implications of those solutions (including the worst-case scenarios.) Elvis From ionut.alex.muresan at gmail.com Tue Sep 5 13:23:46 2017 From: ionut.alex.muresan at gmail.com (=?UTF-8?Q?Muresan=20Ionut=20Alexandru?=) Date: Tue, 05 Sep 2017 19:23:46 +0200 Subject: [Python-Dev] =?utf-8?q?Map_and_Wirelessly_Control_the_Human_Brai?= =?utf-8?q?n_Using_50_Nanometer_Particles_and_Radio_Waves?= Message-ID: <201709051923.54jkfba73ors7j@sciencemedia.eu> If you are not able to see this mail, click http://r.sciencemedia.eu/336n1lt27ors7f.htmlHello, I'm a computer programmer from Romania. I'm very interested in neuroscience. I have some ideas for an international science project. The project is about increasing the worldwide efforts to map the human brain. Methods that can be used to map and wirelessly control the brain 1. Study how the cordyceps mushroom controls the brains of ants. Create synthetic particles that can do the same in other life forms. 2. Use tiny transmitters (the size of a small bacteria) 50 nanometers each (decoder, antenna) that can send and receive data. The decoder should identify uniquely each neuron. They could be built using computer chip transistor technology (1 nanometers). To get them close to each neuron they should be placed inside bacteria that can cross the blood-brain barrier. Another method to get them close to neurons is to inject them into the brain of the lab animal you experiment on. 3. Generate very low fequency radio waves that go through the human brain and in combination with the waves produced by the brain create a pattern that can be interpreted by a software. In my opinion mapping the human brain will allow us to extend the lifespan of humans by at least 20 years all over the world, cure Parkinson, Alzheimer's, depression, senility (worth $1 trillion a year), cure cancer($2.5 trillion a year) and other ailments, and create human like artificial intelligence. I think that mapping the human brain is the most important thing we can do in the next 10 years as a species. When it is finished this project will help us save trillions of dollars and millions of lives each year. Best regards, IonutIf you wish to unsubscribe from our newsletter, click http://r.sciencemedia.eu/336n1lt27ors7g.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From songofacandy at gmail.com Wed Sep 6 23:49:17 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Thu, 7 Sep 2017 12:49:17 +0900 Subject: [Python-Dev] To reduce Python "application" startup time In-Reply-To: <94D7D735-A961-42F4-8770-34132BEF069B@python.org> References: <94D7D735-A961-42F4-8770-34132BEF069B@python.org> Message-ID: > I?m not sure however whether burying imports inside functions (as a kind of poor man?s lazy import) is ultimately going to be satisfying. First, it?s not natural, it generally violates coding standards (e.g. PEP 8), and can make linters complain. Of course, I tried to move imports only when (1) it's only used one or two of many functions in the module, (2) it's relatively heavy, (3) it's rerely imported from other modules. > Second, I think you?ll end up chasing imports all over the stdlib and third party modules in any sufficiently complicated application. Agree. I won't use much time to optimization by moving import from top to inner function in stdlib. I think my import-profiler patch can be polished and committed in Python to help library maintainers to know import time easily. (Maybe, `python -X import-profile`) > Third, I?m not sure that the gains you?ll get won?t just be overwhelmed by lots of other things going on, such as pkg_resources entry point processing, pth file processing, site.py effects, command line processing libraries such as click, and implicitly added distribution exception hooks (e.g. Ubuntu?s apport). Yes. I noticed some of them while profiling imports. For example, old-style namespace package imports types module for types.Module. Types module imports functools, and functools imports collections. So some efforts in CPython (Avoid importing collections and functools from site) is not worth enough when there are at least one old-style namespace package is installed. > > Many of these can?t be blamed on Python itself, but all can contribute significantly to Python?s apparent start up time. It?s definitely worth investigating the details of Python import, and a few of us at the core sprint have looked at those numbers and thrown around ideas for improvement, but we?ll need to look at the effects up and down the stack to improve the start up performance for the average Python application. > Yes. I totally agree with you. That's why I use import-profile.patch for some 3rd party libraries. Currently, I have these ideas to optimize application startup time. * Faster, or lazily compiling regular expression. (pkg_resources imports pyparsing, which has lot regex) * More usable lazy import. (which can be solved "PEP 549: Instance Properties (aka: module properties)") * Optimize enum creation. * Faster namedtuple (There is pull request already) * Faster ABC * Breaking large import tree in stdlib. (PEP 549 may help this too) Regards, INADA Naoki From nas at arctrix.com Thu Sep 7 01:30:41 2017 From: nas at arctrix.com (Neil Schemenauer) Date: Thu, 7 Sep 2017 05:30:41 +0000 (UTC) Subject: [Python-Dev] To reduce Python "application" startup time References: Message-ID: INADA Naoki wrote: > Current `python -v` is not useful to optimize import. > So I use this patch to profile import time. > https://gist.github.com/methane/e688bb31a23bcc437defcea4b815b1eb I have implemented DTrace probes that do almost the same thing. Your patch is better in that it does not require an OS with DTrace or SystemTap. The DTrace probes are better in that they can be a part of the standard Python build. https://github.com/nascheme/cpython/tree/dtrace-module-import DTrace script: https://gist.github.com/nascheme/c1cece36a3369926ee93cecc3d024179 Pretty printer for script output (very minimal): https://gist.github.com/nascheme/0bff5c49bb6b518f5ce23a9aea27f14b From tjreedy at udel.edu Thu Sep 7 02:10:03 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 7 Sep 2017 02:10:03 -0400 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: <4AEEEC14-2FF3-4A48-962D-1293B8B0A8FC@python.org> References: <7DDCE154-7596-414F-90C4-A2AE0618C728@python.org> <3AA673BE-9B32-4ACB-AB4D-940A006C0A4D@python.org> <4AEEEC14-2FF3-4A48-962D-1293B8B0A8FC@python.org> Message-ID: On 9/6/2017 6:24 PM, Barry Warsaw wrote: > On Sep 6, 2017, at 10:14, Fabio Zadrozny wrote: >> >> I think it's a nice idea. > > Great! > >> Related to the name, on the windows c++ there's "DebugBreak": https://msdn.microsoft.com/en-us/library/windows/desktop/ms679297(v=vs.85).aspx, which I think is a better name (so, it'd be debug_break for Python -- I think it's better than plain breakpoint(), and wouldn't clash as debug()). > > It?s important to understand that a new built-in is technically never going to clash with existing code, regardless of what it?s called. Given Python?s name resolution rules, if your code uses any built, it?ll just shadow it. That?s one big reason why the PEP proposed a built-in rather than say a keyword. > > That said, while I like the more succinct `debug()` name, Guido prefers `breakpoint()` and that works fine for me. > >> I think I could change the hook on a custom sitecustomize (there's already one in place in PyDev) so that the debug_break() would actually read some env var to do that work (and provide some utility for users to pre-setup it when not launching from inside the IDE). > > I had meant to add an open issue about the idea of adding an environment variable, such as $PYTHONBREAKPOINTHOOK which could be set to the callable to bind to sys.breakpointhook(). Environmental variables are set to strings, not objects. It is not clear how you intend to handle the conversion. -- Terry Jan Reedy From greg.ewing at canterbury.ac.nz Thu Sep 7 02:26:03 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 07 Sep 2017 18:26:03 +1200 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: References: <20170828173327.GA2671@bytereef.org> <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> Message-ID: <59B0E67B.8050708@canterbury.ac.nz> Guido van Rossum wrote: > This feels like a very abstract argument. I have a feeling that context > state propagating out of a call is used relatively rarely -- it must > work for cases where you refactor something that changes context inline > into a utility function (e.g. decimal.setcontext()), but I just can't > think of a realistic example where coroutines (either of the yield-from > variety or of the async/def form) would be used for such a utility > function. Yuri has already found one himself, the __aenter__ and __aexit__ methods of an async context manager. > A utility function that sets context state but also makes a > network call just sounds like asking for trouble! I'm coming from the other direction. It seems to me that it's not very useful to allow with-statements to be skipped in certain very restricted circumstances. The only situation in which you will be able to take advantage of this is if the context change is being made in a generator or coroutine, and it is to apply to the whole body of that generator or coroutine. If you're in an ordinary function, you'll still have to use a context manager. If you only want the change to apply to part of the body, you'll still have to use a context manager. It would be simpler to just tell people to always use a context manager, wouldn't it? -- Greg From greg.ewing at canterbury.ac.nz Thu Sep 7 02:39:48 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 07 Sep 2017 18:39:48 +1200 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: References: <20170828173327.GA2671@bytereef.org> <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> Message-ID: <59B0E9B4.8010807@canterbury.ac.nz> Yury Selivanov wrote: > It would be great if you or Greg could show a couple of real-world > examples showing the "issue" (with the current PEP 550 > APIs/semantics). Here's one way that refactoring could trip you up. Start with this: async def foo(): calculate_something() #in a coroutine, so we can be lazy and not use a cm ctx = decimal.getcontext().copy() ctx.prec = 5 decimal.setcontext(ctx) calculate_something_else() And factor part of it out (into an *ordinary* function!) async def foo(): calculate_something() calculate_something_else_with_5_digits() def calculate_something_else_with_5_digits(): ctx = decimal.getcontext().copy() ctx.prec = 5 decimal.setcontext(ctx) calculate_something_else() Now we add some more calculation to the end of foo(): async def foo(): calculate_something() calculate_something_else_with_5_digits() calculate_more_stuff() Here we didn't intend calculate_more_stuff() to be done with prec=5, but we forgot that calculate_something_else_ with_5_digits() changes the precision and *doesn't restore it* because we didn't add a context manager to it. If we hadn't been lazy and had used a context manager in the first place, that wouldn't have happened. Summary: I think that skipping context managers in some circumstances is a bad habit that shouldn't be encouraged. -- Greg From yselivanov.ml at gmail.com Thu Sep 7 02:41:43 2017 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Wed, 6 Sep 2017 23:41:43 -0700 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: <59B0E67B.8050708@canterbury.ac.nz> References: <20170828173327.GA2671@bytereef.org> <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> <59B0E67B.8050708@canterbury.ac.nz> Message-ID: On Wed, Sep 6, 2017 at 11:26 PM, Greg Ewing wrote: > Guido van Rossum wrote: >> >> This feels like a very abstract argument. I have a feeling that context >> state propagating out of a call is used relatively rarely -- it must work >> for cases where you refactor something that changes context inline into a >> utility function (e.g. decimal.setcontext()), but I just can't think of a >> realistic example where coroutines (either of the yield-from variety or of >> the async/def form) would be used for such a utility function. > > > Yuri has already found one himself, the __aenter__ and __aexit__ > methods of an async context manager. __aenter__ is not a generator and there's no 'yield from' there. Coroutines (within an async task) leak state just like regular functions (within a thread). Your argument is to allow generators to leak context changes (right?). AFAIK we don't use generators to implement __enter__ or __aenter__ (generators decorated with @types.coroutine or @asyncio.coroutine are coroutines, according to PEP 492). So this is irrelevant. > >> A utility function that sets context state but also makes a network call >> just sounds like asking for trouble! > > > I'm coming from the other direction. It seems to me that it's > not very useful to allow with-statements to be skipped in > certain very restricted circumstances. Can you clarify what do you mean by "with-statements to be skipped"? This language is not used in PEP 550 or in Python documentation. I honestly don't understand what it means. > > The only situation in which you will be able to take advantage > of this is if the context change is being made in a generator > or coroutine, and it is to apply to the whole body of that > generator or coroutine. > > If you're in an ordinary function, you'll still have to use > a context manager. If you only want the change to apply to > part of the body, you'll still have to use a context manager. > > It would be simpler to just tell people to always use a > context manager, wouldn't it? Yes, PEP 550 wants people to always use a context managers! Which will work as you expect them to work for coroutines, generators, and regular functions. At this point I suspect you have some wrong idea about some specification detail of PEP 550. I understand what Koos is talking about, but I really don't follow you. Using the "with-statements to be skipped" language is very confusing and doesn't help to understand you. Yury From yselivanov.ml at gmail.com Thu Sep 7 02:43:42 2017 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Wed, 6 Sep 2017 23:43:42 -0700 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: <59B0E9B4.8010807@canterbury.ac.nz> References: <20170828173327.GA2671@bytereef.org> <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> <59B0E9B4.8010807@canterbury.ac.nz> Message-ID: On Wed, Sep 6, 2017 at 11:39 PM, Greg Ewing wrote: > Yury Selivanov wrote: >> >> It would be great if you or Greg could show a couple of real-world >> examples showing the "issue" (with the current PEP 550 >> APIs/semantics). > > > Here's one way that refactoring could trip you up. > Start with this: > > async def foo(): > calculate_something() > #in a coroutine, so we can be lazy and not use a cm > ctx = decimal.getcontext().copy() > ctx.prec = 5 > decimal.setcontext(ctx) > calculate_something_else() > > And factor part of it out (into an *ordinary* function!) > > async def foo(): > calculate_something() > calculate_something_else_with_5_digits() > > def calculate_something_else_with_5_digits(): > ctx = decimal.getcontext().copy() > ctx.prec = 5 > decimal.setcontext(ctx) > calculate_something_else() > > Now we add some more calculation to the end of foo(): > > async def foo(): > calculate_something() > calculate_something_else_with_5_digits() > calculate_more_stuff() > > Here we didn't intend calculate_more_stuff() to be done > with prec=5, but we forgot that calculate_something_else_ > with_5_digits() changes the precision and *doesn't restore > it* because we didn't add a context manager to it. > > If we hadn't been lazy and had used a context manager in the > first place, that wouldn't have happened. > > Summary: I think that skipping context managers in some > circumstances is a bad habit that shouldn't be encouraged. > > > -- > Greg > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/yselivanov.ml%40gmail.com From greg.ewing at canterbury.ac.nz Thu Sep 7 02:55:38 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 07 Sep 2017 18:55:38 +1200 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: References: <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> Message-ID: <59B0ED6A.10100@canterbury.ac.nz> Guido van Rossum wrote: > Yeah, so my claim this is simply a non-problem, and you've pretty much > just proved that by failing to come up with pointers to actual code that > would suffer from this. Clearly you're not aware of any such code. In response I'd ask Yuri to come up with examples of real code that would benefit significantly from being able to make context changes without wrapping them in a with statement. -- Greg From yselivanov.ml at gmail.com Thu Sep 7 02:57:34 2017 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Wed, 6 Sep 2017 23:57:34 -0700 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: <59B0E9B4.8010807@canterbury.ac.nz> References: <20170828173327.GA2671@bytereef.org> <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> <59B0E9B4.8010807@canterbury.ac.nz> Message-ID: On Wed, Sep 6, 2017 at 11:39 PM, Greg Ewing wrote: > Yury Selivanov wrote: >> >> It would be great if you or Greg could show a couple of real-world >> examples showing the "issue" (with the current PEP 550 >> APIs/semantics). > > > Here's one way that refactoring could trip you up. > Start with this: > > async def foo(): > calculate_something() > #in a coroutine, so we can be lazy and not use a cm Where exactly does PEP 550 encourage users to be "lazy and not use a cm"? PEP 550 provides a mechanism for implementing context managers! What is this example supposed to show? > ctx = decimal.getcontext().copy() > ctx.prec = 5 > decimal.setcontext(ctx) > calculate_something_else() > > And factor part of it out (into an *ordinary* function!) > > async def foo(): > calculate_something() > calculate_something_else_with_5_digits() > > def calculate_something_else_with_5_digits(): > ctx = decimal.getcontext().copy() > ctx.prec = 5 > decimal.setcontext(ctx) > calculate_something_else() > > Now we add some more calculation to the end of foo(): > > async def foo(): > calculate_something() > calculate_something_else_with_5_digits() > calculate_more_stuff() > > Here we didn't intend calculate_more_stuff() to be done > with prec=5, but we forgot that calculate_something_else_ > with_5_digits() changes the precision and *doesn't restore > it* because we didn't add a context manager to it. > > If we hadn't been lazy and had used a context manager in the > first place, that wouldn't have happened. How is PEP 550 is at fault of somebody being lazy and not using a context manager? PEP 550 has a hard requirement to make it possible for decimal/other libraries to start using its APIs and stay backwards compatible, so it allows `decimal.setcontext(ctx)` function to be implemented. We are fixing things here. When you are designing a new library/API, you can use CMs and only CMs. It's up to you, as a library author, PEP 550 does not limit you. And when you use CMs, there's no "problems" with 'yield from' or anything in PEP 550. > > Summary: I think that skipping context managers in some > circumstances is a bad habit that shouldn't be encouraged. PEP 550 does not encourage coding without context managers. It does, in fact, solve the problem of reliably storing context to make writing context managers possible. To reiterate: it provides mechanism to set a variable within the current logical thread, like storing a current request in an async HTTP handler. Or to implement `decimal.setcontext`. But you are free to use it to only implement context managers in your library. Yury From tjreedy at udel.edu Thu Sep 7 02:59:55 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 7 Sep 2017 02:59:55 -0400 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: <03B5AA26-0AE0-4EFB-9300-B71F9ECD3AE8@python.org> References: <98F44C60-2DBB-4771-9544-B9DA7210353C@python.org> <03B5AA26-0AE0-4EFB-9300-B71F9ECD3AE8@python.org> Message-ID: On 9/6/2017 6:45 PM, Barry Warsaw wrote: > On Sep 6, 2017, at 14:59, Terry Reedy wrote: >> >> Currently, the debugger is started in response to a menu seletion in the IDLE process while the python process is idle. One reason for the 'idle' requirement' is because when code is exec-uting, the loop that reads commands, executes them, and sends responses is blocked on the exec call. The IDLE process sets up its debugger window, its ends of the rpc channels, and commands to python process to set up Idb and the other ends of the channels. The challenge would be to initiate setup from the server process and deal with the blocked loop. > > Would the environment variable idea in the latest version of the PEP help you here? That seems to be a mechanism for users to convey which function to call. It does not simplify the function call chosen. pdb works because a) it is designed to be imported into and run in a user module and b) it uses the existing stdin and stdout streams to interact with the user. Idb, on the other hand, is designed to be import in idlelib.run, which is (almost) invisible to user code, and use a special channel to the GUI window. Having said that, I see a solution: before running user code, set things up as now, except do not start tracing and do not show the debugger window, but do set a run.idb_ready flag. Then breakpoint_handler() can check the flag and start tracing either through idb or pdb. What makes breakpoint() less than super useful in IDLE is that IDLE already allows setting of persistent breakpoints on a line by right-clicking. Since they are not stored in the file, there is nothing to erase when one runs the code in Python directly. -- Terry Jan Reedy From greg.ewing at canterbury.ac.nz Thu Sep 7 03:00:43 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 07 Sep 2017 19:00:43 +1200 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: References: <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> Message-ID: <59B0EE9B.6070104@canterbury.ac.nz> Yury Selivanov wrote: > I still think that giving Python programmers one strong rule: "context > mutation is always isolated in generators" makes it easier to reason > about the EC and write maintainable code. Whereas I think it makes code *harder* to reason about, because to take advantage of it you need to be acutely aware of whether the code you're working on is in a generator/coroutine or not. It seems simpler to me to have one rule for all kinds of functions: If you're making a temporary change to contextual state, always encapsulate it in a with statement. -- Greg From yselivanov.ml at gmail.com Thu Sep 7 03:10:14 2017 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Thu, 7 Sep 2017 00:10:14 -0700 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: <59B0ED6A.10100@canterbury.ac.nz> References: <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> <59B0ED6A.10100@canterbury.ac.nz> Message-ID: On Wed, Sep 6, 2017 at 11:55 PM, Greg Ewing wrote: > Guido van Rossum wrote: >> >> Yeah, so my claim this is simply a non-problem, and you've pretty much >> just proved that by failing to come up with pointers to actual code that >> would suffer from this. Clearly you're not aware of any such code. > > > In response I'd ask Yuri to come up with examples of real > code that would benefit significantly from being able to > make context changes without wrapping them in a with > statement. A real-code example: make it possible to implement decimal.setcontext() on top of PEP 550 semantics. I still feel that there's some huge misunderstanding in the discussion: PEP 550 does not promote "not using context managers". It simply implements a low-level mechanism to make it possible to implement context managers for generators/coroutines/etc. Whether this API is used to write context managers or not is completely irrelevant to the discussion. How does threading.local() promote or demote using of context managers? The answer: it doesn't. Same answer is for PEP 550, which is a similar mechanism. Yury From greg.ewing at canterbury.ac.nz Thu Sep 7 03:39:01 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 07 Sep 2017 19:39:01 +1200 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: References: <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> <59B08C27.2040706@canterbury.ac.nz> Message-ID: <59B0F795.8070207@canterbury.ac.nz> Yury Selivanov wrote: > 1. So essentially this means that we will have one "local context" per > context manager storing one value. I can't see that being a major problem. Context vars will (I hope!) be very rare things, and needing to change a bunch of them in one function ought to be rarer still. But if you do, it would be easy to provide a context manager whose sole effect is to introduce a new context: with new_local_context(): cvar1.set(something) cvar2.set(otherthing) ... > 2. If somebody makes a mistake and calls "push_local_context" without > a corresponding "pop_local_context" You wouldn't normally call them directly, they would be encapsulated in carefully-written context managers. If you do use them, you're taking responsibility for using them correctly. If it would make you feel happier, they could be named _push_local_context and _pop_local_context to emphasise that they're not intended for everyday use. > 3. Users will need to know way more to correctly use the mechanism. Most users will simply be using already-provided context managers, which they're *already used to doing*. So they won't have to know anything more than they already do. See my last decimal example, which required *no change* to existing correct user code. > So far, both you and Koos can't give us a realistic example which > illustrates why we should suffer the implications of (1), (2), and > (3). And you haven't given a realistic example that convinces me your proposed with-statement-elimination feature would be of significant benefit. -- Greg From greg.ewing at canterbury.ac.nz Thu Sep 7 03:46:16 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 07 Sep 2017 19:46:16 +1200 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: References: <20170828173327.GA2671@bytereef.org> <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> <59B08454.3060005@canterbury.ac.nz> Message-ID: <59B0F948.8040600@canterbury.ac.nz> Yury Selivanov wrote: > The PEP gives you a Task Local Storage, where Task is: > > 1. your single-threaded code > 2. a generator > 3. an async task > > If you correctly use context managers, PEP 550 works intuitively and > similar to how one would think that threading.local() should work. My version works *more* similarly to thread-local storage, IMO. Currently, if you change the decimal context without using a with-statement or something equivalent, you *don't* expect the change to be confined to the current function or sub-generator or async sub-task. All I'm asking for is one consistent rule: If you want a context change encapsulated, use a with-statement. If you don't, don't. Not only is this rule simpler than yours, it's the *same* rule that we have now, so there is less for users to learn. -- Greg From greg.ewing at canterbury.ac.nz Thu Sep 7 03:54:15 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 07 Sep 2017 19:54:15 +1200 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: References: <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> <59B08D8C.5030309@canterbury.ac.nz> Message-ID: <59B0FB27.2090304@canterbury.ac.nz> Yury Selivanov wrote: > def foo(): > var = ContextVar() > var.set(1) > > for _ in range(10**6): foo() > > If 'var' is strongly referenced, we would have a bunch of them. Erk. This is not how I envisaged context vars would be used. What I thought you would do is this: my_context_var = ContextVar() def foo(): my_context_var.set(1) This problem would also not arise if context vars simply had names instead of being magic key objects: def foo(): contextvars.set("mymodule.myvar", 1) That's another thing I think would be an improvement, but it's orthogonal to what we're talking about here and would be best discussed separately. -- Greg From solipsis at pitrou.net Thu Sep 7 06:24:15 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 7 Sep 2017 12:24:15 +0200 Subject: [Python-Dev] Compiling without multithreading support -- still useful? References: <20170905183651.6934a16d@fsol> Message-ID: <20170907122415.44932336@fsol> I've proposed a PEP 11 update in this PR: https://github.com/python/peps/pull/394 Regards Antoine. On Tue, 5 Sep 2017 18:36:51 +0200 Antoine Pitrou wrote: > Hello, > > It's 2017 and we are still allowing people to compile CPython without > threads support. It adds some complication in several places > (including delicate parts of our internal C code) without a clear > benefit. Do people still need this? > > Regards > > Antoine. > > From greg.ewing at canterbury.ac.nz Thu Sep 7 06:37:58 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 07 Sep 2017 22:37:58 +1200 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: References: <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> <59B0E67B.8050708@canterbury.ac.nz> Message-ID: <59B12186.6030707@canterbury.ac.nz> Yury Selivanov wrote: > I understand what Koos is > talking about, but I really don't follow you. Using the > "with-statements to be skipped" language is very confusing and doesn't > help to understand you. If I understand correctly, instead of using a context manager, your fractions example could be written like this: def fractions(precision, x, y): ctx = decimal.getcontext().copy() decimal.setcontext(ctx) ctx.prec = precision yield MyDecimal(x) / MyDecimal(y) yield MyDecimal(x) / MyDecimal(y ** 2) and it would work without leaking changes to the decimal context, despite the fact that it doesn't use a context manager or do anything else to explicitly put back the old context. Am I right about that? This is what I mean by "skipping context managers" -- that it's possible in some situations to get by without using a context manager, by taking advantage of the implicit local context push that happens whenever a generator is started up. Now, there are two possibilities: 1) You take advantage of this, and don't use context managers in some or all of the places where you don't need to. You seem to agree that this would be a bad idea. 2) You ignore it and always use a context manager, in which case it's not strictly necessary for the implicit context push to occur, since the relevant context managers can take care of it. So there doesn't seem to be any great advantage to the automatic context push, and it has some disadvantages, such as yield-from not quite working as expected in some situations. Also, it seems that every generator is going to incur the overhead of allocating a logical_context even when it doesn't actually change any context vars, which most generators won't. -- Greg From k7hoven at gmail.com Thu Sep 7 07:04:54 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Thu, 7 Sep 2017 14:04:54 +0300 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: <59B0FB27.2090304@canterbury.ac.nz> References: <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> <59B08D8C.5030309@canterbury.ac.nz> <59B0FB27.2090304@canterbury.ac.nz> Message-ID: On Thu, Sep 7, 2017 at 10:54 AM, Greg Ewing wrote: > Yury Selivanov wrote: > >> def foo(): >> var = ContextVar() >> var.set(1) >> >> for _ in range(10**6): foo() >> >> If 'var' is strongly referenced, we would have a bunch of them. >> > > Erk. This is not how I envisaged context vars would be > used. What I thought you would do is this: > > my_context_var = ContextVar() > > def foo(): > my_context_var.set(1) > > This problem would also not arise if context vars > simply had names instead of being magic key objects: > > def foo(): > contextvars.set("mymodule.myvar", 1) > > That's another thing I think would be an improvement, > but it's orthogonal to what we're talking about here > and would be best discussed separately. > > ???There are lots of things in this discussion that I should have commented on, but here's one related to this. PEP 555 does not have the resource-management issue described above and needs no additional tricks to achieve that: # using PEP 555 def foo(): var = contextvars.Var() with var.assign(1): # do something [*] ?for _ in range(10**6): foo(?) Every time foo is called, a new context variable is created, but that's perfectly fine, and lightweight. As soon as the context manager exits, there are no references to the Assignment object returned by var.assign(1), and as soon as foo() returns, there are no references to var, so everything should get cleaned up nicely. And regarding string keys, they have pros and cons, and they can be added easily, so let's not go there now. -- Koos [*] (nit-picking) without closures that would keep the var reference alive -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Thu Sep 7 07:19:20 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 07 Sep 2017 23:19:20 +1200 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: References: <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> <59B0E67B.8050708@canterbury.ac.nz> Message-ID: <59B12B38.6000200@canterbury.ac.nz> There is one thing I misunderstood. Since generators and coroutines are almost exactly the same underneath, I had thought that the automatic logical_context creation for generators was also going to apply to coroutines, but from reading the PEP again it seems that's not the case. Somehow I missed that the first time. Sorry about that. So, context vars do behave like "task locals storage" for asyncio Tasks, which is good. The only issue is whether a generator should be considered an "ad-hoc task" for this purpose. I can see your reasons for thinking that it should be. I can also understand your thinking that the yield-from issue is such an obscure corner case that it's not worth worrying about, especially since there is a workaround available (setting _logical_context_ to None) if needed. I'm not sure how I feel about that now. I agree that it's an obscure case, but the workaround seems even more obscure, and is unlikely to be found by anyone who isn't closely familiar with the inner workings. I think I'd be happier if there were a higher-level way of applying this workaround, such as a decorator: @subgenerator def g(): ... Then the docs could say "If you want a generator to *not* have its own task local storage, wrap it with @subgenerator." By the way, I think "Task Local Storage" would be a much better title for this PEP. It instantly conveys the basic idea in a way that "Execution Context" totally fails to do. It might also serve as a source for some better terminology for parts of the implementation, such as TaskLocalStorage and TaskLocalStorageStack instead of logical_context and execution_context. I found the latter terms almost devoid of useful meaning when trying to understand the implementation. -- Greg From greg.ewing at canterbury.ac.nz Thu Sep 7 07:39:39 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 07 Sep 2017 23:39:39 +1200 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: <59B12B38.6000200@canterbury.ac.nz> References: <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> <59B0E67B.8050708@canterbury.ac.nz> <59B12B38.6000200@canterbury.ac.nz> Message-ID: <59B12FFB.60204@canterbury.ac.nz> There are a couple of things in the PEP I'm confused about: 1) Under "Generators" it says: once set in the generator, the context variable is guaranteed not to change between iterations; This suggests that you're not allowed to set() a given context variable more than once in a given generator, but some of the examples seem to contradict that. So I'm not sure what this is trying to say. 2) I don't understand why the logical_contexts have to be immutable. If every task or generator that wants its own task-local storage has its own logical_context instance, why can't it be updated in-place? -- Greg From ethan at stoneleaf.us Thu Sep 7 08:50:44 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 07 Sep 2017 05:50:44 -0700 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: <59B12FFB.60204@canterbury.ac.nz> References: <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> <59B0E67B.8050708@canterbury.ac.nz> <59B12B38.6000200@canterbury.ac.nz> <59B12FFB.60204@canterbury.ac.nz> Message-ID: <59B140A4.8010308@stoneleaf.us> On 09/07/2017 04:39 AM, Greg Ewing wrote: > 1) Under "Generators" it says: > > once set in the generator, the context variable is guaranteed > not to change between iterations; > > This suggests that you're not allowed to set() a given > context variable more than once in a given generator, > but some of the examples seem to contradict that. So I'm > not sure what this is trying to say. I believe I can answer this part: the guarantee is that - the context variable will not be changed while the yield is in effect -- or, said another way, while the generator is suspended; - the context variable will not be changed by subgenerators - the context variable /may/ be changed by normal functions/class methods (since calling them would be part of the iteration) -- ~Ethan~ From ethan at stoneleaf.us Thu Sep 7 09:05:58 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 07 Sep 2017 06:05:58 -0700 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: <59B12186.6030707@canterbury.ac.nz> References: <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> <59B0E67B.8050708@canterbury.ac.nz> <59B12186.6030707@canterbury.ac.nz> Message-ID: <59B14436.8040705@stoneleaf.us> On 09/07/2017 03:37 AM, Greg Ewing wrote: > If I understand correctly, instead of using a context > manager, your fractions example could be written like > this: > > def fractions(precision, x, y): > ctx = decimal.getcontext().copy() > decimal.setcontext(ctx) > ctx.prec = precision > yield MyDecimal(x) / MyDecimal(y) > yield MyDecimal(x) / MyDecimal(y ** 2) > > and it would work without leaking changes to the decimal > context, despite the fact that it doesn't use a context > manager or do anything else to explicitly put back the > old context. The disagreement seems to be whether a LogicalContext should be created implicitly vs explicitly (or opt-out vs opt-in). As a user trying to track down a decimal context change not propagating, I would not suspect the above code of automatically creating a LogicalContext and isolating the change, whereas Greg's context manager version is abundantly clear. The implicit vs explicit argument comes down, I think, to resource management: some resources in Python are automatically managed (memory), and some are not (files) -- which type should LCs be? -- ~Ethan~ From senthil at uthcode.com Thu Sep 7 09:07:51 2017 From: senthil at uthcode.com (Senthil Kumaran) Date: Thu, 7 Sep 2017 06:07:51 -0700 Subject: [Python-Dev] [python-committers] Cherry picker bot deployed in CPython repo In-Reply-To: References: Message-ID: On Tue, Sep 5, 2017 at 6:10 PM, Mariatta Wijaya wrote: > Hi, > > The cherry picker bot has just been deployed to CPython repo, codenamed > miss-islington. > > miss-islington made the very first backport PR for CPython and became a > first time GitHub contributor: https://github.com/python/cpython/pull/3369 > > > Thanks Mariatta. This is an awesome contribution. It's going to extremely helpful. -- Senthil -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Thu Sep 7 09:25:22 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 07 Sep 2017 06:25:22 -0700 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: References: <59A497C1.1050005@canterbury.ac.nz> <59A49F80.8070808@canterbury.ac.nz> <59A5FA89.1000001@canterbury.ac.nz> <59A6B5CF.4070902@canterbury.ac.nz> <59AF3A75.2020406@canterbury.ac.nz> <59AF9EA0.9020608@canterbury.ac.nz> <59B0E9B4.8010807@canterbury.ac.nz> Message-ID: <59B148C2.3050707@stoneleaf.us> On 09/06/2017 11:57 PM, Yury Selivanov wrote: > On Wed, Sep 6, 2017 at 11:39 PM, Greg Ewing wrote: >> Here's one way that refactoring could trip you up. >> Start with this: >> >> async def foo(): >> calculate_something() >> #in a coroutine, so we can be lazy and not use a cm > > Where exactly does PEP 550 encourage users to be "lazy and not use a > cm"? PEP 550 provides a mechanism for implementing context managers! > What is this example supposed to show? That using a CM is not required, and tracking down a bug caused by not using a CM can be difficult. > How is PEP 550 is at fault of somebody being lazy and not using a > context manager? Because PEP 550 makes a CM unnecessary in the simple (common?) case, hiding the need for a CM in not-so-simple cases. For comparison: in Python 3 we are now warned about files that have been left open (because explicitly closing files was unnecessary in CPython due to an implementation detail) -- the solution? make files context managers whose __exit__ closes the file. > PEP 550 has a hard requirement to make it possible for decimal/other > libraries to start using its APIs and stay backwards compatible, so it > allows `decimal.setcontext(ctx)` function to be implemented. We are > fixing things here. I appreciate that the scientific and number-crunching communities have been a major driver of enhancements for Python (such as rich comparisons and, more recently, matrix operators), but I don't think an enhancement for them that makes life more difficult for the rest is a net win. -- ~Ethan~ From elprans at gmail.com Thu Sep 7 10:01:12 2017 From: elprans at gmail.com (Elvis Pranskevichus) Date: Thu, 07 Sep 2017 10:01:12 -0400 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: <59B0FB27.2090304@canterbury.ac.nz> References: <59B0FB27.2090304@canterbury.ac.nz> Message-ID: <1950180.T6HDsZXD2Z@hammer.magicstack.net> On Thursday, September 7, 2017 3:54:15 AM EDT Greg Ewing wrote: > This problem would also not arise if context vars > simply had names instead of being magic key objects: > > def foo(): > contextvars.set("mymodule.myvar", 1) > > That's another thing I think would be an improvement, > but it's orthogonal to what we're talking about here > and would be best discussed separately. On the contrary, using simple names (PEP 550 V1 was actually doing that) is a regression. It opens up namespace clashing issues. Imagine you have a variable named "foo", and then some library you import also decides to use the name "foo", what then? That's one of the reasons why we do `local = threading.local()` instead of `threading.set_local("foo", 1)`. Elvis From ethan at stoneleaf.us Thu Sep 7 10:06:14 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 07 Sep 2017 07:06:14 -0700 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: <1679203.vpm3Kbeti1@hammer.magicstack.net> References: <59B12186.6030707@canterbury.ac.nz> <59B14436.8040705@stoneleaf.us> <1679203.vpm3Kbeti1@hammer.magicstack.net> Message-ID: <59B15256.8030209@stoneleaf.us> On 09/07/2017 06:41 AM, Elvis Pranskevichus wrote: > On Thursday, September 7, 2017 9:05:58 AM EDT Ethan Furman wrote: >> The disagreement seems to be whether a LogicalContext should be >> created implicitly vs explicitly (or opt-out vs opt-in). As a user >> trying to track down a decimal context change not propagating, I >> would not suspect the above code of automatically creating a >> LogicalContext and isolating the change, whereas Greg's context >> manager version is abundantly clear. >> >> The implicit vs explicit argument comes down, I think, to resource >> management: some resources in Python are automatically managed >> (memory), and some are not (files) -- which type should LCs be? > > You are confusing resource management with the isolation mechanism. PEP > 550 contextvars are analogous to threading.local(), which the PEP makes > very clear from the outset. I might be, and I wouldn't be surprised. :) On the other hand, one can look at isolation as being a resource. > threading.local(), the isolation mechanism, is *implicit*. I don't think so. You don't get threading.local() unless you call it -- that makes it explicit. > decimal.localcontext() is an *explicit* resource manager that relies on > threading.local() magic. PEP 550 simply provides a threading.local() > alternative that works in tasks and generators. That's it! The concern is *how* PEP 550 provides it: - explicitly, like threading.local(): has to be set up manually, preferably with a context manager - implicitly: it just happens under certain conditions -- ~Ethan~ From k7hoven at gmail.com Thu Sep 7 10:04:15 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Thu, 7 Sep 2017 17:04:15 +0300 Subject: [Python-Dev] Consolidate stateful runtime globals In-Reply-To: <1504718226.444156.1097348696.6F266B12@webmail.messagingengine.com> References: <20170906121459.215159bc@fsol> <1504716149.433314.1097304088.73AD5975@webmail.messagingengine.com> <20170906190801.5121e323@fsol> <1504718226.444156.1097348696.6F266B12@webmail.messagingengine.com> Message-ID: On Wed, Sep 6, 2017 at 8:17 PM, Benjamin Peterson wrote: > On Wed, Sep 6, 2017, at 10:08, Antoine Pitrou wrote: > > On Wed, 06 Sep 2017 09:42:29 -0700 > > Benjamin Peterson wrote: > > > On Wed, Sep 6, 2017, at 03:14, Antoine Pitrou wrote: > > > > > > > > Hello, > > > > > > > > I'm a bit concerned about > > > > https://github.com/python/cpython/commit/76d5abc8684bac4f2fc > 7cccfe2cd940923357351 > > > > > > > > My main gripe is that makes writing C code more tedious. Simple C > > > > global variables such as "_once_registry" are now spelled > > > > "_PyRuntime.warnings.once_registry". The most egregious example > seems > > > > to be "_PyRuntime.ceval.gil.locked" (used to be simply "gil_locked"). > > > > > > > > Granted, C is more verbose than Python, but it doesn't have to become > > > > that verbose. I don't know about you, but when code becomes annoying > > > > to type, I tend to try and take shortcuts. > > > > > > How often are you actually typing the names of runtime globals, though? > > > > Not very often, but if I want to experiment with some low-level > > implementation details, it is nice to avoid the hassle. > > It seems like this could be remediated with some inline functions or > macros, which would also help safely encapsulate state. > > > > > There's also a readability argument: with very long names, expressions > > can become less easy to parse. > > > > > If you are using a globals, perhaps the typing time will allow you to > > > fully consider the gravity of the situation. > > > > Right, I needed to be reminded of how perilous the use of C globals is. > > Perhaps I should contact the PSRT the next time I contemplate using a C > > global. > > It's not just you but future readers. Great. Related to this, there is also discussion on dangers of globals and other widely-scoped variables in the Rationale section of PEP 555 (Context-local variables), for anyone interested. But if you read the draft I posted on python-ideas last Monday, you've already seen it. ??Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From elprans at gmail.com Thu Sep 7 09:41:10 2017 From: elprans at gmail.com (Elvis Pranskevichus) Date: Thu, 07 Sep 2017 09:41:10 -0400 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: <59B14436.8040705@stoneleaf.us> References: <59B12186.6030707@canterbury.ac.nz> <59B14436.8040705@stoneleaf.us> Message-ID: <1679203.vpm3Kbeti1@hammer.magicstack.net> On Thursday, September 7, 2017 9:05:58 AM EDT Ethan Furman wrote: > The disagreement seems to be whether a LogicalContext should be > created implicitly vs explicitly (or opt-out vs opt-in). As a user > trying to track down a decimal context change not propagating, I > would not suspect the above code of automatically creating a > LogicalContext and isolating the change, whereas Greg's context > manager version is abundantly clear. > > The implicit vs explicit argument comes down, I think, to resource > management: some resources in Python are automatically managed > (memory), and some are not (files) -- which type should LCs be? You are confusing resource management with the isolation mechanism. PEP 550 contextvars are analogous to threading.local(), which the PEP makes very clear from the outset. threading.local(), the isolation mechanism, is *implicit*. decimal.localcontext() is an *explicit* resource manager that relies on threading.local() magic. PEP 550 simply provides a threading.local() alternative that works in tasks and generators. That's it! Elvis From elprans at gmail.com Thu Sep 7 10:13:46 2017 From: elprans at gmail.com (Elvis Pranskevichus) Date: Thu, 07 Sep 2017 10:13:46 -0400 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: <59B12186.6030707@canterbury.ac.nz> References: <59B12186.6030707@canterbury.ac.nz> Message-ID: <6221644.YDppWAoE6Y@hammer.magicstack.net> On Thursday, September 7, 2017 6:37:58 AM EDT Greg Ewing wrote: > 2) You ignore it and always use a context manager, in > which case it's not strictly necessary for the implicit > context push to occur, since the relevant context managers > can take care of it. > > So there doesn't seem to be any great advantage to the > automatic context push, and it has some disadvantages, > such as yield-from not quite working as expected in > some situations. The advantage is that context managers don't need to *always* allocate and push an LC. [1] > Also, it seems that every generator is going to incur > the overhead of allocating a logical_context even when > it doesn't actually change any context vars, which most > generators won't. By default, generators reference an empty LogicalContext object that is allocated once (like the None object). We can do that because LCs are immutable. Elvis [1] https://mail.python.org/pipermail/python-dev/2017-September/ 149265.html From elprans at gmail.com Thu Sep 7 10:17:15 2017 From: elprans at gmail.com (Elvis Pranskevichus) Date: Thu, 07 Sep 2017 10:17:15 -0400 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: <59B15256.8030209@stoneleaf.us> References: <1679203.vpm3Kbeti1@hammer.magicstack.net> <59B15256.8030209@stoneleaf.us> Message-ID: <1579189.ruT1Uy8ScG@hammer.magicstack.net> On Thursday, September 7, 2017 10:06:14 AM EDT Ethan Furman wrote: > I might be, and I wouldn't be surprised. :) On the other hand, one > can look at isolation as being a resource. > > threading.local(), the isolation mechanism, is *implicit*. > > I don't think so. You don't get threading.local() unless you call it > -- that makes it explicit. > > decimal.localcontext() is an *explicit* resource manager that > > relies on threading.local() magic. PEP 550 simply provides a > > threading.local() alternative that works in tasks and generators. > > That's it! > > The concern is *how* PEP 550 provides it: > > - explicitly, like threading.local(): has to be set up manually, > preferably with a context manager > > - implicitly: it just happens under certain conditions > You literally replace threading.local() with contextvars.ContextVar(): import threading _decimal_context = threading.local() def set_decimal_context(ctx): _decimal_context.context = ctx Becomes: import contextvars _decimal_context = contextvars.ContextVar('decimal.Context') def set_decimal_context(ctx): _decimal_context.set(ctx) Elvis From k7hoven at gmail.com Thu Sep 7 10:29:19 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Thu, 7 Sep 2017 17:29:19 +0300 Subject: [Python-Dev] Consolidate stateful runtime globals In-Reply-To: <1504718226.444156.1097348696.6F266B12@webmail.messagingengine.com> References: <20170906121459.215159bc@fsol> <1504716149.433314.1097304088.73AD5975@webmail.messagingengine.com> <20170906190801.5121e323@fsol> <1504718226.444156.1097348696.6F266B12@webmail.messagingengine.com> Message-ID: On Wed, Sep 6, 2017 at 8:17 PM, Benjamin Peterson wrote: > On Wed, Sep 6, 2017, at 10:08, Antoine Pitrou wrote: > > On Wed, 06 Sep 2017 09:42:29 -0700 > > Benjamin Peterson wrote: > > > On Wed, Sep 6, 2017, at 03:14, Antoine Pitrou wrote: > > > > > > > > Hello, > > > > > > > > I'm a bit concerned about > > > > https://github.com/python/cpython/commit/ > 76d5abc8684bac4f2fc7cccfe2cd940923357351 > > > > > > > > My main gripe is that makes writing C code more tedious. Simple C > > > > global variables such as "_once_registry" are now spelled > > > > "_PyRuntime.warnings.once_registry". The most egregious example > seems > > > > to be "_PyRuntime.ceval.gil.locked" (used to be simply "gil_locked"). > > > > > > > > Granted, C is more verbose than Python, but it doesn't have to become > > > > that verbose. I don't know about you, but when code becomes annoying > > > > to type, I tend to try and take shortcuts. > > > > > > How often are you actually typing the names of runtime globals, though? > > > > Not very often, but if I want to experiment with some low-level > > implementation details, it is nice to avoid the hassle. > > It seems like this could be remediated with some inline functions or > macros, which would also help safely encapsulate state. > > > > > There's also a readability argument: with very long names, expressions > > can become less easy to parse. > > > > > If you are using a globals, perhaps the typing time will allow you to > > > fully consider the gravity of the situation. > > > > Right, I needed to be reminded of how perilous the use of C globals is. > > Perhaps I should contact the PSRT the next time I contemplate using a C > > global. > > It's not just you but future readers. Great. Related to this, there is also discussion on dangers of globals and other widely-scoped variables in the Rationale section of PEP 555 (Context-local variables), for anyone interested. But if you read the draft I posted on python-ideas last Monday, you've already seen it. ??Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Thu Sep 7 10:32:09 2017 From: guido at python.org (Guido van Rossum) Date: Thu, 7 Sep 2017 07:32:09 -0700 Subject: [Python-Dev] PEP 550 discussion time out Message-ID: I am declaring a time out for discussing of PEP 550 and its competitor, PEP 555. There has been some good discussion but also some mud-slinging (in various directions), and I've been frankly overwhelmed by the exchanges, despite having spent much of Tuesday talking to Yury about it and lying awake thinking about it for two nights since then. I need to sit down and think more about the problem we're trying to solve, what are the constraints (and why), and what would the minimal solution. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Thu Sep 7 10:34:24 2017 From: guido at python.org (Guido van Rossum) Date: Thu, 7 Sep 2017 07:34:24 -0700 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: <1579189.ruT1Uy8ScG@hammer.magicstack.net> References: <1679203.vpm3Kbeti1@hammer.magicstack.net> <59B15256.8030209@stoneleaf.us> <1579189.ruT1Uy8ScG@hammer.magicstack.net> Message-ID: I write it in a new thread, but I also want to write it here -- I need a time out in this discussion so I can think about it more. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at bytereef.org Thu Sep 7 10:42:21 2017 From: stefan at bytereef.org (Stefan Krah) Date: Thu, 7 Sep 2017 16:42:21 +0200 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: <1679203.vpm3Kbeti1@hammer.magicstack.net> References: <59B12186.6030707@canterbury.ac.nz> <59B14436.8040705@stoneleaf.us> <1679203.vpm3Kbeti1@hammer.magicstack.net> Message-ID: <20170907144221.GA2612@bytereef.org> On Thu, Sep 07, 2017 at 09:41:10AM -0400, Elvis Pranskevichus wrote: > threading.local(), the isolation mechanism, is *implicit*. > decimal.localcontext() is an *explicit* resource manager that relies on > threading.local() magic. PEP 550 simply provides a threading.local() > alternative that works in tasks and generators. That's it! If there only were a name that would make it explicit, like TaskLocalStorage. ;) Seriously, the problem with 'context' is that it is: a) A predefined set of state values like in the Decimal (I think also the OpenSSL) context. But such a context is put inside another context (the ExecutionContext). b) A theoretical concept from typed Lambda calculus (in the context 'gamma' the variable 'v' has type 't'). But this concept would be associated with lexical scope and would extend to functions (not only tasks and generators). c) ``man 3 setcontext``. A replacement for setjmp/longjmp. Somewhat related in that it could be used to implement coroutines. d) The .NET flowery language. I do did not fully understand what the .NET ExecutionContext and its 2881 implicit flow rules are. ... Stefan Krah From guido at python.org Thu Sep 7 10:46:49 2017 From: guido at python.org (Guido van Rossum) Date: Thu, 7 Sep 2017 07:46:49 -0700 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: References: <98F44C60-2DBB-4771-9544-B9DA7210353C@python.org> <03B5AA26-0AE0-4EFB-9300-B71F9ECD3AE8@python.org> Message-ID: On Wed, Sep 6, 2017 at 11:59 PM, Terry Reedy wrote: > On 9/6/2017 6:45 PM, Barry Warsaw wrote: > >> On Sep 6, 2017, at 14:59, Terry Reedy wrote: >> >>> >>> Currently, the debugger is started in response to a menu seletion in the >>> IDLE process while the python process is idle. One reason for the 'idle' >>> requirement' is because when code is exec-uting, the loop that reads >>> commands, executes them, and sends responses is blocked on the exec call. >>> The IDLE process sets up its debugger window, its ends of the rpc channels, >>> and commands to python process to set up Idb and the other ends of the >>> channels. The challenge would be to initiate setup from the server process >>> and deal with the blocked loop. >>> >> >> Would the environment variable idea in the latest version of the PEP help >> you here? >> > > That seems to be a mechanism for users to convey which function to call. > It does not simplify the function call chosen. > > pdb works because a) it is designed to be imported into and run in a user > module and b) it uses the existing stdin and stdout streams to interact > with the user. Idb, on the other hand, is designed to be import in > idlelib.run, which is (almost) invisible to user code, and use a special > channel to the GUI window. > > Having said that, I see a solution: before running user code, set things > up as now, except do not start tracing and do not show the debugger window, > but do set a run.idb_ready flag. Then breakpoint_handler() can check the > flag and start tracing either through idb or pdb. > Without following all this (or the PEP, yet) super exactly, does this mean you are satisfied with what the PEP currently proposes or would you like changes? It's a little unclear from what you wrote. > What makes breakpoint() less than super useful in IDLE is that IDLE > already allows setting of persistent breakpoints on a line by > right-clicking. Since they are not stored in the file, there is nothing to > erase when one runs the code in Python directly. > I think that's true for any IDE that has existing integrated debug capabilities. However for every IDE I would hope that calling breakpoint() will *also* break into the IDE's debugger. My use case is that sometimes I have a need for a *conditional* breakpoint where it's just much simpler to decide whether to break or not in Python code than it would be to use the IDE's conditional breakpoint facilities. (Plus there's something about old dogs and new tricks, but I've forgotten exactly how that one goes. :-) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Thu Sep 7 12:50:19 2017 From: barry at python.org (Barry Warsaw) Date: Thu, 7 Sep 2017 09:50:19 -0700 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: References: <7DDCE154-7596-414F-90C4-A2AE0618C728@python.org> <3AA673BE-9B32-4ACB-AB4D-940A006C0A4D@python.org> <4AEEEC14-2FF3-4A48-962D-1293B8B0A8FC@python.org> Message-ID: On Sep 6, 2017, at 23:10, Terry Reedy wrote: > > Environmental variables are set to strings, not objects. It is not clear how you intend to handle the conversion. The environment variable names a module import path. Without quibbling about the details of the syntax (because honestly, I?m not convinced it?s a useful feature), it would work roughly like: * The default value is equivalent to PYTHONBREAKPOINTHOOK=pdb.set_trace * breakpoint() splits the value on the rightmost dot * modules on the LHS are imported, then the RHS is getattr?d out of that * That?s the callable breakpoint() calls -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 273 bytes Desc: Message signed with OpenPGP URL: From barry at python.org Thu Sep 7 12:52:20 2017 From: barry at python.org (Barry Warsaw) Date: Thu, 7 Sep 2017 09:52:20 -0700 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: References: <98F44C60-2DBB-4771-9544-B9DA7210353C@python.org> <03B5AA26-0AE0-4EFB-9300-B71F9ECD3AE8@python.org> Message-ID: <1F68BA08-AC94-41A6-AFD2-4439DA5D3371@python.org> On Sep 7, 2017, at 07:46, Guido van Rossum wrote: > Without following all this (or the PEP, yet) super exactly, does this mean you are satisfied with what the PEP currently proposes or would you like changes? It's a little unclear from what you wrote. I?m also unsure whether Terry is good with the existing PEP or suggesting changes. > I think that's true for any IDE that has existing integrated debug capabilities. However for every IDE I would hope that calling breakpoint() will *also* break into the IDE's debugger. My use case is that sometimes I have a need for a *conditional* breakpoint where it's just much simpler to decide whether to break or not in Python code than it would be to use the IDE's conditional breakpoint facilities. That certainly aligns with my own experience and expectations. I guess I?m a fellow old dog. :) -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 273 bytes Desc: Message signed with OpenPGP URL: From christian at python.org Thu Sep 7 13:00:52 2017 From: christian at python.org (Christian Heimes) Date: Thu, 7 Sep 2017 10:00:52 -0700 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: References: <7DDCE154-7596-414F-90C4-A2AE0618C728@python.org> <3AA673BE-9B32-4ACB-AB4D-940A006C0A4D@python.org> <4AEEEC14-2FF3-4A48-962D-1293B8B0A8FC@python.org> Message-ID: On 2017-09-07 09:50, Barry Warsaw wrote: > On Sep 6, 2017, at 23:10, Terry Reedy wrote: >> >> Environmental variables are set to strings, not objects. It is not clear how you intend to handle the conversion. > > The environment variable names a module import path. Without quibbling about the details of the syntax (because honestly, I?m not convinced it?s a useful feature), it would work roughly like: > > * The default value is equivalent to PYTHONBREAKPOINTHOOK=pdb.set_trace > * breakpoint() splits the value on the rightmost dot > * modules on the LHS are imported, then the RHS is getattr?d out of that > * That?s the callable breakpoint() calls Setuptools' entry points [1] use colon between import and function, e.g. "pdb:set_trace" would import pdb and then execute set_trace. The approach can be augmented to allow calling a class method, too. So "package.module:myclass.classfunc" would do : from package.module import myclass myclass.classfunc Regards, Christian [1] https://setuptools.readthedocs.io/en/latest/setuptools.html#automatic-script-creation From solipsis at pitrou.net Thu Sep 7 13:16:37 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 7 Sep 2017 19:16:37 +0200 Subject: [Python-Dev] Compiling without multithreading support -- still useful? References: <20170905183651.6934a16d@fsol> Message-ID: <20170907191637.3cbc38ef@fsol> This is now in git master after being merged by Victor in https://github.com/python/cpython/pull/3385. Regards Antoine. On Tue, 5 Sep 2017 18:36:51 +0200 Antoine Pitrou wrote: > Hello, > > It's 2017 and we are still allowing people to compile CPython without > threads support. It adds some complication in several places > (including delicate parts of our internal C code) without a clear > benefit. Do people still need this? > > Regards > > Antoine. > > From barry at python.org Thu Sep 7 13:19:52 2017 From: barry at python.org (Barry Warsaw) Date: Thu, 7 Sep 2017 10:19:52 -0700 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: References: <7DDCE154-7596-414F-90C4-A2AE0618C728@python.org> <3AA673BE-9B32-4ACB-AB4D-940A006C0A4D@python.org> <4AEEEC14-2FF3-4A48-962D-1293B8B0A8FC@python.org> Message-ID: On Sep 7, 2017, at 10:00, Christian Heimes wrote: > Setuptools' entry points [1] use colon between import and function, e.g. > "pdb:set_trace" would import pdb and then execute set_trace. The > approach can be augmented to allow calling a class method, too. > > So > > "package.module:myclass.classfunc" > > would do : > > from package.module import myclass > myclass.classfunc Yep, that?s how it's described in the PEP 553 open issue. I just didn?t include that complication in my response. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 273 bytes Desc: Message signed with OpenPGP URL: From neil at python.ca Thu Sep 7 13:30:12 2017 From: neil at python.ca (Neil Schemenauer) Date: Thu, 7 Sep 2017 11:30:12 -0600 Subject: [Python-Dev] Memory bitmaps for the Python cyclic garbage collector Message-ID: <20170907173012.vmvhx5lt2ajghqfy@python.ca> Python objects that participate in cyclic GC (things like lists, dicts, sets but not strings, ints and floats) have extra memory overhead. I think it is possible to mostly eliminate this overhead. Also, while the GC is running, this GC state is mutated, which destroys copy-on-write optimizations. This change would mostly fix that issue. All objects that participate in cyclic GC have the Py_TPFLAGS_HAVE_GC bit set in their type. That causes an extra chunk of memory to be allocated *before* the ob_refcnt struct member. This is the PyGC_Head struct. The whole object looks like this in memory (PyObject pointer is at arrow): union __gc_head *gc_next; union __gc_head *gc_prev; Py_ssize_t gc_refs; --> Py_ssize_t ob_refcnt struct _typeobject *ob_type; [rest of PyObject members] So, 24 bytes of overhead on a 64-bit machine. The smallest Python object that can have a pointer to another object (e.g. a single PyObject * member) is 48 bytes. Removing PyGC_Head would cut the size of these objects in half. Carl Shaprio questioned me today on why we use a double linked-list and not the memory bitmap. I think the answer is that there is no good reason. We use a double linked list only due to historical constraints that are no longer present. Long ago, Python objects could be allocated using the system malloc or other memory allocators. Since we could not control the memory location, bitmaps would be inefficient. Today, we allocate all Python objects via our own function. Python objects under a certain size are allocated using our own malloc, obmalloc, and are stored in memory blocks known "arenas". The PyGC_Head struct performs three functions. First, it allows the GC to find all Python objects that will be checked for cycles (i.e. follow the linked list). Second, it stores a single bit of information to let the GC know if it is safe to traverse the object, set with PyObject_GC_Track(). Finally, it has a scratch area to compute the effective reference count while tracing refs (gc_refs). Here is a sketch of how we can remove the PyGC_Head struct for small objects (say less than 512 bytes). Large objects or objects created by a different memory allocator will still have the PyGC_Head overhead. * Have memory arenas that contain only objects with the Py_TPFLAGS_HAVE_GC flag. Objects like ints, strings, etc will be in different arenas, not have bitmaps, not be looked at by the cyclic GC. * For those arenas, add a memory bitmap. The bitmap is a bit array that has a bit for each fixed size object in the arena. The memory used by the bitmap is a fraction of what is needed by PyGC_Head. E.g. an arena that holds up to 1024 objects of 48 bytes in size would have a bitmap of 1024 bits. * The bits will be set and cleared by PyObject_GC_Track/Untrack() * We also need an array of Py_ssize_t to take over the job of gc_refs. That could be allocated only when GC is working and it only needs to be the size of the number of true bits in the bitmap. Or, it could be allocated when the arena is allocated and be sized for the full arena. * Objects that are too large would still get the PyGC_Head struct allocated "in front" of the PyObject. Because they are big, the overhead is not so bad. * The GC process would work nearly the same as it does now. Rather than only traversing the linked list, we would also have to crawl over the GC object arenas, check blocks of memory that have the tracked bit set. There are a lot of smaller details to work out but I see no reason why the idea should not work. It should significantly reduce memory usage. Also, because the bitmap and gc_refs are contiguous in memory, locality will be improved. ?ukasz Langa has mentioned that the current GC causes issues with copy-on-write memory in big applications. This change should solve that issue. To implement, I think the easiest path is to create new malloc to be used by small GC objects, e.g. gcmalloc.c. It would be similar to obmalloc but have the features needed to keep track of the bitmap. obmalloc has some quirks that makes it hard to use for this purpose. Once the idea is proven, gcmalloc could be merged or made to be a variation of obmalloc. Or, maybe just optimized and remain separate. obmalloc is complicated and highly optimized. So, adding additional functionality to it will be challenging. I believe this change would be ABI compatible. From nas at arctrix.com Thu Sep 7 14:08:41 2017 From: nas at arctrix.com (Neil Schemenauer) Date: Thu, 7 Sep 2017 18:08:41 +0000 (UTC) Subject: [Python-Dev] Consolidate stateful runtime globals References: <20170906121459.215159bc@fsol> Message-ID: Is there any issue with unit-at-a-time optimization? I would imagine that a static global would allow optimizations that are not safe for a exported global (not sure the C term for it). I suspect it doesn't matter and I support the idea in general. Global variables in extension modules kills the idea of a mark-and-sweap or some other GC mechanism. That's probably not going to happen but identifying all of the global state seems like a step forward. From nas at arctrix.com Thu Sep 7 14:14:12 2017 From: nas at arctrix.com (Neil Schemenauer) Date: Thu, 7 Sep 2017 18:14:12 +0000 (UTC) Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> Message-ID: Larry Hastings wrote: > The TL;DR summary: add support for property objects to modules. > I've already posted a prototype. I posted an idea to python-ideas about lazy module loading. If the lazy loading idea works, having properties would allow modules to continue to be "lazy safe" but to easily do init logic when needed, e.g. getting of the property. There should be a very clean way to do that, IMHO. Using __class__ is not clean and it would be unfortunate to have the __class__ song-and-dance in a bunch of modules. Using property() seems more Pythonic. From solipsis at pitrou.net Thu Sep 7 14:22:43 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 7 Sep 2017 20:22:43 +0200 Subject: [Python-Dev] Memory bitmaps for the Python cyclic garbage collector References: <20170907173012.vmvhx5lt2ajghqfy@python.ca> Message-ID: <20170907202243.37ff917d@fsol> On Thu, 7 Sep 2017 11:30:12 -0600 Neil Schemenauer wrote: > > * The GC process would work nearly the same as it does now. Rather than > only traversing the linked list, we would also have to crawl over the > GC object arenas, check blocks of memory that have the tracked bit > set. Small note: the linked lists are also used for distinguishing between generations. In other words, there is not one linked list but three of them (one per generation). This means you probably need two bits per object, not one. Other than that, it's an interesting proposal. I'm looking forward to the concrete results (performance, and maintenance overhead) :-) Regards Antoine. From barry at python.org Thu Sep 7 14:43:41 2017 From: barry at python.org (Barry Warsaw) Date: Thu, 07 Sep 2017 11:43:41 -0700 Subject: [Python-Dev] PEP 553 V2 - builtin breakpoint() (was Re: PEP 553: Built-in debug()) Message-ID: Thanks for all the great feedback folks! Here then is PEP 553 version 2. The notable changes are: * Change the name of the built-in from debug() to breakpoint() * Modify the signature to be breakpoint(*args, **kws) https://www.python.org/dev/peps/pep-0553/ Included below for convenience. Cheers, -Barry PEP: 553 Title: Built-in breakpoint() Author: Barry Warsaw Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 2017-09-05 Python-Version: 3.7 Post-History: 2017-09-05, 2017-09-07 Abstract ======== This PEP proposes adding a new built-in function called ``breakpoint()`` which enters a Python debugger at the point of the call. Additionally, two new names are added to the ``sys`` module to make the debugger pluggable. Rationale ========= Python has long had a great debugger in its standard library called ``pdb``. Setting a break point is commonly written like this:: foo() import pdb; pdb.set_trace() bar() Thus after executing ``foo()`` and before executing ``bar()``, Python will enter the debugger. However this idiom has several disadvantages. * It's a lot to type (27 characters). * It's easy to typo. The PEP author often mistypes this line, e.g. omitting the semicolon, or typing a dot instead of an underscore. * It ties debugging directly to the choice of pdb. There might be other debugging options, say if you're using an IDE or some other development environment. * Python linters (e.g. flake8 [1]_) complain about this line because it contains two statements. Breaking the idiom up into two lines further complicates the use of the debugger, These problems can be solved by modeling a solution based on prior art in other languages, and utilizing a convention that already exists in Python. Proposal ======== The JavaScript language provides a ``debugger`` statement [2]_ which enters the debugger at the point where the statement appears. This PEP proposes a new built-in function called ``breakpoint()`` which enters a Python debugger at the call site. Thus the example above would be written like so:: foo() breakpoint() bar() Further, this PEP proposes two new name bindings for the ``sys`` module, called ``sys.breakpointhook()`` and ``sys.__breakpointhook__``. By default, ``sys.breakpointhook()`` implements the actual importing and entry into ``pdb.set_trace()``, and it can be set to a different function to change the debugger that ``breakpoint()`` enters. ``sys.__breakpointhook__`` then stashes the default value of ``sys.breakpointhook()`` to make it easy to reset. This exactly models the existing ``sys.displayhook()`` / ``sys.__displayhook__`` and ``sys.excepthook()`` / ``sys.__excepthook__`` hooks [3]_. The signature of the built-in is ``breakpoint(*args, **kws)``. The positional and keyword arguments are passed straight through to ``sys.breakpointhook()`` and the signatures must match or a ``TypeError`` will be raised. The return from ``sys.breakpointhook()`` is passed back up to, and returned from ``breakpoint()``. Since ``sys.breakpointhook()`` by default calls ``pdb.set_trace()`` by default it accepts no arguments. Open issues =========== Confirmation from other debugger vendors ---------------------------------------- We want to get confirmation from at least one alternative debugger implementation (e.g. PyCharm) that the hooks provided in this PEP will be useful to them. Breakpoint bytecode ------------------- Related, there has been an idea to add a bytecode that calls ``sys.breakpointhook()``. Whether built-in ``breakpoint()`` emits this bytecode (or gets peephole optimized to the bytecode) is an open issue. The bytecode is useful for debuggers that actively modify bytecode streams to trampoline into their own debugger. Having a "breakpoint" bytecode might allow them to avoid bytecode modification in order to invoke this trampoline. *NOTE*: It probably makes sense to split this idea into a separate PEP. Environment variable -------------------- Should we add an environment variable so that ``sys.breakpointhook()`` can be set outside of the Python invocation? E.g.:: $ export PYTHONBREAKPOINTHOOK=my.debugger:Debugger This would provide execution environments such as IDEs which run Python code inside them, to set an internal breakpoint hook before any Python code executes. Call a fancier object by default -------------------------------- Some folks want to be able to use other ``pdb`` interfaces such as ``pdb.pm()``. Although this is a less commonly used API, it could be supported by binding ``sys.breakpointhook`` to an object that implements ``__call__()``. Calling this object would call ``pdb.set_trace()``, but the object could expose other methods, such as ``pdb.pm()``, making invocation of it as handy as ``breakpoint.pm()``. Implementation ============== A pull request exists with the proposed implementation [4]_. Rejected alternatives ===================== A new keyword ------------- Originally, the author considered a new keyword, or an extension to an existing keyword such as ``break here``. This is rejected on several fronts. * A brand new keyword would require a ``__future__`` to enable it since almost any new keyword could conflict with existing code. This negates the ease with which you can enter the debugger. * An extended keyword such as ``break here``, while more readable and not requiring a ``__future__`` would tie the keyword extension to this new feature, preventing more useful extensions such as those proposed in PEP 548. * A new keyword would require a modified grammar and likely a new bytecode. Each of these makes the implementation more complex. A new built-in breaks no existing code (since any existing module global would just shadow the built-in) and is quite easy to implement. sys.breakpoint() ---------------- Why not ``sys.breakpoint()``? Requiring an import to invoke the debugger is explicitly rejected because ``sys`` is not imported in every module. That just requires more typing and would lead to:: import sys; sys.breakpoint() which inherits several of the problems this PEP aims to solve. Version History =============== * 2017-09-07 * ``debug()`` renamed to ``breakpoint()`` * Signature changed to ``breakpoint(*args, **kws)`` which is passed straight through to ``sys.breakpointhook()``. References ========== .. [1] http://flake8.readthedocs.io/en/latest/ .. [2] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/debugger .. [3] https://docs.python.org/3/library/sys.html#sys.displayhook .. [4] https://github.com/python/cpython/pull/3355 Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: From fperez.net at gmail.com Thu Sep 7 15:09:09 2017 From: fperez.net at gmail.com (Fernando Perez) Date: Thu, 7 Sep 2017 12:09:09 -0700 Subject: [Python-Dev] PEP 553: Built-in debug() References: <20170906025715.GG13110@ando.pearwood.info> <1504667186.2203533.1096578744.6E33CEB6@webmail.messagingengine.com> <21D29DF6-B2DB-4A6C-8290-13DB8D7B21B4@python.org> Message-ID: On 2017-09-07 00:20:17 +0000, Barry Warsaw said: > Thanks Fernando, this is exactly the kind of feedback from other > debuggers that I?m looking for. It certainly sounds like a handy > feature; I?ve found myself wanting something like that from pdb from > time to time. Glad it's useful, thanks for the pep! > The PEP has an open issue regarding breakpoint() taking *args and > **kws, which would just be passed through the call stack. It sounds > like you?d be in favor of that enhancement. If you go witht the `(*a, **k)` pass-through API, would you have a special keyword-only arg called 'header' or similar? That seems like a decent compromise to support the feature with the builtin while allowing other implementations to offer more features. In any case, +1 to a pass-through API, as long as the built-in supports some kind of mechanism to help the user get their bearings with "you're here" type messages. Cheers f From benjamin at python.org Thu Sep 7 16:39:21 2017 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 07 Sep 2017 13:39:21 -0700 Subject: [Python-Dev] PEP 552: deterministic pycs Message-ID: <1504816761.1994315.1098826272.2B182328@webmail.messagingengine.com> Hello, I've written a short PEP about an import extension to allow pycs to be more deterministic by optional replacing the timestamp with a hash of the source file: https://www.python.org/dev/peps/pep-0552/ Thanks for reading, Benjamin P.S. I came up with the idea for this PEP while awake. From tjreedy at udel.edu Thu Sep 7 16:52:10 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 7 Sep 2017 16:52:10 -0400 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: References: <7DDCE154-7596-414F-90C4-A2AE0618C728@python.org> <3AA673BE-9B32-4ACB-AB4D-940A006C0A4D@python.org> <4AEEEC14-2FF3-4A48-962D-1293B8B0A8FC@python.org> Message-ID: On 9/7/2017 12:50 PM, Barry Warsaw wrote: > On Sep 6, 2017, at 23:10, Terry Reedy wrote: >> >> Environmental variables are set to strings, not objects. It is not clear how you intend to handle the conversion. > > The environment variable names a module import path. Without quibbling about the details of the syntax (because honestly, I?m not convinced it?s a useful feature), it would work roughly like: > > * The default value is equivalent to PYTHONBREAKPOINTHOOK=pdb.set_trace > * breakpoint() splits the value on the rightmost dot > * modules on the LHS are imported, then the RHS is getattr?d out of that > * That?s the callable breakpoint() calls Environmental variables tend to be a pain on Windows and nigh unusable by beginners. Leaving that aside, I see these problems with trying to use one for IDLE's *current* debugger. pdb is universal, in the sense of working with any python run with actual or simulated stdin and stdout. IDLE's idb is specific to working with IDLE. So one could not set an EV to 'idlelib.idb.start' and leave it while switching between IDLE and console. pdb runs in one process, with communication between debugger and text ui handled by call and return. It can be started on the fly. IDLE runs code in a separate process. The debugger has to run in the user process. IDLE currently runs the GUI in the IDLE process. So a complicated communication process has to be set up with rpc proxies and adaptors, and this is best done *before* code runs. The on-the-fly function should not need an import and it might be better to not have one. pdb.set_trace is a public and stable interface. IDLE's is private and likely to be initially unstable. I can imagine that the function that I would want to bind to sys.__breakpoint__ would be a bound method Thinking about the above prompted me to rethink IDLE's debugger and consider rewriting it as an IDLE-independent gui debugger. I'll will say more in response to Guido and you. -- Terry Jan Reedy From solipsis at pitrou.net Thu Sep 7 17:00:43 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 7 Sep 2017 23:00:43 +0200 Subject: [Python-Dev] PEP 552: deterministic pycs References: <1504816761.1994315.1098826272.2B182328@webmail.messagingengine.com> Message-ID: <20170907230043.54d8f3f7@fsol> On Thu, 07 Sep 2017 13:39:21 -0700 Benjamin Peterson wrote: > Hello, > I've written a short PEP about an import extension to allow pycs to be > more deterministic by optional replacing the timestamp with a hash of > the source file: https://www.python.org/dev/peps/pep-0552/ Why isn't https://github.com/python/cpython/pull/296 a good enough solution to this problem? It has a simple implementation, and requires neither maintaining two different pyc formats nor reading the entire source file to check whether the pyc file is up to date. Regards Antoine. From fred at fdrake.net Thu Sep 7 17:04:05 2017 From: fred at fdrake.net (Fred Drake) Date: Thu, 7 Sep 2017 17:04:05 -0400 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: References: <7DDCE154-7596-414F-90C4-A2AE0618C728@python.org> <3AA673BE-9B32-4ACB-AB4D-940A006C0A4D@python.org> <4AEEEC14-2FF3-4A48-962D-1293B8B0A8FC@python.org> Message-ID: On Thu, Sep 7, 2017 at 4:52 PM, Terry Reedy wrote: > Environmental variables tend to be a pain on Windows and nigh unusable by > beginners. Leaving that aside, I see these problems with trying to use one > for IDLE's *current* debugger. > > pdb is universal, in the sense of working with any python run with actual or > simulated stdin and stdout. IDLE's idb is specific to working with IDLE. So > one could not set an EV to 'idlelib.idb.start' and leave it while switching > between IDLE and console. Would it work for IDLE to set the environment variable for the child process? The user certainly should not need to be involved in that. That doesn't address the issue of setting up the communications channel before breakpoint() is called, but allows the breakpointhook that gets used to work with whatever has been arranged. -Fred -- Fred L. Drake, Jr. "A storm broke loose in my mind." --Albert Einstein From benjamin at python.org Thu Sep 7 17:08:58 2017 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 07 Sep 2017 14:08:58 -0700 Subject: [Python-Dev] PEP 552: deterministic pycs In-Reply-To: <20170907230043.54d8f3f7@fsol> References: <1504816761.1994315.1098826272.2B182328@webmail.messagingengine.com> <20170907230043.54d8f3f7@fsol> Message-ID: <1504818538.3015679.1098853504.131C78A8@webmail.messagingengine.com> On Thu, Sep 7, 2017, at 14:00, Antoine Pitrou wrote: > On Thu, 07 Sep 2017 13:39:21 -0700 > Benjamin Peterson wrote: > > Hello, > > I've written a short PEP about an import extension to allow pycs to be > > more deterministic by optional replacing the timestamp with a hash of > > the source file: https://www.python.org/dev/peps/pep-0552/ > > Why isn't https://github.com/python/cpython/pull/296 a good enough > solution to this problem? It has a simple implementation, and requires > neither maintaining two different pyc formats nor reading the entire > source file to check whether the pyc file is up to date. The main objection to that model is that it requires modifying source timestamps, which isn't possible for builds on read-only source trees. This proposal also allows reproducible builds even if the files are being modified in an edit-run-tests cycle. From barry at python.org Thu Sep 7 17:12:28 2017 From: barry at python.org (Barry Warsaw) Date: Thu, 7 Sep 2017 14:12:28 -0700 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: References: <7DDCE154-7596-414F-90C4-A2AE0618C728@python.org> <3AA673BE-9B32-4ACB-AB4D-940A006C0A4D@python.org> <4AEEEC14-2FF3-4A48-962D-1293B8B0A8FC@python.org> Message-ID: <0A89DDC6-9D4D-41A5-8AA6-6B098A775336@python.org> On Sep 7, 2017, at 14:04, Fred Drake wrote: > > On Thu, Sep 7, 2017 at 4:52 PM, Terry Reedy wrote: >> Environmental variables tend to be a pain on Windows and nigh unusable by >> beginners. Leaving that aside, I see these problems with trying to use one >> for IDLE's *current* debugger. >> >> pdb is universal, in the sense of working with any python run with actual or >> simulated stdin and stdout. IDLE's idb is specific to working with IDLE. So >> one could not set an EV to 'idlelib.idb.start' and leave it while switching >> between IDLE and console. > > Would it work for IDLE to set the environment variable for the child process? That?s exactly how I envision the environment variable would be used. If the process being debugged is run in an environment set up by the IDE, this would be the way for the IDE to communicate to the subprocess under debug, how it should behave in order to communicate properly with the debugger. > The user certainly should not need to be involved in that. Right. > That doesn't address the issue of setting up the communications > channel before breakpoint() is called, but allows the breakpointhook > that gets used to work with whatever has been arranged. Right again! I think setting up the communication channel is outside the scope of this PEP. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 273 bytes Desc: Message signed with OpenPGP URL: From freddyrietdijk at fridh.nl Thu Sep 7 17:19:08 2017 From: freddyrietdijk at fridh.nl (Freddy Rietdijk) Date: Thu, 7 Sep 2017 23:19:08 +0200 Subject: [Python-Dev] PEP 552: deterministic pycs In-Reply-To: <1504818538.3015679.1098853504.131C78A8@webmail.messagingengine.com> References: <1504816761.1994315.1098826272.2B182328@webmail.messagingengine.com> <20170907230043.54d8f3f7@fsol> <1504818538.3015679.1098853504.131C78A8@webmail.messagingengine.com> Message-ID: > The main objection to that model is that it requires modifying source timestamps, which isn't possible for builds on read-only source trees. Why not set the source timestamps of the source trees to say 1 first? That's what is done with the Nix package manager. The Python interpreter is patched (mostly similar to the referred PR) and checks whether SOURCE_DATE_EPOCH is set, and if so, sets the mtime to 1. On Thu, Sep 7, 2017 at 11:08 PM, Benjamin Peterson wrote: > > > On Thu, Sep 7, 2017, at 14:00, Antoine Pitrou wrote: > > On Thu, 07 Sep 2017 13:39:21 -0700 > > Benjamin Peterson wrote: > > > Hello, > > > I've written a short PEP about an import extension to allow pycs to be > > > more deterministic by optional replacing the timestamp with a hash of > > > the source file: https://www.python.org/dev/peps/pep-0552/ > > > > Why isn't https://github.com/python/cpython/pull/296 a good enough > > solution to this problem? It has a simple implementation, and requires > > neither maintaining two different pyc formats nor reading the entire > > source file to check whether the pyc file is up to date. > > The main objection to that model is that it requires modifying source > timestamps, which isn't possible for builds on read-only source trees. > This proposal also allows reproducible builds even if the files are > being modified in an edit-run-tests cycle. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > freddyrietdijk%40fridh.nl > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Thu Sep 7 17:19:35 2017 From: guido at python.org (Guido van Rossum) Date: Thu, 7 Sep 2017 14:19:35 -0700 Subject: [Python-Dev] PEP 552: deterministic pycs In-Reply-To: <1504816761.1994315.1098826272.2B182328@webmail.messagingengine.com> References: <1504816761.1994315.1098826272.2B182328@webmail.messagingengine.com> Message-ID: Nice one. It would be nice to specify the various APIs needed as well. Why do you keep the mtime-based format as an option? (Maybe because it's faster? Did you measure it?) On Thu, Sep 7, 2017 at 1:39 PM, Benjamin Peterson wrote: > Hello, > I've written a short PEP about an import extension to allow pycs to be > more deterministic by optional replacing the timestamp with a hash of > the source file: https://www.python.org/dev/peps/pep-0552/ > > Thanks for reading, > Benjamin > > P.S. I came up with the idea for this PEP while awake. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Thu Sep 7 17:21:27 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 7 Sep 2017 23:21:27 +0200 Subject: [Python-Dev] PEP 552: deterministic pycs In-Reply-To: <1504818538.3015679.1098853504.131C78A8@webmail.messagingengine.com> References: <1504816761.1994315.1098826272.2B182328@webmail.messagingengine.com> <20170907230043.54d8f3f7@fsol> <1504818538.3015679.1098853504.131C78A8@webmail.messagingengine.com> Message-ID: <20170907232127.2c154a42@fsol> On Thu, 07 Sep 2017 14:08:58 -0700 Benjamin Peterson wrote: > On Thu, Sep 7, 2017, at 14:00, Antoine Pitrou wrote: > > On Thu, 07 Sep 2017 13:39:21 -0700 > > Benjamin Peterson wrote: > > > Hello, > > > I've written a short PEP about an import extension to allow pycs to be > > > more deterministic by optional replacing the timestamp with a hash of > > > the source file: https://www.python.org/dev/peps/pep-0552/ > > > > Why isn't https://github.com/python/cpython/pull/296 a good enough > > solution to this problem? It has a simple implementation, and requires > > neither maintaining two different pyc formats nor reading the entire > > source file to check whether the pyc file is up to date. > > The main objection to that model is that it requires modifying source > timestamps, which isn't possible for builds on read-only source trees. Not sure how common that situation is (certainly the source tree wasn't read-only when you checked it out or untar'ed it), but isn't it easily circumvented by copying the source tree before building? > This proposal also allows reproducible builds even if the files are > being modified in an edit-run-tests cycle. I don't follow you here. Could you elaborate? Thanks Antoine. From barry at python.org Thu Sep 7 17:22:14 2017 From: barry at python.org (Barry Warsaw) Date: Thu, 7 Sep 2017 14:22:14 -0700 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: References: <7DDCE154-7596-414F-90C4-A2AE0618C728@python.org> <3AA673BE-9B32-4ACB-AB4D-940A006C0A4D@python.org> <4AEEEC14-2FF3-4A48-962D-1293B8B0A8FC@python.org> Message-ID: <6000E6E8-A664-4B41-9279-5BE50BA0DB67@python.org> On Sep 7, 2017, at 13:52, Terry Reedy wrote: > pdb.set_trace is a public and stable interface. IDLE's is private and likely to be initially unstable. I can imagine that the function that I would want to bind to sys.__breakpoint__ would be a bound method To be pedantic, you?re not supposed to touch sys.__breakpointhook__ although like sys.__excepthook__ and sys.__displayhook__ they are not enforced to be read-only. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 273 bytes Desc: Message signed with OpenPGP URL: From barry at python.org Thu Sep 7 17:25:25 2017 From: barry at python.org (Barry Warsaw) Date: Thu, 7 Sep 2017 14:25:25 -0700 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: References: <20170906025715.GG13110@ando.pearwood.info> <1504667186.2203533.1096578744.6E33CEB6@webmail.messagingengine.com> <21D29DF6-B2DB-4A6C-8290-13DB8D7B21B4@python.org> Message-ID: <0DA417C8-2E57-45B9-B6C4-44A3B05CEC47@python.org> On Sep 7, 2017, at 12:09, Fernando Perez wrote: >> The PEP has an open issue regarding breakpoint() taking *args and **kws, which would just be passed through the call stack. It sounds like you?d be in favor of that enhancement. > > If you go witht the `(*a, **k)` pass-through API, would you have a special keyword-only arg called 'header' or similar? That seems like a decent compromise to support the feature with the builtin while allowing other implementations to offer more features. In any case, +1 to a pass-through API, as long as the built-in supports some kind of mechanism to help the user get their bearings with "you're here" type messages. I don?t think I want to specify what goes in *args or **kws, I just want to pass them straight through. The user will have to have some understanding of what debugger they are using and what arguments their breakpoint hook allows. I?ll see what it takes to add `header` to pdb.set_trace(), but I?ll do that as a separate PR (i.e. not as part of this PEP). Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 273 bytes Desc: Message signed with OpenPGP URL: From benjamin at python.org Thu Sep 7 17:25:54 2017 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 07 Sep 2017 14:25:54 -0700 Subject: [Python-Dev] PEP 552: deterministic pycs In-Reply-To: References: <1504816761.1994315.1098826272.2B182328@webmail.messagingengine.com> <20170907230043.54d8f3f7@fsol> <1504818538.3015679.1098853504.131C78A8@webmail.messagingengine.com> Message-ID: <1504819554.3019994.1098872248.5E3A51F4@webmail.messagingengine.com> On Thu, Sep 7, 2017, at 14:19, Freddy Rietdijk wrote: > > The main objection to that model is that it requires modifying source > timestamps, which isn't possible for builds on read-only source trees. > > Why not set the source timestamps of the source trees to say 1 first? If the source-tree is readonly (because you don't want your build system to modify source files on principal), you cannot do that. From benjamin at python.org Thu Sep 7 17:32:19 2017 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 07 Sep 2017 14:32:19 -0700 Subject: [Python-Dev] PEP 552: deterministic pycs In-Reply-To: <20170907232127.2c154a42@fsol> References: <1504816761.1994315.1098826272.2B182328@webmail.messagingengine.com> <20170907230043.54d8f3f7@fsol> <1504818538.3015679.1098853504.131C78A8@webmail.messagingengine.com> <20170907232127.2c154a42@fsol> Message-ID: <1504819939.3021001.1098873416.7BB4A8BD@webmail.messagingengine.com> On Thu, Sep 7, 2017, at 14:21, Antoine Pitrou wrote: > On Thu, 07 Sep 2017 14:08:58 -0700 > Benjamin Peterson wrote: > > On Thu, Sep 7, 2017, at 14:00, Antoine Pitrou wrote: > > > On Thu, 07 Sep 2017 13:39:21 -0700 > > > Benjamin Peterson wrote: > > > > Hello, > > > > I've written a short PEP about an import extension to allow pycs to be > > > > more deterministic by optional replacing the timestamp with a hash of > > > > the source file: https://www.python.org/dev/peps/pep-0552/ > > > > > > Why isn't https://github.com/python/cpython/pull/296 a good enough > > > solution to this problem? It has a simple implementation, and requires > > > neither maintaining two different pyc formats nor reading the entire > > > source file to check whether the pyc file is up to date. > > > > The main objection to that model is that it requires modifying source > > timestamps, which isn't possible for builds on read-only source trees. > > Not sure how common that situation is (certainly the source tree wasn't > read-only when you checked it out or untar'ed it), but isn't it easily > circumvented by copying the source tree before building? Well, yes, in these kind of "batch" build situations, copying is probably fine. However, I want to be able to have pyc determinism even when developing. Copying the entire source every time I change something isn't a nice. > > > This proposal also allows reproducible builds even if the files are > > being modified in an edit-run-tests cycle. > > I don't follow you here. Could you elaborate? If you require source timestamps to be fixed and deterministic, Python won't notice when a file is modified. The larger point is that while the SOURCE_EPOCH patch will likely work for Linux distributions, I'm interested in being able to have deterministic pycs in "normal" Python development workflows. From benjamin at python.org Thu Sep 7 17:40:33 2017 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 07 Sep 2017 14:40:33 -0700 Subject: [Python-Dev] PEP 552: deterministic pycs In-Reply-To: References: <1504816761.1994315.1098826272.2B182328@webmail.messagingengine.com> Message-ID: <1504820433.3022749.1098877456.3FF76AA6@webmail.messagingengine.com> On Thu, Sep 7, 2017, at 14:19, Guido van Rossum wrote: > Nice one. > > It would be nice to specify the various APIs needed as well. The compileall and py_compile ones? > > Why do you keep the mtime-based format as an option? (Maybe because it's > faster? Did you measure it?) I haven't actually measured anything, but stating a file will definitely be faster than reading it completely and hashing it. I suppose if the speed difference between timestamp-based and hash-based pycs turned out to be small we could feel good about dropping the timestamp format completely. However, that difference might be hard to determine definitely as I expect the speed hit will vary widely based on system parameters such as disk speed and page cache size. My goal in this PEP was to preserve the current pyc invalidation behavior, which works well today for many use cases, as the default. The hash-based pycs are reserved for distribution and other power use cases. From guido at python.org Thu Sep 7 17:43:16 2017 From: guido at python.org (Guido van Rossum) Date: Thu, 7 Sep 2017 14:43:16 -0700 Subject: [Python-Dev] PEP 552: deterministic pycs In-Reply-To: <1504820433.3022749.1098877456.3FF76AA6@webmail.messagingengine.com> References: <1504816761.1994315.1098826272.2B182328@webmail.messagingengine.com> <1504820433.3022749.1098877456.3FF76AA6@webmail.messagingengine.com> Message-ID: On Thu, Sep 7, 2017 at 2:40 PM, Benjamin Peterson wrote: > > > On Thu, Sep 7, 2017, at 14:19, Guido van Rossum wrote: > > Nice one. > > > > It would be nice to specify the various APIs needed as well. > > The compileall and py_compile ones? > Yes, and the SipHash mod to specify the key you mentioned. > > > Why do you keep the mtime-based format as an option? (Maybe because it's > > faster? Did you measure it?) > > I haven't actually measured anything, but stating a file will definitely > be faster than reading it completely and hashing it. I suppose if the > speed difference between timestamp-based and hash-based pycs turned out > to be small we could feel good about dropping the timestamp format > completely. However, that difference might be hard to determine > definitely as I expect the speed hit will vary widely based on system > parameters such as disk speed and page cache size. > > My goal in this PEP was to preserve the current pyc invalidation > behavior, which works well today for many use cases, as the default. The > hash-based pycs are reserved for distribution and other power use cases. > OK, maybe you can clarify that a bit in the PEP. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Thu Sep 7 17:54:17 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 7 Sep 2017 23:54:17 +0200 Subject: [Python-Dev] PEP 552: deterministic pycs References: <1504816761.1994315.1098826272.2B182328@webmail.messagingengine.com> <20170907230043.54d8f3f7@fsol> <1504818538.3015679.1098853504.131C78A8@webmail.messagingengine.com> <20170907232127.2c154a42@fsol> <1504819939.3021001.1098873416.7BB4A8BD@webmail.messagingengine.com> Message-ID: <20170907235417.4530f98e@fsol> On Thu, 07 Sep 2017 14:32:19 -0700 Benjamin Peterson wrote: > > > > Not sure how common that situation is (certainly the source tree wasn't > > read-only when you checked it out or untar'ed it), but isn't it easily > > circumvented by copying the source tree before building? > > Well, yes, in these kind of "batch" build situations, copying is > probably fine. However, I want to be able to have pyc determinism even > when developing. Copying the entire source every time I change something > isn't a nice. Hmm... Are you developing from a read-only source tree? > The larger point is that while the SOURCE_EPOCH patch will likely work > for Linux distributions, I'm interested in being able to have > deterministic pycs in "normal" Python development workflows. That's an interesting idea, but is there a concrete motivation or is it platonical? After all, if you're changing something in the source tree it's expected that the overall "signature" of the build will be modified too. Regards Antoine. From solipsis at pitrou.net Thu Sep 7 17:56:59 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 7 Sep 2017 23:56:59 +0200 Subject: [Python-Dev] PEP 552: deterministic pycs References: <1504816761.1994315.1098826272.2B182328@webmail.messagingengine.com> <1504820433.3022749.1098877456.3FF76AA6@webmail.messagingengine.com> Message-ID: <20170907235659.170f222d@fsol> On Thu, 07 Sep 2017 14:40:33 -0700 Benjamin Peterson wrote: > On Thu, Sep 7, 2017, at 14:19, Guido van Rossum wrote: > > Nice one. > > > > It would be nice to specify the various APIs needed as well. > > The compileall and py_compile ones? > > > > > Why do you keep the mtime-based format as an option? (Maybe because it's > > faster? Did you measure it?) > > I haven't actually measured anything, but stating a file will definitely > be faster than reading it completely and hashing it. I suppose if the > speed difference between timestamp-based and hash-based pycs turned out > to be small we could feel good about dropping the timestamp format > completely. However, that difference might be hard to determine > definitely as I expect the speed hit will vary widely based on system > parameters such as disk speed and page cache size. Also, while some/many of us have fast development machines with performant SSDs, Python can be used in situations where "disk" I/O is still slow (imagine a Raspberry Pi system or similar, grinding through a SD card or USB key to load py and pyc files). Regards Antoine. From benjamin at python.org Thu Sep 7 18:44:09 2017 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 07 Sep 2017 15:44:09 -0700 Subject: [Python-Dev] PEP 552: deterministic pycs In-Reply-To: <20170907235417.4530f98e@fsol> References: <1504816761.1994315.1098826272.2B182328@webmail.messagingengine.com> <20170907230043.54d8f3f7@fsol> <1504818538.3015679.1098853504.131C78A8@webmail.messagingengine.com> <20170907232127.2c154a42@fsol> <1504819939.3021001.1098873416.7BB4A8BD@webmail.messagingengine.com> <20170907235417.4530f98e@fsol> Message-ID: <1504824249.3041251.1098934064.19F07787@webmail.messagingengine.com> On Thu, Sep 7, 2017, at 14:54, Antoine Pitrou wrote: > On Thu, 07 Sep 2017 14:32:19 -0700 > Benjamin Peterson wrote: > > > > > > Not sure how common that situation is (certainly the source tree wasn't > > > read-only when you checked it out or untar'ed it), but isn't it easily > > > circumvented by copying the source tree before building? > > > > Well, yes, in these kind of "batch" build situations, copying is > > probably fine. However, I want to be able to have pyc determinism even > > when developing. Copying the entire source every time I change something > > isn't a nice. > > Hmm... Are you developing from a read-only source tree? No, but the build system is building from one (at least conceptually). > > > The larger point is that while the SOURCE_EPOCH patch will likely work > > for Linux distributions, I'm interested in being able to have > > deterministic pycs in "normal" Python development workflows. > > That's an interesting idea, but is there a concrete motivation or is it > platonical? After all, if you're changing something in the source tree > it's expected that the overall "signature" of the build will be > modified too. Yes, I have used Bazel to build pycs. Having pycs be deterministic allows interesting build system optimizations like Bazel distributed caching to work well for Python. From benjamin at python.org Thu Sep 7 18:46:42 2017 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 07 Sep 2017 15:46:42 -0700 Subject: [Python-Dev] PEP 552: deterministic pycs In-Reply-To: References: <1504816761.1994315.1098826272.2B182328@webmail.messagingengine.com> <1504820433.3022749.1098877456.3FF76AA6@webmail.messagingengine.com> Message-ID: <1504824402.3041582.1098937744.55112449@webmail.messagingengine.com> On Thu, Sep 7, 2017, at 14:43, Guido van Rossum wrote: > On Thu, Sep 7, 2017 at 2:40 PM, Benjamin Peterson > wrote: > > > > > > > On Thu, Sep 7, 2017, at 14:19, Guido van Rossum wrote: > > > Nice one. > > > > > > It would be nice to specify the various APIs needed as well. > > > > The compileall and py_compile ones? > > > > Yes, and the SipHash mod to specify the key you mentioned. Done. > > > > > > Why do you keep the mtime-based format as an option? (Maybe because it's > > > faster? Did you measure it?) > > > > I haven't actually measured anything, but stating a file will definitely > > be faster than reading it completely and hashing it. I suppose if the > > speed difference between timestamp-based and hash-based pycs turned out > > to be small we could feel good about dropping the timestamp format > > completely. However, that difference might be hard to determine > > definitely as I expect the speed hit will vary widely based on system > > parameters such as disk speed and page cache size. > > > > My goal in this PEP was to preserve the current pyc invalidation > > behavior, which works well today for many use cases, as the default. The > > hash-based pycs are reserved for distribution and other power use cases. > > > > OK, maybe you can clarify that a bit in the PEP. I've added a paragraph to the Rationale section. From larry at hastings.org Thu Sep 7 18:49:32 2017 From: larry at hastings.org (Larry Hastings) Date: Thu, 7 Sep 2017 15:49:32 -0700 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) In-Reply-To: References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> Message-ID: On 09/06/2017 09:45 AM, Guido van Rossum wrote: > So we're looking for a competing PEP here. Shouldn't be long, just > summarize the discussion about use cases and generality here. I don't think it's necessarily a /competing/ PEP; in my opinion, they serve slightly different use cases. After all, property (and __getattribute__) were added long after __getattr__; if __getattr__ was a reasonable solution for property's use cases, why did we bother adding property? One guiding principle I use when writing Python: if I need to provide an API, but there's conceptually only one of the thing, build it directly into a module as opposed to writing a class and making users use an instance. (For example: the random module, although these days it provides both.) So I see the situation as symmetric between modules and classes. What is the use case for property / __getattr__ / __getattribute__ on a module? The same as the use case for property / __getattr__ / __getattribute__ on a class. Excluding Lib/test, there are 375 uses of "@property" in the stdlib in trunk, 60 uses of __getattr__, and 34 of __getattribute__. Of course, property is used once per member, whereas each instance of __getattr__ and __getattribute__ could be used for arbitrarily-many members. On the other hand, it's also possible that some uses of __getattr__ are legacy uses, and if property had been available it would have used that instead. Anyway I assert that property is easily the most popular of these three techniques. TBH I forgot the specific use case that inspired this--it's on a project I haven't touched in a while, in favor of devoting time to the Gilectomy. But I can cite at least one place in the standard library that would have been better if it'd been implemented as a module property: os.stat_float_times(). //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From larry at hastings.org Thu Sep 7 19:00:04 2017 From: larry at hastings.org (Larry Hastings) Date: Thu, 7 Sep 2017 16:00:04 -0700 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) In-Reply-To: References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> Message-ID: <46c34db7-fc8f-52ae-4349-758895b85784@hastings.org> On 09/07/2017 03:49 PM, Larry Hastings wrote: > Excluding Lib/test, there are 375 uses of "@property" in the stdlib in > trunk, 60 uses of __getattr__, and 34 of __getattribute__. I spent a minute looking at the output and realized there were a bunch of hits inside pydoc_data/topics.py, aka documentation. And then I realized I was only interested in definitions, not docs or calls or other hackery. Removing pydoc_data/topics.py and changing the search terms results in: "@property" 375 hits "def __getattr__" 28 hits "def __getattribute__(" 2 hits Unless the average use of __getattr__ serve in excess of 13 members each, property is the most popular technique of the three. //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Thu Sep 7 19:00:43 2017 From: barry at python.org (Barry Warsaw) Date: Thu, 7 Sep 2017 16:00:43 -0700 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: <0DA417C8-2E57-45B9-B6C4-44A3B05CEC47@python.org> References: <20170906025715.GG13110@ando.pearwood.info> <1504667186.2203533.1096578744.6E33CEB6@webmail.messagingengine.com> <21D29DF6-B2DB-4A6C-8290-13DB8D7B21B4@python.org> <0DA417C8-2E57-45B9-B6C4-44A3B05CEC47@python.org> Message-ID: On Sep 7, 2017, at 14:25, Barry Warsaw wrote: > > I?ll see what it takes to add `header` to pdb.set_trace(), but I?ll do that as a separate PR (i.e. not as part of this PEP). Turns out to be pretty easy. https://bugs.python.org/issue31389 https://github.com/python/cpython/pull/3438 Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 273 bytes Desc: Message signed with OpenPGP URL: From tjreedy at udel.edu Thu Sep 7 19:19:52 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 7 Sep 2017 19:19:52 -0400 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: <1F68BA08-AC94-41A6-AFD2-4439DA5D3371@python.org> References: <98F44C60-2DBB-4771-9544-B9DA7210353C@python.org> <03B5AA26-0AE0-4EFB-9300-B71F9ECD3AE8@python.org> <1F68BA08-AC94-41A6-AFD2-4439DA5D3371@python.org> Message-ID: On 9/7/2017 12:52 PM, Barry Warsaw wrote: > On Sep 7, 2017, at 07:46, Guido van Rossum wrote: > >> Without following all this (or the PEP, yet) super exactly, does this mean you are satisfied with what the PEP currently proposes or would you like changes? It's a little unclear from what you wrote. > > I?m also unsure whether Terry is good with the existing PEP or suggesting changes. It seems to me that simplifying 'import pdb; pbd.set_trace()' to 'breakpoint()' is not, in itself, justification for a new builtin. Rather, the justification is the possibility of invoking dbs other than pdb. So I have been examining the feasibility of doing that, with IDLE's Idb + rpc + Debugger widget as a test case and example. My conclusion: possible, but not as graceful as I would like. In response to Barry's idea of PYTHONBREAKPOINTHOOK (default pdb.set_trace, and is 'HOOK' really needed), I examined further how IDLE's debugger is different and how the differences cause problems. Then a thought occurred to me: how about rewriting it as a pure tkinter GUI debugger, much more like pdb. Call the module either gdb or tkdb. PYTHONBREAKPOINT=[package.].gdb.set_trace would then make more sense. As with pdb, the UI would run in the same process as the debugger and user code. No rpc setup needed. No interference between the debug gui and the rest of IDLE (as sometimes happens now). No need to pickle objects to display the bindings of global and local names. The reasons for separating user code from IDLE mostly do not apply to the debugger. I would make it a single window with 2 panes, much like turtledemo, but with the turtle screen replaced with the GUI display. (Try IDLE to see what that would look like now, or pull PR 2494 first to see it with one proposed patch.) Gdb should not be limited to running from IDLE but could be launched even when running code from the console. After an initial version, the entry point could accept a list of breakpoint lines and a color mapping for syntax coloring. I think breakpoint() should have a db= parameter so one can select a debugger in one removable line. The sys interface is more useful for IDEs to change the default, possible with other args (like breakpoints and colors) bound to the callable. Breakpoint() should pass on other args. In particular, people who invoke gdb from within a tkinter program should be able to pass in the root widget, to be the master of the debug window. This has to be done from within the user program, and the addition of breakpoint allows that, and makes running the gui with user code more feasible. --- A somewhat separate point: the name breakpoint() is slightly misleading, which has consequences if it is (improperly) called more than once. While breakpoint() acts as a breakpoint, what it does (at least in the default pdb case) is *initialize* and start a *new* debugger, possibly after an import. Re-importing a module is no big deal. Replacing an existing debugger with a *new* one, and tossing away all defined aliases and breakpoints and Bdb's internal caches, is. It is likely not what most people would want or expect. I think it more likely that people will call breakpoint() multiple times than they would, for instance, call pdb.set_trace() multiple times. With a gui debugger, having one window go and another appear might be additionally annoying. If the first window is not immediately GCed, having two windows would be confusing. Perhaps breakpoint() could be made a no-op after the first call. >> I think that's true for any IDE that has existing integrated debug capabilities. However for every IDE I would hope that calling breakpoint() will *also* break into the IDE's debugger. My use case is that sometimes I have a need for a *conditional* breakpoint where it's just much simpler to decide whether to break or not in Python code than it would be to use the IDE's conditional breakpoint facilities. > > That certainly aligns with my own experience and expectations. I guess I?m a fellow old dog. :) -- Terry Jan Reedy From greg.ewing at canterbury.ac.nz Thu Sep 7 19:52:57 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 08 Sep 2017 11:52:57 +1200 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: <6221644.YDppWAoE6Y@hammer.magicstack.net> References: <59B12186.6030707@canterbury.ac.nz> <6221644.YDppWAoE6Y@hammer.magicstack.net> Message-ID: <59B1DBD9.7040008@canterbury.ac.nz> Elvis Pranskevichus wrote: > By default, generators reference an empty LogicalContext object that is > allocated once (like the None object). We can do that because LCs are > immutable. Ah, I see. That wasn't clear from the implementation, where gen.__logical_context__ = contextvars.LogicalContext() looks like it's creating a new one. However, there's another thing: it looks like every time a generator is resumed/suspended, an execution context node is created/discarded. -- Greg From greg at krypto.org Thu Sep 7 19:58:26 2017 From: greg at krypto.org (Gregory P. Smith) Date: Thu, 07 Sep 2017 23:58:26 +0000 Subject: [Python-Dev] PEP 552: deterministic pycs In-Reply-To: <1504824402.3041582.1098937744.55112449@webmail.messagingengine.com> References: <1504816761.1994315.1098826272.2B182328@webmail.messagingengine.com> <1504820433.3022749.1098877456.3FF76AA6@webmail.messagingengine.com> <1504824402.3041582.1098937744.55112449@webmail.messagingengine.com> Message-ID: +1 on this PEP. The TL;DR summary of this PEP: The pyc date+length metadata check was a convenient hack. It still works well for many people and use cases, it isn't going away. PEP 552 proposes a new alternate hack that relies on file contents instead of os and filesystem date metadata. Assumption: The hash function is significantly faster than re-parsing the source. (guaranteed to be true) Questions: Input from OS package distributors would be interesting. Would they use this? Which way would it impact their startup time (loading the .py file vs just statting it. does that even matter? source files are often eventually loaded for linecache use in tracebacks anyways)? Would they benefit from a pyc that can contain _both_ timestamp+length, and the source_hash? if both were present, I assume that only one would be checked at startup. i'm not sure what would make the decision of what to check. one fails, check the other? i personally do not have a use for this case so i'd omit the complexity without a demonstrated need. Something to also state in the PEP: This is intentionally not a "secure" hash. Security is explicitly a non-goal. Rationale behind my support: We use a superset of Bazel at Google (unsurprising) and have had to jump through a lot of messy hoops to deal with timestamp metadata winding up in output files vs deterministic builds. What Benjamin describes here sounds exactly like what we would want. It allows deterministic builds in distributed build and cached operation systems where timestamps are never going to be guaranteed. It allows the check to work on filesystems which do not preserve timestamps. Also importantly, it allows the check to be disabled via the check_source bit. Today we use a modified importer at work that skips checking timestamps anyways as the way we ship applications where the entire set of dependencies present is already guaranteed at build time to be correct and being modified at runtime is not possible or not a concern. This PEP would avoid the need for an extra importer or modified interpreter logic to make this happen. -G On Thu, Sep 7, 2017 at 3:47 PM Benjamin Peterson wrote: > > > On Thu, Sep 7, 2017, at 14:43, Guido van Rossum wrote: > > On Thu, Sep 7, 2017 at 2:40 PM, Benjamin Peterson > > wrote: > > > > > > > > > > > On Thu, Sep 7, 2017, at 14:19, Guido van Rossum wrote: > > > > Nice one. > > > > > > > > It would be nice to specify the various APIs needed as well. > > > > > > The compileall and py_compile ones? > > > > > > > Yes, and the SipHash mod to specify the key you mentioned. > > Done. > > > > > > > > > > Why do you keep the mtime-based format as an option? (Maybe because > it's > > > > faster? Did you measure it?) > > > > > > I haven't actually measured anything, but stating a file will > definitely > > > be faster than reading it completely and hashing it. I suppose if the > > > speed difference between timestamp-based and hash-based pycs turned out > > > to be small we could feel good about dropping the timestamp format > > > completely. However, that difference might be hard to determine > > > definitely as I expect the speed hit will vary widely based on system > > > parameters such as disk speed and page cache size. > > > > > > My goal in this PEP was to preserve the current pyc invalidation > > > behavior, which works well today for many use cases, as the default. > The > > > hash-based pycs are reserved for distribution and other power use > cases. > > > > > > > OK, maybe you can clarify that a bit in the PEP. > > I've added a paragraph to the Rationale section. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/greg%40krypto.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Sep 7 20:06:23 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 7 Sep 2017 17:06:23 -0700 Subject: [Python-Dev] PEP 550 v4 In-Reply-To: <59B15256.8030209@stoneleaf.us> References: <59B12186.6030707@canterbury.ac.nz> <59B14436.8040705@stoneleaf.us> <1679203.vpm3Kbeti1@hammer.magicstack.net> <59B15256.8030209@stoneleaf.us> Message-ID: On 7 September 2017 at 07:06, Ethan Furman wrote: > The concern is *how* PEP 550 provides it: > > - explicitly, like threading.local(): has to be set up manually, > preferably with a context manager > > - implicitly: it just happens under certain conditions A recurring point of confusion with the threading.local() analogy seems to be that there are actually *two* pieces to that analogy: * threading.local() <-> contextvars.ContextVar * PyThreadState_GetDict() <-> LogicalContext (See https://github.com/python/cpython/blob/a6a4dc816d68df04a7d592e0b6af8c7ecc4d4344/Python/pystate.c#L584 for the definition of the PyThreadState_GetDict) For most practical purposes as a *user* of thread locals, the involvement of PyThreadState and the state dict is a completely hidden implementation detail. However, every time you create a new thread, you're implicitly getting a new Python thread state, and hence a new thread state dict, and hence a new set of thread local values. Similarly, as a *user* of context variables, you'll generally be able to ignore the manipulation of the execution context going on behind the scenes - you'll just get, set, and delete individual context variables without worrying too much about exactly where and how they're stored. PEP 550 itself doesn't have that luxury, though, since in addition to defining how users will access and update these values, it *also* needs to define how the interpreter will implicitly manage the execution context for threads and generators and how event loops (including asyncio as the reference implementation) are going to be expected to manage the execution context explicitly when scheduling coroutines. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From v+python at g.nevcal.com Thu Sep 7 19:58:32 2017 From: v+python at g.nevcal.com (Glenn Linderman) Date: Thu, 7 Sep 2017 16:58:32 -0700 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: References: <98F44C60-2DBB-4771-9544-B9DA7210353C@python.org> <03B5AA26-0AE0-4EFB-9300-B71F9ECD3AE8@python.org> <1F68BA08-AC94-41A6-AFD2-4439DA5D3371@python.org> Message-ID: On 9/7/2017 4:19 PM, Terry Reedy wrote: > A somewhat separate point: the name breakpoint() is slightly > misleading, which has consequences if it is (improperly) called more > than once. While breakpoint() acts as a breakpoint, what it does (at > least in the default pdb case) is *initialize* and start a *new* > debugger, possibly after an import. So maybe the original debug() call should be renamed debugger() [but with the extra optional parameters discussed], and an additional breakpoint() call could be added that would be much more like hitting a breakpoint in the already initialized debugger. There seem to be two directions to go here: If breakpoint is called without a prior call to debugger, it should (1) call debugger implicitly with defaults, or (2) be a noop. I prefer the latter, but either is workable.? Or, one could add all those parameters to every breakpoint() call, which makes (1) more workable, but code more wordy. -------------- next part -------------- An HTML attachment was scrubbed... URL: From fperez.net at gmail.com Thu Sep 7 21:03:25 2017 From: fperez.net at gmail.com (Fernando Perez) Date: Thu, 7 Sep 2017 18:03:25 -0700 Subject: [Python-Dev] PEP 553: Built-in debug() References: <20170906025715.GG13110@ando.pearwood.info> <1504667186.2203533.1096578744.6E33CEB6@webmail.messagingengine.com> <21D29DF6-B2DB-4A6C-8290-13DB8D7B21B4@python.org> <0DA417C8-2E57-45B9-B6C4-44A3B05CEC47@python.org> Message-ID: On 2017-09-07 23:00:43 +0000, Barry Warsaw said: > On Sep 7, 2017, at 14:25, Barry Warsaw wrote: >> >> I?ll see what it takes to add `header` to pdb.set_trace(), but I?ll do >> that as a separate PR (i.e. not as part of this PEP). > > Turns out to be pretty easy. > > https://bugs.python.org/issue31389 > https://github.com/python/cpython/pull/3438 Ah, perfect! I've subscribed to the PR on github and can pitch in there further if my input is of any use. Thanks again, f From ncoghlan at gmail.com Thu Sep 7 21:12:20 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 7 Sep 2017 18:12:20 -0700 Subject: [Python-Dev] PEP 553 V2 - builtin breakpoint() (was Re: PEP 553: Built-in debug()) In-Reply-To: References: Message-ID: On 7 September 2017 at 11:43, Barry Warsaw wrote: > Environment variable > -------------------- > > Should we add an environment variable so that ``sys.breakpointhook()`` > can be > set outside of the Python invocation? E.g.:: > > $ export PYTHONBREAKPOINTHOOK=my.debugger:Debugger > > This would provide execution environments such as IDEs which run Python code > inside them, to set an internal breakpoint hook before any Python code > executes. Related to this is the suggestion that we make the default sys.breakpointhook() a no-op, so that accidentally checking in calls to breakpoint() won' t hang CI systems. Then folks that wanted to use the functionality would set "PYTHONBREAKPOINTHOOK=pdb:set_trace" Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Thu Sep 7 21:47:20 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 7 Sep 2017 18:47:20 -0700 Subject: [Python-Dev] PEP 552: deterministic pycs In-Reply-To: References: <1504816761.1994315.1098826272.2B182328@webmail.messagingengine.com> <1504820433.3022749.1098877456.3FF76AA6@webmail.messagingengine.com> <1504824402.3041582.1098937744.55112449@webmail.messagingengine.com> Message-ID: On 7 September 2017 at 16:58, Gregory P. Smith wrote: > +1 on this PEP. > > The TL;DR summary of this PEP: > The pyc date+length metadata check was a convenient hack. It still works > well for many people and use cases, it isn't going away. > PEP 552 proposes a new alternate hack that relies on file contents instead > of os and filesystem date metadata. > Assumption: The hash function is significantly faster than re-parsing > the source. (guaranteed to be true) > > Questions: > > Input from OS package distributors would be interesting. Would they use > this? Which way would it impact their startup time (loading the .py file vs > just statting it. does that even matter? source files are often eventually > loaded for linecache use in tracebacks anyways)? Christian and I asked some of our security folks for their personal wishlists recently, and one of the items that came up was "The recompile is based on a timestamp. How do you know the pyc file on disk really is related to the py file that is human readable? Can it be based on a hash or something like that?" This is a restating of the reproducible build use case: for a given version of Python, a given source file should always give the same source hash and marshaled code object, and once it does, it's easier to do an independent compilation from the source file and check you get the same answer. While you can implement that for timestamp based formats by adjusting input file metadata (and that's exactly what distros do with _SOURCE_DATE_EPOCH), it's still pretty annoying, and not particularly build cache friendly, since the same file in different source artifacts may produce different build outputs. > Would they benefit from a pyc that can contain _both_ timestamp+length, and > the source_hash? if both were present, I assume that only one would be > checked at startup. i'm not sure what would make the decision of what to > check. one fails, check the other? i personally do not have a use for this > case so i'd omit the complexity without a demonstrated need. I don't see any way we'd benefit from having both items present. However, I do wonder whether we could encode *all* the mode settings into the magic number, such that we did something like reserving the top 3 bits for format flags: * number & 0x1FFF -> the traditional magic number * number & 0x8000 -> timestamp or hash? * number & 0x4000 -> checked or not? * number & 0x2000 -> reserved for future format changes By default we'd still produce the checked-timestamp format, but managed build systems (including Linux distros) could opt-in to the unchecked-hash format. > Something to also state in the PEP: > > This is intentionally not a "secure" hash. Security is explicitly a > non-goal. I don't think it's so much that security is a non-goal, as that the (admittedly minor) security improvement comes from making it easier to reproduce the expected machine-readable output from a given human-readable input, rather than from the nature of the hashing function used. > Rationale behind my support: +1 from me as well, for the reasons Greg gives (while Fedora doesn't currently do any per-file build artifact caching, I hope we will in the future, and output formats based on input artifact hashes will make that much easier than formats based on input timestamps). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From benjamin at python.org Thu Sep 7 21:54:54 2017 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 07 Sep 2017 18:54:54 -0700 Subject: [Python-Dev] PEP 552: deterministic pycs In-Reply-To: References: <1504816761.1994315.1098826272.2B182328@webmail.messagingengine.com> <1504820433.3022749.1098877456.3FF76AA6@webmail.messagingengine.com> <1504824402.3041582.1098937744.55112449@webmail.messagingengine.com> Message-ID: <1504835694.3092547.1099070504.18188646@webmail.messagingengine.com> On Thu, Sep 7, 2017, at 16:58, Gregory P. Smith wrote: > +1 on this PEP. Thanks! > Questions: > > Input from OS package distributors would be interesting. Would they use > this? Which way would it impact their startup time (loading the .py file > vs just statting it. does that even matter? source files are often > eventually loaded for linecache use in tracebacks anyways)? I an anticipate distributors will use the mode where the pyc is simply trusted and the source file isn't hashed. That would make the io overhead identical to today. > > Would they benefit from a pyc that can contain _both_ timestamp+length, > and > the source_hash? if both were present, I assume that only one would be > checked at startup. i'm not sure what would make the decision of what to > check. one fails, check the other? i personally do not have a use for > this case so i'd omit the complexity without a demonstrated need. Yeah, it could act as a multi-tiered cache key. I agree with your conclusion to pass for now. > > Something to also state in the PEP: > > This is intentionally not a "secure" hash. Security is explicitly a > non-goal. Added a sentence. From barry at python.org Thu Sep 7 22:13:06 2017 From: barry at python.org (Barry Warsaw) Date: Thu, 7 Sep 2017 19:13:06 -0700 Subject: [Python-Dev] PEP 552: deterministic pycs In-Reply-To: References: <1504816761.1994315.1098826272.2B182328@webmail.messagingengine.com> <1504820433.3022749.1098877456.3FF76AA6@webmail.messagingengine.com> <1504824402.3041582.1098937744.55112449@webmail.messagingengine.com> Message-ID: <5CCA8DB4-492E-4AA1-89ED-CAA23551690C@python.org> On Sep 7, 2017, at 16:58, Gregory P. Smith wrote: > Input from OS package distributors would be interesting. Would they use this? I suspect it won?t be that interesting to the Debian ecosystem, since we generate pyc files on package install. We do that because we can support multiple versions of Python installed simultaneously and we don?t know which versions are installed on the target machine. I suppose our stdlib package could ship pycs, but we don?t. Reproducible builds may still be interesting in other situations though, such as CI machines, but then SOURCE_DATE_EPOCH is probably good enough. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 273 bytes Desc: Message signed with OpenPGP URL: From barry at python.org Thu Sep 7 22:17:29 2017 From: barry at python.org (Barry Warsaw) Date: Thu, 7 Sep 2017 19:17:29 -0700 Subject: [Python-Dev] PEP 553 V2 - builtin breakpoint() (was Re: PEP 553: Built-in debug()) In-Reply-To: References: Message-ID: <7183CD79-9A3A-489D-96B8-3D570D8A3170@python.org> On Sep 7, 2017, at 18:12, Nick Coghlan wrote: > > Related to this is the suggestion that we make the default > sys.breakpointhook() a no-op, so that accidentally checking in calls > to breakpoint() won' t hang CI systems. > > Then folks that wanted to use the functionality would set > "PYTHONBREAKPOINTHOOK=pdb:set_trace" I?d rather do it the other way ?round because I want it to Just Work for the average developer, and maintainers of CI or production systems should be able to fairly easily tweak their environments to noop breakpoint(). Although maybe we want a shortcut for that, e.g. PYTHONBREAKPOINTHOOK=0 or some such. (Note, I?m still not sure it?s worth supporting the environment variable, but I am interesting in collecting the feedback on it.) -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 273 bytes Desc: Message signed with OpenPGP URL: From ncoghlan at gmail.com Thu Sep 7 22:34:58 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 7 Sep 2017 19:34:58 -0700 Subject: [Python-Dev] PEP 553 V2 - builtin breakpoint() (was Re: PEP 553: Built-in debug()) In-Reply-To: <7183CD79-9A3A-489D-96B8-3D570D8A3170@python.org> References: <7183CD79-9A3A-489D-96B8-3D570D8A3170@python.org> Message-ID: On 7 September 2017 at 19:17, Barry Warsaw wrote: > On Sep 7, 2017, at 18:12, Nick Coghlan wrote: >> >> Related to this is the suggestion that we make the default >> sys.breakpointhook() a no-op, so that accidentally checking in calls >> to breakpoint() won' t hang CI systems. >> >> Then folks that wanted to use the functionality would set >> "PYTHONBREAKPOINTHOOK=pdb:set_trace" > > I?d rather do it the other way ?round because I want it to Just Work for the average developer, and maintainers of CI or production systems should be able to fairly easily tweak their environments to noop breakpoint(). Although maybe we want a shortcut for that, e.g. PYTHONBREAKPOINTHOOK=0 or some such. Now that you put it that way, it occurs to me that CI environments could set "PYTHONBREAKPOINTHOOK=sys:exit" to make breakpoint() an immediate failure rather than halting the CI run waiting for input that will never arrive. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From barry at python.org Thu Sep 7 22:45:44 2017 From: barry at python.org (Barry Warsaw) Date: Thu, 7 Sep 2017 19:45:44 -0700 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: References: <98F44C60-2DBB-4771-9544-B9DA7210353C@python.org> <03B5AA26-0AE0-4EFB-9300-B71F9ECD3AE8@python.org> <1F68BA08-AC94-41A6-AFD2-4439DA5D3371@python.org> Message-ID: <1782A936-98F9-45C4-B3E6-9EC4C3E51D7A@python.org> On Sep 7, 2017, at 16:19, Terry Reedy wrote: > I think breakpoint() should have a db= parameter so one can select a debugger in one removable line. The sys interface is more useful for IDEs to change the default, possible with other args (like breakpoints and colors) bound to the callable. I?m skeptical about that. I think any particular user is going to overwhelmingly use the same debugger, so having to repeat themselves every time they want to enter the debugger is going to get tedious fast. I know it would annoy *me* if I had to tell it to use pdb every time I wrote `breakpoint()`, and I?m almost never going to use anything else. I?m also not sure what useful semantics for `db` would be. E.g. what specifically would you set `db` to in order to invoke idle or tkdb (?gdb? would be an unfortunate name I think, given the popular existing GNU Debugger ;). I don?t even know what useful thing I?d set `db` to to mean ?invoke pdb?. Please don?t say ?well, pdb will still be the default so you don?t have to set it to anything? because that does kind of miss the point. I also want to keep breakpoint() as generic as possible. I think doing so allows it to be a nice API into whatever interesting and sophisticated implementations folks can think up underneath it. So *if* you came up with a cool thing that interpreted `db` in some special way, there should be a very low barrier to providing that to your users, e.g.: * pip install myfancymetadebugger * put a snippet which sets sys.breakpointhook in your $PYTHONSTARTUP file * profit! That second item could be replaced with export PYTHONBREAKPOINTHOOK=myfancymetadebugger.invoke (hmm, does that mean the envar is getting more interesting?) > Breakpoint() should pass on other args. I strongly believe it should pass through *all* args. > A somewhat separate point: the name breakpoint() is slightly misleading, which has consequences if it is (improperly) called more than once. While breakpoint() acts as a breakpoint, what it does (at least in the default pdb case) is *initialize* and start a *new* debugger, possibly after an import. Re-importing a module is no big deal. Replacing an existing debugger with a *new* one, and tossing away all defined aliases and breakpoints and Bdb's internal caches, is. It is likely not what most people would want or expect. I think it more likely that people will call breakpoint() multiple times than they would, for instance, call pdb.set_trace() multiple times. Multiple calls to pdb.set_trace() is fairly common in practice today, so I?m not terribly concerned about it. There?s nothing fundamentally different with multiple calls to breakpoint() today. If we care, we can provide a more efficient/different API and make that the default. The machinery in PEP 553 can easily support that, but doing it is outside the scope of the PEP. > With a gui debugger, having one window go and another appear might be additionally annoying. If the first window is not immediately GCed, having two windows would be confusing. Perhaps breakpoint() could be made a no-op after the first call. Your sys.breakpointhook could easily implement that, with a much better user experience than what built-in breakpoint() could do anyway. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 273 bytes Desc: Message signed with OpenPGP URL: From barry at python.org Thu Sep 7 22:47:28 2017 From: barry at python.org (Barry Warsaw) Date: Thu, 7 Sep 2017 19:47:28 -0700 Subject: [Python-Dev] PEP 553 V2 - builtin breakpoint() (was Re: PEP 553: Built-in debug()) In-Reply-To: References: <7183CD79-9A3A-489D-96B8-3D570D8A3170@python.org> Message-ID: On Sep 7, 2017, at 19:34, Nick Coghlan wrote: > Now that you put it that way, it occurs to me that CI environments > could set "PYTHONBREAKPOINTHOOK=sys:exit" to make breakpoint() an > immediate failure rather than halting the CI run waiting for input > that will never arrive. You better watch out Nick. You?re starting to sway me on adding the environment variable. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 273 bytes Desc: Message signed with OpenPGP URL: From barry at python.org Thu Sep 7 22:48:10 2017 From: barry at python.org (Barry Warsaw) Date: Thu, 7 Sep 2017 19:48:10 -0700 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: References: <20170906025715.GG13110@ando.pearwood.info> <1504667186.2203533.1096578744.6E33CEB6@webmail.messagingengine.com> <21D29DF6-B2DB-4A6C-8290-13DB8D7B21B4@python.org> <0DA417C8-2E57-45B9-B6C4-44A3B05CEC47@python.org> Message-ID: On Sep 7, 2017, at 18:03, Fernando Perez wrote: > Ah, perfect! I've subscribed to the PR on github and can pitch in there further if my input is of any use. Awesome, thanks! -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 273 bytes Desc: Message signed with OpenPGP URL: From apetresc at gmail.com Thu Sep 7 23:02:52 2017 From: apetresc at gmail.com (Adrian Petrescu) Date: Thu, 7 Sep 2017 23:02:52 -0400 Subject: [Python-Dev] PEP 553 V2 - builtin breakpoint() (was Re: PEP 553: Built-in debug()) In-Reply-To: References: <7183CD79-9A3A-489D-96B8-3D570D8A3170@python.org> Message-ID: Would that not be a security concern, if you can get Python to execute arbitrary code just by setting an environment variable? On Thu, Sep 7, 2017 at 10:47 PM, Barry Warsaw wrote: > On Sep 7, 2017, at 19:34, Nick Coghlan wrote: > > > Now that you put it that way, it occurs to me that CI environments > > could set "PYTHONBREAKPOINTHOOK=sys:exit" to make breakpoint() an > > immediate failure rather than halting the CI run waiting for input > > that will never arrive. > > You better watch out Nick. You?re starting to sway me on adding the > environment variable. > > -Barry > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > apetresc%40gmail.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From songofacandy at gmail.com Thu Sep 7 23:13:33 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Fri, 8 Sep 2017 12:13:33 +0900 Subject: [Python-Dev] HTTPS on bugs.python.org In-Reply-To: References: Message-ID: Fixed. Thanks to infra team. http://psf.upfronthosting.co.za/roundup/meta/issue638 INADA Naoki On Fri, Sep 1, 2017 at 9:57 PM, Victor Stinner wrote: > Hi, > > When I go to http://bugs.python.org/ Firefox warns me that the form on > the left to login (user, password) sends data in clear text (HTTP). > > Ok, I switch manually to HTTPS: add "s" in "http://" of the URL. > > I log in. > > I go to an issue using HTTPS like https://bugs.python.org/issue31250 > > I modify an issue using the form and click on [Submit Changes] (or > just press Enter): I'm back to HTTP. Truncated URL: > > http://bugs.python.org/issue31250?@ok_message=msg%20301099%20created%... > > Hum, again I switch manually to HTTPS by modifying the URL: > > https://bugs.python.org/issue31250?@ok_message=msg%20301099%20created%... > > I click on the "clear this message" link: oops, I'm back to the HTTP world... > > http://bugs.python.org/issue31250 > > So, would it be possible to enforce HTTPS on the bug tracker? > > The best would be to always generate HTTPS urls and *maybe* redirect > HTTP to HTTPS. > > Sorry, I don't know what are the best practices. For example, should > we use HTTPS only cookies? > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com From guido at python.org Fri Sep 8 00:14:35 2017 From: guido at python.org (Guido van Rossum) Date: Thu, 7 Sep 2017 21:14:35 -0700 Subject: [Python-Dev] PEP 548: More Flexible Loop Control In-Reply-To: <20170906232235.84C441B10001@webabinitio.net> References: <20170906001100.5F9DB1B10001@webabinitio.net> <20170906163401.2EDA01B10001@webabinitio.net> <20170906232235.84C441B10001@webabinitio.net> Message-ID: No worries. We all learned stuff! On Wed, Sep 6, 2017 at 4:22 PM, R. David Murray wrote: > On Wed, 06 Sep 2017 09:43:53 -0700, Guido van Rossum > wrote: > > I'm actually not in favor of this. It's another way to do the same thing. > > Sorry to rain on your dream! > > So it goes :) I learned things by going through the process, so it > wasn't wasted time for me even if (or because) I made several mistakes. > Sorry for wasting anyone else's time :( > > --David > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From songofacandy at gmail.com Fri Sep 8 02:46:12 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Fri, 8 Sep 2017 15:46:12 +0900 Subject: [Python-Dev] Memory bitmaps for the Python cyclic garbage collector In-Reply-To: <20170907173012.vmvhx5lt2ajghqfy@python.ca> References: <20170907173012.vmvhx5lt2ajghqfy@python.ca> Message-ID: Big +1. I love the idea. str (especially, docstring), dict, and tuples are major memory eater in Python. This may reduce tuple memory usage massively. INADA Naoki On Fri, Sep 8, 2017 at 2:30 AM, Neil Schemenauer wrote: > Python objects that participate in cyclic GC (things like lists, dicts, > sets but not strings, ints and floats) have extra memory overhead. I > think it is possible to mostly eliminate this overhead. Also, while > the GC is running, this GC state is mutated, which destroys > copy-on-write optimizations. This change would mostly fix that > issue. > > All objects that participate in cyclic GC have the Py_TPFLAGS_HAVE_GC > bit set in their type. That causes an extra chunk of memory to be > allocated *before* the ob_refcnt struct member. This is the PyGC_Head > struct. > > The whole object looks like this in memory (PyObject pointer is at > arrow): > > union __gc_head *gc_next; > union __gc_head *gc_prev; > Py_ssize_t gc_refs; > --> > Py_ssize_t ob_refcnt > struct _typeobject *ob_type; > [rest of PyObject members] > > > So, 24 bytes of overhead on a 64-bit machine. The smallest Python > object that can have a pointer to another object (e.g. a single PyObject > * member) is 48 bytes. Removing PyGC_Head would cut the size of these > objects in half. > > Carl Shaprio questioned me today on why we use a double linked-list and > not the memory bitmap. I think the answer is that there is no good > reason. We use a double linked list only due to historical constraints > that are no longer present. > > Long ago, Python objects could be allocated using the system malloc or > other memory allocators. Since we could not control the memory > location, bitmaps would be inefficient. Today, we allocate all Python > objects via our own function. Python objects under a certain size are > allocated using our own malloc, obmalloc, and are stored in memory > blocks known "arenas". > > The PyGC_Head struct performs three functions. First, it allows the GC > to find all Python objects that will be checked for cycles (i.e. follow > the linked list). Second, it stores a single bit of information to let > the GC know if it is safe to traverse the object, set with > PyObject_GC_Track(). Finally, it has a scratch area to compute the > effective reference count while tracing refs (gc_refs). > > Here is a sketch of how we can remove the PyGC_Head struct for small > objects (say less than 512 bytes). Large objects or objects created by > a different memory allocator will still have the PyGC_Head overhead. > > * Have memory arenas that contain only objects with the > Py_TPFLAGS_HAVE_GC flag. Objects like ints, strings, etc will be > in different arenas, not have bitmaps, not be looked at by the > cyclic GC. > > * For those arenas, add a memory bitmap. The bitmap is a bit array that > has a bit for each fixed size object in the arena. The memory used by > the bitmap is a fraction of what is needed by PyGC_Head. E.g. an > arena that holds up to 1024 objects of 48 bytes in size would have a > bitmap of 1024 bits. > > * The bits will be set and cleared by PyObject_GC_Track/Untrack() > > * We also need an array of Py_ssize_t to take over the job of gc_refs. > That could be allocated only when GC is working and it only needs to > be the size of the number of true bits in the bitmap. Or, it could be > allocated when the arena is allocated and be sized for the full arena. > > * Objects that are too large would still get the PyGC_Head struct > allocated "in front" of the PyObject. Because they are big, the > overhead is not so bad. > > * The GC process would work nearly the same as it does now. Rather than > only traversing the linked list, we would also have to crawl over the > GC object arenas, check blocks of memory that have the tracked bit > set. > > There are a lot of smaller details to work out but I see no reason > why the idea should not work. It should significantly reduce memory > usage. Also, because the bitmap and gc_refs are contiguous in > memory, locality will be improved. ?ukasz Langa has mentioned that > the current GC causes issues with copy-on-write memory in big > applications. This change should solve that issue. > > To implement, I think the easiest path is to create new malloc to be > used by small GC objects, e.g. gcmalloc.c. It would be similar to > obmalloc but have the features needed to keep track of the bitmap. > obmalloc has some quirks that makes it hard to use for this purpose. > Once the idea is proven, gcmalloc could be merged or made to be a > variation of obmalloc. Or, maybe just optimized and remain > separate. obmalloc is complicated and highly optimized. So, adding > additional functionality to it will be challenging. > > I believe this change would be ABI compatible. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com From ma3yuki.8mamo10 at gmail.com Fri Sep 8 03:30:44 2017 From: ma3yuki.8mamo10 at gmail.com (Masayuki YAMAMOTO) Date: Fri, 8 Sep 2017 16:30:44 +0900 Subject: [Python-Dev] PEP 539 v3: A new C API for Thread-Local Storage in CPython Message-ID: Hi folks, I submit PEP 539 third draft for the finish. Thank you for all the advice and the help! Summary of technical changes: * Don't use `bool` types in the implementation, per discussion at python/cpython#1362 [1] * Update for removal of --without-threads Best regards, Masayuki [1]: https://github.com/python/cpython/pull/1362#discussion_r136357583 First round: https://mail.python.org/pipermail/python-ideas/2016-December/043983.html Second round: https://mail.python.org/pipermail/python-dev/2017-August/149091.html Discussion for the issue: https://bugs.python.org/issue25658 HTML version for PEP 539 draft: https://www.python.org/dev/peps/pep-0539/ Diff from first round to version 3: https://gist.github.com/ma8ma/624f9e4435ebdb26230130b11ce12d20/revisions And the pull-request for reference implementation (work in progress): https://github.com/python/cpython/pull/1362 ======================================== PEP: 539 Title: A New C-API for Thread-Local Storage in CPython Version: $Revision$ Last-Modified: $Date$ Author: Erik M. Bray, Masayuki Yamamoto BDFL-Delegate: Nick Coghlan Status: Draft Type: Informational Content-Type: text/x-rst Created: 20-Dec-2016 Post-History: 16-Dec-2016 Abstract ======== The proposal is to add a new Thread Local Storage (TLS) API to CPython which would supersede use of the existing TLS API within the CPython interpreter, while deprecating the existing API. The new API is named the "Thread Specific Storage (TSS) API" (see `Rationale for Proposed Solution`_ for the origin of the name). Because the existing TLS API is only used internally (it is not mentioned in the documentation, and the header that defines it, ``pythread.h``, is not included in ``Python.h`` either directly or indirectly), this proposal probably only affects CPython, but might also affect other interpreter implementations (PyPy?) that implement parts of the CPython API. This is motivated primarily by the fact that the old API uses ``int`` to represent TLS keys across all platforms, which is neither POSIX-compliant, nor portable in any practical sense [1]_. .. note:: Throughout this document the acronym "TLS" refers to Thread Local Storage and should not be confused with "Transportation Layer Security" protocols. Specification ============= The current API for TLS used inside the CPython interpreter consists of 6 functions:: PyAPI_FUNC(int) PyThread_create_key(void) PyAPI_FUNC(void) PyThread_delete_key(int key) PyAPI_FUNC(int) PyThread_set_key_value(int key, void *value) PyAPI_FUNC(void *) PyThread_get_key_value(int key) PyAPI_FUNC(void) PyThread_delete_key_value(int key) PyAPI_FUNC(void) PyThread_ReInitTLS(void) These would be superseded by a new set of analogous functions:: PyAPI_FUNC(int) PyThread_tss_create(Py_tss_t *key) PyAPI_FUNC(void) PyThread_tss_delete(Py_tss_t *key) PyAPI_FUNC(int) PyThread_tss_set(Py_tss_t *key, void *value) PyAPI_FUNC(void *) PyThread_tss_get(Py_tss_t *key) The specification also adds a few new features: * A new type ``Py_tss_t``--an opaque type the definition of which may depend on the underlying TLS implementation. It is defined:: typedef struct { int _is_initialized; NATIVE_TSS_KEY_T _key; } Py_tss_t; where ``NATIVE_TSS_KEY_T`` is a macro whose value depends on the underlying native TLS implementation (e.g. ``pthread_key_t``). * A constant default value for ``Py_tss_t`` variables, ``Py_tss_NEEDS_INIT``. * Three new functions:: PyAPI_FUNC(Py_tss_t *) PyThread_tss_alloc(void) PyAPI_FUNC(void) PyThread_tss_free(Py_tss_t *key) PyAPI_FUNC(int) PyThread_tss_is_created(Py_tss_t *key) The first two are needed for dynamic (de-)allocation of a ``Py_tss_t``, particularly in extension modules built with ``Py_LIMITED_API``, where static allocation of this type is not possible due to its implementation being opaque at build time. A value returned by ``PyThread_tss_alloc`` is in the same state as a value initialized with ``Py_tss_NEEDS_INIT``, or ``NULL`` in the case of dynamic allocation failure. The behavior of ``PyThread_tss_free`` involves calling ``PyThread_tss_delete`` preventively, or is a no-op if the value pointed to by the ``key`` argument is ``NULL``. ``PyThread_tss_is_created`` returns non-zero if the given ``Py_tss_t`` has been initialized (i.e. by ``PyThread_tss_create``). The new TSS API does not provide functions which correspond to ``PyThread_delete_key_value`` and ``PyThread_ReInitTLS``, because these functions were needed only for CPython's now defunct built-in TLS implementation; that is the existing behavior of these functions is treated as follows: ``PyThread_delete_key_value(key)`` is equalivalent to ``PyThread_set_key_value(key, NULL)``, and ``PyThread_ReInitTLS()`` is a no-op [8]_. The new ``PyThread_tss_`` functions are almost exactly analogous to their original counterparts with a few minor differences: Whereas ``PyThread_create_key`` takes no arguments and returns a TLS key as an ``int``, ``PyThread_tss_create`` takes a ``Py_tss_t*`` as an argument and returns an ``int`` status code. The behavior of ``PyThread_tss_create`` is undefined if the value pointed to by the ``key`` argument is not initialized by ``Py_tss_NEEDS_INIT``. The returned status code is zero on success and non-zero on failure. The meanings of non-zero status codes are not otherwise defined by this specification. Similarly the other ``PyThread_tss_`` functions are passed a ``Py_tss_t*`` whereas previously the key was passed by value. This change is necessary, as being an opaque type, the ``Py_tss_t`` type could hypothetically be almost any size. This is especially necessary for extension modules built with ``Py_LIMITED_API``, where the size of the type is not known. Except for ``PyThread_tss_free``, the behaviors of ``PyThread_tss_`` are undefined if the value pointed to by the ``key`` argument is ``NULL``. Moreover, because of the use of ``Py_tss_t`` instead of ``int``, there are behaviors in the new API which differ from the existing API with regard to key creation and deletion. ``PyThread_tss_create`` can be called repeatedly on the same key--calling it on an already initialized key is a no-op and immediately returns success. Similarly for calling ``PyThread_tss_delete`` with an uninitialized key. The behavior of ``PyThread_tss_delete`` is defined to change the key's initialization state to "uninitialized"--this allows, for example, statically allocated keys to be reset to a sensible state when restarting the CPython interpreter without terminating the process (e.g. embedding Python in an application) [12]_. The old ``PyThread_*_key*`` functions will be marked as deprecated in the documentation, but will not generate runtime deprecation warnings. Additionally, on platforms where ``sizeof(pthread_key_t) != sizeof(int)``, ``PyThread_create_key`` will return immediately with a failure status, and the other TLS functions will all be no-ops on such platforms. Comparison of API Specification ------------------------------- ================= ============================= ============================= API Thread Local Storage (TLS) Thread Specific Storage (TSS) ================= ============================= ============================= Version Existing New Key Type ``int`` ``Py_tss_t`` (opaque type) Handle Native Key cast to ``int`` conceal into internal field Function Argument ``int`` ``Py_tss_t *`` Features - create key - create key - delete key - delete key - set value - set value - get value - get value - delete value - (set ``NULL`` instead) [8]_ - reinitialize keys (after - (unnecessary) [8]_ fork) - dynamically (de-)allocate key - check key's initialization state Default Value (``-1`` as key creation ``Py_tss_NEEDS_INIT`` failure) Requirement native threads native threads (since CPython 3.7 [9]_) Restriction No support for platforms Unable to statically allocate where native TLS key is keys when ``Py_LIMITED_API`` defined in a way that cannot is defined. be safely cast to ``int``. ================= ============================= ============================= Example ------- With the proposed changes, a TSS key is initialized like:: static Py_tss_t tss_key = Py_tss_NEEDS_INIT; if (PyThread_tss_create(&tss_key)) { /* ... handle key creation failure ... */ } The initialization state of the key can then be checked like:: assert(PyThread_tss_is_created(&tss_key)); The rest of the API is used analogously to the old API:: int the_value = 1; if (PyThread_tss_get(&tss_key) == NULL) { PyThread_tss_set(&tss_key, (void *)&the_value); assert(PyThread_tss_get(&tss_key) != NULL); } /* ... once done with the key ... */ PyThread_tss_delete(&tss_key); assert(!PyThread_tss_is_created(&tss_key)); When ``Py_LIMITED_API`` is defined, a TSS key must be dynamically allocated:: static Py_tss_t *ptr_key = PyThread_tss_alloc(); if (ptr_key == NULL) { /* ... handle key allocation failure ... */ } assert(!PyThread_tss_is_created(ptr_key)); /* ... once done with the key ... */ PyThread_tss_free(ptr_key); ptr_key = NULL; Platform Support Changes ======================== A new "Native Thread Implementation" section will be added to PEP 11 that states: * As of CPython 3.7, all platforms are required to provide a native thread implementation (such as pthreads or Windows) to implement the TSS API. Any TSS API problems that occur in an implementation without native threads will be closed as "won't fix". Motivation ========== The primary problem at issue here is the type of the keys (``int``) used for TLS values, as defined by the original PyThread TLS API. The original TLS API was added to Python by GvR back in 1997, and at the time the key used to represent a TLS value was an ``int``, and so it has been to the time of writing. This used CPython's own TLS implementation which long remained unused, largely unchanged, in Python/thread.c. Support for implementation of the API on top of native thread implementations (pthreads and Windows) was added much later, and the built-in implementation has been deemed no longer necessary and has since been removed [9]_. The problem with the choice of ``int`` to represent a TLS key, is that while it was fine for CPython's own TLS implementation, and happens to be compatible with Windows (which uses ``DWORD`` for the analogous data), it is not compatible with the POSIX standard for the pthreads API, which defines ``pthread_key_t`` as an opaque type not further defined by the standard (as with ``Py_tss_t`` described above) [14]_. This leaves it up to the underlying implementation how a ``pthread_key_t`` value is used to look up thread-specific data. This has not generally been a problem for Python's API, as it just happens that on Linux ``pthread_key_t`` is defined as an ``unsigned int``, and so is fully compatible with Python's TLS API--``pthread_key_t``'s created by ``pthread_create_key`` can be freely cast to ``int`` and back (well, not exactly, even this has some limitations as pointed out by issue #22206). However, as issue #25658 points out, there are at least some platforms (namely Cygwin, CloudABI, but likely others as well) which have otherwise modern and POSIX-compliant pthreads implementations, but are not compatible with Python's API because their ``pthread_key_t`` is defined in a way that cannot be safely cast to ``int``. In fact, the possibility of running into this problem was raised by MvL at the time pthreads TLS was added [2]_. It could be argued that PEP-11 makes specific requirements for supporting a new, not otherwise officially-support platform (such as CloudABI), and that the status of Cygwin support is currently dubious. However, this creates a very high barrier to supporting platforms that are otherwise Linux- and/or POSIX-compatible and where CPython might otherwise "just work" except for this one hurdle. CPython itself imposes this implementation barrier by way of an API that is not compatible with POSIX (and in fact makes invalid assumptions about pthreads). Rationale for Proposed Solution =============================== The use of an opaque type (``Py_tss_t``) to key TLS values allows the API to be compatible with all present (POSIX and Windows) and future (C11?) native TLS implementations supported by CPython, as it allows the definition of ``Py_tss_t`` to depend on the underlying implementation. Since the existing TLS API has been available in *the limited API* [13]_ for some platforms (e.g. Linux), CPython makes an effort to provide the new TSS API at that level likewise. Note, however, that the ``Py_tss_t`` definition becomes to be an opaque struct when ``Py_LIMITED_API`` is defined, because exposing ``NATIVE_TSS_KEY_T`` as part of the limited API would prevent us from switching native thread implementation without rebuilding extension modules. A new API must be introduced, rather than changing the function signatures of the current API, in order to maintain backwards compatibility. The new API also more clearly groups together these related functions under a single name prefix, ``PyThread_tss_``. The "tss" in the name stands for "thread-specific storage", and was influenced by the naming and design of the "tss" API that is part of the C11 threads API [15]_. However, this is in no way meant to imply compatibility with or support for the C11 threads API, or signal any future intention of supporting C11--it's just the influence for the naming and design. The inclusion of the special default value ``Py_tss_NEEDS_INIT`` is required by the fact that not all native TLS implementations define a sentinel value for uninitialized TLS keys. For example, on Windows a TLS key is represented by a ``DWORD`` (``unsigned int``) and its value must be treated as opaque [3]_. So there is no unsigned integer value that can be safely used to represent an uninitialized TLS key on Windows. Likewise, POSIX does not specify a sentinel for an uninitialized ``pthread_key_t``, instead relying on the ``pthread_once`` interface to ensure that a given TLS key is initialized only once per-process. Therefore, the ``Py_tss_t`` type contains an explicit ``._is_initialized`` that can indicate the key's initialization state independent of the underlying implementation. Changing ``PyThread_create_key`` to immediately return a failure status on systems using pthreads where ``sizeof(int) != sizeof(pthread_key_t)`` is intended as a sanity check: Currently, ``PyThread_create_key`` may report initial success on such systems, but attempts to use the returned key are likely to fail. Although in practice this failure occurs earlier in the interpreter initialization, it's better to fail immediately at the source of problem (``PyThread_create_key``) rather than sometime later when use of an invalid key is attempted. In other words, this indicates clearly that the old API is not supported on platforms where it cannot be used reliably, and that no effort will be made to add such support. Rejected Ideas ============== * Do nothing: The status quo is fine because it works on Linux, and platforms wishing to be supported by CPython should follow the requirements of PEP-11. As explained above, while this would be a fair argument if CPython were being to asked to make changes to support particular quirks or features of a specific platform, in this case it is a quirk of CPython that prevents it from being used to its full potential on otherwise POSIX-compliant platforms. The fact that the current implementation happens to work on Linux is a happy accident, and there's no guarantee that this will never change. * Affected platforms should just configure Python ``--without-threads``: this is no longer an option as the ``--without-threads`` option has been removed for Python 3.7 [16]_. * Affected platforms should use CPython's built-in TLS implementation instead of a native TLS implementation: This is a more acceptable alternative to the previous idea, and in fact there had been a patch to do just that [4]_. However, the built-in implementation being "slower and clunkier" in general than native implementations still needlessly hobbles performance on affected platforms. At least one other module (``tracemalloc``) is also broken if Python is built without a native TLS implementation. This idea also cannot be adopted because the built-in implementation has since been removed. * Keep the existing API, but work around the issue by providing a mapping from ``pthread_key_t`` values to ``int`` values. A couple attempts were made at this ([5]_, [6]_), but this injects needless complexity and overhead into performance-critical code on platforms that are not currently affected by this issue (such as Linux). Even if use of this workaround were made conditional on platform compatibility, it introduces platform-specific code to maintain, and still has the problem of the previous rejected ideas of needlessly hobbling performance on affected platforms. Implementation ============== An initial version of a patch [7]_ is available on the bug tracker for this issue. Since the migration to GitHub, its development has continued in the ``pep539-tss-api`` feature branch [10]_ in Masayuki Yamamoto's fork of the CPython repository on GitHub. A work-in-progress PR is available at [11]_. This reference implementation covers not only the new API implementation features, but also the client code updates needed to replace the existing TLS API with the new TSS API. Copyright ========= This document has been placed in the public domain. References and Footnotes ======================== .. [1] http://bugs.python.org/issue25658 .. [2] https://bugs.python.org/msg116292 .. [3] https://msdn.microsoft.com/en-us/library/windows/desktop/ ms686801(v=vs.85).aspx .. [4] http://bugs.python.org/file45548/configure-pthread_key_t.patch .. [5] http://bugs.python.org/file44269/issue25658-1.patch .. [6] http://bugs.python.org/file44303/key-constant-time.diff .. [7] http://bugs.python.org/file46379/pythread-tss-3.patch .. [8] https://bugs.python.org/msg298342 .. [9] http://bugs.python.org/issue30832 .. [10] https://github.com/python/cpython/compare/master... ma8ma:pep539-tss-api .. [11] https://github.com/python/cpython/pull/1362 .. [12] https://docs.python.org/3/c-api/init.html#c.Py_FinalizeEx .. [13] It is also called as "stable ABI" (https://www.python.org/dev/peps/pep-0384/) .. [14] http://pubs.opengroup.org/onlinepubs/009695399/ functions/pthread_key_create.html .. [15] http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf#page=404 .. [16] https://bugs.python.org/issue31370 -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Fri Sep 8 06:04:52 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 8 Sep 2017 12:04:52 +0200 Subject: [Python-Dev] PEP 552: deterministic pycs References: <1504816761.1994315.1098826272.2B182328@webmail.messagingengine.com> <1504820433.3022749.1098877456.3FF76AA6@webmail.messagingengine.com> <1504824402.3041582.1098937744.55112449@webmail.messagingengine.com> Message-ID: <20170908120452.22b6723b@fsol> On Thu, 7 Sep 2017 18:47:20 -0700 Nick Coghlan wrote: > However, I do wonder whether we could encode *all* the mode settings > into the magic number, such that we did something like reserving the > top 3 bits for format flags: > > * number & 0x1FFF -> the traditional magic number > * number & 0x8000 -> timestamp or hash? > * number & 0x4000 -> checked or not? > * number & 0x2000 -> reserved for future format changes I'd rather a single magic number and a separate bitfield that tells what the header encodes exactly. We don't *have* to fight for a tiny size reduction of pyc files. Regards Antoine. From solipsis at pitrou.net Fri Sep 8 06:38:29 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 8 Sep 2017 12:38:29 +0200 Subject: [Python-Dev] PEP 552: single magic number References: <1504816761.1994315.1098826272.2B182328@webmail.messagingengine.com> <1504820433.3022749.1098877456.3FF76AA6@webmail.messagingengine.com> <1504824402.3041582.1098937744.55112449@webmail.messagingengine.com> <20170908120452.22b6723b@fsol> Message-ID: <20170908123829.6c8b9874@fsol> On Fri, 8 Sep 2017 12:04:52 +0200 Antoine Pitrou wrote: > On Thu, 7 Sep 2017 18:47:20 -0700 > Nick Coghlan wrote: > > However, I do wonder whether we could encode *all* the mode settings > > into the magic number, such that we did something like reserving the > > top 3 bits for format flags: > > > > * number & 0x1FFF -> the traditional magic number > > * number & 0x8000 -> timestamp or hash? > > * number & 0x4000 -> checked or not? > > * number & 0x2000 -> reserved for future format changes > > I'd rather a single magic number and a separate bitfield that tells > what the header encodes exactly. We don't *have* to fight for a tiny > size reduction of pyc files. Let me expand a bit on this. Currently, the format is: - bytes 0..3: magic number - bytes 4..7: source file timestamp - bytes 8..11: source file size - bytes 12+: pyc file body (marshal format) What I'm proposing is: - bytes 0..3: magic number - bytes 4..7: header options (bitfield) - bytes 8..15: header contents Depending on header options: - bytes 8..11: source file timestamp - bytes 12..15: source file size or: - bytes 8..15: 64-bit source file hash - bytes 16+: pyc file body (marshal format) This way, we keep a single magic number, a single header size, and there's only a per-build variation in the middle of the header. Of course, there are possible ways to encode information. For example, the header could be a sequence of Type-Length-Value triplets, perhaps prefixed with header size or body offset for easy seeking. My whole point here is that we can easily avoid the annoyance of dual magic numbers and encodings which must be maintained in parallel. Regards Antoine. From steve at holdenweb.com Fri Sep 8 08:08:20 2017 From: steve at holdenweb.com (steve at holdenweb.com) Date: Fri, 8 Sep 2017 05:08:20 -0700 Subject: [Python-Dev] PEP 553 V2 - builtin breakpoint() (was Re: PEP 553: Built-in debug()) In-Reply-To: References: Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 It would appear to be more secure to define specific meanings for particular values of the environment variable. ?S -----BEGIN PGP SIGNATURE----- Version: CryptUp 4.4.3 Gmail Encryption https://cryptup.org Comment: Seamlessly send, receive and search encrypted email wsFcBAEBCAAQBQJZsogwCRAplraUdy0hlgAAvBoQALGf9Fx4fnp4zmoYD6Qc y5XEbRZnHdnZUg4/k51qb5U5FRoQCSnFzJMltFD+5BuSTLCG8H0Bm9w1LL+N 8Rcg5j0rgXMY5FACK//jCV11uqi7eqVUst99oLPvtYSbGd9CFH5S8yCxT9Sz fl6Cw8Err828l18phAQ1fgq+AG9Y9vMtGDwLcj4XN5OlWEcpX/rB5HPppylH /xuProaU+Xluig6Gh09meRPSnd1GT+Y6T3E90AypnwbQiUiMVLTCVOXPRFuR vlODvAULpw2Pn4D191rgu3Rwou6DcTRmeJMrNhHLOfMEwGMXAkpZdJYGkVvj vgwmr2ZOD/CafzqQHgi22en0qwPQTZUK9MhqrVcT4pIPfshNfjQ34fe3qdnt sEPxDb4KjFVUBfZVbnxEvLIIf5C3Gir9sEq8LxoCU/2eG8jkxbxNKmhDfFEi lvHF/HSyU0AYhwK/MWsagiGmE+6MHL3URpl6Lp/jaKTvVIVhIxAEp+GKxp36 04v/elaiF++p82jjMFGvUWeqIj4hJJG94RFybXZ53TE5gK6jmOkCixtW8A6E LRFRr9C0f8mKM5d3JmZSCJVtIjoIerFyTTz4prm3S482mLxEeHvcYslhTZI3 yPn0sGIaTDcMk3PvJ8psjrQaEziuOGcDOLX2LYsr9IxKk/kbShmjKs9UBcm9 FNCp =acbU -----END PGP SIGNATURE----- -----BEGIN PGP PUBLIC KEY BLOCK----- Version: CryptUp 4.3.6 Gmail Encryption https://cryptup.org Comment: Seamlessly send, receive and search encrypted email xsFNBFlt3m8BEAC/1rDdusFArgznDbnluLfvIt3Wyl6DzFshtDIGUECuuGTG xcNLxAwLt60z3pabKxFW7w5RDyHcqRqB8RDDYfVw1FiPfAiYcdc6kMFmBoa4 veADYol0maZEGWdg9hmsbmcCuUZ6bHAKAlAdFlLV8T/SmayhwGzk5qGl2WqY VxqDhVPVBkjt/YBdzE++CY+7Gjqm1EqhWMe+cm2HwrZk51+KWCWSmZj8VieG jM9qogS+atleJhF14XbRmrF/OUq5odaJcIlYPy2MFnxNz9lsjVRLvkPGs4Z7 p+LFujj7B7QpW+B9ibrrktGOfPA+gVEBukWEapgghgF7C4X/xE5VfgdX9bHu q52gEuKQWX4weHJDbBykkceZWXlMy2bmWuoXJIQ+rwerGLA3BeuQphmW841d G2tRx4QRBuEUGNfcCLf57Dp7OMf4pu6+JGnCSmf3l2Se38jLqH/c4ySw7sCA slhCKTlzjt6hvvAHlOGOpwSVTzNgSb4LRdcFlgjAYE0TXwgsVzY4bc+cdMfQ rkruoZnQ+Eg0M+XauO3hIZujvcEckiiBjON6VWGLsgLOl/qAyKwzoFvQnnxK qvh7LGnRBtXkVuT3Q/HP8lBtJ1mUOvuKkrq/L75Xjtvj3LBPOFe9knLzvz+M tzzRR6Tv2MVVd3n5NqHvAuRDMw6IyEEwxf2MpQARAQABzSJTdGV2ZSBIb2xk ZW4gPHN0ZXZlQGhvbGRlbndlYi5jb20+wsF1BBABCAApBQJZbd5yBgsJBwgD AgkQKZa2lHctIZYEFQgCCgMWAgECGQECGwMCHgEAAM+JD/0YSpIaMYSo7Wqz YL13yyzukd27352glbTQosuLG31UgjbTx7rgK02EkXAOjVJCOpt8IqRVKEyI S8dAdg4k/rS4DeAiSD1wNlyVxD3TQbFZlDE8L6Gbj7CvdVg64Dm1mby+/YQX XrRAay48fCHWQyLwsvep/sc8k9wL2l7lXVETgbmo8vz0IeYevgZenYJGYWaY hEUnThdbcnjMJQArLt4VyVVLBxMRUg1ggf/ccmVjI8JqS+V1vyGDgHqBwDLp gdJcb6q1poUasD5Zdl3MCpbe3Z/6kUiaQIDkzAKSJdmHxwP2LnOG2I6Kkosk x3fDv6hR3hkG7jv8bv1jyuZzqSgv55wdLbBMNfeDsY14e0q0IRK6cYYiFGyU /y02Yj5KnDRqrMuDXuNcJ8CrkgcfWDUlHn5k0wmBEQCaAvm6amxYSvg8pqdz PtM53GcoZZ/5mDxGCq2j3beyqmYVI1Cp9rPGc9C7TfBJm72dc+TjaGjGQWbU 3DQHH/B5Ny4EJwPu7pFwGF/wQBnym9/aud2tb6d1+l4svQtC1XDzk/m/Hedt Metx3vuqEuUnFkucTl1v+Kk56GnG0mQTBmRg+xhRStqRkK+JYC/5MIasxcE5 cOTrTkoZzsET5+chNEdnK5JKi6gCMaLZIh7zjkLEP8gNP6vTEoJouZpSjBb5 qNPSwV+b587BTQRZbd5vARAAsw8xPylNT0VFDAdX2kgKCB7RsjTp6uYy5KuW jvO8f/WoUhJxKO5yLHzdJ5+nU33U7w73UhdeMBuon5bTeddIIczO6DcZjuiP OJVQeZCpjQbx58lPdvvT4/m16duHiYeD0ONDEchd03mbx0RSVB2Qo/KPy6cb etvdBF893uIJSvJKY/vDmi8yuYsbfnxmirPGMT8DqY3O02g6nZuUp8iQbDEQ xHwp9PZZ5A3ZewG95uY7QgRv8we97QHjGc/Nzr284zHIAOJeqiRNd+xnqCZs mm31lw0ITxwJERKZTAIOgPGejpuLd4zuqS+P385fhsC1RSQJOuozlkiraOSI R76QhOgdSMsMNDr8PRMaDYOLWgXq5vIDQ+7CmH7b6EFM77Acxblf7V1LHjHm IOCgQVRCf27F4M0GulcNOEioQFAYlPGtsI4SshvOguy8eybN+T/mh3R66WrZ RVUHnoURkAkEJ1p+r6amAaQCr+OYHztkjd+6KgW6jQJWsCWcii8lzyWf1oPT g/eZsF7gqArR7H0FkWPFnKPUaJe2AEDXzLlTCf1tN/KDegx8aW1kC+wXUqFj VG0tL1s5CRwVSRhXqt4hV4ARrmeoP6R5IQVvRRomPemisvd+qNHFasWU0El7 S7LVZ+ppS6YejowjscHXDREnoux9Q+v014qTnHYR40qvr8sAEQEAAcLBXwQY AQgAEwUCWW3ecwkQKZa2lHctIZYCGwwAADPkD/9sQInKpiSxeddX0j9ygH5y RdPIZ8fX9rA/E3qp06dLIj0sQo3wpX+8LhxwTGbB/pM6yd5SjNt1A/j9tpS4 Rg6uzqe7ezmMuQTKczOL9/2yFhNpGaPOnFUs2nEFfHv4IGO96rqanDOEfUDo kz/zXoxzehnpbxEUC5+azR46laTrEyRdX9R5UhvY2EA8kuL2hKGQg5OXdaMJ dqtGLyGSa+XTOAvCD0DalJA6T4tDHjOlyUI2GaGs53it3dLAT4sUeBZMATGW vbR+FJbM/dQVvT39ogtzDilL5je2dGZUQMdqwyfPL3HkDmfkr7ESEJEjQD+I /0fzH5jGm0q67bI2B54ddLCy7/60ViFPDYDi57gMEq5lXx6uBQgQMh49vgfO 9ywulUyH2ueFijaTJulNQPpn0fhMTm/20SLI2qHYtLI/g/veMDva/UPuolr1 kQk0BRaLmR8audc4SrGWObGDgVfEmqqHdzJu97ZX6lEKtCcLh/n1u0SbmyEX SF/iStR5tnURIGrykEM+lkeJVmc0Yuh67DCiJC14Widnfy2f5FbcfyRsUs/Y 5WsqgXXroxDWXy/892paz8LmswX08y/sdxGKzr1GmMq8Ib5CixjmFuDhaKpp WhOOGXLkTeGK+W2ARm/xq2YLcmpTBm48ABwRHfTYm6WbrVMXjlWbqwhr3mWT 2g== =0xrN -----END PGP PUBLIC KEY BLOCK----- From ncoghlan at gmail.com Fri Sep 8 10:37:03 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 8 Sep 2017 07:37:03 -0700 Subject: [Python-Dev] PEP 539 v3: A new C API for Thread-Local Storage in CPython In-Reply-To: References: Message-ID: On 8 September 2017 at 00:30, Masayuki YAMAMOTO wrote: > Hi folks, > > I submit PEP 539 third draft for the finish. Thank you for all the advice > and the help! Thank you Erik & Yamamoto-san for all of your work on this PEP! The updates look good, so I'm happy to say as BDFL-Delegate that this proposal is now accepted :) Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From guido at python.org Fri Sep 8 10:40:53 2017 From: guido at python.org (Guido van Rossum) Date: Fri, 8 Sep 2017 07:40:53 -0700 Subject: [Python-Dev] PEP 552: single magic number In-Reply-To: <20170908123829.6c8b9874@fsol> References: <1504816761.1994315.1098826272.2B182328@webmail.messagingengine.com> <1504820433.3022749.1098877456.3FF76AA6@webmail.messagingengine.com> <1504824402.3041582.1098937744.55112449@webmail.messagingengine.com> <20170908120452.22b6723b@fsol> <20170908123829.6c8b9874@fsol> Message-ID: I also like having the header fixed-size, so it might be possible to rewrite headers (e.g. to flip the source bit) without moving the rest of the file. On Fri, Sep 8, 2017 at 3:38 AM, Antoine Pitrou wrote: > On Fri, 8 Sep 2017 12:04:52 +0200 > Antoine Pitrou wrote: > > On Thu, 7 Sep 2017 18:47:20 -0700 > > Nick Coghlan wrote: > > > However, I do wonder whether we could encode *all* the mode settings > > > into the magic number, such that we did something like reserving the > > > top 3 bits for format flags: > > > > > > * number & 0x1FFF -> the traditional magic number > > > * number & 0x8000 -> timestamp or hash? > > > * number & 0x4000 -> checked or not? > > > * number & 0x2000 -> reserved for future format changes > > > > I'd rather a single magic number and a separate bitfield that tells > > what the header encodes exactly. We don't *have* to fight for a tiny > > size reduction of pyc files. > > Let me expand a bit on this. Currently, the format is: > > - bytes 0..3: magic number > - bytes 4..7: source file timestamp > - bytes 8..11: source file size > - bytes 12+: pyc file body (marshal format) > > What I'm proposing is: > > - bytes 0..3: magic number > - bytes 4..7: header options (bitfield) > - bytes 8..15: header contents > Depending on header options: > - bytes 8..11: source file timestamp > - bytes 12..15: source file size > or: > - bytes 8..15: 64-bit source file hash > - bytes 16+: pyc file body (marshal format) > > This way, we keep a single magic number, a single header size, and > there's only a per-build variation in the middle of the header. > > > Of course, there are possible ways to encode information. For > example, the header could be a sequence of Type-Length-Value triplets, > perhaps prefixed with header size or body offset for easy seeking. > > My whole point here is that we can easily avoid the annoyance of dual > magic numbers and encodings which must be maintained in parallel. > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Sep 8 10:49:46 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 8 Sep 2017 07:49:46 -0700 Subject: [Python-Dev] PEP 552: deterministic pycs In-Reply-To: <20170908120452.22b6723b@fsol> References: <1504816761.1994315.1098826272.2B182328@webmail.messagingengine.com> <1504820433.3022749.1098877456.3FF76AA6@webmail.messagingengine.com> <1504824402.3041582.1098937744.55112449@webmail.messagingengine.com> <20170908120452.22b6723b@fsol> Message-ID: On 8 September 2017 at 03:04, Antoine Pitrou wrote: > On Thu, 7 Sep 2017 18:47:20 -0700 > Nick Coghlan wrote: >> However, I do wonder whether we could encode *all* the mode settings >> into the magic number, such that we did something like reserving the >> top 3 bits for format flags: >> >> * number & 0x1FFF -> the traditional magic number >> * number & 0x8000 -> timestamp or hash? >> * number & 0x4000 -> checked or not? >> * number & 0x2000 -> reserved for future format changes > > I'd rather a single magic number and a separate bitfield that tells > what the header encodes exactly. We don't *have* to fight for a tiny > size reduction of pyc files. One of Benjamin's goals was for the existing timestamp-based pyc format to remain completely unchanged, so we need some kind of marker in the magic number to indicate whether the file is using the new format or nor. I'd also be fine with using a single bit for that, such that the only bitmasking needed was: * number & 0x8000 -> legacy format or new format? * number & 0x7FFF -> the magic number itself And any further flags would go in a separate field. That's essentially what PEP 552 already suggests, the only adjustment is the idea of specifically using the high order bit in the magic number field to indicate the pyc format in use rather than leaving the explanation of how the two magic numbers will differ unspecified. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Fri Sep 8 10:53:40 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 8 Sep 2017 07:53:40 -0700 Subject: [Python-Dev] PEP 553 V2 - builtin breakpoint() (was Re: PEP 553: Built-in debug()) In-Reply-To: References: <7183CD79-9A3A-489D-96B8-3D570D8A3170@python.org> Message-ID: On 7 September 2017 at 20:02, Adrian Petrescu wrote: > Would that not be a security concern, if you can get Python to execute > arbitrary code just by setting an environment variable? Not really, as once someone has write access to your process environment, you've already lost (they can mess with PYTHONIOENCODING, PYTHONPATH, LD_PRELOAD, OpenSSL certificate verification settings, and more). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From solipsis at pitrou.net Fri Sep 8 10:53:30 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 8 Sep 2017 16:53:30 +0200 Subject: [Python-Dev] Pygments in PEPs? Message-ID: <20170908165330.61e1a8bf@fsol> Hi, Is it possible to install pygments on the machine which builds and renders PEPs? It would allow syntax highlighting in PEPs, making example code more readable. Regards Antoine. From solipsis at pitrou.net Fri Sep 8 10:55:00 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 8 Sep 2017 16:55:00 +0200 Subject: [Python-Dev] PEP 552: deterministic pycs In-Reply-To: References: <1504816761.1994315.1098826272.2B182328@webmail.messagingengine.com> <1504820433.3022749.1098877456.3FF76AA6@webmail.messagingengine.com> <1504824402.3041582.1098937744.55112449@webmail.messagingengine.com> <20170908120452.22b6723b@fsol> Message-ID: <20170908165500.3828c370@fsol> On Fri, 8 Sep 2017 07:49:46 -0700 Nick Coghlan wrote: > On 8 September 2017 at 03:04, Antoine Pitrou wrote: > > On Thu, 7 Sep 2017 18:47:20 -0700 > > Nick Coghlan wrote: > >> However, I do wonder whether we could encode *all* the mode settings > >> into the magic number, such that we did something like reserving the > >> top 3 bits for format flags: > >> > >> * number & 0x1FFF -> the traditional magic number > >> * number & 0x8000 -> timestamp or hash? > >> * number & 0x4000 -> checked or not? > >> * number & 0x2000 -> reserved for future format changes > > > > I'd rather a single magic number and a separate bitfield that tells > > what the header encodes exactly. We don't *have* to fight for a tiny > > size reduction of pyc files. > > One of Benjamin's goals was for the existing timestamp-based pyc > format to remain completely unchanged, so we need some kind of marker > in the magic number to indicate whether the file is using the new > format or nor. I don't think that's a useful goal, as long as we bump the magic number. Note the header format was already changed in the past when we added a "size" field beside the "timestamp" field, to resolve collisions due to timestamp granularity. Regards Antoine. From eric at trueblade.com Fri Sep 8 10:57:09 2017 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 8 Sep 2017 07:57:09 -0700 Subject: [Python-Dev] PEP 557: Data Classes Message-ID: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> I've written a PEP for what might be thought of as "mutable namedtuples with defaults, but not inheriting tuple's behavior" (a mouthful, but it sounded simpler when I first thought of it). It's heavily influenced by the attrs project. It uses PEP 526 type annotations to define fields. From the overview section: @dataclass class InventoryItem: name: str unit_price: float quantity_on_hand: int = 0 def total_cost(self) -> float: return self.unit_price * self.quantity_on_hand Will automatically add these methods: def __init__(self, name: str, unit_price: float, quantity_on_hand: int = 0) -> None: self.name = name self.unit_price = unit_price self.quantity_on_hand = quantity_on_hand def __repr__(self): return f'InventoryItem(name={self.name!r},unit_price={self.unit_price!r},quantity_on_hand={self.quantity_on_hand!r})' def __eq__(self, other): if other.__class__ is self.__class__: return (self.name, self.unit_price, self.quantity_on_hand) == (other.name, other.unit_price, other.quantity_on_hand) return NotImplemented def __ne__(self, other): if other.__class__ is self.__class__: return (self.name, self.unit_price, self.quantity_on_hand) != (other.name, other.unit_price, other.quantity_on_hand) return NotImplemented def __lt__(self, other): if other.__class__ is self.__class__: return (self.name, self.unit_price, self.quantity_on_hand) < (other.name, other.unit_price, other.quantity_on_hand) return NotImplemented def __le__(self, other): if other.__class__ is self.__class__: return (self.name, self.unit_price, self.quantity_on_hand) <= (other.name, other.unit_price, other.quantity_on_hand) return NotImplemented def __gt__(self, other): if other.__class__ is self.__class__: return (self.name, self.unit_price, self.quantity_on_hand) > (other.name, other.unit_price, other.quantity_on_hand) return NotImplemented def __ge__(self, other): if other.__class__ is self.__class__: return (self.name, self.unit_price, self.quantity_on_hand) >= (other.name, other.unit_price, other.quantity_on_hand) return NotImplemented Data Classes saves you from writing and maintaining these functions. The PEP is largely complete, but could use some filling out in places. Comments welcome! Eric. P.S. I wrote this PEP when I was in my happy place. From levkivskyi at gmail.com Fri Sep 8 10:59:57 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Fri, 8 Sep 2017 16:59:57 +0200 Subject: [Python-Dev] Pygments in PEPs? In-Reply-To: <20170908165330.61e1a8bf@fsol> References: <20170908165330.61e1a8bf@fsol> Message-ID: I already made a PR that would highlight code in PEPs half-year ago, but it was rejected with the reason that they will be moved to RtD soon. -- Ivan 8 ??? 2017 16:56 "Antoine Pitrou" ????: > > Hi, > > Is it possible to install pygments on the machine which builds and > renders PEPs? It would allow syntax highlighting in PEPs, making > example code more readable. > > Regards > > Antoine. > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > levkivskyi%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Fri Sep 8 11:05:23 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 8 Sep 2017 17:05:23 +0200 Subject: [Python-Dev] PEP 556: Threaded garbage collection Message-ID: <20170908170523.73bdb61e@fsol> Hello, I've written a PEP by which you can tell the GC to run in a dedicated thread. The goal is to solve reentrancy issues with finalizers: https://www.python.org/dev/peps/pep-0556/ Regards Antoine. PS: I did not come up with the idea for this PEP while other people were having nightmares. Any nightmares involved in this PEP are fictional, and any resemblance to actual nightmares is purely coincidental. No nightmares were harmed while writing this PEP. From eric at trueblade.com Fri Sep 8 11:01:34 2017 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 8 Sep 2017 08:01:34 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> Message-ID: <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> Oops, I forgot the link. It should show up shortly at https://www.python.org/dev/peps/pep-0557/. Eric. On 9/8/17 7:57 AM, Eric V. Smith wrote: > I've written a PEP for what might be thought of as "mutable namedtuples > with defaults, but not inheriting tuple's behavior" (a mouthful, but it > sounded simpler when I first thought of it). It's heavily influenced by > the attrs project. It uses PEP 526 type annotations to define fields. > From the overview section: > > @dataclass > class InventoryItem: > name: str > unit_price: float > quantity_on_hand: int = 0 > > def total_cost(self) -> float: > return self.unit_price * self.quantity_on_hand > > Will automatically add these methods: > > def __init__(self, name: str, unit_price: float, quantity_on_hand: int > = 0) -> None: > self.name = name > self.unit_price = unit_price > self.quantity_on_hand = quantity_on_hand > def __repr__(self): > return > f'InventoryItem(name={self.name!r},unit_price={self.unit_price!r},quantity_on_hand={self.quantity_on_hand!r})' > > def __eq__(self, other): > if other.__class__ is self.__class__: > return (self.name, self.unit_price, self.quantity_on_hand) == > (other.name, other.unit_price, other.quantity_on_hand) > return NotImplemented > def __ne__(self, other): > if other.__class__ is self.__class__: > return (self.name, self.unit_price, self.quantity_on_hand) != > (other.name, other.unit_price, other.quantity_on_hand) > return NotImplemented > def __lt__(self, other): > if other.__class__ is self.__class__: > return (self.name, self.unit_price, self.quantity_on_hand) < > (other.name, other.unit_price, other.quantity_on_hand) > return NotImplemented > def __le__(self, other): > if other.__class__ is self.__class__: > return (self.name, self.unit_price, self.quantity_on_hand) <= > (other.name, other.unit_price, other.quantity_on_hand) > return NotImplemented > def __gt__(self, other): > if other.__class__ is self.__class__: > return (self.name, self.unit_price, self.quantity_on_hand) > > (other.name, other.unit_price, other.quantity_on_hand) > return NotImplemented > def __ge__(self, other): > if other.__class__ is self.__class__: > return (self.name, self.unit_price, self.quantity_on_hand) >= > (other.name, other.unit_price, other.quantity_on_hand) > return NotImplemented > > Data Classes saves you from writing and maintaining these functions. > > The PEP is largely complete, but could use some filling out in places. > Comments welcome! > > Eric. > > P.S. I wrote this PEP when I was in my happy place. > From neil at python.ca Fri Sep 8 11:50:12 2017 From: neil at python.ca (Neil Schemenauer) Date: Fri, 8 Sep 2017 09:50:12 -0600 Subject: [Python-Dev] Lazy initialization of module global state Message-ID: <20170908155012.bqotiqug4uthnsjz@python.ca> This is an idea that came out of the lazy module loading (via AST analysis), posted to python-ideas. The essential idea is to split the marshal data stored in the .pyc into smaller pieces and only load the parts as they are accessed. E.g. use a __getattr__ hook on the module to unmarshal+exec the code. I have a very early prototype: https://github.com/warsaw/lazyimport/blob/master/lazy_compile.py Would work like a "compile_all.py" tool. It writes standard .pyc files right now. It is not there yet but I should use the AST analysis, like the lazy module load stuff, to determine if things have potential side-effects on module import. Those things will get loaded eagerly, like they do now. Initially I was thinking of class definitions and functions but now I realize any global state could get this treatment. E.g. if you have a large dictionary global, don't unmarshal it until someone accesses the module attribute. This should be pretty safe to do and should give a significant benefit in startup time and memory usage. From p.f.moore at gmail.com Fri Sep 8 11:52:26 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 8 Sep 2017 16:52:26 +0100 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> Message-ID: On 8 September 2017 at 15:57, Eric V. Smith wrote: > I've written a PEP for what might be thought of as "mutable namedtuples with > defaults, but not inheriting tuple's behavior" (a mouthful, but it sounded > simpler when I first thought of it). It's heavily influenced by the attrs > project. It uses PEP 526 type annotations to define fields. [...] > The PEP is largely complete, but could use some filling out in places. > Comments welcome! > > Eric. > > P.S. I wrote this PEP when I was in my happy place. Looks good! One minor point - apparently in your happy place, C and Python have the same syntax :-) """ field's may optionally specify a default value, using normal Python syntax: @dataclass class C: int a # 'a' has no default value int b = 0 # assign a default value for 'b' """ From jcgoble3 at gmail.com Fri Sep 8 11:57:40 2017 From: jcgoble3 at gmail.com (Jonathan Goble) Date: Fri, 08 Sep 2017 15:57:40 +0000 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> Message-ID: Interesting. I note that this under "Specification": """ field's may optionally specify a default value, using normal Python syntax: @dataclass class C: int a # 'a' has no default value int b = 0 # assign a default value for 'b' """ ...does not look like "normal Python syntax". On Fri, Sep 8, 2017 at 11:44 AM Eric V. Smith wrote: > Oops, I forgot the link. It should show up shortly at > https://www.python.org/dev/peps/pep-0557/. > > Eric. > > On 9/8/17 7:57 AM, Eric V. Smith wrote: > > I've written a PEP for what might be thought of as "mutable namedtuples > > with defaults, but not inheriting tuple's behavior" (a mouthful, but it > > sounded simpler when I first thought of it). It's heavily influenced by > > the attrs project. It uses PEP 526 type annotations to define fields. > > From the overview section: > > > > @dataclass > > class InventoryItem: > > name: str > > unit_price: float > > quantity_on_hand: int = 0 > > > > def total_cost(self) -> float: > > return self.unit_price * self.quantity_on_hand > > > > Will automatically add these methods: > > > > def __init__(self, name: str, unit_price: float, quantity_on_hand: int > > = 0) -> None: > > self.name = name > > self.unit_price = unit_price > > self.quantity_on_hand = quantity_on_hand > > def __repr__(self): > > return > > f'InventoryItem(name={self.name > !r},unit_price={self.unit_price!r},quantity_on_hand={self.quantity_on_hand!r})' > > > > def __eq__(self, other): > > if other.__class__ is self.__class__: > > return (self.name, self.unit_price, self.quantity_on_hand) == > > (other.name, other.unit_price, other.quantity_on_hand) > > return NotImplemented > > def __ne__(self, other): > > if other.__class__ is self.__class__: > > return (self.name, self.unit_price, self.quantity_on_hand) != > > (other.name, other.unit_price, other.quantity_on_hand) > > return NotImplemented > > def __lt__(self, other): > > if other.__class__ is self.__class__: > > return (self.name, self.unit_price, self.quantity_on_hand) < > > (other.name, other.unit_price, other.quantity_on_hand) > > return NotImplemented > > def __le__(self, other): > > if other.__class__ is self.__class__: > > return (self.name, self.unit_price, self.quantity_on_hand) <= > > (other.name, other.unit_price, other.quantity_on_hand) > > return NotImplemented > > def __gt__(self, other): > > if other.__class__ is self.__class__: > > return (self.name, self.unit_price, self.quantity_on_hand) > > > (other.name, other.unit_price, other.quantity_on_hand) > > return NotImplemented > > def __ge__(self, other): > > if other.__class__ is self.__class__: > > return (self.name, self.unit_price, self.quantity_on_hand) >= > > (other.name, other.unit_price, other.quantity_on_hand) > > return NotImplemented > > > > Data Classes saves you from writing and maintaining these functions. > > > > The PEP is largely complete, but could use some filling out in places. > > Comments welcome! > > > > Eric. > > > > P.S. I wrote this PEP when I was in my happy place. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/jcgoble3%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ma3yuki.8mamo10 at gmail.com Fri Sep 8 11:58:37 2017 From: ma3yuki.8mamo10 at gmail.com (Masayuki YAMAMOTO) Date: Sat, 9 Sep 2017 00:58:37 +0900 Subject: [Python-Dev] PEP 539 v3: A new C API for Thread-Local Storage in CPython In-Reply-To: References: Message-ID: Awesome! Thank you for accepting the PEP :) The only thing missing is a reference implementation, I'm going to complete. BTW, one of TSS keys is going to move to the runtime struct by bpo-30860 "Consolidate stateful C globals under a single struct". Although that's PR was merged, the issue status is still open yet. Is there a chance for rollback? Masayuki 2017-09-08 23:37 GMT+09:00 Nick Coghlan : > On 8 September 2017 at 00:30, Masayuki YAMAMOTO > wrote: > > Hi folks, > > > > I submit PEP 539 third draft for the finish. Thank you for all the advice > > and the help! > > Thank you Erik & Yamamoto-san for all of your work on this PEP! > > The updates look good, so I'm happy to say as BDFL-Delegate that this > proposal is now accepted :) > > Regards, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > -------------- next part -------------- An HTML attachment was scrubbed... URL: From turnbull.stephen.fw at u.tsukuba.ac.jp Fri Sep 8 11:36:56 2017 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Sat, 9 Sep 2017 00:36:56 +0900 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: References: Message-ID: <22962.47384.836700.537296@turnbull.sk.tsukuba.ac.jp> Barry Warsaw writes: > and you would drop into the debugger after foo() but before bar(). > More rationale and details are provided in the PEP: > > https://www.python.org/dev/peps/pep-0553/ > > Unlike David, but like Larry, I have a prototype implementation: > > https://github.com/python/cpython/pull/3355 > > P.S. This came to me in a nightmare on Sunday night, and the more I > explored the idea the more it frightened me. I know exactly what I > was dreaming about and the only way to make it all go away was to > write this thing up. What's even scarier are the Qabalistic implications of the reversal of the PEP # and the PR #! Steve From eric at trueblade.com Fri Sep 8 12:07:42 2017 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 8 Sep 2017 09:07:42 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> Message-ID: <4BBCD739-7597-4425-90E4-FFDE6AA5378D@trueblade.com> Hah! Thanks for the catch. There is no C in my happy place. -- Eric. > On Sep 8, 2017, at 8:52 AM, Paul Moore wrote: > >> On 8 September 2017 at 15:57, Eric V. Smith wrote: >> I've written a PEP for what might be thought of as "mutable namedtuples with >> defaults, but not inheriting tuple's behavior" (a mouthful, but it sounded >> simpler when I first thought of it). It's heavily influenced by the attrs >> project. It uses PEP 526 type annotations to define fields. > [...] >> The PEP is largely complete, but could use some filling out in places. >> Comments welcome! >> >> Eric. >> >> P.S. I wrote this PEP when I was in my happy place. > > Looks good! One minor point - apparently in your happy place, C and > Python have the same syntax :-) > > """ > field's may optionally specify a default value, using normal Python syntax: > > @dataclass > class C: > int a # 'a' has no default value > int b = 0 # assign a default value for 'b' > """ From status at bugs.python.org Fri Sep 8 12:09:23 2017 From: status at bugs.python.org (Python tracker) Date: Fri, 8 Sep 2017 18:09:23 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20170908160923.A8FD411A866@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2017-09-01 - 2017-09-08) Python tracker at https://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 6149 (-17) closed 36997 (+87) total 43146 (+70) Open issues with patches: 2321 Issues opened (45) ================== #31069: test_multiprocessing_spawn and test_multiprocessing_forkserver https://bugs.python.org/issue31069 reopened by haypo #31327: bug in dateutil\tz\tz.py https://bugs.python.org/issue31327 opened by jerrykramskoy #31329: Add idlelib module entry to doc https://bugs.python.org/issue31329 opened by terry.reedy #31331: IDLE: Move prompts with input. https://bugs.python.org/issue31331 opened by terry.reedy #31332: Building modules by Clang with Microsoft CodeGen https://bugs.python.org/issue31332 opened by tats.u. #31333: Implement ABCMeta in C https://bugs.python.org/issue31333 opened by levkivskyi #31334: select.poll.poll fails on BSDs with arbitrary negative timeout https://bugs.python.org/issue31334 opened by Riccardo Coccioli #31336: Speed up _PyType_Lookup() for class creation https://bugs.python.org/issue31336 opened by scoder #31338: Use abort() for code we never expect to hit https://bugs.python.org/issue31338 opened by barry #31342: test.bisect module causes tests to fail https://bugs.python.org/issue31342 opened by nascheme #31345: Backport docstring improvements to the C version of OrderedDic https://bugs.python.org/issue31345 opened by rhettinger #31346: Prefer PROTOCOL_TLS_CLIENT/SERVER https://bugs.python.org/issue31346 opened by christian.heimes #31348: webbrowser BROWSER with MacOSXOSAScript type https://bugs.python.org/issue31348 opened by michael-lazar #31349: Embedded initialization ignores Py_SetProgramName() https://bugs.python.org/issue31349 opened by chrullrich #31351: ensurepip discards pip's return code which leads to broken ven https://bugs.python.org/issue31351 opened by Igor Filatov #31353: Implement PEP 553 - built-in debug() https://bugs.python.org/issue31353 opened by barry #31354: Fixing a bug related to LTO only build https://bugs.python.org/issue31354 opened by octavian.soldea #31355: Remove Travis CI macOS job: rely on buildbots https://bugs.python.org/issue31355 opened by haypo #31356: Add context manager to temporarily disable GC https://bugs.python.org/issue31356 opened by rhettinger #31357: Expose `worker_target` and `workitem_cls` as arugments to cust https://bugs.python.org/issue31357 opened by justdoit0823 #31359: `configure` script incorrectly detects symbols as available on https://bugs.python.org/issue31359 opened by Maxime Belanger #31361: Update feedparser.py to prevent theano compiling fail in pytho https://bugs.python.org/issue31361 opened by Wei-Shun Lo #31363: __PYVENV_LAUNCHER__ breaks calling another venv's interpreter https://bugs.python.org/issue31363 opened by Ilya.Kulakov #31366: Missing terminator option when using readline with socketserve https://bugs.python.org/issue31366 opened by Thomas Feldmann #31368: RWF_NONBLOCK https://bugs.python.org/issue31368 opened by YoSTEALTH #31369: re.RegexFlag is not included in __all__ https://bugs.python.org/issue31369 opened by PJB3005 #31371: Remove deprecated tkinter.tix module in 3.7 https://bugs.python.org/issue31371 opened by ned.deily #31372: Add SSLSocket.get_verify_result() https://bugs.python.org/issue31372 opened by christian.heimes #31374: expat: warning: "_POSIX_C_SOURCE" redefined https://bugs.python.org/issue31374 opened by christian.heimes #31375: Add the interpreters module to stdlib (PEP 554). https://bugs.python.org/issue31375 opened by eric.snow #31376: test_multiprocessing_spawn randomly hangs AMD64 FreeBSD 10.x S https://bugs.python.org/issue31376 opened by haypo #31377: remove *_INTERNED opcodes from marshal https://bugs.python.org/issue31377 opened by benjamin.peterson #31378: Missing documentation for sqlite3.OperationalError https://bugs.python.org/issue31378 opened by Leonardo Taglialegne #31380: test_undecodable_filename() in Lib/test/test_httpservers.py br https://bugs.python.org/issue31380 opened by howarthjw #31381: Unable to read the project file "pythoncore.vcxproj". https://bugs.python.org/issue31381 opened by denis-osipov #31382: CGI upload error when file ~< 10kb https://bugs.python.org/issue31382 opened by mschaming #31386: Make return types of wrap_bio and wrap_socket customizable https://bugs.python.org/issue31386 opened by christian.heimes #31387: asyncio should make it easy to enable cooperative SIGINT handl https://bugs.python.org/issue31387 opened by ncoghlan #31388: Provide a way to defer SIGINT handling in the current thread https://bugs.python.org/issue31388 opened by ncoghlan #31389: Give pdb.set_trace() an optional `header` keyword argument https://bugs.python.org/issue31389 opened by barry #31390: pydoc.Helper.keywords missing async and await https://bugs.python.org/issue31390 opened by rsw #31391: Forward-port test_xpickle from 2.7 to 3.x https://bugs.python.org/issue31391 opened by zach.ware #31392: Upgrade installers to OpenSSL 1.1.0f https://bugs.python.org/issue31392 opened by steve.dower #31394: Ellipsis_token.type != token.ELLIPSIS https://bugs.python.org/issue31394 opened by Aivar.Annamaa #31395: Docs Downloads are 404s https://bugs.python.org/issue31395 opened by n8henrie Most recent 15 issues with no replies (15) ========================================== #31395: Docs Downloads are 404s https://bugs.python.org/issue31395 #31392: Upgrade installers to OpenSSL 1.1.0f https://bugs.python.org/issue31392 #31391: Forward-port test_xpickle from 2.7 to 3.x https://bugs.python.org/issue31391 #31390: pydoc.Helper.keywords missing async and await https://bugs.python.org/issue31390 #31386: Make return types of wrap_bio and wrap_socket customizable https://bugs.python.org/issue31386 #31382: CGI upload error when file ~< 10kb https://bugs.python.org/issue31382 #31380: test_undecodable_filename() in Lib/test/test_httpservers.py br https://bugs.python.org/issue31380 #31376: test_multiprocessing_spawn randomly hangs AMD64 FreeBSD 10.x S https://bugs.python.org/issue31376 #31375: Add the interpreters module to stdlib (PEP 554). https://bugs.python.org/issue31375 #31374: expat: warning: "_POSIX_C_SOURCE" redefined https://bugs.python.org/issue31374 #31372: Add SSLSocket.get_verify_result() https://bugs.python.org/issue31372 #31368: RWF_NONBLOCK https://bugs.python.org/issue31368 #31366: Missing terminator option when using readline with socketserve https://bugs.python.org/issue31366 #31363: __PYVENV_LAUNCHER__ breaks calling another venv's interpreter https://bugs.python.org/issue31363 #31357: Expose `worker_target` and `workitem_cls` as arugments to cust https://bugs.python.org/issue31357 Most recent 15 issues waiting for review (15) ============================================= #31392: Upgrade installers to OpenSSL 1.1.0f https://bugs.python.org/issue31392 #31389: Give pdb.set_trace() an optional `header` keyword argument https://bugs.python.org/issue31389 #31386: Make return types of wrap_bio and wrap_socket customizable https://bugs.python.org/issue31386 #31381: Unable to read the project file "pythoncore.vcxproj". https://bugs.python.org/issue31381 #31346: Prefer PROTOCOL_TLS_CLIENT/SERVER https://bugs.python.org/issue31346 #31336: Speed up _PyType_Lookup() for class creation https://bugs.python.org/issue31336 #31325: req_rate is a namedtuple type rather than instance https://bugs.python.org/issue31325 #31315: assertion failure in imp.create_dynamic(), when spec.name is n https://bugs.python.org/issue31315 #31310: semaphore tracker isn't protected against crashes https://bugs.python.org/issue31310 #31308: forkserver process isn't re-launched if it died https://bugs.python.org/issue31308 #31294: ZeroMQSocketListener and ZeroMQSocketHandler examples in the L https://bugs.python.org/issue31294 #31184: Fix data descriptor detection in inspect.getattr_static https://bugs.python.org/issue31184 #31179: Speed-up dict.copy() up to 5.5 times. https://bugs.python.org/issue31179 #31175: Exception while extracting file from ZIP with non-matching fil https://bugs.python.org/issue31175 #31151: socketserver.ForkingMixIn.server_close() leaks zombie processe https://bugs.python.org/issue31151 Top 10 most discussed issues (10) ================================= #29988: with statements are not ensuring that __exit__ is called if __ https://bugs.python.org/issue29988 26 msgs #31338: Use abort() for code we never expect to hit https://bugs.python.org/issue31338 18 msgs #14976: queue.Queue() is not reentrant, so signals and GC can cause de https://bugs.python.org/issue14976 14 msgs #31336: Speed up _PyType_Lookup() for class creation https://bugs.python.org/issue31336 14 msgs #17852: Built-in module _io can lose data from buffered files at exit https://bugs.python.org/issue17852 7 msgs #30403: PEP 547: Running extension modules using -m switch https://bugs.python.org/issue30403 7 msgs #31355: Remove Travis CI macOS job: rely on buildbots https://bugs.python.org/issue31355 7 msgs #30860: Consolidate stateful C globals under a single struct. https://bugs.python.org/issue30860 6 msgs #31333: Implement ABCMeta in C https://bugs.python.org/issue31333 6 msgs #31377: remove *_INTERNED opcodes from marshal https://bugs.python.org/issue31377 6 msgs Issues closed (82) ================== #10746: ctypes c_long & c_bool have incorrect PEP-3118 type codes https://bugs.python.org/issue10746 closed by pitrou #11943: Add TLS-SRP (RFC 5054) support to ssl, _ssl, http, and urllib https://bugs.python.org/issue11943 closed by christian.heimes #12197: non-blocking SSL write fails if a partial write was issued https://bugs.python.org/issue12197 closed by christian.heimes #12633: sys.modules doc entry should reflect restrictions https://bugs.python.org/issue12633 closed by eric.snow #14191: argparse doesn't allow optionals within positionals https://bugs.python.org/issue14191 closed by r.david.murray #15464: ssl: add set_msg_callback function https://bugs.python.org/issue15464 closed by christian.heimes #16988: argparse: PARSER option for nargs not documented https://bugs.python.org/issue16988 closed by r.david.murray #19084: No way to use TLS-PSK from python ssl https://bugs.python.org/issue19084 closed by christian.heimes #20994: Disable TLS Compression https://bugs.python.org/issue20994 closed by christian.heimes #21649: Mention "Recommendations for Secure Use of TLS and DTLS" https://bugs.python.org/issue21649 closed by Mariatta #21818: cookielib documentation references Cookie module, not cookieli https://bugs.python.org/issue21818 closed by berker.peksag #22536: subprocess should include filename in FileNotFoundError except https://bugs.python.org/issue22536 closed by gregory.p.smith #22635: subprocess.getstatusoutput changed behavior in 3.4 (maybe 3.3. https://bugs.python.org/issue22635 closed by gregory.p.smith #23863: Fix EINTR Socket Module issues in 2.7 https://bugs.python.org/issue23863 closed by haypo #24516: SSL create_default_socket purpose insufficiently documented https://bugs.python.org/issue24516 closed by christian.heimes #25674: test_ssl (test_algorithms) failures on bolen-ubuntu slaves: sh https://bugs.python.org/issue25674 closed by christian.heimes #26612: test_ssl: use context manager (with) to fix ResourceWarning https://bugs.python.org/issue26612 closed by haypo #26904: Difflib quick_ratio() could use Counter() https://bugs.python.org/issue26904 closed by mscuthbert #27144: concurrent.futures.as_completed() memory inefficiency https://bugs.python.org/issue27144 closed by pitrou #27340: bytes-like objects with socket.sendall(), SSL, and http.client https://bugs.python.org/issue27340 closed by christian.heimes #27448: Race condition in subprocess.Popen which causes a huge memory https://bugs.python.org/issue27448 closed by gregory.p.smith #27584: New addition of vSockets to the python socket module https://bugs.python.org/issue27584 closed by christian.heimes #27629: Cannot create ssl.SSLSocket without existing socket https://bugs.python.org/issue27629 closed by christian.heimes #27768: ssl: get CPU cap flags for AESNI and PCLMULQDQ https://bugs.python.org/issue27768 closed by christian.heimes #28085: SSL: Add client and server protocols for SSLContext https://bugs.python.org/issue28085 closed by christian.heimes #28191: Support RFC4985 SRVName in SAN name https://bugs.python.org/issue28191 closed by christian.heimes #28196: ssl.match_hostname() should check for SRV-ID and URI-ID https://bugs.python.org/issue28196 closed by christian.heimes #28411: Eliminate PyInterpreterState.modules. https://bugs.python.org/issue28411 closed by eric.snow #28588: Memory leak in OpenSSL thread state https://bugs.python.org/issue28588 closed by christian.heimes #28958: Python should return comperhansive error when SSLContext canno https://bugs.python.org/issue28958 closed by christian.heimes #29136: Add OP_NO_TLSv1_3 https://bugs.python.org/issue29136 closed by christian.heimes #29212: Python 3.6 logging thread name regression with concurrent.futu https://bugs.python.org/issue29212 closed by gregory.p.smith #29588: importing ssl can fail with NameError: name 'PROTOCOL_TLS' is https://bugs.python.org/issue29588 closed by christian.heimes #29627: configparser.ConfigParser.read() has undocumented/unexpected b https://bugs.python.org/issue29627 closed by lukasz.langa #29781: SSLObject.version returns incorrect value before handshake. https://bugs.python.org/issue29781 closed by christian.heimes #29824: Hostname validation in SSL match_hostname() https://bugs.python.org/issue29824 closed by christian.heimes #30096: Update examples in abc documentation to use abc.ABC https://bugs.python.org/issue30096 closed by Mariatta #30102: improve performance of libSSL usage on hashing https://bugs.python.org/issue30102 closed by christian.heimes #30502: Fix buffer handling of OBJ_obj2txt https://bugs.python.org/issue30502 closed by christian.heimes #30622: Fix NPN guard for OpenSSL 1.1 https://bugs.python.org/issue30622 closed by christian.heimes #30640: NULL + 1 in _PyFunction_FastCallDict(), PyEval_EvalCodeEx() https://bugs.python.org/issue30640 closed by haypo #30662: fix OrderedDict.__init__ docstring to reflect PEP 468 https://bugs.python.org/issue30662 closed by Mariatta #30737: Update devguide link to the new URL https://bugs.python.org/issue30737 closed by Mariatta #30828: Out of bounds write in _asyncio_Future_remove_done_callback https://bugs.python.org/issue30828 closed by serhiy.storchaka #30867: Add macro `HAVE_OPENSSL_VERIFY_PARAM` to avoid invalid functio https://bugs.python.org/issue30867 closed by christian.heimes #30912: python 3 git master fails to find libffi and build _ctypes on https://bugs.python.org/issue30912 closed by zach.ware #31075: Collections - ChainMap - Documentation example wrong order lin https://bugs.python.org/issue31075 closed by rhettinger #31144: add initializer to concurrent.futures.ProcessPoolExecutor https://bugs.python.org/issue31144 closed by pitrou #31172: Py_Main() is totally broken on Visual Studio 2017 https://bugs.python.org/issue31172 closed by steve.dower #31178: [EASY] subprocess: TypeError: can't concat str to bytes, in _e https://bugs.python.org/issue31178 closed by gregory.p.smith #31185: Miscellaneous errors in asyncio speedup module https://bugs.python.org/issue31185 closed by serhiy.storchaka #31252: Operator.itemgetter documentation should include dictionary ke https://bugs.python.org/issue31252 closed by rhettinger #31270: Simplify documentation of itertools.zip_longest https://bugs.python.org/issue31270 closed by rhettinger #31281: fileinput inplace does not work with pathlib.Path https://bugs.python.org/issue31281 closed by eric.smith #31301: Python 2.7 SIGSEGV https://bugs.python.org/issue31301 closed by serhiy.storchaka #31326: concurrent.futures: ProcessPoolExecutor.shutdown(wait=True) sh https://bugs.python.org/issue31326 closed by haypo #31328: _sha3 is missing from Setup.dist https://bugs.python.org/issue31328 closed by benjamin.peterson #31330: argparse.RawTextHelpFormatter does not maintain lines separate https://bugs.python.org/issue31330 closed by r.david.murray #31335: str.format should support "{:r}" for any type. https://bugs.python.org/issue31335 closed by al45tair #31337: Small opportunity for NULL dereference in compile.c https://bugs.python.org/issue31337 closed by barry #31339: [2.7] time.asctime() crash with year > 9999 using musl C libra https://bugs.python.org/issue31339 closed by haypo #31340: Use VS 2017 compiler for build https://bugs.python.org/issue31340 closed by steve.dower #31341: remove IRIX support code https://bugs.python.org/issue31341 closed by benjamin.peterson #31343: Include major(), minor(), makedev() from sysmacros https://bugs.python.org/issue31343 closed by christian.heimes #31344: f_trace_opcodes frame attribute to switch to per-opcode tracin https://bugs.python.org/issue31344 closed by ncoghlan #31347: possible undefined behavior in _PyObject_FastCall_Prepend https://bugs.python.org/issue31347 closed by benjamin.peterson #31350: asyncio: Optimize get_event_loop and _get_running_loop https://bugs.python.org/issue31350 closed by Mariatta #31352: Travis CI coverage job fails on Python 3.6 https://bugs.python.org/issue31352 closed by haypo #31358: Pull zlib out to cpython-source-deps https://bugs.python.org/issue31358 closed by zach.ware #31360: argparse mutually_exclusive_group under add_argument_group fai https://bugs.python.org/issue31360 closed by martin.panter #31362: "async" and "await" are not keyword https://bugs.python.org/issue31362 closed by Conor Cal #31364: Possible problem with PR #3377 https://bugs.python.org/issue31364 closed by denis-osipov #31365: Multiplication issue with 16.1 https://bugs.python.org/issue31365 closed by martin.panter #31367: [[]]*int gives x of the same list that cannot be appended to o https://bugs.python.org/issue31367 closed by zach.ware #31370: Remove support for threads-less builds https://bugs.python.org/issue31370 closed by haypo #31373: demoting floating float values to unrepresentable types is und https://bugs.python.org/issue31373 closed by benjamin.peterson #31379: run_profile_task in Makefile should add $(RUNSHARED) https://bugs.python.org/issue31379 closed by xiang.zhang #31383: Issue with _posixsubprocess when importing subprocess32 https://bugs.python.org/issue31383 closed by gregory.p.smith #31384: marshal: remove "current_filename" optimization https://bugs.python.org/issue31384 closed by benjamin.peterson #31385: `import as` does not work when module has same same as parent https://bugs.python.org/issue31385 closed by ncoghlan #31393: Fix the use of PyUnicode_READY() https://bugs.python.org/issue31393 closed by serhiy.storchaka #1198569: string.Template not flexible enough to subclass (regexes) https://bugs.python.org/issue1198569 closed by barry From eric at trueblade.com Fri Sep 8 12:24:16 2017 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 8 Sep 2017 09:24:16 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <4BBCD739-7597-4425-90E4-FFDE6AA5378D@trueblade.com> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <4BBCD739-7597-4425-90E4-FFDE6AA5378D@trueblade.com> Message-ID: <208fd6c1-b948-39a3-c8b1-a23e64013e48@trueblade.com> On 9/8/17 9:07 AM, Eric V. Smith wrote: > Hah! Thanks for the catch. I've pushed a fix to the PEP repo, it should show up online shortly. Eric. > > There is no C in my happy place. > > -- > Eric. > >> On Sep 8, 2017, at 8:52 AM, Paul Moore wrote: >> >>> On 8 September 2017 at 15:57, Eric V. Smith wrote: >>> I've written a PEP for what might be thought of as "mutable namedtuples with >>> defaults, but not inheriting tuple's behavior" (a mouthful, but it sounded >>> simpler when I first thought of it). It's heavily influenced by the attrs >>> project. It uses PEP 526 type annotations to define fields. >> [...] >>> The PEP is largely complete, but could use some filling out in places. >>> Comments welcome! >>> >>> Eric. >>> >>> P.S. I wrote this PEP when I was in my happy place. >> >> Looks good! One minor point - apparently in your happy place, C and >> Python have the same syntax :-) >> >> """ >> field's may optionally specify a default value, using normal Python syntax: >> >> @dataclass >> class C: >> int a # 'a' has no default value >> int b = 0 # assign a default value for 'b' >> """ > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/eric%2Ba-python-dev%40trueblade.com > From ncoghlan at gmail.com Fri Sep 8 12:43:54 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 8 Sep 2017 09:43:54 -0700 Subject: [Python-Dev] PEP 552: deterministic pycs In-Reply-To: <20170908165500.3828c370@fsol> References: <1504816761.1994315.1098826272.2B182328@webmail.messagingengine.com> <1504820433.3022749.1098877456.3FF76AA6@webmail.messagingengine.com> <1504824402.3041582.1098937744.55112449@webmail.messagingengine.com> <20170908120452.22b6723b@fsol> <20170908165500.3828c370@fsol> Message-ID: On 8 September 2017 at 07:55, Antoine Pitrou wrote: > On Fri, 8 Sep 2017 07:49:46 -0700 > Nick Coghlan wrote: >> > I'd rather a single magic number and a separate bitfield that tells >> > what the header encodes exactly. We don't *have* to fight for a tiny >> > size reduction of pyc files. >> >> One of Benjamin's goals was for the existing timestamp-based pyc >> format to remain completely unchanged, so we need some kind of marker >> in the magic number to indicate whether the file is using the new >> format or nor. > > I don't think that's a useful goal, as long as we bump the magic number. Yeah, we (me, Benjamin, Greg) discussed that here, and we agree - there isn't actually any benefit to keeping the timestamp based pyc's using the same layout, since the magic number is already going to change anyway. Given that, I think your suggested 16 byte header layout would be a good one: 4 byte magic number, 4 bytes reserved for format flags, 8 bytes with an interpretation that depends on the format flags. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Fri Sep 8 13:03:33 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 8 Sep 2017 10:03:33 -0700 Subject: [Python-Dev] PEP 556: Threaded garbage collection In-Reply-To: <20170908170523.73bdb61e@fsol> References: <20170908170523.73bdb61e@fsol> Message-ID: On 8 September 2017 at 08:05, Antoine Pitrou wrote: > > Hello, > > I've written a PEP by which you can tell the GC to run in a dedicated > thread. The goal is to solve reentrancy issues with finalizers: > https://www.python.org/dev/peps/pep-0556/ +1 from me for the general concept. (Minor naming idea: "inline" may be a clearer name for the current behaviour). One point that seems worth noting: even with cyclic GC moved out to a separate thread, __del__ methods and weakref callbacks triggered by a refcount going to zero will still be called inline in the current thread. Changing that would require a tweak to the semantics of Py_DECREF where if the GC was in threaded mode, instead of finalizing the object immediately, it would instead be placed on a FIFO queue where the GC thread would pick it up and then actually delete it. Cheers, Nick. > PS: I did not come up with the idea for this PEP while other people > were having nightmares. I dunno, debugging finalizer re-entrancy problems seems pretty nightmarish to me ;) -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Fri Sep 8 13:07:10 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 8 Sep 2017 10:07:10 -0700 Subject: [Python-Dev] Lazy initialization of module global state In-Reply-To: <20170908155012.bqotiqug4uthnsjz@python.ca> References: <20170908155012.bqotiqug4uthnsjz@python.ca> Message-ID: On 8 September 2017 at 08:50, Neil Schemenauer wrote: > This should be pretty safe to do and should give a significant > benefit in startup time and memory usage. And given the extended pyc header being proposed in PEP 552, we'd be able to include flags in the header to indicate that the pyc file was compiled with this alternative behaviour. While that wouldn't be needed for import, it would be useful for anyone trying to confirm that the pyc file actually matches the corresponding source file. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From solipsis at pitrou.net Fri Sep 8 13:08:39 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 8 Sep 2017 19:08:39 +0200 Subject: [Python-Dev] PEP 556: Threaded garbage collection In-Reply-To: References: <20170908170523.73bdb61e@fsol> Message-ID: <20170908190839.2d523417@fsol> On Fri, 8 Sep 2017 10:03:33 -0700 Nick Coghlan wrote: > On 8 September 2017 at 08:05, Antoine Pitrou wrote: > > > > Hello, > > > > I've written a PEP by which you can tell the GC to run in a dedicated > > thread. The goal is to solve reentrancy issues with finalizers: > > https://www.python.org/dev/peps/pep-0556/ > > +1 from me for the general concept. (Minor naming idea: "inline" may > be a clearer name for the current behaviour). > > One point that seems worth noting: even with cyclic GC moved out to a > separate thread, __del__ methods and weakref callbacks triggered by a > refcount going to zero will still be called inline in the current > thread. You're right, that bears mentioning. It's much less of a problem, of course, since such calls happen at deterministic places. Regards Antoine. From ncoghlan at gmail.com Fri Sep 8 13:09:37 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 8 Sep 2017 10:09:37 -0700 Subject: [Python-Dev] PEP 539 v3: A new C API for Thread-Local Storage in CPython In-Reply-To: References: Message-ID: On 8 September 2017 at 08:58, Masayuki YAMAMOTO wrote: > Awesome! Thank you for accepting the PEP :) > The only thing missing is a reference implementation, I'm going to complete. > > BTW, one of TSS keys is going to move to the runtime struct by bpo-30860 > "Consolidate stateful C globals under a single struct". Although that's PR > was merged, the issue status is still open yet. Is there a chance for > rollback? No, we genuinely want to consolidate that state into a single shared location. However, the struct definition can be adjusted as needed as part of the PEP 539 implementation (and we'll get Eric Snow to be one of the PR reviewers). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Fri Sep 8 13:37:12 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 8 Sep 2017 10:37:12 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> Message-ID: On 8 September 2017 at 07:57, Eric V. Smith wrote: > I've written a PEP for what might be thought of as "mutable namedtuples with > defaults, but not inheriting tuple's behavior" (a mouthful, but it sounded > simpler when I first thought of it). It's heavily influenced by the attrs > project. It uses PEP 526 type annotations to define fields. From the > overview section: > > @dataclass > class InventoryItem: > name: str > unit_price: float > quantity_on_hand: int = 0 > > def total_cost(self) -> float: > return self.unit_price * self.quantity_on_hand Very nice! > def __eq__(self, other): > if other.__class__ is self.__class__: > return (self.name, self.unit_price, self.quantity_on_hand) == > (other.name, other.unit_price, other.quantity_on_hand) > return NotImplemented My one technical question about the PEP relates to the use of an exact type check in the comparison methods, rather than "isinstance(other, self.__class__)". I think I agree with that decision, but it isn't immediately obvious that the class identity is considered part of the instance value for a data class, so if you do: @dataclass class BaseItem: value: Any class DerivedItem: pass Then instances of DerivedItem *won't* be considered equivalent to instances of BaseItem, and they also won't be orderable relative to each other, even though "DerivedItem" doesn't actually add any new data fields. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From benjamin at python.org Fri Sep 8 13:52:03 2017 From: benjamin at python.org (Benjamin Peterson) Date: Fri, 08 Sep 2017 10:52:03 -0700 Subject: [Python-Dev] PEP 552: deterministic pycs In-Reply-To: References: <1504816761.1994315.1098826272.2B182328@webmail.messagingengine.com> <1504820433.3022749.1098877456.3FF76AA6@webmail.messagingengine.com> <1504824402.3041582.1098937744.55112449@webmail.messagingengine.com> <20170908120452.22b6723b@fsol> <20170908165500.3828c370@fsol> Message-ID: <1504893123.182750.1099870936.4E48255D@webmail.messagingengine.com> Thank you all for the feedback. I've now updated the PEP to specify a 4-word pyc header with a bit field in every case. On Fri, Sep 8, 2017, at 09:43, Nick Coghlan wrote: > On 8 September 2017 at 07:55, Antoine Pitrou wrote: > > On Fri, 8 Sep 2017 07:49:46 -0700 > > Nick Coghlan wrote: > >> > I'd rather a single magic number and a separate bitfield that tells > >> > what the header encodes exactly. We don't *have* to fight for a tiny > >> > size reduction of pyc files. > >> > >> One of Benjamin's goals was for the existing timestamp-based pyc > >> format to remain completely unchanged, so we need some kind of marker > >> in the magic number to indicate whether the file is using the new > >> format or nor. > > > > I don't think that's a useful goal, as long as we bump the magic number. > > Yeah, we (me, Benjamin, Greg) discussed that here, and we agree - > there isn't actually any benefit to keeping the timestamp based pyc's > using the same layout, since the magic number is already going to > change anyway. > > Given that, I think your suggested 16 byte header layout would be a > good one: 4 byte magic number, 4 bytes reserved for format flags, 8 > bytes with an interpretation that depends on the format flags. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/benjamin%40python.org From steve at pearwood.info Fri Sep 8 14:38:06 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 9 Sep 2017 04:38:06 +1000 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> Message-ID: <20170908183806.GN13110@ando.pearwood.info> On Fri, Sep 08, 2017 at 10:37:12AM -0700, Nick Coghlan wrote: > > def __eq__(self, other): > > if other.__class__ is self.__class__: > > return (self.name, self.unit_price, self.quantity_on_hand) == > > (other.name, other.unit_price, other.quantity_on_hand) > > return NotImplemented > > My one technical question about the PEP relates to the use of an exact > type check in the comparison methods, rather than "isinstance(other, > self.__class__)". I haven't read the whole PEP in close detail, but that method stood out for me too. Only, unlike Nick, I don't think I agree with the decision. I'm also not convinced that we should be adding ordered comparisons (__lt__ __gt__ etc) by default, if these DataClasses are considered more like structs/records than tuples. The closest existing equivalent to a struct in the std lib (apart from namedtuple) is, I think, SimpleNamespace, and they are unorderable. -- Steve From ethan at stoneleaf.us Fri Sep 8 15:01:01 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 08 Sep 2017 12:01:01 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <20170908183806.GN13110@ando.pearwood.info> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <20170908183806.GN13110@ando.pearwood.info> Message-ID: <59B2E8ED.9070903@stoneleaf.us> On 09/08/2017 11:38 AM, Steven D'Aprano wrote: > On Fri, Sep 08, 2017 at 10:37:12AM -0700, Nick Coghlan wrote: > >>> def __eq__(self, other): >>> if other.__class__ is self.__class__: >>> return (self.name, self.unit_price, self.quantity_on_hand) == >>> (other.name, other.unit_price, other.quantity_on_hand) >>> return NotImplemented >> >> My one technical question about the PEP relates to the use of an exact >> type check in the comparison methods, rather than "isinstance(other, >> self.__class__)". > > I haven't read the whole PEP in close detail, but that method stood out > for me too. Only, unlike Nick, I don't think I agree with the decision. > > I'm also not convinced that we should be adding ordered comparisons > (__lt__ __gt__ etc) by default, if these DataClasses are considered > more like structs/records than tuples. The closest existing equivalent > to a struct in the std lib (apart from namedtuple) is, I think, > SimpleNamespace, and they are unorderable. I'll split the difference. ;) I agree with D'Aprano that an isinstance check should be used, but I disagree with him about the rich comparison methods -- please include them. I think unordered should only be the default when order is impossible, extremely difficult, or nonsensical. -- ~Ethan~ From benjamin at python.org Fri Sep 8 15:04:10 2017 From: benjamin at python.org (Benjamin Peterson) Date: Fri, 08 Sep 2017 12:04:10 -0700 Subject: [Python-Dev] PEP 556: Threaded garbage collection In-Reply-To: <20170908170523.73bdb61e@fsol> References: <20170908170523.73bdb61e@fsol> Message-ID: <1504897450.203763.1099937864.1DA2E661@webmail.messagingengine.com> I like it overall. - I was wondering what happens during interpreter shutdown. I see you have that listed as a open issue. How about simply shutting down the finalization thread and not guaranteeing that finalizers are actually ever run ? la Java? - Why not run all (Python) finalizers on the thread and not just ones from cycles? On Fri, Sep 8, 2017, at 08:05, Antoine Pitrou wrote: > > Hello, > > I've written a PEP by which you can tell the GC to run in a dedicated > thread. The goal is to solve reentrancy issues with finalizers: > https://www.python.org/dev/peps/pep-0556/ > > Regards > > Antoine. > > PS: I did not come up with the idea for this PEP while other people > were having nightmares. Any nightmares involved in this PEP are > fictional, and any resemblance to actual nightmares is purely > coincidental. No nightmares were harmed while writing this PEP. > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/benjamin%40python.org From solipsis at pitrou.net Fri Sep 8 15:13:46 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 8 Sep 2017 21:13:46 +0200 Subject: [Python-Dev] PEP 556: Threaded garbage collection References: <20170908170523.73bdb61e@fsol> <1504897450.203763.1099937864.1DA2E661@webmail.messagingengine.com> Message-ID: <20170908211346.3f404a4b@fsol> On Fri, 08 Sep 2017 12:04:10 -0700 Benjamin Peterson wrote: > I like it overall. > > - I was wondering what happens during interpreter shutdown. I see you > have that listed as a open issue. How about simply shutting down the > finalization thread and not guaranteeing that finalizers are actually > ever run ? la Java? I don't know. People generally have expectations towards stuff being finalized properly (especially when talking about files etc.). Once the first implementation is devised, we will know more about what's workable (perhaps we'll have to move _PyGC_Fini earlier in the shutdown sequence? perhaps we'll want to switch back to serial mode when shutting down?). > - Why not run all (Python) finalizers on the thread and not just ones > from cycles? Because a lot of code probably expects them to be run as soon as the last visible ref disappears. Regards Antoine. From larry at hastings.org Fri Sep 8 15:24:32 2017 From: larry at hastings.org (Larry Hastings) Date: Fri, 8 Sep 2017 12:24:32 -0700 Subject: [Python-Dev] PEP 556: Threaded garbage collection In-Reply-To: <1504897450.203763.1099937864.1DA2E661@webmail.messagingengine.com> References: <20170908170523.73bdb61e@fsol> <1504897450.203763.1099937864.1DA2E661@webmail.messagingengine.com> Message-ID: <0d24600d-b4c9-9c0c-d1b8-1e7fb1edf60a@hastings.org> On 09/08/2017 12:04 PM, Benjamin Peterson wrote: > - Why not run all (Python) finalizers on the thread and not just ones > from cycles? Two reasons: 1. Because some code relies on the finalizer being called on the thread where the last reference is dropped. This is usually the same thread where the object was created. Some irritating third-party libraries make demands on callers, e.g. "you can only interact with / destroy X objects on your 'main thread'". This is often true of windowing / GUI libraries. (For example, I believe this was true of Direct3D, at least as of D3D8; it was also often true of Win32 USER objects.) 2. Because there's so much work there. In the Gilectomy prototype, I originally called all finalizers on the "reference count manager commit thread", the thread that also committed increfs and decrefs. The thread immediately fell behind on its queue and never caught up. I changed the Gilectomy so objects needing finalization are passed back to the thread where the last decref happened, for finalization on that thread; this was pleasingly self-balancing. Note that I turned off cyclic GC on the Gilectomy prototype a long time ago and haven't revisited it since. My very, very long-term plan for GC is to stop the world and run it from one thread. With the current system, that means all those finalizers would be run on the thread chosen to run the GC. //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From benjamin at python.org Fri Sep 8 15:30:52 2017 From: benjamin at python.org (Benjamin Peterson) Date: Fri, 08 Sep 2017 12:30:52 -0700 Subject: [Python-Dev] PEP 556: Threaded garbage collection In-Reply-To: <0d24600d-b4c9-9c0c-d1b8-1e7fb1edf60a@hastings.org> References: <20170908170523.73bdb61e@fsol> <1504897450.203763.1099937864.1DA2E661@webmail.messagingengine.com> <0d24600d-b4c9-9c0c-d1b8-1e7fb1edf60a@hastings.org> Message-ID: <1504899052.210148.1099965304.44F16992@webmail.messagingengine.com> On Fri, Sep 8, 2017, at 12:24, Larry Hastings wrote: > > > On 09/08/2017 12:04 PM, Benjamin Peterson wrote: > > - Why not run all (Python) finalizers on the thread and not just ones > > from cycles? > > Two reasons: > > 1. Because some code relies on the finalizer being called on the thread > where the last reference is dropped. This is usually the same > thread where the object was created. Some irritating third-party > libraries make demands on callers, e.g. "you can only interact with > / destroy X objects on your 'main thread'". This is often true of > windowing / GUI libraries. (For example, I believe this was true of > Direct3D, at least as of D3D8; it was also often true of Win32 USER > objects.) Is the requirement that the construction and destruction be literally on the same thread or merely non-concurrent? The GIL would provide the latter. > 2. Because there's so much work there. In the Gilectomy prototype, I > originally called all finalizers on the "reference count manager > commit thread", the thread that also committed increfs and decrefs. > The thread immediately fell behind on its queue and never caught > up. I changed the Gilectomy so objects needing finalization are > passed back to the thread where the last decref happened, for > finalization on that thread; this was pleasingly self-balancing. I'm only suggesting Python-level __del__ methods be run on the separate thread not general deallocation work. I would those would be few and far between. From benjamin at python.org Fri Sep 8 15:33:01 2017 From: benjamin at python.org (Benjamin Peterson) Date: Fri, 08 Sep 2017 12:33:01 -0700 Subject: [Python-Dev] PEP 556: Threaded garbage collection In-Reply-To: <20170908211346.3f404a4b@fsol> References: <20170908170523.73bdb61e@fsol> <1504897450.203763.1099937864.1DA2E661@webmail.messagingengine.com> <20170908211346.3f404a4b@fsol> Message-ID: <1504899181.210364.1099968424.2DECCAB6@webmail.messagingengine.com> On Fri, Sep 8, 2017, at 12:13, Antoine Pitrou wrote: > On Fri, 08 Sep 2017 12:04:10 -0700 > Benjamin Peterson wrote: > > I like it overall. > > > > - I was wondering what happens during interpreter shutdown. I see you > > have that listed as a open issue. How about simply shutting down the > > finalization thread and not guaranteeing that finalizers are actually > > ever run ? la Java? > > I don't know. People generally have expectations towards stuff being > finalized properly (especially when talking about files etc.). > Once the first implementation is devised, we will know more about > what's workable (perhaps we'll have to move _PyGC_Fini earlier in the > shutdown sequence? perhaps we'll want to switch back to serial mode > when shutting down?). Okay, I'm curious to know what ends up happening here then. > > > - Why not run all (Python) finalizers on the thread and not just ones > > from cycles? > > Because a lot of code probably expects them to be run as soon as the > last visible ref disappears. But this assumption is broken on PyPy and sometimes already by CPython, so I don't feel very bad moving away from it. From njs at pobox.com Fri Sep 8 15:40:34 2017 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 8 Sep 2017 12:40:34 -0700 Subject: [Python-Dev] PEP 556: Threaded garbage collection In-Reply-To: <20170908211346.3f404a4b@fsol> References: <20170908170523.73bdb61e@fsol> <1504897450.203763.1099937864.1DA2E661@webmail.messagingengine.com> <20170908211346.3f404a4b@fsol> Message-ID: On Fri, Sep 8, 2017 at 12:13 PM, Antoine Pitrou wrote: > On Fri, 08 Sep 2017 12:04:10 -0700 > Benjamin Peterson wrote: >> I like it overall. >> >> - I was wondering what happens during interpreter shutdown. I see you >> have that listed as a open issue. How about simply shutting down the >> finalization thread and not guaranteeing that finalizers are actually >> ever run ? la Java? > > I don't know. People generally have expectations towards stuff being > finalized properly (especially when talking about files etc.). > Once the first implementation is devised, we will know more about > what's workable (perhaps we'll have to move _PyGC_Fini earlier in the > shutdown sequence? perhaps we'll want to switch back to serial mode > when shutting down?). PyPy just abandons everything when shutting down, instead of running finalizers. See the last paragraph of : http://doc.pypy.org/en/latest/cpython_differences.html#differences-related-to-garbage-collection-strategies So that might be a useful source of experience. On another note, I'm going to be that annoying person who suggests massively extending the scope of your proposal. Feel free to throw things at me or whatever. Would it make sense to also move signal handlers to run in this thread? Those are the other major source of nasty re-entrancy problems. -n -- Nathaniel J. Smith -- https://vorpus.org From larry at hastings.org Fri Sep 8 15:44:26 2017 From: larry at hastings.org (Larry Hastings) Date: Fri, 8 Sep 2017 12:44:26 -0700 Subject: [Python-Dev] PEP 549 v2: now titled Instance Descriptors Message-ID: <30a24dc0-edcb-61f3-d89e-6fc5bab3c8b6@hastings.org> I've updated PEP 549 with a new title--"Instance Descriptors" is a better name than "Instance Properties"--and to clarify my rationale for the PEP. I've also updated the prototype with code cleanups and a new type: "collections.abc.InstanceDescriptor", a base class that allows user classes to be instance descriptors. //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From larry at hastings.org Fri Sep 8 15:48:20 2017 From: larry at hastings.org (Larry Hastings) Date: Fri, 8 Sep 2017 12:48:20 -0700 Subject: [Python-Dev] PEP 556: Threaded garbage collection In-Reply-To: <1504899052.210148.1099965304.44F16992@webmail.messagingengine.com> References: <20170908170523.73bdb61e@fsol> <1504897450.203763.1099937864.1DA2E661@webmail.messagingengine.com> <0d24600d-b4c9-9c0c-d1b8-1e7fb1edf60a@hastings.org> <1504899052.210148.1099965304.44F16992@webmail.messagingengine.com> Message-ID: On 09/08/2017 12:30 PM, Benjamin Peterson wrote: > > On Fri, Sep 8, 2017, at 12:24, Larry Hastings wrote: >> >> On 09/08/2017 12:04 PM, Benjamin Peterson wrote: >>> - Why not run all (Python) finalizers on the thread and not just ones >>> from cycles? >> Two reasons: >> >> 1. Because some code relies on the finalizer being called on the thread >> where the last reference is dropped. This is usually the same >> thread where the object was created. Some irritating third-party >> libraries make demands on callers, e.g. "you can only interact with >> / destroy X objects on your 'main thread'". This is often true of >> windowing / GUI libraries. (For example, I believe this was true of >> Direct3D, at least as of D3D8; it was also often true of Win32 USER >> objects.) > Is the requirement that the construction and destruction be literally on > the same thread or merely non-concurrent? The GIL would provide the > latter. Literally the same thread. My theory was that these clowntown external libraries are hiding important details in thread local storage, but I don't actually know. //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Fri Sep 8 15:49:08 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 8 Sep 2017 21:49:08 +0200 Subject: [Python-Dev] PEP 556: Threaded garbage collection In-Reply-To: References: <20170908170523.73bdb61e@fsol> <1504897450.203763.1099937864.1DA2E661@webmail.messagingengine.com> <20170908211346.3f404a4b@fsol> Message-ID: <20170908214908.2da9f53d@fsol> On Fri, 8 Sep 2017 12:40:34 -0700 Nathaniel Smith wrote: > > PyPy just abandons everything when shutting down, instead of running > finalizers. See the last paragraph of : > http://doc.pypy.org/en/latest/cpython_differences.html#differences-related-to-garbage-collection-strategies > > So that might be a useful source of experience. CPython can be embedded in applications, though, and that is why we try to be a bit more thorough during the interpreter cleanup phase. > Would it make sense to also move signal handlers to run in this > thread? Those are the other major source of nasty re-entrancy > problems. See the "Non-goals" section in the PEP, they are already mentioned there :-) Note I don't think signal handlers are a major source of reentrancy problems, rather minor, since usually you don't try to do much in a signal handler. Signal handling is mostly a relic of 70s Unix design and it has less and less relevance in today's world, apart from the trivial task of telling a process to shut down. Regards Antoine. From greg at krypto.org Fri Sep 8 15:57:10 2017 From: greg at krypto.org (Gregory P. Smith) Date: Fri, 08 Sep 2017 19:57:10 +0000 Subject: [Python-Dev] PEP 556: Threaded garbage collection In-Reply-To: <20170908214908.2da9f53d@fsol> References: <20170908170523.73bdb61e@fsol> <1504897450.203763.1099937864.1DA2E661@webmail.messagingengine.com> <20170908211346.3f404a4b@fsol> <20170908214908.2da9f53d@fsol> Message-ID: On Fri, Sep 8, 2017 at 12:52 PM Antoine Pitrou wrote: > On Fri, 8 Sep 2017 12:40:34 -0700 > Nathaniel Smith wrote: > > > > PyPy just abandons everything when shutting down, instead of running > > finalizers. See the last paragraph of : > > > http://doc.pypy.org/en/latest/cpython_differences.html#differences-related-to-garbage-collection-strategies > > > > So that might be a useful source of experience. > > CPython can be embedded in applications, though, and that is why we try > to be a bit more thorough during the interpreter cleanup phase. > Indeed. My gut feeling is that proposing to not run finalizers on interpreter shutdown is a non-starter and would get the pep rejected. We've previously guaranteed that they were run unless the process dies via an unhandled signal or calls os._exit() in CPython. -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Fri Sep 8 16:00:35 2017 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 8 Sep 2017 15:00:35 -0500 Subject: [Python-Dev] Memory bitmaps for the Python cyclic garbage collector In-Reply-To: <20170907173012.vmvhx5lt2ajghqfy@python.ca> References: <20170907173012.vmvhx5lt2ajghqfy@python.ca> Message-ID: [Neil Schemenauer ] > Python objects that participate in cyclic GC (things like lists, dicts, > sets but not strings, ints and floats) have extra memory overhead. I > think it is possible to mostly eliminate this overhead. Also, while > the GC is running, this GC state is mutated, which destroys > copy-on-write optimizations. This change would mostly fix that > issue. > > All objects that participate in cyclic GC have the Py_TPFLAGS_HAVE_GC > bit set in their type. That causes an extra chunk of memory to be > allocated *before* the ob_refcnt struct member. This is the PyGC_Head > struct. > > The whole object looks like this in memory (PyObject pointer is at > arrow): > > union __gc_head *gc_next; > union __gc_head *gc_prev; > Py_ssize_t gc_refs; > --> > Py_ssize_t ob_refcnt > struct _typeobject *ob_type; > [rest of PyObject members] > > > So, 24 bytes of overhead on a 64-bit machine. The smallest Python > object that can have a pointer to another object (e.g. a single PyObject > * member) is 48 bytes. Removing PyGC_Head would cut the size of these > objects in half. > > Carl Shaprio questioned me today on why we use a double linked-list and > not the memory bitmap. I think the answer is that there is no good > reason. We use a double linked list only due to historical constraints > that are no longer present. Since you wrote this code to begin with, it will come back to you ;-) that the real purpose of the doubly-linked lists is to _partition_ (not just find) the tracked objects. Between collections, they're partitioned by generation, and within a collection equivalence classes are first merged (up through the oldest generation to be scanned in this run), and then temporarily partitioned internally in various ways (based on things like whether objects turn out to be reachable from outside, and whether they have finalizers). The linked list representation makes all the required operations cheap: iteration, merging classes, moving an object from one class to another, removing an object entirely _while_ iterating over its equivalence class. Don't know whether all that can be done efficiently with a bitmap representation instead. > Long ago, Python objects could be allocated using the system malloc or > other memory allocators. Since we could not control the memory > location, bitmaps would be inefficient. Today, we allocate all Python > objects via our own function. Python objects under a certain size are > allocated using our own malloc, obmalloc, and are stored in memory > blocks known "arenas". > > The PyGC_Head struct performs three functions. First, it allows the GC > to find all Python objects that will be checked for cycles (i.e. follow > the linked list). As above, the set of tracked objects is partitioned into more than one linked list. > Second, it stores a single bit of information to let > the GC know if it is safe to traverse the object, set with > PyObject_GC_Track(). ? An object is "tracked" if and only if it appears in _some_ doubly-linked list. There is no bit set (or cleared) for this. Untracking an object removes it entirely from whichever linked list it's in (leaving it in no linked lists), and tracking an object consists of adding it to the "generation 0" linked list. Unless the code has changed a whole lot recently. For clarity, the top N-1 bits of gc_refs (which you cover next) are also set to a special _PyGC_REFS_UNTRACKED constant when an object is untracked: """ /* True if the object is currently tracked by the GC. */ #define _PyObject_GC_IS_TRACKED(o) \ (_PyGC_REFS(o) != _PyGC_REFS_UNTRACKED) """ But I believe it could just as well check to see whether the object's gc_next is NULL. > Finally, it has a scratch area to compute the > effective reference count while tracing refs (gc_refs). As above, the top N-1 bits of that are also used between collections to record whether an object is tracked. The least significant bit of gc_refs now (not back when you or I were mucking with this code) records whether the object has a finalizer that has already been run, and that state needs to be preserved across gc runs. So that's another bit that would need to be stored somewhere else. > Here is a sketch of how we can remove the PyGC_Head struct for small > objects (say less than 512 bytes). Large objects or objects created by > a different memory allocator will still have the PyGC_Head overhead. > > * Have memory arenas that contain only objects with the > Py_TPFLAGS_HAVE_GC flag. Objects like ints, strings, etc will be > in different arenas, not have bitmaps, not be looked at by the > cyclic GC. > > * For those arenas, add a memory bitmap. The bitmap is a bit array that > has a bit for each fixed size object in the arena. The memory used by > the bitmap is a fraction of what is needed by PyGC_Head. E.g. an > arena that holds up to 1024 objects of 48 bytes in size would have a > bitmap of 1024 bits. If it's based on obmalloc code, an arena (256KB) can hold objects of all kinds of different sizes simultaneously - the sizes are the same only within a relatively small (4KB) "pool". I suspect this gets complicated. > * The bits will be set and cleared by PyObject_GC_Track/Untrack() > > * We also need an array of Py_ssize_t to take over the job of gc_refs. > That could be allocated only when GC is working and it only needs to > be the size of the number of true bits in the bitmap. Or, it could be > allocated when the arena is allocated and be sized for the full arena. > > * Objects that are too large would still get the PyGC_Head struct > allocated "in front" of the PyObject. Because they are big, the > overhead is not so bad. > > * The GC process would work nearly the same as it does now. Rather than > only traversing the linked list, we would also have to crawl over the > GC object arenas, check blocks of memory that have the tracked bit > set. Well, as checks are performed now, objects are also moved around among various lists. Without the lists, some other machinery would be needed to represent the fast-changing partitioning. > There are a lot of smaller details to work out but I see no reason > why the idea should not work. It should significantly reduce memory > usage. Also, because the bitmap and gc_refs are contiguous in > memory, locality will be improved. ?ukasz Langa has mentioned that > the current GC causes issues with copy-on-write memory in big > applications. This change should solve that issue. Better yet, convince Unixy systems to get rid of the abominable `fork()` ;-) There's a related problem when fork() is used intentionally to share read-only data: memory use balloons as child processes access (but don't mutate!) the shared data. In that case, it's because Python _does_ mutate the objects' refcount members under the covers, and so the OS ends up making fresh copies of the memory anyway. Heh. > To implement, I think the easiest path is to create new malloc to be > used by small GC objects, e.g. gcmalloc.c. It would be similar to > obmalloc but have the features needed to keep track of the bitmap. > obmalloc has some quirks that makes it hard to use for this purpose. > Once the idea is proven, gcmalloc could be merged or made to be a > variation of obmalloc. Or, maybe just optimized and remain > separate. obmalloc is complicated and highly optimized. So, adding > additional functionality to it will be challenging. The hardest thing obmalloc needs to do isn't allocating memory - while very detailed, that's straightforward. The hard part is for a free() or realloc() function to figure out efficiently where the heck its argument came from. All it's given is a raw memory address. Especially since: > I believe this change would be ABI compatible. that can't change. free()/realloc() functions need to figure out _which_ lowest-level allocation system "owns" the address passed to it. That's excruciating now with only two possible answers ("it's either the system malloc or Python's obmalloc"). Not much code is required - "excruciating" means the logic is exceedingly delicate, and enrages memory debuggers because it may branch based on reading uninitialized memory. It's probably not going to get less painful if a third possible answer appears ;-) All that said, ya, it would be wonderful to cut the gc space overhead for small objects! From ethan at stoneleaf.us Fri Sep 8 16:45:13 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 08 Sep 2017 13:45:13 -0700 Subject: [Python-Dev] PEP 549 v2: now titled Instance Descriptors In-Reply-To: <30a24dc0-edcb-61f3-d89e-6fc5bab3c8b6@hastings.org> References: <30a24dc0-edcb-61f3-d89e-6fc5bab3c8b6@hastings.org> Message-ID: <59B30159.6000800@stoneleaf.us> On 09/08/2017 12:44 PM, Larry Hastings wrote: > I've updated PEP 549 with a new title--"Instance Descriptors" is a better name than "Instance Properties"--and to > clarify my rationale for the PEP. I've also updated the prototype with code cleanups and a new type: > "collections.abc.InstanceDescriptor", a base class that allows user classes to be instance descriptors. I like the new title, I'm +0 on the PEP itself, and I have one correction for the PEP: we've had the ability to simulate module properties for ages: Python 2.7.6 (default, Oct 26 2016, 20:32:47) [GCC 4.8.4] on linux2 Type "help", "copyright", "credits" or "license" for more information. --> import module_class --> module_class.hello 'hello' --> module_class.hello = 'hola' --> module_class.hello 'hola' And the code: class ModuleClass(object): @property def hello(self): try: return self._greeting except AttributeError: return 'hello' @hello.setter def hello(self, value): self._greeting = value import sys sys.modules[__name__] = ModuleClass() I will admit I don't see what reassigning the __class__ attribute on a module did for us. -- ~Ethan~ From larry at hastings.org Fri Sep 8 16:51:54 2017 From: larry at hastings.org (Larry Hastings) Date: Fri, 8 Sep 2017 13:51:54 -0700 Subject: [Python-Dev] PEP 549 v2: now titled Instance Descriptors In-Reply-To: <59B30159.6000800@stoneleaf.us> References: <30a24dc0-edcb-61f3-d89e-6fc5bab3c8b6@hastings.org> <59B30159.6000800@stoneleaf.us> Message-ID: <2b08a986-c093-ddaa-40a5-4f0d123c71c5@hastings.org> Assigning the module's __class__ means you can otherwise initialize your module instance normally--all your class needs to do is declare your properties or magic methods. If you overwrite the sys.modules entry for your module with a new instance of a custom subclass, all your module initialization needs to be done within--or propagated to--the new class or instance. //arry/ On 09/08/2017 01:45 PM, Ethan Furman wrote: > On 09/08/2017 12:44 PM, Larry Hastings wrote: > >> I've updated PEP 549 with a new title--"Instance Descriptors" is a >> better name than "Instance Properties"--and to >> clarify my rationale for the PEP. I've also updated the prototype >> with code cleanups and a new type: >> "collections.abc.InstanceDescriptor", a base class that allows user >> classes to be instance descriptors. > > I like the new title, I'm +0 on the PEP itself, and I have one > correction for the PEP: we've had the ability to simulate module > properties for ages: > > Python 2.7.6 (default, Oct 26 2016, 20:32:47) > [GCC 4.8.4] on linux2 > Type "help", "copyright", "credits" or "license" for more > information. > --> import module_class > --> module_class.hello > 'hello' > --> module_class.hello = 'hola' > --> module_class.hello > 'hola' > > And the code: > > class ModuleClass(object): > @property > def hello(self): > try: > return self._greeting > except AttributeError: > return 'hello' > @hello.setter > def hello(self, value): > self._greeting = value > > import sys > sys.modules[__name__] = ModuleClass() > > I will admit I don't see what reassigning the __class__ attribute on a > module did for us. > > -- > ~Ethan~ > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/larry%40hastings.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Fri Sep 8 16:54:22 2017 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 8 Sep 2017 13:54:22 -0700 Subject: [Python-Dev] PEP 549 v2: now titled Instance Descriptors In-Reply-To: <59B30159.6000800@stoneleaf.us> References: <30a24dc0-edcb-61f3-d89e-6fc5bab3c8b6@hastings.org> <59B30159.6000800@stoneleaf.us> Message-ID: On Fri, Sep 8, 2017 at 1:45 PM, Ethan Furman wrote: > On 09/08/2017 12:44 PM, Larry Hastings wrote: > >> I've updated PEP 549 with a new title--"Instance Descriptors" is a better >> name than "Instance Properties"--and to >> clarify my rationale for the PEP. I've also updated the prototype with >> code cleanups and a new type: >> "collections.abc.InstanceDescriptor", a base class that allows user >> classes to be instance descriptors. > > > I like the new title, I'm +0 on the PEP itself, and I have one correction > for the PEP: we've had the ability to simulate module properties for ages: > > Python 2.7.6 (default, Oct 26 2016, 20:32:47) > [GCC 4.8.4] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > --> import module_class > --> module_class.hello > 'hello' > --> module_class.hello = 'hola' > --> module_class.hello > 'hola' > > And the code: > > class ModuleClass(object): > @property > def hello(self): > try: > return self._greeting > except AttributeError: > return 'hello' > @hello.setter > def hello(self, value): > self._greeting = value > > import sys > sys.modules[__name__] = ModuleClass() > > I will admit I don't see what reassigning the __class__ attribute on a > module did for us. If you have an existing package that doesn't replace itself in sys.modules, then it's difficult and risky to switch to that form -- don't think of toy examples, think of django/__init__.py or numpy/__init__.py. You have to rewrite the whole export logic, and you have to figure out what to do with things like submodules that import from the parent module before the swaparoo happens, you can get skew issues between the original module namespace and the replacement class namespace, etc. The advantage of the __class__ assignment trick (as compared to what we had before) is that it lets you easily and safely retrofit this kind of magic onto existing packages. -n -- Nathaniel J. Smith -- https://vorpus.org From ethan at stoneleaf.us Fri Sep 8 17:26:47 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 08 Sep 2017 14:26:47 -0700 Subject: [Python-Dev] PEP 549 v2: now titled Instance Descriptors In-Reply-To: <59B30159.6000800@stoneleaf.us> References: <30a24dc0-edcb-61f3-d89e-6fc5bab3c8b6@hastings.org> <59B30159.6000800@stoneleaf.us> Message-ID: <59B30B17.80504@stoneleaf.us> On 09/08/2017 01:45 PM, Ethan Furman wrote: > I will admit I don't see what reassigning the __class__ attribute on a module did for us. Nathaniel Smith wrote: --------------------- > If you have an existing package that doesn't replace itself in > sys.modules, then it's difficult and risky to switch to that form -- > don't think of toy examples, think of django/__init__.py or > numpy/__init__.py. You have to rewrite the whole export logic, and you > have to figure out what to do with things like submodules that import > from the parent module before the swaparoo happens, you can get skew > issues between the original module namespace and the replacement class > namespace, etc. The advantage of the __class__ assignment trick (as > compared to what we had before) is that it lets you easily and safely > retrofit this kind of magic onto existing packages. Ah, that makes sense. Larry Hastings wrote: --------------------- > Assigning the module's __class__ means you can otherwise initialize your module instance > normally--all your class needs to do is declare your properties or magic methods. If you > overwrite the sys.modules entry for your module with a new instance of a custom subclass, > all your module initialization needs to be done within--or propagated to--the new class > or instance. So with the module level __class__ we have to use two steps to get what we want, and PEP 549 takes it back to one step. Hmm. In theory it sounds cool; on the other hand, the existing __class__ assignment method means all the magic is in one place in the module, not scattered around -- but that's a style issue. Thanks for the info! -- ~Ethan~ From christian at python.org Fri Sep 8 17:45:36 2017 From: christian at python.org (Christian Heimes) Date: Fri, 8 Sep 2017 14:45:36 -0700 Subject: [Python-Dev] make multissltests Message-ID: For your information, You can now automatically compile and check the ssl module with multiple versions of OpenSSL and LibreSSL. The multissltest script downloads tar.gz, compiles the source and installs headers + shared lib into a local directory. It takes rather long the first time because OpenSSL does not support parallel builds. Be prepared to wait an hour. After the script has downloaded and installed OpenSSL/LibreSSL, it recompiles the ssl and hashlib module with every version and runs the SSL-related test suite. make multissltests compiles and runs all tests make multisslcompile only compiles See ./python Tools/ssl/multissltests.py --help for additional options. For now the script covers: * OpenSSL 0.9.8zc * OpenSSL 0.9.8zh * OpenSSL 1.0.1u * OpenSSL 1.0.2 * OpenSSL 1.0.2l * OpenSSL 1.1.0f * LibreSSL 2.3.10 * LibreSSL 2.4.5 * LibreSSL 2.5.3 * LibreSSL 2.5.5 Christian From guido at python.org Fri Sep 8 18:08:20 2017 From: guido at python.org (Guido van Rossum) Date: Fri, 8 Sep 2017 15:08:20 -0700 Subject: [Python-Dev] Pygments in PEPs? In-Reply-To: References: <20170908165330.61e1a8bf@fsol> Message-ID: For some unknown value of "soon". :-( On Fri, Sep 8, 2017 at 7:59 AM, Ivan Levkivskyi wrote: > I already made a PR that would highlight code in PEPs half-year ago, but > it was rejected with the reason that they will be moved to RtD soon. > > -- > Ivan > > 8 ??? 2017 16:56 "Antoine Pitrou" ????: > >> >> Hi, >> >> Is it possible to install pygments on the machine which builds and >> renders PEPs? It would allow syntax highlighting in PEPs, making >> example code more readable. >> >> Regards >> >> Antoine. >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: https://mail.python.org/mailman/options/python-dev/levkivsky >> i%40gmail.com >> > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Fri Sep 8 18:14:33 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 9 Sep 2017 00:14:33 +0200 Subject: [Python-Dev] Pygments in PEPs? In-Reply-To: References: <20170908165330.61e1a8bf@fsol> Message-ID: <20170909001433.258d2d04@fsol> On Fri, 8 Sep 2017 16:59:57 +0200 Ivan Levkivskyi wrote: > I already made a PR that would highlight code in PEPs half-year ago, but it > was rejected with the reason that they will be moved to RtD soon. Perhaps we can revive that PR? Browsing through old peps PRs, I can see that other people were already bitten by the lack of pygments. (more generally, it would be nice to build the PEPs with Sphinx so as to get the whole gamut of useful markup that comes with it, but I guess that's much more involved than a simple "pip install pygments" inside the build environment) Regards Antoine. From mariatta.wijaya at gmail.com Fri Sep 8 18:20:10 2017 From: mariatta.wijaya at gmail.com (Mariatta Wijaya) Date: Fri, 8 Sep 2017 15:20:10 -0700 Subject: [Python-Dev] Pygments in PEPs? In-Reply-To: References: <20170908165330.61e1a8bf@fsol> Message-ID: > > For some unknown value of "soon". :-( Well as soon as these tasks are done: https://github.com/python/peps/projects/1 Mariatta Wijaya -------------- next part -------------- An HTML attachment was scrubbed... URL: From python-dev at mgmiller.net Fri Sep 8 18:20:46 2017 From: python-dev at mgmiller.net (Mike Miller) Date: Fri, 8 Sep 2017 15:20:46 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> Message-ID: On 2017-09-08 07:57, Eric V. Smith wrote: > I've written a PEP for? Apologies for the following list dumb questions and bikesheds: - 'Classes can be thought of as "mutable namedtuples with defaults".' - A C/C++ (struct)ure sounds like a simpler description that many more would understand. - dataclass name: - class, redundant - data, good but very common - struct, used? - Record? (best I could come up with) - Source needs blanks between functions, hard to read. - Are types required? Maybe an example or two with Any? - Intro discounts inheritance and metaclasses as "potentially interfering", but unclear why that would be the case. Inheritance is easy to override, metaclasses not sure? - Perhaps mention ORMs/metaclass approach, as prior art: https://docs.djangoproject.com/en/dev/topics/db/models/ - Perhaps mention Kivy Properties, as prior art: https://kivy.org/docs/api-kivy.properties.html - For mutable default values: @dataclass class C: x: list # = field(default_factory=list) Could it detect list as a mutable class "type" and set it as a factory automatically? The PEP/bug #3 mentions using copy, but that's not exactly what I'm asking above. From levkivskyi at gmail.com Fri Sep 8 18:30:17 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Sat, 9 Sep 2017 00:30:17 +0200 Subject: [Python-Dev] Pygments in PEPs? In-Reply-To: <20170909001433.258d2d04@fsol> References: <20170908165330.61e1a8bf@fsol> <20170909001433.258d2d04@fsol> Message-ID: On 9 September 2017 at 00:14, Antoine Pitrou wrote: > On Fri, 8 Sep 2017 16:59:57 +0200 > Ivan Levkivskyi wrote: > > > I already made a PR that would highlight code in PEPs half-year ago, but > it > > was rejected with the reason that they will be moved to RtD soon. > > Perhaps we can revive that PR? > Just for the reference here is the PR https://github.com/python/pythondotorg/pull/1063 It is indeed quite simple. Unfortunately, right now (and next one-two weeks) I will be not able to work on this PR if any additional changes will be needed. But I will be happy if someone will continue with this (temporary) solution. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Fri Sep 8 18:40:27 2017 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 8 Sep 2017 15:40:27 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> Message-ID: <0da1ba37-ff98-5394-ba55-f59855048ff3@trueblade.com> On 9/8/17 3:20 PM, Mike Miller wrote: > > On 2017-09-08 07:57, Eric V. Smith wrote: >> I've written a PEP for? > > Apologies for the following list dumb questions and bikesheds: > > > - 'Classes can be thought of as "mutable namedtuples with defaults".' > > - A C/C++ (struct)ure sounds like a simpler description that many more > would understand. Yes, other people have pointed out that this might not be the best "elevator pitch" example. I'm thinking about it. > - dataclass name: > > - class, redundant > - data, good but very common > - struct, used? > - Record? (best I could come up with) There was a bunch of discussions on this. We're delaying the name bikeshedding for later (and maybe never). > - Source needs blanks between functions, hard to read. It's supposed to be hard to read! You're just supposed to think "am I glad I don't have to read or write that". But I'll look at it. > - Are types required? Annotations are required, the typing module is not. > Maybe an example or two with Any? I'd rather leave it like it is: typing is referenced only once, for ClassVar. > - Intro discounts inheritance and metaclasses as "potentially interfering", > but unclear why that would be the case. > Inheritance is easy to override, metaclasses not sure? I don't really want to get in to the history of why people don't like inheritance, single and multi. Or how metaclass magic can make life difficult. I just want to point out that Data Classes don't interfere at all. > - Perhaps mention ORMs/metaclass approach, as prior art: > > https://docs.djangoproject.com/en/dev/topics/db/models/ > > - Perhaps mention Kivy Properties, as prior art: > > https://kivy.org/docs/api-kivy.properties.html Those are all good. Thanks. > - For mutable default values: > > @dataclass > class C: > x: list # = field(default_factory=list) > > Could it detect list as a mutable class "type" and set it as a factory > automatically? > > The PEP/bug #3 mentions using copy, but that's not exactly what I'm > asking > above. The problem is: how do you know what's a mutable type? There's no general way to know. The behavior in the PEP is just mean to stop the worst of it. I guess we could have an option that says: call the type to create a new, empty instance. @dataclass class C: x: list = field(default_type_is_factory=True) Thanks for the critical reading and your comments. I'm going to push a new version early next week, when I get back from traveling. Eric. From guido at python.org Fri Sep 8 19:00:36 2017 From: guido at python.org (Guido van Rossum) Date: Fri, 8 Sep 2017 16:00:36 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <0da1ba37-ff98-5394-ba55-f59855048ff3@trueblade.com> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <0da1ba37-ff98-5394-ba55-f59855048ff3@trueblade.com> Message-ID: I think it would be useful to write 1-2 sentences about the problem with inheritance -- in that case you pretty much have to use a metaclass, and the use of a metaclass makes life harder for people who want to use their own metaclass (since metaclasses don't combine without some manual intervention). On Fri, Sep 8, 2017 at 3:40 PM, Eric V. Smith wrote: > On 9/8/17 3:20 PM, Mike Miller wrote: > >> >> On 2017-09-08 07:57, Eric V. Smith wrote: >> >>> I've written a PEP for? >>> >> >> Apologies for the following list dumb questions and bikesheds: >> >> >> - 'Classes can be thought of as "mutable namedtuples with defaults".' >> >> - A C/C++ (struct)ure sounds like a simpler description that many more >> would understand. >> > > Yes, other people have pointed out that this might not be the best > "elevator pitch" example. I'm thinking about it. > > - dataclass name: >> >> - class, redundant >> - data, good but very common >> - struct, used? >> - Record? (best I could come up with) >> > > There was a bunch of discussions on this. We're delaying the name > bikeshedding for later (and maybe never). > > - Source needs blanks between functions, hard to read. >> > > It's supposed to be hard to read! You're just supposed to think "am I glad > I don't have to read or write that". But I'll look at it. > > - Are types required? >> > > Annotations are required, the typing module is not. > > Maybe an example or two with Any? >> > > I'd rather leave it like it is: typing is referenced only once, for > ClassVar. > > - Intro discounts inheritance and metaclasses as "potentially interfering", >> but unclear why that would be the case. >> Inheritance is easy to override, metaclasses not sure? >> > > I don't really want to get in to the history of why people don't like > inheritance, single and multi. Or how metaclass magic can make life > difficult. I just want to point out that Data Classes don't interfere at > all. > > - Perhaps mention ORMs/metaclass approach, as prior art: >> >> https://docs.djangoproject.com/en/dev/topics/db/models/ >> >> - Perhaps mention Kivy Properties, as prior art: >> >> https://kivy.org/docs/api-kivy.properties.html >> > > Those are all good. Thanks. > > - For mutable default values: >> >> @dataclass >> class C: >> x: list # = field(default_factory=list) >> >> Could it detect list as a mutable class "type" and set it as a factory >> automatically? >> >> The PEP/bug #3 mentions using copy, but that's not exactly what I'm >> asking >> above. >> > > The problem is: how do you know what's a mutable type? There's no general > way to know. The behavior in the PEP is just mean to stop the worst of it. > > I guess we could have an option that says: call the type to create a new, > empty instance. > > @dataclass > class C: > x: list = field(default_type_is_factory=True) > > > Thanks for the critical reading and your comments. I'm going to push a new > version early next week, when I get back from traveling. > > Eric. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido% > 40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Fri Sep 8 19:04:27 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 8 Sep 2017 16:04:27 -0700 Subject: [Python-Dev] PEP 554 v2 (new "interpreters" module) Message-ID: I've updated the PEP in response to some excellent feedback. Thanks to all who helped. Most notably, I've added a basic object passing mechanism. I've included the PEP below. Please let me know what you think. Thanks! -eric **************************************************** PEP: 554 Title: Multiple Interpreters in the Stdlib Author: Eric Snow Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 2017-09-05 Python-Version: 3.7 Post-History: Abstract ======== This proposal introduces the stdlib ``interpreters`` module. It exposes the basic functionality of subinterpreters that already exists in the C-API. Each subinterpreter runs with its own state (see ``Interpreter Isolation`` below). The module will be "provisional", as described by PEP 411. Rationale ========= Running code in multiple interpreters provides a useful level of isolation within the same process. This can be leveraged in number of ways. Furthermore, subinterpreters provide a well-defined framework in which such isolation may extended. CPython has supported subinterpreters, with increasing levels of support, since version 1.5. While the feature has the potential to be a powerful tool, subinterpreters have suffered from neglect because they are not available directly from Python. Exposing the existing functionality in the stdlib will help reverse the situation. This proposal is focused on enabling the fundamental capability of multiple isolated interpreters in the same Python process. This is a new area for Python so there is relative uncertainly about the best tools to provide as companions to subinterpreters. Thus we minimize the functionality we add in the proposal as much as possible. Concerns -------- * "subinterpreters are not worth the trouble" Some have argued that subinterpreters do not add sufficient benefit to justify making them an official part of Python. Adding features to the language (or stdlib) has a cost in increasing the size of the language. So it must pay for itself. In this case, subinterpreters provide a novel concurrency model focused on isolated threads of execution. Furthermore, they present an opportunity for changes in CPython that will allow simulateous use of multiple CPU cores (currently prevented by the GIL). Alternatives to subinterpreters include threading, async, and multiprocessing. Threading is limited by the GIL and async isn't the right solution for every problem (nor for every person). Multiprocessing is likewise valuable in some but not all situations. Direct IPC (rather than via the multiprocessing module) provides similar benefits but with the same caveat. Notably, subinterpreters are not intended as a replacement for any of the above. Certainly they overlap in some areas, but the benefits of subinterpreters include isolation and (potentially) performance. In particular, subinterpreters provide a direct route to an alternate concurrency model (e.g. CSP) which has found success elsewhere and will appeal to some Python users. That is the core value that the ``interpreters`` module will provide. * "stdlib support for subinterpreters adds extra burden on C extension authors" In the ``Interpreter Isolation`` section below we identify ways in which isolation in CPython's subinterpreters is incomplete. Most notable is extension modules that use C globals to store internal state. PEP 3121 and PEP 489 provide a solution for most of the problem, but one still remains. [petr-c-ext]_ Until that is resolved, C extension authors will face extra difficulty to support subinterpreters. Consequently, projects that publish extension modules may face an increased maintenance burden as their users start using subinterpreters, where their modules may break. This situation is limited to modules that use C globals (or use libraries that use C globals) to store internal state. Ultimately this comes down to a question of how often it will be a problem in practice: how many projects would be affected, how often their users will be affected, what the additional maintenance burden will be for projects, and what the overall benefit of subinterpreters is to offset those costs. The position of this PEP is that the actual extra maintenance burden will be small and well below the threshold at which subinterpreters are worth it. Proposal ======== The ``interpreters`` module will be added to the stdlib. It will provide a high-level interface to subinterpreters and wrap the low-level ``_interpreters`` module. The proposed API is inspired by the ``threading`` module. The module provides the following functions: ``list()``:: Return a list of all existing interpreters. ``get_current()``:: Return the currently running interpreter. ``get_main()``:: Return the main interpreter. ``create()``:: Initialize a new Python interpreter and return it. The interpreter will be created in the current thread and will remain idle until something is run in it. The module also provides the following classes: ``Interpreter(id)``:: id: The interpreter's ID (read-only). is_running(): Return whether or not the interpreter is currently executing code. destroy(): Finalize and destroy the interpreter. If called on a running interpreter then raise RuntimeError in the interpreter that called ``destroy()``. run(code): Run the provided Python code in the interpreter, in the current OS thread. If the interpreter is already running then raise RuntimeError in the interpreter that called ``run()``. The current interpreter (which called ``run()``) will block until the subinterpreter finishes running the requested code. Any uncaught exception in that code will bubble up to the current interpreter. Supported code: source text. get_fifo(name): Return the FIFO object with the given name that is associated with this interpreter. If no such FIFO exists then raise KeyError. The FIFO will be either a "FIFOReader" or a "FIFOWriter", depending on which "add_*_fifo()" was called. list_fifos(): Return a list of all fifos associated with the interpreter. add_recv_fifo(name=None): Create a new FIFO, associate the two ends with the involved interpreters, and return the side associated with the interpreter in which "add_recv_fifo()" was called. A FIFOReader gets tied to this interpreter. A FIFOWriter gets tied to the interpreter that called "add_recv_fifo()". The FIFO's name is set to the provided value. If no name is provided then a dynamically generated one is used. If a FIFO with the given name is already associated with this interpreter (or with the one in which "add_recv_fifo()" was called) then raise KeyError. add_send_fifo(name=None): Create a new FIFO, associate the two ends with the involved interpreters, and return the side associated with the interpreter in which "add_recv_fifo()" was called. A FIFOWriter gets tied to this interpreter. A FIFOReader gets tied to the interpreter that called "add_recv_fifo()". The FIFO's name is set to the provided value. If no name is provided then a dynamically generated one is used. If a FIFO with the given name is already associated with this interpreter (or with the one in which "add_send_fifo()" was called) then raise KeyError. remove_fifo(name): Drop the association between the named FIFO and this interpreter. If the named FIFO is not found then raise KeyError. ``FIFOReader(name)``:: The receiving end of a FIFO. An interpreter may use this to receive objects from another interpreter. At first only bytes and None will be supported. name: The FIFO's name. __next__(): Return the next bytes object from the pipe. If none have been pushed on then block. pop(*, block=True): Return the next bytes object from the pipe. If none have been pushed on and "block" is True (the default) then block. Otherwise return None. ``FIFOWriter(name)``:: The sending end of a FIFO. An interpreter may use this to send objects to another interpreter. At first only bytes and None will be supported. name: The FIFO's name. push(object, *, block=True): Add the object to the FIFO. If "block" is true then block until the object is popped off. If the FIFO does not support the object's type then TypeError is raised. About FIFOs ----------- Subinterpreters are inherently isolated (with caveats explained below), in contrast to threads. This enables a different concurrency model than currently exists in Python. CSP (Communicating Sequential Processes), upon which Go's concurrency is based, is one example of this model. A key component of this approach to concurrency is message passing. So providing a message/object passing mechanism alongside ``Interpreter`` is a fundamental requirement. This proposal includes a basic mechanism upon which more complex machinery may be built. That basic mechanism draws inspiration from pipes, queues, and CSP's channels. The key challenge here is that sharing objects between interpreters faces complexity due in part to CPython's current memory model. Furthermore, in this class of concurrency, the ideal is that objects only exist in one interpreter at a time. However, this is not practical for Python so we initially constrain supported objects to ``bytes`` and ``None``. There are a number of strategies we may pursue in the future to expand supported objects and object sharing strategies. Note that the complexity of object sharing increases as subinterpreters become more isolated, e.g. after GIL removal. So the mechanism for message passing needs to be carefully considered. Keeping the API minimal and initially restricting the supported types helps us avoid further exposing any underlying complexity to Python users. Deferred Functionality ====================== In the interest of keeping this proposal minimal, the following functionality has been left out for future consideration. Note that this is not a judgement against any of said capability, but rather a deferment. That said, each is arguably valid. Interpreter.call() ------------------ It would be convenient to run existing functions in subinterpreters directly. ``Interpreter.run()`` could be adjusted to support this or a ``call()`` method could be added:: Interpreter.call(f, *args, **kwargs) This suffers from the same problem as sharing objects between interpreters via queues. The minimal solution (running a source string) is sufficient for us to get the feature out where it can be explored. timeout arg to pop() and push() ------------------------------- Typically functions that have a ``block`` argument also have a ``timeout`` argument. We can add it later if needed. Interpreter Isolation ===================== CPython's interpreters are intended to be strictly isolated from each other. Each interpreter has its own copy of all modules, classes, functions, and variables. The same applies to state in C, including in extension modules. The CPython C-API docs explain more. [c-api]_ However, there are ways in which interpreters share some state. First of all, some process-global state remains shared, like file descriptors. There are no plans to change this. Second, some isolation is faulty due to bugs or implementations that did not take subinterpreters into account. This includes things like at-exit handlers and extension modules that rely on C globals. In these cases bugs should be opened (some are already). Finally, some potential isolation is missing due to the current design of CPython. This includes the GIL and memory management. Improvements are currently going on to address gaps in this area. Provisional Status ================== The new ``interpreters`` module will be added with "provisional" status (see PEP 411). This allows Python users to experiment with the feature and provide feedback while still allowing us to adjust to that feedback. The module will be provisional in Python 3.7 and we will make a decision before the 3.8 release whether to keep it provisional, graduate it, or remove it. References ========== .. [c-api] https://docs.python.org/3/c-api/init.html#bugs-and-caveats .. [petr-c-ext] https://mail.python.org/pipermail/import-sig/2016-June/001062.html https://mail.python.org/pipermail/python-ideas/2016-April/039748.html Copyright ========= This document has been placed in the public domain. From levkivskyi at gmail.com Fri Sep 8 19:19:20 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Sat, 9 Sep 2017 01:19:20 +0200 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <0da1ba37-ff98-5394-ba55-f59855048ff3@trueblade.com> Message-ID: On 9 September 2017 at 01:00, Guido van Rossum wrote: > I think it would be useful to write 1-2 sentences about the problem with > inheritance -- in that case you pretty much have to use a metaclass, > > It is not the case now. I think __init_subclass__ has almost the same possibilities as a decorator, it just updates an already created class and can add some methods to it. This is a more subtle question, these two for example would be equivalent: from dataclass import Data, Frozen class Point(Frozen, Data): x: int y: int and from dataclass import dataclass @dataclass(frozen=True) class Point: x: int y: int But the problem with inheritance based pattern is that it cannot support automatic addition of __slots__. Also I think a decorator will be easier to maintain. But on the other hand I think inheritance based scheme is a bit more readable. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From python-ideas at mgmiller.net Fri Sep 8 18:17:29 2017 From: python-ideas at mgmiller.net (Mike Miller) Date: Fri, 8 Sep 2017 15:17:29 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> Message-ID: <79169430-2371-ff8a-35ce-784a1f12b2e0@mgmiller.net> On 2017-09-08 07:57, Eric V. Smith wrote: > I've written a PEP for? Apologies for the following list dumb questions and bikesheds: - 'Classes can be thought of as "mutable namedtuples with defaults".' - A C/C++ (struct)ure sounds like a simpler description that many more would understand. - dataclass name: - class, redundant - data, good but very common - struct, used? - Record? (best I could come up with) - Source needs blanks between functions, hard to read. - Are types required? Maybe an example or two with Any? - Intro discounts inheritance and metaclasses as "potentially interfering", but unclear why that would be the case. Inheritance is easy to override, metaclasses not sure? - Perhaps mention ORMs/metaclass approach, as prior art: https://docs.djangoproject.com/en/dev/topics/db/models/ - Perhaps mention Kivy Properties, as prior art: https://kivy.org/docs/api-kivy.properties.html - For mutable default values: @dataclass class C: x: list # = field(default_factory=list) Could it detect list as a mutable class "type" and set it as a factory automatically? The PEP/bug #3 mentions using copy, but that's not exactly what I'm asking above. From stefan at bytereef.org Fri Sep 8 19:28:42 2017 From: stefan at bytereef.org (Stefan Krah) Date: Sat, 9 Sep 2017 01:28:42 +0200 Subject: [Python-Dev] PEP 554 v2 (new "interpreters" module) In-Reply-To: References: Message-ID: <20170908232841.GA30417@bytereef.org> On Fri, Sep 08, 2017 at 04:04:27PM -0700, Eric Snow wrote: > * "stdlib support for subinterpreters adds extra burden > on C extension authors" > > In the ``Interpreter Isolation`` section below we identify ways in > which isolation in CPython's subinterpreters is incomplete. Most > notable is extension modules that use C globals to store internal > state. PEP 3121 and PEP 489 provide a solution for most of the > problem, but one still remains. [petr-c-ext]_ Until that is resolved, > C extension authors will face extra difficulty to support > subinterpreters. It's a bit of a hassle, and the enormous slowdown in some of the existing solutions is really a no go [1]. In the case of _decimal, the tls-context is already subinterpreter safe and reasonable fast due to caching. The most promising model to me is to put *all* globals in a tls structure and cache the whole structure. Extrapolating from my experiences with the context, this might have a slowdown of "only" 4%. Still, the argument "who uses subinterpreters?" of course still remains. Stefan Krah [1] I'm referring to the slowdown of heaptypes + module state. From greg.ewing at canterbury.ac.nz Fri Sep 8 19:49:52 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 09 Sep 2017 11:49:52 +1200 Subject: [Python-Dev] Memory bitmaps for the Python cyclic garbage collector In-Reply-To: References: <20170907173012.vmvhx5lt2ajghqfy@python.ca> Message-ID: <59B32CA0.1050409@canterbury.ac.nz> Tim Peters wrote: > In that case, it's because Python > _does_ mutate the objects' refcount members under the covers, and so > the OS ends up making fresh copies of the memory anyway. Has anyone ever considered addressing that by moving the refcounts out of the objects and keeping them somewhere else? -- Greg From tim.peters at gmail.com Fri Sep 8 20:13:29 2017 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 8 Sep 2017 19:13:29 -0500 Subject: [Python-Dev] Memory bitmaps for the Python cyclic garbage collector In-Reply-To: <59B32CA0.1050409@canterbury.ac.nz> References: <20170907173012.vmvhx5lt2ajghqfy@python.ca> <59B32CA0.1050409@canterbury.ac.nz> Message-ID: [Tim] >> In that case, it's because Python >> _does_ mutate the objects' refcount members under the covers, and so >> the OS ends up making fresh copies of the memory anyway. [Greg Ewing ] > Has anyone ever considered addressing that by moving the > refcounts out of the objects and keeping them somewhere > else? Not that I know of. I know Larry Hastings was considering doing it as part of his experiments with removing the GIL, but that had nothing to do with reducing cross-process copy-on-write surprises (it had to do with "batching" refcount operations to eliminate a need for fine-grained locking). As-is, I'd say it's "a feature" that the refcount is part of the object header. Ref count manipulations are very frequent, and as part of the object header a refcount tends to show up in cache lines "for free" as a side effect of accessing the object's type pointer. From desmoulinmichel at gmail.com Fri Sep 8 23:54:51 2017 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Sat, 9 Sep 2017 05:54:51 +0200 Subject: [Python-Dev] PEP 554 v2 (new "interpreters" module) In-Reply-To: <20170908232841.GA30417@bytereef.org> References: <20170908232841.GA30417@bytereef.org> Message-ID: Le 09/09/2017 ? 01:28, Stefan Krah a ?crit : > On Fri, Sep 08, 2017 at 04:04:27PM -0700, Eric Snow wrote: >> * "stdlib support for subinterpreters adds extra burden >> on C extension authors" >> >> In the ``Interpreter Isolation`` section below we identify ways in >> which isolation in CPython's subinterpreters is incomplete. Most >> notable is extension modules that use C globals to store internal >> state. PEP 3121 and PEP 489 provide a solution for most of the >> problem, but one still remains. [petr-c-ext]_ Until that is resolved, >> C extension authors will face extra difficulty to support >> subinterpreters. > > It's a bit of a hassle, and the enormous slowdown in some of the existing > solutions is really a no go [1]. > > In the case of _decimal, the tls-context is already subinterpreter safe > and reasonable fast due to caching. > > > The most promising model to me is to put *all* globals in a tls structure > and cache the whole structure. Extrapolating from my experiences with the > context, this might have a slowdown of "only" 4%. > > > Still, the argument "who uses subinterpreters?" of course still remains. For now, nobody. But if we expose it and web frameworks manage to create workers as fast as multiprocessing and as cheap as threading, you will find a lot of people starting to want to use it. We can't know until we got the toy to play with. > > > > Stefan Krah > > > [1] I'm referring to the slowdown of heaptypes + module state. > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/desmoulinmichel%40gmail.com > From tjreedy at udel.edu Sat Sep 9 00:40:42 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 9 Sep 2017 00:40:42 -0400 Subject: [Python-Dev] PEP 553: Built-in debug() In-Reply-To: <1782A936-98F9-45C4-B3E6-9EC4C3E51D7A@python.org> References: <98F44C60-2DBB-4771-9544-B9DA7210353C@python.org> <03B5AA26-0AE0-4EFB-9300-B71F9ECD3AE8@python.org> <1F68BA08-AC94-41A6-AFD2-4439DA5D3371@python.org> <1782A936-98F9-45C4-B3E6-9EC4C3E51D7A@python.org> Message-ID: On 9/7/2017 10:45 PM, Barry Warsaw wrote: > On Sep 7, 2017, at 16:19, Terry Reedy wrote: > >> I think breakpoint() should have a db= parameter so one can select a debugger in one removable line. The sys interface is more useful for IDEs to change the default, possible with other args (like breakpoints and colors) bound to the callable. > > I?m skeptical about that. I think any particular user is going to overwhelmingly use the same debugger, so having to repeat themselves every time they want to enter the debugger is going to get tedious fast. I know it would annoy *me* if I had to tell it to use pdb every time I wrote `breakpoint()`, and I?m almost never going to use anything else. OK > I?m also not sure what useful semantics for `db` would be. E.g. what specifically would you set `db` to in order to invoke idle or tkdb (?gdb? would be an unfortunate name I think, given the popular existing GNU Debugger ;). I will stick with tkdb until there is a shed to paint. >> A somewhat separate point: the name breakpoint() is slightly misleading, which has consequences if it is (improperly) called more than once. While breakpoint() acts as a breakpoint, what it does (at least in the default pdb case) is *initialize* and start a *new* debugger, possibly after an import. Re-importing a module is no big deal. Replacing an existing debugger with a *new* one, and tossing away all defined aliases and breakpoints and Bdb's internal caches, is. It is likely not what most people would want or expect. I think it more likely that people will call breakpoint() multiple times than they would, for instance, call pdb.set_trace() multiple times. > > Multiple calls to pdb.set_trace() is fairly common in practice today, so I?m not terribly concerned about it. I am slightly surprised, but I guess if one does not set anything, one would not lose anything. >> With a gui debugger, having one window go and another appear might be additionally annoying. If the first window is not immediately GCed, having two windows would be confusing. Perhaps breakpoint() could be made a no-op after the first call. > > Your sys.breakpointhook could easily implement that, with a much better user experience than what built-in breakpoint() could do anyway. Now that I know multiple breakpoint()s are likely, I definitely would. -- Terry Jan Reedy From njs at pobox.com Sat Sep 9 01:14:06 2017 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 8 Sep 2017 22:14:06 -0700 Subject: [Python-Dev] PEP 554 v2 (new "interpreters" module) In-Reply-To: References: <20170908232841.GA30417@bytereef.org> Message-ID: On Fri, Sep 8, 2017 at 8:54 PM, Michel Desmoulin wrote: > > Le 09/09/2017 ? 01:28, Stefan Krah a ?crit : >> Still, the argument "who uses subinterpreters?" of course still remains. > > For now, nobody. But if we expose it and web frameworks manage to create > workers as fast as multiprocessing and as cheap as threading, you will > find a lot of people starting to want to use it. To temper expectations a bit here, it sounds like the first version might be more like: as slow as threading (no multicore), as expensive as multiprocessing (no shared memory), and -- on Unix -- slower to start than either of them (no fork). -n -- Nathaniel J. Smith -- https://vorpus.org From p.f.moore at gmail.com Sat Sep 9 04:04:59 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Sat, 9 Sep 2017 09:04:59 +0100 Subject: [Python-Dev] PEP 554 v2 (new "interpreters" module) In-Reply-To: References: Message-ID: On 9 September 2017 at 00:04, Eric Snow wrote: > add_recv_fifo(name=None): > > Create a new FIFO, associate the two ends with the involved > interpreters, and return the side associated with the interpreter > in which "add_recv_fifo()" was called. A FIFOReader gets tied to > this interpreter. A FIFOWriter gets tied to the interpreter that > called "add_recv_fifo()". > > The FIFO's name is set to the provided value. If no name is > provided then a dynamically generated one is used. If a FIFO > with the given name is already associated with this interpreter > (or with the one in which "add_recv_fifo()" was called) then raise > KeyError. > > add_send_fifo(name=None): > > Create a new FIFO, associate the two ends with the involved > interpreters, and return the side associated with the interpreter > in which "add_recv_fifo()" was called. A FIFOWriter gets tied to > this interpreter. A FIFOReader gets tied to the interpreter that > called "add_recv_fifo()". > > The FIFO's name is set to the provided value. If no name is > provided then a dynamically generated one is used. If a FIFO > with the given name is already associated with this interpreter > (or with the one in which "add_send_fifo()" was called) then raise > KeyError. Personally, I *always* read these names backwards - from the POV of the caller. So when I see "add_send_fifo", I then expect to be able to send stuff on the returned FIFO (i.e., I get a writer back). But that's not how it works. I may be alone in this - I've had similar problems in the past with how people name pipes, for example - but I thought I'd raise it while there's still a possibility that it's not just me and the names can be changed. Paul From solipsis at pitrou.net Sat Sep 9 07:45:13 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 9 Sep 2017 13:45:13 +0200 Subject: [Python-Dev] PEP 554 v2 (new "interpreters" module) References: Message-ID: <20170909134513.4ba776bc@fsol> On Sat, 9 Sep 2017 09:04:59 +0100 Paul Moore wrote: > On 9 September 2017 at 00:04, Eric Snow wrote: > > add_recv_fifo(name=None): > > > > Create a new FIFO, associate the two ends with the involved > > interpreters, and return the side associated with the interpreter > > in which "add_recv_fifo()" was called. A FIFOReader gets tied to > > this interpreter. A FIFOWriter gets tied to the interpreter that > > called "add_recv_fifo()". > > > > The FIFO's name is set to the provided value. If no name is > > provided then a dynamically generated one is used. If a FIFO > > with the given name is already associated with this interpreter > > (or with the one in which "add_recv_fifo()" was called) then raise > > KeyError. > > > > add_send_fifo(name=None): > > > > Create a new FIFO, associate the two ends with the involved > > interpreters, and return the side associated with the interpreter > > in which "add_recv_fifo()" was called. A FIFOWriter gets tied to > > this interpreter. A FIFOReader gets tied to the interpreter that > > called "add_recv_fifo()". > > > > The FIFO's name is set to the provided value. If no name is > > provided then a dynamically generated one is used. If a FIFO > > with the given name is already associated with this interpreter > > (or with the one in which "add_send_fifo()" was called) then raise > > KeyError. > > Personally, I *always* read these names backwards - from the POV of > the caller. So when I see "add_send_fifo", I then expect to be able to > send stuff on the returned FIFO (i.e., I get a writer back). But > that's not how it works. Hmm, it's even worse for me, as it's not obvious by the quoted excerpt how those APIs are supposed to be used exactly. How does the other interpreter get the FIFO "tied" to it? Is it `get_current().get_fifo(name)`? Something else? How does it get to learn the *name*? Why are FIFOs unidirectional? Does it really make them significantly cheaper? With bidirectional pipes, the whole recv/send confusion would be avoided. As a point of comparison, for default for `multiprocessing.Pipe()` is to create a bidirectional pipe (*), yet I'm sure a multiprocessing Pipe has more overhead than an in-process FIFO. (*) https://docs.python.org/3/library/multiprocessing.html#multiprocessing.Pipe Regards Antoine. From ma3yuki.8mamo10 at gmail.com Sat Sep 9 07:49:05 2017 From: ma3yuki.8mamo10 at gmail.com (Masayuki YAMAMOTO) Date: Sat, 9 Sep 2017 20:49:05 +0900 Subject: [Python-Dev] PEP 539 v3: A new C API for Thread-Local Storage in CPython In-Reply-To: References: Message-ID: 2017-09-09 2:09 GMT+09:00 Nick Coghlan : > [...] > No, we genuinely want to consolidate that state into a single shared > location. However, the struct definition can be adjusted as needed as > part of the PEP 539 implementation (and we'll get Eric Snow to be one > of the PR reviewers). > I see. I do merge the field of the runtime structure with the replacement code as is. Thanks, Masayuki -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Sat Sep 9 08:05:10 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 9 Sep 2017 14:05:10 +0200 Subject: [Python-Dev] PEP 554 v2 (new "interpreters" module) References: Message-ID: <20170909140510.1220f57e@fsol> On Fri, 8 Sep 2017 16:04:27 -0700 Eric Snow wrote: > > The module provides the following functions: > > ``list()``:: > > Return a list of all existing interpreters. It's called ``enumerate()`` in the threading module. Not sure there's a point in choosing a different name here. > The module also provides the following classes: > > ``Interpreter(id)``:: > run(code): > > Run the provided Python code in the interpreter, in the current > OS thread. If the interpreter is already running then raise > RuntimeError in the interpreter that called ``run()``. > > The current interpreter (which called ``run()``) will block > until the subinterpreter finishes running the requested code. Any > uncaught exception in that code will bubble up to the current > interpreter. Why does it block? How is concurrency supposed to be achieved in that model? It would be more flexible if run(code) returned an object that can later be waited on. Something like... a Future :-) And why guarantee that it executes in the "current OS thread"? I would say you don't want to specify where it executes exactly, as it opens the door for more sophisticated implementations (such as automatic assignment of subinterpreters inside a pool of threads). > get_fifo(name): > > Return the FIFO object with the given name that is associated > with this interpreter. If no such FIFO exists then raise > KeyError. The FIFO will be either a "FIFOReader" or a > "FIFOWriter", depending on which "add_*_fifo()" was called. > > list_fifos(): > > Return a list of all fifos associated with the interpreter. If fifos are uniquely named, why not return a name->fifo mapping? > ``FIFOReader(name)``:: > [...] I don't think the method naming choice is very adequate here. The API model for the FIFO objects can either be a (threading or multiprocessing) Queue or a multiprocessing Pipe. - if a Queue, then it should have a get() / put() pair of methods - if a Pipe, then it should have a recv() / send() pair of methods Now, since Queues are multi-producer multi-consumer, while Pipes are single-producer single-consumer (they aren't "synchronized"), the better analogy seems to the multiprocessing Pipe here, so I would vote for recv() / send(). But, in any case, definitely not a pop() / push() pair. Has any thought been given to how FIFOs could integrate with async code driven by an event loop (e.g. asyncio)? I think the model of executing several asyncio (or Tornado) applications each in their own subinterpreter may prove quite interesting to reconcile multi-core concurrency with ease of programming. That would require the FIFOs to be able to synchronize on something an event loop can wait on (probably a file descriptor?). Regards Antoine. From erik.m.bray at gmail.com Sat Sep 9 09:03:15 2017 From: erik.m.bray at gmail.com (Erik Bray) Date: Sat, 9 Sep 2017 15:03:15 +0200 Subject: [Python-Dev] PEP 539 v3: A new C API for Thread-Local Storage in CPython In-Reply-To: References: Message-ID: On Fri, Sep 8, 2017 at 4:37 PM, Nick Coghlan wrote: > On 8 September 2017 at 00:30, Masayuki YAMAMOTO > wrote: >> Hi folks, >> >> I submit PEP 539 third draft for the finish. Thank you for all the advice >> and the help! > > Thank you Erik & Yamamoto-san for all of your work on this PEP! > > The updates look good, so I'm happy to say as BDFL-Delegate that this > proposal is now accepted :) Thanks Nick! It's great to have this issue resolved in a carefully-considered, well-documented manner. Best, Erik From sf at fermigier.com Sat Sep 9 06:02:45 2017 From: sf at fermigier.com (=?UTF-8?Q?St=C3=A9fane_Fermigier?=) Date: Sat, 9 Sep 2017 12:02:45 +0200 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> Message-ID: Hi, first post here. My two cents: Here's a list of "prior arts" that I have collected over the years, besides attrs, that address similar needs (and often, much more): - https://github.com/bluedynamics/plumber - https://github.com/ionelmc/python-fields - https://github.com/frasertweedale/elk - https://github.com/kuujo/yuppy Regarding the name, 'dataclass', I agree that it can be a bit misleading (my first idea of a "dataclass" would be a class with only data and no behaviour, e.g. a 'struct', a 'record', a DTO, 'anemic' class, etc.). Scala has 'case classes' with some similarities ( https://docs.scala-lang.org/tour/case-classes.html). Regards, S. On Fri, Sep 8, 2017 at 4:57 PM, Eric V. Smith wrote: > I've written a PEP for what might be thought of as "mutable namedtuples > with defaults, but not inheriting tuple's behavior" (a mouthful, but it > sounded simpler when I first thought of it). It's heavily influenced by the > attrs project. It uses PEP 526 type annotations to define fields. From the > overview section: > > @dataclass > class InventoryItem: > name: str > unit_price: float > quantity_on_hand: int = 0 > > def total_cost(self) -> float: > return self.unit_price * self.quantity_on_hand > > Will automatically add these methods: > > def __init__(self, name: str, unit_price: float, quantity_on_hand: int = > 0) -> None: > self.name = name > self.unit_price = unit_price > self.quantity_on_hand = quantity_on_hand > def __repr__(self): > return f'InventoryItem(name={self.name!r},unit_price={self.unit_pri > ce!r},quantity_on_hand={self.quantity_on_hand!r})' > def __eq__(self, other): > if other.__class__ is self.__class__: > return (self.name, self.unit_price, self.quantity_on_hand) == ( > other.name, other.unit_price, other.quantity_on_hand) > return NotImplemented > def __ne__(self, other): > if other.__class__ is self.__class__: > return (self.name, self.unit_price, self.quantity_on_hand) != ( > other.name, other.unit_price, other.quantity_on_hand) > return NotImplemented > def __lt__(self, other): > if other.__class__ is self.__class__: > return (self.name, self.unit_price, self.quantity_on_hand) < ( > other.name, other.unit_price, other.quantity_on_hand) > return NotImplemented > def __le__(self, other): > if other.__class__ is self.__class__: > return (self.name, self.unit_price, self.quantity_on_hand) <= ( > other.name, other.unit_price, other.quantity_on_hand) > return NotImplemented > def __gt__(self, other): > if other.__class__ is self.__class__: > return (self.name, self.unit_price, self.quantity_on_hand) > ( > other.name, other.unit_price, other.quantity_on_hand) > return NotImplemented > def __ge__(self, other): > if other.__class__ is self.__class__: > return (self.name, self.unit_price, self.quantity_on_hand) >= ( > other.name, other.unit_price, other.quantity_on_hand) > return NotImplemented > > Data Classes saves you from writing and maintaining these functions. > > The PEP is largely complete, but could use some filling out in places. > Comments welcome! > > Eric. > > P.S. I wrote this PEP when I was in my happy place. > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/sfermigie > r%2Blists%40gmail.com > -- Stefane Fermigier - http://fermigier.com/ - http://twitter.com/sfermigier - http://linkedin.com/in/sfermigier Founder & CEO, Abilian - Enterprise Social Software - http://www.abilian.com/ Chairman, Free&OSS Group / Systematic Cluster - http://www.gt-logiciel-libre.org/ Co-Chairman, National Council for Free & Open Source Software (CNLL) - http://cnll.fr/ Founder & Organiser, PyData Paris - http://pydata.fr/ --- ?You never change things by ?ghting the existing reality. To change something, build a new model that makes the existing model obsolete.? ? R. Buckminster Fuller -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Sep 9 11:12:14 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 9 Sep 2017 08:12:14 -0700 Subject: [Python-Dev] PEP 556: Threaded garbage collection In-Reply-To: References: <20170908170523.73bdb61e@fsol> <1504897450.203763.1099937864.1DA2E661@webmail.messagingengine.com> <20170908211346.3f404a4b@fsol> Message-ID: On 8 September 2017 at 12:40, Nathaniel Smith wrote: > Would it make sense to also move signal handlers to run in this > thread? Those are the other major source of nasty re-entrancy > problems. Python level signal handlers are already only run in the main thread, so applications that want to ensure signals don't run at arbitrary points in their code are already free to push all their application logic into a subthread and have the main thread be purely a signal handling thread. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From gjcarneiro at gmail.com Sat Sep 9 11:41:34 2017 From: gjcarneiro at gmail.com (Gustavo Carneiro) Date: Sat, 9 Sep 2017 16:41:34 +0100 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> Message-ID: Hi, it is not clear whether anything is done to total_cost: def total_cost(self) -> float: Does this become a property automatically, or is it still a method call? To that end, some examples of *using* a data class, not just defining one, would be helpful. If it remains a normal method, why put it in this example at all? Makes little sense... Otherwise I really like this idea, thanks! On 8 September 2017 at 15:57, Eric V. Smith wrote: > I've written a PEP for what might be thought of as "mutable namedtuples > with defaults, but not inheriting tuple's behavior" (a mouthful, but it > sounded simpler when I first thought of it). It's heavily influenced by the > attrs project. It uses PEP 526 type annotations to define fields. From the > overview section: > > @dataclass > class InventoryItem: > name: str > unit_price: float > quantity_on_hand: int = 0 > > def total_cost(self) -> float: > return self.unit_price * self.quantity_on_hand > > Will automatically add these methods: > > def __init__(self, name: str, unit_price: float, quantity_on_hand: int = > 0) -> None: > self.name = name > self.unit_price = unit_price > self.quantity_on_hand = quantity_on_hand > def __repr__(self): > return f'InventoryItem(name={self.name!r},unit_price={self.unit_pri > ce!r},quantity_on_hand={self.quantity_on_hand!r})' > def __eq__(self, other): > if other.__class__ is self.__class__: > return (self.name, self.unit_price, self.quantity_on_hand) == ( > other.name, other.unit_price, other.quantity_on_hand) > return NotImplemented > def __ne__(self, other): > if other.__class__ is self.__class__: > return (self.name, self.unit_price, self.quantity_on_hand) != ( > other.name, other.unit_price, other.quantity_on_hand) > return NotImplemented > def __lt__(self, other): > if other.__class__ is self.__class__: > return (self.name, self.unit_price, self.quantity_on_hand) < ( > other.name, other.unit_price, other.quantity_on_hand) > return NotImplemented > def __le__(self, other): > if other.__class__ is self.__class__: > return (self.name, self.unit_price, self.quantity_on_hand) <= ( > other.name, other.unit_price, other.quantity_on_hand) > return NotImplemented > def __gt__(self, other): > if other.__class__ is self.__class__: > return (self.name, self.unit_price, self.quantity_on_hand) > ( > other.name, other.unit_price, other.quantity_on_hand) > return NotImplemented > def __ge__(self, other): > if other.__class__ is self.__class__: > return (self.name, self.unit_price, self.quantity_on_hand) >= ( > other.name, other.unit_price, other.quantity_on_hand) > return NotImplemented > > Data Classes saves you from writing and maintaining these functions. > > The PEP is largely complete, but could use some filling out in places. > Comments welcome! > > Eric. > > P.S. I wrote this PEP when I was in my happy place. > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/gjcarneir > o%40gmail.com > -- Gustavo J. A. M. Carneiro Gambit Research "The universe is always one step beyond logic." -- Frank Herbert -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Sep 9 11:58:26 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 9 Sep 2017 08:58:26 -0700 Subject: [Python-Dev] PEP 554 v2 (new "interpreters" module) In-Reply-To: References: Message-ID: On 9 September 2017 at 01:04, Paul Moore wrote: > On 9 September 2017 at 00:04, Eric Snow wrote: >> add_recv_fifo(name=None): >> >> Create a new FIFO, associate the two ends with the involved >> interpreters, and return the side associated with the interpreter >> in which "add_recv_fifo()" was called. A FIFOReader gets tied to >> this interpreter. A FIFOWriter gets tied to the interpreter that >> called "add_recv_fifo()". >> >> The FIFO's name is set to the provided value. If no name is >> provided then a dynamically generated one is used. If a FIFO >> with the given name is already associated with this interpreter >> (or with the one in which "add_recv_fifo()" was called) then raise >> KeyError. >> >> add_send_fifo(name=None): >> >> Create a new FIFO, associate the two ends with the involved >> interpreters, and return the side associated with the interpreter >> in which "add_recv_fifo()" was called. A FIFOWriter gets tied to >> this interpreter. A FIFOReader gets tied to the interpreter that >> called "add_recv_fifo()". >> >> The FIFO's name is set to the provided value. If no name is >> provided then a dynamically generated one is used. If a FIFO >> with the given name is already associated with this interpreter >> (or with the one in which "add_send_fifo()" was called) then raise >> KeyError. > > Personally, I *always* read these names backwards - from the POV of > the caller. So when I see "add_send_fifo", I then expect to be able to > send stuff on the returned FIFO (i.e., I get a writer back). But > that's not how it works. I had the same problem with the current names: as a message sender, I expect to request a send queue, as a message receiver, I expect to request a receive queue. Having to request the opposite of what I want to do next seems backwards: send_fifo = other_interpreter.add_recv_fifo(__name__) # Wut? send_fifo.push(b"Hello!") I think it would be much clearer if the API looked like this for an inline subinterpreter invocation.: # Sending messages other_interpreter = interpreters.create() send_fifo = other_interpreter.get_send_fifo(__name__) send_fifo.push(b"Hello!") send_fifo.push(b"World!") send_fifo.push(None) # Receiving messages receiver_code = textwrap.dedent(f""" import interpreters recv_fifo = interpreters.get_current().get_recv_fifo({__name__}) while True: msg = recv_fifo.pop() if msg is None: break print(msg) """) other_interpreter.run(receiver_code) To enable concurrent communication between the sender & receiver, you'd currently need to move the "other_interpreter.run()" call out to a separate thread, but I think that's fine for now. The rules for get_recv_fifo() and get_send_fifo() would be: - named fifos are created as needed and always stored on the *receiving* interpreter - you can only call get_recv_fifo() on the currently running interpreter: other interpreters can't access your receive fifos - get_recv_fifo() never fails - the receiving interpreter can have as many references to a receive FIFO as it likes - get_send_fifo() can fail if the named fifo already exists on the receiving interpreter and the designated sender is an interpreter other than the one calling get_send_fifo() - interpreters are free to use the named FIFO mechanism to send messages to themselves To immediately realise some level of efficiency benefits from the shared memory space between the main interpreter and subinterpreters, I also think these low level FIFOs should be defined as accepting any object that supports the PEP 3118 buffer protocol, and emitting memoryview() objects on the receiving end, rather than being bytes-in, bytes-out. Such a memoryview based primitive can then potentially be expanded in the future to support additional more-efficient-than-multiprocessing serailisation based protocols, where the sending interpreter serialises into a region of memory, shares that region with the subinterpreter via a FIFO memoryview, and then the subinterpreter deserialises directly from the sending interpreter's memory region, without the serialised form ever needing to be streamed or copied anywhere (which is much harder to avoid when coordinating across multiple operating system processes). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From njs at pobox.com Sat Sep 9 13:54:53 2017 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 9 Sep 2017 10:54:53 -0700 Subject: [Python-Dev] PEP 554 v2 (new "interpreters" module) In-Reply-To: References: Message-ID: On Sep 9, 2017 9:07 AM, "Nick Coghlan" wrote: To immediately realise some level of efficiency benefits from the shared memory space between the main interpreter and subinterpreters, I also think these low level FIFOs should be defined as accepting any object that supports the PEP 3118 buffer protocol, and emitting memoryview() objects on the receiving end, rather than being bytes-in, bytes-out. Is your idea that this memoryview would refer directly to the sending interpreter's memory (as opposed to a copy into some receiver-owned buffer)? If so, then how do the two subinterpreters coordinate the buffer release when the memoryview is closed? -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sat Sep 9 14:04:32 2017 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 9 Sep 2017 11:04:32 -0700 Subject: [Python-Dev] PEP 554 v2 (new "interpreters" module) In-Reply-To: References: Message-ID: On Sep 8, 2017 4:06 PM, "Eric Snow" wrote: run(code): Run the provided Python code in the interpreter, in the current OS thread. If the interpreter is already running then raise RuntimeError in the interpreter that called ``run()``. The current interpreter (which called ``run()``) will block until the subinterpreter finishes running the requested code. Any uncaught exception in that code will bubble up to the current interpreter. This phrase "bubble up" here is doing a lot of work :-). Can you elaborate on what you mean? The text now makes it seem like the exception will just pass from one interpreter into another, but that seems impossible ? it'd mean sharing not just arbitrary user defined exception classes but full frame objects... -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Sat Sep 9 14:46:30 2017 From: barry at python.org (Barry Warsaw) Date: Sat, 9 Sep 2017 11:46:30 -0700 Subject: [Python-Dev] PEP 559 - built-in noop() Message-ID: I couldn?t resist one more PEP from the Core sprint. I won?t reveal where or how this one came to me. -Barry PEP: 559 Title: Built-in noop() Author: Barry Warsaw Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 2017-09-08 Python-Version: 3.7 Post-History: 2017-09-09 Abstract ======== This PEP proposes adding a new built-in function called ``noop()`` which does nothing but return ``None``. Rationale ========= It is trivial to implement a no-op function in Python. It's so easy in fact that many people do it many times over and over again. It would be useful in many cases to have a common built-in function that does nothing. One use case would be for PEP 553, where you could set the breakpoint environment variable to the following in order to effectively disable it:: $ setenv PYTHONBREAKPOINT=noop Implementation ============== The Python equivalent of the ``noop()`` function is exactly:: def noop(*args, **kws): return None The C built-in implementation is available as a pull request. Rejected alternatives ===================== ``noop()`` returns something ---------------------------- YAGNI. This is rejected because it complicates the semantics. For example, if you always return both ``*args`` and ``**kws``, what do you return when none of those are given? Returning a tuple of ``((), {})`` is kind of ugly, but provides consistency. But you might also want to just return ``None`` since that's also conceptually what the function was passed. Or, what if you pass in exactly one positional argument, e.g. ``noop(7)``. Do you return ``7`` or ``((7,), {})``? And so on. The author claims that you won't ever need the return value of ``noop()`` so it will always return ``None``. Coghlin's Dialogs (edited for formatting): My counterargument to this would be ``map(noop, iterable)``, ``sorted(iterable, key=noop)``, etc. (``filter``, ``max``, and ``min`` all accept callables that accept a single argument, as do many of the itertools operations). Making ``noop()`` a useful default function in those cases just needs the definition to be:: def noop(*args, **kwds): return args[0] if args else None The counterargument to the counterargument is that using ``None`` as the default in all these cases is going to be faster, since it lets the algorithm skip the callback entirely, rather than calling it and having it do nothing useful. Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 273 bytes Desc: Message signed with OpenPGP URL: From python at mrabarnett.plus.com Sat Sep 9 14:58:47 2017 From: python at mrabarnett.plus.com (MRAB) Date: Sat, 9 Sep 2017 19:58:47 +0100 Subject: [Python-Dev] PEP 559 - built-in noop() In-Reply-To: References: Message-ID: <154082ae-46b3-db5e-d2c8-5a764a0198af@mrabarnett.plus.com> On 2017-09-09 19:46, Barry Warsaw wrote: > I couldn?t resist one more PEP from the Core sprint. I won?t reveal where or how this one came to me. > > -Barry > > PEP: 559 > Title: Built-in noop() > Author: Barry Warsaw > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 2017-09-08 > Python-Version: 3.7 > Post-History: 2017-09-09 > > > Abstract > ======== > > This PEP proposes adding a new built-in function called ``noop()`` which does > nothing but return ``None``. > [snip] I'd prefer the more tradition "nop". :-) From benhoyt at gmail.com Sat Sep 9 15:04:18 2017 From: benhoyt at gmail.com (Ben Hoyt) Date: Sat, 9 Sep 2017 15:04:18 -0400 Subject: [Python-Dev] PEP 559 - built-in noop() In-Reply-To: References: Message-ID: I don't think the rationale justifies an entire builtin. You could just use "PYTHONBREAKPOINT=int" to disable, or support "PYTHONBREAKPOINT=0" as I think someone else suggested. I personally can't remember the last time I needed a noop() function. I've more often needed an identity() function, and even that has been proposed several times but shot down, mainly because different people have different needs. (Often "None" provided for function arguments means identity anyway, like for sorted() and filter().) If anything, I think it should be functools.noop() and PYTHONBREAKPOINT=functools.noop I think we need more justification and examples of where this would be helpful to justify a new builtin. -Ben On Sat, Sep 9, 2017 at 2:46 PM, Barry Warsaw wrote: > I couldn?t resist one more PEP from the Core sprint. I won?t reveal where > or how this one came to me. > > -Barry > > PEP: 559 > Title: Built-in noop() > Author: Barry Warsaw > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 2017-09-08 > Python-Version: 3.7 > Post-History: 2017-09-09 > > > Abstract > ======== > > This PEP proposes adding a new built-in function called ``noop()`` which > does > nothing but return ``None``. > > > Rationale > ========= > > It is trivial to implement a no-op function in Python. It's so easy in > fact > that many people do it many times over and over again. It would be useful > in > many cases to have a common built-in function that does nothing. > > One use case would be for PEP 553, where you could set the breakpoint > environment variable to the following in order to effectively disable it:: > > $ setenv PYTHONBREAKPOINT=noop > > > Implementation > ============== > > The Python equivalent of the ``noop()`` function is exactly:: > > def noop(*args, **kws): > return None > > The C built-in implementation is available as a pull request. > > > Rejected alternatives > ===================== > > ``noop()`` returns something > ---------------------------- > > YAGNI. > > This is rejected because it complicates the semantics. For example, if you > always return both ``*args`` and ``**kws``, what do you return when none of > those are given? Returning a tuple of ``((), {})`` is kind of ugly, but > provides consistency. But you might also want to just return ``None`` > since > that's also conceptually what the function was passed. > > Or, what if you pass in exactly one positional argument, e.g. > ``noop(7)``. Do > you return ``7`` or ``((7,), {})``? And so on. > > The author claims that you won't ever need the return value of ``noop()`` > so > it will always return ``None``. > > Coghlin's Dialogs (edited for formatting): > > My counterargument to this would be ``map(noop, iterable)``, > ``sorted(iterable, key=noop)``, etc. (``filter``, ``max``, and > ``min`` all accept callables that accept a single argument, as do > many of the itertools operations). > > Making ``noop()`` a useful default function in those cases just > needs the definition to be:: > > def noop(*args, **kwds): > return args[0] if args else None > > The counterargument to the counterargument is that using ``None`` > as the default in all these cases is going to be faster, since it > lets the algorithm skip the callback entirely, rather than calling > it and having it do nothing useful. > > > Copyright > ========= > > This document has been placed in the public domain. > > > .. > Local Variables: > mode: indented-text > indent-tabs-mode: nil > sentence-end-double-space: t > fill-column: 70 > coding: utf-8 > End: > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > benhoyt%40gmail.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Sat Sep 9 15:54:33 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Sat, 9 Sep 2017 12:54:33 -0700 Subject: [Python-Dev] PEP 559 - built-in noop() In-Reply-To: References: Message-ID: I always wanted this feature (no kidding). Would it be possible to add support for the context manager? with noop(): ... Maybe noop can be an instance of: class Noop: def __enter__(self, *args, **kw): return self def __exit__(self, *args): pass def __call__(self, *args, **kw): return self Victor Le 9 sept. 2017 11:48 AM, "Barry Warsaw" a ?crit : > I couldn?t resist one more PEP from the Core sprint. I won?t reveal where > or how this one came to me. > > -Barry > > PEP: 559 > Title: Built-in noop() > Author: Barry Warsaw > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 2017-09-08 > Python-Version: 3.7 > Post-History: 2017-09-09 > > > Abstract > ======== > > This PEP proposes adding a new built-in function called ``noop()`` which > does > nothing but return ``None``. > > > Rationale > ========= > > It is trivial to implement a no-op function in Python. It's so easy in > fact > that many people do it many times over and over again. It would be useful > in > many cases to have a common built-in function that does nothing. > > One use case would be for PEP 553, where you could set the breakpoint > environment variable to the following in order to effectively disable it:: > > $ setenv PYTHONBREAKPOINT=noop > > > Implementation > ============== > > The Python equivalent of the ``noop()`` function is exactly:: > > def noop(*args, **kws): > return None > > The C built-in implementation is available as a pull request. > > > Rejected alternatives > ===================== > > ``noop()`` returns something > ---------------------------- > > YAGNI. > > This is rejected because it complicates the semantics. For example, if you > always return both ``*args`` and ``**kws``, what do you return when none of > those are given? Returning a tuple of ``((), {})`` is kind of ugly, but > provides consistency. But you might also want to just return ``None`` > since > that's also conceptually what the function was passed. > > Or, what if you pass in exactly one positional argument, e.g. > ``noop(7)``. Do > you return ``7`` or ``((7,), {})``? And so on. > > The author claims that you won't ever need the return value of ``noop()`` > so > it will always return ``None``. > > Coghlin's Dialogs (edited for formatting): > > My counterargument to this would be ``map(noop, iterable)``, > ``sorted(iterable, key=noop)``, etc. (``filter``, ``max``, and > ``min`` all accept callables that accept a single argument, as do > many of the itertools operations). > > Making ``noop()`` a useful default function in those cases just > needs the definition to be:: > > def noop(*args, **kwds): > return args[0] if args else None > > The counterargument to the counterargument is that using ``None`` > as the default in all these cases is going to be faster, since it > lets the algorithm skip the callback entirely, rather than calling > it and having it do nothing useful. > > > Copyright > ========= > > This document has been placed in the public domain. > > > .. > Local Variables: > mode: indented-text > indent-tabs-mode: nil > sentence-end-double-space: t > fill-column: 70 > coding: utf-8 > End: > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > victor.stinner%40gmail.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Sat Sep 9 16:26:44 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 9 Sep 2017 23:26:44 +0300 Subject: [Python-Dev] PEP 559 - built-in noop() In-Reply-To: References: Message-ID: 09.09.17 21:46, Barry Warsaw ????: > One use case would be for PEP 553, where you could set the breakpoint > environment variable to the following in order to effectively disable it:: > > $ setenv PYTHONBREAKPOINT=noop Are there other use cases? PEP 553 still is not approved, and you could use other syntax for disabling breakpoint(), e.g. setting PYTHONBREAKPOINT to an empty value. It looks to me that in all other cases it can be replaced with `lambda *args, **kwds: None` (actually the expression can be even simpler in concrete cases). I can't remember any case when I needed an noop() function (unlike to an identity() function or null context manager). From solipsis at pitrou.net Sat Sep 9 16:35:34 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 9 Sep 2017 22:35:34 +0200 Subject: [Python-Dev] PEP 559 - built-in noop() References: Message-ID: <20170909223534.22a5ad76@fsol> On Sat, 9 Sep 2017 11:46:30 -0700 Barry Warsaw wrote: > > Rationale > ========= > > It is trivial to implement a no-op function in Python. It's so easy in fact > that many people do it many times over and over again. It would be useful in > many cases to have a common built-in function that does nothing. You forgot to mention the advantage of battle-testing the noop() function on our buildbot fleet! You don't want to compromise your application code with a weak noop() function. > The C built-in implementation is available as a pull request. Does it guarantee reentrancy? Regards Antoine. From victor.stinner at gmail.com Sat Sep 9 17:33:18 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Sat, 9 Sep 2017 14:33:18 -0700 Subject: [Python-Dev] PEP 559 - built-in noop() In-Reply-To: References: Message-ID: I was able to find a real keyboard, so here is a more complete code: --- class Noop: def __call__(self, *args, **kw): return self def __enter__(self, *args, **kw): return self def __exit__(self, *args): return def __repr__(self): return 'nope' --- Example: --- noop = Noop() print(noop) print(noop()) with noop() as nope: print(nope) with noop as well: print(well) --- Output: --- nope nope nope nope --- IHMO the real question is if we need a Noop.nope() method? Victor 2017-09-09 12:54 GMT-07:00 Victor Stinner : > I always wanted this feature (no kidding). > > Would it be possible to add support for the context manager? > > with noop(): ... > > Maybe noop can be an instance of: > > class Noop: > def __enter__(self, *args, **kw): return self > def __exit__(self, *args): pass > def __call__(self, *args, **kw): return self > > Victor > > Le 9 sept. 2017 11:48 AM, "Barry Warsaw" a ?crit : >> >> I couldn?t resist one more PEP from the Core sprint. I won?t reveal where >> or how this one came to me. >> >> -Barry >> >> PEP: 559 >> Title: Built-in noop() >> Author: Barry Warsaw >> Status: Draft >> Type: Standards Track >> Content-Type: text/x-rst >> Created: 2017-09-08 >> Python-Version: 3.7 >> Post-History: 2017-09-09 >> >> >> Abstract >> ======== >> >> This PEP proposes adding a new built-in function called ``noop()`` which >> does >> nothing but return ``None``. >> >> >> Rationale >> ========= >> >> It is trivial to implement a no-op function in Python. It's so easy in >> fact >> that many people do it many times over and over again. It would be useful >> in >> many cases to have a common built-in function that does nothing. >> >> One use case would be for PEP 553, where you could set the breakpoint >> environment variable to the following in order to effectively disable it:: >> >> $ setenv PYTHONBREAKPOINT=noop >> >> >> Implementation >> ============== >> >> The Python equivalent of the ``noop()`` function is exactly:: >> >> def noop(*args, **kws): >> return None >> >> The C built-in implementation is available as a pull request. >> >> >> Rejected alternatives >> ===================== >> >> ``noop()`` returns something >> ---------------------------- >> >> YAGNI. >> >> This is rejected because it complicates the semantics. For example, if >> you >> always return both ``*args`` and ``**kws``, what do you return when none >> of >> those are given? Returning a tuple of ``((), {})`` is kind of ugly, but >> provides consistency. But you might also want to just return ``None`` >> since >> that's also conceptually what the function was passed. >> >> Or, what if you pass in exactly one positional argument, e.g. ``noop(7)``. >> Do >> you return ``7`` or ``((7,), {})``? And so on. >> >> The author claims that you won't ever need the return value of ``noop()`` >> so >> it will always return ``None``. >> >> Coghlin's Dialogs (edited for formatting): >> >> My counterargument to this would be ``map(noop, iterable)``, >> ``sorted(iterable, key=noop)``, etc. (``filter``, ``max``, and >> ``min`` all accept callables that accept a single argument, as do >> many of the itertools operations). >> >> Making ``noop()`` a useful default function in those cases just >> needs the definition to be:: >> >> def noop(*args, **kwds): >> return args[0] if args else None >> >> The counterargument to the counterargument is that using ``None`` >> as the default in all these cases is going to be faster, since it >> lets the algorithm skip the callback entirely, rather than calling >> it and having it do nothing useful. >> >> >> Copyright >> ========= >> >> This document has been placed in the public domain. >> >> >> .. >> Local Variables: >> mode: indented-text >> indent-tabs-mode: nil >> sentence-end-double-space: t >> fill-column: 70 >> coding: utf-8 >> End: >> >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com >> > From phd at phdru.name Sat Sep 9 17:38:46 2017 From: phd at phdru.name (Oleg Broytman) Date: Sat, 9 Sep 2017 23:38:46 +0200 Subject: [Python-Dev] PEP 559 - built-in noop() In-Reply-To: References: Message-ID: <20170909213846.GA4588@phdru.name> On Sat, Sep 09, 2017 at 02:33:18PM -0700, Victor Stinner wrote: > I was able to find a real keyboard, so here is a more complete code: > --- > class Noop: > def __call__(self, *args, **kw): > return self > def __enter__(self, *args, **kw): > return self > def __exit__(self, *args): > return > def __repr__(self): > return 'nope' > --- > > Example: > --- > noop = Noop() > print(noop) > print(noop()) > with noop() as nope: > print(nope) > with noop as well: > print(well) > --- > > Output: > --- > nope > nope > nope > nope > --- > > IHMO the real question is if we need a Noop.nope() method? Yep. It must return self so one can chain as many calls as she wants. > Victor Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From k7hoven at gmail.com Sat Sep 9 17:43:27 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Sun, 10 Sep 2017 00:43:27 +0300 Subject: [Python-Dev] PEP 559 - built-in noop() In-Reply-To: References: Message-ID: On Sat, Sep 9, 2017 at 10:54 PM, Victor Stinner wrote: > I always wanted this feature (no kidding). > > Would it be possible to add support for the context manager? > > with noop(): ... > > Maybe noop can be an instance of: > > class Noop: > def __enter__(self, *args, **kw): return self > def __exit__(self, *args): pass > def __call__(self, *args, **kw): return self > > ?This worries me. Clearly, assuming a None-coercing noop, we must have: noop(foo) is None noop[foo] is None noop * ?foo is None foo * noop is None noop + foo is None foo + noop is None noop - noop is None ... noop / 0 is None ... (noop == None) is None which can all sort of be implicitly extrapolated to be in the PEP. But how are you planning to implement: (noop is None) is None (obj in noop) is None (noop in obj) is None or (None or noop) is None (None and noop) is None and finally: foo(noop) is None ? Sooner or later, someone will need all these features, and the PEP does not seem to address the issue in any way. -- Koos > Victor > > Le 9 sept. 2017 11:48 AM, "Barry Warsaw" a ?crit : > >> I couldn?t resist one more PEP from the Core sprint. I won?t reveal >> where or how this one came to me. >> >> -Barry >> >> PEP: 559 >> Title: Built-in noop() >> Author: Barry Warsaw >> Status: Draft >> Type: Standards Track >> Content-Type: text/x-rst >> Created: 2017-09-08 >> Python-Version: 3.7 >> Post-History: 2017-09-09 >> >> >> Abstract >> ======== >> >> This PEP proposes adding a new built-in function called ``noop()`` which >> does >> nothing but return ``None``. >> >> >> Rationale >> ========= >> >> It is trivial to implement a no-op function in Python. It's so easy in >> fact >> that many people do it many times over and over again. It would be >> useful in >> many cases to have a common built-in function that does nothing. >> >> One use case would be for PEP 553, where you could set the breakpoint >> environment variable to the following in order to effectively disable it:: >> >> $ setenv PYTHONBREAKPOINT=noop >> >> >> Implementation >> ============== >> >> The Python equivalent of the ``noop()`` function is exactly:: >> >> def noop(*args, **kws): >> return None >> >> The C built-in implementation is available as a pull request. >> >> >> Rejected alternatives >> ===================== >> >> ``noop()`` returns something >> ---------------------------- >> >> YAGNI. >> >> This is rejected because it complicates the semantics. For example, if >> you >> always return both ``*args`` and ``**kws``, what do you return when none >> of >> those are given? Returning a tuple of ``((), {})`` is kind of ugly, but >> provides consistency. But you might also want to just return ``None`` >> since >> that's also conceptually what the function was passed. >> >> Or, what if you pass in exactly one positional argument, e.g. >> ``noop(7)``. Do >> you return ``7`` or ``((7,), {})``? And so on. >> >> The author claims that you won't ever need the return value of ``noop()`` >> so >> it will always return ``None``. >> >> Coghlin's Dialogs (edited for formatting): >> >> My counterargument to this would be ``map(noop, iterable)``, >> ``sorted(iterable, key=noop)``, etc. (``filter``, ``max``, and >> ``min`` all accept callables that accept a single argument, as do >> many of the itertools operations). >> >> Making ``noop()`` a useful default function in those cases just >> needs the definition to be:: >> >> def noop(*args, **kwds): >> return args[0] if args else None >> >> The counterargument to the counterargument is that using ``None`` >> as the default in all these cases is going to be faster, since it >> lets the algorithm skip the callback entirely, rather than calling >> it and having it do nothing useful. >> >> >> Copyright >> ========= >> >> This document has been placed in the public domain. >> >> >> .. >> Local Variables: >> mode: indented-text >> indent-tabs-mode: nil >> sentence-end-double-space: t >> fill-column: 70 >> coding: utf-8 >> End: >> >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: https://mail.python.org/mailman/options/python-dev/victor. >> stinner%40gmail.com >> >> > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > k7hoven%40gmail.com > > -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Sat Sep 9 17:45:21 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Sat, 9 Sep 2017 14:45:21 -0700 Subject: [Python-Dev] PEP 559 - built-in noop() In-Reply-To: References: Message-ID: Previous discussion: https://bugs.python.org/issue10049 Issue closed as rejected. Victor 2017-09-09 14:33 GMT-07:00 Victor Stinner : > I was able to find a real keyboard, so here is a more complete code: > --- > class Noop: > def __call__(self, *args, **kw): > return self > def __enter__(self, *args, **kw): > return self > def __exit__(self, *args): > return > def __repr__(self): > return 'nope' > --- > > Example: > --- > noop = Noop() > print(noop) > print(noop()) > with noop() as nope: > print(nope) > with noop as well: > print(well) > --- > > Output: > --- > nope > nope > nope > nope > --- > > IHMO the real question is if we need a Noop.nope() method? > > Victor > > 2017-09-09 12:54 GMT-07:00 Victor Stinner : >> I always wanted this feature (no kidding). >> >> Would it be possible to add support for the context manager? >> >> with noop(): ... >> >> Maybe noop can be an instance of: >> >> class Noop: >> def __enter__(self, *args, **kw): return self >> def __exit__(self, *args): pass >> def __call__(self, *args, **kw): return self >> >> Victor >> >> Le 9 sept. 2017 11:48 AM, "Barry Warsaw" a ?crit : >>> >>> I couldn?t resist one more PEP from the Core sprint. I won?t reveal where >>> or how this one came to me. >>> >>> -Barry >>> >>> PEP: 559 >>> Title: Built-in noop() >>> Author: Barry Warsaw >>> Status: Draft >>> Type: Standards Track >>> Content-Type: text/x-rst >>> Created: 2017-09-08 >>> Python-Version: 3.7 >>> Post-History: 2017-09-09 >>> >>> >>> Abstract >>> ======== >>> >>> This PEP proposes adding a new built-in function called ``noop()`` which >>> does >>> nothing but return ``None``. >>> >>> >>> Rationale >>> ========= >>> >>> It is trivial to implement a no-op function in Python. It's so easy in >>> fact >>> that many people do it many times over and over again. It would be useful >>> in >>> many cases to have a common built-in function that does nothing. >>> >>> One use case would be for PEP 553, where you could set the breakpoint >>> environment variable to the following in order to effectively disable it:: >>> >>> $ setenv PYTHONBREAKPOINT=noop >>> >>> >>> Implementation >>> ============== >>> >>> The Python equivalent of the ``noop()`` function is exactly:: >>> >>> def noop(*args, **kws): >>> return None >>> >>> The C built-in implementation is available as a pull request. >>> >>> >>> Rejected alternatives >>> ===================== >>> >>> ``noop()`` returns something >>> ---------------------------- >>> >>> YAGNI. >>> >>> This is rejected because it complicates the semantics. For example, if >>> you >>> always return both ``*args`` and ``**kws``, what do you return when none >>> of >>> those are given? Returning a tuple of ``((), {})`` is kind of ugly, but >>> provides consistency. But you might also want to just return ``None`` >>> since >>> that's also conceptually what the function was passed. >>> >>> Or, what if you pass in exactly one positional argument, e.g. ``noop(7)``. >>> Do >>> you return ``7`` or ``((7,), {})``? And so on. >>> >>> The author claims that you won't ever need the return value of ``noop()`` >>> so >>> it will always return ``None``. >>> >>> Coghlin's Dialogs (edited for formatting): >>> >>> My counterargument to this would be ``map(noop, iterable)``, >>> ``sorted(iterable, key=noop)``, etc. (``filter``, ``max``, and >>> ``min`` all accept callables that accept a single argument, as do >>> many of the itertools operations). >>> >>> Making ``noop()`` a useful default function in those cases just >>> needs the definition to be:: >>> >>> def noop(*args, **kwds): >>> return args[0] if args else None >>> >>> The counterargument to the counterargument is that using ``None`` >>> as the default in all these cases is going to be faster, since it >>> lets the algorithm skip the callback entirely, rather than calling >>> it and having it do nothing useful. >>> >>> >>> Copyright >>> ========= >>> >>> This document has been placed in the public domain. >>> >>> >>> .. >>> Local Variables: >>> mode: indented-text >>> indent-tabs-mode: nil >>> sentence-end-double-space: t >>> fill-column: 70 >>> coding: utf-8 >>> End: >>> >>> >>> _______________________________________________ >>> Python-Dev mailing list >>> Python-Dev at python.org >>> https://mail.python.org/mailman/listinfo/python-dev >>> Unsubscribe: >>> https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com >>> >> From guido at python.org Sat Sep 9 18:12:16 2017 From: guido at python.org (Guido van Rossum) Date: Sat, 9 Sep 2017 15:12:16 -0700 Subject: [Python-Dev] PEP 559 - built-in noop() In-Reply-To: References: Message-ID: I can't tell whether this was meant seriously, but I don't think it's worth it. People can easily write their own dummy function and give it any damn semantics they want. Let's reject the PEP. On Sat, Sep 9, 2017 at 11:46 AM, Barry Warsaw wrote: > I couldn?t resist one more PEP from the Core sprint. I won?t reveal where > or how this one came to me. > > -Barry > > PEP: 559 > Title: Built-in noop() > Author: Barry Warsaw > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 2017-09-08 > Python-Version: 3.7 > Post-History: 2017-09-09 > > > Abstract > ======== > > This PEP proposes adding a new built-in function called ``noop()`` which > does > nothing but return ``None``. > > > Rationale > ========= > > It is trivial to implement a no-op function in Python. It's so easy in > fact > that many people do it many times over and over again. It would be useful > in > many cases to have a common built-in function that does nothing. > > One use case would be for PEP 553, where you could set the breakpoint > environment variable to the following in order to effectively disable it:: > > $ setenv PYTHONBREAKPOINT=noop > > > Implementation > ============== > > The Python equivalent of the ``noop()`` function is exactly:: > > def noop(*args, **kws): > return None > > The C built-in implementation is available as a pull request. > > > Rejected alternatives > ===================== > > ``noop()`` returns something > ---------------------------- > > YAGNI. > > This is rejected because it complicates the semantics. For example, if you > always return both ``*args`` and ``**kws``, what do you return when none of > those are given? Returning a tuple of ``((), {})`` is kind of ugly, but > provides consistency. But you might also want to just return ``None`` > since > that's also conceptually what the function was passed. > > Or, what if you pass in exactly one positional argument, e.g. > ``noop(7)``. Do > you return ``7`` or ``((7,), {})``? And so on. > > The author claims that you won't ever need the return value of ``noop()`` > so > it will always return ``None``. > > Coghlin's Dialogs (edited for formatting): > > My counterargument to this would be ``map(noop, iterable)``, > ``sorted(iterable, key=noop)``, etc. (``filter``, ``max``, and > ``min`` all accept callables that accept a single argument, as do > many of the itertools operations). > > Making ``noop()`` a useful default function in those cases just > needs the definition to be:: > > def noop(*args, **kwds): > return args[0] if args else None > > The counterargument to the counterargument is that using ``None`` > as the default in all these cases is going to be faster, since it > lets the algorithm skip the callback entirely, rather than calling > it and having it do nothing useful. > > > Copyright > ========= > > This document has been placed in the public domain. > > > .. > Local Variables: > mode: indented-text > indent-tabs-mode: nil > sentence-end-double-space: t > fill-column: 70 > coding: utf-8 > End: > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Sat Sep 9 21:00:53 2017 From: eric at trueblade.com (Eric V. Smith) Date: Sat, 9 Sep 2017 21:00:53 -0400 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> Message-ID: <750de277-4975-48a9-7ef5-817b78bcbdde@trueblade.com> On 9/9/2017 11:41 AM, Gustavo Carneiro wrote: > Hi, it is not clear whether anything is done to total_cost: > > ? ? def total_cost(self) -> float: > > Does this become a property automatically, or is it still a method > call?? To that end, some examples of *using* a data class, not just > defining one, would be helpful. > > If it remains a normal method, why put it in this example at all? Makes Nothing is done with total_cost, it's still a method. It's meant to show that you can use methods in a Data Class. Maybe I should add a method that has a parameter, or at least explain why that method is present in the example. I'm not sure how I'd write an example showing you can you do everything you can with an undecorated class. I think some text explanation would be better. Eric. From skip.montanaro at gmail.com Sun Sep 10 05:42:49 2017 From: skip.montanaro at gmail.com (Skip Montanaro) Date: Sun, 10 Sep 2017 04:42:49 -0500 Subject: [Python-Dev] Fwd: Python programming language vulnerabilities In-Reply-To: <9573BEFB-B5EB-4814-8C8D-AF18C7DB8BD3@maurya.on.ca> References: <9573BEFB-B5EB-4814-8C8D-AF18C7DB8BD3@maurya.on.ca> Message-ID: This popped up on python-list. It actually seems to me like it might be interesting to the core developers. Apologies if I've missed my guess. Skip ---------- Forwarded message ---------- From: Stephen Michell Date: Fri, Sep 8, 2017 at 12:34 PM Subject: Python programming language vulnerabilities To: python-list at python.org I chair ISO/IEC/JTC1/SC22/WG23 Programming Language Vulnerabilities. We publish an international technical report, ISO IEC TR 24772 Guide to avoiding programming language vulnerabilities through language selection use. Annex D in this document addresses vulnerabilities in Python. This document is freely available from ISO and IEC. We are updating this technical report, adding a few vulnerabilities and updating language applicability as programming languages evolve. We are also subdividing the document by making the language-specific annexes each their own technical report. For the Python Part, the major portions are written, but we have about 6 potential vulnerabilities left to complete. We need help in finishing the Python TR. We are looking for a few Python experts that have experience in implementing Python language systems, or experts in implementing significant systems in Python (for technical level, persons that provide technical supervision to implementers, or that write and maintain organizational Python coding standards. If you are interested in helping, please reply to this posting. Thank you Stephen Michell Convenor, ISO/IEC/JTC 1/SC 22/WG 23 Programming Language Vulnerabilities -- https://mail.python.org/mailman/listinfo/python-list -------------- next part -------------- An HTML attachment was scrubbed... URL: From desmoulinmichel at gmail.com Sun Sep 10 09:51:48 2017 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Sun, 10 Sep 2017 15:51:48 +0200 Subject: [Python-Dev] PEP 559 - built-in noop() In-Reply-To: References: Message-ID: <8838a41b-f96d-6153-6aa0-3b1d94fe028e@gmail.com> Don't we already have the mock module for that ? A mowk works as a noop, will be ok with being used as a context manager and allow chaining... Either way, what would a noop function really give you compared to lambda *a, **b: None ? A be bit shorter to write. Maybe faster to run. But do you use it so much that it needs to be included ? It's not rocket science. And as a built-in, noon of the less. If it's coded it in C to gain perfs, then: - let's put it in functools, not in built-in. I often wish for partial to be built-in, but it's not. Honestly the fonctools and itertools module should be autoimported (I always do in my PYTHONSTARTUP). But no pony for us, let's not clutter the global name spaces. - provide it with it's little sister, the identity function. They almost always go hand in hand and doing the whole work of this PEP is a nice opportunity. sorted/max/min/sort and most validation callbacks have an identity function as default parameters. We would have then functools.noop and functools.identity. - provide a pure Python backport. Aternatively, just rewrite part of the mock module in C. You'll get a fast noop, with a lot of features, and as a bonus would speed up a lot of unit tests around here. From desmoulinmichel at gmail.com Sun Sep 10 10:00:37 2017 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Sun, 10 Sep 2017 16:00:37 +0200 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <750de277-4975-48a9-7ef5-817b78bcbdde@trueblade.com> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <750de277-4975-48a9-7ef5-817b78bcbdde@trueblade.com> Message-ID: The reaction is overwhelmingly positive everywhere: hacker news, reddit, twitter. People have been expecting something like that for a long time. 3 questions: - is providing validation/conversion hooks completely out of the question of still open for debate ? I know it's to keep the implementation simple but have a few callbacks run in the __init__ in a foo loop is not that much complexity. You don't have to provide validators, but having a validators parameters on field() would be a huge time saver. Just a list of callables called when the value is first set, potentially raising an exception that you don't even need to process in any way. It returns the value converted, and voil?. We all do that every day manually. - I read Guido talking about some base class as alternative to the generator version, but don't see it in the PEP. Is it still considered ? - any chance it becomes a built in later ? When classes have been improved in Python 2, the object built-in was added. Imagine if we had had to import it every time... Or maybe just plug it to object like @object.dataclass. From eric at trueblade.com Sun Sep 10 12:36:10 2017 From: eric at trueblade.com (Eric V. Smith) Date: Sun, 10 Sep 2017 12:36:10 -0400 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <750de277-4975-48a9-7ef5-817b78bcbdde@trueblade.com> Message-ID: On 9/10/2017 10:00 AM, Michel Desmoulin wrote: > The reaction is overwhelmingly positive everywhere: hacker news, reddit, > twitter. Do you have a pointer to the Hacker News discussion? I missed it. > People have been expecting something like that for a long time. Me, too! > 3 questions: > > - is providing validation/conversion hooks completely out of the > question of still open for debate ? I know it's to keep the > implementation simple but have a few callbacks run in the __init__ in a > foo loop is not that much complexity. You don't have to provide > validators, but having a validators parameters on field() would be a > huge time saver. Just a list of callables called when the value is first > set, potentially raising an exception that you don't even need to > process in any way. It returns the value converted, and voil?. We all do > that every day manually. I don't particularly want to add validation specifically. I want to make it possible to add validation yourself, or via a library. What I think I'll do is add a metadata parameter to fields(), defaulting to None. Then you could write a post-init hook that does whatever single- and multi-field validations you want (or whatever else you want to do). Although this plays poorly with "frozen" classes: it's always something! I'll think about it. To make this most useful, I need to get the post-init hook to take an optional parameter so you can get data to it. I don't have a good way to do this, yet. Suggestions welcomed. Although if the post-init hook takes a param that you can pass in at object creation time, I guess there's really no need for a per-field metadata parameter: you could use the field name as a key to look up whatever you wanted to know about the field. > - I read Guido talking about some base class as alternative to the > generator version, but don't see it in the PEP. Is it still considered ? I'm going to put some words in explaining why I don't want to use base classes (I don't think it buys you anything). Do you have a reason for preferring base classes? > - any chance it becomes a built in later ? When classes have been > improved in Python 2, the object built-in was added. Imagine if we had > had to import it every time... Or maybe just plug it to object like > @object.dataclass. Because of the choice of using module-level functions so as to not introduce conflicts in the object's namespace, it would be difficult to make this a builtin. Although now that I think about it, maybe what are currently module-level functions should instead be methods on the "dataclass" decorator itself: @dataclass class C: i: int = dataclass.field(default=1, init=False) j: str c = C('hello') dataclass.asdict(c) {'i': 1, 'j': 'hello'} Then, "dataclass" would be the only name the module exports, making it easier to someday be a builtin. I'm not sure it's important enough for this to be a builtin, but this would make it easier. Thoughts? I'm usually not a fan of having attributes on a function: it's why itertools.chain.from_iterable() is hard to find. Eric. From barry at python.org Sun Sep 10 13:21:43 2017 From: barry at python.org (Barry Warsaw) Date: Sun, 10 Sep 2017 10:21:43 -0700 Subject: [Python-Dev] PEP 559 - built-in noop() In-Reply-To: References: Message-ID: <67580E0D-0FAC-4B4D-95BC-C1624EEA7627@python.org> On Sep 9, 2017, at 15:12, Guido van Rossum wrote: > > I can't tell whether this was meant seriously, but I don't think it's worth it. People can easily write their own dummy function and give it any damn semantics they want. Let's reject the PEP. Alrighty then! (Yes, it was serious, but I claim post-sprint euphoria/delirium). -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 273 bytes Desc: Message signed with OpenPGP URL: From guido at python.org Sun Sep 10 14:49:52 2017 From: guido at python.org (Guido van Rossum) Date: Sun, 10 Sep 2017 11:49:52 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <750de277-4975-48a9-7ef5-817b78bcbdde@trueblade.com> Message-ID: On Sun, Sep 10, 2017 at 9:36 AM, Eric V. Smith wrote: > On 9/10/2017 10:00 AM, Michel Desmoulin wrote: > >> - any chance it becomes a built in later ? When classes have been >> improved in Python 2, the object built-in was added. Imagine if we had >> had to import it every time... Or maybe just plug it to object like >> @object.dataclass. >> > > Because of the choice of using module-level functions so as to not > introduce conflicts in the object's namespace, it would be difficult to > make this a builtin. > The temptation to make everything a builtin should be resisted. > Although now that I think about it, maybe what are currently module-level > functions should instead be methods on the "dataclass" decorator itself: > > @dataclass > class C: > i: int = dataclass.field(default=1, init=False) > j: str > > c = C('hello') > > dataclass.asdict(c) > {'i': 1, 'j': 'hello'} > > Then, "dataclass" would be the only name the module exports, making it > easier to someday be a builtin. I'm not sure it's important enough for this > to be a builtin, but this would make it easier. Thoughts? I'm usually not a > fan of having attributes on a function: it's why > itertools.chain.from_iterable() is hard to find. > Let's not do that. It would be better to design the module so that people can write `from dataclasses import *` and they will only get things that are clearly part of dataclasses (I guess dataclass, field, asdict, and a few more like that). That way people who really want this to look like a builtin can just violate PEP 8. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Sun Sep 10 15:06:09 2017 From: barry at python.org (Barry Warsaw) Date: Sun, 10 Sep 2017 12:06:09 -0700 Subject: [Python-Dev] breakpoint() and $PYTHONBREAKPOINT Message-ID: For PEP 553, I think it?s a good idea to support the environment variable $PYTHONBREAKPOINT[*] but I?m stuck on a design question, so I?d like to get some feedback. Should $PYTHONBREAKPOINT be consulted in breakpoint() or in sys.breakpointhook()? If we support it in breakpoint() then it means if $PYTHONBREAKPOINT is set, Python will never call sys.breakpointhook() so programmatic override of that will be ignored. If we support it in sys.breakpointhook(), then if you programmatically override that, $PYTHONBREAKPOINT will be ignored even if set. Of course, any code that overrides sys.breakpointhook() could also consult $PYTHONBREAKPOINT if they want. So how do we want it to work? It seems like explicit code should override the environment variable, which should override the default, meaning that $PYTHONBREAKPOINT should be consulted only in breakpoint(). And if you set sys.breakpointhook() with something else, then oh well, Python just ignores $PYTHONBREAKPOINT. Feedback welcome. -Barry [*] Probably not $PYTHONBREAKPOINTHOOK although if we choose to implement that in sys.breakpointhook(), it might still makes sense. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 273 bytes Desc: Message signed with OpenPGP URL: From guido at python.org Sun Sep 10 15:16:05 2017 From: guido at python.org (Guido van Rossum) Date: Sun, 10 Sep 2017 12:16:05 -0700 Subject: [Python-Dev] breakpoint() and $PYTHONBREAKPOINT In-Reply-To: References: Message-ID: I think programmatic overrides should be able to decide for themselves if they want to honor PYTHONBREAKPOINT or not, since I can imagine use cases that go both ways. So it should be checked in sys.breakpointhook(). On Sun, Sep 10, 2017 at 12:06 PM, Barry Warsaw wrote: > For PEP 553, I think it?s a good idea to support the environment variable > $PYTHONBREAKPOINT[*] but I?m stuck on a design question, so I?d like to get > some feedback. > > Should $PYTHONBREAKPOINT be consulted in breakpoint() or in > sys.breakpointhook()? > > If we support it in breakpoint() then it means if $PYTHONBREAKPOINT is > set, Python will never call sys.breakpointhook() so programmatic override > of that will be ignored. > > If we support it in sys.breakpointhook(), then if you programmatically > override that, $PYTHONBREAKPOINT will be ignored even if set. Of course, > any code that overrides sys.breakpointhook() could also consult > $PYTHONBREAKPOINT if they want. > > So how do we want it to work? It seems like explicit code should override > the environment variable, which should override the default, meaning that > $PYTHONBREAKPOINT should be consulted only in breakpoint(). And if you set > sys.breakpointhook() with something else, then oh well, Python just ignores > $PYTHONBREAKPOINT. > > Feedback welcome. > -Barry > > [*] Probably not $PYTHONBREAKPOINTHOOK although if we choose to implement > that in sys.breakpointhook(), it might still makes sense. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sun Sep 10 16:46:14 2017 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 10 Sep 2017 13:46:14 -0700 Subject: [Python-Dev] breakpoint() and $PYTHONBREAKPOINT In-Reply-To: References: Message-ID: On Sun, Sep 10, 2017 at 12:06 PM, Barry Warsaw wrote: > For PEP 553, I think it?s a good idea to support the environment variable $PYTHONBREAKPOINT[*] but I?m stuck on a design question, so I?d like to get some feedback. > > Should $PYTHONBREAKPOINT be consulted in breakpoint() or in sys.breakpointhook()? Wouldn't the usual pattern be to check $PYTHONBREAKPOINT once at startup, and if it's set use it to initialize sys.breakpointhook()? Compare to, say, $PYTHONPATH. -n -- Nathaniel J. Smith -- https://vorpus.org From python-ideas at mgmiller.net Sun Sep 10 17:05:45 2017 From: python-ideas at mgmiller.net (Mike Miller) Date: Sun, 10 Sep 2017 14:05:45 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <0da1ba37-ff98-5394-ba55-f59855048ff3@trueblade.com> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <0da1ba37-ff98-5394-ba55-f59855048ff3@trueblade.com> Message-ID: Thanks for the info. On 2017-09-08 15:40, Eric V. Smith wrote: >> - Are types required? > > Annotations are required, the typing module is not. > >> Maybe an example or two with Any? > > I'd rather leave it like it is: typing is referenced only once, for ClassVar. Guess I really meant "object" or "type" above, not "typing.Any." For a decade or two, my use of structs/records in Python has been dynamically-typed and that hasn't been an issue. As the problem this PEP is solving is orthogonal to typing improvements, it feels a bit like typing is being coupled and pushed with it, whether wanted or not. Not that I dislike typing mind you (appreciated on projects with large teams), it's just as mentioned not related to the problem of class definition verbosity or lacking functionality. Cheers, -Mike From python-dev at mgmiller.net Sun Sep 10 17:13:03 2017 From: python-dev at mgmiller.net (Mike Miller) Date: Sun, 10 Sep 2017 14:13:03 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <0da1ba37-ff98-5394-ba55-f59855048ff3@trueblade.com> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <0da1ba37-ff98-5394-ba55-f59855048ff3@trueblade.com> Message-ID: <132c15ba-ceb6-4731-2708-5b43ed863e7e@mgmiller.net> Thanks for the info. On 2017-09-08 15:40, Eric V. Smith wrote: >> - Are types required? > > Annotations are required, the typing module is not. > >> Maybe an example or two with Any? > > I'd rather leave it like it is: typing is referenced only once, for ClassVar. Guess I really meant "object" or "type" above, not "typing.Any." For a decade or two, my use of structs/records in Python has been dynamically-typed and that hasn't been an issue. As the problem this PEP is solving is orthogonal to typing improvements, it feels a bit like typing is being coupled and pushed with it, whether wanted or not. Not that I dislike typing mind you (appreciated on projects with large teams), it's just as mentioned not related to the problem of class definition verbosity or lacking functionality. Cheers, -Mike From levkivskyi at gmail.com Sun Sep 10 17:23:49 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Sun, 10 Sep 2017 23:23:49 +0200 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <0da1ba37-ff98-5394-ba55-f59855048ff3@trueblade.com> Message-ID: On 10 September 2017 at 23:05, Mike Miller wrote: > [...] > As the problem this PEP is solving is orthogonal to typing improvements > This is not the case, static support for dataclasses is an import point of motivation. It is hard to support static typing for many third party packages like attrs, since they use a lot of "magic". -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From k7hoven at gmail.com Sun Sep 10 17:29:11 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Mon, 11 Sep 2017 00:29:11 +0300 Subject: [Python-Dev] PEP 559 - built-in noop() In-Reply-To: <67580E0D-0FAC-4B4D-95BC-C1624EEA7627@python.org> References: <67580E0D-0FAC-4B4D-95BC-C1624EEA7627@python.org> Message-ID: On Sun, Sep 10, 2017 at 8:21 PM, Barry Warsaw wrote: > On Sep 9, 2017, at 15:12, Guido van Rossum wrote: > > > > I can't tell whether this was meant seriously, but I don't think it's > worth it. People can easily write their own dummy function and give it any > damn semantics they want. Let's reject the PEP. > > Alrighty then! (Yes, it was serious, but I claim post-sprint > euphoria/delirium). > > ?Just for future reference, here's a slightly more serious comment: I think the "pass" statement wasn't mentioned yet, but clearly noop() would be duplication of functionality. So maybe the closest thing without duplication would be to make "pass" an expression which evaluates to a no-op function, but which the compiler could perhaps optimize away if it's a statement by itself, or is a builtin. -- Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Sun Sep 10 17:39:07 2017 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 11 Sep 2017 07:39:07 +1000 Subject: [Python-Dev] PEP 559 - built-in noop() In-Reply-To: References: <67580E0D-0FAC-4B4D-95BC-C1624EEA7627@python.org> Message-ID: On Mon, Sep 11, 2017 at 7:29 AM, Koos Zevenhoven wrote: > On Sun, Sep 10, 2017 at 8:21 PM, Barry Warsaw wrote: >> >> On Sep 9, 2017, at 15:12, Guido van Rossum wrote: >> > >> > I can't tell whether this was meant seriously, but I don't think it's >> > worth it. People can easily write their own dummy function and give it any >> > damn semantics they want. Let's reject the PEP. >> >> Alrighty then! (Yes, it was serious, but I claim post-sprint >> euphoria/delirium). >> > > Just for future reference, here's a slightly more serious comment: > > I think the "pass" statement wasn't mentioned yet, but clearly noop() would > be duplication of functionality. So maybe the closest thing without > duplication would be to make "pass" an expression which evaluates to a no-op > function, but which the compiler could perhaps optimize away if it's a > statement by itself, or is a builtin. As a language change, definitely not. But I like this idea for PYTHONBREAKPOINT. You set it to the name of a function, or to "pass" if you want nothing to be done. It's a special case that can't possibly conflict with normal usage. ChrisA From python-dev at mgmiller.net Sun Sep 10 17:48:50 2017 From: python-dev at mgmiller.net (Mike Miller) Date: Sun, 10 Sep 2017 14:48:50 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <0da1ba37-ff98-5394-ba55-f59855048ff3@trueblade.com> Message-ID: On 2017-09-10 14:23, Ivan Levkivskyi wrote: > This is not the case, static support for dataclasses is an import point of > motivation. I've needed this functionality a decade before types became cool again. ;-) > It is hard to support static typing for many third party packages like attrs, > since they use a lot of "magic". Believe that attrs has its own validators. As mentioned, nothing against static typing, would simply like an example without, to show that it is not required. -Mike From eric at trueblade.com Sun Sep 10 19:54:48 2017 From: eric at trueblade.com (Eric V. Smith) Date: Sun, 10 Sep 2017 19:54:48 -0400 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> Message-ID: <3e7b6c3a-0a70-6942-f4a5-b66a7b3864ef@trueblade.com> On 9/8/2017 11:01 AM, Eric V. Smith wrote: > Oops, I forgot the link. It should show up shortly at > https://www.python.org/dev/peps/pep-0557/. And now I've pushed a version that works with Python 3.6 to PyPI at https://pypi.python.org/pypi/dataclasses It implements the PEP as it currently stands. I'll be making some tweaks in the coming weeks. Feedback is welcomed. The repo is at https://github.com/ericvsmith/dataclasses Eric. From barry at python.org Sun Sep 10 21:03:58 2017 From: barry at python.org (Barry Warsaw) Date: Sun, 10 Sep 2017 18:03:58 -0700 Subject: [Python-Dev] PEP 559 - built-in noop() In-Reply-To: References: <67580E0D-0FAC-4B4D-95BC-C1624EEA7627@python.org> Message-ID: On Sep 10, 2017, at 14:39, Chris Angelico wrote: > > As a language change, definitely not. But I like this idea for > PYTHONBREAKPOINT. You set it to the name of a function, or to "pass" > if you want nothing to be done. It's a special case that can't > possibly conflict with normal usage. I have working code for `PYTHONBREAKPOINT=0` and `PYTHONBREAKPOINT= ` (i.e. the empty string as given by getenv()) meaning ?disable?. I don?t think we also need `PYTHONBREAKPOINT=pass`. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 273 bytes Desc: Message signed with OpenPGP URL: From rosuav at gmail.com Sun Sep 10 21:06:21 2017 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 11 Sep 2017 11:06:21 +1000 Subject: [Python-Dev] PEP 559 - built-in noop() In-Reply-To: References: <67580E0D-0FAC-4B4D-95BC-C1624EEA7627@python.org> Message-ID: On Mon, Sep 11, 2017 at 11:03 AM, Barry Warsaw wrote: > On Sep 10, 2017, at 14:39, Chris Angelico wrote: >> >> As a language change, definitely not. But I like this idea for >> PYTHONBREAKPOINT. You set it to the name of a function, or to "pass" >> if you want nothing to be done. It's a special case that can't >> possibly conflict with normal usage. > > I have working code for `PYTHONBREAKPOINT=0` and `PYTHONBREAKPOINT= ` (i.e. the empty string as given by getenv()) meaning ?disable?. I don?t think we also need `PYTHONBREAKPOINT=pass`. > Blank is the other one that I would stumble to. Works for me. ChrisA From eric at trueblade.com Sun Sep 10 21:20:31 2017 From: eric at trueblade.com (Eric V. Smith) Date: Sun, 10 Sep 2017 21:20:31 -0400 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <132c15ba-ceb6-4731-2708-5b43ed863e7e@mgmiller.net> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <0da1ba37-ff98-5394-ba55-f59855048ff3@trueblade.com> <132c15ba-ceb6-4731-2708-5b43ed863e7e@mgmiller.net> Message-ID: > On Sep 10, 2017, at 5:13 PM, Mike Miller wrote: > > Thanks for the info. > > On 2017-09-08 15:40, Eric V. Smith wrote: >>> - Are types required? >> Annotations are required, the typing module is not. >>> Maybe an example or two with Any? >> I'd rather leave it like it is: typing is referenced only once, for ClassVar. > > > Guess I really meant "object" or "type" above, not "typing.Any." > > For a decade or two, my use of structs/records in Python has been dynamically-typed and that hasn't been an issue. As the problem this PEP is solving is orthogonal to typing improvements, it feels a bit like typing is being coupled and pushed with it, whether wanted or not. > > Not that I dislike typing mind you (appreciated on projects with large teams), it's just as mentioned not related to the problem of class definition verbosity or lacking functionality. Using type annotations buys two things. One is the concise syntax: it's how the decorator finds the fields. In the simple case, it's what let's you define a data class without using the call to fields(), which is analogous to attrs's attr.ib(). The other thing it buys you is compatibility with type checkers. As Ivan said, the design was very careful to be compatible with type checkers. So other than requiring some type as an annotation, there's no dependency added for typing in the genera sense. You can use a type of object if you want, or just lie to it and say None (though I'm not recommending that!). It's completely ignored at runtime (except for the ClassVar case, as the PEP states). I'll think about adding some more language to the PEP about it. Eric. From steve at pearwood.info Sun Sep 10 21:52:56 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 11 Sep 2017 11:52:56 +1000 Subject: [Python-Dev] PEP 559 - built-in noop() In-Reply-To: References: <67580E0D-0FAC-4B4D-95BC-C1624EEA7627@python.org> Message-ID: <20170911015255.GP13110@ando.pearwood.info> On Mon, Sep 11, 2017 at 07:39:07AM +1000, Chris Angelico wrote: [...] > As a language change, definitely not. But I like this idea for > PYTHONBREAKPOINT. You set it to the name of a function, or to "pass" > if you want nothing to be done. It's a special case that can't > possibly conflict with normal usage. I disagree -- its a confusion of concepts. "pass" is a do-nothing statement, not a value, so you can't set something to pass. Expect a lot of StackOverflow questions asking why this doesn't work: sys.breakpoint = pass In fact, in one sense pass is not even a statement. It has no runtime effect, it isn't compiled into any bytecode. It is a purely syntactic feature to satisfy the parser. Of course env variables are actually strings, so we can choose "pass" to mean "no break point" if we wanted. But I think there are already two perfectly good candidates for that usage which don't mix the concepts of statements and values, the empty string, and None: setenv PYTHONBREAKPOINT="" setenv PYTHONBREAKPOINT=None -- Steve From raymond.hettinger at gmail.com Sun Sep 10 22:27:22 2017 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sun, 10 Sep 2017 19:27:22 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <3e7b6c3a-0a70-6942-f4a5-b66a7b3864ef@trueblade.com> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> <3e7b6c3a-0a70-6942-f4a5-b66a7b3864ef@trueblade.com> Message-ID: > On Sep 10, 2017, at 4:54 PM, Eric V. Smith wrote: > > And now I've pushed a version that works with Python 3.6 to PyPI at https://pypi.python.org/pypi/dataclasses > > It implements the PEP as it currently stands. I'll be making some tweaks in the coming weeks. Feedback is welcomed. > > The repo is at https://github.com/ericvsmith/dataclasses +1 Overall, this looks very well thought out. Nice work! Once you get agreement on the functionality, name bike-shedding will likely be next. In a way, all classes are data classes so that name doesn't tell me much. Instead, it would be nice to have something suggestive of what it actually does which is automatically adding boilerplate methods to a general purpose class. Perhaps, @boilerplate or @autoinit or some such. Raymond From njs at pobox.com Sun Sep 10 23:08:58 2017 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 10 Sep 2017 20:08:58 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> Message-ID: Hi Eric, A few quick comments: Why do you even have a hash= argument on individual fields? For the whole class, I can imagine you might want to explicitly mark a whole class as unhashable, but it seems like the only thing you can do with the field-level hash= argument is to create a class where the __hash__ and __eq__ take different fields into account, and why would you ever want that? Though honestly I can see a reasonable argument for removing the class-level hash= option too. And even if you keep it you might want to error on some truly nonsensical options like defining __hash__ without __eq__. (Also watch out that Python's usual rule about defining __eq__ blocking the inheritance of __hash__ does not kick in if __eq__ is added after the class is created.) I've sometimes wished that attrs let me control whether it generated equality methods (eq/ne/hash) separately from ordering methods (lt/gt/...). Maybe the cmp= argument should take an enum with options none/equality-only/full? The "why not attrs" section kind of reads like "because it's too popular and useful"? -n On Sep 8, 2017 08:44, "Eric V. Smith" wrote: Oops, I forgot the link. It should show up shortly at https://www.python.org/dev/peps/pep-0557/. Eric. On 9/8/17 7:57 AM, Eric V. Smith wrote: > I've written a PEP for what might be thought of as "mutable namedtuples > with defaults, but not inheriting tuple's behavior" (a mouthful, but it > sounded simpler when I first thought of it). It's heavily influenced by > the attrs project. It uses PEP 526 type annotations to define fields. > From the overview section: > > @dataclass > class InventoryItem: > name: str > unit_price: float > quantity_on_hand: int = 0 > > def total_cost(self) -> float: > return self.unit_price * self.quantity_on_hand > > Will automatically add these methods: > > def __init__(self, name: str, unit_price: float, quantity_on_hand: int > = 0) -> None: > self.name = name > self.unit_price = unit_price > self.quantity_on_hand = quantity_on_hand > def __repr__(self): > return > f'InventoryItem(name={self.name!r},unit_price={self.unit_pri > ce!r},quantity_on_hand={self.quantity_on_hand!r})' > > def __eq__(self, other): > if other.__class__ is self.__class__: > return (self.name, self.unit_price, self.quantity_on_hand) == > (other.name, other.unit_price, other.quantity_on_hand) > return NotImplemented > def __ne__(self, other): > if other.__class__ is self.__class__: > return (self.name, self.unit_price, self.quantity_on_hand) != > (other.name, other.unit_price, other.quantity_on_hand) > return NotImplemented > def __lt__(self, other): > if other.__class__ is self.__class__: > return (self.name, self.unit_price, self.quantity_on_hand) < > (other.name, other.unit_price, other.quantity_on_hand) > return NotImplemented > def __le__(self, other): > if other.__class__ is self.__class__: > return (self.name, self.unit_price, self.quantity_on_hand) <= > (other.name, other.unit_price, other.quantity_on_hand) > return NotImplemented > def __gt__(self, other): > if other.__class__ is self.__class__: > return (self.name, self.unit_price, self.quantity_on_hand) > > (other.name, other.unit_price, other.quantity_on_hand) > return NotImplemented > def __ge__(self, other): > if other.__class__ is self.__class__: > return (self.name, self.unit_price, self.quantity_on_hand) >= > (other.name, other.unit_price, other.quantity_on_hand) > return NotImplemented > > Data Classes saves you from writing and maintaining these functions. > > The PEP is largely complete, but could use some filling out in places. > Comments welcome! > > Eric. > > P.S. I wrote this PEP when I was in my happy place. > > _______________________________________________ Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/njs% 40pobox.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From larry at hastings.org Sun Sep 10 23:42:11 2017 From: larry at hastings.org (Larry Hastings) Date: Sun, 10 Sep 2017 20:42:11 -0700 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) In-Reply-To: References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> Message-ID: <69ac8d22-ad51-ac1d-ca7b-9b451f72b0c0@hastings.org> On 09/06/2017 08:26 AM, Guido van Rossum wrote: > So we've seen a real use case for __class__ assignment: deprecating > things on access. That use case could also be solved if modules > natively supported defining __getattr__ (with the same "only used if > attribute not found otherwise" semantics as it has on classes), but it > couldn't be solved using @property (or at least it would be quite hacky). I guess it's a matter of perspective. I solved this problem using @property, and I don't think it's particularly hacky. (See my implementation at the bottom of this email). The worst thing it does is look up the current module via sys.modules[__name__]... which Nathaniel's code also must do. My example is a lot simpler than Nathaniel's code but it's just a proof-of-concept. Nathaniel's code does much more. In particular, I didn't override __dir__ to hide the deprecated attributes; doing that would mean assigning to __class__ just as Nathaniel does. (If you're actually curious to see it I could add that to the proof-of-concept.) IMO my PEP strikes the right balance. @property is far more popular than __getattr__ or __getattribute__; 19 times out of 20, people use @property. Meanwhile, people for whom @property is insufficient (like Nathaniel) can already assign to module.__class__ and override __getattr__, __getattribute__, __del__, or any other existing magic method. In other words: the commonplace is easy, and the unusual is possible. Perfect! //arry ----- /test_deprecated.py: import depmod print(depmod.depr1) # throws an exception depmod.py: # module with deprecated properties import sys _deprecated_properies = ( ("depr1", 33), ("depr2", 44), ) __module__ = sys.modules[__name__] def set_deprecated_property(name, value): @property def prop(self): raise RuntimeError(f"property '{name}' is deprecated") return value setattr(__module__, name, prop) for name, value in _deprecated_properies: set_deprecated_property(name, value) -------------- next part -------------- An HTML attachment was scrubbed... URL: From larry at hastings.org Sun Sep 10 23:52:12 2017 From: larry at hastings.org (Larry Hastings) Date: Sun, 10 Sep 2017 20:52:12 -0700 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) In-Reply-To: References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> Message-ID: <68c9f5c5-e8d9-723f-c258-1e6727da3008@hastings.org> On 09/06/2017 02:13 PM, Ronald Oussoren wrote: > To be honest this sounds like a fairly crude hack. Updating the > __class__ of a module object feels dirty, but at least you get normal > behavior w.r.t. properties. Okay. Obviously I disagree, I think it's reasonable. But I'll assume you're -1. > Why is there no mechanism to add new descriptors that can work in this > context? I've updated the prototype to add one. I added it as "collections.abc.InstanceDescriptor"; that's a base class you can inherit from, and then your descriptor will work in a module. Bikeshedding the name is fine. > BTW. The interaction with import is interesting? Module properties > only work as naive users expect when accessing them as attributes of > the module object, in particular importing the name using ?from module > import prop? would only call the property getter once and that may not > be the intended behavior. I spent a fair amount of time thinking about this. The short answer is: we /could/ fix it. We /could/ make it so that "from x import y", when x.y is an instance descriptor, ensures that y honors the descriptor protocol when referenced. We'd have to do it for three contexts: * global scope (aka module scope) * class scope * function scope The first two are pretty similar; create a proxy object that retains the module instance so it remains "bound" to that. The third one is trickier; it'd mean a new bytecode (LOAD_FAST_DESCRIPTOR), but it wouldn't slow down people who didn't use it. Anyway, long story short, I think this would be worse than simply having "from x import y" only calling the getter once. As the Zen says: special cases aren't special enough to break the rules. //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Sep 11 01:47:55 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 11 Sep 2017 15:47:55 +1000 Subject: [Python-Dev] PEP 554 v2 (new "interpreters" module) In-Reply-To: References: Message-ID: On 10 September 2017 at 03:54, Nathaniel Smith wrote: > On Sep 9, 2017 9:07 AM, "Nick Coghlan" wrote: > > > To immediately realise some level of efficiency benefits from the > shared memory space between the main interpreter and subinterpreters, > I also think these low level FIFOs should be defined as accepting any > object that supports the PEP 3118 buffer protocol, and emitting > memoryview() objects on the receiving end, rather than being bytes-in, > bytes-out. > > > Is your idea that this memoryview would refer directly to the sending > interpreter's memory (as opposed to a copy into some receiver-owned buffer)? There are two possibilities that I think can be made to work. Option one would be a memoryview subclass that: 1. Also stores a reference to the source interpreter 2. Temporarily switches back to that interpreter when acquiring or releasing a buffer view The receiving interpreter would then also be able to acquire a suitable referencing to the sending interpreter for the message in order to establish bidirectional communications. Option two would be the same general idea, but with a regular memoryview placed in front of the subinterpreter aware variant (although it's less clear how we'd establish bidirectional comms channels in that case). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Mon Sep 11 01:51:50 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 11 Sep 2017 15:51:50 +1000 Subject: [Python-Dev] PEP 554 v2 (new "interpreters" module) In-Reply-To: References: Message-ID: On 10 September 2017 at 04:04, Nathaniel Smith wrote: > On Sep 8, 2017 4:06 PM, "Eric Snow" wrote: > > > run(code): > > Run the provided Python code in the interpreter, in the current > OS thread. If the interpreter is already running then raise > RuntimeError in the interpreter that called ``run()``. > > The current interpreter (which called ``run()``) will block until > the subinterpreter finishes running the requested code. Any > uncaught exception in that code will bubble up to the current > interpreter. > > > This phrase "bubble up" here is doing a lot of work :-). Can you elaborate > on what you mean? The text now makes it seem like the exception will just > pass from one interpreter into another, but that seems impossible ? it'd > mean sharing not just arbitrary user defined exception classes but full > frame objects... Indeed, I think this will need to be something more akin to https://docs.python.org/3/library/subprocess.html#subprocess.CalledProcessError, where the child interpreter is able to pass back encoded text data (perhaps including a full rendered traceback), but the exception itself won't be able to be propagated. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Mon Sep 11 01:55:24 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 11 Sep 2017 15:55:24 +1000 Subject: [Python-Dev] Fwd: Python programming language vulnerabilities In-Reply-To: References: <9573BEFB-B5EB-4814-8C8D-AF18C7DB8BD3@maurya.on.ca> Message-ID: On 10 September 2017 at 19:42, Skip Montanaro wrote: > This popped up on python-list. It actually seems to me like it might be > interesting to the core developers. Apologies if I've missed my guess. Thanks for passing this along. If it's the document I think it is, I've reviewed at least one previous iteration of it, so I'll let Stephen know I'll be happy to help out this time as well. (It would be highly desirable to have more than just me looking at it, though!) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From k7hoven at gmail.com Mon Sep 11 02:36:11 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Mon, 11 Sep 2017 09:36:11 +0300 Subject: [Python-Dev] PEP 554 v2 (new "interpreters" module) In-Reply-To: References: Message-ID: On Mon, Sep 11, 2017 at 8:51 AM, Nick Coghlan wrote: > On 10 September 2017 at 04:04, Nathaniel Smith wrote: > > On Sep 8, 2017 4:06 PM, "Eric Snow" wrote: > > > > > > run(code): > > > > Run the provided Python code in the interpreter, in the current > > OS thread. If the interpreter is already running then raise > > RuntimeError in the interpreter that called ``run()``. > > > > The current interpreter (which called ``run()``) will block until > > the subinterpreter finishes running the requested code. Any > > uncaught exception in that code will bubble up to the current > > interpreter. > > > > > > This phrase "bubble up" here is doing a lot of work :-). Can you > elaborate > > on what you mean? The text now makes it seem like the exception will just > > pass from one interpreter into another, but that seems impossible ? it'd > > mean sharing not just arbitrary user defined exception classes but full > > frame objects... > > Indeed, I think this will need to be something more akin to > https://docs.python.org/3/library/subprocess.html# > subprocess.CalledProcessError, > where the child interpreter is able to pass back encoded text data > (perhaps including a full rendered traceback), but the exception > itself won't be able to be propagated. > ?It would be helpful if at least the exception type could somehow be preserved / restored / mimicked. Otherwise you need if-elif statements in your try-excepts and other annoying stuff. -- Koos? -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From desmoulinmichel at gmail.com Mon Sep 11 02:38:06 2017 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Mon, 11 Sep 2017 08:38:06 +0200 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <750de277-4975-48a9-7ef5-817b78bcbdde@trueblade.com> Message-ID: Le 10/09/2017 ? 18:36, Eric V. Smith a ?crit : > On 9/10/2017 10:00 AM, Michel Desmoulin wrote: >> The reaction is overwhelmingly positive everywhere: hacker news, reddit, >> twitter. > > Do you have a pointer to the Hacker News discussion? I missed it. Err... I may have been over enthusiastic and created the hacker news thread in my mind. > >> People have been expecting something like that for a long time. > > Me, too! > >> 3 questions: >> >> - is providing validation/conversion hooks completely out of the >> question of still open for debate ? I know it's to keep the >> implementation simple but have a few callbacks run in the __init__ in a >> foo loop is not that much complexity. You don't have to provide >> validators, but having a validators parameters on field() would be a >> huge time saver. Just a list of callables called when the value is first >> set, potentially raising an exception that you don't even need to >> process in any way. It returns the value converted, and voil?. We all do >> that every day manually. > > I don't particularly want to add validation specifically. I want to make > it possible to add validation yourself, or via a library. > > What I think I'll do is add a metadata parameter to fields(), defaulting > to None. Then you could write a post-init hook that does whatever > single- and multi-field validations you want (or whatever else you want > to do). Although this plays poorly with "frozen" classes: it's always > something! I'll think about it. > > To make this most useful, I need to get the post-init hook to take an > optional parameter so you can get data to it. I don't have a good way to > do this, yet. Suggestions welcomed. It doesn't really allow you to do anything you couldn't do as easily as in __init__. Alternatively, you could have a "on_set" hooks for field(), that just take the field value, and return the field value. By default, it's an identity function and is always called (minus implementation optimizations): from functools improt reduce self.foo = reduce((lambda data, next: next(*data)), on_set_hooks, (field, val)) And people can do whatever they want: default values, factories, transformers, converters/casters, validation, logging... > > Although if the post-init hook takes a param that you can pass in at > object creation time, I guess there's really no need for a per-field > metadata parameter: you could use the field name as a key to look up > whatever you wanted to know about the field. > >> - I read Guido talking about some base class as alternative to the >> generator version, but don't see it in the PEP. Is it still considered ? > > I'm going to put some words in explaining why I don't want to use base > classes (I don't think it buys you anything). Do you have a reason for > preferring base classes? Not preferring, but having it as an alternative. Mainly for 2 reasons: 1 - data classes allow one to type in classes very quickly, let's harvest the benefit from that. Typing a decorator in a shell is much less comfortable than using inheritance. Same thing about IDE: all current ones have snippet with auto-switch to the class parents on tab. All in all, if you are doing exploratory programming, and thus disposable code, which data classes are fantastic for, inheritance will keep you in the flow. 2 - it will help sell the data classes I train a lot of people to Python each year. I never have to explain classes to people with any kind of programming background. I _always_ have to explain decorators. People are not used to it, and even kind fear it for quite some time. Inheritance however, is familiar, and will not only push people to use data classes more, but also will let them do less mistakes: they know the danger of parent ordering, but not the ones of decorators ordering. > >> - any chance it becomes a built in later ? When classes have been >> improved in Python 2, the object built-in was added. Imagine if we had >> had to import it every time... Or maybe just plug it to object like >> @object.dataclass. > > Because of the choice of using module-level functions so as to not > introduce conflicts in the object's namespace, it would be difficult to > make this a builtin. > > Although now that I think about it, maybe what are currently > module-level functions should instead be methods on the "dataclass" > decorator itself: > > @dataclass > class C: > i: int = dataclass.field(default=1, init=False) > j: str > > c = C('hello') > > dataclass.asdict(c) > {'i': 1, 'j': 'hello'} > > Then, "dataclass" would be the only name the module exports, making it > easier to someday be a builtin. I'm not sure it's important enough for > this to be a builtin, but this would make it easier. Thoughts? I'm > usually not a fan of having attributes on a function: it's why > itertools.chain.from_iterable() is hard to find. > > Eric. From ncoghlan at gmail.com Mon Sep 11 03:32:38 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 11 Sep 2017 17:32:38 +1000 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> <3e7b6c3a-0a70-6942-f4a5-b66a7b3864ef@trueblade.com> Message-ID: On 11 September 2017 at 12:27, Raymond Hettinger wrote: > Once you get agreement on the functionality, name bike-shedding will likely be next. In a way, all classes are data classes so that name doesn't tell me much. Instead, it would be nice to have something suggestive of what it actually does which is automatically adding boilerplate methods to a general purpose class. Perhaps, @boilerplate or @autoinit or some such. "data class" is essentially short for "declarative data class" or "data-centric class": as a class author the decorator allows you to focus on declaring the data fields, and *not* on procedurally defining how those fields are initialised (and compared, and displayed, and hashed, ...) the way you do with a traditional imperative class definition. When I changed the name of contextlib.ignored to the more cryptic contextlib.suppress, I made the mistake of letting the folks that knew how the context manager worked dictate the name, rather than allowing it to keep the name that described what it was for. I think the same will apply here: we'll get a better name if we focus on describing the problem the capability solves in the simplest possible terms than we will if we choose something that more accurately describes how it is implemented. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From eric at trueblade.com Mon Sep 11 08:26:33 2017 From: eric at trueblade.com (Eric V. Smith) Date: Mon, 11 Sep 2017 08:26:33 -0400 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> <3e7b6c3a-0a70-6942-f4a5-b66a7b3864ef@trueblade.com> Message-ID: <2e3d5b22-61f9-df2e-2534-3448a59412ac@trueblade.com> On 9/10/17 10:27 PM, Raymond Hettinger wrote: > >> On Sep 10, 2017, at 4:54 PM, Eric V. Smith wrote: >> >> And now I've pushed a version that works with Python 3.6 to PyPI at https://pypi.python.org/pypi/dataclasses >> >> It implements the PEP as it currently stands. I'll be making some tweaks in the coming weeks. Feedback is welcomed. >> >> The repo is at https://github.com/ericvsmith/dataclasses > > +1 > Overall, this looks very well thought out. > Nice work! Thank you. > Once you get agreement on the functionality, name bike-shedding will likely be next. In a way, all classes are data classes so that name doesn't tell me much. Instead, it would be nice to have something suggestive of what it actually does which is automatically adding boilerplate methods to a general purpose class. Perhaps, @boilerplate or @autoinit or some such. There was some discussion on naming at https://github.com/ericvsmith/dataclasses/issues/12. In that issue, Guido said use Data Classes (the concept), dataclasses (the module) and dataclass (the decorator). I think if someone came up with an awesomely better name, we could all be convinced to use it. But so far, nothing's better. Eric. From eric at trueblade.com Mon Sep 11 08:32:12 2017 From: eric at trueblade.com (Eric V. Smith) Date: Mon, 11 Sep 2017 08:32:12 -0400 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> Message-ID: <0e981fde-672a-9fcf-5003-e23ab0f8f473@trueblade.com> On 9/10/17 11:08 PM, Nathaniel Smith wrote: > Hi Eric, > > A few quick comments: > > Why do you even have a hash= argument on individual fields? For the > whole class, I can imagine you might want to explicitly mark a whole > class as unhashable, but it seems like the only thing you can do with > the field-level hash= argument is to create a class where the __hash__ > and __eq__ take different fields into account, and why would you ever > want that? The use case is that you have a cache, or something similar, that doesn't affect the object identity. > Though honestly I can see a reasonable argument for removing the > class-level hash= option too. And even if you keep it you might want to > error on some truly nonsensical options like defining __hash__ without > __eq__. (Also watch out that Python's usual rule about defining __eq__ > blocking the inheritance of __hash__ does not kick in if __eq__ is added > after the class is created.) > > I've sometimes wished that attrs let me control whether it generated > equality methods (eq/ne/hash) separately from ordering methods > (lt/gt/...). Maybe the cmp= argument should take an enum with options > none/equality-only/full? Yeah, I've thought about this, too. But I don't have any use case in mind, and if it hasn't come up with attrs, then I'm reluctant to break new ground here. > The "why not attrs" section kind of reads like "because it's too popular > and useful"? I'll add some words to that section, probably focused on typing compatibility. My general feeling is that attrs has some great design decisions, but goes a little too far (e.g., conversions, validations). As with most things we add, I'm trying to be as minimalist as possible, while still being widely useful and allowing 3rd party extensions and future features. Eric. From eric at trueblade.com Mon Sep 11 09:51:11 2017 From: eric at trueblade.com (Eric V. Smith) Date: Mon, 11 Sep 2017 09:51:11 -0400 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <1214949192.4401.1505137425687@office.mailbox.org> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> <0e981fde-672a-9fcf-5003-e23ab0f8f473@trueblade.com> <1214949192.4401.1505137425687@office.mailbox.org> Message-ID: <608246a4-1891-c886-8952-e3b600ac063b@trueblade.com> On 9/11/17 9:43 AM, tds333 at mailbox.org wrote: > Hi Eric, > > I have on question not addressed yet. > The implementation is based on "__annotations__" where the type is specified. > But "__annotations__" is not always filled. An interpreter version with special optimization > could remove all __annotations__ for performance reasons. (Discussed in other threads) > > In this case the dataclass does not work or will there be a fallback? > > I know it is a little bit hypothetical because an interpreter with this optimization is > not there yet. I am looking only in the future a bit. > Asking this because type annotations are stated as completely optional for Python. > And this use case will break this assumption. Yes, if there are no __annotations__, then Data Classes would break. typing.NamedTuple has the same issue. We discussed it a little bit last week, but I don't think we came to any conclusions. Since @dataclass ignores the value of the annotation (except for typing.ClassVar), it would continue to work if the type was present, buy maybe mapped to None or similar. Eric. From tds333 at mailbox.org Mon Sep 11 09:43:45 2017 From: tds333 at mailbox.org (tds333 at mailbox.org) Date: Mon, 11 Sep 2017 15:43:45 +0200 (CEST) Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <0e981fde-672a-9fcf-5003-e23ab0f8f473@trueblade.com> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> <0e981fde-672a-9fcf-5003-e23ab0f8f473@trueblade.com> Message-ID: <1214949192.4401.1505137425687@office.mailbox.org> Hi Eric, I have on question not addressed yet. The implementation is based on "__annotations__" where the type is specified. But "__annotations__" is not always filled. An interpreter version with special optimization could remove all __annotations__ for performance reasons. (Discussed in other threads) In this case the dataclass does not work or will there be a fallback? I know it is a little bit hypothetical because an interpreter with this optimization is not there yet. I am looking only in the future a bit. Asking this because type annotations are stated as completely optional for Python. And this use case will break this assumption. Personally I am a heavy user of attrs and happy to have a dataclass in the std lib. Regards Wolfgang From victor.stinner at gmail.com Mon Sep 11 10:56:21 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 11 Sep 2017 16:56:21 +0200 Subject: [Python-Dev] Fwd: Python programming language vulnerabilities In-Reply-To: References: <9573BEFB-B5EB-4814-8C8D-AF18C7DB8BD3@maurya.on.ca> Message-ID: 2017-09-10 11:42 GMT+02:00 Skip Montanaro : > I chair ISO/IEC/JTC1/SC22/WG23 Programming Language Vulnerabilities. We > publish an international technical report, ISO IEC TR 24772 Guide to > avoiding programming language vulnerabilities through language selection > use. Annex D in this document addresses vulnerabilities in Python. This > document is freely available from ISO and IEC. Does someone know where these documents can be found? (URL) I can link them from my http://python-security.readthedocs.io/ documentation. Victor From guido at python.org Mon Sep 11 11:44:36 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 11 Sep 2017 08:44:36 -0700 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) In-Reply-To: <68c9f5c5-e8d9-723f-c258-1e6727da3008@hastings.org> References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> <68c9f5c5-e8d9-723f-c258-1e6727da3008@hastings.org> Message-ID: On Sun, Sep 10, 2017 at 8:52 PM, Larry Hastings wrote: > > On 09/06/2017 02:13 PM, Ronald Oussoren wrote: > > To be honest this sounds like a fairly crude hack. Updating the __class__ > of a module object feels dirty, but at least you get normal behavior w.r.t. > properties. > > Okay. Obviously I disagree, I think it's reasonable. But I'll assume > you're -1. > I'm still poindering this. I am far from decided about whether it's better to just add @property support, or whether we should add __getattr__ instead (it can do everything @property can do, plus other things, but the @property support would be a little clumsier), or everything that a class can do with attributes (e.g. __getattribute__, __dir__, __setattr__), or leave things alone (__class__ assignment can solve everything). I worry that in the end @property isn't general enough and the major use cases end up still having to use __class__ assignment, and then we'd have a fairly useless feature that we cant withdraw, ever. > Why is there no mechanism to add new descriptors that can work in this > context? > > I've updated the prototype to add one. I added it as "collections.abc.InstanceDescriptor"; > that's a base class you can inherit from, and then your descriptor will > work in a module. Bikeshedding the name is fine. > I don't understand the question, or the answer. (And finding the prototype is taking longer than writing this email.) > BTW. The interaction with import is interesting? Module properties only > work as naive users expect when accessing them as attributes of the module > object, in particular importing the name using ?from module import prop? > would only call the property getter once and that may not be the intended > behavior. > > I spent a fair amount of time thinking about this. The short answer is: > we *could* fix it. We *could* make it so that "from x import y", when > x.y is an instance descriptor, ensures that y honors the descriptor > protocol when referenced. We'd have to do it for three contexts: > > - global scope (aka module scope) > - class scope > - function scope > > The first two are pretty similar; create a proxy object that retains the > module instance so it remains "bound" to that. The third one is trickier; > it'd mean a new bytecode (LOAD_FAST_DESCRIPTOR), but it wouldn't slow down > people who didn't use it. > > Anyway, long story short, I think this would be worse than simply having > "from x import y" only calling the getter once. As the Zen says: special > cases aren't special enough to break the rules. > I agree -- let's please not fix this, it's way too complicated for something that started out as a nice localized tweak. When we write "from x import y" we already voluntarily give up seeing any later assignments to x.y. If this prevents a certain use case for lazy import, then so be it -- I don't think lazy import can ever be made so transparent that everything will work lazily without understanding the difference between lazy and eager import. (People already have to understand many subtleties of import, e.g. the difference between import and loading, and how circular imports work.) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Sep 11 12:00:41 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 11 Sep 2017 09:00:41 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <608246a4-1891-c886-8952-e3b600ac063b@trueblade.com> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> <0e981fde-672a-9fcf-5003-e23ab0f8f473@trueblade.com> <1214949192.4401.1505137425687@office.mailbox.org> <608246a4-1891-c886-8952-e3b600ac063b@trueblade.com> Message-ID: On Mon, Sep 11, 2017 at 6:51 AM, Eric V. Smith wrote: > On 9/11/17 9:43 AM, tds333 at mailbox.org wrote: > >> Hi Eric, >> >> I have on question not addressed yet. >> The implementation is based on "__annotations__" where the type is >> specified. >> But "__annotations__" is not always filled. An interpreter version with >> special optimization >> could remove all __annotations__ for performance reasons. (Discussed in >> other threads) >> >> In this case the dataclass does not work or will there be a fallback? >> >> I know it is a little bit hypothetical because an interpreter with this >> optimization is >> not there yet. I am looking only in the future a bit. >> Asking this because type annotations are stated as completely optional >> for Python. >> And this use case will break this assumption. >> > > Yes, if there are no __annotations__, then Data Classes would break. > typing.NamedTuple has the same issue. We discussed it a little bit last > week, but I don't think we came to any conclusions. > > Since @dataclass ignores the value of the annotation (except for > typing.ClassVar), it would continue to work if the type was present, buy > maybe mapped to None or similar. > Let's not worry about a future where there's no __annotations__. Type annotations will gradually become more mainstream. You won't have to use them, but some newer features of the language will be inaccessible if you don't. This has already started with the pattern based on inheriting from typing.NamedTuple and using field annotations. Dataclasses are simply another example. (That said, I strongly oppose *runtime type checking* based on annotations. It's by and large a mistaken idea. But this has nothing to do with that.) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Mon Sep 11 12:34:53 2017 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 11 Sep 2017 09:34:53 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <0e981fde-672a-9fcf-5003-e23ab0f8f473@trueblade.com> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> <0e981fde-672a-9fcf-5003-e23ab0f8f473@trueblade.com> Message-ID: On Mon, Sep 11, 2017 at 5:32 AM, Eric V. Smith wrote: > On 9/10/17 11:08 PM, Nathaniel Smith wrote: >> >> Hi Eric, >> >> A few quick comments: >> >> Why do you even have a hash= argument on individual fields? For the >> whole class, I can imagine you might want to explicitly mark a whole >> class as unhashable, but it seems like the only thing you can do with >> the field-level hash= argument is to create a class where the __hash__ >> and __eq__ take different fields into account, and why would you ever >> want that? > > > The use case is that you have a cache, or something similar, that doesn't > affect the object identity. But wouldn't this just be field(cmp=False), no need to fiddle with hash=? >> Though honestly I can see a reasonable argument for removing the >> class-level hash= option too. And even if you keep it you might want to >> error on some truly nonsensical options like defining __hash__ without >> __eq__. (Also watch out that Python's usual rule about defining __eq__ >> blocking the inheritance of __hash__ does not kick in if __eq__ is added >> after the class is created.) >> >> I've sometimes wished that attrs let me control whether it generated >> equality methods (eq/ne/hash) separately from ordering methods >> (lt/gt/...). Maybe the cmp= argument should take an enum with options >> none/equality-only/full? > > > Yeah, I've thought about this, too. But I don't have any use case in mind, > and if it hasn't come up with attrs, then I'm reluctant to break new ground > here. https://github.com/python-attrs/attrs/issues/170 >From that thread, it feels like part of the problem is that it's awkward to encode this using only boolean arguments, but they're sort of stuck with that for backcompat with how they originally defined cmp=, hence my suggestion to consider making it an enum from the start :-). >> The "why not attrs" section kind of reads like "because it's too popular >> and useful"? > > > I'll add some words to that section, probably focused on typing > compatibility. My general feeling is that attrs has some great design > decisions, but goes a little too far (e.g., conversions, validations). As > with most things we add, I'm trying to be as minimalist as possible, while > still being widely useful and allowing 3rd party extensions and future > features. If the question is "given that we're going to add something to the stdlib, why shouldn't that thing be attrs?" then I guess it's sufficient to say "because the attrs developers didn't want it". But I think the PEP should also address the question "why are we adding something to the stdlib, instead of just recommending people install attrs". -n From chris.barker at noaa.gov Mon Sep 11 13:00:15 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 11 Sep 2017 10:00:15 -0700 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) In-Reply-To: References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> Message-ID: On Thu, Sep 7, 2017 at 3:49 PM, Larry Hastings wrote: > But I can cite at least one place in the standard library that would have > been better if it'd been implemented as a module property: > os.stat_float_times(). > I wish there were a property feature available almost very time I encounter a "get*" method in the stdlib (or any where): There are a heck of a lot in the os module: In [4]: [s for s in dir(os) if s.startswith('get')] Out[4]: ['get_blocking', 'get_exec_path', 'get_inheritable', 'get_terminal_size', 'getcwd', 'getcwdb', 'getegid', 'getenv', 'getenvb', 'geteuid', 'getgid', 'getgrouplist', 'getgroups', 'getloadavg', 'getlogin', 'getpgid', 'getpgrp', 'getpid', 'getppid', 'getpriority', 'getsid', 'getuid'] Many of those may be good use-cases for getters, but still... And just yesterday I was getting annoyed by some in sysconfig: In [6]: [s for s in dir(sysconfig) if s.startswith('get')] Out[6]: ['get_config_h_filename', 'get_config_var', 'get_config_vars', 'get_makefile_filename', 'get_path', 'get_path_names', 'get_paths', 'get_platform', 'get_python_version', 'get_scheme_names'] modules serve the very useful function of a global singleton -- if we think properties are a good idea for classes, then they are a good idea for modules... -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Mon Sep 11 13:20:44 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 11 Sep 2017 19:20:44 +0200 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) In-Reply-To: References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> Message-ID: 2017-09-11 19:00 GMT+02:00 Chris Barker : > I wish there were a property feature available almost very time I encounter > a "get*" method in the stdlib (or any where): > > There are a heck of a lot in the os module: > > In [4]: [s for s in dir(os) if s.startswith('get')] > Out[4]: > > ['get_blocking', This one is not a good example: it takes a parameter. You cannot convert it to a property. > 'getcwd', > And just yesterday I was getting annoyed by some in sysconfig: > > In [6]: [s for s in dir(sysconfig) if s.startswith('get')] > Out[6]: > > ['get_config_h_filename', > 'get_config_var', > 'get_config_vars', > 'get_makefile_filename', > 'get_path', > 'get_path_names', > 'get_paths', > 'get_platform', > 'get_python_version', > 'get_scheme_names'] When designing an API, when I have to choose between property and function/method, I prefer function/method over a property when the code is slow, especially at the first call. I prefer to "warn" users than a call (like the first one which fills a cache) can be slow. Well, it's not a strong rule, sometimes I use a property even if the first call has to fill a cache :-) Here the sysconfig has to build an internal cache at the first call ... if I recall correctly. Victor From chris.barker at noaa.gov Mon Sep 11 16:32:36 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 11 Sep 2017 13:32:36 -0700 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) In-Reply-To: References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> Message-ID: On Mon, Sep 11, 2017 at 10:20 AM, Victor Stinner wrote: > 2017-09-11 19:00 GMT+02:00 Chris Barker : > > There are a heck of a lot in the os module: > > ['get_blocking', > > This one is not a good example: it takes a parameter. You cannot > convert it to a property. > I'm sure there are many that really do have a good reason for a getter function, but not all, by any means. When designing an API, when I have to choose between property and > function/method, I prefer function/method over a property when the > code is slow, especially at the first call. > > I prefer to "warn" users than a call (like the first one which fills a > cache) can be slow. > > Well, it's not a strong rule, sometimes I use a property even if the > first call has to fill a cache :-) > > Here the sysconfig has to build an internal cache at the first call > ... if I recall correctly. > If we do get properties on modules, then there is plenty of room to discuss best API for a given functionality. And chances are, we won't gratuitously re-factor the standard library. But your point is well taken, and makes my point in a way -- if we had properties, then there would be a getter function only if there was a good reason for it. As it stands, I have no idea if calling, for instance, sysconfig.get_config_vars is an expensive operation. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Mon Sep 11 18:00:05 2017 From: barry at python.org (Barry Warsaw) Date: Mon, 11 Sep 2017 15:00:05 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> Message-ID: <3318395E-FE76-4BB1-A693-DC0229A7A34C@python.org> On Sep 10, 2017, at 20:08, Nathaniel Smith wrote: > > I've sometimes wished that attrs let me control whether it generated equality methods (eq/ne/hash) separately from ordering methods (lt/gt/...). Maybe the cmp= argument should take an enum with options none/equality-only/full? I have had use cases where I needed equality comparisons but not ordered comparisons, so I?m in favor of the option to split them. (atm, I can?t bring up a specific case, but it?s not uncommon.) Given that you only want to support the three states that Nathaniel describes, I think an enum makes the most sense, and it certainly would read well. I.e. there?s no sense in supporting the ordered comparisons and not equality, so that?s not a state that needs to be represented. I?d make one other suggestion here: please let?s not call the keyword `cmp`. That?s reminiscent of Python 2?s `cmp` built-in, which of course doesn?t exist in Python 3. Using `cmp` is just an unnecessarily obfuscating abbreviation. I?d suggest just `compare` with an enum like so: enum Compare(enum.Enum): none = 1 unordered = 2 ordered = 3 One thing I can?t avoid is DRY: from dataclasses import Compare, dataclass @dataclass(compare=Compare.unordered) class Foo: # ? Maybe exposing the enum items in the module namespace? ??dataclasses/__init__.py----- from enum import Enum class Compare(Enum): none = 1 unordered = 2 ordered =3 none = Compare.none unordered = Compare.unordered ordered = Compare.ordered ??dataclasses/__init__.py----- from dataclasses import dataclass, unordered @dataclass(compare=unordered) class Foo: # ? Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 273 bytes Desc: Message signed with OpenPGP URL: From ethan at stoneleaf.us Mon Sep 11 18:16:57 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 11 Sep 2017 15:16:57 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <3318395E-FE76-4BB1-A693-DC0229A7A34C@python.org> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> <3318395E-FE76-4BB1-A693-DC0229A7A34C@python.org> Message-ID: <59B70B59.3030900@stoneleaf.us> On 09/11/2017 03:00 PM, Barry Warsaw wrote: > On Sep 10, 2017, at 20:08, Nathaniel Smith wrote: >> >> I've sometimes wished that attrs let me control whether it generated equality methods (eq/ne/hash) separately from ordering methods (lt/gt/...). Maybe the cmp= argument should take an enum with options none/equality-only/full? > > I have had use cases where I needed equality comparisons but not ordered comparisons, so I?m in favor of the option to split them. (atm, I can?t bring up a specific case, but it?s not uncommon.) > > Given that you only want to support the three states that Nathaniel describes, I think an enum makes the most sense, and it certainly would read well. I.e. there?s no sense in supporting the ordered comparisons and not equality, so that?s not a state that needs to be represented. > > I?d make one other suggestion here: please let?s not call the keyword `cmp`. That?s reminiscent of Python 2?s `cmp` built-in, which of course doesn?t exist in Python 3. Using `cmp` is just an unnecessarily obfuscating abbreviation. I?d suggest just `compare` with an enum like so: > > enum Compare(enum.Enum): > none = 1 > unordered = 2 > ordered = 3 I like the enum idea (suprise! ;) but I would suggest "equal" or "equivalent" instead of "unordered"; better to say what they are rather than what they are not. -- ~Ethan~ From guido at python.org Mon Sep 11 18:28:02 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 11 Sep 2017 15:28:02 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <59B70B59.3030900@stoneleaf.us> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> <3318395E-FE76-4BB1-A693-DC0229A7A34C@python.org> <59B70B59.3030900@stoneleaf.us> Message-ID: Oddly I don't like the enum (flag names get too long that way), but I do agree with everything else Barry said (it should be a trivalue flag and please don't name it cmp). On Mon, Sep 11, 2017 at 3:16 PM, Ethan Furman wrote: > On 09/11/2017 03:00 PM, Barry Warsaw wrote: > >> On Sep 10, 2017, at 20:08, Nathaniel Smith wrote: >> >>> >>> I've sometimes wished that attrs let me control whether it generated >>> equality methods (eq/ne/hash) separately from ordering methods (lt/gt/...). >>> Maybe the cmp= argument should take an enum with options >>> none/equality-only/full? >>> >> >> I have had use cases where I needed equality comparisons but not ordered >> comparisons, so I?m in favor of the option to split them. (atm, I can?t >> bring up a specific case, but it?s not uncommon.) >> >> Given that you only want to support the three states that Nathaniel >> describes, I think an enum makes the most sense, and it certainly would >> read well. I.e. there?s no sense in supporting the ordered comparisons and >> not equality, so that?s not a state that needs to be represented. >> >> I?d make one other suggestion here: please let?s not call the keyword >> `cmp`. That?s reminiscent of Python 2?s `cmp` built-in, which of course >> doesn?t exist in Python 3. Using `cmp` is just an unnecessarily >> obfuscating abbreviation. I?d suggest just `compare` with an enum like so: >> >> enum Compare(enum.Enum): >> none = 1 >> unordered = 2 >> ordered = 3 >> > > I like the enum idea (suprise! ;) but I would suggest "equal" or > "equivalent" instead of "unordered"; better to say what they are rather > than what they are not. > > -- > ~Ethan~ > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido% > 40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Mon Sep 11 18:39:20 2017 From: barry at python.org (Barry Warsaw) Date: Mon, 11 Sep 2017 15:39:20 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <59B70B59.3030900@stoneleaf.us> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> <3318395E-FE76-4BB1-A693-DC0229A7A34C@python.org> <59B70B59.3030900@stoneleaf.us> Message-ID: On Sep 11, 2017, at 15:16, Ethan Furman wrote: >> >> enum Compare(enum.Enum): >> none = 1 >> unordered = 2 >> ordered = 3 > > I like the enum idea (suprise! ;) but I would suggest "equal" or "equivalent" instead of "unordered"; better to say what they are rather than what they are not. The language reference calls them ?equality? comparisons, so maybe that?s the term we should use. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 273 bytes Desc: Message signed with OpenPGP URL: From python-dev at mgmiller.net Mon Sep 11 18:53:16 2017 From: python-dev at mgmiller.net (Mike Miller) Date: Mon, 11 Sep 2017 15:53:16 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <750de277-4975-48a9-7ef5-817b78bcbdde@trueblade.com> Message-ID: <526feace-34af-a0e2-2933-fbd99b997fa4@mgmiller.net> On 2017-09-10 23:38, Michel Desmoulin wrote: >> I'm going to put some words in explaining why I don't want to use base >> classes (I don't think it buys you anything). Do you have a reason for >> preferring base classes? > > Not preferring, but having it as an alternative. Mainly for 2 reasons: > > 1 - data classes allow one to type in classes very quickly, let's > harvest the benefit from that. > > Typing a decorator in a shell is much less comfortable than using > inheritance. Same thing about IDE: all current ones have snippet with > auto-switch to the class parents on tab. > > All in all, if you are doing exploratory programming, and thus > disposable code, which data classes are fantastic for, inheritance will > keep you in the flow. > > 2 - it will help sell the data classes > > I train a lot of people to Python each year. I never have to explain > classes to people with any kind of programming background. I _always_ > have to explain decorators. > > People are not used to it, and even kind fear it for quite some time. > > Inheritance however, is familiar, and will not only push people to use > data classes more, but also will let them do less mistakes: they know > the danger of parent ordering, but not the ones of decorators ordering. I also feel inheritance is more intuitive (and easily entered), but missed the reason it wasn't chosen. ;) -Mike From python-ideas at mgmiller.net Mon Sep 11 18:47:16 2017 From: python-ideas at mgmiller.net (Mike Miller) Date: Mon, 11 Sep 2017 15:47:16 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <0da1ba37-ff98-5394-ba55-f59855048ff3@trueblade.com> <132c15ba-ceb6-4731-2708-5b43ed863e7e@mgmiller.net> Message-ID: On 2017-09-10 18:20, Eric V. Smith wrote: > I'll think about adding some more language to the PEP about it. Agreed, was only looking for mention or example that a specific type was not required. Perhaps even dusting off the word "annotation" to describe them. From python-dev at mgmiller.net Mon Sep 11 19:51:52 2017 From: python-dev at mgmiller.net (Mike Miller) Date: Mon, 11 Sep 2017 16:51:52 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <2e3d5b22-61f9-df2e-2534-3448a59412ac@trueblade.com> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> <3e7b6c3a-0a70-6942-f4a5-b66a7b3864ef@trueblade.com> <2e3d5b22-61f9-df2e-2534-3448a59412ac@trueblade.com> Message-ID: <748a60b0-bc12-a0bd-ffb2-3978aa43a2ec@mgmiller.net> On 2017-09-11 05:26, Eric V. Smith wrote: > On 9/10/17 10:27 PM, Raymond Hettinger wrote: I've typically used these type of objects as records. When in an irreverent mood I've called them bags. The short name is helpful as they get used all over the place. I'll add Nick's "declarative" as it describes the problem well from another angle: - record - bag - declarative Anyone like these? I find them more intuitive than the existing name. Also, considering their uses, it might make sense to put them in the collections module. -Mike From barry at python.org Mon Sep 11 20:00:41 2017 From: barry at python.org (Barry Warsaw) Date: Mon, 11 Sep 2017 17:00:41 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <748a60b0-bc12-a0bd-ffb2-3978aa43a2ec@mgmiller.net> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> <3e7b6c3a-0a70-6942-f4a5-b66a7b3864ef@trueblade.com> <2e3d5b22-61f9-df2e-2534-3448a59412ac@trueblade.com> <748a60b0-bc12-a0bd-ffb2-3978aa43a2ec@mgmiller.net> Message-ID: <77F0B396-FA1E-46DB-A287-DB233575C43B@python.org> On Sep 11, 2017, at 16:51, Mike Miller wrote: > > On 2017-09-11 05:26, Eric V. Smith wrote: >> On 9/10/17 10:27 PM, Raymond Hettinger wrote: > > I've typically used these type of objects as records. When in an irreverent mood I've called them bags. I?ve used ?bag? too, but I?m not sure that?s descriptive enough for the stdlib. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 273 bytes Desc: Message signed with OpenPGP URL: From python-dev at mgmiller.net Mon Sep 11 20:08:26 2017 From: python-dev at mgmiller.net (Mike Miller) Date: Mon, 11 Sep 2017 17:08:26 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <77F0B396-FA1E-46DB-A287-DB233575C43B@python.org> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> <3e7b6c3a-0a70-6942-f4a5-b66a7b3864ef@trueblade.com> <2e3d5b22-61f9-df2e-2534-3448a59412ac@trueblade.com> <748a60b0-bc12-a0bd-ffb2-3978aa43a2ec@mgmiller.net> <77F0B396-FA1E-46DB-A287-DB233575C43B@python.org> Message-ID: <4d1cc199-4a2d-426a-efb5-4bb219b18541@mgmiller.net> On 2017-09-11 17:00, Barry Warsaw wrote: > > I?ve used ?bag? too, but I?m not sure that?s descriptive enough for the stdlib. Well, "the Pythons" were irreverent, were they not? ;) -Mike From barry at python.org Mon Sep 11 20:26:12 2017 From: barry at python.org (Barry Warsaw) Date: Mon, 11 Sep 2017 17:26:12 -0700 Subject: [Python-Dev] breakpoint() and $PYTHONBREAKPOINT In-Reply-To: References: Message-ID: <4BFB28D1-1511-4E18-AECD-796472613643@python.org> On Sep 10, 2017, at 12:16, Guido van Rossum wrote: > > I think programmatic overrides should be able to decide for themselves if they want to honor PYTHONBREAKPOINT or not, since I can imagine use cases that go both ways. So it should be checked in sys.breakpointhook(). Thanks Guido, I?ll implement it this way in my PR and update the PEP. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From barry at python.org Mon Sep 11 20:27:13 2017 From: barry at python.org (Barry Warsaw) Date: Mon, 11 Sep 2017 17:27:13 -0700 Subject: [Python-Dev] breakpoint() and $PYTHONBREAKPOINT In-Reply-To: References: Message-ID: <4F91F6FA-016C-4039-8765-32C806AB041D@python.org> On Sep 10, 2017, at 13:46, Nathaniel Smith wrote: > > On Sun, Sep 10, 2017 at 12:06 PM, Barry Warsaw wrote: >> For PEP 553, I think it?s a good idea to support the environment variable $PYTHONBREAKPOINT[*] but I?m stuck on a design question, so I?d like to get some feedback. >> >> Should $PYTHONBREAKPOINT be consulted in breakpoint() or in sys.breakpointhook()? > > Wouldn't the usual pattern be to check $PYTHONBREAKPOINT once at > startup, and if it's set use it to initialize sys.breakpointhook()? > Compare to, say, $PYTHONPATH. Perhaps, but what would be the visible effects of that? I.e. what would that buy you? Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From larry at hastings.org Mon Sep 11 21:04:33 2017 From: larry at hastings.org (Larry Hastings) Date: Mon, 11 Sep 2017 18:04:33 -0700 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) In-Reply-To: References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> <68c9f5c5-e8d9-723f-c258-1e6727da3008@hastings.org> Message-ID: On 09/11/2017 08:44 AM, Guido van Rossum wrote: > I worry that in the end @property isn't general enough and the major > use cases end up still having to use __class__ assignment, and then > we'd have a fairly useless feature that we cant withdraw, ever. What can I say--I don't have that worry ;-) As previously mentioned in this thread, I counted up uses of property, __getattr__, and __getattribute__ in 3.7/Lib. I grepped for the following strings, ignored pydoc_data/topics.py, and got these totals: "@property" 375 hits "def __getattr__" 28 hits "def __getattribute__(" 2 hits @property seems pretty popular. >> Why is there no mechanism to add new descriptors that can work in >> this context? > I've updated the prototype to add one. I added it as > "collections.abc.InstanceDescriptor"; that's a base class you can > inherit from, and then your descriptor will work in a module. > Bikeshedding the name is fine. > > I don't understand the question, or the answer. (And finding the > prototype is taking longer than writing this email.) Ronald was basically asking: what about user classes? The first revision of the prototype didn't provide a way to write your own instance descriptors. The only thing that could be a instance descriptor was property. So, I updated the prototype and added collections.abc.InstanceDescriptor, a base class user classes can inherit from that lets them be instance descriptors. The prototype is linked to from the PEP; for your convenience here's a link: https://github.com/larryhastings/cpython/tree/module-properties //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Mon Sep 11 21:15:06 2017 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 11 Sep 2017 18:15:06 -0700 Subject: [Python-Dev] breakpoint() and $PYTHONBREAKPOINT In-Reply-To: <4F91F6FA-016C-4039-8765-32C806AB041D@python.org> References: <4F91F6FA-016C-4039-8765-32C806AB041D@python.org> Message-ID: On Mon, Sep 11, 2017 at 5:27 PM, Barry Warsaw wrote: > On Sep 10, 2017, at 13:46, Nathaniel Smith wrote: >> >> On Sun, Sep 10, 2017 at 12:06 PM, Barry Warsaw wrote: >>> For PEP 553, I think it?s a good idea to support the environment variable $PYTHONBREAKPOINT[*] but I?m stuck on a design question, so I?d like to get some feedback. >>> >>> Should $PYTHONBREAKPOINT be consulted in breakpoint() or in sys.breakpointhook()? >> >> Wouldn't the usual pattern be to check $PYTHONBREAKPOINT once at >> startup, and if it's set use it to initialize sys.breakpointhook()? >> Compare to, say, $PYTHONPATH. > > Perhaps, but what would be the visible effects of that? I.e. what would that buy you? Why is this case special enough to break the rules? Compared to checking it on each call to sys.breakpointhook(), I guess the two user-visible differences in behavior would be: - whether mutating os.environ["PYTHONBREAKPOINT"] inside the process affects future calls. I would find it quite surprising if it did; generally when we mutate envvars like os.environ["PYTHONPATH"] it's a way to set things up for future child processes and doesn't affect our process. - whether the import happens immediately at startup, or is delayed until the first call to breakpoint(). If it's imported once at startup, then this adds overhead to programs that set PYTHONBREAKPOINT but don't use it, and if the envvar is set to some nonsense then you get an error immediately instead of at the first call to breakpoint(). These both seem like fairly minor differences to me, but maybe saving that 30 ms or whatever of startup time is an important enough optimization to justify the special case? -n -- Nathaniel J. Smith -- https://vorpus.org From eric at trueblade.com Mon Sep 11 21:27:41 2017 From: eric at trueblade.com (Eric V. Smith) Date: Mon, 11 Sep 2017 21:27:41 -0400 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> <0e981fde-672a-9fcf-5003-e23ab0f8f473@trueblade.com> Message-ID: On 9/11/2017 12:34 PM, Nathaniel Smith wrote: > On Mon, Sep 11, 2017 at 5:32 AM, Eric V. Smith wrote: >> On 9/10/17 11:08 PM, Nathaniel Smith wrote: >>> Why do you even have a hash= argument on individual fields? For the >>> whole class, I can imagine you might want to explicitly mark a whole >>> class as unhashable, but it seems like the only thing you can do with >>> the field-level hash= argument is to create a class where the __hash__ >>> and __eq__ take different fields into account, and why would you ever >>> want that? >> >> >> The use case is that you have a cache, or something similar, that doesn't >> affect the object identity. > > But wouldn't this just be field(cmp=False), no need to fiddle with hash=? Ah, true. You're right, I can't see any good use for setting hash on a field that isn't already controlled by cmp. I think field level hash can go. >>> Though honestly I can see a reasonable argument for removing the >>> class-level hash= option too. And even if you keep it you might want to >>> error on some truly nonsensical options like defining __hash__ without >>> __eq__. (Also watch out that Python's usual rule about defining __eq__ >>> blocking the inheritance of __hash__ does not kick in if __eq__ is added >>> after the class is created.) At the class level, I think it makes more sense. But I'll write up some motivating examples. >>> I've sometimes wished that attrs let me control whether it generated >>> equality methods (eq/ne/hash) separately from ordering methods >>> (lt/gt/...). Maybe the cmp= argument should take an enum with options >>> none/equality-only/full? >> >> >> Yeah, I've thought about this, too. But I don't have any use case in mind, >> and if it hasn't come up with attrs, then I'm reluctant to break new ground >> here. > > https://github.com/python-attrs/attrs/issues/170 > > From that thread, it feels like part of the problem is that it's > awkward to encode this using only boolean arguments, but they're sort > of stuck with that for backcompat with how they originally defined > cmp=, hence my suggestion to consider making it an enum from the start > :-). I'll respond to other emails about this, probably tomorrow. Eric. >>> The "why not attrs" section kind of reads like "because it's too popular >>> and useful"? >> >> >> I'll add some words to that section, probably focused on typing >> compatibility. My general feeling is that attrs has some great design >> decisions, but goes a little too far (e.g., conversions, validations). As >> with most things we add, I'm trying to be as minimalist as possible, while >> still being widely useful and allowing 3rd party extensions and future >> features. > > If the question is "given that we're going to add something to the > stdlib, why shouldn't that thing be attrs?" then I guess it's > sufficient to say "because the attrs developers didn't want it". But I > think the PEP should also address the question "why are we adding > something to the stdlib, instead of just recommending people install > attrs". > > -n > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/eric%2Ba-python-dev%40trueblade.com > From eric at trueblade.com Mon Sep 11 21:36:21 2017 From: eric at trueblade.com (Eric V. Smith) Date: Mon, 11 Sep 2017 21:36:21 -0400 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> <3318395E-FE76-4BB1-A693-DC0229A7A34C@python.org> <59B70B59.3030900@stoneleaf.us> Message-ID: On 9/11/2017 6:28 PM, Guido van Rossum wrote: > Oddly I don't like the enum (flag names get too long that way), but I do > agree with everything else Barry said (it should be a trivalue flag and > please don't name it cmp). So if we don't do enums, I think the choices are ints, strs, or maybe True/False/None. Do you have a preference here? If int or str, I assume we'd want module-level constants. I like the name compare=, and 3 values makes sense: None, Equality, Ordered. Eric. > > On Mon, Sep 11, 2017 at 3:16 PM, Ethan Furman > wrote: > > On 09/11/2017 03:00 PM, Barry Warsaw wrote: > > On Sep 10, 2017, at 20:08, Nathaniel Smith wrote: > > > I've sometimes wished that attrs let me control whether it > generated equality methods (eq/ne/hash) separately from > ordering methods (lt/gt/...). Maybe the cmp= argument should > take an enum with options none/equality-only/full? > > > I have had use cases where I needed equality comparisons but not > ordered comparisons, so I?m in favor of the option to split > them.? (atm, I can?t bring up a specific case, but it?s not > uncommon.) > > Given that you only want to support the three states that > Nathaniel describes, I think an enum makes the most sense, and > it certainly would read well.? I.e. there?s no sense in > supporting the ordered comparisons and not equality, so that?s > not a state that needs to be represented. > > I?d make one other suggestion here: please let?s not call the > keyword `cmp`.? That?s reminiscent of Python 2?s `cmp` built-in, > which of course doesn?t exist in Python 3.? Using `cmp` is just > an unnecessarily obfuscating abbreviation.? I?d suggest just > `compare` with an enum like so: > > enum Compare(enum.Enum): > ? ? ?none = 1 > ? ? ?unordered = 2 > ? ? ?ordered = 3 > > > I like the enum idea (suprise! ;) but I would suggest "equal" or > "equivalent" instead of "unordered"; better to say what they are > rather than what they are not. > > -- > ~Ethan~ > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > > > > > -- > --Guido van Rossum (python.org/~guido ) > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/eric%2Ba-python-dev%40trueblade.com > From barry at python.org Mon Sep 11 21:45:36 2017 From: barry at python.org (Barry Warsaw) Date: Mon, 11 Sep 2017 18:45:36 -0700 Subject: [Python-Dev] breakpoint() and $PYTHONBREAKPOINT In-Reply-To: References: <4F91F6FA-016C-4039-8765-32C806AB041D@python.org> Message-ID: <40802A23-6B87-42A5-9724-3E70F4127D1A@python.org> On Sep 11, 2017, at 18:15, Nathaniel Smith wrote: > Compared to checking it on each call to sys.breakpointhook(), I guess > the two user-visible differences in behavior would be: > > - whether mutating os.environ["PYTHONBREAKPOINT"] inside the process > affects future calls. I would find it quite surprising if it did; Maybe, but the effect would be essentially the same as setting sys.breakpointhook during the execution of your program. > - whether the import happens immediately at startup, or is delayed > until the first call to breakpoint(). I definitely think it should be delayed until first use. You might never hit the breakpoint() in which case you wouldn?t want to pay any penalty for even checking the environment variable. And once you *do* hit the breakpoint(), you aren?t caring about performance then anyway. Both points could be addressed by caching the import after the first lookup, but even there I don?t think it?s worth it. I?d invoke KISS and just look it up anew each time. That way there?s no global/static state to worry about, and the feature is as flexible as it can be. > If it's imported once at > startup, then this adds overhead to programs that set PYTHONBREAKPOINT > but don't use it, and if the envvar is set to some nonsense then you > get an error immediately instead of at the first call to breakpoint(). That?s one case I can see being useful; report an error immediately upon startup rather that when you hit breakpoint(). But even there, I?m not sure. What if you?ve got some kind of dynamic debugger discovery thing going on, such that the callable isn?t available until its first use? > These both seem like fairly minor differences to me, but maybe saving > that 30 ms or whatever of startup time is an important enough > optimization to justify the special case? I?m probably too steeped in the implementation, but it feels to me like not just loading the envar callable on demand makes reasoning about and using this more complicated. I think for most uses though, it won?t really matter. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From barry at python.org Mon Sep 11 21:49:03 2017 From: barry at python.org (Barry Warsaw) Date: Mon, 11 Sep 2017 18:49:03 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> <3318395E-FE76-4BB1-A693-DC0229A7A34C@python.org> <59B70B59.3030900@stoneleaf.us> Message-ID: On Sep 11, 2017, at 18:36, Eric V. Smith wrote: > So if we don't do enums, I think the choices are ints, strs, or maybe True/False/None. Do you have a preference here? > > If int or str, I assume we'd want module-level constants. > > I like the name compare=, and 3 values makes sense: None, Equality, Ordered. +1 for the name, the 3 values, and making them module constants. After that, I don?t think it really matters what their implementation is. User code will look the same either way. One minor nice effect of using an enum is that the dataclass function can use `is` instead of `==` to compare keyword argument values. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From ethan at stoneleaf.us Mon Sep 11 21:54:21 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 11 Sep 2017 18:54:21 -0700 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) In-Reply-To: References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> Message-ID: <59B73E4D.7000808@stoneleaf.us> On 09/06/2017 08:26 AM, Guido van Rossum wrote: > So we've seen a real use case for __class__ assignment: deprecating things on access. That use case could also be solved > if modules natively supported defining __getattr__ (with the same "only used if attribute not found otherwise" semantics > as it has on classes), but it couldn't be solved using @property (or at least it would be quite hacky). > > Is there a real use case for @property? Otherwise, if we're going to mess with module's getattro, it makes more sense to > add __getattr__, which would have made Nathaniel's use case somewhat simpler. (Except for the __dir__ thing -- what else > might we need?) Doesn't assigning a module's __class__ make it so we can already write properties, descriptors, __getattr__, etc. ? Are these other things so common that we need /more/ syntax for them? -- ~Ethan~ From stephen.michell at maurya.on.ca Mon Sep 11 21:58:39 2017 From: stephen.michell at maurya.on.ca (Stephen Michell) Date: Mon, 11 Sep 2017 21:58:39 -0400 Subject: [Python-Dev] Python vulnerabilities Message-ID: <5F0506F2-231E-4972-B14F-4D393DAB46E3@maurya.on.ca> I am new to this list. Skip suggested that I join. I convene ISO/IEC/JTC1SC22/WG23 Programming Languages Working Group. We produce a suite of international technical reports that document vulnerabilities in programming that can lead to serious safety and security breaches. We published TR 24772 "Guidance to avoiding programming language vulnerabilities through language selection and use" in 2010 and again in 2013. Edition one was a language independent look at such vulnerabilities. Edition two added new vulnerabilities plus language specific annexes for Ada, C, Python, PHP, Ruby, and Spark. For this round, we have split the document into parts and are publishing the language specific parts separately. We have added a few new vulnerabilities, mostly associated with concurrency and object orientation for this iteration. We target the team lead that guides and writes coding standards for an organization, as opposed to the general programmer. We plan to ballot and publish in 2018 TR 24772-1, the language independent Part, as well as -2 Ada, -3 C, -4 Python and -8 Fortran. Our Python Part needs completion to address the new vulnerabilities documented. We want to do justice to all languages that we work with. We need experts to help us complete the document, and then to review it. I have had initial conversations with one expert. We hope for a bit more if possible. I If interested, please contact me as listed below. Our document list is at www.open-std.org/JTC1/sc22/wg23. Thank you. Stephen Michell Maurya Software stephen dot michell at maurya dot on dot ca Phone: 1-613-299-9047 -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Mon Sep 11 22:13:23 2017 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 11 Sep 2017 19:13:23 -0700 Subject: [Python-Dev] breakpoint() and $PYTHONBREAKPOINT In-Reply-To: <40802A23-6B87-42A5-9724-3E70F4127D1A@python.org> References: <4F91F6FA-016C-4039-8765-32C806AB041D@python.org> <40802A23-6B87-42A5-9724-3E70F4127D1A@python.org> Message-ID: On Mon, Sep 11, 2017 at 6:45 PM, Barry Warsaw wrote: > On Sep 11, 2017, at 18:15, Nathaniel Smith wrote: > >> Compared to checking it on each call to sys.breakpointhook(), I guess >> the two user-visible differences in behavior would be: >> >> - whether mutating os.environ["PYTHONBREAKPOINT"] inside the process >> affects future calls. I would find it quite surprising if it did; > > Maybe, but the effect would be essentially the same as setting sys.breakpointhook during the execution of your program. > >> - whether the import happens immediately at startup, or is delayed >> until the first call to breakpoint(). > > I definitely think it should be delayed until first use. You might never hit the breakpoint() in which case you wouldn?t want to pay any penalty for even checking the environment variable. And once you *do* hit the breakpoint(), you aren?t caring about performance then anyway. > > Both points could be addressed by caching the import after the first lookup, but even there I don?t think it?s worth it. I?d invoke KISS and just look it up anew each time. That way there?s no global/static state to worry about, and the feature is as flexible as it can be. > >> If it's imported once at >> startup, then this adds overhead to programs that set PYTHONBREAKPOINT >> but don't use it, and if the envvar is set to some nonsense then you >> get an error immediately instead of at the first call to breakpoint(). > > That?s one case I can see being useful; report an error immediately upon startup rather that when you hit breakpoint(). But even there, I?m not sure. What if you?ve got some kind of dynamic debugger discovery thing going on, such that the callable isn?t available until its first use? I don't think that's a big deal? Just do the discovery logic from inside the callable itself, instead of using some kind of magical attribute lookup hook. >> These both seem like fairly minor differences to me, but maybe saving >> that 30 ms or whatever of startup time is an important enough >> optimization to justify the special case? > > I?m probably too steeped in the implementation, but it feels to me like not just loading the envar callable on demand makes reasoning about and using this more complicated. I think for most uses though, it won?t really matter. I don't mind breaking from convention if you have a good reason, I just like it for PEPs to actually write down that reasoning :-) -n -- Nathaniel J. Smith -- https://vorpus.org From guido at python.org Mon Sep 11 22:16:39 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 11 Sep 2017 19:16:39 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> <3318395E-FE76-4BB1-A693-DC0229A7A34C@python.org> <59B70B59.3030900@stoneleaf.us> Message-ID: Or we could just have two arguments, eq= and order=, and some rule so that you only need to specify one or the other but not both. (E.g. order=True implies eq=True.) That seems better than needing new constants just for this flag. On Mon, Sep 11, 2017 at 6:49 PM, Barry Warsaw wrote: > On Sep 11, 2017, at 18:36, Eric V. Smith wrote: > > So if we don't do enums, I think the choices are ints, strs, or maybe > True/False/None. Do you have a preference here? > > > > If int or str, I assume we'd want module-level constants. > > > > I like the name compare=, and 3 values makes sense: None, Equality, > Ordered. > > +1 for the name, the 3 values, and making them module constants. After > that, I don?t think it really matters what their implementation is. User > code will look the same either way. One minor nice effect of using an enum > is that the dataclass function can use `is` instead of `==` to compare > keyword argument values. > > -Barry > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Sep 11 22:22:15 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 11 Sep 2017 19:22:15 -0700 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) In-Reply-To: References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> <68c9f5c5-e8d9-723f-c258-1e6727da3008@hastings.org> Message-ID: On Mon, Sep 11, 2017 at 6:04 PM, Larry Hastings wrote: > > > On 09/11/2017 08:44 AM, Guido van Rossum wrote: > > I worry that in the end @property isn't general enough and the major use > cases end up still having to use __class__ assignment, and then we'd have a > fairly useless feature that we cant withdraw, ever. > > > What can I say--I don't have that worry ;-) > > As previously mentioned in this thread, I counted up uses of property, > __getattr__, and __getattribute__ in 3.7/Lib. I grepped for the following > strings, ignored pydoc_data/topics.py, and got these totals: > > "@property" 375 hits > "def __getattr__" 28 hits > "def __getattribute__(" 2 hits > > @property seems pretty popular. > I saw that the first time but it doesn't address my worry. (I don't feel like explaining the worry again though, I feel I've done all I can.) > > Why is there no mechanism to add new descriptors that can work in this >> context? >> >> I've updated the prototype to add one. I added it as >> "collections.abc.InstanceDescriptor"; that's a base class you can >> inherit from, and then your descriptor will work in a module. Bikeshedding >> the name is fine. >> > > I don't understand the question, or the answer. (And finding the prototype > is taking longer than writing this email.) > > > Ronald was basically asking: what about user classes? The first revision > of the prototype didn't provide a way to write your own instance > descriptors. The only thing that could be a instance descriptor was > property. So, I updated the prototype and added collections.abc.InstanceDescriptor, > a base class user classes can inherit from that lets them be instance > descriptors. > I still don't follow. How does one use InstanceDescriptor? Is this still about modules, or is it a generalization of properties for non-module instances? (If it is specific to modules, why doesn't it have "module" in its name? Or if it's not, why is it in this discussion?) > The prototype is linked to from the PEP; for your convenience here's a > link: > > https://github.com/larryhastings/cpython/tree/module-properties > > I found that link in the PEP, but it's just a branch of a fork of cpython. It would be easier to review the prototype as a PR to upstream cpython. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaronchall at yahoo.com Mon Sep 11 22:17:45 2017 From: aaronchall at yahoo.com (Aaron Hall) Date: Tue, 12 Sep 2017 02:17:45 +0000 (UTC) Subject: [Python-Dev] Aaron Hall, Introduction and pull-request bump References: <133215058.29095.1505182665743.ref@mail.yahoo.com> Message-ID: <133215058.29095.1505182665743@mail.yahoo.com> Hi there, I have had the privilege of getting to know some core developers at PyCon in Portland, and I've met some others through my volunteering at PyGotham and the NYC Python meetup group (resisting urge to name-drop). In spite of not joining the mailing list until about a week ago at Brett Cannon's suggestion, I have managed to submit 3 bug-report fixes, with 1 closed by pull-request, 1 nearly closed with the pull-request accepted, and 1 with the pull request not yet accepted (looks like there's a conflict in misc/NEWS now, not sure how to use Github to resolve it either). That last pull request is here: bpo-26103 resolve Data descriptor contradiction by aaronchall ? Pull Request #1959 ? python/cpython - probably good if | | | bpo-26103 resolve Data descriptor contr... | I also had a "MutableNamedTuple" implementation I put on codereview.stackexchange.com (https://codereview.stackexchange.com/q/173045/23451) and Ashwini Chaudhury gave me some excellent pointers, and inspired by the recent discussion on "Data Classes" I now have a half-baked implementation that leverages __slots__ with a Mapping, for example (from a unittest, hence the indentation): ??????? class MNTDemo(MutableNamedTuple): ??????????? "Demo" ??????????? __slots__ = { ??????????????? 'arg0': arg(), ??????????????? 'arg1': arg(typing.List), ??????????????? 'kwarg': arg(default=0), # must be positional if passed *args ??????????????? 'args': arg(stars=Stars.ARGS), ??????????????? # everything after *args must be keyword argument ??????????????? 'kwarg1': arg(list, []), ??????????????? 'kwarg2': None, ??????????????? 'kwargs': arg(typing.Any, stars=Stars.KWARGS) ??????????????? } ??????? mnt = MNTDemo(1, 2, 3, 'stararg1', ??????? ? ? ? ?? ???? kwarg1=1, kwarg2=2, kwarg3=3) Maybe there's a nicer way to do it, but arg is a function returning a namedtuple (attrs: type, default, stars) with some defaults. It allows the __init__ to have required positional arguments and required named arguments, pretty much as flexible as a regular function signature, but communicated via __slots__. And thanks to the slots, they're every bit as performant as a namedtuple, I presume (I gave the __slots__ talk at PyCon). Anyways, to wrap up my introduction, I'm a moderator on StackOverflow (https://stackoverflow.com/users/541136/aaron-hall?tab=profile) and nearly about to catch Raymond in reputation points, mostly due to Python, I'm currently teaching Python at NYU (Introduction to Programming), and I'm an all-around Python evangelist (as much as I know how to be.) I owe a lot to the Python community, especially the meetup community in NYC, but also virtual community (Ned Batchelder in IRC comes to mind). Thank you for everything, I'm looking for my chance to give back! Cheers, Aaron Hall -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Mon Sep 11 23:21:30 2017 From: barry at python.org (Barry Warsaw) Date: Mon, 11 Sep 2017 20:21:30 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> <3318395E-FE76-4BB1-A693-DC0229A7A34C@python.org> <59B70B59.3030900@stoneleaf.us> Message-ID: <4952AA14-AC51-46D7-8AB1-D6829B683E67@python.org> On Sep 11, 2017, at 19:16, Guido van Rossum wrote: > > Or we could just have two arguments, eq= and order=, and some rule so that you only need to specify one or the other but not both. (E.g. order=True implies eq=True.) That seems better than needing new constants just for this flag. You?d have to disallow the combination `order=True, eq=False` then, right? Or would you ignore eq for any value of order=True? Seems like a clumsier API than a single tri-value parameter. Do the module constants bother you that much? -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From guido at python.org Tue Sep 12 00:47:21 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 11 Sep 2017 21:47:21 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <4952AA14-AC51-46D7-8AB1-D6829B683E67@python.org> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> <3318395E-FE76-4BB1-A693-DC0229A7A34C@python.org> <59B70B59.3030900@stoneleaf.us> <4952AA14-AC51-46D7-8AB1-D6829B683E67@python.org> Message-ID: On Mon, Sep 11, 2017 at 8:21 PM, Barry Warsaw wrote: > On Sep 11, 2017, at 19:16, Guido van Rossum wrote: > > > > Or we could just have two arguments, eq= and order=, and > some rule so that you only need to specify one or the other but not both. > (E.g. order=True implies eq=True.) That seems better than needing new > constants just for this flag. > > You?d have to disallow the combination `order=True, eq=False` then, > right? Or would you ignore eq for any value of order=True? Seems like a > clumsier API than a single tri-value parameter. Do the module constants > bother you that much? Yes they do. You may have to import them, or you have to prefix them with the module name -- whereas keyword args and True/False require neither. We could disallow order=True, eq=True. Or we could have the default being to generate __eq__, __ne__ and __hash__, and a flag to prevent these (since equality by object identity is probably less popular than equality by elementwise comparison). Perhaps: order: bool = False eq: bool = True and disallowing order=True, eq=False. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From hodgestar+pythondev at gmail.com Tue Sep 12 03:20:46 2017 From: hodgestar+pythondev at gmail.com (Simon Cross) Date: Tue, 12 Sep 2017 09:20:46 +0200 Subject: [Python-Dev] Aaron Hall, Introduction and pull-request bump In-Reply-To: <133215058.29095.1505182665743@mail.yahoo.com> References: <133215058.29095.1505182665743.ref@mail.yahoo.com> <133215058.29095.1505182665743@mail.yahoo.com> Message-ID: Hello! And welcome aboard this mailing list ship. :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From larry at hastings.org Tue Sep 12 03:38:47 2017 From: larry at hastings.org (Larry Hastings) Date: Tue, 12 Sep 2017 00:38:47 -0700 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) In-Reply-To: References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> <68c9f5c5-e8d9-723f-c258-1e6727da3008@hastings.org> Message-ID: <0580a3c6-f262-acbb-93a3-87807ac48242@hastings.org> On 09/11/2017 07:22 PM, Guido van Rossum wrote: > I still don't follow. How does one use InstanceDescriptor? If you write a class that inherits from InstanceDescriptor and supports the descriptor protocol, module objects will call the descriptor protocol functions when that object is accessed inside a module instance. Example: if this is "module.py": import collections.abc class CustomInstanceProperty(collections.abc.InstanceDescriptor): def __get__(self, instance, owner): print("CustomInstanceProperty.__get__ called!") return 44 cip = CustomInstanceProperty() If you then "import module" and reference "module.cip", it'll print "CustomInstanceProperty.__get__ called!" and you'll get the value 44. All three magic methods for descriptors work fine. (BTW, I didn't know the best way to make this work. I wound up implementing InstanceDescriptor by copying-and-pasting PyBaseObject_Type, setting the special tp_flag, and ensuring the special flag gets propagated to subclasses in inherit_special(). If there's a better way to do it I'd happily improve the implementation.) > Is this still about modules, or is it a generalization of properties > for non-module instances? (If it is specific to modules, why doesn't > it have "module" in its name? Or if it's not, why is it in this > discussion?) It's a generalization of supporting descriptors for instances, but currently it's only supported by modules. It's opt-in and modules are the only class doing so. > The prototype is linked to from the PEP; for your convenience > here's a link: > > https://github.com/larryhastings/cpython/tree/module-properties > > > I found that link in the PEP, but it's just a branch of a fork of > cpython. It would be easier to review the prototype as a PR to > upstream cpython. Okay, I'll make a PR tomorrow and post a reply here (and update the PEP). //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From levkivskyi at gmail.com Tue Sep 12 04:43:54 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Tue, 12 Sep 2017 10:43:54 +0200 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) In-Reply-To: References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> <68c9f5c5-e8d9-723f-c258-1e6727da3008@hastings.org> Message-ID: @Larry > "@property" 375 hits > "def __getattr__" 28 hits I don't think it is fair to compare occurrences of __getattr__ vs occurrences of @property, since in the first case one would use a single __getattr__ per class, while in the second case @property is required for every attribute. @Guido > I'm still poindering this. I am far from decided about whether it's better to ... The decision is up to you, I just wanted to bring an idea that is both simpler and more performance efficient. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Tue Sep 12 05:49:31 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 12 Sep 2017 11:49:31 +0200 Subject: [Python-Dev] PEP 490 "Chain exceptions at C level" rejected Message-ID: Hi, In March 2015, I proposed the PEP 490 to chain implicitly C exceptions by default: https://www.python.org/dev/peps/pep-0490/ The discussion on python-dev was quickly in favor of keeping the status quo: don't chain C exceptions by default, but only do that explicitly where it makes sense. I forgot this PEP, but I just rejected it to close the old discussion :-) Note: The PEP is not yet rejected on python.org, it will be done at the next cron job run. Victor From ncoghlan at gmail.com Tue Sep 12 06:28:24 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 12 Sep 2017 20:28:24 +1000 Subject: [Python-Dev] breakpoint() and $PYTHONBREAKPOINT In-Reply-To: References: <4F91F6FA-016C-4039-8765-32C806AB041D@python.org> Message-ID: On 12 September 2017 at 11:15, Nathaniel Smith wrote: > On Mon, Sep 11, 2017 at 5:27 PM, Barry Warsaw wrote: >> On Sep 10, 2017, at 13:46, Nathaniel Smith wrote: >>> >>> On Sun, Sep 10, 2017 at 12:06 PM, Barry Warsaw wrote: >>>> For PEP 553, I think it?s a good idea to support the environment variable $PYTHONBREAKPOINT[*] but I?m stuck on a design question, so I?d like to get some feedback. >>>> >>>> Should $PYTHONBREAKPOINT be consulted in breakpoint() or in sys.breakpointhook()? >>> >>> Wouldn't the usual pattern be to check $PYTHONBREAKPOINT once at >>> startup, and if it's set use it to initialize sys.breakpointhook()? >>> Compare to, say, $PYTHONPATH. >> >> Perhaps, but what would be the visible effects of that? I.e. what would that buy you? > > Why is this case special enough to break the rules? > > Compared to checking it on each call to sys.breakpointhook(), I guess > the two user-visible differences in behavior would be: > > - whether mutating os.environ["PYTHONBREAKPOINT"] inside the process > affects future calls. I would find it quite surprising if it did; > generally when we mutate envvars like os.environ["PYTHONPATH"] it's a > way to set things up for future child processes and doesn't affect our > process. I think this case is closer to PYTHONINSPECT than it is to PYTHONPATH, and we *do* delay checking PYTHONINSPECT until the application is shutting down specifically so programs can set the env var during execution. > - whether the import happens immediately at startup, or is delayed > until the first call to breakpoint(). If it's imported once at > startup, then this adds overhead to programs that set PYTHONBREAKPOINT > but don't use it, and if the envvar is set to some nonsense then you > get an error immediately instead of at the first call to breakpoint(). > These both seem like fairly minor differences to me, but maybe saving > that 30 ms or whatever of startup time is an important enough > optimization to justify the special case? This is independent of whether the check happens in the builtin or in the default hook, and I agree with Barry that it should be queried lazily regardless of whether it's the builtin or the default hook that checks it. I think it would be reasonable for the default hook to cache the result of PYTHONBREAKPOINT on the first lookup and then not check it again on future calls, though. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From chris.barker at noaa.gov Tue Sep 12 12:01:00 2017 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Tue, 12 Sep 2017 09:01:00 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <748a60b0-bc12-a0bd-ffb2-3978aa43a2ec@mgmiller.net> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> <3e7b6c3a-0a70-6942-f4a5-b66a7b3864ef@trueblade.com> <2e3d5b22-61f9-df2e-2534-3448a59412ac@trueblade.com> <748a60b0-bc12-a0bd-ffb2-3978aa43a2ec@mgmiller.net> Message-ID: <-4829494977384524515@unknownmsgid> - record +1 This really does match well with the record concept in databases, and most people are familiar with that. Though it will. E a touch confusing until (if ever) most of the database and cab traders, etc start using them. It also matches pretty well with numpy "record arrays": https://docs.scipy.org/doc/numpy-1.13.0/user/basics.rec.html - bag I also like this -- not many folks will have a ore-conceived notion of what it is. - declarative Yeach-- that's an adjective (at least on most common use) -- and a programming term that means something else. Also, considering their uses, it might make sense to put them in the collections module. Yup. -CHB -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Tue Sep 12 12:04:21 2017 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Tue, 12 Sep 2017 09:04:21 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <-4829494977384524515@unknownmsgid> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> <3e7b6c3a-0a70-6942-f4a5-b66a7b3864ef@trueblade.com> <2e3d5b22-61f9-df2e-2534-3448a59412ac@trueblade.com> <748a60b0-bc12-a0bd-ffb2-3978aa43a2ec@mgmiller.net> <-4829494977384524515@unknownmsgid> Message-ID: <-2538734101852001480@unknownmsgid> > On Sep 12, 2017, at 9:01 AM, Chris Barker - NOAA Federal wrote: > This really does match well with the record concept in databases, and most people are familiar with that. Though it will. E a touch confusing until (if ever) most of the database and cab traders, etc start using them. I REALLY need to stop quickly posting from my phone... ... Though it will be a touch confusing until (if ever) most of the database and csv readers etc. start using them. -CHB From larry at hastings.org Tue Sep 12 15:18:23 2017 From: larry at hastings.org (Larry Hastings) Date: Tue, 12 Sep 2017 12:18:23 -0700 Subject: [Python-Dev] PEP 490 "Chain exceptions at C level" rejected In-Reply-To: References: Message-ID: <8da73f91-d0d7-a267-83a0-6b2606253c78@hastings.org> On 09/12/2017 02:49 AM, Victor Stinner wrote: > Note: The PEP is not yet rejected on python.org, it will be done at > the next cron job run. My understanding is that the docs are built once a day via cron job, but the PEPs are built every time the repo changes thanks to Travis CI. So it should be only a matter of minutes between you pushing a change to a PEP and the change showing up live on the web. //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From nad at python.org Tue Sep 12 15:58:31 2017 From: nad at python.org (Ned Deily) Date: Tue, 12 Sep 2017 12:58:31 -0700 Subject: [Python-Dev] PEP 490 "Chain exceptions at C level" rejected In-Reply-To: <8da73f91-d0d7-a267-83a0-6b2606253c78@hastings.org> References: <8da73f91-d0d7-a267-83a0-6b2606253c78@hastings.org> Message-ID: <37B00BAA-5DCA-4260-A973-732C8EF61C41@python.org> On Sep 12, 2017, at 12:18, Larry Hastings wrote: > On 09/12/2017 02:49 AM, Victor Stinner wrote: >> Note: The PEP is not yet rejected on python.org, it will be done at >> the next cron job run. > My understanding is that the docs are built once a day via cron job, but the PEPs are built every time the repo changes thanks to Travis CI. So it should be only a matter of minutes between you pushing a change to a PEP and the change showing up live on the web. I believe peps on python.org are updated every 15 minutes. The html version of docs for maintenance and feature branches (currently, master->3.7, 3.6, 2.7) on docs.python.org are updated every 3 hours and, in theory, the downloadable versions are updated once a day. But take that with a grain of salt (nudge nudge, wink wink). -- Ned Deily nad at python.org -- [] From ericsnowcurrently at gmail.com Tue Sep 12 16:59:43 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 12 Sep 2017 13:59:43 -0700 Subject: [Python-Dev] PEP 554 v2 (new "interpreters" module) In-Reply-To: <20170908232841.GA30417@bytereef.org> References: <20170908232841.GA30417@bytereef.org> Message-ID: On Fri, Sep 8, 2017 at 4:28 PM, Stefan Krah wrote: > The most promising model to me is to put *all* globals in a tls structure > and cache the whole structure. Extrapolating from my experiences with the > context, this might have a slowdown of "only" 4%. Yeah, this is actually something I've been exploring and was one of the motivations for consolidating the C globals. -eric From ericsnowcurrently at gmail.com Tue Sep 12 17:06:48 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 12 Sep 2017 14:06:48 -0700 Subject: [Python-Dev] PEP 554 v2 (new "interpreters" module) In-Reply-To: References: Message-ID: On Sat, Sep 9, 2017 at 1:04 AM, Paul Moore wrote: > On 9 September 2017 at 00:04, Eric Snow wrote: >> add_recv_fifo(name=None): >> add_send_fifo(name=None): > > Personally, I *always* read these names backwards - from the POV of > the caller. So when I see "add_send_fifo", I then expect to be able to > send stuff on the returned FIFO (i.e., I get a writer back). But > that's not how it works. > > I may be alone in this - I've had similar problems in the past with > how people name pipes, for example - but I thought I'd raise it while > there's still a possibility that it's not just me and the names can be > changed. Yeah, those names are bugging me too. I'm stewing over possible alternatives. It's the one piece of the PEP that I want to address before I re-post to the list. -eric From ericsnowcurrently at gmail.com Tue Sep 12 17:04:51 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 12 Sep 2017 14:04:51 -0700 Subject: [Python-Dev] PEP 554 v2 (new "interpreters" module) In-Reply-To: References: <20170908232841.GA30417@bytereef.org> Message-ID: On Fri, Sep 8, 2017 at 8:54 PM, Michel Desmoulin wrote: > Le 09/09/2017 ? 01:28, Stefan Krah a ?crit : >> Still, the argument "who uses subinterpreters?" of course still remains. > > For now, nobody. But if we expose it and web frameworks manage to create > workers as fast as multiprocessing and as cheap as threading, you will > find a lot of people starting to want to use it. Note that subinterpreters share the GIL currently, and the objects you can pass between interpreters will be quite restricted initially. However, the ultimate goal is to stop sharing the GIL between interpreters and to broaden the types that can be passed between interpreters. Regardless, the isolation inherent to subinterpreters provides benefits immediately. -eric From ericsnowcurrently at gmail.com Tue Sep 12 17:22:29 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 12 Sep 2017 14:22:29 -0700 Subject: [Python-Dev] PEP 554 v2 (new "interpreters" module) In-Reply-To: <20170909134513.4ba776bc@fsol> References: <20170909134513.4ba776bc@fsol> Message-ID: On Sat, Sep 9, 2017 at 4:45 AM, Antoine Pitrou wrote: > How does the other interpreter get the FIFO "tied" to it? > Is it `get_current().get_fifo(name)`? Something else? Yep, that's it. I've added some examples to the PEP to illustrate. That said, I'm re-working the method names since they confuse me too. :) > How does it get to learn the *name*? The main way is that the programmer hard-codes it. I'd expect close proximity between the code that adds the FIFO and the code that get's run in the subinterpreter. > > Why are FIFOs unidirectional? Does it really make them significantly > cheaper? With bidirectional pipes, the whole recv/send confusion would > be avoided. > > As a point of comparison, for default for `multiprocessing.Pipe()` is > to create a bidirectional pipe (*), yet I'm sure a multiprocessing Pipe > has more overhead than an in-process FIFO. The choice wasn't about cost or performance. In part it's a result of my experience with Go's channels. Bidirectional channels are problematic in certain situations, including when it relates to which goroutine is responsible for managing the channel's lifetime. Also, bidirectional channels require both the reader and the writer (of the code) to keep track of the two roles that the channel plays. It's easy to forget about one or the other, and accidentally do the wrong thing. Unidirectional channels resolve all those problems. Granted, the FIFOs from the PEP aren't the same as Go's channels. They are more rudimentary and they lack the support provided by a static compiler (e.g. deadlock detection), but my intention is that they provide all the functionality we need as a building block. -eric From ericsnowcurrently at gmail.com Tue Sep 12 17:43:43 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 12 Sep 2017 14:43:43 -0700 Subject: [Python-Dev] PEP 554 v2 (new "interpreters" module) In-Reply-To: <20170909140510.1220f57e@fsol> References: <20170909140510.1220f57e@fsol> Message-ID: On Sat, Sep 9, 2017 at 5:05 AM, Antoine Pitrou wrote: > On Fri, 8 Sep 2017 16:04:27 -0700, Eric Snow wrote: >> ``list()``:: > > It's called ``enumerate()`` in the threading module. Not sure there's > a point in choosing a different name here. Yeah, in the first version of the PEP it was called "enumerate()". I changed it to "list()" at Raymond's recommendation. The reasoning is that it's less confusing to most people that way. TBH, I'd rather leave it "list()", but could be swayed. Perhaps it would be enough for the PEP to not mention any relationship to "threading"? >> The current interpreter (which called ``run()``) will block >> until the subinterpreter finishes running the requested code. Any >> uncaught exception in that code will bubble up to the current >> interpreter. > > Why does it block? How is concurrency supposed to be achieved in that > model? It would be more flexible if run(code) returned an object that > can later be waited on. Something like... a Future :-) I expect this is more a problem with my description than with the feature. :) I've already re-written this bit to be more clear. It's not that the thread blocks. It's more like a function call, where the current frame is paused while the call is executed. Then it returns to the calling frame. Likewise the interpreter in the current thread gets swapped out with the target interpreter, where the code gets run, and then the original interpreter gets swapped back in. This is how you do it in the C-API and it made sense (to me) to do it the same way in Python. > > And why guarantee that it executes in the "current OS thread"? > I would say you don't want to specify where it executes exactly, as it > opens the door for more sophisticated implementations (such as > automatic assignment of subinterpreters inside a pool of threads). Again, I had explained this poorly in the PEP. The key thing here is that subinterpreters don't do anything special relative to threading. If you want to call "Interpreter.run()" in a thread then you stick it in a "threading.Thread". If you want to auto-assign to a pool of threads then you treat it like any other function you would auto-assign to a pool of threads. >> get_fifo(name): >> list_fifos(): > > If fifos are uniquely named, why not return a name->fifo mapping? I suppose we could. Then we could get rid of "get_fifo()" too. I'm still mulling over the right API for the FIFO parts of the PEP. > >> ``FIFOReader(name)``:: >> [...] > > I don't think the method naming choice is very adequate here. The API > model for the FIFO objects can either be a (threading or > multiprocessing) Queue or a multiprocessing Pipe. > > - if a Queue, then it should have a get() / put() pair of methods > - if a Pipe, then it should have a recv() / send() pair of methods > > Now, since Queues are multi-producer multi-consumer, while Pipes are > single-producer single-consumer (they aren't "synchronized"), the > better analogy seems to the multiprocessing Pipe here, so I would vote > for recv() / send(). > > But, in any case, definitely not a pop() / push() pair. Thanks for pointing that out. Prior art, FTW! I'll factor that in. > Has any thought been given to how FIFOs could integrate with async code > driven by an event loop (e.g. asyncio)? I think the model of executing > several asyncio (or Tornado) applications each in their own > subinterpreter may prove quite interesting to reconcile multi-core > concurrency with ease of programming. That would require the FIFOs to > be able to synchronize on something an event loop can wait on (probably > a file descriptor?). Personally I've given pretty much no thought to the relationship with async. TBH, the FIFO parts of the PEP were added only recently and haven't fully baked yet. I'd be interested more feedback on async relative to PEP, not just with the FIFO bits; my experience with async is pretty limited thus far. -eric From ericsnowcurrently at gmail.com Tue Sep 12 17:48:37 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 12 Sep 2017 14:48:37 -0700 Subject: [Python-Dev] PEP 554 v2 (new "interpreters" module) In-Reply-To: References: Message-ID: On Sat, Sep 9, 2017 at 11:04 AM, Nathaniel Smith wrote: > This phrase "bubble up" here is doing a lot of work :-). Can you elaborate > on what you mean? The text now makes it seem like the exception will just > pass from one interpreter into another, but that seems impossible I've updated the PEP to clarify. > it'd mean sharing not just arbitrary user defined exception classes but full > frame objects... My current implementation does the right thing currently. However, it would probably struggle under a more tightly controlled inter-interpreter boundary. I'll take a look and also update the PEP. -eric From mdcb808 at gmail.com Tue Sep 12 19:43:16 2017 From: mdcb808 at gmail.com (Matthieu Bec) Date: Tue, 12 Sep 2017 16:43:16 -0700 Subject: [Python-Dev] parallelizing Message-ID: There are times when you deal with completely independent input/output 'pipes' - where parallelizing would really help speed things up. Can't there be a way to capture that idiom and multi thread it in the language itself? Example: loop: read an XML produce a JSON like Regards, MB From ncoghlan at gmail.com Tue Sep 12 22:09:25 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 13 Sep 2017 12:09:25 +1000 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <-4829494977384524515@unknownmsgid> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> <3e7b6c3a-0a70-6942-f4a5-b66a7b3864ef@trueblade.com> <2e3d5b22-61f9-df2e-2534-3448a59412ac@trueblade.com> <748a60b0-bc12-a0bd-ffb2-3978aa43a2ec@mgmiller.net> <-4829494977384524515@unknownmsgid> Message-ID: On 13 September 2017 at 02:01, Chris Barker - NOAA Federal wrote: > This really does match well with the record concept in databases, and most > people are familiar with that. No, most people aren't familiar with that - they only become familiar with it *after* they've learned to program and learned what a database is. > Though it will. E a touch confusing until (if > ever) most of the database and cab traders, etc start using them. Aside from the potential confusion with other technical uses of "record", a further problem with "record" is that it's ambiguous as to whether its referring to the noun (wreck-ord) or the verb (ree-cord). Even if folks correctly interpret it as a noun, there's still plenty of opportunities for folks to guess incorrectly about what it means based on the other conventional English uses of the word (e.g. a "personal record" will consist of multiple "records" in the database sense). So in this case, the vagueness of "data class" is considered a feature - since it doesn't inherently mean *anything*, folks are more likely to realise that they need to look up "Python data class", and if I search for that in a private window, the first Google hit is https://stackoverflow.com/questions/3357581/using-python-class-as-a-data-container and the second is Eric's PEP. > Also, considering their uses, it might make sense to put them in the > collections module. Data classes are things you're likely to put *in* a collection, rather than really being collections themselves (they're only collections in the same sense that all Python classes are collections of attributes, and that's not the way the collections module uses the term). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Tue Sep 12 22:48:43 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 13 Sep 2017 12:48:43 +1000 Subject: [Python-Dev] PEP 554 v2 (new "interpreters" module) In-Reply-To: References: <20170909140510.1220f57e@fsol> Message-ID: On 13 September 2017 at 07:43, Eric Snow wrote: > On Sat, Sep 9, 2017 at 5:05 AM, Antoine Pitrou wrote: >> On Fri, 8 Sep 2017 16:04:27 -0700, Eric Snow wrote: >>> ``list()``:: >> >> It's called ``enumerate()`` in the threading module. Not sure there's >> a point in choosing a different name here. > > Yeah, in the first version of the PEP it was called "enumerate()". I > changed it to "list()" at Raymond's recommendation. The reasoning is > that it's less confusing to most people that way. TBH, I'd rather > leave it "list()", but could be swayed. Perhaps it would be enough > for the PEP to not mention any relationship to "threading"? I think you should just say that you're not using interpreters.enumerate() because that conflicts with the enumerate builtin. However, replacing that with a *different* naming conflict with an even more commonly used builtin wouldn't be a good idea either :) I also think the "list interpreters" question is more subtle than you think, as there are two potential sets of interpreters to consider: 1. Getting a listing of all interpreters in the process (creating Interpreter objects in the current interpreter for each of them) 2. Getting a listing of all other interpreters created by calling interpreters.create_interpreter() in *this* interpreter Since it's not clear yet whether or not we're actually going to track the metadata needed to report the latter, I'd suggest only offering "interpreters.list_all()" initially, and requiring folks to do their own tracking if they want a record of the interpreters their own code has created. >>> The current interpreter (which called ``run()``) will block >>> until the subinterpreter finishes running the requested code. Any >>> uncaught exception in that code will bubble up to the current >>> interpreter. >> >> Why does it block? How is concurrency supposed to be achieved in that >> model? It would be more flexible if run(code) returned an object that >> can later be waited on. Something like... a Future :-) > > I expect this is more a problem with my description than with the > feature. :) I've already re-written this bit to be more clear. It's > not that the thread blocks. It's more like a function call, where the > current frame is paused while the call is executed. Then it returns > to the calling frame. Likewise the interpreter in the current thread > gets swapped out with the target interpreter, where the code gets run, > and then the original interpreter gets swapped back in. This is how > you do it in the C-API and it made sense (to me) to do it the same way > in Python. I'm not sure if it would create more confusion than it would resolve, but I'm wondering if it may be worth including an example where you delegate the subinterpreter call to another thread via concurrent.futures.ThreadExecutor. >>> get_fifo(name): >>> list_fifos(): >> >> If fifos are uniquely named, why not return a name->fifo mapping? > > I suppose we could. Then we could get rid of "get_fifo()" too. I'm > still mulling over the right API for the FIFO parts of the PEP. The FIFO parts of the proposal are feeling a lot like the Mailbox objects in TI-RTOS to me, just with a more user-friendly API since we're dealing with Python rather than C: http://software-dl.ti.com/dsps/dsps_public_sw/sdo_sb/targetcontent/sysbios/6_35_01_29/exports/bios_6_35_01_29/docs/cdoc/ti/sysbios/knl/Mailbox.html >>> ``FIFOReader(name)``:: >>> [...] >> >> I don't think the method naming choice is very adequate here. The API >> model for the FIFO objects can either be a (threading or >> multiprocessing) Queue or a multiprocessing Pipe. >> >> - if a Queue, then it should have a get() / put() pair of methods >> - if a Pipe, then it should have a recv() / send() pair of methods >> >> Now, since Queues are multi-producer multi-consumer, while Pipes are >> single-producer single-consumer (they aren't "synchronized"), the >> better analogy seems to the multiprocessing Pipe here, so I would vote >> for recv() / send(). >> >> But, in any case, definitely not a pop() / push() pair. > > Thanks for pointing that out. Prior art, FTW! I'll factor that in. +1 from me for treating this as a split Pipe model, where the SendPipe and RecvPipe are distinct objects (since they'll exist in different interpreters). >> Has any thought been given to how FIFOs could integrate with async code >> driven by an event loop (e.g. asyncio)? I think the model of executing >> several asyncio (or Tornado) applications each in their own >> subinterpreter may prove quite interesting to reconcile multi-core >> concurrency with ease of programming. That would require the FIFOs to >> be able to synchronize on something an event loop can wait on (probably >> a file descriptor?). > > Personally I've given pretty much no thought to the relationship with > async. TBH, the FIFO parts of the PEP were added only recently and > haven't fully baked yet. I'd be interested more feedback on async > relative to PEP, not just with the FIFO bits; my experience with async > is pretty limited thus far. The way TI-RTOS handles this is to allow event objects to be passed in that are used to report a "Mailbox not empty" event (when a new message is queued) and a "Mailbox not full" event (when the Mailbox was previously full, and a message is read). Code can then either block on the mailbox itself (using Mailbox_pend), or else wait on the underlying events directly. It seems to me that a similar model would work here (and wouldn't necessarily need to be included in the initial version of the PEP): a SendPipe could accept an optional Event object that it calls set() on when the pipe ceases to be full, and a RecvPipe could do the same for when it ceases to be empty. It would then be up to the calling code to decide whether to pass in a regular synchronous threading.Event() object (in which case it could probably just make blocking calls to the send()/recv() methods instead), or else to pass in asyncio.Event() objects (which a coroutine can wait on via the Event.wait() coroutine) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From guido at python.org Wed Sep 13 00:05:20 2017 From: guido at python.org (Guido van Rossum) Date: Tue, 12 Sep 2017 21:05:20 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> <3e7b6c3a-0a70-6942-f4a5-b66a7b3864ef@trueblade.com> <2e3d5b22-61f9-df2e-2534-3448a59412ac@trueblade.com> <748a60b0-bc12-a0bd-ffb2-3978aa43a2ec@mgmiller.net> <-4829494977384524515@unknownmsgid> Message-ID: On Tue, Sep 12, 2017 at 7:09 PM, Nick Coghlan wrote: > So in this case, the vagueness of "data class" is considered a feature > - since it doesn't inherently mean *anything*, folks are more likely > to realise that they need to look up "Python data class", and if I > search for that in a private window, the first Google hit is > https://stackoverflow.com/questions/3357581/using-python-class-as-a-data- > container > and the second is Eric's PEP. > I think "data classes" is a fine moniker for this concept. It's ironic that some people dislike "data classes" because these are regular classes, not just for data, while others are proposing alternative names that emphasize the data container aspect. So "data classes" splits the difference, by referring to both data and classes. Let's bikeshed about something else. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Sep 13 00:33:52 2017 From: guido at python.org (Guido van Rossum) Date: Tue, 12 Sep 2017 21:33:52 -0700 Subject: [Python-Dev] PEP 554 v2 (new "interpreters" module) In-Reply-To: References: <20170909140510.1220f57e@fsol> Message-ID: On Tue, Sep 12, 2017 at 2:43 PM, Eric Snow wrote: > Yeah, in the first version of the PEP it was called "enumerate()". I > changed it to "list()" at Raymond's recommendation. The reasoning is > that it's less confusing to most people that way. TBH, I'd rather > leave it "list()", but could be swayed. Perhaps it would be enough > for the PEP to not mention any relationship to "threading"? > Really, you're going to change it from a name conflict with a builtin function to a name conflict with a builtin type? -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Wed Sep 13 04:44:23 2017 From: eric at trueblade.com (Eric V. Smith) Date: Wed, 13 Sep 2017 04:44:23 -0400 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> Message-ID: <693756ab-fac3-a711-3300-d399f035b2bb@trueblade.com> On 9/10/2017 11:08 PM, Nathaniel Smith wrote: > Hi Eric, > > A few quick comments: > > Why do you even have a hash= argument on individual fields? For the > whole class, I can imagine you might want to explicitly mark a whole > class as unhashable, but it seems like the only thing you can do with > the field-level hash= argument is to create a class where the __hash__ > and __eq__ take different fields into account, and why would you ever > want that? See the discussion at https://github.com/ericvsmith/dataclasses/issues/44 for why we're keeping the field-level hash function. Quick version, from Guido: "There's a legitimate reason for having a field in the eq but not in the hash. After all hash is always followed by an eq check and it is totally legit for the eq check to say two objects are not equal even though their hashes are equal." This would be in the case where a field should be used for equality testing, but computing its hash is expensive. Eric. From larry at hastings.org Wed Sep 13 05:43:12 2017 From: larry at hastings.org (Larry Hastings) Date: Wed, 13 Sep 2017 02:43:12 -0700 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) In-Reply-To: <0580a3c6-f262-acbb-93a3-87807ac48242@hastings.org> References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> <68c9f5c5-e8d9-723f-c258-1e6727da3008@hastings.org> <0580a3c6-f262-acbb-93a3-87807ac48242@hastings.org> Message-ID: <38dd16c8-432d-c9d5-b267-ddfd88064098@hastings.org> On 09/12/2017 12:38 AM, Larry Hastings wrote: > On 09/11/2017 07:22 PM, Guido van Rossum wrote: >> >> The prototype is linked to from the PEP; for your convenience >> here's a link: >> >> https://github.com/larryhastings/cpython/tree/module-properties >> >> >> I found that link in the PEP, but it's just a branch of a fork of >> cpython. It would be easier to review the prototype as a PR to >> upstream cpython. > > Okay, I'll make a PR tomorrow and post a reply here (and update the PEP). PR is here: https://github.com/python/cpython/pull/3534 Cheers, //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From nad at python.org Wed Sep 13 12:35:31 2017 From: nad at python.org (Ned Deily) Date: Wed, 13 Sep 2017 09:35:31 -0700 Subject: [Python-Dev] Reminder: snapshots and releases coming up in the next several days Message-ID: <7DB7FBD1-0574-4C4B-8170-EE8FCD0E2002@python.org> Lots of releases coming up soon! 2017-09-16: - Python 2.7.14 final release 2017-09-18 1200 UTC cutoff: - Python 3.6.3 rc1 - Python 3.3.7 final release (following by retirement) If you know of any issues that you feel need to be addressed in these releases, please make sure there is an open issue on the bug tracker set to "release blocker". Also on 2017-09-18: - Python 3.7.0 alpha 1 This will be the first preview snapshot of 3.7.0. It's still very early in the development cycle for 3.7. The main purpose of early alpha releases is to exercise the build and release process and to make it easier for Windows and Macs users to help test. Many new features and build changes are yet to appear in subsequent releases prior to feature code cutoff on 2018-01-29 at 3.7.0b1. Thanks for your help! --Ned -- Ned Deily nad at python.org -- [] From fred at fdrake.net Wed Sep 13 12:57:29 2017 From: fred at fdrake.net (Fred Drake) Date: Wed, 13 Sep 2017 12:57:29 -0400 Subject: [Python-Dev] Reminder: snapshots and releases coming up in the next several days In-Reply-To: <7DB7FBD1-0574-4C4B-8170-EE8FCD0E2002@python.org> References: <7DB7FBD1-0574-4C4B-8170-EE8FCD0E2002@python.org> Message-ID: On Wed, Sep 13, 2017 at 12:35 PM, Ned Deily wrote: > Lots of releases coming up soon! There's a "Python Release Schedule" calendar on Google Calendar that used to be maintained, but that appears to have been dropped, though I found it useful. Is there any sort of calendar feed available with this schedule that's currently maintained? -Fred -- Fred L. Drake, Jr. "A storm broke loose in my mind." --Albert Einstein From nad at python.org Wed Sep 13 13:35:33 2017 From: nad at python.org (Ned Deily) Date: Wed, 13 Sep 2017 10:35:33 -0700 Subject: [Python-Dev] Reminder: snapshots and releases coming up in the next several days In-Reply-To: References: <7DB7FBD1-0574-4C4B-8170-EE8FCD0E2002@python.org> Message-ID: <0CEC8EE4-6A47-493C-AD88-EA778053A812@python.org> I know the issue of the Google Calendar feed has come up before and we should either maintain it or remove it from the web site. I don't know who has or had the "keys" to it. I'm traveling the next couple of days but I'll look into it afterwards. If anyone has info about it, please contact me directly. AFAIK, there is no other release calendar feed currently. -- Ned Deily nad at python.org -- [] > On Sep 13, 2017, at 09:57, Fred Drake wrote: > >> On Wed, Sep 13, 2017 at 12:35 PM, Ned Deily wrote: >> Lots of releases coming up soon! > > There's a "Python Release Schedule" calendar on Google Calendar that > used to be maintained, but that appears to have been dropped, though I > found it useful. > > Is there any sort of calendar feed available with this schedule that's > currently maintained? > > > -Fred > > -- > Fred L. Drake, Jr. > "A storm broke loose in my mind." --Albert Einstein From chris.barker at noaa.gov Wed Sep 13 13:49:43 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 13 Sep 2017 10:49:43 -0700 Subject: [Python-Dev] parallelizing In-Reply-To: References: Message-ID: This really isn't the place to ask this kind of question. If you want to know how to do something with python, try python-users , stack overflow, etc. If you have an idea about a new feature you think python could have, then the python-ideas list is the place for that. But if you want anyone to take it seriously, it should be a better formed idea before you post there. But: On Tue, Sep 12, 2017 at 4:43 PM, Matthieu Bec wrote: > There are times when you deal with completely independent input/output > 'pipes' - where parallelizing would really help speed things up. > > Can't there be a way to capture that idiom and multi thread it in the > language itself? > > Example: > > loop: > > read an XML > > produce a JSON like > Regular old threading works fine for this: import time import random import threading def process(infile, outfile): "fake function to simulate a process that takes a random amount of time" time.sleep(random.random()) print("processing: {} to make {}".format(infile, outfile)) for i in range(10): threading.Thread(target=process, args=("file%i.xml" % i, "file%i.xml" % i)).start() It gets complicated if you need to pass information back and forth, or worry about race conditions, or manage a queue, or .... But just running a nice self-contained thread safe function in another thread is pretty straightforward. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From srkunze at mail.de Wed Sep 13 14:26:52 2017 From: srkunze at mail.de (Sven R. Kunze) Date: Wed, 13 Sep 2017 20:26:52 +0200 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) In-Reply-To: <38dd16c8-432d-c9d5-b267-ddfd88064098@hastings.org> References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> <68c9f5c5-e8d9-723f-c258-1e6727da3008@hastings.org> <0580a3c6-f262-acbb-93a3-87807ac48242@hastings.org> <38dd16c8-432d-c9d5-b267-ddfd88064098@hastings.org> Message-ID: Why not adding both? Properties do have their uses as does __getattr__. Cheers, Sven On 13.09.2017 11:43, Larry Hastings wrote: > > On 09/12/2017 12:38 AM, Larry Hastings wrote: >> On 09/11/2017 07:22 PM, Guido van Rossum wrote: >>> >>> The prototype is linked to from the PEP; for your convenience >>> here's a link: >>> >>> https://github.com/larryhastings/cpython/tree/module-properties >>> >>> >>> I found that link in the PEP, but it's just a branch of a fork of >>> cpython. It would be easier to review the prototype as a PR to >>> upstream cpython. >> >> Okay, I'll make a PR tomorrow and post a reply here (and update the PEP). > > PR is here: > > https://github.com/python/cpython/pull/3534 > > > Cheers, > > > //arry/ > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/srkunze%40mail.de -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Sep 13 14:49:07 2017 From: guido at python.org (Guido van Rossum) Date: Wed, 13 Sep 2017 11:49:07 -0700 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) In-Reply-To: References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> <68c9f5c5-e8d9-723f-c258-1e6727da3008@hastings.org> <0580a3c6-f262-acbb-93a3-87807ac48242@hastings.org> <38dd16c8-432d-c9d5-b267-ddfd88064098@hastings.org> Message-ID: > Why not adding both? Properties do have their uses as does __getattr__. In that case I would just add __getattr__ to module.c, and add a recipe or perhaps a utility module that implements a __getattr__ you can put into your module if you want @property support. That way you can have both but you only need a little bit of code in module.c to check for __getattr__ and call it when you'd otherwise raise AttributeError. -- --Guido van Rossum (python.org/~guido ) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Sep 13 15:15:06 2017 From: guido at python.org (Guido van Rossum) Date: Wed, 13 Sep 2017 12:15:06 -0700 Subject: [Python-Dev] A reminder for PEP owners Message-ID: I know there's a lot of excitement around lots of new ideas. And the 3.7 feature freeze is looming (January is in a few months). But someone has to review and accept all those PEPs, and I can't do it all by myself. If you want your proposal to be taken seriously, you need to include a summary of the discussion on the mailing list (including objections, even if you disagree!) in your PEP, e.g. as an extended design rationale or under the Rejected Ideas heading. If you don't do this you risk having to repeat yourself -- also you risk having your PEP rejected, because at this point there's no way I am going to read all the discussions. -- --Guido van Rossum (python.org/~guido ) -------------- next part -------------- An HTML attachment was scrubbed... URL: From mdcb808 at gmail.com Wed Sep 13 15:11:29 2017 From: mdcb808 at gmail.com (Matthieu Bec) Date: Wed, 13 Sep 2017 12:11:29 -0700 Subject: [Python-Dev] parallelizing In-Reply-To: References: Message-ID: <299aa326-62f9-492e-c67f-a73b4766cf35@gmail.com> Thank you, I'll take your advice. Regarding your example, I think it gives the illusion to work because sleep() is GIL aware under the hood. I don't think it works for process() that mainly runs bytecode, because of the GIL. Sorry if I wrongly thought that was a language level discussion. Regards, Matthieu On 9/13/17 10:49 AM, Chris Barker wrote: > This really isn't the place to ask this kind of question. > > If you want to know how to do something with python, try python-users > , stack overflow, etc. > > If you have an idea about a new feature you think python could have, > then the python-ideas list is the place for that. But if you want > anyone to take it seriously, it should be a better formed idea before > you post there. > > But: > > On Tue, Sep 12, 2017 at 4:43 PM, Matthieu Bec > wrote: > > There are times when you deal with completely independent > input/output 'pipes' - where parallelizing would really help speed > things up. > > Can't there be a way to capture that idiom and multi thread it in > the language itself? > > Example: > > loop: > > read an XML > > produce a JSON like > > > Regular old threading works fine for this: > > import time > import random > import threading > > > def process(infile, outfile): > "fake function to simulate a process that takes a random amount of > time" > time.sleep(random.random()) > print("processing: {} to make {}".format(infile, outfile)) > > > for i in range(10): > threading.Thread(target=process, args=("file%i.xml" % i, > "file%i.xml" % i)).start() > > > It gets complicated if you need to pass information back and forth, or > worry about race conditions, or manage a queue, or .... > > But just running a nice self-contained thread safe function in another > thread is pretty straightforward. > > -CHB > > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Sep 13 17:00:13 2017 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 13 Sep 2017 14:00:13 -0700 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) In-Reply-To: References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> <68c9f5c5-e8d9-723f-c258-1e6727da3008@hastings.org> <0580a3c6-f262-acbb-93a3-87807ac48242@hastings.org> <38dd16c8-432d-c9d5-b267-ddfd88064098@hastings.org> Message-ID: On Wed, Sep 13, 2017 at 11:49 AM, Guido van Rossum wrote: > > Why not adding both? Properties do have their uses as does __getattr__. > > In that case I would just add __getattr__ to module.c, and add a recipe or > perhaps a utility module that implements a __getattr__ you can put into your > module if you want @property support. That way you can have both but you > only need a little bit of code in module.c to check for __getattr__ and call > it when you'd otherwise raise AttributeError. Unfortunately I don't think this works. If there's a @property object present in the module's instance dict, then __getattribute__ will return it directly instead of calling __getattr__. (I guess for full property emulation you'd also need to override __setattr__ and __dir__, but I don't know how important that is.) We could consider letting modules overload __getattribute__ instead of __getattr__, but I don't think this is viable either -- a key feature of __getattr__ is that it doesn't add overhead to normal lookups. If you implement deprecation warnings by overloading __getattribute__, then it makes all your users slower, even the ones who never touch the deprecated attributes. __getattr__ is much better than __getattribute__ for this purpose. Alternatively we can have a recipe that implements @property support using __class__ assignment and overriding __getattribute__/__setattr__/__dir__, so instead of 'from module_helper.property_emulation import __getattr__' it'd be 'from module_helper import enable_property_emulation; enable_property_emulation(__name__)'. Still has the slowdown problem but it would work. -n -- Nathaniel J. Smith -- https://vorpus.org From skip.montanaro at gmail.com Wed Sep 13 17:03:21 2017 From: skip.montanaro at gmail.com (Skip Montanaro) Date: Wed, 13 Sep 2017 16:03:21 -0500 Subject: [Python-Dev] A reminder for PEP owners In-Reply-To: References: Message-ID: > But someone has to > review and accept all those PEPs, and I can't do it all by myself. An alternate definition for BDFL is "Benevolent Delegator For Life." :-) Skip From guido at python.org Wed Sep 13 17:17:08 2017 From: guido at python.org (Guido van Rossum) Date: Wed, 13 Sep 2017 14:17:08 -0700 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) In-Reply-To: References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> <68c9f5c5-e8d9-723f-c258-1e6727da3008@hastings.org> <0580a3c6-f262-acbb-93a3-87807ac48242@hastings.org> <38dd16c8-432d-c9d5-b267-ddfd88064098@hastings.org> Message-ID: On Wed, Sep 13, 2017 at 2:00 PM, Nathaniel Smith wrote: > On Wed, Sep 13, 2017 at 11:49 AM, Guido van Rossum > wrote: > > > Why not adding both? Properties do have their uses as does > __getattr__. > > > > In that case I would just add __getattr__ to module.c, and add a recipe > or > > perhaps a utility module that implements a __getattr__ you can put into > your > > module if you want @property support. That way you can have both but you > > only need a little bit of code in module.c to check for __getattr__ and > call > > it when you'd otherwise raise AttributeError. > > Unfortunately I don't think this works. If there's a @property object > present in the module's instance dict, then __getattribute__ will > return it directly instead of calling __getattr__. > Hm, that's a good point. One would have to introduce some kind of convention where you can write properties with a leading _: @property def _foo(): return 42 and then a utility __getattr__ like this: def __getattr__(name): g = globals() name = '_' + name if name in g: return g[name]() raise AttributeError(...) > (I guess for full property emulation you'd also need to override > __setattr__ and __dir__, but I don't know how important that is.) > At that point maybe __class__ assignment is better. > We could consider letting modules overload __getattribute__ instead of > __getattr__, but I don't think this is viable either -- a key feature > of __getattr__ is that it doesn't add overhead to normal lookups. If > you implement deprecation warnings by overloading __getattribute__, > then it makes all your users slower, even the ones who never touch the > deprecated attributes. __getattr__ is much better than > __getattribute__ for this purpose. > Agreed. Alternatively we can have a recipe that implements @property support > using __class__ assignment and overriding > __getattribute__/__setattr__/__dir__, so instead of 'from > module_helper.property_emulation import __getattr__' it'd be 'from > module_helper import enable_property_emulation; > enable_property_emulation(__name__)'. Still has the slowdown problem > but it would work. > The emulation could do something less drastic than __class__ assignment -- it could look for globals that are properties, move them into some other dict (e.g. __properties__), and install a __getattr__ that looks things up in that dict and calls them. def __getattr__(name): if name in __properties__: return __properties__[name]() raise AttributeError(...) Still, proposals for sys.py notwithstanding, I'm worried that all of this is a solution looking for a problem. Yes, it's a cute hack. But is it art? -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From levkivskyi at gmail.com Wed Sep 13 18:01:21 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Thu, 14 Sep 2017 00:01:21 +0200 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) In-Reply-To: References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> <68c9f5c5-e8d9-723f-c258-1e6727da3008@hastings.org> <0580a3c6-f262-acbb-93a3-87807ac48242@hastings.org> <38dd16c8-432d-c9d5-b267-ddfd88064098@hastings.org> Message-ID: @Guido > One would have to introduce some kind of convention > where you can write properties with a leading _ One doesn't even need the @property decorator in this case. For example: def __getattr__(name): g = globals() name = '_' + name if name in g: return g[name]() raise AttributeError(...) def _attr(): "do something" return actual_object One can even write a decorator that would change the name automatically (but this will require a call at the end of module to unbind original names). The space for play is infinite. The question is what exactly is needed. I would say that already basic customisation (like __getattr__) will be enough. IIUC, this will be actually a bit faster than setting __class__ since only "customized" attributes will be affected. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Sep 13 19:13:57 2017 From: guido at python.org (Guido van Rossum) Date: Wed, 13 Sep 2017 16:13:57 -0700 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) In-Reply-To: References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> <68c9f5c5-e8d9-723f-c258-1e6727da3008@hastings.org> <0580a3c6-f262-acbb-93a3-87807ac48242@hastings.org> <38dd16c8-432d-c9d5-b267-ddfd88064098@hastings.org> Message-ID: On Wed, Sep 13, 2017 at 3:01 PM, Ivan Levkivskyi wrote: > @Guido > > One would have to introduce some kind of convention > > where you can write properties with a leading _ > > One doesn't even need the @property decorator in this case. > For example: > > def __getattr__(name): > g = globals() > name = '_' + name > if name in g: > return g[name]() > raise AttributeError(...) > > def _attr(): > "do something" > return actual_object > > One can even write a decorator that would change the name automatically > (but this will require a call at the end of module to unbind original > names). > The space for play is infinite. The question is what exactly is needed. > I would say that already basic customisation (like __getattr__) will be > enough. > IIUC, this will be actually a bit faster than setting __class__ since only > "customized" > attributes will be affected. > That last sentence is a key observation. Do we even know whether there are (non-toy) things that you can do *in principle* with __class__ assignment but which are too slow *in practice* to bother? And if yes, is __getattr__ fast enough? @property? IMO we're still looking for applications. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Sep 13 20:07:07 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 13 Sep 2017 17:07:07 -0700 Subject: [Python-Dev] parallelizing In-Reply-To: <299aa326-62f9-492e-c67f-a73b4766cf35@gmail.com> References: <299aa326-62f9-492e-c67f-a73b4766cf35@gmail.com> Message-ID: On Wed, Sep 13, 2017 at 12:11 PM, Matthieu Bec wrote: > Regarding your example, I think it gives the illusion to work because > sleep() is GIL aware under the hood. > > It'll work for anything -- it just may not buy you any performance. I don't know off the top of my head if file I/O captures the GIL -- for your example of file parsing. > I don't think it works for process() that mainly runs bytecode, because of > the GIL. > If you are trying to get around the GIL that that is a totally different question. But the easy way is to use multiprocessing instead: import time import random import multiprocessing def process(infile, outfile): "fake function to simulate a process that takes a random amount of time" time.sleep(random.random()) print("processing: {} to make {}".format(infile, outfile)) for i in range(10): multiprocessing.Process(target=process, args=("file%i.xml" % i, "file%i.xml" % i)).start() More overhead creating the processes, but no more GIL issues. Sorry if I wrongly thought that was a language level discussion. > > This list is for discussion of the development of the cPython interpreter. So this kind of discussion doesn't belong here unless/until it gets to the point of actually implementing something. If you have an idea as to how to improve Python, then python-ideas is the place for that discussion. But "there should be a way to run threads without the GIL" isn't a well-enough formed idea to get far there.... If you want to discuss further, let's take this offline. > Can't there be a way to capture that idiom and multi thread it in the language itself? > >> Example: >> >> loop: >> >> read an XML >> >> produce a JSON like >> > note about this -- code like this would be using all sorts of shared modules. The code in those modules is going to be touched by all the threads. There is no way the python interpreter can know which python objects are used by what how --- the GIL is there for good (and complex) reasons, not an easy task to avoid it. It's all using the same interpreter. Also -- it's not easy to know what code may work OK with the GIL. intensive computation is bad. But Python is a poor choice for that anyway. And code that does a lot in C -- numpy, text processing, etc. may not hold the GIL. And I/O So for your example of parsing XML and writing JSON -- it may well do a lot of work without holding the GIL. No way to know but to profile it. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Wed Sep 13 21:44:31 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Wed, 13 Sep 2017 18:44:31 -0700 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) Message-ID: I've updated PEP 554 in response to feedback. (thanks all!) There are a few unresolved points (some of them added to the Open Questions section), but the current PEP has changed enough that I wanted to get it out there first. Notably changed: * the API relative to object passing has changed somewhat drastically (hopefully simpler and easier to understand), replacing "FIFO" with "channel" * added an examples section * added an open questions section * added a rejected ideas section * added more items to the deferred functionality section * the rationale section has moved down below the examples Please let me know what you think. I'm especially interested in feedback about the channels. Thanks! -eric ++++++++++++++++++++++++++++++++++++++++++++++++ PEP: 554 Title: Multiple Interpreters in the Stdlib Author: Eric Snow Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 2017-09-05 Python-Version: 3.7 Post-History: Abstract ======== CPython has supported subinterpreters, with increasing levels of support, since version 1.5. The feature has been available via the C-API. [c-api]_ Subinterpreters operate in `relative isolation from one another `_, which provides the basis for an `alternative concurrency model `_. This proposal introduces the stdlib ``interpreters`` module. The module will be `provisional `_. It exposes the basic functionality of subinterpreters already provided by the C-API. Proposal ======== The ``interpreters`` module will be added to the stdlib. It will provide a high-level interface to subinterpreters and wrap the low-level ``_interpreters`` module. The proposed API is inspired by the ``threading`` module. See the `Examples`_ section for concrete usage and use cases. API for interpreters -------------------- The module provides the following functions: ``list_all()``:: Return a list of all existing interpreters. ``get_current()``:: Return the currently running interpreter. ``create()``:: Initialize a new Python interpreter and return it. The interpreter will be created in the current thread and will remain idle until something is run in it. The interpreter may be used in any thread and will run in whichever thread calls ``interp.run()``. The module also provides the following class: ``Interpreter(id)``:: id: The interpreter's ID (read-only). is_running(): Return whether or not the interpreter is currently executing code. Calling this on the current interpreter will always return True. destroy(): Finalize and destroy the interpreter. This may not be called on an already running interpreter. Doing so results in a RuntimeError. run(source_str, /, **shared): Run the provided Python source code in the interpreter. Any keyword arguments are added to the interpreter's execution namespace. If any of the values are not supported for sharing between interpreters then RuntimeError gets raised. Currently only channels (see "create_channel()" below) are supported. This may not be called on an already running interpreter. Doing so results in a RuntimeError. A "run()" call is quite similar to any other function call. Once it completes, the code that called "run()" continues executing (in the original interpreter). Likewise, if there is any uncaught exception, it propagates into the code where "run()" was called. The big difference is that "run()" executes the code in an entirely different interpreter, with entirely separate state. The state of the current interpreter in the current OS thread is swapped out with the state of the target interpreter (the one that will execute the code). When the target finishes executing, the original interpreter gets swapped back in and its execution resumes. So calling "run()" will effectively cause the current Python thread to pause. Sometimes you won't want that pause, in which case you should make the "run()" call in another thread. To do so, add a function that calls "run()" and then run that function in a normal "threading.Thread". Note that interpreter's state is never reset, neither before "run()" executes the code nor after. Thus the interpreter state is preserved between calls to "run()". This includes "sys.modules", the "builtins" module, and the internal state of C extension modules. Also note that "run()" executes in the namespace of the "__main__" module, just like scripts, the REPL, "-m", and "-c". Just as the interpreter's state is not ever reset, the "__main__" module is never reset. You can imagine concatenating the code from each "run()" call into one long script. This is the same as how the REPL operates. Supported code: source text. API for sharing data -------------------- The mechanism for passing objects between interpreters is through channels. A channel is a simplex FIFO similar to a pipe. The main difference is that channels can be associated with zero or more interpreters on either end. Unlike queues, which are also many-to-many, channels have no buffer. ``create_channel()``:: Create a new channel and return (recv, send), the RecvChannel and SendChannel corresponding to the ends of the channel. The channel is not closed and destroyed (i.e. garbage-collected) until the number of associated interpreters returns to 0. An interpreter gets associated with a channel by calling its "send()" or "recv()" method. That association gets dropped by calling "close()" on the channel. Both ends of the channel are supported "shared" objects (i.e. may be safely shared by different interpreters. Thus they may be passed as keyword arguments to "Interpreter.run()". ``list_all_channels()``:: Return a list of all open (RecvChannel, SendChannel) pairs. ``RecvChannel(id)``:: The receiving end of a channel. An interpreter may use this to receive objects from another interpreter. At first only bytes will be supported. id: The channel's unique ID. interpreters: The list of associated interpreters (those that have called the "recv()" method). __next__(): Return the next object from the channel. If none have been sent then wait until the next send. recv(): Return the next object from the channel. If none have been sent then wait until the next send. If the channel has been closed then EOFError is raised. recv_nowait(default=None): Return the next object from the channel. If none have been sent then return the default. If the channel has been closed then EOFError is raised. close(): No longer associate the current interpreter with the channel (on the receiving end). This is a noop if the interpreter isn't already associated. Once an interpreter is no longer associated with the channel, subsequent (or current) send() and recv() calls from that interpreter will raise EOFError. Once number of associated interpreters on both ends drops to 0, the channel is actually marked as closed. The Python runtime will garbage collect all closed channels. Note that "close()" is automatically called when it is no longer used in the current interpreter. This operation is idempotent. Return True if the current interpreter was still associated with the receiving end of the channel and False otherwise. ``SendChannel(id)``:: The sending end of a channel. An interpreter may use this to send objects to another interpreter. At first only bytes will be supported. id: The channel's unique ID. interpreters: The list of associated interpreters (those that have called the "send()" method). send(obj): Send the object to the receiving end of the channel. Wait until the object is received. If the channel does not support the object then TypeError is raised. Currently only bytes are supported. If the channel has been closed then EOFError is raised. send_nowait(obj): Send the object to the receiving end of the channel. If the object is received then return True. Otherwise return False. If the channel does not support the object then TypeError is raised. If the channel has been closed then EOFError is raised. close(): No longer associate the current interpreter with the channel (on the sending end). This is a noop if the interpreter isn't already associated. Once an interpreter is no longer associated with the channel, subsequent (or current) send() and recv() calls from that interpreter will raise EOFError. Once number of associated interpreters on both ends drops to 0, the channel is actually marked as closed. The Python runtime will garbage collect all closed channels. Note that "close()" is automatically called when it is no longer used in the current interpreter. This operation is idempotent. Return True if the current interpreter was still associated with the sending end of the channel and False otherwise. Examples ======== Run isolated code ----------------- :: interp = interpreters.create() print('before') interp.run('print("during")') print('after') Run in a thread --------------- :: interp = interpreters.create() def run(): interp.run('print("during")') t = threading.Thread(target=run) print('before') t.start() print('after') Pre-populate an interpreter --------------------------- :: interp = interpreters.create() interp.run("""if True: import some_lib import an_expensive_module some_lib.set_up() """) wait_for_request() interp.run("""if True: some_lib.handle_request() """) Handling an exception --------------------- :: interp = interpreters.create() try: interp.run("""if True: raise KeyError """) except KeyError: print("got the error from the subinterpreter") Synchronize using a channel --------------------------- :: interp = interpreters.create() r, s = interpreters.create_channel() def run(): interp.run("""if True: reader.recv() print("during") reader.close() """, reader=r) t = threading.Thread(target=run) print('before') t.start() print('after') s.send(b'') s.close() Sharing a file descriptor ------------------------- :: interp = interpreters.create() r1, s1 = interpreters.create_channel() r2, s2 = interpreters.create_channel() def run(): interp.run("""if True: fd = int.from_bytes( reader.recv(), 'big') for line in os.fdopen(fd): print(line) writer.send(b'') """, reader=r1, writer=s2) t = threading.Thread(target=run) t.start() with open('spamspamspam') as infile: fd = infile.fileno().to_bytes(1, 'big') s.send(fd) r.recv() Passing objects via pickle -------------------------- :: interp = interpreters.create() r, s = interpreters.create_channel() interp.run("""if True: import pickle """, reader=r) def run(): interp.run("""if True: data = reader.recv() while data: obj = pickle.loads(data) do_something(obj) data = reader.recv() reader.close() """, reader=r) t = threading.Thread(target=run) t.start() for obj in input: data = pickle.dumps(obj) s.send(data) s.send(b'') Rationale ========= Running code in multiple interpreters provides a useful level of isolation within the same process. This can be leveraged in number of ways. Furthermore, subinterpreters provide a well-defined framework in which such isolation may extended. CPython has supported subinterpreters, with increasing levels of support, since version 1.5. While the feature has the potential to be a powerful tool, subinterpreters have suffered from neglect because they are not available directly from Python. Exposing the existing functionality in the stdlib will help reverse the situation. This proposal is focused on enabling the fundamental capability of multiple isolated interpreters in the same Python process. This is a new area for Python so there is relative uncertainly about the best tools to provide as companions to subinterpreters. Thus we minimize the functionality we add in the proposal as much as possible. Concerns -------- * "subinterpreters are not worth the trouble" Some have argued that subinterpreters do not add sufficient benefit to justify making them an official part of Python. Adding features to the language (or stdlib) has a cost in increasing the size of the language. So it must pay for itself. In this case, subinterpreters provide a novel concurrency model focused on isolated threads of execution. Furthermore, they present an opportunity for changes in CPython that will allow simulateous use of multiple CPU cores (currently prevented by the GIL). Alternatives to subinterpreters include threading, async, and multiprocessing. Threading is limited by the GIL and async isn't the right solution for every problem (nor for every person). Multiprocessing is likewise valuable in some but not all situations. Direct IPC (rather than via the multiprocessing module) provides similar benefits but with the same caveat. Notably, subinterpreters are not intended as a replacement for any of the above. Certainly they overlap in some areas, but the benefits of subinterpreters include isolation and (potentially) performance. In particular, subinterpreters provide a direct route to an alternate concurrency model (e.g. CSP) which has found success elsewhere and will appeal to some Python users. That is the core value that the ``interpreters`` module will provide. * "stdlib support for subinterpreters adds extra burden on C extension authors" In the `Interpreter Isolation`_ section below we identify ways in which isolation in CPython's subinterpreters is incomplete. Most notable is extension modules that use C globals to store internal state. PEP 3121 and PEP 489 provide a solution for most of the problem, but one still remains. [petr-c-ext]_ Until that is resolved, C extension authors will face extra difficulty to support subinterpreters. Consequently, projects that publish extension modules may face an increased maintenance burden as their users start using subinterpreters, where their modules may break. This situation is limited to modules that use C globals (or use libraries that use C globals) to store internal state. Ultimately this comes down to a question of how often it will be a problem in practice: how many projects would be affected, how often their users will be affected, what the additional maintenance burden will be for projects, and what the overall benefit of subinterpreters is to offset those costs. The position of this PEP is that the actual extra maintenance burden will be small and well below the threshold at which subinterpreters are worth it. About Subinterpreters ===================== Shared data ----------- Subinterpreters are inherently isolated (with caveats explained below), in contrast to threads. This enables `a different concurrency model `_ than is currently readily available in Python. `Communicating Sequential Processes`_ (CSP) is the prime example. A key component of this approach to concurrency is message passing. So providing a message/object passing mechanism alongside ``Interpreter`` is a fundamental requirement. This proposal includes a basic mechanism upon which more complex machinery may be built. That basic mechanism draws inspiration from pipes, queues, and CSP's channels. [fifo]_ The key challenge here is that sharing objects between interpreters faces complexity due in part to CPython's current memory model. Furthermore, in this class of concurrency, the ideal is that objects only exist in one interpreter at a time. However, this is not practical for Python so we initially constrain supported objects to ``bytes``. There are a number of strategies we may pursue in the future to expand supported objects and object sharing strategies. Note that the complexity of object sharing increases as subinterpreters become more isolated, e.g. after GIL removal. So the mechanism for message passing needs to be carefully considered. Keeping the API minimal and initially restricting the supported types helps us avoid further exposing any underlying complexity to Python users. To make this work, the mutable shared state will be managed by the Python runtime, not by any of the interpreters. Initially we will support only one type of objects for shared state: the channels provided by ``create_channel()``. Channels, in turn, will carefully manage passing objects between interpreters. Interpreter Isolation --------------------- CPython's interpreters are intended to be strictly isolated from each other. Each interpreter has its own copy of all modules, classes, functions, and variables. The same applies to state in C, including in extension modules. The CPython C-API docs explain more. [caveats]_ However, there are ways in which interpreters share some state. First of all, some process-global state remains shared: * file descriptors * builtin types (e.g. dict, bytes) * singletons (e.g. None) * underlying static module data (e.g. functions) for builtin/extension/frozen modules There are no plans to change this. Second, some isolation is faulty due to bugs or implementations that did not take subinterpreters into account. This includes things like extension modules that rely on C globals. [cryptography]_ In these cases bugs should be opened (some are already): * readline module hook functions (http://bugs.python.org/issue4202) * memory leaks on re-init (http://bugs.python.org/issue21387) Finally, some potential isolation is missing due to the current design of CPython. Improvements are currently going on to address gaps in this area: * interpreters share the GIL * interpreters share memory management (e.g. allocators, gc) * GC is not run per-interpreter [global-gc]_ * at-exit handlers are not run per-interpreter [global-atexit]_ * extensions using the ``PyGILState_*`` API are incompatible [gilstate]_ Concurrency ----------- Concurrency is a challenging area of software development. Decades of research and practice have led to a wide variety of concurrency models, each with different goals. Most center on correctness and usability. One class of concurrency models focuses on isolated threads of execution that interoperate through some message passing scheme. A notable example is `Communicating Sequential Processes`_ (CSP), upon which Go's concurrency is based. The isolation inherent to subinterpreters makes them well-suited to this approach. Existing Usage -------------- Subinterpreters are not a widely used feature. In fact, the only documented case of wide-spread usage is `mod_wsgi `_. On the one hand, this case provides confidence that existing subinterpreter support is relatively stable. On the other hand, there isn't much of a sample size from which to judge the utility of the feature. Provisional Status ================== The new ``interpreters`` module will be added with "provisional" status (see PEP 411). This allows Python users to experiment with the feature and provide feedback while still allowing us to adjust to that feedback. The module will be provisional in Python 3.7 and we will make a decision before the 3.8 release whether to keep it provisional, graduate it, or remove it. Alternate Python Implementations ================================ TBD Open Questions ============== Leaking exceptions across interpreters -------------------------------------- As currently proposed, uncaught exceptions from ``run()`` propagate to the frame that called it. However, this means that exception objects are leaking across the inter-interpreter boundary. Likewise, the frames in the traceback potentially leak. While that might not be a problem currently, it would be a problem once interpreters get better isolation relative to memory management (which is necessary to stop sharing the GIL between interpreters). So the semantics of how the exceptions propagate needs to be resolved. Initial support for buffers in channels --------------------------------------- An alternative to support for bytes in channels in support for read-only buffers (the PEP 3119 kind). Then ``recv()`` would return a memoryview to expose the buffer in a zero-copy way. This is similar to what ``multiprocessing.Connection`` supports. [mp-conn] Switching to such an approach would help resolve questions of how passing bytes through channels will work once we isolate memory management in interpreters. Deferred Functionality ====================== In the interest of keeping this proposal minimal, the following functionality has been left out for future consideration. Note that this is not a judgement against any of said capability, but rather a deferment. That said, each is arguably valid. Interpreter.call() ------------------ It would be convenient to run existing functions in subinterpreters directly. ``Interpreter.run()`` could be adjusted to support this or a ``call()`` method could be added:: Interpreter.call(f, *args, **kwargs) This suffers from the same problem as sharing objects between interpreters via queues. The minimal solution (running a source string) is sufficient for us to get the feature out where it can be explored. timeout arg to pop() and push() ------------------------------- Typically functions that have a ``block`` argument also have a ``timeout`` argument. We can add it later if needed. get_main() ---------- CPython has a concept of a "main" interpreter. This is the initial interpreter created during CPython's runtime initialization. It may be useful to identify the main interpreter. For instance, the main interpreter should not be destroyed. However, for the basic functionality of a high-level API a ``get_main()`` function is not necessary. Furthermore, there is no requirement that a Python implementation have a concept of a main interpreter. So until there's a clear need we'll leave ``get_main()`` out. Interpreter.run_in_thread() --------------------------- This method would make a ``run()`` call for you in a thread. Doing this using only ``threading.Thread`` and ``run()`` is relatively trivial so we've left it out. Synchronization Primitives -------------------------- The ``threading`` module provides a number of synchronization primitives for coordinating concurrent operations. This is especially necessary due to the shared-state nature of threading. In contrast, subinterpreters do not share state. Data sharing is restricted to channels, which do away with the need for explicit synchronization. If any sort of opt-in shared state support is added to subinterpreters in the future, that same effort can introduce synchronization primitives to meet that need. CSP Library ----------- A ``csp`` module would not be a large step away from the functionality provided by this PEP. However, adding such a module is outside the minimalist goals of this proposal. Syntactic Support ----------------- The ``Go`` language provides a concurrency model based on CSP, so it's similar to the concurrency model that subinterpreters support. ``Go`` provides syntactic support, as well several builtin concurrency primitives, to make concurrency a first-class feature. Conceivably, similar syntactic (and builtin) support could be added to Python using subinterpreters. However, that is *way* outside the scope of this PEP! Multiprocessing --------------- The ``multiprocessing`` module could support subinterpreters in the same way it supports threads and processes. In fact, the module's maintainer, Davin Potts, has indicated this is a reasonable feature request. However, it is outside the narrow scope of this PEP. C-extension opt-in/opt-out -------------------------- By using the ``PyModuleDef_Slot`` introduced by PEP 489, we could easily add a mechanism by which C-extension modules could opt out of support for subinterpreters. Then the import machinery, when operating in a subinterpreter, would need to check the module for support. It would raise an ImportError if unsupported. Alternately we could support opting in to subinterpreter support. However, that would probably exclude many more modules (unnecessarily) than the opt-out approach. The scope of adding the ModuleDef slot and fixing up the import machinery is non-trivial, but could be worth it. It all depends on how many extension modules break under subinterpreters. Given the relatively few cases we know of through mod_wsgi, we can leave this for later. Poisoning channels ------------------ CSP has the concept of poisoning a channel. Once a channel has been poisoned, and ``send()`` or ``recv()`` call on it will raise a special exception, effectively ending execution in the interpreter that tried to use the poisoned channel. This could be accomplished by adding a ``poison()`` method to both ends of the channel. The ``close()`` method could work if it had a ``force`` option to force the channel closed. Regardless, these semantics are relatively specialized and can wait. Sending channels over channels ------------------------------ Some advanced usage of subinterpreters could take advantage of the ability to send channels over channels, in addition to bytes. Given that channels will already be multi-interpreter safe, supporting then in ``RecvChannel.recv()`` wouldn't be a big change. However, this can wait until the basic functionality has been ironed out. Reseting __main__ ----------------- As proposed, every call to ``Interpreter.run()`` will execute in the namespace of the interpreter's existing ``__main__`` module. This means that data persists there between ``run()`` calls. Sometimes this isn't desireable and you want to execute in a fresh ``__main__``. Also, you don't necessarily want to leak objects there that you aren't using any more. Solutions include: * a ``create()`` arg to indicate resetting ``__main__`` after each ``run`` call * an ``Interpreter.reset_main`` flag to support opting in or out after the fact * an ``Interpreter.reset_main()`` method to opt in when desired This isn't a critical feature initially. It can wait until later if desirable. Support passing ints in channels -------------------------------- Passing ints around should be fine and ultimately is probably desirable. However, we can get by with serializing them as bytes for now. The goal is a minimal API for the sake of basic functionality at first. File descriptors and sockets in channels ---------------------------------------- Given that file descriptors and sockets are process-global resources, support for passing them through channels is a reasonable idea. They would be a good candidate for the first effort at expanding the types that channels support. They aren't strictly necessary for the initial API. Rejected Ideas ============== Explicit channel association ---------------------------- Interpreters are implicitly associated with channels upon ``recv()`` and ``send()`` calls. They are de-associated with ``close()`` calls. The alternative would be explicit methods. It would be either ``add_channel()`` and ``remove_channel()`` methods on ``Interpreter`` objects or something similar on channel objects. In practice, this level of management shouldn't be necessary for users. So adding more explicit support would only add clutter to the API. Use pipes instead of channels ----------------------------- A pipe would be a simplex FIFO between exactly two interpreters. For most use cases this would be sufficient. It could potentially simplify the implementation as well. However, it isn't a big step to supporting a many-to-many simplex FIFO via channels. Also, with pipes the API ends up being slightly more complicated, requiring naming the pipes. Use queues instead of channels ------------------------------ The main difference between queues and channels is that queues support buffering. This would complicate the blocking semantics of ``recv()`` and ``send()``. Also, queues can be built on top of channels. "enumerate" ----------- The ``list_all()`` function provides the list of all interpreters. In the threading module, which partly inspired the proposed API, the function is called ``enumerate()``. The name is different here to avoid confusing Python users that are not already familiar with the threading API. For them "enumerate" is rather unclear, whereas "list_all" is clear. References ========== .. [c-api] https://docs.python.org/3/c-api/init.html#sub-interpreter-support .. _Communicating Sequential Processes: .. [CSP] https://en.wikipedia.org/wiki/Communicating_sequential_processes https://github.com/futurecore/python-csp .. [fifo] https://docs.python.org/3/library/multiprocessing.html#multiprocessing.Pipe https://docs.python.org/3/library/multiprocessing.html#multiprocessing.Queue https://docs.python.org/3/library/queue.html#module-queue http://stackless.readthedocs.io/en/2.7-slp/library/stackless/channels.html https://golang.org/doc/effective_go.html#sharing http://www.jtolds.com/writing/2016/03/go-channels-are-bad-and-you-should-feel-bad/ .. [caveats] https://docs.python.org/3/c-api/init.html#bugs-and-caveats .. [petr-c-ext] https://mail.python.org/pipermail/import-sig/2016-June/001062.html https://mail.python.org/pipermail/python-ideas/2016-April/039748.html .. [cryptography] https://github.com/pyca/cryptography/issues/2299 .. [global-gc] http://bugs.python.org/issue24554 .. [gilstate] https://bugs.python.org/issue10915 http://bugs.python.org/issue15751 .. [global-atexit] https://bugs.python.org/issue6531 .. [mp-conn] https://docs.python.org/3/library/multiprocessing.html#multiprocessing.Connection Copyright ========= This document has been placed in the public domain. From barry at python.org Wed Sep 13 22:12:46 2017 From: barry at python.org (Barry Warsaw) Date: Wed, 13 Sep 2017 19:12:46 -0700 Subject: [Python-Dev] PEP 553, v3 Message-ID: <5B417D78-C985-4B02-8D74-4ED1F6A04062@python.org> Here?s an update to PEP 553, which makes $PYTHONBREAKPOINT a first class feature. I?ve also updated PR #3355 with the implementation to match. Cheers, -Barry PEP: 553 Title: Built-in breakpoint() Author: Barry Warsaw Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 2017-09-05 Python-Version: 3.7 Post-History: 2017-09-05, 2017-09-07, 2017-09-13 Abstract ======== This PEP proposes adding a new built-in function called ``breakpoint()`` which enters a Python debugger at the point of the call. Additionally, two new names are added to the ``sys`` module to make the debugger pluggable. Rationale ========= Python has long had a great debugger in its standard library called ``pdb``. Setting a break point is commonly written like this:: foo() import pdb; pdb.set_trace() bar() Thus after executing ``foo()`` and before executing ``bar()``, Python will enter the debugger. However this idiom has several disadvantages. * It's a lot to type (27 characters). * It's easy to typo. The PEP author often mistypes this line, e.g. omitting the semicolon, or typing a dot instead of an underscore. * It ties debugging directly to the choice of pdb. There might be other debugging options, say if you're using an IDE or some other development environment. * Python linters (e.g. flake8 [linters]_) complain about this line because it contains two statements. Breaking the idiom up into two lines further complicates the use of the debugger, These problems can be solved by modeling a solution based on prior art in other languages, and utilizing a convention that already exists in Python. Proposal ======== The JavaScript language provides a ``debugger`` statement [java]_ which enters the debugger at the point where the statement appears. This PEP proposes a new built-in function called ``breakpoint()`` which enters a Python debugger at the call site. Thus the example above would be written like so:: foo() breakpoint() bar() Further, this PEP proposes two new name bindings for the ``sys`` module, called ``sys.breakpointhook()`` and ``sys.__breakpointhook__``. By default, ``sys.breakpointhook()`` implements the actual importing and entry into ``pdb.set_trace()``, and it can be set to a different function to change the debugger that ``breakpoint()`` enters. ``sys.__breakpointhook__`` then stashes the default value of ``sys.breakpointhook()`` to make it easy to reset. This exactly models the existing ``sys.displayhook()`` / ``sys.__displayhook__`` and ``sys.excepthook()`` / ``sys.__excepthook__`` [hooks]_. The signature of the built-in is ``breakpoint(*args, **kws)``. The positional and keyword arguments are passed straight through to ``sys.breakpointhook()`` and the signatures must match or a ``TypeError`` will be raised. The return from ``sys.breakpointhook()`` is passed back up to, and returned from ``breakpoint()``. Since ``sys.breakpointhook()`` by default calls ``pdb.set_trace()`` by default it accepts no arguments. Environment variable ==================== The default implementation of ``sys.breakpointhook()`` consults a new environment variable called ``PYTHONBREAKPOINT``. This environment variable can have various values: * ``PYTHONBREAKPOINT=0`` disables debugging. Specifically, with this value ``sys.breakpointhook()`` returns ``None`` immediately. * ``PYTHONBREAKPOINT= `` (i.e. the empty string). This is the same as not setting the environment variable at all, in which case ``pdb.set_trace()`` is run as usual. * ``PYTHONBREAKPOINT=some.importable.callable``. In this case, ``sys.breakpointhook()`` imports the ``some.importable`` module and gets the ``callable`` object from the resulting module, which it then calls. The value may be a string with no dots, in which case it names a built-in callable, e.g. ``PYTHONBREAKPOINT=int``. (Guido has expressed the preference for normal Python dotted-paths, not setuptools-style entry point syntax [syntax]_.) This environment variable allows external processes to control how breakpoints are handled. Some uses cases include: * Completely disabling all accidental ``breakpoint()`` calls pushed to production. This could be accomplished by setting ``PYTHONBREAKPOINT=0`` in the execution environment. Another suggestion by reviewers of the PEP was to set ``PYTHONBREAKPOINT=sys.exit`` in this case. * IDE integration with specialized debuggers for embedded execution. The IDE would run the program in its debugging environment with ``PYTHONBREAKPOINT`` set to their internal debugging hook. ``PYTHONBREAKPOINT`` is re-interpreted every time ``sys.breakpointhook()`` is reached. This allows processes to change its value during the execution of a program and have ``breakpoint()`` respond to those changes. It is not considered a performance critical section since entering a debugger by definition stops execution. (Of note, the implementation fast-tracks the ``PYTHONBREAKPOINT=0`` case.) Overriding ``sys.breakpointhook`` defeats the default consultation of ``PYTHONBREAKPOINT``. It is up to the overriding code to consult ``PYTHONBREAKPOINT`` if they want. If access to the ``PYTHONBREAKPOINT`` callable fails in any way (e.g. the import fails, or the resulting module does not contain the callable), a ``RuntimeWarning`` is issued, and no breakpoint function is called. Open issues =========== Confirmation from other debugger vendors ---------------------------------------- We want to get confirmation from at least one alternative debugger implementation (e.g. PyCharm) that the hooks provided in this PEP will be useful to them. Evaluation of $PYTHONBREAKPOINT ------------------------------- There has been some mailing list discussion around this topic. The basic behavior as described above does not appear to be controversial. Guido has expressed a preference for ``$PYTHONBREAKPOINT`` consultation happening in the default implementation of ``sys.breakpointhook`` [envar]_. The one point of discussion relates to whether the value of ``$PYTHONBREAKPOINT`` should be loaded on interpreter start, and whether its value should be cached the first time it's accessed. It is the PEP author's opinion that the environment variable need only be looked up at the time of use. It is also the author's opinion that the value of the environment variable can be accessed each time ``sys.breakpointhook`` is run, to allow for maximum functional flexibility. Because this feature enters the debugger, any performance improvements for caching will be negligible and do not outweigh the flexibility. Further, because the special case of ``PYTHONBREAKPOINT=0`` is fast-tracked, the no-op code path is quite fast, and should be in the noise given the function calls of ``breakpoint()`` -> ``sys.breakpointhook()``. Breakpoint bytecode ------------------- Related, there has been an idea to add a bytecode that calls ``sys.breakpointhook()``. Whether built-in ``breakpoint()`` emits this bytecode (or gets peephole optimized to the bytecode) is an open issue. The bytecode is useful for debuggers that actively modify bytecode streams to trampoline into their own debugger. Having a "breakpoint" bytecode might allow them to avoid bytecode modification in order to invoke this trampoline. *NOTE*: It probably makes sense to split this idea into a separate PEP. Call a fancier object by default -------------------------------- Some folks want to be able to use other ``pdb`` interfaces such as ``pdb.pm()``. Although this is a less commonly used API, it could be supported by binding ``sys.breakpointhook`` to an object that implements ``__call__()``. Calling this object would call ``pdb.set_trace()``, but the object could expose other methods, such as ``pdb.pm()``, making invocation of it as handy as ``breakpoint.pm()``. Implementation ============== A pull request exists with the proposed implementation [impl]_. Rejected alternatives ===================== A new keyword ------------- Originally, the author considered a new keyword, or an extension to an existing keyword such as ``break here``. This is rejected on several fronts. * A brand new keyword would require a ``__future__`` to enable it since almost any new keyword could conflict with existing code. This negates the ease with which you can enter the debugger. * An extended keyword such as ``break here``, while more readable and not requiring a ``__future__`` would tie the keyword extension to this new feature, preventing more useful extensions such as those proposed in PEP 548. * A new keyword would require a modified grammar and likely a new bytecode. Each of these makes the implementation more complex. A new built-in breaks no existing code (since any existing module global would just shadow the built-in) and is quite easy to implement. sys.breakpoint() ---------------- Why not ``sys.breakpoint()``? Requiring an import to invoke the debugger is explicitly rejected because ``sys`` is not imported in every module. That just requires more typing and would lead to:: import sys; sys.breakpoint() which inherits several of the problems this PEP aims to solve. Version History =============== * 2017-09-13 * The ``PYTHONBREAKPOINT`` environment variable is made a first class feature. * 2017-09-07 * ``debug()`` renamed to ``breakpoint()`` * Signature changed to ``breakpoint(*args, **kws)`` which is passed straight through to ``sys.breakpointhook()``. References ========== .. [linters] http://flake8.readthedocs.io/en/latest/ .. [java] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/debugger .. [hooks] https://docs.python.org/3/library/sys.html#sys.displayhook .. [syntax] http://setuptools.readthedocs.io/en/latest/setuptools.html?highlight=console#automatic-script-creation .. [impl] https://github.com/python/cpython/pull/3355 .. [envar] https://mail.python.org/pipermail/python-dev/2017-September/149447.html Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From tjreedy at udel.edu Wed Sep 13 23:34:11 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 13 Sep 2017 23:34:11 -0400 Subject: [Python-Dev] PEP 553, v3 In-Reply-To: <5B417D78-C985-4B02-8D74-4ED1F6A04062@python.org> References: <5B417D78-C985-4B02-8D74-4ED1F6A04062@python.org> Message-ID: On 9/13/2017 10:12 PM, Barry Warsaw wrote: > Here?s an update to PEP 553, which makes $PYTHONBREAKPOINT a first class feature. I?ve also updated PR #3355 with the implementation to match. Looks pretty good to me. Reading the PR eliminated my remaining uncertainties. -- Terry Jan Reedy From ncoghlan at gmail.com Wed Sep 13 23:56:55 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 14 Sep 2017 13:56:55 +1000 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) In-Reply-To: References: Message-ID: On 14 September 2017 at 11:44, Eric Snow wrote: > I've updated PEP 554 in response to feedback. (thanks all!) There > are a few unresolved points (some of them added to the Open Questions > section), but the current PEP has changed enough that I wanted to get > it out there first. > > Notably changed: > > * the API relative to object passing has changed somewhat drastically > (hopefully simpler and easier to understand), replacing "FIFO" with > "channel" > * added an examples section > * added an open questions section > * added a rejected ideas section > * added more items to the deferred functionality section > * the rationale section has moved down below the examples > > Please let me know what you think. I'm especially interested in > feedback about the channels. Thanks! I like the new pipe-like channels API more than the previous named FIFO approach :) > send(obj): > > Send the object to the receiving end of the channel. Wait until > the object is received. If the channel does not support the > object then TypeError is raised. Currently only bytes are > supported. If the channel has been closed then EOFError is > raised. I still expect any form of object sharing to hinder your per-interpreter GIL efforts, so restricting the initial implementation to memoryview-only seems more future-proof to me. > Pre-populate an interpreter > --------------------------- > > :: > > interp = interpreters.create() > interp.run("""if True: > import some_lib > import an_expensive_module > some_lib.set_up() > """) > wait_for_request() > interp.run("""if True: > some_lib.handle_request() > """) I find the "if True:"'s sprinkled through the examples distracting, so I'd prefer either: 1. Using textwrap.dedent; or 2. Assigning the code to a module level attribute :: interp = interpreters.create() setup_code = """\ import some_lib import an_expensive_module some_lib.set_up() """ interp.run(setup_code) wait_for_request() handler_code = """\ some_lib.handle_request() """ interp.run(handler_code) > Handling an exception > --------------------- > > :: > > interp = interpreters.create() > try: > interp.run("""if True: > raise KeyError > """) > except KeyError: > print("got the error from the subinterpreter") As with the message passing through channels, I think you'll really want to minimise any kind of implicit object sharing that may interfere with future efforts to make the GIL truly an *interpreter* lock, rather than the global process lock that it is currently. One possible way to approach that would be to make the low level run() API a more Go-style API rather than a Python-style one, and have it return a (result, err) 2-tuple. "err.raise()" would then translate the foreign interpreter's exception into a local interpreter exception, but the *traceback* for that exception would be entirely within the current interpreter. > About Subinterpreters > ===================== > > Shared data > ----------- > > Subinterpreters are inherently isolated (with caveats explained below), > in contrast to threads. This enables `a different concurrency model > `_ than is currently readily available in Python. > `Communicating Sequential Processes`_ (CSP) is the prime example. > > A key component of this approach to concurrency is message passing. So > providing a message/object passing mechanism alongside ``Interpreter`` > is a fundamental requirement. This proposal includes a basic mechanism > upon which more complex machinery may be built. That basic mechanism > draws inspiration from pipes, queues, and CSP's channels. [fifo]_ > > The key challenge here is that sharing objects between interpreters > faces complexity due in part to CPython's current memory model. > Furthermore, in this class of concurrency, the ideal is that objects > only exist in one interpreter at a time. However, this is not practical > for Python so we initially constrain supported objects to ``bytes``. > There are a number of strategies we may pursue in the future to expand > supported objects and object sharing strategies. > > Note that the complexity of object sharing increases as subinterpreters > become more isolated, e.g. after GIL removal. So the mechanism for > message passing needs to be carefully considered. Keeping the API > minimal and initially restricting the supported types helps us avoid > further exposing any underlying complexity to Python users. > > To make this work, the mutable shared state will be managed by the > Python runtime, not by any of the interpreters. Initially we will > support only one type of objects for shared state: the channels provided > by ``create_channel()``. Channels, in turn, will carefully manage > passing objects between interpreters. Interpreters themselves will also need to be shared objects, as: - they all have access to "interpreters.list_all()" - when we do "interpreters.create_interpreter()", the calling interpreter gets a reference to itself via "interpreters.get_current()" (These shared objects are what I suspect you may end up needing a process global read/write lock to manage, by the way - I think it would be great if you can figure out a way to avoid that, it's just not entirely clear to me what that might look like. I do think you're on the right track by prohibiting the destruction of an interpreter that's currently running, and the destruction of channels that are currently still associated with an interpreter) > Interpreter Isolation > --------------------- > This sections is a really nice addition :) > Existing Usage > -------------- > > Subinterpreters are not a widely used feature. In fact, the only > documented case of wide-spread usage is > `mod_wsgi `_. On the one > hand, this case provides confidence that existing subinterpreter support > is relatively stable. On the other hand, there isn't much of a sample > size from which to judge the utility of the feature. Nathaniel pointed out that JEP embeds CPython subinterpreters inside the JVM similar to the way that mod_wsgi embeds them inside Apache httpd: https://github.com/ninia/jep/wiki/How-Jep-Works > Open Questions > ============== > > Leaking exceptions across interpreters > -------------------------------------- > > As currently proposed, uncaught exceptions from ``run()`` propagate > to the frame that called it. However, this means that exception > objects are leaking across the inter-interpreter boundary. Likewise, > the frames in the traceback potentially leak. > > While that might not be a problem currently, it would be a problem once > interpreters get better isolation relative to memory management (which > is necessary to stop sharing the GIL between interpreters). So the > semantics of how the exceptions propagate needs to be resolved. As noted above, I think you *really* want to avoid leaking exceptions in the initial implementation. A non-exception-based error signaling mechanism would be one way to do that, similar to how the low-level subprocess APIs actually report the return code, which higher level APIs then turn into an exception. resp.raise_for_status() does something similar for HTTP responses in the requests API. > Initial support for buffers in channels > --------------------------------------- > > An alternative to support for bytes in channels in support for > read-only buffers (the PEP 3119 kind). Then ``recv()`` would return > a memoryview to expose the buffer in a zero-copy way. This is similar > to what ``multiprocessing.Connection`` supports. [mp-conn] > > Switching to such an approach would help resolve questions of how > passing bytes through channels will work once we isolate memory > management in interpreters. Exactly :) > Reseting __main__ > ----------------- > > As proposed, every call to ``Interpreter.run()`` will execute in the > namespace of the interpreter's existing ``__main__`` module. This means > that data persists there between ``run()`` calls. Sometimes this isn't > desireable and you want to execute in a fresh ``__main__``. Also, > you don't necessarily want to leak objects there that you aren't using > any more. > > Solutions include: > > * a ``create()`` arg to indicate resetting ``__main__`` after each > ``run`` call > * an ``Interpreter.reset_main`` flag to support opting in or out > after the fact > * an ``Interpreter.reset_main()`` method to opt in when desired > > This isn't a critical feature initially. It can wait until later > if desirable. I was going to note that you can already do this: interp.run("globals().clear()") However, that turns out to clear *too* much, since it also clobbers all the __dunder__ attributes that the interpreter needs in a code execution environment. Either way, if you added this, I think it would make more sense as an "importlib.util.reset_globals()" operation, rather than have it be something specific to subinterpreters. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From yselivanov.ml at gmail.com Thu Sep 14 01:07:33 2017 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Thu, 14 Sep 2017 01:07:33 -0400 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) In-Reply-To: References: Message-ID: On Wed, Sep 13, 2017 at 11:56 PM, Nick Coghlan wrote: [..] >> send(obj): >> >> Send the object to the receiving end of the channel. Wait until >> the object is received. If the channel does not support the >> object then TypeError is raised. Currently only bytes are >> supported. If the channel has been closed then EOFError is >> raised. > > I still expect any form of object sharing to hinder your > per-interpreter GIL efforts, so restricting the initial implementation > to memoryview-only seems more future-proof to me. +1. Working with memoryviews is as convenient as with bytes. Yury From njs at pobox.com Thu Sep 14 01:27:19 2017 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 13 Sep 2017 22:27:19 -0700 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) In-Reply-To: References: Message-ID: On Sep 13, 2017 9:01 PM, "Nick Coghlan" wrote: On 14 September 2017 at 11:44, Eric Snow wrote: > send(obj): > > Send the object to the receiving end of the channel. Wait until > the object is received. If the channel does not support the > object then TypeError is raised. Currently only bytes are > supported. If the channel has been closed then EOFError is > raised. I still expect any form of object sharing to hinder your per-interpreter GIL efforts, so restricting the initial implementation to memoryview-only seems more future-proof to me. I don't get it. With bytes, you can either share objects or copy them and the user can't tell the difference, so you can change your mind later if you want. But memoryviews require some kind of cross-interpreter strong reference to keep the underlying buffer object alive. So if you want to minimize object sharing, surely bytes are more future-proof. > Handling an exception > --------------------- > > :: > > interp = interpreters.create() > try: > interp.run("""if True: > raise KeyError > """) > except KeyError: > print("got the error from the subinterpreter") As with the message passing through channels, I think you'll really want to minimise any kind of implicit object sharing that may interfere with future efforts to make the GIL truly an *interpreter* lock, rather than the global process lock that it is currently. One possible way to approach that would be to make the low level run() API a more Go-style API rather than a Python-style one, and have it return a (result, err) 2-tuple. "err.raise()" would then translate the foreign interpreter's exception into a local interpreter exception, but the *traceback* for that exception would be entirely within the current interpreter. It would also be reasonable to simply not return any value/exception from run() at all, or maybe just a bool for whether there was an unhandled exception. Any high level API is going to be injecting code on both sides of the interpreter boundary anyway, so it can do whatever exception and traceback translation it wants to. > Reseting __main__ > ----------------- > > As proposed, every call to ``Interpreter.run()`` will execute in the > namespace of the interpreter's existing ``__main__`` module. This means > that data persists there between ``run()`` calls. Sometimes this isn't > desireable and you want to execute in a fresh ``__main__``. Also, > you don't necessarily want to leak objects there that you aren't using > any more. > > Solutions include: > > * a ``create()`` arg to indicate resetting ``__main__`` after each > ``run`` call > * an ``Interpreter.reset_main`` flag to support opting in or out > after the fact > * an ``Interpreter.reset_main()`` method to opt in when desired > > This isn't a critical feature initially. It can wait until later > if desirable. I was going to note that you can already do this: interp.run("globals().clear()") However, that turns out to clear *too* much, since it also clobbers all the __dunder__ attributes that the interpreter needs in a code execution environment. Either way, if you added this, I think it would make more sense as an "importlib.util.reset_globals()" operation, rather than have it be something specific to subinterpreters. This is another point where the API could reasonably say that if you want clean namespaces then you should do that yourself (e.g. by setting up your own globals dict and using it to execute any post-bootstrap code). -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From python-dev at mgmiller.net Thu Sep 14 12:56:34 2017 From: python-dev at mgmiller.net (Mike Miller) Date: Thu, 14 Sep 2017 09:56:34 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> <3e7b6c3a-0a70-6942-f4a5-b66a7b3864ef@trueblade.com> <2e3d5b22-61f9-df2e-2534-3448a59412ac@trueblade.com> <748a60b0-bc12-a0bd-ffb2-3978aa43a2ec@mgmiller.net> <-4829494977384524515@unknownmsgid> Message-ID: On 2017-09-12 19:09, Nick Coghlan wrote: > On 13 September 2017 at 02:01, Chris Barker - NOAA Federal > wrote: >> This really does match well with the record concept in databases, and most >> people are familiar with that. > > No, most people aren't familiar with that - they only become familiar > with it *after* they've learned to program and learned what a database > is. Pretty sure he was talking about programmers, and they are introduced to the concept early. Structs, objects with fields, random access files, databases, etc. Lay-folks are familiar with "keeping records" as you mention, but they are not the primary customer it seems. Record is the most common name for this ubiquitous concept. > whether its referring to the noun (wreck-ord) or the verb (ree-cord). This can be grasped from context quickly, and due to mentioned ubiquity, not likely to be a problem in the real world. "Am I going to ree-cord this class?" >> Also, considering their uses, it might make sense to put them in the >> collections module. > > Data classes are things you're likely to put *in* a collection, rather > than really being collections themselves (they're only collections in > the same sense that all Python classes are collections of attributes, > and that's not the way the collections module uses the term). > Yes, a collection of attributes, not significantly different than the namedtuple (that began this thread) or the various dictionaries implemented there already. The criteria doesn't appear to be very strict, should it be? (Also, could be put into a submodule and imported into it maintain modularity. Where it lands though isn't so important, just that collections is relatively likely to be imported already on medium sized projects, and I think the definition fits, collections == "bags of stuff".) Cheers, -Mike From python-dev at mgmiller.net Thu Sep 14 13:24:52 2017 From: python-dev at mgmiller.net (Mike Miller) Date: Thu, 14 Sep 2017 10:24:52 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> <3e7b6c3a-0a70-6942-f4a5-b66a7b3864ef@trueblade.com> <2e3d5b22-61f9-df2e-2534-3448a59412ac@trueblade.com> <748a60b0-bc12-a0bd-ffb2-3978aa43a2ec@mgmiller.net> <-4829494977384524515@unknownmsgid> Message-ID: On 2017-09-12 21:05, Guido van Rossum wrote: > It's ironic that some people dislike "data classes" because these are regular > classes, not just for data, while others are proposing alternative names that > emphasize the data container aspect. So "data classes" splits the difference, by > referring to both data and classes. True that these data-classes will be a superset of a traditional record. But, we already have objects and inheritance for those use cases. The data-class is meant to be used primarily like a record, so why not name it that way? Almost everything is extensible in Python; that shouldn't prevent focused names, should it? > Let's bikeshed about something else. An elegant name can make the difference between another obscure module thrown in the stdlib to be never seen again and one that gets used every day. Which is more intuitive? from collections import record from dataclass import dataclass Would the language be as nice if "object" was named an "instanceclass?" Or perhaps the "requests" module could have been named "httpcall." Much of the reluctance to use the attrs module is about its weird naming. Due to the fact that this is a simple, potentially ubiquitous enhancement an elegant name is important. "For humans," or something, haha. -Mike From stefan at bytereef.org Thu Sep 14 13:45:28 2017 From: stefan at bytereef.org (Stefan Krah) Date: Thu, 14 Sep 2017 19:45:28 +0200 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> <3e7b6c3a-0a70-6942-f4a5-b66a7b3864ef@trueblade.com> <2e3d5b22-61f9-df2e-2534-3448a59412ac@trueblade.com> <748a60b0-bc12-a0bd-ffb2-3978aa43a2ec@mgmiller.net> <-4829494977384524515@unknownmsgid> Message-ID: <20170914174527.GA4858@bytereef.org> On Thu, Sep 14, 2017 at 10:24:52AM -0700, Mike Miller wrote: > An elegant name can make the difference between another obscure > module thrown in the stdlib to be never seen again and one that gets > used every day. Which is more intuitive? > > from collections import record I'd expect something like a C struct or an ML record. > from dataclass import dataclass This is more intuitive, since the PEP example also has attached methods like total_cost(). I don't think this is really common for records. Stefan Krah From python-dev at mgmiller.net Thu Sep 14 14:06:15 2017 From: python-dev at mgmiller.net (Mike Miller) Date: Thu, 14 Sep 2017 11:06:15 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <20170914174527.GA4858@bytereef.org> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> <3e7b6c3a-0a70-6942-f4a5-b66a7b3864ef@trueblade.com> <2e3d5b22-61f9-df2e-2534-3448a59412ac@trueblade.com> <748a60b0-bc12-a0bd-ffb2-3978aa43a2ec@mgmiller.net> <-4829494977384524515@unknownmsgid> <20170914174527.GA4858@bytereef.org> Message-ID: <002b98c9-e753-51f1-bec8-c174d71f954d@mgmiller.net> On 2017-09-14 10:45, Stefan Krah wrote: > I'd expect something like a C struct or an ML record. Struct is taken, and your second example is record. > from dataclass import dataclass > > This is more intuitive, since the PEP example also has attached methods > like total_cost(). I don't think this is really common for records. Every class can be extended, does that mean they can't be given appropriate names? (Not to mention dataclass is hardly intuitive for something that can have methods added.) -Mike From barry at python.org Thu Sep 14 14:07:37 2017 From: barry at python.org (Barry Warsaw) Date: Thu, 14 Sep 2017 11:07:37 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> <3e7b6c3a-0a70-6942-f4a5-b66a7b3864ef@trueblade.com> <2e3d5b22-61f9-df2e-2534-3448a59412ac@trueblade.com> <748a60b0-bc12-a0bd-ffb2-3978aa43a2ec@mgmiller.net> <-4829494977384524515@unknownmsgid> Message-ID: <70E96D68-7EDF-41FF-AAF2-A6E0DD858CBB@python.org> On Sep 14, 2017, at 09:56, Mike Miller wrote: > Record is the most common name for this ubiquitous concept. Mind if we call them Eric Classes to keep it clear? Because if its name is not Eric Classes, it will cause a little confusion. g?day-bruce-ly y?rs, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From stefan at bytereef.org Thu Sep 14 14:12:48 2017 From: stefan at bytereef.org (Stefan Krah) Date: Thu, 14 Sep 2017 20:12:48 +0200 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <002b98c9-e753-51f1-bec8-c174d71f954d@mgmiller.net> References: <3e7b6c3a-0a70-6942-f4a5-b66a7b3864ef@trueblade.com> <2e3d5b22-61f9-df2e-2534-3448a59412ac@trueblade.com> <748a60b0-bc12-a0bd-ffb2-3978aa43a2ec@mgmiller.net> <-4829494977384524515@unknownmsgid> <20170914174527.GA4858@bytereef.org> <002b98c9-e753-51f1-bec8-c174d71f954d@mgmiller.net> Message-ID: <20170914181248.GA5142@bytereef.org> On Thu, Sep 14, 2017 at 11:06:15AM -0700, Mike Miller wrote: > On 2017-09-14 10:45, Stefan Krah wrote: > >I'd expect something like a C struct or an ML record. > > Struct is taken, and your second example is record. *If* the name were collections.record, I'd expect collections.record to be something like a C struct or an ML record. I'm NOT proposing "record". > > from dataclass import dataclass > > > >This is more intuitive, since the PEP example also has attached methods > >like total_cost(). I don't think this is really common for records. > > Every class can be extended, does that mean they can't be given appropriate names? A class is not a record. This brief conversation already convinced me that "record" is a bad name for the proposed construct. Stefan Krah From levkivskyi at gmail.com Thu Sep 14 15:08:34 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Thu, 14 Sep 2017 21:08:34 +0200 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) In-Reply-To: References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> <68c9f5c5-e8d9-723f-c258-1e6727da3008@hastings.org> <0580a3c6-f262-acbb-93a3-87807ac48242@hastings.org> <38dd16c8-432d-c9d5-b267-ddfd88064098@hastings.org> Message-ID: On 14 September 2017 at 01:13, Guido van Rossum wrote: > > That last sentence is a key observation. Do we even know whether there are > (non-toy) things that you can do *in principle* with __class__ assignment > but which are too slow *in practice* to bother? And if yes, is __getattr__ > fast enough? @property? > > I myself have never implemented deprecation warnings management nor lazy loading, so it is hard to say if __class__ assignment is fast enough. For me it is more combination of three factors: * modest performance improvement * some people might find __getattr__ clearer than __class__ assignment * this would be consistent with how stubs work > IMO we're still looking for applications. > > How about this def allow_forward_references(*allowed): caller_globals = sys._getframe().__globals__ def typing_getattr(name): if name in allowed: return name raise AttributeError(...) caller_globals.__getattr__ = typing_getattr from typing_extensions import allow_forward_references allow_forward_references('Vertex', 'Edge') T = TypeVar('T', bound=Edge) class Vertex(List[Edge]): def copy(self: T) -> T: ... class Edge: ends: Tuple[Vertex, Vertex] ... Look mum, no quotes! :-) -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From levkivskyi at gmail.com Thu Sep 14 15:11:05 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Thu, 14 Sep 2017 21:11:05 +0200 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) In-Reply-To: References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> <68c9f5c5-e8d9-723f-c258-1e6727da3008@hastings.org> <0580a3c6-f262-acbb-93a3-87807ac48242@hastings.org> <38dd16c8-432d-c9d5-b267-ddfd88064098@hastings.org> Message-ID: (sorry for obvious mistakes in the example in previous e-mail) On 14 September 2017 at 21:08, Ivan Levkivskyi wrote: > On 14 September 2017 at 01:13, Guido van Rossum wrote: > >> >> That last sentence is a key observation. Do we even know whether there >> are (non-toy) things that you can do *in principle* with __class__ >> assignment but which are too slow *in practice* to bother? And if yes, is >> __getattr__ fast enough? @property? >> >> > I myself have never implemented deprecation warnings management nor lazy > loading, > so it is hard to say if __class__ assignment is fast enough. For me it is > more combination > of three factors: > > * modest performance improvement > * some people might find __getattr__ clearer than __class__ assignment > * this would be consistent with how stubs work > > >> IMO we're still looking for applications. >> >> > How about this > > def allow_forward_references(*allowed): > caller_globals = sys._getframe().__globals__ > def typing_getattr(name): > if name in allowed: > return name > raise AttributeError(...) > caller_globals.__getattr__ = typing_getattr > > from typing_extensions import allow_forward_references > allow_forward_references('Vertex', 'Edge') > > T = TypeVar('T', bound=Edge) > > class Vertex(List[Edge]): > def copy(self: T) -> T: > ... > > class Edge: > ends: Tuple[Vertex, Vertex] > ... > > Look mum, no quotes! :-) > > -- > Ivan > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Thu Sep 14 16:07:55 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 14 Sep 2017 13:07:55 -0700 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) In-Reply-To: References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> <68c9f5c5-e8d9-723f-c258-1e6727da3008@hastings.org> <0580a3c6-f262-acbb-93a3-87807ac48242@hastings.org> <38dd16c8-432d-c9d5-b267-ddfd88064098@hastings.org> Message-ID: <59BAE19B.9090800@stoneleaf.us> On 09/14/2017 12:08 PM, Ivan Levkivskyi wrote: > On 14 September 2017 at 01:13, Guido van Rossum wrote: >> That last sentence is a key observation. Do we even know whether there are (non-toy) things that you can do *in >> principle* with __class__ assignment but which are too slow *in practice* to bother? And if yes, is __getattr__ fast >> enough? @property? > > I myself have never implemented deprecation warnings management nor lazy loading, > so it is hard to say if __class__ assignment is fast enough. For me it is more combination > of three factors: > > * modest performance improvement > * some people might find __getattr__ clearer than __class__ assignment > * this would be consistent with how stubs work > > IMO we're still looking for applications. > > > How about this > > def allow_forward_references(*allowed): > caller_globals = sys._getframe().__globals__ > def typing_getattr(name): > if name in allowed: > return name > raise AttributeError(...) > caller_globals.__getattr__ = typing_getattr > > from typing_extensions import allow_forward_references > allow_forward_references('Vertex', 'Edge') > > T = TypeVar('T', bound=Edge) > > class Vertex(List[Edge]): > def copy(self: T) -> T: > ... > > class Edge: > ends: Tuple[Vertex, Vertex] > ... > > Look mum, no quotes! :-) For comparison's sake, what would the above look like using __class__ assignment? And what is the performance difference? -- ~Ethan~ From levkivskyi at gmail.com Thu Sep 14 16:21:38 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Thu, 14 Sep 2017 22:21:38 +0200 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) In-Reply-To: <59BAE19B.9090800@stoneleaf.us> References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> <68c9f5c5-e8d9-723f-c258-1e6727da3008@hastings.org> <0580a3c6-f262-acbb-93a3-87807ac48242@hastings.org> <38dd16c8-432d-c9d5-b267-ddfd88064098@hastings.org> <59BAE19B.9090800@stoneleaf.us> Message-ID: On 14 September 2017 at 22:07, Ethan Furman wrote: > > For comparison's sake, what would the above look like using __class__ > assignment? And what is the performance difference? > > Actually I tried but I can't implement this without module __getattr__ so that one can just write: from typing_extensions import allow_forward_references allow_forward_references('Vertex', 'Edge') Maybe I am missing something, but either it is impossible in principle or highly non-trivial. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From levkivskyi at gmail.com Thu Sep 14 16:40:38 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Thu, 14 Sep 2017 22:40:38 +0200 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) In-Reply-To: References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> <68c9f5c5-e8d9-723f-c258-1e6727da3008@hastings.org> <0580a3c6-f262-acbb-93a3-87807ac48242@hastings.org> <38dd16c8-432d-c9d5-b267-ddfd88064098@hastings.org> <59BAE19B.9090800@stoneleaf.us> Message-ID: On 14 September 2017 at 22:21, Ivan Levkivskyi wrote: > On 14 September 2017 at 22:07, Ethan Furman wrote: > >> >> For comparison's sake, what would the above look like using __class__ >> assignment? And what is the performance difference? >> >> > Actually I tried but I can't implement this without module __getattr__ > so that one can just write: > > from typing_extensions import allow_forward_references > allow_forward_references('Vertex', 'Edge') > > Maybe I am missing something, but either it is impossible in principle or > highly non-trivial. > > Actually my version does not work either on my branch :-( But maybe this is a problem with my implementation. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From gvanrossum at gmail.com Thu Sep 14 16:52:54 2017 From: gvanrossum at gmail.com (Guido van Rossum) Date: Thu, 14 Sep 2017 13:52:54 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <20170914181248.GA5142@bytereef.org> References: <3e7b6c3a-0a70-6942-f4a5-b66a7b3864ef@trueblade.com> <2e3d5b22-61f9-df2e-2534-3448a59412ac@trueblade.com> <748a60b0-bc12-a0bd-ffb2-3978aa43a2ec@mgmiller.net> <-4829494977384524515@unknownmsgid> <20170914174527.GA4858@bytereef.org> <002b98c9-e753-51f1-bec8-c174d71f954d@mgmiller.net> <20170914181248.GA5142@bytereef.org> Message-ID: Let's all please take a time out from the naming discussion. On Sep 14, 2017 11:15 AM, "Stefan Krah" wrote: > On Thu, Sep 14, 2017 at 11:06:15AM -0700, Mike Miller wrote: > > On 2017-09-14 10:45, Stefan Krah wrote: > > >I'd expect something like a C struct or an ML record. > > > > Struct is taken, and your second example is record. > > *If* the name were collections.record, I'd expect collections.record to > be something like a C struct or an ML record. I'm NOT proposing "record". > > > > > from dataclass import dataclass > > > > > >This is more intuitive, since the PEP example also has attached methods > > >like total_cost(). I don't think this is really common for records. > > > > Every class can be extended, does that mean they can't be given > appropriate names? > > A class is not a record. This brief conversation already convinced me that > "record" is a bad name for the proposed construct. > > > > Stefan Krah > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From levkivskyi at gmail.com Thu Sep 14 17:02:00 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Thu, 14 Sep 2017 23:02:00 +0200 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) In-Reply-To: <59BAE19B.9090800@stoneleaf.us> References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> <68c9f5c5-e8d9-723f-c258-1e6727da3008@hastings.org> <0580a3c6-f262-acbb-93a3-87807ac48242@hastings.org> <38dd16c8-432d-c9d5-b267-ddfd88064098@hastings.org> <59BAE19B.9090800@stoneleaf.us> Message-ID: On 14 September 2017 at 22:07, Ethan Furman wrote: > For comparison's sake, what would the above look like using __class__ > assignment? And what is the performance difference? > > FWIW I found a different solution: # file mod.py from typing_extensions import allow_forward_references allow_forward_references() from mod import Vertex, Edge # the import is from this same module. It works both with __class__ assignment and with __getattr__ -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From levkivskyi at gmail.com Thu Sep 14 17:04:55 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Thu, 14 Sep 2017 23:04:55 +0200 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) In-Reply-To: References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> <68c9f5c5-e8d9-723f-c258-1e6727da3008@hastings.org> <0580a3c6-f262-acbb-93a3-87807ac48242@hastings.org> <38dd16c8-432d-c9d5-b267-ddfd88064098@hastings.org> <59BAE19B.9090800@stoneleaf.us> Message-ID: On 14 September 2017 at 23:02, Ivan Levkivskyi wrote: > On 14 September 2017 at 22:07, Ethan Furman wrote: > >> For comparison's sake, what would the above look like using __class__ >> assignment? And what is the performance difference? >> >> > FWIW I found a different solution: > > # file mod.py > > from typing_extensions import allow_forward_references > allow_forward_references() > from mod import Vertex, Edge # the import is from this same module. > > It works both with __class__ assignment and with __getattr__ > > -- > Ivan > > Anyway, I don't think we should take this seriously, the way forward is PEP 563, we should have clear separation between runtime context and type context. In the latter forward references are OK, but in the former, they are quite weird. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Sep 14 20:44:40 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 15 Sep 2017 10:44:40 +1000 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) In-Reply-To: References: Message-ID: On 14 September 2017 at 15:27, Nathaniel Smith wrote: > On Sep 13, 2017 9:01 PM, "Nick Coghlan" wrote: > > On 14 September 2017 at 11:44, Eric Snow > wrote: >> send(obj): >> >> Send the object to the receiving end of the channel. Wait until >> the object is received. If the channel does not support the >> object then TypeError is raised. Currently only bytes are >> supported. If the channel has been closed then EOFError is >> raised. > > I still expect any form of object sharing to hinder your > per-interpreter GIL efforts, so restricting the initial implementation > to memoryview-only seems more future-proof to me. > > > I don't get it. With bytes, you can either share objects or copy them and > the user can't tell the difference, so you can change your mind later if you > want. > But memoryviews require some kind of cross-interpreter strong > reference to keep the underlying buffer object alive. So if you want to > minimize object sharing, surely bytes are more future-proof. Not really, because the only way to ensure object separation (i.e no refcounted objects accessible from multiple interpreters at once) with a bytes-based API would be to either: 1. Always copy (eliminating most of the low overhead communications benefits that subinterpreters may offer over multiple processes) 2. Make the bytes implementation more complicated by allowing multiple bytes objects to share the same underlying storage while presenting as distinct objects in different interpreters 3. Make the output on the receiving side not actually a bytes object, but instead a view onto memory owned by another object in a different interpreter (a "memory view", one might say) And yes, using memory views for this does mean defining either a subclass or a mediating object that not only keeps the originating object alive until the receiving memoryview is closed, but also retains a reference to the originating interpreter so that it can switch to it when it needs to manipulate the source object's refcount or call one of the buffer methods. Yury and I are fine with that, since it means that either the sender *or* the receiver can decide to copy the data (e.g. by calling bytes(obj) before sending, or bytes(view) after receiving), and in the meantime, the object holding the cross-interpreter view knows that it needs to switch interpreters (and hence acquire the sending interpreter's GIL) before doing anything with the source object. The reason we're OK with this is that it means that only reading a new message from a channel (i.e creating a cross-interpreter view) or discarding a previously read message (i.e. closing a cross-interpreter view) will be synchronisation points where the receiving interpreter necessarily needs to acquire the sending interpreter's GIL. By contrast, if we allow an actual bytes object to be shared, then either every INCREF or DECREF on that bytes object becomes a synchronisation point, or else we end up needing some kind of secondary per-interpreter refcount where the interpreter doesn't drop its shared reference to the original object in its source interpreter until the internal refcount in the borrowing interpreter drops to zero. >> Handling an exception >> --------------------- > It would also be reasonable to simply not return any value/exception from > run() at all, or maybe just a bool for whether there was an unhandled > exception. Any high level API is going to be injecting code on both sides of > the interpreter boundary anyway, so it can do whatever exception and > traceback translation it wants to. So any more detailed response would *have* to come back as a channel message? That sounds like a reasonable option to me, too, especially since module level code doesn't have a return value as such - you can really only say "it raised an exception (and this was the exception it raised)" or "it reached the end of the code without raising an exception". Given that, I think subprocess.run() (with check=False) is the right API precedent here: https://docs.python.org/3/library/subprocess.html#subprocess.run That always returns subprocess.CompletedProcess, and then you can call "cp.check_returncode()" to get it to raise subprocess.CalledProcessError for non-zero return codes. For interpreter.run(), we could keep the initial RunResult *really* simple and only report back: * source: the source code passed to run() * shared: the keyword args passed to run() (name chosen to match functools.partial) * completed: completed execution without raising an exception? (True if yes, False otherwise) Whether or not to report more details for a raised exception, and provide some mechanism to reraise it in the calling interpreter could then be deferred until later. The subprocess.run() comparison does make me wonder whether this might be a more future-proof signature for Interpreter.run() though: def run(source_str, /, *, channels=None): ... That way channels can be a namespace *specifically* for passing in channels, and can be reported as such on RunResult. If we decide to allow arbitrary shared objects in the future, or add flag options like "reraise=True" to reraise exceptions from the subinterpreter in the current interpreter, we'd have that ability, rather than having the entire potential keyword namespace taken up for passing shared objects. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Thu Sep 14 20:55:49 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 15 Sep 2017 10:55:49 +1000 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> <3e7b6c3a-0a70-6942-f4a5-b66a7b3864ef@trueblade.com> <2e3d5b22-61f9-df2e-2534-3448a59412ac@trueblade.com> <748a60b0-bc12-a0bd-ffb2-3978aa43a2ec@mgmiller.net> <-4829494977384524515@unknownmsgid> Message-ID: On 15 September 2017 at 02:56, Mike Miller wrote: > > On 2017-09-12 19:09, Nick Coghlan wrote: >> >> On 13 September 2017 at 02:01, Chris Barker - NOAA Federal >> wrote: >>> >>> This really does match well with the record concept in databases, and >>> most >>> people are familiar with that. >> >> >> No, most people aren't familiar with that - they only become familiar >> with it *after* they've learned to program and learned what a database >> is. > > > Pretty sure he was talking about programmers, and they are introduced to the > concept early. Structs, objects with fields, random access files, > databases, etc. Lay-folks are familiar with "keeping records" as you > mention, but they are not the primary customer it seems. Python is an incredibly common first programming language, so we need to keep folks with *zero* knowledge of programming jargon firmly in mind when designing new features. That isn't always the most important consideration, but it's always *a* consideration. And, as Stefan notes in his reply, we also need to keep *misleading* inferences in mind when we consider repurposing existing jargon for a new use case - what seems like an obviously intuitive connection based on our own individual experiences with a term may turn out to be extremely counterintuitive for someone with a different experience of the same term. In such cases, it can make sense to look for new *semantically neutral* terminology as the official glossary entry and API naming scheme, and rely on documentation to indicate that this is a realisation of a feature that goes by other names in other contexts. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From njs at pobox.com Thu Sep 14 22:04:16 2017 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 14 Sep 2017 19:04:16 -0700 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) In-Reply-To: References: Message-ID: On Thu, Sep 14, 2017 at 5:44 PM, Nick Coghlan wrote: > On 14 September 2017 at 15:27, Nathaniel Smith wrote: >> I don't get it. With bytes, you can either share objects or copy them and >> the user can't tell the difference, so you can change your mind later if you >> want. >> But memoryviews require some kind of cross-interpreter strong >> reference to keep the underlying buffer object alive. So if you want to >> minimize object sharing, surely bytes are more future-proof. > > Not really, because the only way to ensure object separation (i.e no > refcounted objects accessible from multiple interpreters at once) with > a bytes-based API would be to either: > > 1. Always copy (eliminating most of the low overhead communications > benefits that subinterpreters may offer over multiple processes) > 2. Make the bytes implementation more complicated by allowing multiple > bytes objects to share the same underlying storage while presenting as > distinct objects in different interpreters > 3. Make the output on the receiving side not actually a bytes object, > but instead a view onto memory owned by another object in a different > interpreter (a "memory view", one might say) > > And yes, using memory views for this does mean defining either a > subclass or a mediating object that not only keeps the originating > object alive until the receiving memoryview is closed, but also > retains a reference to the originating interpreter so that it can > switch to it when it needs to manipulate the source object's refcount > or call one of the buffer methods. > > Yury and I are fine with that, since it means that either the sender > *or* the receiver can decide to copy the data (e.g. by calling > bytes(obj) before sending, or bytes(view) after receiving), and in the > meantime, the object holding the cross-interpreter view knows that it > needs to switch interpreters (and hence acquire the sending > interpreter's GIL) before doing anything with the source object. > > The reason we're OK with this is that it means that only reading a new > message from a channel (i.e creating a cross-interpreter view) or > discarding a previously read message (i.e. closing a cross-interpreter > view) will be synchronisation points where the receiving interpreter > necessarily needs to acquire the sending interpreter's GIL. > > By contrast, if we allow an actual bytes object to be shared, then > either every INCREF or DECREF on that bytes object becomes a > synchronisation point, or else we end up needing some kind of > secondary per-interpreter refcount where the interpreter doesn't drop > its shared reference to the original object in its source interpreter > until the internal refcount in the borrowing interpreter drops to > zero. Ah, that makes more sense. I am nervous that allowing arbitrary memoryviews gives a *little* more power than we need or want. I like that the current API can reasonably be emulated using subprocesses -- it opens up the door for backports, compatibility support on language implementations that don't support subinterpreters, direct benchmark comparisons between the two implementation strategies, etc. But if we allow arbitrary memoryviews, then this requires that you can take (a) an arbitrary object, not specified ahead of time, and (b) provide two read-write views on it in separate interpreters such that modifications made in one are immediately visible in the other. Subprocesses can do one or the other -- they can copy arbitrary data, and if you warn them ahead of time when you allocate the buffer, they can do real zero-copy shared memory. But the combination is really difficult. It'd be one thing if this were like a key feature that gave subinterpreters an advantage over subprocesses, but it seems really unlikely to me that a library won't know ahead of time when it's filling in a buffer to be transferred, and if anything it seems like we'd rather not expose read-write shared mappings in any case. It's extremely non-trivial to do right [1]. tl;dr: let's not rule out a useful implementation strategy based on a feature we don't actually need. One alternative would be your option (3) -- you can put bytes in and get memoryviews out, and since bytes objects are immutable it's OK. [1] https://en.wikipedia.org/wiki/Memory_model_(programming) >>> Handling an exception >>> --------------------- >> It would also be reasonable to simply not return any value/exception from >> run() at all, or maybe just a bool for whether there was an unhandled >> exception. Any high level API is going to be injecting code on both sides of >> the interpreter boundary anyway, so it can do whatever exception and >> traceback translation it wants to. > > So any more detailed response would *have* to come back as a channel message? > > That sounds like a reasonable option to me, too, especially since > module level code doesn't have a return value as such - you can really > only say "it raised an exception (and this was the exception it > raised)" or "it reached the end of the code without raising an > exception". > > Given that, I think subprocess.run() (with check=False) is the right > API precedent here: > https://docs.python.org/3/library/subprocess.html#subprocess.run > > That always returns subprocess.CompletedProcess, and then you can call > "cp.check_returncode()" to get it to raise > subprocess.CalledProcessError for non-zero return codes. > > For interpreter.run(), we could keep the initial RunResult *really* > simple and only report back: > > * source: the source code passed to run() > * shared: the keyword args passed to run() (name chosen to match > functools.partial) > * completed: completed execution without raising an exception? (True > if yes, False otherwise) > > Whether or not to report more details for a raised exception, and > provide some mechanism to reraise it in the calling interpreter could > then be deferred until later. > > The subprocess.run() comparison does make me wonder whether this might > be a more future-proof signature for Interpreter.run() though: > > def run(source_str, /, *, channels=None): > ... > > That way channels can be a namespace *specifically* for passing in > channels, and can be reported as such on RunResult. If we decide to > allow arbitrary shared objects in the future, or add flag options like > "reraise=True" to reraise exceptions from the subinterpreter in the > current interpreter, we'd have that ability, rather than having the > entire potential keyword namespace taken up for passing shared > objects. Would channels be a dict, or...? -n -- Nathaniel J. Smith -- https://vorpus.org From ncoghlan at gmail.com Fri Sep 15 00:24:03 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 15 Sep 2017 14:24:03 +1000 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) In-Reply-To: References: Message-ID: On 15 September 2017 at 12:04, Nathaniel Smith wrote: > On Thu, Sep 14, 2017 at 5:44 PM, Nick Coghlan wrote: >> The reason we're OK with this is that it means that only reading a new >> message from a channel (i.e creating a cross-interpreter view) or >> discarding a previously read message (i.e. closing a cross-interpreter >> view) will be synchronisation points where the receiving interpreter >> necessarily needs to acquire the sending interpreter's GIL. >> >> By contrast, if we allow an actual bytes object to be shared, then >> either every INCREF or DECREF on that bytes object becomes a >> synchronisation point, or else we end up needing some kind of >> secondary per-interpreter refcount where the interpreter doesn't drop >> its shared reference to the original object in its source interpreter >> until the internal refcount in the borrowing interpreter drops to >> zero. > > Ah, that makes more sense. > > I am nervous that allowing arbitrary memoryviews gives a *little* more > power than we need or want. I like that the current API can reasonably > be emulated using subprocesses -- it opens up the door for backports, > compatibility support on language implementations that don't support > subinterpreters, direct benchmark comparisons between the two > implementation strategies, etc. But if we allow arbitrary memoryviews, > then this requires that you can take (a) an arbitrary object, not > specified ahead of time, and (b) provide two read-write views on it in > separate interpreters such that modifications made in one are > immediately visible in the other. Subprocesses can do one or the other > -- they can copy arbitrary data, and if you warn them ahead of time > when you allocate the buffer, they can do real zero-copy shared > memory. But the combination is really difficult. One constraint we'd want to impose is that the memory view in the receiving interpreter should always be read-only - while we don't currently expose the ability to request that at the Python layer, memoryviews *do* support the creation of read-only views at the C API layer (which then gets reported to Python code via the "view.readonly" attribute). While that change alone is enough to preserve the simplex nature of the channel, it wouldn't be enough to prevent the *sender* from mutating the buffer contents and having that change be visible in the recipient. In that regard it may make sense to maintain both restrictions initially (as you suggested below): only accept bytes on the sending side (to prevent mutation by the sender), and expose that as a read-only memory view on the receiving side (to allow for zero-copy data sharing without allowing mutation by the receiver). > It'd be one thing if this were like a key feature that gave > subinterpreters an advantage over subprocesses, but it seems really > unlikely to me that a library won't know ahead of time when it's > filling in a buffer to be transferred, and if anything it seems like > we'd rather not expose read-write shared mappings in any case. It's > extremely non-trivial to do right [1]. > > tl;dr: let's not rule out a useful implementation strategy based on a > feature we don't actually need. Yeah, the description Eric currently has in the PEP is a summary of a much longer suggestion Yury, Neil Schumenauer and I put together while waiting for our flights following the core dev sprint, and the full version had some of these additional constraints on it (most notably the "read-only in the receiving interpreter" one). > One alternative would be your option (3) -- you can put bytes in and > get memoryviews out, and since bytes objects are immutable it's OK. Indeed, I think that will be a sensible starting point. However, I genuinely want to allow for zero-copy sharing of NumPy arrays eventually, as that's where I think this idea gets most interesting: the potential to allow for multiple parallel read operations on a given NumPy array *in Python* (rather than Cython or C) without running afoul of the GIL, and without needing to mess about with the complexities of operating system level IPC. >>>> Handling an exception >> That way channels can be a namespace *specifically* for passing in >> channels, and can be reported as such on RunResult. If we decide to >> allow arbitrary shared objects in the future, or add flag options like >> "reraise=True" to reraise exceptions from the subinterpreter in the >> current interpreter, we'd have that ability, rather than having the >> entire potential keyword namespace taken up for passing shared >> objects. > > Would channels be a dict, or...? Yeah, it would be a direct replacement for the way the current draft is proposing to use the keywords dict - it would just be a separate dictionary instead. It does occur to me that if we wanted to align with the way the `runpy` module spells that concept, we'd call the option `init_globals`, but I'm thinking it will be better to only allow channels to be passed through directly, and require that everything else be sent through a channel. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Fri Sep 15 01:35:37 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 15 Sep 2017 15:35:37 +1000 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) In-Reply-To: References: Message-ID: On 14 September 2017 at 11:44, Eric Snow wrote: > About Subinterpreters > ===================== > > Shared data > ----------- [snip] > To make this work, the mutable shared state will be managed by the > Python runtime, not by any of the interpreters. Initially we will > support only one type of objects for shared state: the channels provided > by ``create_channel()``. Channels, in turn, will carefully manage > passing objects between interpreters. Something I think you may want to explicitly call out as *not* being shared is the thread objects in threading.enumerate(), as the way that works in the current implementation makes sense, but isn't particularly obvious (what I have below comes from experimenting with your branch at https://github.com/python/cpython/pull/1748). Specifically, what happens is that the operating system thread underlying the existing interpreter thread that calls interp.run() gets borrowed as the operating system thread underlying the MainThread object in the called interpreter. That MainThread object then gets preserved in the interpreter's interpreter state, but the mapping to an underlying OS thread will change freely based on who's calling into it. From outside an interpreter, you *can't* request to run code in subthreads directly - you'll always run your given code in the main thread, and it will be up to that to dispatch requests to subthreads. Beyond the thread lending that happens when you call interp.run() (where one of your threads gets borrowed as the other interpreter's main thread), each interpreter otherwise maintains a completely disjoint set of thread objects that it is solely responsible for. This also clarifies for me what it means for an interpreter to be a "main" interpreter: it's the interpreter who's main thread actually corresponds to the main thread of the overall operating system process, rather than being temporarily borrowed from another interpreter. We're going to have to put some thought into how we want that to interact with the signal handling logic - right now, I believe *any* main thread will consider it its responsibility to process signals delivered to the runtime (and embedding application avoid the potential problems arising from that by simply not installing the CPython signal handlers in the first place), and we probably want to change that condition to be "the main thread in the main interpreter". Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From desmoulinmichel at gmail.com Fri Sep 15 08:08:45 2017 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Fri, 15 Sep 2017 14:08:45 +0200 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> <3e7b6c3a-0a70-6942-f4a5-b66a7b3864ef@trueblade.com> <2e3d5b22-61f9-df2e-2534-3448a59412ac@trueblade.com> <748a60b0-bc12-a0bd-ffb2-3978aa43a2ec@mgmiller.net> <-4829494977384524515@unknownmsgid> Message-ID: <2276a2c5-39f5-4639-ee01-77ace1a80fe9@gmail.com> Le 14/09/2017 ? 19:24, Mike Miller a ?crit?: > > On 2017-09-12 21:05, Guido van Rossum wrote: >> It's ironic that some people dislike "data classes" because these are >> regular classes, not just for data, while others are proposing >> alternative names that emphasize the data container aspect. So "data >> classes" splits the difference, by referring to both data and classes. > > True that these data-classes will be a superset of a traditional > record.? But, we already have objects and inheritance for those use > cases.? The data-class is meant to be used primarily like a record, so > why not name it that way? Because given how convenient it is, it will most probably becomes the default way to write classes in Python. Not just for record. Everybody end up wishing for a less verbose way to write day to day classes after a while. From status at bugs.python.org Fri Sep 15 12:09:26 2017 From: status at bugs.python.org (Python tracker) Date: Fri, 15 Sep 2017 18:09:26 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20170915160926.1534611A86E@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2017-09-08 - 2017-09-15) Python tracker at https://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 6171 (+22) closed 37065 (+68) total 43236 (+90) Open issues with patches: 2349 Issues opened (59) ================== #28411: Eliminate PyInterpreterState.modules. https://bugs.python.org/issue28411 reopened by eric.snow #31370: Remove support for threads-less builds https://bugs.python.org/issue31370 reopened by Arfrever #31398: TypeError: gdbm key must be string, not unicode https://bugs.python.org/issue31398 opened by sam-s #31399: Let OpenSSL verify hostname and IP address https://bugs.python.org/issue31399 opened by christian.heimes #31404: undefined behavior and crashes in case of a bad sys.modules https://bugs.python.org/issue31404 opened by Oren Milman #31405: shutil.which doesn't find files without PATHEXT extension on W https://bugs.python.org/issue31405 opened by rmccampbell7 #31410: int.__repr__() is slower than repr() https://bugs.python.org/issue31410 opened by serhiy.storchaka #31412: wave.open does not accept PathLike objects https://bugs.python.org/issue31412 opened by mscuthbert #31414: IDLE: Entry tests should delete before insert. https://bugs.python.org/issue31414 opened by terry.reedy #31415: Add -X option to show import time https://bugs.python.org/issue31415 opened by inada.naoki #31419: Can not install python3.6.2 due to Error 0x80070643: Failed to https://bugs.python.org/issue31419 opened by fernado #31422: tkinter.messagebox and tkinter.filedialog don't show default b https://bugs.python.org/issue31422 opened by jcrmatos #31423: Error while building PDF documentation https://bugs.python.org/issue31423 opened by mdk #31424: test_socket hangs on x86 Gentoo Installed with X 3.x https://bugs.python.org/issue31424 opened by haypo #31425: Expose AF_QIPCRTR in socket module https://bugs.python.org/issue31425 opened by Bjorn Andersson #31426: [3.5] crash in gen_traverse(): gi_frame.ob_type=NULL, called b https://bugs.python.org/issue31426 opened by iwienand #31427: Proposed addition to Windows FAQ https://bugs.python.org/issue31427 opened by Stephan Houben #31429: TLS cipher suite compile time option for downstream https://bugs.python.org/issue31429 opened by christian.heimes #31430: [Windows][2.7] Python 2.7 compilation fails on mt.exe crashing https://bugs.python.org/issue31430 opened by haypo #31431: SSL: check_hostname should imply CERT_REQUIRED https://bugs.python.org/issue31431 opened by christian.heimes #31432: Documention for CERT_OPTIONAL is misleading https://bugs.python.org/issue31432 opened by christian.heimes #31436: test_socket.SendfileUsingSendfileTest.testWithTimeoutTriggered https://bugs.python.org/issue31436 opened by bmoyles #31440: wrong default module search path in help message https://bugs.python.org/issue31440 opened by xiang.zhang #31441: Descriptor example in documentation is confusing, possibly wro https://bugs.python.org/issue31441 opened by Benjamin Wohlwend #31442: assertion failures on Windows in Python/traceback.c in case of https://bugs.python.org/issue31442 opened by Oren Milman #31443: Possibly out of date C extension documentation https://bugs.python.org/issue31443 opened by Romuald #31445: Index out of range in get of message.EmailMessage.get() https://bugs.python.org/issue31445 opened by Michala #31446: _winapi.CreateProcess (used by subprocess) is not thread-safe https://bugs.python.org/issue31446 opened by evan_ #31447: proc communicate not exiting on python subprocess timeout usin https://bugs.python.org/issue31447 opened by Leonardo Francalanci #31449: Potential DoS Attack when Parsing Email with Huge Number of MI https://bugs.python.org/issue31449 opened by ckossmann #31450: Subprocess exceptions re-raised in parent process do not have https://bugs.python.org/issue31450 opened by msekletar #31451: PYTHONHOME is not absolutized https://bugs.python.org/issue31451 opened by xiang.zhang #31452: asyncio.gather does not cancel tasks if one fails https://bugs.python.org/issue31452 opened by Andrew Lytvyn #31453: Debian Sid/Buster: Cannot enable TLS 1.0/1.1 with PROTOCOL_TLS https://bugs.python.org/issue31453 opened by adrianv #31454: Include "import as" in tutorial https://bugs.python.org/issue31454 opened by svenyonson #31455: ElementTree.XMLParser() mishandles exceptions https://bugs.python.org/issue31455 opened by scoder #31456: SimpleCookie fails to parse any cookie if an entry has whitesp https://bugs.python.org/issue31456 opened by Adam Davis #31458: Broken link to Misc/NEWS in What's New page https://bugs.python.org/issue31458 opened by Mariatta #31459: IDLE: Remane Class Browser as Module Browser https://bugs.python.org/issue31459 opened by terry.reedy #31460: IDLE: Revise ModuleBrowser API https://bugs.python.org/issue31460 opened by terry.reedy #31461: IDLE: Enhance class browser https://bugs.python.org/issue31461 opened by terry.reedy #31463: test_multiprocessing_fork hangs test_subprocess https://bugs.python.org/issue31463 opened by benjamin.peterson #31465: Allow _PyType_Lookup() to raise exceptions https://bugs.python.org/issue31465 opened by scoder #31466: No easy way to change float formatting when subclassing encode https://bugs.python.org/issue31466 opened by qpeter #31468: os.path.samefile fails on windows network drive https://bugs.python.org/issue31468 opened by Daniel Sauer #31470: Py_Initialize documentation wrong https://bugs.python.org/issue31470 opened by Jim.Jewett #31472: "Emulating callable objects" documentation misleading https://bugs.python.org/issue31472 opened by nmarrow #31473: PyMem_Raw* API in debug mode are not thread safe https://bugs.python.org/issue31473 opened by xiang.zhang #31474: [2.7] Fix -Wnonnull and -Wint-in-bool-context warnings https://bugs.python.org/issue31474 opened by christian.heimes #31475: Bug in argparse - not supporting utf8 https://bugs.python.org/issue31475 opened by Ali Razmjoo #31476: "Open Module..." Not Working in IDLE for Standard Library https://bugs.python.org/issue31476 opened by Zero #31477: IDLE 'strip trailing whitespace' changes the value of multilin https://bugs.python.org/issue31477 opened by bup #31478: assertion failure in random.seed() in case the seed argument h https://bugs.python.org/issue31478 opened by Oren Milman #31479: Always reset the signal alarm in tests https://bugs.python.org/issue31479 opened by haypo #31481: telnetlib fails to read continous large output from Windows Te https://bugs.python.org/issue31481 opened by nisar009 #31482: random.seed() doesn't work with bytes and version=1 https://bugs.python.org/issue31482 opened by serhiy.storchaka #31483: ButtonPress event not firing until button release Python 3.6.1 https://bugs.python.org/issue31483 opened by HobbyScript #31484: Cache single-character strings outside of the Latin1 range https://bugs.python.org/issue31484 opened by serhiy.storchaka #31485: Tkinter widget.unbind(sequence, funcid) unbind all bindings https://bugs.python.org/issue31485 opened by j-4321-i Most recent 15 issues with no replies (15) ========================================== #31484: Cache single-character strings outside of the Latin1 range https://bugs.python.org/issue31484 #31482: random.seed() doesn't work with bytes and version=1 https://bugs.python.org/issue31482 #31481: telnetlib fails to read continous large output from Windows Te https://bugs.python.org/issue31481 #31477: IDLE 'strip trailing whitespace' changes the value of multilin https://bugs.python.org/issue31477 #31476: "Open Module..." Not Working in IDLE for Standard Library https://bugs.python.org/issue31476 #31474: [2.7] Fix -Wnonnull and -Wint-in-bool-context warnings https://bugs.python.org/issue31474 #31473: PyMem_Raw* API in debug mode are not thread safe https://bugs.python.org/issue31473 #31472: "Emulating callable objects" documentation misleading https://bugs.python.org/issue31472 #31470: Py_Initialize documentation wrong https://bugs.python.org/issue31470 #31468: os.path.samefile fails on windows network drive https://bugs.python.org/issue31468 #31466: No easy way to change float formatting when subclassing encode https://bugs.python.org/issue31466 #31463: test_multiprocessing_fork hangs test_subprocess https://bugs.python.org/issue31463 #31461: IDLE: Enhance class browser https://bugs.python.org/issue31461 #31460: IDLE: Revise ModuleBrowser API https://bugs.python.org/issue31460 #31459: IDLE: Remane Class Browser as Module Browser https://bugs.python.org/issue31459 Most recent 15 issues waiting for review (15) ============================================= #31484: Cache single-character strings outside of the Latin1 range https://bugs.python.org/issue31484 #31479: Always reset the signal alarm in tests https://bugs.python.org/issue31479 #31478: assertion failure in random.seed() in case the seed argument h https://bugs.python.org/issue31478 #31474: [2.7] Fix -Wnonnull and -Wint-in-bool-context warnings https://bugs.python.org/issue31474 #31466: No easy way to change float formatting when subclassing encode https://bugs.python.org/issue31466 #31458: Broken link to Misc/NEWS in What's New page https://bugs.python.org/issue31458 #31455: ElementTree.XMLParser() mishandles exceptions https://bugs.python.org/issue31455 #31452: asyncio.gather does not cancel tasks if one fails https://bugs.python.org/issue31452 #31432: Documention for CERT_OPTIONAL is misleading https://bugs.python.org/issue31432 #31431: SSL: check_hostname should imply CERT_REQUIRED https://bugs.python.org/issue31431 #31429: TLS cipher suite compile time option for downstream https://bugs.python.org/issue31429 #31415: Add -X option to show import time https://bugs.python.org/issue31415 #31414: IDLE: Entry tests should delete before insert. https://bugs.python.org/issue31414 #31412: wave.open does not accept PathLike objects https://bugs.python.org/issue31412 #31410: int.__repr__() is slower than repr() https://bugs.python.org/issue31410 Top 10 most discussed issues (10) ================================= #31234: Make support.threading_cleanup() stricter https://bugs.python.org/issue31234 31 msgs #31265: Remove doubly-linked list from C OrderedDict https://bugs.python.org/issue31265 13 msgs #31453: Debian Sid/Buster: Cannot enable TLS 1.0/1.1 with PROTOCOL_TLS https://bugs.python.org/issue31453 12 msgs #31336: Speed up _PyType_Lookup() for class creation https://bugs.python.org/issue31336 11 msgs #31447: proc communicate not exiting on python subprocess timeout usin https://bugs.python.org/issue31447 11 msgs #31370: Remove support for threads-less builds https://bugs.python.org/issue31370 8 msgs #29505: Submit the re, json, & csv modules to oss-fuzz testing https://bugs.python.org/issue29505 7 msgs #31454: Include "import as" in tutorial https://bugs.python.org/issue31454 7 msgs #31233: socketserver.ThreadingMixIn leaks running threads after server https://bugs.python.org/issue31233 6 msgs #31404: undefined behavior and crashes in case of a bad sys.modules https://bugs.python.org/issue31404 6 msgs Issues closed (65) ================== #15427: Describe use of args parameter of argparse.ArgumentParser.pars https://bugs.python.org/issue15427 closed by Mariatta #17085: test_socket crashes the whole test suite https://bugs.python.org/issue17085 closed by haypo #19985: Not so correct error message when initializing Struct with ill https://bugs.python.org/issue19985 closed by xiang.zhang #23508: Document & unittest the subprocess.getstatusoutput() status va https://bugs.python.org/issue23508 closed by gregory.p.smith #23669: test_socket.NonBlockingTCPTests failing due to race condition https://bugs.python.org/issue23669 closed by haypo #25684: ttk.OptionMenu radiobuttons aren't unique between two instance https://bugs.python.org/issue25684 closed by Mariatta #27099: IDLE: turn built-in extensions into regular modules https://bugs.python.org/issue27099 closed by terry.reedy #27354: SSLContext.load_verify_locations cannot handle paths on Window https://bugs.python.org/issue27354 closed by christian.heimes #28182: Expose OpenSSL verification results in SSLError https://bugs.python.org/issue28182 closed by christian.heimes #29631: Error ???importlib.h, importlib_external.h updated. You will n https://bugs.python.org/issue29631 closed by zach.ware #29718: Fixed compile on cygwin. https://bugs.python.org/issue29718 closed by Decorater #29767: build python failed on test_socket due to unused_port is actua https://bugs.python.org/issue29767 closed by haypo #30246: fix several error messages in struct https://bugs.python.org/issue30246 closed by xiang.zhang #30313: Tests of Python 2.7 VS9.0 buildbots must be run with -uall -rw https://bugs.python.org/issue30313 closed by haypo #30389: distutils._msvccompiler cannot find VS 2017 https://bugs.python.org/issue30389 closed by steve.dower #30830: test_logging leaks dangling threads on FreeBSD https://bugs.python.org/issue30830 closed by haypo #30860: Consolidate stateful C globals under a single struct. https://bugs.python.org/issue30860 closed by eric.snow #30918: Unable to launch IDLE in windows 7 https://bugs.python.org/issue30918 closed by terry.reedy #31022: ERROR: testRegularFile (test.test_socket.SendfileUsingSendTest https://bugs.python.org/issue31022 closed by haypo #31128: Allow pydoc to run with an arbitrary hostname https://bugs.python.org/issue31128 closed by merwok #31164: test_functools: test_recursive_pickle() stack overflow on x86 https://bugs.python.org/issue31164 closed by haypo #31258: [2.7] Enhance support reap_children() and threading_cleanup() https://bugs.python.org/issue31258 closed by haypo #31259: [3.6] Enhance support reap_children() and threading_cleanup() https://bugs.python.org/issue31259 closed by haypo #31260: [2.7] Enhance PC/VS9.0/ project to produce python.bat, as PCbu https://bugs.python.org/issue31260 closed by zach.ware #31294: ZeroMQSocketListener and ZeroMQSocketHandler examples in the L https://bugs.python.org/issue31294 closed by christian.heimes #31320: test_ssl logs a traceback https://bugs.python.org/issue31320 closed by haypo #31323: test_ssl: reference cycle between ThreadedEchoServer and its C https://bugs.python.org/issue31323 closed by haypo #31338: Use abort() for code we never expect to hit https://bugs.python.org/issue31338 closed by barry #31345: Backport docstring improvements to the C version of OrderedDic https://bugs.python.org/issue31345 closed by Mariatta #31354: Fixing a bug related to LTO only build https://bugs.python.org/issue31354 closed by gregory.p.smith #31361: Update feedparser.py to prevent theano compiling fail in pytho https://bugs.python.org/issue31361 closed by serhiy.storchaka #31395: Docs Downloads are 404s https://bugs.python.org/issue31395 closed by ned.deily #31396: Configure and make will fail with enabling --with-pydebug https://bugs.python.org/issue31396 closed by christian.heimes #31397: does not work import in Python 3.6.2 https://bugs.python.org/issue31397 closed by r.david.murray #31400: SSL module returns incorrect error codes on Windows https://bugs.python.org/issue31400 closed by christian.heimes #31401: Dynamic compilation that uses function in comprehension fails https://bugs.python.org/issue31401 closed by r.david.murray #31402: Python won't load on Acer Aspire https://bugs.python.org/issue31402 closed by terry.reedy #31403: Remove WITHOUT_THREADS from _decimal https://bugs.python.org/issue31403 closed by skrah #31406: crashes when comparing between a Decimal object and a bad Rati https://bugs.python.org/issue31406 closed by skrah #31407: --without-pymalloc broken https://bugs.python.org/issue31407 closed by eric.snow #31408: Leak in typeobject.c https://bugs.python.org/issue31408 closed by skrah #31409: Implement PEP 559 - built-in noop() https://bugs.python.org/issue31409 closed by barry #31411: SystemError raised by warn_explicit() in case warnings.oncereg https://bugs.python.org/issue31411 closed by serhiy.storchaka #31413: Support importing anything in ._pth files. https://bugs.python.org/issue31413 closed by steve.dower #31416: assertion failures in warn_explicit() in case of a bad warning https://bugs.python.org/issue31416 closed by serhiy.storchaka #31417: Use the enumerate function where appropriate https://bugs.python.org/issue31417 closed by serhiy.storchaka #31418: assertion failure in PyErr_WriteUnraisable() in case of an exc https://bugs.python.org/issue31418 closed by serhiy.storchaka #31420: Reference leaks introduced by bpo-30860 https://bugs.python.org/issue31420 closed by eric.snow #31421: IDLE doc: add section on developing tkinter apps. https://bugs.python.org/issue31421 closed by terry.reedy #31428: ElementTree.Element.__deepcopy__() raises a SystemError in cas https://bugs.python.org/issue31428 closed by serhiy.storchaka #31433: Impossible download python 3.6.2 and 2.7 documentation (html) https://bugs.python.org/issue31433 closed by berker.peksag #31434: [3.6] Python/random.c:128:17: warning: implicit declaration of https://bugs.python.org/issue31434 closed by haypo #31435: Addition giving incorrect result in certain circumstances https://bugs.python.org/issue31435 closed by skrah #31437: test_socket.SendfileUsingSendfileTest.testWithTimeoutTriggered https://bugs.python.org/issue31437 closed by bmoyles #31438: IDLE 3 crashes and quits when caret character typed https://bugs.python.org/issue31438 closed by karakoyun #31439: WindowsError: [Error 2] The system cannot find the file specif https://bugs.python.org/issue31439 closed by zach.ware #31444: ResourceWarning in Python/traceback.c in case of a bad io.Text https://bugs.python.org/issue31444 closed by haypo #31448: test_poplib started to log a lot of ResourceWarning warnings o https://bugs.python.org/issue31448 closed by haypo #31457: LoggerAdapter objects cannot be nested https://bugs.python.org/issue31457 closed by lukasz.langa #31462: Remove trailing whitespaces https://bugs.python.org/issue31462 closed by serhiy.storchaka #31464: Remove trailing spaces from Python-ast.h https://bugs.python.org/issue31464 closed by serhiy.storchaka #31467: cElementTree behaves differently compared to ElementTree https://bugs.python.org/issue31467 closed by poojariravi #31469: Running inside docker container from non-root user https://bugs.python.org/issue31469 closed by r.david.murray #31471: assertion failure in subprocess.Popen() in case the env arg ha https://bugs.python.org/issue31471 closed by serhiy.storchaka #31480: IDLE: disable ZzDummy, revise tests to match https://bugs.python.org/issue31480 closed by terry.reedy From victor.stinner at gmail.com Fri Sep 15 16:28:58 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 15 Sep 2017 22:28:58 +0200 Subject: [Python-Dev] SK-CSIRT identified malicious software libraries in the official Python package repository, PyPI Message-ID: Hi, Last week, the National Security Authority of Slovakia contacted the Python Security Response Team (PSRT) to report that the Python Package Index (PyPI) was hosting malicious packages. Installing these packages send user data to a HTTP server, but also install the expected module so it was an easy to notice the attack. Advisory: http://www.nbu.gov.sk/skcsirt-sa-20170909-pypi/ Kudos to them to report the issue! It's not a compromise of the PyPI server nor a third-party project, but the "typo squatting" issue which is known since at least June 2016 (for PyPI). The issue is not specific to Python, npmjs.com or rubygems.org are vulnerable to the same issue. For example, a malicious package used the names "urllib" (no 3) and "urlib3" (1 L) instead of "urllib3" (2 L). These packages were downloaded by users, so the attack was effective. More information on typo squatting and Python package security: https://python-security.readthedocs.io/packages.html#pypi-typo-squatting The PRST contacted PyPI administrators and all identified packages were taken down, only 1h10 after the PSRT received the email from the National Security Authority of Slovakia! The typo squatting issue is known and discussed, but not solution was found yet. See for example this warehouse issue: https://github.com/pypa/warehouse/issues/2151 It seems like the consensus is that pip is not responsible to detect malicious code, it's more the responsability of PyPI. The problem is to decide how to detect malicious code and/or prevent typo squatting on PyPI. The issue has been discussed privately on the PSRT list last week. The National Security Authority of Slovakia just published their advisory, and a public discussion started on reddit: https://news.ycombinator.com/item?id=15256121 I consider that it's now time to find a solution on the public python-dev mailing list. Let's try to find a solution! Can we learn something from the Update Framework (TUF)? How does Javascript, Ruby, Perl and other programming languages deal with these security issues on their package manager? See also my other notes on Python security and the list of known CPython vulnerabilities: https://python-security.readthedocs.io/ Victor From victor.stinner at gmail.com Fri Sep 15 17:08:00 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 15 Sep 2017 23:08:00 +0200 Subject: [Python-Dev] SK-CSIRT identified malicious software libraries in the official Python package repository, PyPI In-Reply-To: References: Message-ID: Benjamin Bach and Hanno B?ck are running https://www.pytosquatting.org/ and registered many projects lilke https://pypi.python.org/pypi/urllib2 "In June 2016, Typosquatting programming language package managers stated that urllib2 had ~4,000 downloads in 2 weeks. The package name is now squatted by us (the good guys). We take these findings seriously." It seems like we need a solution to prevent that a project removed because it contains malicious code, can be recreated automatically. pytosquatting.org projects contain a download file: a tarball with a setup.py file. This setup.py raises an exception, but also send a HTTP request, a "pingback", to their server. Thank you for reserving names of the standard library. But I'm not sure of the HTTP "pingback" part. It can be on CIs, a restricted environments, etc. Why not just reserving the name but don't provide any download file? With no download file, the user will likely understand his/her error, no? Note: I don't think that Benjamin Bach and Hanno B?ck are related to the PSRT nor PyPI administrators. Victor 2017-09-15 22:28 GMT+02:00 Victor Stinner : > Hi, > > Last week, the National Security Authority of Slovakia contacted the > Python Security Response Team (PSRT) to report that the Python Package > Index (PyPI) was hosting malicious packages. Installing these packages > send user data to a HTTP server, but also install the expected module > so it was an easy to notice the attack. > > Advisory: http://www.nbu.gov.sk/skcsirt-sa-20170909-pypi/ > > Kudos to them to report the issue! > > > It's not a compromise of the PyPI server nor a third-party project, > but the "typo squatting" issue which is known since at least June 2016 > (for PyPI). The issue is not specific to Python, npmjs.com or > rubygems.org are vulnerable to the same issue. > > For example, a malicious package used the names "urllib" (no 3) and > "urlib3" (1 L) instead of "urllib3" (2 L). These packages were > downloaded by users, so the attack was effective. > > More information on typo squatting and Python package security: > https://python-security.readthedocs.io/packages.html#pypi-typo-squatting > > The PRST contacted PyPI administrators and all identified packages > were taken down, only 1h10 after the PSRT received the email from the > National Security Authority of Slovakia! > > > The typo squatting issue is known and discussed, but not solution was > found yet. See for example this warehouse issue: > https://github.com/pypa/warehouse/issues/2151 > > It seems like the consensus is that pip is not responsible to detect > malicious code, it's more the responsability of PyPI. > > The problem is to decide how to detect malicious code and/or prevent > typo squatting on PyPI. > > > The issue has been discussed privately on the PSRT list last week. The > National Security Authority of Slovakia just published their advisory, > and a public discussion started on reddit: > https://news.ycombinator.com/item?id=15256121 > > I consider that it's now time to find a solution on the public > python-dev mailing list. > > > Let's try to find a solution! > > Can we learn something from the Update Framework (TUF)? > > How does Javascript, Ruby, Perl and other programming languages deal > with these security issues on their package manager? > > > See also my other notes on Python security and the list of known > CPython vulnerabilities: > https://python-security.readthedocs.io/ > > Victor From victor.stinner at gmail.com Fri Sep 15 17:16:34 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 15 Sep 2017 23:16:34 +0200 Subject: [Python-Dev] SK-CSIRT identified malicious software libraries in the official Python package repository, PyPI In-Reply-To: References: Message-ID: An idea for typo squatting would be to compute the Levenshtein distance with package names of standard library and top 100 most popular PyPI packages, and require to contact a moderation team if the name is too close to an existing package. The moderation team will review the email, but also watch the package during 1 month to check if everything seems fine. It requires to have a list of all package names of the standard library, and maintain an up to date list of popular PyPI package names. It also requires to set up a mailing list, and tooling to report the error message to users, and then give moderators the right to create the package. I'm not sure that it's easy to implement it. Victor From python-dev at mgmiller.net Fri Sep 15 19:21:42 2017 From: python-dev at mgmiller.net (Mike Miller) Date: Fri, 15 Sep 2017 16:21:42 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <2276a2c5-39f5-4639-ee01-77ace1a80fe9@gmail.com> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> <3e7b6c3a-0a70-6942-f4a5-b66a7b3864ef@trueblade.com> <2e3d5b22-61f9-df2e-2534-3448a59412ac@trueblade.com> <748a60b0-bc12-a0bd-ffb2-3978aa43a2ec@mgmiller.net> <-4829494977384524515@unknownmsgid> <2276a2c5-39f5-4639-ee01-77ace1a80fe9@gmail.com> Message-ID: On 2017-09-15 05:08, Michel Desmoulin wrote: > Because given how convenient it is, it will most probably becomes the > default way to write classes in Python. Not just for record. Yes, would have been great if this was how the original object worked and the current barebones object was a base(object) or something like that. Too late however. Another option was "bag" which is more generic and brief, and might seem to fit better, but the discussion went towards record. -Mike From jwilk at jwilk.net Fri Sep 15 17:38:28 2017 From: jwilk at jwilk.net (Jakub Wilk) Date: Fri, 15 Sep 2017 23:38:28 +0200 Subject: [Python-Dev] SK-CSIRT identified malicious software libraries in the official Python package repository, PyPI In-Reply-To: References: Message-ID: <20170915213828.gdm3lwgqqbviiclq@jwilk.net> * Victor Stinner , 2017-09-15, 23:08: >Why not just reserving the name but don't provide any download file? Is is possible at the moment? I tried "python setup.py register", but all I got was: Server response (410): Project pre-registration is no longer required or supported, so continue directly to uploading files. -- Jakub Wilk From larry at hastings.org Sat Sep 16 05:39:25 2017 From: larry at hastings.org (Larry Hastings) Date: Sat, 16 Sep 2017 02:39:25 -0700 Subject: [Python-Dev] Why aren't decorators just expressions? Message-ID: <1b232526-ebc4-5d34-76c9-2eaaca6de69a@hastings.org> The syntax for decorators in the grammar is quite specific: decorator: '@' dotted_name [ '(' [arglist] ')' ] NEWLINE Decorators can be either a dotted name, or a dotted name followed by arguments. This disallows: @mapping['async'] # looking something up in a mapping @func(1, 2, 3).async # getting the attribute of the /result/ of a function call Obviously these restrictions haven't been too burdensome, as I've labored under them for 13+ years and didn't even notice until just now. But it set me to wondering. If it were me, perpetually lazy sod that I am, I'd have written the grammar as decorator: '@' expr NEWLINE and been done with it. So why don't decorators allow arbitrary expressions? The PEP discusses the syntax for decorators, but that whole debate only concerned itself with where the decorator goes relative to "def", and what funny punctuation might it use. It never says "decorators shouldn't permit arbitrary expressions because--". Nor is there any info on wiki page with the extensive summary of alternative syntax proposals. Anybody remember? I'm not proposing that we allow arbitrary expressions as decorators... well, I'm not doing that /yet/ at least. But like I said, the syntax has been this way for 13 years and I don't recall anybody complaining. //arry /p.s. I experimentally tried modifying the grammar for descriptors to allow arbitrary expressions. The syntax was fine but it crashed in ast_for_decorator(). When I updated that to match, it worked fine! The diff is short and mainly consists of deleted code. -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Sat Sep 16 07:22:17 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 16 Sep 2017 14:22:17 +0300 Subject: [Python-Dev] Why aren't decorators just expressions? In-Reply-To: <1b232526-ebc4-5d34-76c9-2eaaca6de69a@hastings.org> References: <1b232526-ebc4-5d34-76c9-2eaaca6de69a@hastings.org> Message-ID: 16.09.17 12:39, Larry Hastings ????: > So why don't decorators allow arbitrary expressions?? The PEP discusses > the syntax for decorators, but that whole debate only concerned itself > with where the decorator goes relative to "def", and what funny > punctuation might it use.? It never says "decorators shouldn't permit > arbitrary expressions because--".? Nor is there any info on wiki page > with the extensive summary of alternative syntax proposals. > > Anybody remember? https://mail.python.org/pipermail/python-dev/2004-August/046711.html > I'm not proposing that we allow arbitrary expressions as decorators... > well, I'm not doing that /yet/ at least.? But like I said, the syntax > has been this way for 13 years and I don't recall anybody complaining. This may be an argument for not changing the syntax. Actually I remember somebody raised this question a year or two ago, but don't remember details. From ncoghlan at gmail.com Sat Sep 16 07:53:38 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 16 Sep 2017 21:53:38 +1000 Subject: [Python-Dev] SK-CSIRT identified malicious software libraries in the official Python package repository, PyPI In-Reply-To: References: Message-ID: On 16 September 2017 at 07:08, Victor Stinner wrote: > Benjamin Bach and Hanno B?ck are running > https://www.pytosquatting.org/ and registered many projects lilke > https://pypi.python.org/pypi/urllib2 > > "In June 2016, Typosquatting programming language package managers > stated that urllib2 had ~4,000 downloads in 2 weeks. The package name > is now squatted by us (the good guys). We take these findings > seriously." > > It seems like we need a solution to prevent that a project removed > because it contains malicious code, can be recreated automatically. The PyPI admins have the ability to blacklist project names (with one of the simplest mechanisms being for admins to typosquat them pre-emptively - while, as Jakub notes, that currently requires uploading an actual sdist, it doesn't need to be an installable one), so that isn't the hard part of the problem. The hard parts are the ones that potentially require people to scale: 1. Noticing typosquatting in the first place 2. Evaluating and responding to notifications of malicious namesquatting in a timely manner 3. Handling requests to use names that have been reserved by the admins The first part can already be handled in a distributed fashion - bandersnatch will give anyone a full mirror of PyPI in a few hours (depending on their available network bandwidth), and going to https://pypi.org/simple/ will give you a listing of all the registered project names as a HTML file. The second part is tricky, since there aren't currently any consistent public markers for "benign" blacklisting. For example, if you go to https://pypi.org/simple/pkg-resources/ (the normalised name for "pkg_resources") you'll get a response, since that's a registered name with no uploads (IIRC, it's reserved by Donald). However, if you go to https://pypi.org/project/pkg-resources/ you get a 404 page, rather than a notification that the name has been reserved by the PyPI admins. So that would probably be a useful enhancement - having "reserved by the PyPI admins" as an explicit state, so 3rd party researchers can more readily flag such cases as being as safe as we can reasonably make them from a security perspective. For the "review for malice" aspect, notifying the PyPI admins of all close names isn't a particularly viable option as the default behaviour, since they're either volunteers, or folks with partial time grants from their employers. However, something that I think does have the potential to scale reasonably well is to instead notify the maintainers of projects with similar names: https://github.com/pypa/warehouse/issues/2268 That way the initial review is distributed to the maintainers of targeted packages, and then only the cases that those maintainers consider to be potentially malicious would need to be escalated to the PyPI admins. The final aspect is one of the items considered in PEP 541: https://www.python.org/dev/peps/pep-0541/#invalid-projects Getting PEP 541 accepted will likely make it easier to start scaling the ticket review team for PyPI (with both paid and volunteer efforts), and that's currently with Donald for review as an initial policy that we can start with, and then iterate on over time. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sat Sep 16 08:21:03 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 16 Sep 2017 22:21:03 +1000 Subject: [Python-Dev] Why aren't decorators just expressions? In-Reply-To: References: <1b232526-ebc4-5d34-76c9-2eaaca6de69a@hastings.org> Message-ID: On 16 September 2017 at 21:22, Serhiy Storchaka wrote: > Actually I remember somebody raised this question a year or two ago, but > don't remember details. Aye, I remember that as well, but apparently the thread title for the discussion was sufficiently unrelated to the eventual topic that I can't find it with Google. Anyway, I think the outcome of that particular thread was something like: 1. The most reasonable-sounding enhancement request is the one to allow subscripts: "@deco_group[deco_id]" 2. If we're going to relax the restrictions at all, it probably makes sense to just allow arbitrary expressions 3. You can already use arbitrary expressions in practice by defining "def deco(x): return x", and then writing "@deco(whatever_expression_you_want)" 4. Nobody was motivated enough to actually implement the change and then ask Guido if he'd changed his mind on the topic since 2004 Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From srkunze at mail.de Sat Sep 16 10:44:34 2017 From: srkunze at mail.de (Sven R. Kunze) Date: Sat, 16 Sep 2017 16:44:34 +0200 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <750de277-4975-48a9-7ef5-817b78bcbdde@trueblade.com> Message-ID: <61662340-c713-3ca2-db3c-50e0480f9eb6@mail.de> Thanks for the PEP! :) I like the naming. ;) Though, I would like to add to Michel's argument in favor of a base class. On 11.09.2017 08:38, Michel Desmoulin wrote: >>> - I read Guido talking about some base class as alternative to the >>> generator version, but don't see it in the PEP. Is it still considered ? >> I'm going to put some words in explaining why I don't want to use base >> classes (I don't think it buys you anything). Do you have a reason for >> preferring base classes? > Not preferring, but having it as an alternative. Mainly for 2 reasons: > > 1 - data classes allow one to type in classes very quickly, let's > harvest the benefit from that. > > Typing a decorator in a shell is much less comfortable than using > inheritance. Same thing about IDE: all current ones have snippet with > auto-switch to the class parents on tab. > > All in all, if you are doing exploratory programming, and thus > disposable code, which data classes are fantastic for, inheritance will > keep you in the flow. > > 2 - it will help sell the data classes > > I train a lot of people to Python each year. I never have to explain > classes to people with any kind of programming background. I _always_ > have to explain decorators. > > People are not used to it, and even kind fear it for quite some time. > > Inheritance however, is familiar, and will not only push people to use > data classes more, but also will let them do less mistakes: they know > the danger of parent ordering, but not the ones of decorators ordering. 3) - the order of base classes can arranged appropriately In our day-to-day work, we use mixins and cooperative multiple inheritance a lot. So, having dataclasses as a base class or a mixin would be great! :) Combined with 1) and 2), I am much in favor of having dataclasses as base class/mixin than as a decorator. What are the benefits of the decorator? Maybe both is possible? Cheers, Sven PS: @Michel good observation 1). Typing decorators in shell is annoying. From barry at python.org Sat Sep 16 12:42:55 2017 From: barry at python.org (Barry Warsaw) Date: Sat, 16 Sep 2017 09:42:55 -0700 Subject: [Python-Dev] Why aren't decorators just expressions? In-Reply-To: <1b232526-ebc4-5d34-76c9-2eaaca6de69a@hastings.org> References: <1b232526-ebc4-5d34-76c9-2eaaca6de69a@hastings.org> Message-ID: <03325C3F-3DE0-45D0-A6FC-59E78917BA96@python.org> On Sep 16, 2017, at 02:39, Larry Hastings wrote: > I'm not proposing that we allow arbitrary expressions as decorators... well, I'm not doing that yet at least. But like I said, the syntax has been this way for 13 years and I don't recall anybody complaining. Indeed, I can?t remember a single time where I?ve needed that, let alone actually realized the restriction existed. But now that you mention it, I do remember discussions in favor of the more restricted syntax when the feature was originally being debated. I don?t remember the reasons though - it well could have been an abundance of caution over how far to take the new syntax (and understanding of course that it?s easier to relax than restrict). -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From steve at holdenweb.com Sat Sep 16 14:14:02 2017 From: steve at holdenweb.com (Steve Holden) Date: Sat, 16 Sep 2017 19:14:02 +0100 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <61662340-c713-3ca2-db3c-50e0480f9eb6@mail.de> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <750de277-4975-48a9-7ef5-817b78bcbdde@trueblade.com> <61662340-c713-3ca2-db3c-50e0480f9eb6@mail.de> Message-ID: ?I make this suggestion in trepidation, given that Guido called a halt on the Great Naming Debate, but it seems that a short, neutral name with data connotations previously not a part of many popular subsystems is required. I therefore propose "row", which is sufficiently neutral to avoid most current opposition and yet a common field-oriented mechanism for accessing units of retrieved data by name. regards Steve? Steve Holden On Sat, Sep 16, 2017 at 3:44 PM, Sven R. Kunze wrote: > Thanks for the PEP! :) > > I like the naming. ;) Though, I would like to add to Michel's argument in > favor of a base class. > > > On 11.09.2017 08:38, Michel Desmoulin wrote: > >> - I read Guido talking about some base class as alternative to the >>>> generator version, but don't see it in the PEP. Is it still considered ? >>>> >>> I'm going to put some words in explaining why I don't want to use base >>> classes (I don't think it buys you anything). Do you have a reason for >>> preferring base classes? >>> >> Not preferring, but having it as an alternative. Mainly for 2 reasons: >> >> 1 - data classes allow one to type in classes very quickly, let's >> harvest the benefit from that. >> >> Typing a decorator in a shell is much less comfortable than using >> inheritance. Same thing about IDE: all current ones have snippet with >> auto-switch to the class parents on tab. >> >> All in all, if you are doing exploratory programming, and thus >> disposable code, which data classes are fantastic for, inheritance will >> keep you in the flow. >> >> 2 - it will help sell the data classes >> >> I train a lot of people to Python each year. I never have to explain >> classes to people with any kind of programming background. I _always_ >> have to explain decorators. >> >> People are not used to it, and even kind fear it for quite some time. >> >> Inheritance however, is familiar, and will not only push people to use >> data classes more, but also will let them do less mistakes: they know >> the danger of parent ordering, but not the ones of decorators ordering. >> > > 3) - the order of base classes can arranged appropriately > > In our day-to-day work, we use mixins and cooperative multiple inheritance > a lot. > So, having dataclasses as a base class or a mixin would be great! :) > > Combined with 1) and 2), I am much in favor of having dataclasses as base > class/mixin than as a decorator. > What are the benefits of the decorator? Maybe both is possible? > > Cheers, > Sven > > > PS: @Michel good observation 1). Typing decorators in shell is annoying. > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/steve% > 40holdenweb.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Sat Sep 16 14:23:04 2017 From: mertz at gnosis.cx (David Mertz) Date: Sat, 16 Sep 2017 11:23:04 -0700 Subject: [Python-Dev] Why aren't decorators just expressions? In-Reply-To: <03325C3F-3DE0-45D0-A6FC-59E78917BA96@python.org> References: <1b232526-ebc4-5d34-76c9-2eaaca6de69a@hastings.org> <03325C3F-3DE0-45D0-A6FC-59E78917BA96@python.org> Message-ID: I always realized the restriction was there, and once in a while mention it in teaching. But I've NEVER had an actual desire to use anything other that a simple decorator or a "decorator factory" (which I realize is a decorator in the grammar, but it's worth teaching how to parameterize custom ones, which is a factory). I've used barry_as_FLUFL more often, actually... Albeit always joking around for students, not in production covfefe. On Sep 16, 2017 9:46 AM, "Barry Warsaw" wrote: On Sep 16, 2017, at 02:39, Larry Hastings wrote: > I'm not proposing that we allow arbitrary expressions as decorators... well, I'm not doing that yet at least. But like I said, the syntax has been this way for 13 years and I don't recall anybody complaining. Indeed, I can?t remember a single time where I?ve needed that, let alone actually realized the restriction existed. But now that you mention it, I do remember discussions in favor of the more restricted syntax when the feature was originally being debated. I don?t remember the reasons though - it well could have been an abundance of caution over how far to take the new syntax (and understanding of course that it?s easier to relax than restrict). -Barry _______________________________________________ Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ mertz%40gnosis.cx -------------- next part -------------- An HTML attachment was scrubbed... URL: From skip.montanaro at gmail.com Sat Sep 16 14:28:17 2017 From: skip.montanaro at gmail.com (Skip Montanaro) Date: Sat, 16 Sep 2017 13:28:17 -0500 Subject: [Python-Dev] Why aren't decorators just expressions? In-Reply-To: <03325C3F-3DE0-45D0-A6FC-59E78917BA96@python.org> References: <1b232526-ebc4-5d34-76c9-2eaaca6de69a@hastings.org> <03325C3F-3DE0-45D0-A6FC-59E78917BA96@python.org> Message-ID: > Indeed, I can?t remember a single time where I?ve needed that, let alone actually realized the restriction existed. Likewise. I suspect the use of a function sort of just fell out from the pre-decorator usage. Things like staticmethod() and classmethod() were used well before decorators were available. (According to the linked 2.7 doc references, both were introduced in 2.2, with decorator tweaks coming along in 2.4.) In fact, those two functions might have been the most significant driving force in the development of decorators, which, I believe, were always just thought if as nothing more than syntactic sugar for existing common use cases. Skip -------------- next part -------------- An HTML attachment was scrubbed... URL: From gvanrossum at gmail.com Sat Sep 16 15:59:26 2017 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sat, 16 Sep 2017 12:59:26 -0700 Subject: [Python-Dev] Why aren't decorators just expressions? In-Reply-To: References: <1b232526-ebc4-5d34-76c9-2eaaca6de69a@hastings.org> <03325C3F-3DE0-45D0-A6FC-59E78917BA96@python.org> Message-ID: It was and is all very intentional. I don't want to encourage line noise, which the at-sign already resembles. But namespacing and some form of parametrization (i.e. calls) are essential. So that's what we got. On Sep 16, 2017 11:30 AM, "Skip Montanaro" wrote: > > Indeed, I can?t remember a single time where I?ve needed that, let alone > actually realized the restriction existed. > > Likewise. I suspect the use of a function sort of just fell out from the > pre-decorator usage. Things like staticmethod() > and > classmethod() > were used > well before decorators were available. (According to the linked 2.7 doc > references, both were introduced in 2.2, with decorator tweaks coming along > in 2.4.) In fact, those two functions might have been the most significant > driving force in the development of decorators, which, I believe, were > always just thought if as nothing more than syntactic sugar for existing > common use cases. > > Skip > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From benjamin at python.org Sat Sep 16 18:05:32 2017 From: benjamin at python.org (Benjamin Peterson) Date: Sat, 16 Sep 2017 15:05:32 -0700 Subject: [Python-Dev] [RELEASE] Python 2.7.14 Message-ID: <1505599532.1460496.1108442448.264843EC@webmail.messagingengine.com> I'm happy to announce to the immediate availability of Python 2.7.14, yet another bug fix release in the Python 2.7 series. 2.7.14 includes 9 months of conservative bug fixes from the 3.x branch. Downloads of source code and binaries are at: https://www.python.org/downloads/release/python-2714/ Bugs may be reported at https://bugs.python.org/ Warmly, Benjamin 2.7 release manager (on behalf of all of 2.7's contributors) From solipsis at pitrou.net Sun Sep 17 16:21:51 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 17 Sep 2017 22:21:51 +0200 Subject: [Python-Dev] PEP 554 v2 (new "interpreters" module) In-Reply-To: References: <20170909140510.1220f57e@fsol> Message-ID: <20170917222151.0868146a@fsol> On Tue, 12 Sep 2017 14:43:43 -0700 Eric Snow wrote: > On Sat, Sep 9, 2017 at 5:05 AM, Antoine Pitrou wrote: > > On Fri, 8 Sep 2017 16:04:27 -0700, Eric Snow wrote: > >> ``list()``:: > > > > It's called ``enumerate()`` in the threading module. Not sure there's > > a point in choosing a different name here. > > Yeah, in the first version of the PEP it was called "enumerate()". I > changed it to "list()" at Raymond's recommendation. The reasoning is > that it's less confusing to most people that way. TBH, I'd rather > leave it "list()", but could be swayed. Perhaps it would be enough > for the PEP to not mention any relationship to "threading"? Yeah, given the other clarifications you did, I agree not mentioning threading at all would be better, since the PEP itself does not strive to provide new concurrency primitives. As for the naming, let's make it both unconfusing and explicit? How about three functions: `all_interpreters()`, `running_interpreters()` and `idle_interpreters()`, for example? > > And why guarantee that it executes in the "current OS thread"? > > I would say you don't want to specify where it executes exactly, as it > > opens the door for more sophisticated implementations (such as > > automatic assignment of subinterpreters inside a pool of threads). > > Again, I had explained this poorly in the PEP. The key thing here is > that subinterpreters don't do anything special relative to threading. > If you want to call "Interpreter.run()" in a thread then you stick it > in a "threading.Thread". If you want to auto-assign to a pool of > threads then you treat it like any other function you would > auto-assign to a pool of threads. Makes sense, thanks. Perhaps you want to mention that example quickly in the PEP? (or perhaps not) > > Has any thought been given to how FIFOs could integrate with async code > > driven by an event loop (e.g. asyncio)? I think the model of executing > > several asyncio (or Tornado) applications each in their own > > subinterpreter may prove quite interesting to reconcile multi-core > > concurrency with ease of programming. That would require the FIFOs to > > be able to synchronize on something an event loop can wait on (probably > > a file descriptor?). > > Personally I've given pretty much no thought to the relationship with > async. TBH, the FIFO parts of the PEP were added only recently and > haven't fully baked yet. I'd be interested more feedback on async > relative to PEP, not just with the FIFO bits; my experience with async > is pretty limited thus far. I think the FIFO is the primary point where interaction with event loop would prove useful, since that's how you would communicate and synchronize with other interpreters. Actually, even waiting on an interpreter's state (e.g. running / idle) could use a "hidden" FIFO sending the right messages (for example the strings "running" and "idle", for a crude proof-of-concept). Regards Antoine. From victor.stinner at gmail.com Mon Sep 18 05:31:12 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 18 Sep 2017 11:31:12 +0200 Subject: [Python-Dev] Evil reference cycles caused Exception.__traceback__ Message-ID: Hi, Python 3 added a __traceback__ attribute to exception objects. I guess that it was added to be able to get the original traceback when an exception is re-raised. Artifical example (real code is more complex with subfunctions and conditional code): try: ... except Exception as exc: ... raise exc # keep original traceback The problem is that Exception.__traceback__ creates reference cycles. If you store an exception in a local variable, suddently, all local variables of all frames (of the current traceback) become part of a giant reference cycle. Sometimes all these variables are kept alive very long (until you exit Python?), at least, much longer than the "expected" lifetime (destroyed "magically" local variables when you exit the function). The reference cycle is: 1) frame -> local variable -> variable which stores the exception 2) exception -> traceback -> frame exception -> ... -> frame -> ... -> same exception Breaking manually the reference cycle is complex. First, you must be aware of the reference cycle! Second, you have to identify which functions of your large application create the reference cycle: this task is long and painful even with good tooling. Finally, you have to explicitly clear variables or attributes to break the reference cycle manually. asyncio.Future.set_exception() keeps an exception object and its traceback object alive: asyncio creates reference cycles *by design*. Enjoy! asyncio tries hard to reduce the consequence of reference cycles, or even try to break cycles, using hacks like "self = None" in methods... Setting self to None is really surprising and requires a comment explainaing the hack. Last years, I fixed many reference cycles in various parts of the Python 3 standard library. Sometimes, it takes years to become aware of the reference cycle and finally fix it. For example, recently, I worked on fixing all "dangling threads" leaked by tests of the Python test suite, and I found and fixed many reference cycles which probably existed since Python 3 was created (forked from Python 2): * socket.create_connection(): commit acb9fa79fa6453c2bbe3ccfc9cad2837feb90093, bpo-31234 * concurrent.futures.ThreadPoolExecutor: commit bc61315377056fe362b744d9c44e17cd3178ce54, bpo-31249 * pydoc: commit 4cab2cd0c05fcda5fcb128c9eb230253fff88c21, bpo-31238 * xmlrpc.server: commit 84524454d0ba77d299741c47bd0f5841ac3f66ce, bpo-31247 Other examples: * test_ssl: commit 868710158910fa38e285ce0e6d50026e1d0b2a8c, bpo-31323 * test_threading: commit 3d284c081fc3042036adfe1bf2ce92c34d743b0b, bpo-31234 Another example of a recent change fixing a reference cycle, by Antoine Pitrou: * multiprocessing: commit 79d37ae979a65ada0b2ac820279ccc3b1cd41ba6, bpo-30775 For socket.create_connection(), I discovered the reference cycle because a test started to log a warning about dangling thred. The warning was introduced indirectly by a change which modified support.HOST value from '127.0.0.1' to 'localhost'... It's hard to see to link between support.HOST value and a reference cycle. Full story: https://bugs.python.org/issue29639#msg302087 Again, it's just yet another random example of a very tricky reference cycle bug caused by Exception.__traceback__. Ideally, CPython 3.x should never create reference cycles. Removing Exception.__traceback__ is the obvious "fix" for the issue. But I expect that slowly, a lot of code started to rely on the attribute, maybe even for good reasons :-) A more practical solution would be to log a warning. Maybe the garbage collector can emit a warning if it detects an exception part of a reference cycle? Or maybe detect frames? If the GC cannot do it, maybe we might use a debug thread (enabled manually) which checks manually if an exception is part of a reference cycle using gc.get_objects(): check if an exception remains alive longer than X seconds? I had the same idea for asyncio, to detect reference cycles or if a task is never "awaited", but I never implemented the idea. Victor From solipsis at pitrou.net Mon Sep 18 05:48:24 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 18 Sep 2017 11:48:24 +0200 Subject: [Python-Dev] Evil reference cycles caused Exception.__traceback__ References: Message-ID: <20170918114824.741c17e4@fsol> On Mon, 18 Sep 2017 11:31:12 +0200 Victor Stinner wrote: > > Ideally, CPython 3.x should never create reference cycles. Removing > Exception.__traceback__ is the obvious "fix" for the issue. But I > expect that slowly, a lot of code started to rely on the attribute, > maybe even for good reasons :-) The real issue is not reference cycles, but the fact that a traceback frame keeps all locals alive. When the frame's execution is finished, that information is practically almost always useless, but is kept for the rare cases of debugging (and perhaps dubious features such as some of py.test's magic). So we're constantly paying the price of this minor, albeit probably useful (*), debugging-related feature with annoying object lifetime issues that are sometimes quite difficult to track down and solve. (*) (I've hardly ever used pdb so I don't know if people often examine the locals of finished frames, but I suppose some do) Perhaps it would be useful to have a way to tell the interpreter to automatically clear all locals on frames when they have finished executing. It could be a thread-local setting (or, better, a PEP 550 context variable setting). Regards Antoine. From victor.stinner at gmail.com Mon Sep 18 06:02:03 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 18 Sep 2017 12:02:03 +0200 Subject: [Python-Dev] Evil reference cycles caused Exception.__traceback__ In-Reply-To: References: Message-ID: Hum, my email is long. Maybe an short example is simpler to understand: --- import gc class VerboseDestructor: # Imagine an object using many limited resources like memory, # sockets and/or files def __del__(self): print("VerboseDestructor") def create_ref_cycle(): obj = VerboseDestructor() # Create a reference cycle using an exception # to keep the frame alive try: raise ValueError except ValueError as exc: err = exc print("create_ref_cycle") create_ref_cycle() print("gc.collect") gc.collect() print("exit") --- Output: --- create_ref_cycle gc.collect VerboseDestructor exit --- create_ref_cycle() function creates a reference cycle causing the "obj" variable to be kept alive after create_ref_cycle() exit. Let's say that the "leaked" object uses a lot of memory. The bug can now be called a memory leak. Let's say that the leaked object uses a file. Now the risk is that the file cannot be removed anymore on Windows (you cannot remove an open file), or that data is not properly flushed on disk, or that the same file is re-opened by a different function in a different thread and so you might corrupt data, etc. It's not hard to imagine many various issues caused by an object kept alive longer than expected. Well, you *can* modify your code to add a VerboseDestructor.close() method and explicitly releases resources as soon as possible in create_ref_cycle(). Maybe by adding support for the context manager protocol to VerboseDestructor and use "with obj:" in create_ref_cycle(). But here you expect that you are already aware of the reference cycle. Moreover, Python 3 haters can complain that the code worked perfectly fine in Python 2... Victor From ncoghlan at gmail.com Mon Sep 18 06:07:42 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 18 Sep 2017 20:07:42 +1000 Subject: [Python-Dev] Evil reference cycles caused Exception.__traceback__ In-Reply-To: <20170918114824.741c17e4@fsol> References: <20170918114824.741c17e4@fsol> Message-ID: On 18 September 2017 at 19:48, Antoine Pitrou wrote: > On Mon, 18 Sep 2017 11:31:12 +0200 > Victor Stinner wrote: >> >> Ideally, CPython 3.x should never create reference cycles. Removing >> Exception.__traceback__ is the obvious "fix" for the issue. But I >> expect that slowly, a lot of code started to rely on the attribute, >> maybe even for good reasons :-) > > The real issue is not reference cycles, but the fact that a traceback > frame keeps all locals alive. When the frame's execution is finished, > that information is practically almost always useless, but is kept for > the rare cases of debugging (and perhaps dubious features such as > some of py.test's magic). As another use case, IPython tracebacks will include details of referenced local variables, and https://docs.python.org/3/library/traceback.html#tracebackexception-objects offers the ability to readily capture the repr() of locals referenced from a traceback for the same reason. > So we're constantly paying the price of this > minor, albeit probably useful (*), debugging-related feature with > annoying object lifetime issues that are sometimes quite difficult to > track down and solve. > > (*) (I've hardly ever used pdb so I don't know if people often examine > the locals of finished frames, but I suppose some do) > > Perhaps it would be useful to have a way to tell the interpreter to > automatically clear all locals on frames when they have finished > executing. It could be a thread-local setting (or, better, a PEP 550 > context variable setting). I wonder if it might be reasonable to have tracebacks only hold a weak reference to their frame objects when "__debug__ == False". Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From victor.stinner at gmail.com Mon Sep 18 06:18:21 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 18 Sep 2017 12:18:21 +0200 Subject: [Python-Dev] Evil reference cycles caused Exception.__traceback__ In-Reply-To: References: <20170918114824.741c17e4@fsol> Message-ID: 2017-09-18 12:07 GMT+02:00 Nick Coghlan : > I wonder if it might be reasonable to have tracebacks only hold a weak > reference to their frame objects when "__debug__ == False". Please don't change the Python behaviour depending on __debug__, or it will be a nightmare to debug it :-( ("Why does it work on my computer?") Victor From ncoghlan at gmail.com Mon Sep 18 06:35:02 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 18 Sep 2017 20:35:02 +1000 Subject: [Python-Dev] Evil reference cycles caused Exception.__traceback__ In-Reply-To: References: <20170918114824.741c17e4@fsol> Message-ID: On 18 September 2017 at 20:18, Victor Stinner wrote: > 2017-09-18 12:07 GMT+02:00 Nick Coghlan : >> I wonder if it might be reasonable to have tracebacks only hold a weak >> reference to their frame objects when "__debug__ == False". > > Please don't change the Python behaviour depending on __debug__, or it > will be a nightmare to debug it :-( ("Why does it work on my > computer?") Yeah, that's always a challenge :) Rather than being thread local or context local state, whether or not to keep the full frames in the traceback could be a yet another setting on the *exception* object, whereby we tweaked the logic that drops the reference at the end of an except clause as follows: 1. If the exception ref count == 1, just drop the reference and let the traceback die with it 2. If the exception ref count > 1, check for a new "__keep_locals__" attribute 3. If it's true, just drop the reference and otherwise leave the exception alone 4. If it's false, recursively clear all the already terminated frames in __traceback__, __context__, and __cause__ For now, we'd default to "__keep_locals__ = True", but we'd attempt to figure out some way to migrate to the default being "__keep_locals__ = False". Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From solipsis at pitrou.net Mon Sep 18 06:46:36 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 18 Sep 2017 12:46:36 +0200 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) References: Message-ID: <20170918124636.7ee8de0b@fsol> Hi, First my high-level opinion about the PEP: the CSP model can probably be already implemented using Queues. To me, the interesting promise of subinterpreters is if they allow to remove the GIL while sharing memory for big objects (such as Numpy arrays). This means the PEP should probably focus on potential concurrency improvements rather than try to faithfully follow the CSP model. Other than that, a bunch of detailed comments follow: On Wed, 13 Sep 2017 18:44:31 -0700 Eric Snow wrote: > > API for interpreters > -------------------- > > The module provides the following functions: > > ``list_all()``:: > > Return a list of all existing interpreters. See my naming proposal in the previous thread. > > run(source_str, /, **shared): > > Run the provided Python source code in the interpreter. Any > keyword arguments are added to the interpreter's execution > namespace. "Execution namespace" specifically means the __main__ module in the target interpreter, right? > If any of the values are not supported for sharing > between interpreters then RuntimeError gets raised. Currently > only channels (see "create_channel()" below) are supported. > > This may not be called on an already running interpreter. Doing > so results in a RuntimeError. I would distinguish between both error cases: RuntimeError for calling run() on an already running interpreter, ValueError for values which are not supported for sharing. > Likewise, if there is any uncaught > exception, it propagates into the code where "run()" was called. That makes it a bit harder to differentiate with errors raised by run() itself (see above), though how much of an annoyance this is remains unclear. The more litigious implication, though, is that it forces the interpreter to support migration of arbitrary objects from one interpreter to another (since a traceback keeps all local variables alive). > API for sharing data > -------------------- > > The mechanism for passing objects between interpreters is through > channels. A channel is a simplex FIFO similar to a pipe. The main > difference is that channels can be associated with zero or more > interpreters on either end. So it seems channels have become more complicated now? Is it important to support multi-producer multi-consumer channels? > Unlike queues, which are also many-to-many, > channels have no buffer. How does it work? Does send() block until someone else calls recv()? That does not sound like a good idea to me. I don't think it's a coincidence that the most varied kinds of I/O (from socket or file IO to threading Queues to multiprocessing Pipes) have non-blocking send(). send() blocking until someone else calls recv() is not only bad for performance, it also increases the likelihood of deadlocks. > recv_nowait(default=None): > > Return the next object from the channel. If none have been sent > then return the default. If the channel has been closed > then EOFError is raised. > > close(): > > No longer associate the current interpreter with the channel (on > the receiving end). This is a noop if the interpreter isn't > already associated. Once an interpreter is no longer associated > with the channel, subsequent (or current) send() and recv() calls > from that interpreter will raise EOFError. EOFError normally means the *other* (sending) side has closed the channel (but it becomes complicated with a multi-producer multi-consumer setup...). When *this* side has closed the channel, we should raise ValueError. > The Python runtime > will garbage collect all closed channels. Note that "close()" is > automatically called when it is no longer used in the current > interpreter. "No longer used" meaning it loses all references in this interpreter? > send(obj): > > Send the object to the receiving end of the channel. Wait until > the object is received. If the channel does not support the > object then TypeError is raised. Currently only bytes are > supported. If the channel has been closed then EOFError is > raised. Similar remark as above (EOFError vs. ValueError). More generally, send() raising EOFError sounds unheard of. A sidenote: context manager support (__enter__ / __exit__) on channels would sound more useful to me than iteration support. > Initial support for buffers in channels > --------------------------------------- > > An alternative to support for bytes in channels in support for > read-only buffers (the PEP 3119 kind). Probably you mean PEP 3118. > Then ``recv()`` would return > a memoryview to expose the buffer in a zero-copy way. It will probably not do much if you only can pass buffers and not structured objects, because unserializing (e.g. unpickling) from a buffer will still copy memory around. To pass a Numpy array, for example, you not only need to pass its contents but also its metadata (its value type -- named "dtype" --, its shape and strides). This may be serialized as simple tuples of atomic types (str, int, bytes, other tuples), but you want to include a memoryview of the data area somewhere in those tuples. (and, of course, at some point, this will feel like reinventing pickle :)) but pickle has no mechanism to avoid memory copies, so it can't readily be reused here -- otherwise you're just reinventing multiprocessing...) > timeout arg to pop() and push() > ------------------------------- pop() and push() don't exist anymore :-) > Synchronization Primitives > -------------------------- > > The ``threading`` module provides a number of synchronization primitives > for coordinating concurrent operations. This is especially necessary > due to the shared-state nature of threading. In contrast, > subinterpreters do not share state. Data sharing is restricted to > channels, which do away with the need for explicit synchronization. I think this rationale confuses Python-level data sharing with process-level data sharing. The main point of subinterpreters (compared to multiprocessing) is that they live in the same OS process. So it's really not true that you can't share a low-level synchronization primitive (say a semaphore) between subinterpreters. (also see multiprocessing/synchronize.py, which implements all synchronization primitives using basic low-level semaphores) > Solutions include: > > * a ``create()`` arg to indicate resetting ``__main__`` after each > ``run`` call > * an ``Interpreter.reset_main`` flag to support opting in or out > after the fact > * an ``Interpreter.reset_main()`` method to opt in when desired This would all be a false promise. Persistent state lives in other places than __main__ (for example the loaded modules and their respective configurations - think logging or decimal). > Use queues instead of channels > ------------------------------ > > The main difference between queues and channels is that queues support > buffering. This would complicate the blocking semantics of ``recv()`` > and ``send()``. Also, queues can be built on top of channels. But buffering with background threads in pure Python will be order of magnitudes slower than optimized buffering in a custom low-level implementation. It would be a pity if a subinterpreters Queue ended out as slow as a multiprocessing Queue. Regards Antoine. From solipsis at pitrou.net Mon Sep 18 06:50:31 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 18 Sep 2017 12:50:31 +0200 Subject: [Python-Dev] Evil reference cycles caused Exception.__traceback__ In-Reply-To: References: <20170918114824.741c17e4@fsol> Message-ID: <20170918125031.0476e545@fsol> On Mon, 18 Sep 2017 20:07:42 +1000 Nick Coghlan wrote: > On 18 September 2017 at 19:48, Antoine Pitrou wrote: > > On Mon, 18 Sep 2017 11:31:12 +0200 > > Victor Stinner wrote: > >> > >> Ideally, CPython 3.x should never create reference cycles. Removing > >> Exception.__traceback__ is the obvious "fix" for the issue. But I > >> expect that slowly, a lot of code started to rely on the attribute, > >> maybe even for good reasons :-) > > > > The real issue is not reference cycles, but the fact that a traceback > > frame keeps all locals alive. When the frame's execution is finished, > > that information is practically almost always useless, but is kept for > > the rare cases of debugging (and perhaps dubious features such as > > some of py.test's magic). > > As another use case, IPython tracebacks will include details of > referenced local variables, and > https://docs.python.org/3/library/traceback.html#tracebackexception-objects > offers the ability to readily capture the repr() of locals referenced > from a traceback for the same reason. True... but interactive use has different concerns than production use (hence my proposition of a setting to change frame cleanup behaviour). Besides, IPython also stores the history of displayed values, which is another cause of keeping objects alive :-) Regards Antoine. From solipsis at pitrou.net Mon Sep 18 06:52:17 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 18 Sep 2017 12:52:17 +0200 Subject: [Python-Dev] Evil reference cycles caused Exception.__traceback__ In-Reply-To: References: <20170918114824.741c17e4@fsol> Message-ID: <20170918125217.391ed882@fsol> On Mon, 18 Sep 2017 20:35:02 +1000 Nick Coghlan wrote: > On 18 September 2017 at 20:18, Victor Stinner wrote: > > 2017-09-18 12:07 GMT+02:00 Nick Coghlan : > >> I wonder if it might be reasonable to have tracebacks only hold a weak > >> reference to their frame objects when "__debug__ == False". > > > > Please don't change the Python behaviour depending on __debug__, or it > > will be a nightmare to debug it :-( ("Why does it work on my > > computer?") > > Yeah, that's always a challenge :) > > Rather than being thread local or context local state, whether or not > to keep the full frames in the traceback could be a yet another > setting on the *exception* object, whereby we tweaked the logic that > drops the reference at the end of an except clause as follows: That doesn't solve the problem, since the issue is that exceptions can be raised (and then silenced) in many places, and you don't want such exception-raising code (such as socket.create_connection) to start having to set an option on the exceptions it raises. Regards Antoine. From ncoghlan at gmail.com Mon Sep 18 07:40:40 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 18 Sep 2017 21:40:40 +1000 Subject: [Python-Dev] Evil reference cycles caused Exception.__traceback__ In-Reply-To: <20170918125217.391ed882@fsol> References: <20170918114824.741c17e4@fsol> <20170918125217.391ed882@fsol> Message-ID: On 18 September 2017 at 20:52, Antoine Pitrou wrote: > On Mon, 18 Sep 2017 20:35:02 +1000 > Nick Coghlan wrote: >> Rather than being thread local or context local state, whether or not >> to keep the full frames in the traceback could be a yet another >> setting on the *exception* object, whereby we tweaked the logic that >> drops the reference at the end of an except clause as follows: > > That doesn't solve the problem, since the issue is that exceptions can > be raised (and then silenced) in many places, and you don't want such > exception-raising code (such as socket.create_connection) to start > having to set an option on the exceptions it raises. Bleh, in trying to explain why my proposal would be sufficient to break the problematic cycles, I realised I was wrong: if we restrict the frame clearing to already terminated frames (as would be necessary to avoid breaking any still executing functions), then that means we won't clear the frame running the exception handler, and that's the frame that creates the problematic cyclic reference. However, I still think it makes more sense to focus on the semantics of preserving an exception beyond the life of the stack being unwound, rather than on the semantics of raising the exception in the first place. In the usual case, the traceback does keep the whole stack alive while the stack is being unwound, but then the exception gets thrown away at the end when sys.exc_info() gets reset back to (None, None, None), and then all the frames still get cleaned up fairly promptly (this is also the case in Python 2). We only get problems when one of the exception handlers in the stack grabs an additional reference to either the traceback or the exception and hence creates a persistent cyclic reference from one of the frames back to itself. The difference in Python 3 is that saving the exception is reasonably common, while explicitly saving the traceback is relatively rare, so the "exc.__traceback__" is keeping tracebacks alive that would otherwise have been cleaned up more deterministically. Putting the problem that way gave me an idea, though: what if, when the interpreter was setting "sys.exc_info()" back to (None, None, None) (or otherwise dropping an exception instance from being the "currently active exception") it automatically set exc.__traceback__ to None? That way, if you wanted the *traceback* (rather than just the exception) to live beyond the stack being unwound, you'd have to preserve the entire sys.exc_info() triple (or at least save "exc.__traceback__" separately from "exc"). By doing that, we'd have the opportunity to encourage folks that are considering preserving the entire traceback to extract a TracebackException instead and some themselves from some potentially nasty reference management issues: https://docs.python.org/3/library/traceback.html#tracebackexception-objects Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From vadmium+py at gmail.com Mon Sep 18 08:56:40 2017 From: vadmium+py at gmail.com (Martin Panter) Date: Mon, 18 Sep 2017 12:56:40 +0000 Subject: [Python-Dev] Evil reference cycles caused Exception.__traceback__ In-Reply-To: References: Message-ID: On 18 September 2017 at 09:31, Victor Stinner wrote: > Last years, I fixed many reference cycles in various parts of the > Python 3 standard library. Sometimes, it takes years to become aware > of the reference cycle and finally fix it. > > For example, recently, I worked on fixing all "dangling threads" > leaked by tests of the Python test suite, and I found and fixed many > reference cycles which probably existed since Python 3 was created > (forked from Python 2) My instinct is to suggest to make the ?dangling threads? test tolerate Thread objects that are no longer running. Last time I looked, it tests a list of weak references to Thread objects. But maybe it should check ?Thread.is_alive? or use ?threading.enumerate?. > . . . > > Ideally, CPython 3.x should never create reference cycles. Removing > Exception.__traceback__ is the obvious "fix" for the issue. But I > expect that slowly, a lot of code started to rely on the attribute, > maybe even for good reasons :-) > > A more practical solution would be to log a warning. Maybe the garbage > collector can emit a warning if it detects an exception part of a > reference cycle? Or maybe detect frames? The ?gc? module can do some of this already. In the past (when __del__ used to prevent garbage collection) I used DEBUG_SAVEALL to and graphed the result to figure out where code was creating reference cycles. Or maybe if you turned on DEBUG_COLLECTABLE and filtered the objects printed out, or played with the new ?gc.callbacks? API. > If the GC cannot do it, maybe we might use a debug thread (enabled > manually) which checks manually if an exception is part of a reference > cycle using gc.get_objects(): check if an exception remains alive > longer than X seconds? I had the same idea for asyncio, to detect > reference cycles or if a task is never "awaited", but I never > implemented the idea. From guido at python.org Mon Sep 18 10:52:27 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 18 Sep 2017 07:52:27 -0700 Subject: [Python-Dev] Evil reference cycles caused Exception.__traceback__ In-Reply-To: <20170918114824.741c17e4@fsol> References: <20170918114824.741c17e4@fsol> Message-ID: On Mon, Sep 18, 2017 at 2:48 AM, Antoine Pitrou wrote: > On Mon, 18 Sep 2017 11:31:12 +0200 > Victor Stinner wrote: > > > > Ideally, CPython 3.x should never create reference cycles. Removing > > Exception.__traceback__ is the obvious "fix" for the issue. But I > > expect that slowly, a lot of code started to rely on the attribute, > > maybe even for good reasons :-) > > The real issue is not reference cycles, but the fact that a traceback > frame keeps all locals alive. When the frame's execution is finished, > that information is practically almost always useless, but is kept for > the rare cases of debugging (and perhaps dubious features such as > some of py.test's magic). So we're constantly paying the price of this > minor, albeit probably useful (*), debugging-related feature with > annoying object lifetime issues that are sometimes quite difficult to > track down and solve. > > (*) (I've hardly ever used pdb so I don't know if people often examine > the locals of finished frames, but I suppose some do) > Yes, that's what `pdb.pm()` is for. > Perhaps it would be useful to have a way to tell the interpreter to > automatically clear all locals on frames when they have finished > executing. It could be a thread-local setting (or, better, a PEP 550 > context variable setting). > In Python 2 the traceback was not part of the exception object because there was (originally) no cycle GC. In Python GC we changed the awkward interface to something more useful, because we could depend on GC. Why are we now trying to roll back this feature? We should just improve GC. (Or perhaps you shouldn't be raising so many exceptions. :-) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From antoine at python.org Mon Sep 18 10:56:22 2017 From: antoine at python.org (Antoine Pitrou) Date: Mon, 18 Sep 2017 16:56:22 +0200 Subject: [Python-Dev] Evil reference cycles caused Exception.__traceback__ In-Reply-To: References: <20170918114824.741c17e4@fsol> Message-ID: <7fbed308-1eec-6dce-f701-405682e3d297@python.org> Le 18/09/2017 ? 16:52, Guido van Rossum a ?crit?: > > In Python 2 the traceback was not part of the exception object because > there was (originally) no cycle GC. In Python GC we changed the awkward > interface to something more useful, because we could depend on GC. Why > are we now trying to roll back this feature? We should just improve GC. > (Or perhaps you shouldn't be raising so many exceptions. :-) Improving the GC is obviously a good thing, but what heuristic would you have in mind that may solve the issue at hand? Regards Antoine. From merwok at netwok.org Mon Sep 18 11:33:48 2017 From: merwok at netwok.org (=?UTF-8?Q?=c3=89ric_Araujo?=) Date: Mon, 18 Sep 2017 11:33:48 -0400 Subject: [Python-Dev] Why aren't decorators just expressions? In-Reply-To: References: <1b232526-ebc4-5d34-76c9-2eaaca6de69a@hastings.org> Message-ID: Hello, Le 2017-09-16 ? 07:22, Serhiy Storchaka a ?crit?: > 16.09.17 12:39, Larry Hastings ????: >> So why don't decorators allow arbitrary expressions?? [...] > > Actually I remember somebody raised this question a year or two ago,> but don't remember details. The discussion I remember was https://bugs.python.org/issue19660 Guido said: ?Nobody has posted a real use case. All the examples are toys. What are the real use cases that are blocked by this? Readability counts!? Cheers From njs at pobox.com Mon Sep 18 12:42:45 2017 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 18 Sep 2017 09:42:45 -0700 Subject: [Python-Dev] Evil reference cycles caused Exception.__traceback__ In-Reply-To: <7fbed308-1eec-6dce-f701-405682e3d297@python.org> References: <20170918114824.741c17e4@fsol> <7fbed308-1eec-6dce-f701-405682e3d297@python.org> Message-ID: On Sep 18, 2017 07:58, "Antoine Pitrou" wrote: Le 18/09/2017 ? 16:52, Guido van Rossum a ?crit : > > In Python 2 the traceback was not part of the exception object because > there was (originally) no cycle GC. In Python GC we changed the awkward > interface to something more useful, because we could depend on GC. Why > are we now trying to roll back this feature? We should just improve GC. > (Or perhaps you shouldn't be raising so many exceptions. :-) Improving the GC is obviously a good thing, but what heuristic would you have in mind that may solve the issue at hand? I read the whole thread and I'm not sure what the issue at hand is :-). Obviously it's nice when the refcount system is able to implicitly clean things up in a prompt and deterministic way, but there are already tools to handle the cases where it doesn't (ResourceWarning, context managers, ...), and the more we encourage people to implicitly rely on refcounting, the harder it is to optimize the interpreter or use alternative language implementations. Why are reference cycles a problem that needs solving? Actually the bit that I find most confusing about Victor's story is, how can a traceback frame keep a thread alive? Is the problem that "dangling thread" warnings are being triggered by threads that are finished and dead but their Thread objects are still allocated? Because if so then that seems like a bug in the warnings mechanism; there's no harm in a dead Thread hanging around until collected, and Victor may have wasted a day debugging an issue that wasn't a problem in the first place... -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Mon Sep 18 12:50:40 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 18 Sep 2017 18:50:40 +0200 Subject: [Python-Dev] Evil reference cycles caused Exception.__traceback__ References: <20170918114824.741c17e4@fsol> <7fbed308-1eec-6dce-f701-405682e3d297@python.org> Message-ID: <20170918185040.4a38bc62@fsol> On Mon, 18 Sep 2017 09:42:45 -0700 Nathaniel Smith wrote: > > Obviously it's nice when the refcount system is able to implicitly clean > things up in a prompt and deterministic way, but there are already tools to > handle the cases where it doesn't (ResourceWarning, context managers, ...), > and the more we encourage people to implicitly rely on refcounting, [...] The thing is, we don't need to encourage them. Having objects disposed of when the last visible reference vanishes is a pretty common expectation people have when using CPython. > Why are reference cycles a problem that needs solving? Because sometimes they are holding up costly resources in memory when people don't expect them to. Such as large Numpy arrays :-) And, no, there are no obvious ways to fix for users. gc.collect() is much too costly to be invoked on a regular basis. > Because if so then that seems > like a bug in the warnings mechanism; there's no harm in a dead Thread > hanging around until collected, and Victor may have wasted a day debugging > an issue that wasn't a problem in the first place... Yes, I think Victor is getting a bit overboard with the so-called "dangling thread" issue. But the underlying issue (that heavyweight resources can be inadvertently held up in memory up just because some unrelated exception was caught and silenced along the way) is a real one. Regards Antoine. From njs at pobox.com Mon Sep 18 13:53:26 2017 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 18 Sep 2017 10:53:26 -0700 Subject: [Python-Dev] Evil reference cycles caused Exception.__traceback__ In-Reply-To: <20170918185040.4a38bc62@fsol> References: <20170918114824.741c17e4@fsol> <7fbed308-1eec-6dce-f701-405682e3d297@python.org> <20170918185040.4a38bc62@fsol> Message-ID: On Mon, Sep 18, 2017 at 9:50 AM, Antoine Pitrou wrote: > On Mon, 18 Sep 2017 09:42:45 -0700 > Nathaniel Smith wrote: >> >> Obviously it's nice when the refcount system is able to implicitly clean >> things up in a prompt and deterministic way, but there are already tools to >> handle the cases where it doesn't (ResourceWarning, context managers, ...), >> and the more we encourage people to implicitly rely on refcounting, [...] > > The thing is, we don't need to encourage them. Having objects disposed > of when the last visible reference vanishes is a pretty common > expectation people have when using CPython. > >> Why are reference cycles a problem that needs solving? > > Because sometimes they are holding up costly resources in memory when > people don't expect them to. Such as large Numpy arrays :-) Do we have any reason to believe that this is actually happening on a regular basis though? If it is then it might make sense to look at the cycle collection heuristics; IIRC they're based on a fairly naive count of how many allocations have been made, without regard to their size. > And, no, there are no obvious ways to fix for users. gc.collect() > is much too costly to be invoked on a regular basis. > >> Because if so then that seems >> like a bug in the warnings mechanism; there's no harm in a dead Thread >> hanging around until collected, and Victor may have wasted a day debugging >> an issue that wasn't a problem in the first place... > > Yes, I think Victor is getting a bit overboard with the so-called > "dangling thread" issue. But the underlying issue (that heavyweight > resources can be inadvertently held up in memory up just because some > unrelated exception was caught and silenced along the way) is a real > one. Simply catching and silencing exceptions doesn't create any loops -- if you do try: raise ValueError except ValueError as exc: raise exc then there's no loop, because the 'exc' local gets cleared as soon as you exit the except: block. The issue that Victor ran into with socket.create_connection is a special case where that function saves off the caught exception to use later. If someone wanted to replace socket.create_connection's 'raise err' with try: raise err finally: del err then I guess that would be pretty harmless... -n -- Nathaniel J. Smith -- https://vorpus.org From antoine at python.org Mon Sep 18 13:59:51 2017 From: antoine at python.org (Antoine Pitrou) Date: Mon, 18 Sep 2017 19:59:51 +0200 Subject: [Python-Dev] Evil reference cycles caused Exception.__traceback__ In-Reply-To: References: <20170918114824.741c17e4@fsol> <7fbed308-1eec-6dce-f701-405682e3d297@python.org> <20170918185040.4a38bc62@fsol> Message-ID: <8ac58695-dcf0-156c-2967-862749808dcd@python.org> Le 18/09/2017 ? 19:53, Nathaniel Smith a ?crit?: >> >>> Why are reference cycles a problem that needs solving? >> >> Because sometimes they are holding up costly resources in memory when >> people don't expect them to. Such as large Numpy arrays :-) > > Do we have any reason to believe that this is actually happening on a > regular basis though? Define "regular" :-) We did get some reports on dask/distributed about it. > If it is then it might make sense to look at the cycle collection > heuristics; IIRC they're based on a fairly naive count of how many > allocations have been made, without regard to their size. Yes... But just because a lot of memory has been allocated isn't a good enough heuristic to launch a GC collection. What if that memory is gonna stay allocated for a long time? Then you're frequently launching GC runs for no tangible result except more CPU consumption and frequent pauses. Perhaps we could special-case tracebacks somehow, flag when a traceback remains alive after the implicit "del" clause at the end of an "except" block, then maintain some kind of linked list of the flagged tracebacks and launch specialized GC runs to find cycles accross that collection. That sounds quite involved, though. > The issue that Victor ran into with > socket.create_connection is a special case where that function saves > off the caught exception to use later. That's true... but you can find such special cases in an assorted bunch of library functions (stdlib or third-party), not just socket.create_connection(). Fixing them one by one is always possible, but it's a never-ending battle. Regards Antoine. From ethan at stoneleaf.us Mon Sep 18 14:37:09 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 18 Sep 2017 11:37:09 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> <3318395E-FE76-4BB1-A693-DC0229A7A34C@python.org> <59B70B59.3030900@stoneleaf.us> Message-ID: <59C01255.6020106@stoneleaf.us> On 09/11/2017 03:28 PM, Guido van Rossum wrote: > Oddly I don't like the enum (flag names get too long that way), but I do agree with everything else Barry said (it > should be a trivalue flag and please don't name it cmp). Hmmm, named constants are one of the motivating factors for having an Enum type. It's easy to keep the name a reasonable length, however: export them into the module-level namespace. re is an excellent example; the existing flags were moved into a FlagEnum, and then (for backwards compatibility) aliased back to the module level: class RegexFlag(enum.IntFlag): ASCII = sre_compile.SRE_FLAG_ASCII # assume ascii "locale" IGNORECASE = sre_compile.SRE_FLAG_IGNORECASE # ignore case LOCALE = sre_compile.SRE_FLAG_LOCALE # assume current 8-bit locale UNICODE = sre_compile.SRE_FLAG_UNICODE # assume unicode "locale" MULTILINE = sre_compile.SRE_FLAG_MULTILINE # make anchors look for newline DOTALL = sre_compile.SRE_FLAG_DOTALL # make dot match newline VERBOSE = sre_compile.SRE_FLAG_VERBOSE # ignore whitespace and comments A = ASCII I = IGNORECASE L = LOCALE U = UNICODE M = MULTILINE S = DOTALL X = VERBOSE # sre extensions (experimental, don't rely on these) TEMPLATE = sre_compile.SRE_FLAG_TEMPLATE # disable backtracking T = TEMPLATE DEBUG = sre_compile.SRE_FLAG_DEBUG # dump pattern after compilation globals().update(RegexFlag.__members__) So we can still do re.I instead of re.RegexFlag.I. Likewise, if we had: class Compare(enum.Enum): NONE = 'each instance is an island' EQUAL = 'instances can be equal to each other' ORDERED = 'instances can be ordered and/or equal' globals().update(Compare.__members__) then we can still use, for example, EQUAL, but get the more informative repr and str when we need to. -- ~Ethan~ From gzlist at googlemail.com Mon Sep 18 14:55:02 2017 From: gzlist at googlemail.com (Martin (gzlist)) Date: Mon, 18 Sep 2017 19:55:02 +0100 Subject: [Python-Dev] Evil reference cycles caused Exception.__traceback__ In-Reply-To: References: Message-ID: Thanks for working on this and writing up the details Victor. For those confused about why this matters, routinely having every object in your application participating in one (or more) giant reference cycles makes reasoning about and fixing resource issues very difficult. For instance, a while back I made some changes in Bazaar so that each TestCase instance was dereferenced after being run: Which mattered when running ~25000 tests, to keep peak memory usage sane. Also there were various other object lifetime issues, and tracking down which specific test failed to join a thread, or close a file on Windows was much simpler after making sure cleanup actually happened at the time a test was deleted. Keeping more stuff alive for longer periods also makes peak memory requirements higher, and the gc has to do more work each time. On 18/09/2017, Victor Stinner wrote: > > Ideally, CPython 3.x should never create reference cycles. Removing > Exception.__traceback__ is the obvious "fix" for the issue. But I > expect that slowly, a lot of code started to rely on the attribute, > maybe even for good reasons :-) > > A more practical solution would be to log a warning. Maybe the garbage > collector can emit a warning if it detects an exception part of a > reference cycle? Or maybe detect frames? Logging a warning is unlikely to be practical. I had an optional test flag that complained about reference cycles, and it produced a lot of output. Martin From njs at pobox.com Mon Sep 18 16:25:47 2017 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 18 Sep 2017 13:25:47 -0700 Subject: [Python-Dev] Evil reference cycles caused Exception.__traceback__ In-Reply-To: <8ac58695-dcf0-156c-2967-862749808dcd@python.org> References: <20170918114824.741c17e4@fsol> <7fbed308-1eec-6dce-f701-405682e3d297@python.org> <20170918185040.4a38bc62@fsol> <8ac58695-dcf0-156c-2967-862749808dcd@python.org> Message-ID: On Mon, Sep 18, 2017 at 10:59 AM, Antoine Pitrou wrote: > Le 18/09/2017 ? 19:53, Nathaniel Smith a ?crit : >>> >>>> Why are reference cycles a problem that needs solving? >>> >>> Because sometimes they are holding up costly resources in memory when >>> people don't expect them to. Such as large Numpy arrays :-) >> >> Do we have any reason to believe that this is actually happening on a >> regular basis though? > > Define "regular" :-) We did get some reports on dask/distributed about it. Caused by uncollected cycles involving tracebacks? I looked here: https://github.com/dask/distributed/issues?utf8=%E2%9C%93&q=is%3Aissue%20memory%20leak and saw some issues with cycles causing delayed collection (e.g. #956) or the classic memory leak problem of explicitly holding onto data you don't need any more (e.g. #1209, bpo-29861), but nothing involving traceback cycles. It was just a quick skim though. >> If it is then it might make sense to look at the cycle collection >> heuristics; IIRC they're based on a fairly naive count of how many >> allocations have been made, without regard to their size. > > Yes... But just because a lot of memory has been allocated isn't a good > enough heuristic to launch a GC collection. I'm not an expert on GC at all, but intuitively it sure seems like allocation size might be a useful piece of information to feed into a heuristic. Our current heuristic is just, run a small collection after every 700 allocations, run a larger collection after 10 smaller collections. > What if that memory is > gonna stay allocated for a long time? Then you're frequently launching > GC runs for no tangible result except more CPU consumption and frequent > pauses. Every heuristic has problematic cases, that's why we call it a heuristic :-). But somehow every other GC language manages to do well-enough without refcounting... I think they mostly have more sophisticated heuristics than CPython, though. Off the top of my head, I know PyPy's heuristic involves the ratio of the size of nursery objects versus the size of the heap, and JVMs do much cleverer things like auto-tuning nursery size to make empirical pause times match some target. > Perhaps we could special-case tracebacks somehow, flag when a traceback > remains alive after the implicit "del" clause at the end of an "except" > block, then maintain some kind of linked list of the flagged tracebacks > and launch specialized GC runs to find cycles accross that collection. > That sounds quite involved, though. We already keep a list of recently allocated objects and have a specialized GC that runs across just that collection. That's what generational GC is :-). -n -- Nathaniel J. Smith -- https://vorpus.org From solipsis at pitrou.net Mon Sep 18 16:54:03 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 18 Sep 2017 22:54:03 +0200 Subject: [Python-Dev] Evil reference cycles caused Exception.__traceback__ References: <20170918114824.741c17e4@fsol> <7fbed308-1eec-6dce-f701-405682e3d297@python.org> <20170918185040.4a38bc62@fsol> <8ac58695-dcf0-156c-2967-862749808dcd@python.org> Message-ID: <20170918225403.1d8d2cb0@fsol> On Mon, 18 Sep 2017 13:25:47 -0700 Nathaniel Smith wrote: > >> If it is then it might make sense to look at the cycle collection > >> heuristics; IIRC they're based on a fairly naive count of how many > >> allocations have been made, without regard to their size. > > > > Yes... But just because a lot of memory has been allocated isn't a good > > enough heuristic to launch a GC collection. > > I'm not an expert on GC at all, but intuitively it sure seems like > allocation size might be a useful piece of information to feed into a > heuristic. Our current heuristic is just, run a small collection after > every 700 allocations, run a larger collection after 10 smaller > collections. Ok, so there are two useful things a heuristic may try to guess: 1. how much CPU time a (full) collection will cost 2. how much memory it might help release While #1 is quite realistically predictable (because a full collection is basically O(number of allocated objects)), #2 is entirely unpredictable and you can really only make out an upper bound (trivially, the GC cannot release any more memory than has been allocated :-)). Our current heuristic is geared at estimating #1, so that the proportional time taken by GC runs in any long-running program is roughly bounded. > Every heuristic has problematic cases, that's why we call it a > heuristic :-). But somehow every other GC language manages to do > well-enough without refcounting... I don't know, define "well enough"? :-) For example, Java is well-known for requiring manual tuning of GC parameters for heavy workloads (or at least it was). I would be curious how well Java deals when someone does numerical crunching on large chunks of data, then throws that data away, for example. Apparently R also uses a garbage collector, and ironically you can find places on the Internet where people recommend you call gc() to avoid thrashing memory. https://stackoverflow.com/questions/8813753/what-is-the-difference-between-gc-and-rm On CPython at least, once you lose the last reference to your multi-gigabyte Numpy array, memory is returned immediately... > I think they mostly have more > sophisticated heuristics than CPython, though. I'm sure some of them do. They also tend to have many more man-hours available than we do to optimize the hell out of their GC. > > Perhaps we could special-case tracebacks somehow, flag when a traceback > > remains alive after the implicit "del" clause at the end of an "except" > > block, then maintain some kind of linked list of the flagged tracebacks > > and launch specialized GC runs to find cycles accross that collection. > > That sounds quite involved, though. > > We already keep a list of recently allocated objects and have a > specialized GC that runs across just that collection. That's what > generational GC is :-). Well, you can cross your fingers that the traceback is still in the youngest generations when it finally becomes collectable. But if the traceback stays alive for too long, it ends up in the oldest generation and a long time can pass before it gets collected. Regards Antoine. From v+python at g.nevcal.com Mon Sep 18 16:50:30 2017 From: v+python at g.nevcal.com (Glenn Linderman) Date: Mon, 18 Sep 2017 13:50:30 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <59C01255.6020106@stoneleaf.us> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <9e686aa5-a984-7337-41e3-067727ac5209@trueblade.com> <3318395E-FE76-4BB1-A693-DC0229A7A34C@python.org> <59B70B59.3030900@stoneleaf.us> <59C01255.6020106@stoneleaf.us> Message-ID: <7c88e69b-a997-68a4-96c7-abef5bf2cf1d@g.nevcal.com> On 9/18/2017 11:37 AM, Ethan Furman wrote: > On 09/11/2017 03:28 PM, Guido van Rossum wrote: > >> Oddly I don't like the enum (flag names get too long that way), but I >> do agree with everything else Barry said (it >> should be a trivalue flag and please don't name it cmp). > > Hmmm, named constants are one of the motivating factors for having an > Enum type.? It's easy to keep the name a reasonable length, however: > export them into the module-level namespace.? re is an excellent > example; the existing flags were moved into a FlagEnum, and then (for > backwards compatibility) aliased back to the module level: snip > Likewise, if we had: > > ??? class Compare(enum.Enum): > ??????? NONE = 'each instance is an island' > ??????? EQUAL = 'instances can be equal to each other' > ??????? ORDERED = 'instances can be ordered and/or equal' > ??? globals().update(Compare.__members__) > > then we can still use, for example, EQUAL, but get the more > informative repr and str when we need to. I tend to like the Enum also. But maybe this is a case where there should be multiple different decorators: @dataclass @equality_dataclass @ordered_datatclass -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsbueno at python.org.br Tue Sep 19 02:57:43 2017 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Tue, 19 Sep 2017 09:57:43 +0300 Subject: [Python-Dev] Why aren't decorators just expressions? In-Reply-To: References: <1b232526-ebc4-5d34-76c9-2eaaca6de69a@hastings.org> Message-ID: I myself found out abotu this restriction once I clashesd into it- soit was one time the restriction bit me back. But I can't remember if that was for intended production code or for toying around either. Anyway, a simple "nop" function can allow for any arbitrary expression to be used as decorator, at expense of a little more line noise. And the restriction prevents the line noise to emerge as "cool" in some framework that mitg take inspiration in trends ongoing in other languages. nop = lambda f: f @nop(MyClass(things)[dimension1, dimension2[subdimension5]] + CombinableAutoMixinDecoFactory()) def my_func(): pass On 18 September 2017 at 18:33, ?ric Araujo wrote: > Hello, > > Le 2017-09-16 ? 07:22, Serhiy Storchaka a ?crit : >> 16.09.17 12:39, Larry Hastings ????: >>> So why don't decorators allow arbitrary expressions? [...] >> >> Actually I remember somebody raised this question a year or two ago,> but don't remember details. > > The discussion I remember was https://bugs.python.org/issue19660 > > Guido said: ?Nobody has posted a real use case. All the examples are > toys. What are the real use cases that are blocked by this? Readability > counts!? > > Cheers > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/jsbueno%40python.org.br From nad at python.org Tue Sep 19 17:45:53 2017 From: nad at python.org (Ned Deily) Date: Tue, 19 Sep 2017 17:45:53 -0400 Subject: [Python-Dev] [RELEASE] Python 3.6.3rc1 and 3.7.0a1 are now available for testing and more Message-ID: <15E3769E-F938-44F6-AD38-EAAE976A49D2@python.org> The Python build factories have been busy the last several weeks preparing our fall lineup of releases. Today we are happy to announce three additions: 3.6.3rc1, 3.7.0a1, and 3.3.7 final, which join last weekend's 2.7.14 and last month's 3.5.4 bug-fix releases and 3.4.7 security-fix update (https://www.python.org/downloads/). 1. Python 3.6.3rc1 is the first release candidate for Python 3.6.3, the next maintenance release of Python 3.6. While 3.6.3rc1 is a preview release and, thus, not intended for production environments, we encourage you to explore it and provide feedback via the Python bug tracker (https://bugs.python.org). 3.6.3 is planned for final release on 2017-10-02 with the next maintenance release expected to follow in about 3 months. You can find Python 3.6.3rc1 and more information here: https://www.python.org/downloads/release/python-363rc1/ 2. Python 3.7.0a1 is the first of four planned alpha releases of Python 3.7, the next feature release of Python. During the alpha phase, Python 3.7 remains under heavy development: additional features will be added and existing features may be modified or deleted. Please keep in mind that this is a preview release and its use is not recommended for production environments. The next preview release, 3.6.0a2, is planned for 2017-10-16. You can find Python 3.7.0a1 and more information here: https://www.python.org/downloads/release/python-370a1/ 3. Python 3.3.7 is also now available. It is a security-fix source-only release and is expected to be the final release of any kind for Python 3.3.x before it reaches end-of-life status on 2017-09-29, five years after its initial release. Because 3.3.x has long been in security-fix mode, 3.3.7 may no longer build correctly on all current operating system releases and some tests may fail. If you are still using Python 3.3.x, we **strongly** encourage you to upgrade now to a more recent, fully supported version of Python 3. You can find Python 3.3.7 here: https://www.python.org/downloads/release/python-337/ -- Ned Deily nad at python.org -- [] From barry at python.org Tue Sep 19 20:45:43 2017 From: barry at python.org (Barry Warsaw) Date: Tue, 19 Sep 2017 20:45:43 -0400 Subject: [Python-Dev] PEP 553, v3 In-Reply-To: <5B417D78-C985-4B02-8D74-4ED1F6A04062@python.org> References: <5B417D78-C985-4B02-8D74-4ED1F6A04062@python.org> Message-ID: Barry Warsaw wrote: > Here?s an update to PEP 553, which makes $PYTHONBREAKPOINT a first class feature. I?ve also updated PR #3355 with the implementation to match. I've been in contact with Elizaveta Shashkova from JetBrains. She gave a great talk at Pycon 2017 which influenced my thinking about the feature that lead to PEP 553. https://www.youtube.com/watch?v=NdObDUbLjdg She wasn't able to post a follow up directly, but here is her response, pasted here with her permission. "In the beginning of 2017 we implemented frame evaluation debugger in the PyCharm IDE - the debugger, based on the frame evaluation API (included to Python 3.6 in PEP 523). This implementation improved debugger's performance significantly. The main idea of this debugger is using custom frame evaluation function for implementing breakpoints instead of standard tracing function. In fact, before the entering a new frame we're modifying the code object, generated for this frame and we're inserting breakpoints right into the code object. At the moment we're doing quite complicated things: we define our own local `breakpoint()` function (which leads debugger to a suspend state), create a wrapper for it, and insert the wrapper's code into the user's code. In order to be able to call this function, we add it to frame's globals dict. It's a bit tricky and it still has some problems. So if PEP 553 is accepted, it will help us to make this part of the debugger much simpler: we will be able to specify our breakpoint function for the whole program and call built-in breakpoint() function instead of all these manipulations with wrappers and frame's globals. It will make our frame evaluation debugger less complicated and more stable." Cheers, -Barry From stefan at bytereef.org Wed Sep 20 14:01:56 2017 From: stefan at bytereef.org (Stefan Krah) Date: Wed, 20 Sep 2017 20:01:56 +0200 Subject: [Python-Dev] Does Cygwin still have broken slot initialization? Message-ID: <20170920180156.GA7973@bytereef.org> Hi, The docs have this rule for slot initialization for the benefit of Cygwin: https://github.com/python/cpython/commit/db6a569de7ae595ada53b618fce6bbbd1c98d350 Synopsis -------- - PyType_GenericNew, /* tp_new */ + noddy_NoddyType.tp_new = PyType_GenericNew; + if (PyType_Ready(&noddy_NoddyType) < 0) + return; This is absolutely not required by C99 (and probably never was). 'PyType_GenericNew' is an address constant, and MSVC supports it just fine -- at least since VS 2008. Does anyone know if Cygwin still misbehaves? I would like to get rid of this arcane rule. https://bugs.python.org/issue31443 Stefan Krah From brett at python.org Wed Sep 20 17:33:37 2017 From: brett at python.org (Brett Cannon) Date: Wed, 20 Sep 2017 21:33:37 +0000 Subject: [Python-Dev] pythonhosted.org doc upload no longer works In-Reply-To: References: Message-ID: This is actually a question for distutils-sig since they manage PyPI. On Mon, 4 Sep 2017 at 03:50 Giampaolo Rodola' wrote: > I know pythonhosted.org was deprecated long ago in favor of readthedocs > but I kept postponing it and my doc for psutil is still hosted on > https://pythonhosted.org/psutil/. > > http://pythonhosted.org/ web page still recommends this: > > < http://pypi.python.org/pypi?%3Aaction=pkg_edit&name=yourpackage), and > fill out the form at the bottom of the page.>> > > I took a look at: > https://pypi.python.org/pypi?:action=pkg_edit&name=psutil > ...but the form to upload the doc in zip format is gone. > > The only available option is the "destroy documentation" button. > I would like to at least upload a static HTML page redirecting to the new > doc which I will soon move on readthedocs. Is that possible? > > Thanks in advance. > > -- > Giampaolo - http://grodola.blogspot.com > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Wed Sep 20 18:37:51 2017 From: brett at python.org (Brett Cannon) Date: Wed, 20 Sep 2017 22:37:51 +0000 Subject: [Python-Dev] PEP 552: deterministic pycs In-Reply-To: <1504893123.182750.1099870936.4E48255D@webmail.messagingengine.com> References: <1504816761.1994315.1098826272.2B182328@webmail.messagingengine.com> <1504820433.3022749.1098877456.3FF76AA6@webmail.messagingengine.com> <1504824402.3041582.1098937744.55112449@webmail.messagingengine.com> <20170908120452.22b6723b@fsol> <20170908165500.3828c370@fsol> <1504893123.182750.1099870936.4E48255D@webmail.messagingengine.com> Message-ID: On Fri, 8 Sep 2017 at 10:53 Benjamin Peterson wrote: > Thank you all for the feedback. I've now updated the PEP to specify a > 4-word pyc header with a bit field in every case. > > On Fri, Sep 8, 2017, at 09:43, Nick Coghlan wrote: > > On 8 September 2017 at 07:55, Antoine Pitrou > wrote: > > > On Fri, 8 Sep 2017 07:49:46 -0700 > > > Nick Coghlan wrote: > > >> > I'd rather a single magic number and a separate bitfield that tells > > >> > what the header encodes exactly. We don't *have* to fight for a > tiny > > >> > size reduction of pyc files. > > >> > > >> One of Benjamin's goals was for the existing timestamp-based pyc > > >> format to remain completely unchanged, so we need some kind of marker > > >> in the magic number to indicate whether the file is using the new > > >> format or nor. > > > > > > I don't think that's a useful goal, as long as we bump the magic > number. > > > > Yeah, we (me, Benjamin, Greg) discussed that here, and we agree - > > there isn't actually any benefit to keeping the timestamp based pyc's > > using the same layout, since the magic number is already going to > > change anyway. > > > > Given that, I think your suggested 16 byte header layout would be a > > good one: 4 byte magic number, 4 bytes reserved for format flags, 8 > > bytes with an interpretation that depends on the format flags. > > > +1 from me! -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Wed Sep 20 17:44:41 2017 From: brett at python.org (Brett Cannon) Date: Wed, 20 Sep 2017 21:44:41 +0000 Subject: [Python-Dev] PEP 549: Instance Properties (aka: module properties) In-Reply-To: References: <1346e629-11d6-c814-6cf5-cba5c8ae5895@hastings.org> <68c9f5c5-e8d9-723f-c258-1e6727da3008@hastings.org> <0580a3c6-f262-acbb-93a3-87807ac48242@hastings.org> <38dd16c8-432d-c9d5-b267-ddfd88064098@hastings.org> Message-ID: On Thu, 14 Sep 2017 at 12:09 Ivan Levkivskyi wrote: > On 14 September 2017 at 01:13, Guido van Rossum wrote: > >> >> That last sentence is a key observation. Do we even know whether there >> are (non-toy) things that you can do *in principle* with __class__ >> assignment but which are too slow *in practice* to bother? And if yes, is >> __getattr__ fast enough? @property? >> >> > I myself have never implemented deprecation warnings management nor lazy > loading, > so it is hard to say if __class__ assignment is fast enough. > For lazy loading it is, but that's because after the lazy load, module.__class__ is set back to its original value, so any performance penalty is paid only once upon triggering __getattribute__ on the lazy module. IOW no one has complained and I know plenty of places using the lazy loading mechanism in importlib at scale. :) -Brett > For me it is more combination > of three factors: > > * modest performance improvement > * some people might find __getattr__ clearer than __class__ assignment > * this would be consistent with how stubs work > > >> IMO we're still looking for applications. >> >> > How about this > > def allow_forward_references(*allowed): > caller_globals = sys._getframe().__globals__ > def typing_getattr(name): > if name in allowed: > return name > raise AttributeError(...) > caller_globals.__getattr__ = typing_getattr > > from typing_extensions import allow_forward_references > allow_forward_references('Vertex', 'Edge') > > T = TypeVar('T', bound=Edge) > > class Vertex(List[Edge]): > def copy(self: T) -> T: > ... > > class Edge: > ends: Tuple[Vertex, Vertex] > ... > > Look mum, no quotes! :-) > > -- > Ivan > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Wed Sep 20 18:50:28 2017 From: brett at python.org (Brett Cannon) Date: Wed, 20 Sep 2017 22:50:28 +0000 Subject: [Python-Dev] To reduce Python "application" startup time In-Reply-To: References: <94D7D735-A961-42F4-8770-34132BEF069B@python.org> Message-ID: I should mention that I have a prototype design for improving importlib's lazy loading to be easier to turn on and use. See https://notebooks.azure.com/Brett/libraries/di2Btqj7zSI/html/Lazy%20importing.ipynb for my current notes. Part of it includes an explicit lazy_import() function which would negate needing to hide imports in functions to delay their importation. On Wed, 6 Sep 2017 at 20:50 INADA Naoki wrote: > > I?m not sure however whether burying imports inside functions (as a kind > of poor man?s lazy import) is ultimately going to be satisfying. First, > it?s not natural, it generally violates coding standards (e.g. PEP 8), and > can make linters complain. > > Of course, I tried to move imports only when (1) it's only used one > or two of many functions in the module, > (2) it's relatively heavy, (3) it's rerely imported from other modules. > > > > Second, I think you?ll end up chasing imports all over the stdlib and > third party modules in any sufficiently complicated application. > > Agree. I won't use much time to optimization by moving import from > top to inner function in stdlib. > > I think my import-profiler patch can be polished and committed in Python > to help > library maintainers to know import time easily. (Maybe, `python -X > import-profile`) > > > > Third, I?m not sure that the gains you?ll get won?t just be overwhelmed > by lots of other things going on, such as pkg_resources entry point > processing, pth file processing, site.py effects, command line processing > libraries such as click, and implicitly added distribution exception hooks > (e.g. Ubuntu?s apport). > > Yes. I noticed some of them while profiling imports. > For example, old-style namespace package imports types module for > types.Module. > Types module imports functools, and functools imports collections. > So some efforts in CPython (Avoid importing collections and functools > from site) is not > worth enough when there are at least one old-style namespace package > is installed. > > > > > > Many of these can?t be blamed on Python itself, but all can contribute > significantly to Python?s apparent start up time. It?s definitely worth > investigating the details of Python import, and a few of us at the core > sprint have looked at those numbers and thrown around ideas for > improvement, but we?ll need to look at the effects up and down the stack to > improve the start up performance for the average Python application. > > > > Yes. I totally agree with you. That's why I use import-profile.patch > for some 3rd party libraries. > > Currently, I have these ideas to optimize application startup time. > > * Faster, or lazily compiling regular expression. (pkg_resources > imports pyparsing, which has lot regex) > * More usable lazy import. (which can be solved "PEP 549: Instance > Properties (aka: module properties)") > * Optimize enum creation. > * Faster namedtuple (There is pull request already) > * Faster ABC > * Breaking large import tree in stdlib. (PEP 549 may help this too) > > Regards, > > INADA Naoki > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lifebalance at gmail.com Thu Sep 21 01:42:27 2017 From: lifebalance at gmail.com (ashok bakthavathsalam) Date: Thu, 21 Sep 2017 11:12:27 +0530 Subject: [Python-Dev] Please accept my PR on issue bpo-31504 Message-ID: Dear Sir / Madam, Against the issue https://bugs.python.org/issue31504 (related to an earlier issue https://bugs.python.org/issue24243) please accept my Pull Request. https://github.com/python/cpython/pull/3642 Thanks and regards, Ashok Bakthavathsalam -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Thu Sep 21 11:30:57 2017 From: barry at python.org (Barry Warsaw) Date: Thu, 21 Sep 2017 11:30:57 -0400 Subject: [Python-Dev] New security-announce@python.org mailing list Message-ID: I?m happy to announce the availability of a new mailing list, with the mission of providing security announcements to the Python community from the Python Security Response Team (PSRT): security-announce at python.org You can sign up in the usual Mailman way: https://mail.python.org/mailman/listinfo/security-announce This joins our suite of security related forums. As always, if you believe you?ve found a security issue in Python, you should contact the PSRT directly and securely via: security at python.org For more information on how you can contact us, see: https://www.python.org/news/security/ We also have a public security-focused discussion mailing list that you can subscribe and contribute to. security-sig at python.org https://mail.python.org/mailman/listinfo/security-sig Please don?t report security vulnerabilities here, since this is a publicly archived mailing list. We welcome you to collaborate here to help make Python and its ecosystem even more secure than it already is. Once a security vulnerability is identified and fixed, it becomes public knowledge. Generally, these are captured in a ReadTheDocs site for posterity: https://python-security.readthedocs.io/ This new security-announce mailing list fills a void ? one-way communication about security related matters from the PSRT back to the community. This is an area that we?ve not done a great job at, frankly, and this new announcement list is intended to improve that situation. The PSRT will use this low traffic, high value forum as the primary way the PSRT will communicate security issues of high importance back to the wider Python community. All follow-ups to postings to this list are redirected to the security-sig mailing list. Cheers, -Barry (on behalf of the PSRT) -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From victor.stinner at gmail.com Fri Sep 22 09:19:13 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 22 Sep 2017 15:19:13 +0200 Subject: [Python-Dev] SK-CSIRT identified malicious software libraries in the official Python package repository, PyPI In-Reply-To: References: Message-ID: Hi, FYI I just sent a public advisory for the PyPI typo squatting issue to the new security-announce list: [Security-announce] Typo squatting and malicious packages on PyPI https://mail.python.org/pipermail/security-announce/2017-September/000000.html Please subscribe to this newly created mailing list to stay tuned! https://mail.python.org/mailman/listinfo/security-announce Victor 2017-09-15 22:28 GMT+02:00 Victor Stinner : > Hi, > > Last week, the National Security Authority of Slovakia contacted the > Python Security Response Team (PSRT) to report that the Python Package > Index (PyPI) was hosting malicious packages. Installing these packages > send user data to a HTTP server, but also install the expected module > so it was an easy to notice the attack. > > Advisory: http://www.nbu.gov.sk/skcsirt-sa-20170909-pypi/ > > Kudos to them to report the issue! > > > It's not a compromise of the PyPI server nor a third-party project, > but the "typo squatting" issue which is known since at least June 2016 > (for PyPI). The issue is not specific to Python, npmjs.com or > rubygems.org are vulnerable to the same issue. > > For example, a malicious package used the names "urllib" (no 3) and > "urlib3" (1 L) instead of "urllib3" (2 L). These packages were > downloaded by users, so the attack was effective. > > More information on typo squatting and Python package security: > https://python-security.readthedocs.io/packages.html#pypi-typo-squatting > > The PRST contacted PyPI administrators and all identified packages > were taken down, only 1h10 after the PSRT received the email from the > National Security Authority of Slovakia! > > > The typo squatting issue is known and discussed, but not solution was > found yet. See for example this warehouse issue: > https://github.com/pypa/warehouse/issues/2151 > > It seems like the consensus is that pip is not responsible to detect > malicious code, it's more the responsability of PyPI. > > The problem is to decide how to detect malicious code and/or prevent > typo squatting on PyPI. > > > The issue has been discussed privately on the PSRT list last week. The > National Security Authority of Slovakia just published their advisory, > and a public discussion started on reddit: > https://news.ycombinator.com/item?id=15256121 > > I consider that it's now time to find a solution on the public > python-dev mailing list. > > > Let's try to find a solution! > > Can we learn something from the Update Framework (TUF)? > > How does Javascript, Ruby, Perl and other programming languages deal > with these security issues on their package manager? > > > See also my other notes on Python security and the list of known > CPython vulnerabilities: > https://python-security.readthedocs.io/ > > Victor From status at bugs.python.org Fri Sep 22 12:09:17 2017 From: status at bugs.python.org (Python tracker) Date: Fri, 22 Sep 2017 18:09:17 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20170922160917.4BCAB11A87F@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2017-09-15 - 2017-09-22) Python tracker at https://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 6196 (+25) closed 37107 (+42) total 43303 (+67) Open issues with patches: 2376 Issues opened (56) ================== #13487: inspect.getmodule fails when module imports change sys.modules https://bugs.python.org/issue13487 reopened by merwok #31486: calling a _json.Encoder object raises a SystemError in case ob https://bugs.python.org/issue31486 opened by Oren Milman #31488: IDLE: Update feature classes when options are changed. https://bugs.python.org/issue31488 opened by terry.reedy #31489: Signal delivered to a subprocess triggers parent's handler https://bugs.python.org/issue31489 opened by Ilya.Kulakov #31490: assertion failure in ctypes in case an _anonymous_ attr appear https://bugs.python.org/issue31490 opened by Oren Milman #31491: Add is_closing() to asyncio.StreamWriter. https://bugs.python.org/issue31491 opened by aymeric.augustin #31492: assertion failures in case a module has a bad __name__ attribu https://bugs.python.org/issue31492 opened by Oren Milman #31493: IDLE cond context: fix code update and font update timers https://bugs.python.org/issue31493 opened by terry.reedy #31494: Valgrind suppression file https://bugs.python.org/issue31494 opened by Aaron Michaux #31495: Wrong offset with IndentationError ("expected an indented bloc https://bugs.python.org/issue31495 opened by blueyed #31496: IDLE: test_configdialog failed https://bugs.python.org/issue31496 opened by serhiy.storchaka #31500: IDLE: Tiny font on HiDPI display https://bugs.python.org/issue31500 opened by serhiy.storchaka #31502: IDLE: Config dialog again deletes custom themes and keysets. https://bugs.python.org/issue31502 opened by terry.reedy #31503: Enhance dir(module) to be informed by __all__ by updating modu https://bugs.python.org/issue31503 opened by codypiersall #31504: Documentation for return value for string.rindex is missing wh https://bugs.python.org/issue31504 opened by kgashok #31505: assertion failure in json, in case _json.make_encoder() receiv https://bugs.python.org/issue31505 opened by Oren Milman #31506: Improve the error message logic for object_new & object_init https://bugs.python.org/issue31506 opened by ncoghlan #31507: email.utils.parseaddr has no docstring https://bugs.python.org/issue31507 opened by mark.dickinson #31508: Running test_ttk_guionly logs "test_widgets.py:1562: UserWarni https://bugs.python.org/issue31508 opened by haypo #31509: test_subprocess hangs randomly on AMD64 Windows10 3.x https://bugs.python.org/issue31509 opened by haypo #31510: test_many_processes() of test_multiprocessing_spawn failed on https://bugs.python.org/issue31510 opened by haypo #31511: test_normalization: test.support.open_urlresource() doesn't ha https://bugs.python.org/issue31511 opened by haypo #31512: Add non-elevated symlink support for dev mode Windows 10 https://bugs.python.org/issue31512 opened by vidartf #31516: current_thread() becomes "dummy" thread during shutdown https://bugs.python.org/issue31516 opened by pitrou #31517: MainThread association logic is fragile https://bugs.python.org/issue31517 opened by pitrou #31518: ftplib, urllib2, poplib, httplib, urllib2_localnet use ssl.PRO https://bugs.python.org/issue31518 opened by doko #31520: ResourceWarning: unclosed w https://bugs.python.org/issue31520 opened by haypo #31521: segfault in PyBytes_AsString https://bugs.python.org/issue31521 opened by Tim Smith #31522: _mboxMMDF.get_string() fails to pass param to get_bytes() https://bugs.python.org/issue31522 opened by bpoaugust #31523: Windows build file fixes https://bugs.python.org/issue31523 opened by steve.dower #31524: mailbox._mboxMMDF.get_message throws away From envelope https://bugs.python.org/issue31524 opened by bpoaugust #31526: Allow setting timestamp in gzip-compressed tarfiles https://bugs.python.org/issue31526 opened by randombit #31527: Report details of excess arguments in object_new & object_init https://bugs.python.org/issue31527 opened by ncoghlan #31528: Let ConfigParser parse systemd units https://bugs.python.org/issue31528 opened by johnlinp #31529: IDLE: Add docstrings and tests for editor.py reload functions https://bugs.python.org/issue31529 opened by csabella #31530: Python 2.7 readahead feature of file objects is not thread saf https://bugs.python.org/issue31530 opened by haypo #31531: crash and SystemError in case of a bad zipimport._zip_director https://bugs.python.org/issue31531 opened by Oren Milman #31534: python 3.6.2 installation failed 0x80070002 error https://bugs.python.org/issue31534 opened by roie #31535: configparser unable to write comment with a upper cas letter https://bugs.python.org/issue31535 opened by philippewagnieres at hispeed.ch #31536: `make regen-all` triggers wholesale rebuild https://bugs.python.org/issue31536 opened by pitrou #31537: Bug in readline module documentation example https://bugs.python.org/issue31537 opened by bazwal #31538: mailbox does not treat external factories the same https://bugs.python.org/issue31538 opened by bpoaugust #31539: asyncio.sleep may sleep less time then it should https://bugs.python.org/issue31539 opened by germn #31540: Adding context in concurrent.futures.ProcessPoolExecutor https://bugs.python.org/issue31540 opened by tomMoral #31541: Mock called_with does not ensure self/cls argument is used https://bugs.python.org/issue31541 opened by jonathan.huot #31542: pth files in site-packages of venvs are executed twice https://bugs.python.org/issue31542 opened by Antony.Lee #31543: Optimize wrapper descriptors using FASTCALL https://bugs.python.org/issue31543 opened by haypo #31544: gettext.Catalog title is not flagged as a class https://bugs.python.org/issue31544 opened by linkid #31545: Fixing documentation for timedelta. https://bugs.python.org/issue31545 opened by musically_ut #31546: PyOS_InputHook is not called when waiting for input() in Windo https://bugs.python.org/issue31546 opened by yhlam #31547: IDLE: Save definitions added to user keysets https://bugs.python.org/issue31547 opened by terry.reedy #31548: test_os fails on Windows if current directory contains spaces https://bugs.python.org/issue31548 opened by serhiy.storchaka #31549: test_strptime and test_time fail on non-English Windows https://bugs.python.org/issue31549 opened by serhiy.storchaka #31550: Inconsistent error message for TypeError with subscripting https://bugs.python.org/issue31550 opened by Anthony Sottile #31551: test_distutils fails if current directory contains spaces https://bugs.python.org/issue31551 opened by serhiy.storchaka #31552: IDLE: Convert browswers to use ttk.Treeview https://bugs.python.org/issue31552 opened by terry.reedy Most recent 15 issues with no replies (15) ========================================== #31551: test_distutils fails if current directory contains spaces https://bugs.python.org/issue31551 #31549: test_strptime and test_time fail on non-English Windows https://bugs.python.org/issue31549 #31548: test_os fails on Windows if current directory contains spaces https://bugs.python.org/issue31548 #31547: IDLE: Save definitions added to user keysets https://bugs.python.org/issue31547 #31545: Fixing documentation for timedelta. https://bugs.python.org/issue31545 #31542: pth files in site-packages of venvs are executed twice https://bugs.python.org/issue31542 #31541: Mock called_with does not ensure self/cls argument is used https://bugs.python.org/issue31541 #31540: Adding context in concurrent.futures.ProcessPoolExecutor https://bugs.python.org/issue31540 #31537: Bug in readline module documentation example https://bugs.python.org/issue31537 #31535: configparser unable to write comment with a upper cas letter https://bugs.python.org/issue31535 #31534: python 3.6.2 installation failed 0x80070002 error https://bugs.python.org/issue31534 #31531: crash and SystemError in case of a bad zipimport._zip_director https://bugs.python.org/issue31531 #31528: Let ConfigParser parse systemd units https://bugs.python.org/issue31528 #31526: Allow setting timestamp in gzip-compressed tarfiles https://bugs.python.org/issue31526 #31520: ResourceWarning: unclosed w https://bugs.python.org/issue31520 Most recent 15 issues waiting for review (15) ============================================= #31550: Inconsistent error message for TypeError with subscripting https://bugs.python.org/issue31550 #31545: Fixing documentation for timedelta. https://bugs.python.org/issue31545 #31543: Optimize wrapper descriptors using FASTCALL https://bugs.python.org/issue31543 #31536: `make regen-all` triggers wholesale rebuild https://bugs.python.org/issue31536 #31530: Python 2.7 readahead feature of file objects is not thread saf https://bugs.python.org/issue31530 #31529: IDLE: Add docstrings and tests for editor.py reload functions https://bugs.python.org/issue31529 #31518: ftplib, urllib2, poplib, httplib, urllib2_localnet use ssl.PRO https://bugs.python.org/issue31518 #31516: current_thread() becomes "dummy" thread during shutdown https://bugs.python.org/issue31516 #31512: Add non-elevated symlink support for dev mode Windows 10 https://bugs.python.org/issue31512 #31508: Running test_ttk_guionly logs "test_widgets.py:1562: UserWarni https://bugs.python.org/issue31508 #31507: email.utils.parseaddr has no docstring https://bugs.python.org/issue31507 #31506: Improve the error message logic for object_new & object_init https://bugs.python.org/issue31506 #31505: assertion failure in json, in case _json.make_encoder() receiv https://bugs.python.org/issue31505 #31504: Documentation for return value for string.rindex is missing wh https://bugs.python.org/issue31504 #31502: IDLE: Config dialog again deletes custom themes and keysets. https://bugs.python.org/issue31502 Top 10 most discussed issues (10) ================================= #31500: IDLE: Tiny font on HiDPI display https://bugs.python.org/issue31500 19 msgs #31443: Possibly out of date C extension documentation https://bugs.python.org/issue31443 14 msgs #31504: Documentation for return value for string.rindex is missing wh https://bugs.python.org/issue31504 13 msgs #31530: Python 2.7 readahead feature of file objects is not thread saf https://bugs.python.org/issue31530 13 msgs #31484: Cache single-character strings outside of the Latin1 range https://bugs.python.org/issue31484 12 msgs #31539: asyncio.sleep may sleep less time then it should https://bugs.python.org/issue31539 12 msgs #31506: Improve the error message logic for object_new & object_init https://bugs.python.org/issue31506 11 msgs #31536: `make regen-all` triggers wholesale rebuild https://bugs.python.org/issue31536 8 msgs #31517: MainThread association logic is fragile https://bugs.python.org/issue31517 7 msgs #31496: IDLE: test_configdialog failed https://bugs.python.org/issue31496 6 msgs Issues closed (38) ================== #13224: Change str(x) to return only the qualname for some types https://bugs.python.org/issue13224 closed by merwok #15472: Itertools doc summary table misdocuments some arguments https://bugs.python.org/issue15472 closed by merwok #26510: [argparse] Add required argument to add_subparsers https://bugs.python.org/issue26510 closed by merwok #27541: Repr of collection's subclasses https://bugs.python.org/issue27541 closed by serhiy.storchaka #29916: No explicit documentation for PyGetSetDef and getter and sette https://bugs.python.org/issue29916 closed by serhiy.storchaka #29966: typing.get_type_hints doesn't really work for classes with For https://bugs.python.org/issue29966 closed by levkivskyi #30228: Open a file in text mode requires too many syscalls https://bugs.python.org/issue30228 closed by haypo #30381: test_smtpnet.test_connect_using_sslcontext_verified() randomly https://bugs.python.org/issue30381 closed by haypo #30648: test_logout() of test_imaplib.RemoteIMAP_STARTTLSTest failed r https://bugs.python.org/issue30648 closed by haypo #30658: Buildbot: don't sent email notification for custom builders https://bugs.python.org/issue30658 closed by haypo #30848: test_multiprocessing_forkserver hangs on AMD64 FreeBSD CURRENT https://bugs.python.org/issue30848 closed by haypo #30866: Add _testcapi.stack_pointer() to measure the C stack consumpti https://bugs.python.org/issue30866 closed by haypo #31008: FAIL: test_wait_for_handle (test.test_asyncio.test_windows_eve https://bugs.python.org/issue31008 closed by haypo #31016: [Regression] sphinx shows an EOF error when using python2.7 fr https://bugs.python.org/issue31016 closed by benjamin.peterson #31293: crashes in multiply_float_timedelta() and in truedivide_timede https://bugs.python.org/issue31293 closed by serhiy.storchaka #31315: assertion failure in imp.create_dynamic(), when spec.name is n https://bugs.python.org/issue31315 closed by serhiy.storchaka #31346: Prefer PROTOCOL_TLS_CLIENT/SERVER https://bugs.python.org/issue31346 closed by christian.heimes #31376: test_multiprocessing_spawn randomly hangs AMD64 FreeBSD 10.x S https://bugs.python.org/issue31376 closed by haypo #31381: Unable to read the project file "pythoncore.vcxproj". https://bugs.python.org/issue31381 closed by denis-osipov #31386: Make return types of wrap_bio and wrap_socket customizable https://bugs.python.org/issue31386 closed by christian.heimes #31404: undefined behavior and crashes in case of a bad sys.modules https://bugs.python.org/issue31404 closed by eric.snow #31410: int.__repr__() is slower than repr() https://bugs.python.org/issue31410 closed by serhiy.storchaka #31431: SSL: check_hostname should imply CERT_REQUIRED https://bugs.python.org/issue31431 closed by christian.heimes #31477: IDLE 'strip trailing whitespace' changes multiline strings https://bugs.python.org/issue31477 closed by terry.reedy #31479: Always reset the signal alarm in tests https://bugs.python.org/issue31479 closed by haypo #31482: random.seed() doesn't work with bytes and version=1 https://bugs.python.org/issue31482 closed by rhettinger #31487: Improve f-strings documentation wrt format specifiers https://bugs.python.org/issue31487 closed by Mariatta #31497: Add _PyType_Name() https://bugs.python.org/issue31497 closed by serhiy.storchaka #31498: Default values for zero in time.strftime() https://bugs.python.org/issue31498 closed by denis-osipov #31499: ElementTree crash while cleaning up ParseError https://bugs.python.org/issue31499 closed by haypo #31501: Operator precedence description for arithmetic operators https://bugs.python.org/issue31501 closed by rhettinger #31513: Document structure of memo dictionary to enable more advanced https://bugs.python.org/issue31513 closed by r.david.murray #31514: There is a problem with the setdefault type conversion in the https://bugs.python.org/issue31514 closed by steven.daprano #31515: Doc for os.makedirs is inconsistent with actual behaviour https://bugs.python.org/issue31515 closed by fpom #31519: Negative number raised to even power is negative (-1 ** even = https://bugs.python.org/issue31519 closed by r.david.murray #31525: require sqlite3_prepare_v2 https://bugs.python.org/issue31525 closed by benjamin.peterson #31532: Py_GetPath, Py_SetPath memory corruption due to mixed PyMem_Ne https://bugs.python.org/issue31532 closed by benjamin.peterson #31533: Dead link in SSLContext.set_ciphers documentation https://bugs.python.org/issue31533 closed by Mariatta From ericsnowcurrently at gmail.com Fri Sep 22 21:09:01 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 22 Sep 2017 19:09:01 -0600 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) In-Reply-To: <20170918124636.7ee8de0b@fsol> References: <20170918124636.7ee8de0b@fsol> Message-ID: Thanks for the feedback, Antoine. Sorry for the delay; it's been a busy week for me. I just pushed an updated PEP to the repo. Once I've sorted out the question of passing bytes through channels I plan on posting the PEP to the list again for another round of discussion. In the meantime, I've replied below in-line. -eric On Mon, Sep 18, 2017 at 4:46 AM, Antoine Pitrou wrote: > First my high-level opinion about the PEP: the CSP model can probably > be already implemented using Queues. To me, the interesting promise of > subinterpreters is if they allow to remove the GIL while sharing memory > for big objects (such as Numpy arrays). This means the PEP should > probably focus on potential concurrency improvements rather than try to > faithfully follow the CSP model. Please elaborate. I'm interested in understanding what you mean here. Do you have some subinterpreter-based concurrency improvements in mind? What aspect of CSP is the PEP following too faithfully? >> ``list_all()``:: >> >> Return a list of all existing interpreters. > > See my naming proposal in the previous thread. Sorry, your previous comment slipped through the cracks. You suggested: As for the naming, let's make it both unconfusing and explicit? How about three functions: `all_interpreters()`, `running_interpreters()` and `idle_interpreters()`, for example? As to "all_interpreters()", I suppose it's the difference between "interpreters.all_interpreters()" and "interpreters.list_all()". To me the latter looks better. As to "running_interpreters()" and "idle_interpreters()", I'm not sure what the benefit would be. You can compose either list manually with a simple comprehension: [interp for interp in interpreters.list_all() if interp.is_running()] [interp for interp in interpreters.list_all() if not interp.is_running()] >> run(source_str, /, **shared): >> >> Run the provided Python source code in the interpreter. Any >> keyword arguments are added to the interpreter's execution >> namespace. > > "Execution namespace" specifically means the __main__ module in the > target interpreter, right? Right. It's explained in more detail a little further down and elsewhere in the PEP. I've updated the PEP to explicitly mention __main__ here too. >> If any of the values are not supported for sharing >> between interpreters then RuntimeError gets raised. Currently >> only channels (see "create_channel()" below) are supported. >> >> This may not be called on an already running interpreter. Doing >> so results in a RuntimeError. > > I would distinguish between both error cases: RuntimeError for calling > run() on an already running interpreter, ValueError for values which > are not supported for sharing. Good point. >> Likewise, if there is any uncaught >> exception, it propagates into the code where "run()" was called. > > That makes it a bit harder to differentiate with errors raised by run() > itself (see above), though how much of an annoyance this is remains > unclear. The more litigious implication, though, is that it forces the > interpreter to support migration of arbitrary objects from one > interpreter to another (since a traceback keeps all local variables > alive). Yeah, the proposal to propagate exceptions out of the subinterpreter is still rather weak. I've added some notes the the PEP about this open issue. >> The mechanism for passing objects between interpreters is through >> channels. A channel is a simplex FIFO similar to a pipe. The main >> difference is that channels can be associated with zero or more >> interpreters on either end. > > So it seems channels have become more complicated now? Is it important > to support multi-producer multi-consumer channels? To me it made the API simpler. The change did introduce the "close()" method, which I suppose could be confusing. However, I'm sure that in practice it won't be. In contrast, the FIFO/pipe-based API that I had before required passing names around, required more calls, required managing the channel/interpreter relationship more carefully, and made it hard to follow that relationship. >> Unlike queues, which are also many-to-many, >> channels have no buffer. > > How does it work? Does send() block until someone else calls recv()? > That does not sound like a good idea to me. Correct "send()" blocks until the other end receives (if ever). Likewise "recv()" blocks until the other end sends. This specific behavior is probably the main thing I borrowed from CSP. It is *the* synchronization mechanism. Given the isolated nature of subinterpreters, I consider using this concept from CSP to be a good fit. > I don't think it's a > coincidence that the most varied kinds of I/O (from socket or file IO > to threading Queues to multiprocessing Pipes) have non-blocking send(). Interestingly, you can set sockets to blocking mode, in which case send() will block until there is room in the kernel buffer. Likewise, queue.Queue.send() supports blocking, in addition to providing a put_nowait() method. Note that the PEP provides "recv_nowait()" and "send_nowait()" (names inspired by queue.Queue), allowing for a non-blocking send. It's just not the default. I deliberated for a little while on which one to make the default. In the end I went with blocking-by-default to stick to the CSP model. However, I want to do what's most practical for users. I can imagine folks at first not expecting blocking send by default. However, it otherwise isn't clear yet which one is better for interpreter channels. I'll add on "open question" about switching to non-blocking-by-default for send(). > send() blocking until someone else calls recv() is not only bad for > performance, What is the performance problem? > it also increases the likelihood of deadlocks. How much of a problem will deadlocks be in practice? (FWIW, CSP provides rigorous guarantees about deadlock detection (which Go leverages), though I'm not sure how much benefit that can offer such a dynamic language as Python.) Regardless, I'll make sure the PEP discusses deadlocks. > EOFError normally means the *other* (sending) side has closed the > channel (but it becomes complicated with a multi-producer multi-consumer > setup...). When *this* side has closed the channel, we should raise > ValueError. I've fixed this in the PEP. >> The Python runtime >> will garbage collect all closed channels. Note that "close()" is >> automatically called when it is no longer used in the current >> interpreter. > > "No longer used" meaning it loses all references in this interpreter? Correct. I've clarified this in the PEP. > Similar remark as above (EOFError vs. ValueError). > More generally, send() raising EOFError sounds unheard of. Hmm. I've fixed this in the PEP, but perhaps using EOFError here (and even for read()) isn't right. I was drawing inspiration from pipes, but certainly the semantics aren't exactly the same. So it may make sense to use something else less I/O-related, like a new exception type in the "interpreters" module. I'll make a note in the PEP about this. > A sidenote: context manager support (__enter__ / __exit__) on channels > would sound more useful to me than iteration support. Yeah, I can see that. FWIW, I've dropped __next__() from the PEP. I've also added a note about added context manager support. >> An alternative to support for bytes in channels in support for >> read-only buffers (the PEP 3119 kind). > > Probably you mean PEP 3118. Yep. :) >> Then ``recv()`` would return >> a memoryview to expose the buffer in a zero-copy way. > > It will probably not do much if you only can pass buffers and not > structured objects, because unserializing (e.g. unpickling) from a > buffer will still copy memory around. > > To pass a Numpy array, for example, you not only need to pass its > contents but also its metadata (its value type -- named "dtype" --, its > shape and strides). This may be serialized as simple tuples of atomic > types (str, int, bytes, other tuples), but you want to include a > memoryview of the data area somewhere in those tuples. > > (and, of course, at some point, this will feel like reinventing > pickle :)) but pickle has no mechanism to avoid memory copies, so it > can't readily be reused here -- otherwise you're just reinventing > multiprocessing...) I'm still working through all the passing-buffers-through-channels feedback, so I'll defer on a reply for now. :) >> timeout arg to pop() and push() >> ------------------------------- > > pop() and push() don't exist anymore :-) Fixed! :) >> Synchronization Primitives >> -------------------------- >> >> The ``threading`` module provides a number of synchronization primitives >> for coordinating concurrent operations. This is especially necessary >> due to the shared-state nature of threading. In contrast, >> subinterpreters do not share state. Data sharing is restricted to >> channels, which do away with the need for explicit synchronization. > > I think this rationale confuses Python-level data sharing with > process-level data sharing. The main point of subinterpreters > (compared to multiprocessing) is that they live in the same OS > process. So it's really not true that you can't share a low-level > synchronization primitive (say a semaphore) between subinterpreters. I'm not sure I understand your concern here. Perhaps I used the word "sharing" too ambiguously? By "sharing" I mean that the two actors have read access to something that at least one of them can modify. If they both only have read-only access then it's effectively the same as if they are not sharing. While I can imagine the *possibility* (some day) of an opt-in mechanism to share objects (r/rw or rw/rw), that is definitely not a part of this PEP. I expect that in reality we will only ever pass immutable data between interpreters. So I'm unclear on what need there might be for any synchronization primitives other than what is inherent to channels. >> * a ``create()`` arg to indicate resetting ``__main__`` after each >> ``run`` call >> * an ``Interpreter.reset_main`` flag to support opting in or out >> after the fact >> * an ``Interpreter.reset_main()`` method to opt in when desired > > This would all be a false promise. Persistent state lives in other > places than __main__ (for example the loaded modules and their > respective configurations - think logging or decimal). I've added a bit more explanation to the PEP to clarify this point. >> The main difference between queues and channels is that queues support >> buffering. This would complicate the blocking semantics of ``recv()`` >> and ``send()``. Also, queues can be built on top of channels. > > But buffering with background threads in pure Python will be order > of magnitudes slower than optimized buffering in a custom low-level > implementation. It would be a pity if a subinterpreters Queue ended > out as slow as a multiprocessing Queue. I agree. I'm entirely open to supporting other object-passing types, including adding low-level implementations. I've added a note to the PEP to that effect. However, I wanted to start off with the most basic object-passing type, and I felt that channels provides the simplest solution. My goal is to get a basic API landed in 3.7 and then build on it from there for 3.8. That said, in the interest of enabling extra utility in the near-term, I expect that we will be able to design the PyInterpreterState changes (few as they are) in such a way that a C-extension could implement an efficient multi-interpreter Queue type that would run under 3.7. Actually, would that be strictly necessary if you can interact with channels without the GIL in the C-API? Regardless, I'll make a note in the PEP about the relationship between C-API and implementing an efficient multi-interepter Queue. I suppose that means I need to add C-API changes to the PEP (which I had wanted to avoid). From solipsis at pitrou.net Sat Sep 23 05:45:45 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 23 Sep 2017 11:45:45 +0200 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) In-Reply-To: References: <20170918124636.7ee8de0b@fsol> Message-ID: <20170923114545.115b901b@fsol> Hi Eric, On Fri, 22 Sep 2017 19:09:01 -0600 Eric Snow wrote: > > Please elaborate. I'm interested in understanding what you mean here. > Do you have some subinterpreter-based concurrency improvements in > mind? What aspect of CSP is the PEP following too faithfully? See below the discussion of blocking send()s :-) > As to "running_interpreters()" and "idle_interpreters()", I'm not sure > what the benefit would be. You can compose either list manually with > a simple comprehension: > > [interp for interp in interpreters.list_all() if interp.is_running()] > [interp for interp in interpreters.list_all() if not interp.is_running()] There is a inherit race condition in doing that, at least if interpreters are running in multiple threads (which I assume is going to be the overly dominant usage model). That is why I'm proposing all three variants. > > I don't think it's a > > coincidence that the most varied kinds of I/O (from socket or file IO > > to threading Queues to multiprocessing Pipes) have non-blocking send(). > > Interestingly, you can set sockets to blocking mode, in which case > send() will block until there is room in the kernel buffer. Yes, but there *is* a kernel buffer. Which is the whole point of my comment: most alike primitives have internal buffering to prevent the user-facing send() API from blocking in the common case. > Likewise, > queue.Queue.send() supports blocking, in addition to providing a > put_nowait() method. queue.Queue.put() never blocks in the usual case (*), which is of an unbounded queue. Only bounded queues (created with an explicit non-zero max_size parameter) can block in Queue.put(). (*) and therefore also never deadlocks :-) > Note that the PEP provides "recv_nowait()" and "send_nowait()" (names > inspired by queue.Queue), allowing for a non-blocking send. True, but it's not the same thing at all. In the objects I mentioned, send() mostly doesn't block and doesn't fail either. In your model, send_nowait() will routinely fail with an error if a recipient isn't immediately available to recv the data. > > send() blocking until someone else calls recv() is not only bad for > > performance, > > What is the performance problem? Intuitively, there must be some kind of context switch (interpreter switch?) at each send() call to let the other end receive the data, since you don't have any internal buffering. Also, suddenly an interpreter's ability to exploit CPU time is dependent on another interpreter's ability to consume data in a timely manner (what if the other interpreter is e.g. stuck on some disk I/O?). IMHO it would be better not to have such coupling. > > it also increases the likelihood of deadlocks. > > How much of a problem will deadlocks be in practice? I expect more often than expected, in complex systems :-) For example, you could have a recv() loop that also from time to time send()s some data on another queue, depending on what is received. But if that send()'s recipient also has the same structure (a recv() loop which send()s from time to time), then it's easy to imagine to two getting in a deadlock. > (FWIW, CSP > provides rigorous guarantees about deadlock detection (which Go > leverages), though I'm not sure how much benefit that can offer such a > dynamic language as Python.) Hmm... deadlock detection is one thing, but when detected you must still solve those deadlock issues, right? > I'm not sure I understand your concern here. Perhaps I used the word > "sharing" too ambiguously? By "sharing" I mean that the two actors > have read access to something that at least one of them can modify. > If they both only have read-only access then it's effectively the same > as if they are not sharing. Right. What I mean is that you *can* share very simple "data" under the form of synchronization primitives. You may want to synchronize your interpreters even they don't share user-visible memory areas. The point of synchronization is not only to avoid memory corruption but also to regulate and orchestrate processing amongst multiple workers (for example processes or interpreters). For example, a semaphore is an easy way to implement "I want no more than N workers to do this thing at the same time" ("this thing" can be something such as disk I/O). Regards Antoine. From python at mrabarnett.plus.com Sat Sep 23 10:32:58 2017 From: python at mrabarnett.plus.com (MRAB) Date: Sat, 23 Sep 2017 15:32:58 +0100 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) In-Reply-To: <20170923114545.115b901b@fsol> References: <20170918124636.7ee8de0b@fsol> <20170923114545.115b901b@fsol> Message-ID: <617b130b-7782-75ac-5b6d-b1ad8fcc610f@mrabarnett.plus.com> On 2017-09-23 10:45, Antoine Pitrou wrote: > > Hi Eric, > > On Fri, 22 Sep 2017 19:09:01 -0600 > Eric Snow wrote: >> >> Please elaborate. I'm interested in understanding what you mean here. >> Do you have some subinterpreter-based concurrency improvements in >> mind? What aspect of CSP is the PEP following too faithfully? > > See below the discussion of blocking send()s :-) > >> As to "running_interpreters()" and "idle_interpreters()", I'm not sure >> what the benefit would be. You can compose either list manually with >> a simple comprehension: >> >> [interp for interp in interpreters.list_all() if interp.is_running()] >> [interp for interp in interpreters.list_all() if not interp.is_running()] > > There is a inherit race condition in doing that, at least if > interpreters are running in multiple threads (which I assume is going > to be the overly dominant usage model). That is why I'm proposing all > three variants. > An alternative to 3 variants would be: interpreters.list_all(running=True) interpreters.list_all(running=False) interpreters.list_all(running=None) [snip] From storchaka at gmail.com Sat Sep 23 18:39:26 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sun, 24 Sep 2017 01:39:26 +0300 Subject: [Python-Dev] To reduce Python "application" startup time In-Reply-To: References: Message-ID: 05.09.17 16:02, INADA Naoki ????: > While I can't attend to sprint, I saw etherpad and I found > Neil Schemenauer and Eric Snow will work on startup time. > > I want to share my current knowledge about startup time. > > For bare (e.g. `python -c pass`) startup time, I'm waiting C > implementation of ABC. > > But application startup time is more important. And we can improve > them with optimize importing common stdlib. > > Current `python -v` is not useful to optimize import. > So I use this patch to profile import time. > https://gist.github.com/methane/e688bb31a23bcc437defcea4b815b1eb > > With this profile, I tried optimize `python -c 'import asyncio'`, logging > and http.client. > > https://gist.github.com/methane/1ab97181e74a33592314c7619bf34233#file-0-optimize-import-patch > > With this small patch: > > logging: 14.9ms -> 12.9ms > asyncio: 62.1ms -> 58.2ms > http.client: 43.8ms -> 36.1ms See also https://bugs.python.org/issue30152 which optimizes the import time of argparse using similar technique. I think these patches overlap. From francismb at email.de Mon Sep 25 13:51:29 2017 From: francismb at email.de (francismb) Date: Mon, 25 Sep 2017 19:51:29 +0200 Subject: [Python-Dev] Evil reference cycles caused Exception.__traceback__ In-Reply-To: <20170918225403.1d8d2cb0@fsol> References: <20170918114824.741c17e4@fsol> <7fbed308-1eec-6dce-f701-405682e3d297@python.org> <20170918185040.4a38bc62@fsol> <8ac58695-dcf0-156c-2967-862749808dcd@python.org> <20170918225403.1d8d2cb0@fsol> Message-ID: <15a98a1f-c370-db9d-b476-be40bf51a2e8@email.de> Hi, just curious on this, On 09/18/2017 10:54 PM, Antoine Pitrou wrote: >> I'm not an expert on GC at all, but intuitively it sure seems like >> allocation size might be a useful piece of information to feed into a >> heuristic. Our current heuristic is just, run a small collection after >> every 700 allocations, run a larger collection after 10 smaller >> collections. > Ok, so there are two useful things a heuristic may try to guess: > 1. how much CPU time a (full) collection will cost > 2. how much memory it might help release > > While #1 is quite realistically predictable (because a full collection > is basically O(number of allocated objects)), #2 is entirely > unpredictable and you can really only make out an upper bound > (trivially, the GC cannot release any more memory than has been > allocated :-)). is the allocation size not useful enough for #2 ? (the upper bound seems logical as far as the location is done by the same "entity", (or size communicated back to) ) Thanks in advance! --francis From njs at pobox.com Mon Sep 25 20:42:02 2017 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 25 Sep 2017 17:42:02 -0700 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) In-Reply-To: <20170923114545.115b901b@fsol> References: <20170918124636.7ee8de0b@fsol> <20170923114545.115b901b@fsol> Message-ID: On Sat, Sep 23, 2017 at 2:45 AM, Antoine Pitrou wrote: >> As to "running_interpreters()" and "idle_interpreters()", I'm not sure >> what the benefit would be. You can compose either list manually with >> a simple comprehension: >> >> [interp for interp in interpreters.list_all() if interp.is_running()] >> [interp for interp in interpreters.list_all() if not interp.is_running()] > > There is a inherit race condition in doing that, at least if > interpreters are running in multiple threads (which I assume is going > to be the overly dominant usage model). That is why I'm proposing all > three variants. There's a race condition no matter what the API looks like -- having a dedicated running_interpreters() lets you guarantee that the returned list describes the set of interpreters that were running at some moment in time, but you don't know when that moment was and by the time you get the list, it's already out-of-date. So this doesn't seem very useful. OTOH if we think that invariants like this are useful, we might also want to guarantee that calling running_interpreters() and idle_interpreters() gives two lists such that each interpreter appears in exactly one of them, but that's impossible with this API; it'd require a single function that returns both lists. What problem are you trying to solve? >> Likewise, >> queue.Queue.send() supports blocking, in addition to providing a >> put_nowait() method. > > queue.Queue.put() never blocks in the usual case (*), which is of an > unbounded queue. Only bounded queues (created with an explicit > non-zero max_size parameter) can block in Queue.put(). > > (*) and therefore also never deadlocks :-) Unbounded queues also introduce unbounded latency and memory usage in realistic situations. (E.g. a producer/consumer setup where the producer runs faster than the consumer.) There's a reason why sockets always have bounded buffers -- it's sometimes painful, but the pain is intrinsic to building distributed systems, and unbounded buffers just paper over it. >> > send() blocking until someone else calls recv() is not only bad for >> > performance, >> >> What is the performance problem? > > Intuitively, there must be some kind of context switch (interpreter > switch?) at each send() call to let the other end receive the data, > since you don't have any internal buffering. Technically you just need the other end to wake up at some time in between any two calls to send(), and if there's no GIL then this doesn't necessarily require a context switch. > Also, suddenly an interpreter's ability to exploit CPU time is > dependent on another interpreter's ability to consume data in a timely > manner (what if the other interpreter is e.g. stuck on some disk I/O?). > IMHO it would be better not to have such coupling. A small buffer probably is useful in some cases, yeah -- basically enough to smooth out scheduler jitter. >> > it also increases the likelihood of deadlocks. >> >> How much of a problem will deadlocks be in practice? > > I expect more often than expected, in complex systems :-) For example, > you could have a recv() loop that also from time to time send()s some > data on another queue, depending on what is received. But if that > send()'s recipient also has the same structure (a recv() loop which > send()s from time to time), then it's easy to imagine to two getting in > a deadlock. You kind of want to be able to create deadlocks, since the alternative is processes that can't coordinate and end up stuck in livelocks or with unbounded memory use etc. >> I'm not sure I understand your concern here. Perhaps I used the word >> "sharing" too ambiguously? By "sharing" I mean that the two actors >> have read access to something that at least one of them can modify. >> If they both only have read-only access then it's effectively the same >> as if they are not sharing. > > Right. What I mean is that you *can* share very simple "data" under > the form of synchronization primitives. You may want to synchronize > your interpreters even they don't share user-visible memory areas. The > point of synchronization is not only to avoid memory corruption but > also to regulate and orchestrate processing amongst multiple workers > (for example processes or interpreters). For example, a semaphore is > an easy way to implement "I want no more than N workers to do this > thing at the same time" ("this thing" can be something such as disk > I/O). It's fairly reasonable to implement a mutex using a CSP-style unbuffered channel (send = acquire, receive = release). And the same trick turns a channel with a fixed-size buffer into a bounded semaphore. It won't be as efficient as a modern specialized mutex implementation, of course, but it's workable. Unfortunately while technically you can construct a buffered channel out of an unbuffered channel, the construction's pretty unreasonable (it needs two dedicated threads per channel). -n -- Nathaniel J. Smith -- https://vorpus.org From solipsis at pitrou.net Tue Sep 26 03:04:45 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 26 Sep 2017 09:04:45 +0200 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) In-Reply-To: References: <20170918124636.7ee8de0b@fsol> <20170923114545.115b901b@fsol> Message-ID: <20170926090445.6b57ddd1@fsol> On Mon, 25 Sep 2017 17:42:02 -0700 Nathaniel Smith wrote: > On Sat, Sep 23, 2017 at 2:45 AM, Antoine Pitrou wrote: > >> As to "running_interpreters()" and "idle_interpreters()", I'm not sure > >> what the benefit would be. You can compose either list manually with > >> a simple comprehension: > >> > >> [interp for interp in interpreters.list_all() if interp.is_running()] > >> [interp for interp in interpreters.list_all() if not interp.is_running()] > > > > There is a inherit race condition in doing that, at least if > > interpreters are running in multiple threads (which I assume is going > > to be the overly dominant usage model). That is why I'm proposing all > > three variants. > > There's a race condition no matter what the API looks like -- having a > dedicated running_interpreters() lets you guarantee that the returned > list describes the set of interpreters that were running at some > moment in time, but you don't know when that moment was and by the > time you get the list, it's already out-of-date. Hmm, you're right of course. > >> Likewise, > >> queue.Queue.send() supports blocking, in addition to providing a > >> put_nowait() method. > > > > queue.Queue.put() never blocks in the usual case (*), which is of an > > unbounded queue. Only bounded queues (created with an explicit > > non-zero max_size parameter) can block in Queue.put(). > > > > (*) and therefore also never deadlocks :-) > > Unbounded queues also introduce unbounded latency and memory usage in > realistic situations. This doesn't seem to pose much a problem in common use cases, though. How many Python programs have you seen switch from an unbounded to a bounded Queue to solve this problem? Conversely, choosing a buffer size is tricky. How do you know up front which amount you need? Is a fixed buffer size even ok or do you want it to fluctuate based on the current conditions? And regardless, my point was that a buffer is desirable. That send() may block when the buffer is full doesn't change that it won't block in the common case. > There's a reason why sockets > always have bounded buffers -- it's sometimes painful, but the pain is > intrinsic to building distributed systems, and unbounded buffers just > paper over it. Papering over a problem is sometimes the right answer actually :-) For example, most Python programs assume memory is unbounded... If I'm using a queue or channel to push events to a logging system, should I really block at every send() call? Most probably I'd rather run ahead instead. > > Also, suddenly an interpreter's ability to exploit CPU time is > > dependent on another interpreter's ability to consume data in a timely > > manner (what if the other interpreter is e.g. stuck on some disk I/O?). > > IMHO it would be better not to have such coupling. > > A small buffer probably is useful in some cases, yeah -- basically > enough to smooth out scheduler jitter. That's not about scheduler jitter, but catering for activities which occur at inherently different speed or rhythms. Requiring things run in lockstep removes a lot of flexibility and makes it harder to exploit CPU resources fully. > > I expect more often than expected, in complex systems :-) For example, > > you could have a recv() loop that also from time to time send()s some > > data on another queue, depending on what is received. But if that > > send()'s recipient also has the same structure (a recv() loop which > > send()s from time to time), then it's easy to imagine to two getting in > > a deadlock. > > You kind of want to be able to create deadlocks, since the alternative > is processes that can't coordinate and end up stuck in livelocks or > with unbounded memory use etc. I am not advocating we make it *impossible* to create deadlocks; just saying we should not make them more *likely* than they need to. > >> I'm not sure I understand your concern here. Perhaps I used the word > >> "sharing" too ambiguously? By "sharing" I mean that the two actors > >> have read access to something that at least one of them can modify. > >> If they both only have read-only access then it's effectively the same > >> as if they are not sharing. > > > > Right. What I mean is that you *can* share very simple "data" under > > the form of synchronization primitives. You may want to synchronize > > your interpreters even they don't share user-visible memory areas. The > > point of synchronization is not only to avoid memory corruption but > > also to regulate and orchestrate processing amongst multiple workers > > (for example processes or interpreters). For example, a semaphore is > > an easy way to implement "I want no more than N workers to do this > > thing at the same time" ("this thing" can be something such as disk > > I/O). > > It's fairly reasonable to implement a mutex using a CSP-style > unbuffered channel (send = acquire, receive = release). And the same > trick turns a channel with a fixed-size buffer into a bounded > semaphore. It won't be as efficient as a modern specialized mutex > implementation, of course, but it's workable. We are drifting away from the point I was trying to make here. I was pointing out that the claim that nothing can be shared is a lie. If it's possible to share a small datum (a synchronized counter aka semaphore) between processes, certainly there's no technical reason that should prevent it between interpreters. By the way, I do think efficiency is a concern here. Otherwise subinterpreters don't even have a point (just use multiprocessing). > Unfortunately while technically you can construct a buffered channel > out of an unbuffered channel, the construction's pretty unreasonable > (it needs two dedicated threads per channel). And the reverse is quite cumbersome as well. So we should favour the construct that's more convenient for users, or provide both. Regards Antoine. From francismb at email.de Tue Sep 26 08:56:22 2017 From: francismb at email.de (francismb) Date: Tue, 26 Sep 2017 14:56:22 +0200 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) - channel type In-Reply-To: References: <20170918124636.7ee8de0b@fsol> Message-ID: <9e85db1a-629f-0e3a-b69c-de90b3a7569b@email.de> Hi Eric, >> To make this work, the mutable shared state will be managed by the >> Python runtime, not by any of the interpreters. Initially we will >> support only one type of objects for shared state: the channels >> provided by create_channel(). Channels, in turn, will carefully >> manage passing objects between interpreters. [0] Would it make sense to make the default channel type explicit, something like ``create_channel(bytes)`` ? Thanks in advance, --francis [0] https://www.python.org/dev/peps/pep-0554/ From walter at livinglogic.de Tue Sep 26 10:30:10 2017 From: walter at livinglogic.de (Walter =?utf-8?q?D=C3=B6rwald?=) Date: Tue, 26 Sep 2017 16:30:10 +0200 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) In-Reply-To: References: <20170918124636.7ee8de0b@fsol> Message-ID: On 23 Sep 2017, at 3:09, Eric Snow wrote: > [...] >>> ``list_all()``:: >>> >>> Return a list of all existing interpreters. >> >> See my naming proposal in the previous thread. > > Sorry, your previous comment slipped through the cracks. You > suggested: > > As for the naming, let's make it both unconfusing and explicit? > How about three functions: `all_interpreters()`, > `running_interpreters()` > and `idle_interpreters()`, for example? > > As to "all_interpreters()", I suppose it's the difference between > "interpreters.all_interpreters()" and "interpreters.list_all()". To > me the latter looks better. But in most cases when Python returns a container (list/dict/iterator) of things, the name of the function/method is the name of the things, not the name of the container, i.e. we have sys.modules, dict.keys, dict.values etc.. Or if the collection of things itself has a name, it is that name, i.e. os.environ, sys.path etc. Its a little bit unfortunate that the name of the module would be the same as the name of the function, but IMHO interpreters() would be better than list(). > As to "running_interpreters()" and "idle_interpreters()", I'm not sure > what the benefit would be. You can compose either list manually with > a simple comprehension: > > [interp for interp in interpreters.list_all() if > interp.is_running()] > [interp for interp in interpreters.list_all() if not > interp.is_running()] Servus, Walter From guido at python.org Tue Sep 26 16:47:26 2017 From: guido at python.org (Guido van Rossum) Date: Tue, 26 Sep 2017 13:47:26 -0700 Subject: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files) Message-ID: I've read the current version of PEP 552 over and I think everything looks good for acceptance. I believe there are no outstanding objections (or they have been adequately addressed in responses). Therefore I intend to accept PEP 552 this Friday, unless grave objections are raised on this mailing list (python-dev). Congratulations Benjamin. Gotta love those tristate options! -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Sep 27 01:26:52 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 27 Sep 2017 15:26:52 +1000 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) In-Reply-To: <20170926090445.6b57ddd1@fsol> References: <20170918124636.7ee8de0b@fsol> <20170923114545.115b901b@fsol> <20170926090445.6b57ddd1@fsol> Message-ID: On 26 September 2017 at 17:04, Antoine Pitrou wrote: > On Mon, 25 Sep 2017 17:42:02 -0700 Nathaniel Smith wrote: >> Unbounded queues also introduce unbounded latency and memory usage in >> realistic situations. > > This doesn't seem to pose much a problem in common use cases, though. > How many Python programs have you seen switch from an unbounded to a > bounded Queue to solve this problem? > > Conversely, choosing a buffer size is tricky. How do you know up front > which amount you need? Is a fixed buffer size even ok or do you want > it to fluctuate based on the current conditions? > > And regardless, my point was that a buffer is desirable. That send() > may block when the buffer is full doesn't change that it won't block in > the common case. It's also the case that unlike Go channels, which were designed from scratch on the basis of implementing pure CSP, Python has an established behavioural precedent in the APIs of queue.Queue and collections.deque: they're unbounded by default, and you have to opt in to making them bounded. >> There's a reason why sockets >> always have bounded buffers -- it's sometimes painful, but the pain is >> intrinsic to building distributed systems, and unbounded buffers just >> paper over it. > > Papering over a problem is sometimes the right answer actually :-) For > example, most Python programs assume memory is unbounded... > > If I'm using a queue or channel to push events to a logging system, > should I really block at every send() call? Most probably I'd rather > run ahead instead. While the article title is clickbaity, http://www.jtolds.com/writing/2016/03/go-channels-are-bad-and-you-should-feel-bad/ actually has a good discussion of this point. Search for "compose" to find the relevant section ("Channels don?t compose well with other concurrency primitives"). The specific problem cited is that only offering unbuffered or bounded-buffer channels means that every send call becomes a potential deadlock scenario, as all that needs to happen is for you to be holding a different synchronisation primitive when the send call blocks. >> > Also, suddenly an interpreter's ability to exploit CPU time is >> > dependent on another interpreter's ability to consume data in a timely >> > manner (what if the other interpreter is e.g. stuck on some disk I/O?). >> > IMHO it would be better not to have such coupling. >> >> A small buffer probably is useful in some cases, yeah -- basically >> enough to smooth out scheduler jitter. > > That's not about scheduler jitter, but catering for activities which > occur at inherently different speed or rhythms. Requiring things run > in lockstep removes a lot of flexibility and makes it harder to exploit > CPU resources fully. The fact that the proposal now allows for M:N sender:receiver relationships (just as queue.Queue does with threads) makes that problem worse, since you may now have variability not only on the message consumption side, but also on the message production side. Consider this example where you have an event processing thread pool that we're attempting to isolate from blocking IO by using channels rather than coroutines. Desired flow: 1. Listener thread receives external message from socket 2. Listener thread files message for processing on receive channel 3. Listener thread returns to blocking on the receive socket 4. Processing thread picks up message from receive channel 5. Processing thread processes message 6. Processing thread puts reply on the send channel 7. Sending thread picks up message from send channel 8. Sending thread makes a blocking network send call to transmit the message 9. Sending thread returns to blocking on the send channel When queue.Queue is used to pass the messages between threads, such an arrangement will be effectively non-blocking as long as the send rate is greater than or equal to the receive rate. However, the GIL means it won't exploit all available cores, even if we create multiple processing threads: you have to switch to multiprocessing for that, with all the extra overhead that entails. So I see the essential premise of PEP 554 as being to ask the question "If each of these threads was running its own *interpreter*, could we use Sans IO style protocols with interpreter channels to separate internally "synchronous" processing threads from separate IO threads operating at system boundaries, without having to make the entire application pervasively asynchronous?" If channels are an unbuffered blocking primitive, then we don't get that benefit: even when there are additional receive messages to be processed, the processing thread will block until the previous send has completed. Switching the listener and sender threads over to asynchronous IO would help with that, but they'd also end up having to implement their own message buffering to manage the lack of buffering in the core channel primitive. By contrast, if the core channels are designed to offer an unbounded buffer by default, then you can get close-to-CSP semantics just by setting the buffer size to 1 (it's still not exactly CSP, since that has a buffer size of 0, but you at least get the semantics of having to alternate sending and receiving of messages). >> > I expect more often than expected, in complex systems :-) For example, >> > you could have a recv() loop that also from time to time send()s some >> > data on another queue, depending on what is received. But if that >> > send()'s recipient also has the same structure (a recv() loop which >> > send()s from time to time), then it's easy to imagine to two getting in >> > a deadlock. >> >> You kind of want to be able to create deadlocks, since the alternative >> is processes that can't coordinate and end up stuck in livelocks or >> with unbounded memory use etc. > > I am not advocating we make it *impossible* to create deadlocks; just > saying we should not make them more *likely* than they need to. Right, and I think the queue.Queue and collections.deque model works well for that, since you can start introducing queue bounds to propagate backpressure through a system if you're seeing undesirable memory growth. >> It's fairly reasonable to implement a mutex using a CSP-style >> unbuffered channel (send = acquire, receive = release). And the same >> trick turns a channel with a fixed-size buffer into a bounded >> semaphore. It won't be as efficient as a modern specialized mutex >> implementation, of course, but it's workable. > > We are drifting away from the point I was trying to make here. I was > pointing out that the claim that nothing can be shared is a lie. > If it's possible to share a small datum (a synchronized counter aka > semaphore) between processes, certainly there's no technical reason > that should prevent it between interpreters. > > By the way, I do think efficiency is a concern here. Otherwise > subinterpreters don't even have a point (just use multiprocessing). Agreed, and I think the interaction between the threading module and the interpreters module is one we're going to have to explicitly call out as being covered by the provisional status of the interpreters module, as I think it could be incredibly valuable to be able to send at least some threading objects through channels, and have them be an interpreter-specific reference to a common underlying sync primitive. >> Unfortunately while technically you can construct a buffered channel >> out of an unbuffered channel, the construction's pretty unreasonable >> (it needs two dedicated threads per channel). > > And the reverse is quite cumbersome as well. So we should favour the > construct that's more convenient for users, or provide both. As noted above, I think consistency with design intuitions formed through the use of queue.Queue is also an important consideration. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From victor.stinner at gmail.com Wed Sep 27 08:56:32 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 27 Sep 2017 14:56:32 +0200 Subject: [Python-Dev] API design question: how to extend sys.settrace()? Message-ID: Hi, In bpo-29400, it was proposed to add the ability to trace not only function calls but also instructions at the bytecode level. I like the idea, but I don't see how to extend sys.settrace() to add a new "trace_instructions: bool" optional (keyword-only?) parameter without breaking the backward compatibility. Should we add a new function instead? Would it be possible to make the API "future-proof", so we can extend it again later? I almost never used sys.settrace(), so I prefer to ask on this python-dev list. Copy of the George King's message: https://bugs.python.org/issue29400#msg298441 """ After reviewing the thread, I'm reminded that the main design problem concerns preserving behavior of this idiom: "old=sys.gettrace(); ...; sys.settrace(old)" If we add more state, i.e. the `trace_instructions` bool, then the above idiom no longer correctly stores/reloads the full state. Here are the options that I see: 1. New APIs: * `gettrace() -> Callable # no change.` * `settrace(tracefunc: Callable) -> None # no change.` * `gettraceinst() -> Callable # new.` * `settraceinst(tracefunc: Callable) -> None # new.` Behavior: * `settrace()` raises if `gettraceinst()` is not None. * `settraceinst()` raises if `gettrace()` is not None. 2. Add keyword arg to `settrace`. * `gettrace() -> Callable # no change.` * `settrace(tracefunc: Callable, trace_instructions:Optional[bool]=False) -> None # added keyword.` * `gettracestate() -> Dict[str, Any] # new.` Behavior: * `gettracestate()` returns the complete trace-related state: currently, `tracefunc` and `trace_instructions` fields. * `gettrace()` raises an exception if any of the trace state does not match the old behavior, i.e. if `trace_instructions` is set. 3. New API, but with extensible keyword args: * `settracestate(tracefunc: Callable, trace_instructions:Optional[bool]=False) -> None # new.` * `gettracestate() -> Dict[str, Any] # new.` Behavior: * `settrace()` raises if `gettracestate()` is not None. * `settracestate()` raises if `gettrace()` is not None. As I see it: * approach #1 is more conservative because it does not modify the old API. * approach #2 is more extensible because all future features can be represented by the state dictionary. * approach #3 has both of these strengths, but is also the most work. Examples of possible future features via keywords: * branch-level tracing, as briefly disscussed above. * instruction filtering set, which could be a more general version of the branch tracing. I intend to prototype one or both of these, but I'm also feeling a bit of time pressure to get the basic feature on track for 3.7. """ settraceinst() doesn't seem "future-proof" to me. Victor From hrvoje.niksic at avl.com Wed Sep 27 10:44:14 2017 From: hrvoje.niksic at avl.com (Hrvoje Niksic) Date: Wed, 27 Sep 2017 16:44:14 +0200 Subject: [Python-Dev] API design question: how to extend sys.settrace()? In-Reply-To: References: Message-ID: On 09/27/2017 02:56 PM, Victor Stinner wrote: > Hi, > > In bpo-29400, it was proposed to add the ability to trace not only > function calls but also instructions at the bytecode level. I like the > idea, but I don't see how to extend sys.settrace() to add a new > "trace_instructions: bool" optional (keyword-only?) parameter without > breaking the backward compatibility. Should we add a new function > instead? One possibility would be for settrace to query the capability on the part of the provided callable. For example: def settrace(tracefn): if getattr(tracefn, '__settrace_trace_instructions__', False): ... we need to trace instructions ... In general, the "trace function" can be upgraded to the concept of a "trace object" with (optional) methods under than __call__. This is extensible, easy to document, and fully supports the original "old=sys.gettrace(); ...; sys.settrace(old)" idiom. From storchaka at gmail.com Wed Sep 27 11:10:20 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 27 Sep 2017 18:10:20 +0300 Subject: [Python-Dev] API design question: how to extend sys.settrace()? In-Reply-To: References: Message-ID: 27.09.17 15:56, Victor Stinner ????: > In bpo-29400, it was proposed to add the ability to trace not only > function calls but also instructions at the bytecode level. I like the > idea, but I don't see how to extend sys.settrace() to add a new > "trace_instructions: bool" optional (keyword-only?) parameter without > breaking the backward compatibility. Should we add a new function > instead? I afraid that this change breaks an assumption in frame_setlineno() about the state of the stack. This can corrupt the stack if you jump from the instruction which is a part of Python operation. For example FOR_ITER expects an iterator on the stack. If you jump to the end of the loop from the middle of an assignment operator and skip say STORE_FAST, you will left an arbitrary value on the stack. This can lead to unpredictable consequences. From guido at python.org Wed Sep 27 11:18:26 2017 From: guido at python.org (Guido van Rossum) Date: Wed, 27 Sep 2017 08:18:26 -0700 Subject: [Python-Dev] API design question: how to extend sys.settrace()? In-Reply-To: References: Message-ID: On Wed, Sep 27, 2017 at 8:10 AM, Serhiy Storchaka wrote: > I afraid that this change breaks an assumption in frame_setlineno() about > the state of the stack. This can corrupt the stack if you jump from the > instruction which is a part of Python operation. For example FOR_ITER > expects an iterator on the stack. If you jump to the end of the loop from > the middle of an assignment operator and skip say STORE_FAST, you will left > an arbitrary value on the stack. This can lead to unpredictable > consequences. > Well, probably OT but the solution for that would be to stop using a local stack and instead use explicit addressing. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcea at jcea.es Wed Sep 27 11:56:02 2017 From: jcea at jcea.es (Jesus Cea) Date: Wed, 27 Sep 2017 17:56:02 +0200 Subject: [Python-Dev] New security-announce@python.org mailing list In-Reply-To: References: Message-ID: On 21/09/17 17:30, Barry Warsaw wrote: > https://mail.python.org/mailman/listinfo/security-announce "No such list security-announce". > https://mail.python.org/mailman/listinfo/security-sig "No such list security-sig". -- Jes?s Cea Avi?n _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ Twitter: @jcea _/_/ _/_/ _/_/_/_/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 473 bytes Desc: OpenPGP digital signature URL: From barry at python.org Wed Sep 27 12:16:08 2017 From: barry at python.org (Barry Warsaw) Date: Wed, 27 Sep 2017 12:16:08 -0400 Subject: [Python-Dev] New security-announce@python.org mailing list In-Reply-To: References: Message-ID: On Sep 27, 2017, at 11:56, Jesus Cea wrote: > > On 21/09/17 17:30, Barry Warsaw wrote: >> https://mail.python.org/mailman/listinfo/security-announce > > "No such list security-announce". > >> https://mail.python.org/mailman/listinfo/security-sig > > "No such list security-sig". These lists were just migrated to Mailman 3. Please use: https://mail.python.org/mm3/mailman3/lists/security-announce.python.org/ https://mail.python.org/mm3/mailman3/lists/security-sig.python.org/ Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From robertomartinezp at gmail.com Wed Sep 27 12:17:25 2017 From: robertomartinezp at gmail.com (=?UTF-8?Q?Roberto_Mart=C3=ADnez?=) Date: Wed, 27 Sep 2017 16:17:25 +0000 Subject: [Python-Dev] New security-announce@python.org mailing list In-Reply-To: References: Message-ID: I can't see the lists either. Same message "No such list security-announce". Cheers. On Wed, Sep 27, 2017, 18:05 Jesus Cea wrote: > On 21/09/17 17:30, Barry Warsaw wrote: > > https://mail.python.org/mailman/listinfo/security-announce > > "No such list security-announce". > > > https://mail.python.org/mailman/listinfo/security-sig > > "No such list security-sig". > > -- > Jes?s Cea Avi?n _/_/ _/_/_/ _/_/_/ > jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ > Twitter: @jcea _/_/ _/_/ _/_/_/_/_/ > jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/ _/_/ _/_/ > "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ > "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ > "El amor es poner tu felicidad en la felicidad de otro" - Leibniz > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/robertomartinezp%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Sep 27 12:24:12 2017 From: guido at python.org (Guido van Rossum) Date: Wed, 27 Sep 2017 09:24:12 -0700 Subject: [Python-Dev] New security-announce@python.org mailing list In-Reply-To: References: Message-ID: Can we add redirects to the old list locations? On Wed, Sep 27, 2017 at 9:17 AM, Roberto Mart?nez < robertomartinezp at gmail.com> wrote: > I can't see the lists either. Same message "No such list > security-announce". > > Cheers. > > On Wed, Sep 27, 2017, 18:05 Jesus Cea wrote: > >> On 21/09/17 17:30, Barry Warsaw wrote: >> > https://mail.python.org/mailman/listinfo/security-announce >> >> "No such list security-announce". >> >> > https://mail.python.org/mailman/listinfo/security-sig >> >> "No such list security-sig". >> >> -- >> Jes?s Cea Avi?n _/_/ _/_/_/ _/_/_/ >> jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ >> Twitter: @jcea _/_/ _/_/ _/_/_/_/_/ >> jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/ _/_/ _/_/ >> "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ >> "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ >> "El amor es poner tu felicidad en la felicidad de otro" - Leibniz >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: https://mail.python.org/mailman/options/python-dev/ >> robertomartinezp%40gmail.com >> > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Sep 27 12:39:07 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 28 Sep 2017 02:39:07 +1000 Subject: [Python-Dev] API design question: how to extend sys.settrace()? In-Reply-To: References: Message-ID: On 27 September 2017 at 22:56, Victor Stinner wrote: > Hi, > > In bpo-29400, it was proposed to add the ability to trace not only > function calls but also instructions at the bytecode level. I like the > idea, but I don't see how to extend sys.settrace() to add a new > "trace_instructions: bool" optional (keyword-only?) parameter without > breaking the backward compatibility. Should we add a new function > instead? As part of investigating the current signal safety problems in with statements [1], I added the ability to trace lines, opcodes, both, or neither by setting a couple of flags on a per-frame basis: https://docs.python.org/dev/whatsnew/3.7.html#other-cpython-implementation-changes So the idea is that if you want per-opcode tracing, your trace function has to turn it on for each frame when handling the call event, and you accept the responsibility of not messing up the frame stack. Cheers, Nick. [1] See https://bugs.python.org/issue29988 if you're interested in the gory details -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From barry at python.org Wed Sep 27 13:18:39 2017 From: barry at python.org (Barry Warsaw) Date: Wed, 27 Sep 2017 13:18:39 -0400 Subject: [Python-Dev] New security-announce@python.org mailing list In-Reply-To: References: Message-ID: <4A9D53C3-338F-4317-82B9-F73050A89755@python.org> On Sep 27, 2017, at 12:24, Guido van Rossum wrote: > > Can we add redirects to the old list locations? I?m not sure we need it for security-announce given how new that list is and that we only have one message to it. I?ve forwarded this request to postmaster@ though. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From gwk.lists at gmail.com Wed Sep 27 16:23:22 2017 From: gwk.lists at gmail.com (George King) Date: Wed, 27 Sep 2017 16:23:22 -0400 Subject: [Python-Dev] API design question: how to extend sys.settrace()? Message-ID: Victor, thank you for starting this conversation. Nick, I just looked at your patch and I think it is a better solution than mine, because it does not involve adding to or changing the sys API. I will close my pull request. The reason for my interest in this area is that I?m experimenting with a code coverage tool that does per-opcode tracing. I just updated it to use the new f_trace_opcodes feature and it *almost* worked: Nick?s implementation calls the opcode trace before updating the line number, whereas my version updated line numbers first, so the event stream is slightly different. See: https://github.com/python/cpython/commit/5a8516701f5140c8c989c40e261a4f4e20e8af86#diff-7f17c8d8448b7b6f90549035d2147a9f From my perspective it makes more sense to do the update to f_lineno first, followed by the opcode trace, because then line events and opcode events corresponding to the same opcode will have the same line number; as it is they come out different. Is the current order intentional? Otherwise I?ll submit a patch. Either way, I?m really happy that this functionality has made it into master! Thanks, George From ncoghlan at gmail.com Wed Sep 27 20:28:52 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 28 Sep 2017 10:28:52 +1000 Subject: [Python-Dev] API design question: how to extend sys.settrace()? In-Reply-To: References: Message-ID: On 28 September 2017 at 06:23, George King wrote: > Victor, thank you for starting this conversation. Nick, I just looked at your patch and I think it is a better solution than mine, because it does not involve adding to or changing the sys API. I will close my pull request. > > The reason for my interest in this area is that I?m experimenting with a code coverage tool that does per-opcode tracing. > I just updated it to use the new f_trace_opcodes feature and it *almost* worked: Nick?s implementation calls the opcode trace before updating the line number, whereas my version updated line numbers first, so the event stream is slightly different. > > See: https://github.com/python/cpython/commit/5a8516701f5140c8c989c40e261a4f4e20e8af86#diff-7f17c8d8448b7b6f90549035d2147a9f > > From my perspective it makes more sense to do the update to f_lineno first, followed by the opcode trace, because then line events and opcode events corresponding to the same opcode will have the same line number; as it is they come out different. > > Is the current order intentional? Otherwise I?ll submit a patch. Line numbers simply weren't relevant to my use case (injecting inconveniently timed exceptions), so I never checked whether or not f_lineno was accurate when the trace event was emitted. So yeah, a follow-up bug report and PR to fix that would be appreciated (no need for a NEWS entry, since it's a bug in a not-yet-released feature). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ezio.melotti at gmail.com Wed Sep 27 21:54:53 2017 From: ezio.melotti at gmail.com (Ezio Melotti) Date: Thu, 28 Sep 2017 04:54:53 +0300 Subject: [Python-Dev] bugs.python.org updated Message-ID: Hi, yesterday I updated the version of Roundup used to run our bugs.python.org instance and 4 other instances (the meta, jython, setuptools, and roundup trackers). If everything went right you shouldn't have noticed anything different :) This update included ~300 changesets from upstream and required an additional ~30 to update our instances and our fork of Roundup. A number of features that we added to our fork over the years have been ported upstream and they have now been removed from our fork, which is now -- except for the github integration -- almost aligned with upstream. If you notice any issue with the bug tracker (/with/, not /in/ -- that's expected!), please report it to the meta tracker ( http://psf.upfronthosting.co.za/roundup/meta/) and/or to me. If there are no reports and everything works fine I still have a few more updates coming up ;) A big thanks to John Rouillard (from the Roundup team) and R. David Murray for the help! Best Regards, Ezio Melotti P.S. Roundup started moving towards Python 3. -------------- next part -------------- An HTML attachment was scrubbed... URL: From senthil at uthcode.com Wed Sep 27 22:00:29 2017 From: senthil at uthcode.com (Senthil Kumaran) Date: Wed, 27 Sep 2017 19:00:29 -0700 Subject: [Python-Dev] bugs.python.org updated In-Reply-To: References: Message-ID: On Wed, Sep 27, 2017 at 6:54 PM, Ezio Melotti wrote: > > This update included ~300 changesets from upstream and required an > additional ~30 to update our instances and our fork of Roundup. A number > of features that we added to our fork over the years have been ported > upstream and they have now been removed from our fork, which is now -- > except for the github integration -- almost aligned with upstream. > > > P.S. Roundup started moving towards Python 3. > Thanks a lot for your work on this, Ezio! -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Thu Sep 28 04:16:10 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 28 Sep 2017 10:16:10 +0200 Subject: [Python-Dev] bugs.python.org updated In-Reply-To: References: Message-ID: 2017-09-28 3:54 GMT+02:00 Ezio Melotti : > This update included ~300 changesets from upstream and required an > additional ~30 to update our instances and our fork of Roundup. A number of > features that we added to our fork over the years have been ported upstream > and they have now been removed from our fork, which is now -- except for the > github integration -- almost aligned with upstream. Oh, impressive. It's great when we succeed to use the upstream version! Thank you for putting all downstream patches upstream. Victor From guido at python.org Fri Sep 29 10:40:06 2017 From: guido at python.org (Guido van Rossum) Date: Fri, 29 Sep 2017 07:40:06 -0700 Subject: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files) In-Reply-To: References: Message-ID: It's Friday! There have been no further comments. PEP 550 is now accepted. Congrats, Benjamin! Go ahead and send your implementation for review. --Guido On Tue, Sep 26, 2017 at 1:47 PM, Guido van Rossum wrote: > I've read the current version of PEP 552 over and I think everything looks > good for acceptance. I believe there are no outstanding objections (or they > have been adequately addressed in responses). > > Therefore I intend to accept PEP 552 this Friday, unless grave objections > are raised on this mailing list (python-dev). > > Congratulations Benjamin. Gotta love those tristate options! > > -- > --Guido van Rossum (python.org/~guido) > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From me at louie.lu Fri Sep 29 10:44:12 2017 From: me at louie.lu (Louie Lu) Date: Fri, 29 Sep 2017 22:44:12 +0800 Subject: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files) In-Reply-To: References: Message-ID: Hi all, Do the accepted PEP were 552, not 550? Thanks, Louie. 2017-09-29 22:40 GMT+08:00 Guido van Rossum : > It's Friday! > > There have been no further comments. PEP 550 is now accepted. > > Congrats, Benjamin! Go ahead and send your implementation for review. > > --Guido > > On Tue, Sep 26, 2017 at 1:47 PM, Guido van Rossum wrote: >> >> I've read the current version of PEP 552 over and I think everything looks >> good for acceptance. I believe there are no outstanding objections (or they >> have been adequately addressed in responses). >> >> Therefore I intend to accept PEP 552 this Friday, unless grave objections >> are raised on this mailing list (python-dev). >> >> Congratulations Benjamin. Gotta love those tristate options! >> >> -- >> --Guido van Rossum (python.org/~guido) > > > > > -- > --Guido van Rossum (python.org/~guido) > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/me%40louie.lu > From guido at python.org Fri Sep 29 11:18:27 2017 From: guido at python.org (Guido van Rossum) Date: Fri, 29 Sep 2017 08:18:27 -0700 Subject: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files) In-Reply-To: References: Message-ID: Let me try that again. There have been no further comments. PEP 552 is now accepted. Congrats, Benjamin! Go ahead and send your implementation for review.Oops. Let me try that again. PS. PEP 550 is still unaccepted, awaiting a new revision from Yury and Elvis. -- --Guido van Rossum (python.org/~guido ) -------------- next part -------------- An HTML attachment was scrubbed... URL: From status at bugs.python.org Fri Sep 29 12:09:47 2017 From: status at bugs.python.org (Python tracker) Date: Fri, 29 Sep 2017 18:09:47 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20170929160947.09D7911A98B@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2017-09-22 - 2017-09-29) Python tracker at https://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 6208 (+12) closed 37181 (+74) total 43389 (+86) Open issues with patches: 2384 Issues opened (54) ================== #25287: test_crypt fails on OpenBSD https://bugs.python.org/issue25287 reopened by serhiy.storchaka #31553: Extend json.tool to handle jsonlines (with a flag) https://bugs.python.org/issue31553 opened by Eric Moyer #31554: Warn when __loader__ != __spec__.loader https://bugs.python.org/issue31554 opened by brett.cannon #31555: Windows pyd slower when not loaded via load_dynamic https://bugs.python.org/issue31555 opened by Safihre #31556: asyncio.wait_for can cancel futures faster with timeout==0 https://bugs.python.org/issue31556 opened by hellysmile #31557: tarfile: incorrectly treats regular file as directory https://bugs.python.org/issue31557 opened by Joe Tsai #31558: gc.freeze() - an API to mark objects as uncollectable https://bugs.python.org/issue31558 opened by lukasz.langa #31562: snakebite.net is not available https://bugs.python.org/issue31562 opened by serhiy.storchaka #31565: open() doesn't use locale.getpreferredencoding() https://bugs.python.org/issue31565 opened by serhiy.storchaka #31567: Inconsistent documentation around decorators https://bugs.python.org/issue31567 opened by dmiyakawa #31572: Avoid suppressing all exceptions in PyObject_HasAttr() https://bugs.python.org/issue31572 opened by serhiy.storchaka #31573: PyStructSequence_New() doesn't validate its input type (crashe https://bugs.python.org/issue31573 opened by Oren Milman #31574: Add dtrace hook for importlib https://bugs.python.org/issue31574 opened by christian.heimes #31577: crash in os.utime() in case of a bad ns argument https://bugs.python.org/issue31577 opened by Oren Milman #31581: Reduce the number of imports for functools https://bugs.python.org/issue31581 opened by inada.naoki #31582: Add _pth breadcrumb to sys.path documentation https://bugs.python.org/issue31582 opened by Traveler Hauptman #31583: 2to3 call for file in current directory yields error https://bugs.python.org/issue31583 opened by denis-osipov #31584: Documentation Language mixed up https://bugs.python.org/issue31584 opened by asl #31589: Links for French documentation pdf is broken https://bugs.python.org/issue31589 opened by fabrice #31590: CSV module incorrectly treats escaped newlines as new records https://bugs.python.org/issue31590 opened by mallyvai #31591: Closing socket raises AttributeError: 'collections.deque' obje https://bugs.python.org/issue31591 opened by reidfaiv #31592: assertion failure in Python/ast.c in case of a bad unicodedata https://bugs.python.org/issue31592 opened by Oren Milman #31594: Make bytes and bytearray maketrans accept dictionaries as firs https://bugs.python.org/issue31594 opened by oleksandr.suvorov #31596: expose pthread_getcpuclockid in time module https://bugs.python.org/issue31596 opened by pdox #31601: Availability of utimensat and futimens not checked correctly o https://bugs.python.org/issue31601 opened by jmr #31602: assertion failure in zipimporter.get_source() in case of a bad https://bugs.python.org/issue31602 opened by Oren Milman #31603: Please add argument to override stdin/out/err in the input bui https://bugs.python.org/issue31603 opened by wt #31604: unittest.TestLoader().loadTestsFromTestCase(...) fails when ad https://bugs.python.org/issue31604 opened by Erasmus Cedernaes #31606: [Windows] Tests failing on installed Python 3.7a1 on Windows https://bugs.python.org/issue31606 opened by haypo #31607: Add listsize in pdb.py https://bugs.python.org/issue31607 opened by matrixise #31608: crash in methods of a subclass of _collections.deque with a ba https://bugs.python.org/issue31608 opened by Oren Milman #31609: PCbuild\clean.bat fails if the path contains whitespaces https://bugs.python.org/issue31609 opened by serhiy.storchaka #31610: Use select.poll instead of select.select in SocketServer.BaseS https://bugs.python.org/issue31610 opened by ?????????????? ???????????? #31611: Tests failures using -u largefile when the disk is full https://bugs.python.org/issue31611 opened by haypo #31612: Building 3.6 fails on Windows https://bugs.python.org/issue31612 opened by serhiy.storchaka #31613: tkinter.simpledialog default buttons (not buttonbox) are not l https://bugs.python.org/issue31613 opened by jcrmatos #31618: Change sys.settrace opcode tracing to occur after frame line n https://bugs.python.org/issue31618 opened by gwk #31619: Strange error when convert hexadecimal with underscores to int https://bugs.python.org/issue31619 opened by serhiy.storchaka #31620: asyncio.Queue leaks memory if the queue is empty and consumers https://bugs.python.org/issue31620 opened by zackelan #31622: Make threading.get_ident() return an opaque type https://bugs.python.org/issue31622 opened by pdox #31623: Allow to build MSI installer from the official tarball https://bugs.python.org/issue31623 opened by Ivan.Pozdeev #31625: stop using ranlib https://bugs.python.org/issue31625 opened by benjamin.peterson #31626: Crash in _PyUnicode_DecodeUnicodeEscape on OpenBSD https://bugs.python.org/issue31626 opened by serhiy.storchaka #31627: test_mailbox fails if the hostname is empty https://bugs.python.org/issue31627 opened by serhiy.storchaka #31628: test_emails failure on FreeBSD https://bugs.python.org/issue31628 opened by serhiy.storchaka #31629: test_multiprocessing_fork failure on FreeBSD https://bugs.python.org/issue31629 opened by serhiy.storchaka #31630: math.tan has poor accuracy near pi/2 on OpenBSD https://bugs.python.org/issue31630 opened by serhiy.storchaka #31631: test_c_locale_coercion fails on OpenBSD https://bugs.python.org/issue31631 opened by serhiy.storchaka #31632: Asyncio: SSL transport does not support set_protocol() https://bugs.python.org/issue31632 opened by jlacoline #31634: Consider installing wheel in ensurepip by default https://bugs.python.org/issue31634 opened by Segev Finer #31635: test_strptime failure on OpenBSD https://bugs.python.org/issue31635 opened by serhiy.storchaka #31636: test_locale failure on OpenBSD https://bugs.python.org/issue31636 opened by serhiy.storchaka #31638: zipapp module should support compression https://bugs.python.org/issue31638 opened by zmwangx #1649329: Extract file-finding and language-handling code from gettext.f https://bugs.python.org/issue1649329 reopened by merwok Most recent 15 issues with no replies (15) ========================================== #31638: zipapp module should support compression https://bugs.python.org/issue31638 #31636: test_locale failure on OpenBSD https://bugs.python.org/issue31636 #31635: test_strptime failure on OpenBSD https://bugs.python.org/issue31635 #31634: Consider installing wheel in ensurepip by default https://bugs.python.org/issue31634 #31632: Asyncio: SSL transport does not support set_protocol() https://bugs.python.org/issue31632 #31631: test_c_locale_coercion fails on OpenBSD https://bugs.python.org/issue31631 #31630: math.tan has poor accuracy near pi/2 on OpenBSD https://bugs.python.org/issue31630 #31629: test_multiprocessing_fork failure on FreeBSD https://bugs.python.org/issue31629 #31628: test_emails failure on FreeBSD https://bugs.python.org/issue31628 #31609: PCbuild\clean.bat fails if the path contains whitespaces https://bugs.python.org/issue31609 #31602: assertion failure in zipimporter.get_source() in case of a bad https://bugs.python.org/issue31602 #31601: Availability of utimensat and futimens not checked correctly o https://bugs.python.org/issue31601 #31596: expose pthread_getcpuclockid in time module https://bugs.python.org/issue31596 #31590: CSV module incorrectly treats escaped newlines as new records https://bugs.python.org/issue31590 #31583: 2to3 call for file in current directory yields error https://bugs.python.org/issue31583 Most recent 15 issues waiting for review (15) ============================================= #31638: zipapp module should support compression https://bugs.python.org/issue31638 #31634: Consider installing wheel in ensurepip by default https://bugs.python.org/issue31634 #31632: Asyncio: SSL transport does not support set_protocol() https://bugs.python.org/issue31632 #31627: test_mailbox fails if the hostname is empty https://bugs.python.org/issue31627 #31625: stop using ranlib https://bugs.python.org/issue31625 #31623: Allow to build MSI installer from the official tarball https://bugs.python.org/issue31623 #31620: asyncio.Queue leaks memory if the queue is empty and consumers https://bugs.python.org/issue31620 #31619: Strange error when convert hexadecimal with underscores to int https://bugs.python.org/issue31619 #31608: crash in methods of a subclass of _collections.deque with a ba https://bugs.python.org/issue31608 #31607: Add listsize in pdb.py https://bugs.python.org/issue31607 #31603: Please add argument to override stdin/out/err in the input bui https://bugs.python.org/issue31603 #31602: assertion failure in zipimporter.get_source() in case of a bad https://bugs.python.org/issue31602 #31592: assertion failure in Python/ast.c in case of a bad unicodedata https://bugs.python.org/issue31592 #31583: 2to3 call for file in current directory yields error https://bugs.python.org/issue31583 #31581: Reduce the number of imports for functools https://bugs.python.org/issue31581 Top 10 most discussed issues (10) ================================= #31415: Add -X option to show import time https://bugs.python.org/issue31415 11 msgs #31584: Documentation Language mixed up https://bugs.python.org/issue31584 11 msgs #30844: selectors: Add urgent data to read event https://bugs.python.org/issue30844 10 msgs #31158: test_pty: test_basic() fails randomly on Travis CI https://bugs.python.org/issue31158 10 msgs #31285: a SystemError and an assertion failure in warnings.warn_explic https://bugs.python.org/issue31285 10 msgs #31170: Update to expat 2.2.4 (expat: utf8_toUtf8 cannot properly hand https://bugs.python.org/issue31170 8 msgs #31612: Building 3.6 fails on Windows https://bugs.python.org/issue31612 8 msgs #29729: RFE: uuid.UUID(bytes=...): support bytes-like types, not only https://bugs.python.org/issue29729 7 msgs #31558: gc.freeze() - an API to mark objects as uncollectable https://bugs.python.org/issue31558 7 msgs #31530: Python 2.7 readahead feature of file objects is not thread saf https://bugs.python.org/issue31530 6 msgs Issues closed (72) ================== #5885: uuid.uuid1() is too slow https://bugs.python.org/issue5885 closed by haypo #11063: Rework uuid module: lazy initialization and add a new C extens https://bugs.python.org/issue11063 closed by pitrou #18257: Two copies of python-config https://bugs.python.org/issue18257 closed by r.david.murray #18558: Iterable glossary entry needs clarification https://bugs.python.org/issue18558 closed by rhettinger #20519: Replace uuid ctypes usage with an extension module. https://bugs.python.org/issue20519 closed by haypo #22140: "python-config --includes" returns a wrong path (double prefix https://bugs.python.org/issue22140 closed by benjamin.peterson #23702: docs.python.org/3/howto/descriptor.html still refers to "unbou https://bugs.python.org/issue23702 closed by rhettinger #25351: pyvenv activate script failure with specific bash option https://bugs.python.org/issue25351 closed by haypo #25532: infinite loop when running inspect.unwrap over unittest.mock.c https://bugs.python.org/issue25532 closed by ncoghlan #25732: functools.total_ordering does not correctly implement not equa https://bugs.python.org/issue25732 closed by serhiy.storchaka #27385: itertools.groupby has misleading doc string https://bugs.python.org/issue27385 closed by rhettinger #28129: assertion failures in ctypes https://bugs.python.org/issue28129 closed by haypo #28754: Argument Clinic for bisect.bisect_left https://bugs.python.org/issue28754 closed by rhettinger #29400: Add instruction level tracing via sys.settrace https://bugs.python.org/issue29400 closed by gwk #29624: Python 3.5.3 x86 web installer cannot install launcher https://bugs.python.org/issue29624 closed by steve.dower #30085: Discourage operator.__dunder__ functions https://bugs.python.org/issue30085 closed by terry.reedy #30152: Reduce the number of imports for argparse https://bugs.python.org/issue30152 closed by serhiy.storchaka #30346: Odd behavior when unpacking `itertools.groupby` https://bugs.python.org/issue30346 closed by serhiy.storchaka #30347: itertools.groupby() can fail a C assert() https://bugs.python.org/issue30347 closed by serhiy.storchaka #30947: Update embeded copy of libexpat from 2.2.1 to 2.2.3 https://bugs.python.org/issue30947 closed by haypo #30995: Support logging.getLogger(style='{') https://bugs.python.org/issue30995 closed by vinay.sajip #31153: Update docstrings of itertools functions https://bugs.python.org/issue31153 closed by rhettinger #31159: Doc: Language switch can't switch on specific cases https://bugs.python.org/issue31159 closed by mdk #31311: a SystemError and a crash in PyCData_setstate() when __dict__ https://bugs.python.org/issue31311 closed by serhiy.storchaka #31351: ensurepip discards pip's return code which leads to broken ven https://bugs.python.org/issue31351 closed by ncoghlan #31355: Remove Travis CI macOS job: rely on buildbots https://bugs.python.org/issue31355 closed by haypo #31389: Give pdb.set_trace() an optional `header` keyword argument https://bugs.python.org/issue31389 closed by barry #31423: Error while building PDF documentation https://bugs.python.org/issue31423 closed by zach.ware #31443: Possibly out of date C extension documentation https://bugs.python.org/issue31443 closed by skrah #31459: IDLE: Rename Class Browser as Module Browser https://bugs.python.org/issue31459 closed by terry.reedy #31490: assertion failure in ctypes in case an _anonymous_ attr appear https://bugs.python.org/issue31490 closed by serhiy.storchaka #31492: assertion failures in case a module has a bad __name__ attribu https://bugs.python.org/issue31492 closed by serhiy.storchaka #31494: Valgrind suppression file https://bugs.python.org/issue31494 closed by skrah #31502: IDLE: Config dialog again deletes custom themes and keysets. https://bugs.python.org/issue31502 closed by terry.reedy #31505: assertion failure in json, in case _json.make_encoder() receiv https://bugs.python.org/issue31505 closed by serhiy.storchaka #31508: Running test_ttk_guionly logs "test_widgets.py:1562: UserWarni https://bugs.python.org/issue31508 closed by serhiy.storchaka #31549: test_strptime and test_time fail on non-English Windows https://bugs.python.org/issue31549 closed by serhiy.storchaka #31559: IDLE: test_browser is failed when run twice https://bugs.python.org/issue31559 closed by terry.reedy #31560: hashlib.blake2: error in example for signed cookies https://bugs.python.org/issue31560 closed by benjamin.peterson #31561: difflib pathological behavior with mixed line endings https://bugs.python.org/issue31561 closed by rhettinger #31563: Change repr of dict_keys, dict_values, and dict_items to use c https://bugs.python.org/issue31563 closed by serhiy.storchaka #31564: NewType can subclass NewType https://bugs.python.org/issue31564 closed by Mariatta #31566: assertion failure in _warnings.warn() in case of a bad __name_ https://bugs.python.org/issue31566 closed by serhiy.storchaka #31568: Configure thinks it finds python3.5 but doesn't https://bugs.python.org/issue31568 closed by haypo #31569: inconsistent case of PCbuild/ directory https://bugs.python.org/issue31569 closed by steve.dower #31570: minor bug in documentataion relating to sending html email https://bugs.python.org/issue31570 closed by Mariatta #31571: Redundand information on Doc/reference/lexical_analysis.rst https://bugs.python.org/issue31571 closed by Mariatta #31575: Functional Programming HOWTO sub-optimal example for reduce https://bugs.python.org/issue31575 closed by rhettinger #31576: problem in math https://bugs.python.org/issue31576 closed by haypo #31578: Unexpected Floating Point Division Result https://bugs.python.org/issue31578 closed by r.david.murray #31579: Reference leak in enumerate https://bugs.python.org/issue31579 closed by serhiy.storchaka #31580: Defer compiling regular expressions https://bugs.python.org/issue31580 closed by barry #31585: Refactor the enumerate.__next__ implementation https://bugs.python.org/issue31585 closed by rhettinger #31586: SystemError in _collections._count_element() in case of a bad https://bugs.python.org/issue31586 closed by rhettinger #31587: Enum doesn't allow name https://bugs.python.org/issue31587 closed by Robert Gomu??ka #31588: SystemError in class creation in case of a metaclass with a ba https://bugs.python.org/issue31588 closed by ncoghlan #31593: [2.7][3.6] test_socketserver ForkingMixIn tests leaks child pr https://bugs.python.org/issue31593 closed by haypo #31595: Preferring MSVC 2017 in Python 3.6.x new minor releases https://bugs.python.org/issue31595 closed by steve.dower #31597: ipaddress.hosts() doesn't return anything on a /32 "network" https://bugs.python.org/issue31597 closed by serhiy.storchaka #31598: pyping 0.0.5 package crashed on ZeroDivisionError: float divis https://bugs.python.org/issue31598 closed by r.david.murray #31599: bug in doctest when comparing - 0.0 == 0.0 https://bugs.python.org/issue31599 closed by r.david.murray #31600: please allow association of google login with current account https://bugs.python.org/issue31600 closed by serhiy.storchaka #31605: meta issue: bugs.python.org search shows only issues with rece https://bugs.python.org/issue31605 closed by ezio.melotti #31614: can't list groupby generator without breaking the sub groups g https://bugs.python.org/issue31614 closed by rhettinger #31615: Python interpreter is failing due to BufferOverflow Exception https://bugs.python.org/issue31615 closed by serhiy.storchaka #31616: Windows installer: Python binaries are user-writable https://bugs.python.org/issue31616 closed by steve.dower #31617: os.path.join() return wrong value in windows https://bugs.python.org/issue31617 closed by eryksun #31621: Pluralization typo in Language Reference section 7.12 https://bugs.python.org/issue31621 closed by Mariatta #31624: remove support for BSD/OS https://bugs.python.org/issue31624 closed by benjamin.peterson #31633: test_crypt fails on OpenBSD https://bugs.python.org/issue31633 closed by serhiy.storchaka #31637: integer overflow in the size of a ctypes.Array https://bugs.python.org/issue31637 closed by Oren Milman #1612262: IDLE: Include nested functions and classes in module browser https://bugs.python.org/issue1612262 closed by terry.reedy From benjamin at python.org Fri Sep 29 12:17:16 2017 From: benjamin at python.org (Benjamin Peterson) Date: Fri, 29 Sep 2017 09:17:16 -0700 Subject: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files) In-Reply-To: References: Message-ID: <1506701836.3115188.1122665216.743447DD@webmail.messagingengine.com> Thanks, Guido and everyone who submitted feedback! I guess I know what I'll be doing this weekend. On Fri, Sep 29, 2017, at 08:18, Guido van Rossum wrote: > Let me try that again. > > There have been no further comments. PEP 552 is now accepted. > > Congrats, Benjamin! Go ahead and send your implementation for > review.Oops. > Let me try that again. > > PS. PEP 550 is still unaccepted, awaiting a new revision from Yury and> Elvis. > > -- > --Guido van Rossum (python.org/~guido ) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/benjamin%40python.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Fri Sep 29 12:38:14 2017 From: brett at python.org (Brett Cannon) Date: Fri, 29 Sep 2017 16:38:14 +0000 Subject: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files) In-Reply-To: <1506701836.3115188.1122665216.743447DD@webmail.messagingengine.com> References: <1506701836.3115188.1122665216.743447DD@webmail.messagingengine.com> Message-ID: BTW, if you find the bytecode-specific APIs are sub-par while trying to update them, let me know as I have been toying with cleaning them up and centralizing them under importlib for a while and just never gotten around to sitting down and coming up with a better design that warranted putting the time into it. :) On Fri, 29 Sep 2017 at 09:17 Benjamin Peterson wrote: > Thanks, Guido and everyone who submitted feedback! > > I guess I know what I'll be doing this weekend. > > > On Fri, Sep 29, 2017, at 08:18, Guido van Rossum wrote: > > Let me try that again. > > > > There have been no further comments. PEP 552 is now accepted. > > > > Congrats, Benjamin! Go ahead and send your implementation for > > review.Oops. > > Let me try that again. > > > > PS. PEP 550 is still unaccepted, awaiting a new revision from Yury and > > Elvis. > > > > -- > > --Guido van Rossum (python.org/~guido ) > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > https://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: > > https://mail.python.org/mailman/options/python-dev/benjamin%40python.org > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From benjamin at python.org Sat Sep 30 21:46:34 2017 From: benjamin at python.org (Benjamin Peterson) Date: Sat, 30 Sep 2017 18:46:34 -0700 Subject: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files) In-Reply-To: References: <1506701836.3115188.1122665216.743447DD@webmail.messagingengine.com> Message-ID: <1506822394.2443337.1123772232.2C322F02@webmail.messagingengine.com> What do you mean by bytecode-specific APIs? The internal importlib ones? On Fri, Sep 29, 2017, at 09:38, Brett Cannon wrote: > BTW, if you find the bytecode-specific APIs are sub-par while trying to > update them, let me know as I have been toying with cleaning them up and > centralizing them under importlib for a while and just never gotten > around > to sitting down and coming up with a better design that warranted putting > the time into it. :) > > On Fri, 29 Sep 2017 at 09:17 Benjamin Peterson > wrote: > > > Thanks, Guido and everyone who submitted feedback! > > > > I guess I know what I'll be doing this weekend. > > > > > > On Fri, Sep 29, 2017, at 08:18, Guido van Rossum wrote: > > > Let me try that again. > > > > > > There have been no further comments. PEP 552 is now accepted. > > > > > > Congrats, Benjamin! Go ahead and send your implementation for > > > review.Oops. > > > Let me try that again. > > > > > > PS. PEP 550 is still unaccepted, awaiting a new revision from Yury and > > > Elvis. > > > > > > -- > > > --Guido van Rossum (python.org/~guido ) > > > _______________________________________________ > > > Python-Dev mailing list > > > Python-Dev at python.org > > > https://mail.python.org/mailman/listinfo/python-dev > > > Unsubscribe: > > > https://mail.python.org/mailman/options/python-dev/benjamin%40python.org > > > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > https://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: > > https://mail.python.org/mailman/options/python-dev/brett%40python.org > >