From storchaka at gmail.com Sun Oct 1 04:12:35 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sun, 1 Oct 2017 11:12:35 +0300 Subject: [Python-Dev] bpo-30806 netrc.__repr__() is broken for writing to file (GH-2491) In-Reply-To: <3y402J0GMyzFqwk@mail.python.org> References: <3y402J0GMyzFqwk@mail.python.org> Message-ID: <8c28b48c-521e-daa9-59fe-d94286769ef4@gmail.com> 30.09.17 10:10, INADA Naoki ????: > https://github.com/python/cpython/commit/b24cd055ecb3eea9a15405a6ca72dafc739e6531 > commit: b24cd055ecb3eea9a15405a6ca72dafc739e6531 > branch: master > author: James Sexton > committer: INADA Naoki > date: 2017-09-30T16:10:31+09:00 > summary: > > bpo-30806 netrc.__repr__() is broken for writing to file (GH-2491) > > netrc file format doesn't support quotes and escapes. > > See https://linux.die.net/man/5/netrc The commit message looks confusing to me. Is netrc.__repr__() is broken now? Or this change makes netrc file format supporting quotes and escapes now? Please read the following thread: https://mail.python.org/pipermail/python-dev/2011-May/111303.html. From k7hoven at gmail.com Sun Oct 1 12:13:49 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Sun, 1 Oct 2017 19:13:49 +0300 Subject: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files) In-Reply-To: References: Message-ID: On Sep 29, 2017 18:21, "Guido van Rossum" wrote: PS. PEP 550 is still unaccepted, awaiting a new revision from Yury and Elvis. This is getting really off-topic, but I do have updates to add to PEP 555 if there is interest in that. IMO, 555 is better and most likely faster than 550, but on the other hand, the issues with PEP 550 are most likely not going to be a problem for me personally. -- Koos -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sun Oct 1 12:26:42 2017 From: guido at python.org (Guido van Rossum) Date: Sun, 1 Oct 2017 09:26:42 -0700 Subject: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files) In-Reply-To: References: Message-ID: Your PEP is currently incomplete. If you don't finish it, it is not even a contender. But TBH it's not my favorite anyway, so you could also just withdraw it. On Oct 1, 2017 9:13 AM, "Koos Zevenhoven" wrote: > On Sep 29, 2017 18:21, "Guido van Rossum" wrote: > > > PS. PEP 550 is still unaccepted, awaiting a new revision from Yury and > Elvis. > > > This is getting really off-topic, but I do have updates to add to PEP 555 > if there is interest in that. IMO, 555 is better and most likely faster > than 550, but on the other hand, the issues with PEP 550 are most likely > not going to be a problem for me personally. > > -- Koos > -------------- next part -------------- An HTML attachment was scrubbed... URL: From k7hoven at gmail.com Sun Oct 1 16:52:31 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Sun, 1 Oct 2017 23:52:31 +0300 Subject: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files) In-Reply-To: References: Message-ID: On Oct 1, 2017 19:26, "Guido van Rossum" wrote: Your PEP is currently incomplete. If you don't finish it, it is not even a contender. But TBH it's not my favorite anyway, so you could also just withdraw it. I can withdraw it if you ask me to, but I don't want to withdraw it without any reason. I haven't changed my mind about the big picture. OTOH, PEP 521 is elegant and could be used to implement PEP 555, but 521 is almost certainly less performant and has some problems regarding context manager wrappers that use composition instead of inheritance. -- Koos On Oct 1, 2017 9:13 AM, "Koos Zevenhoven" wrote: > On Sep 29, 2017 18:21, "Guido van Rossum" wrote: > > > PS. PEP 550 is still unaccepted, awaiting a new revision from Yury and > Elvis. > > > This is getting really off-topic, but I do have updates to add to PEP 555 > if there is interest in that. IMO, 555 is better and most likely faster > than 550, but on the other hand, the issues with PEP 550 are most likely > not going to be a problem for me personally. > > -- Koos > -------------- next part -------------- An HTML attachment was scrubbed... URL: From songofacandy at gmail.com Sun Oct 1 22:04:51 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Mon, 02 Oct 2017 02:04:51 +0000 Subject: [Python-Dev] Investigating time for `import requests` Message-ID: See also https://github.com/requests/requests/issues/4315 I tried new `-X importtime` option to `import requests`. Full output is here: https://gist.github.com/methane/96d58a29e57e5be97769897462ee1c7e Currently, it took about 110ms. And major parts are from Python stdlib. Followings are root of slow stdlib subtrees. import time: self [us] | cumulative | imported package import time: 1374 | 14038 | logging import time: 2636 | 4255 | socket import time: 2902 | 11004 | ssl import time: 1162 | 16694 | http.client import time: 656 | 5331 | cgi import time: 7338 | 7867 | http.cookiejar import time: 2930 | 2930 | http.cookies *1. logging* logging is slow because it is imported in early stage. It imports many common, relatively slow packages. (collections, functools, enum, re). Especially, traceback module is slow because linecache. import time: 1419 | 5016 | tokenize import time: 200 | 5910 | linecache import time: 347 | 8869 | traceback I think it's worth enough to import linecache lazily. *2. socket* import time: 807 | 1221 | selectors import time: 2636 | 4255 | socket socket imports selectors for socket.send_file(). And selectors module use ABC. That's why selectors is bit slow. And socket module creates four enums. That's why import socket took more than 2.5ms excluding subimports. *3. ssl* import time: 2007 | 2007 | ipaddress import time: 2386 | 2386 | textwrap import time: 2723 | 2723 | _ssl ... import time: 306 | 988 | base64 import time: 2902 | 11004 | ssl I already created pull request about removing textwrap dependency from ssl. https://github.com/python/cpython/pull/3849 ipaddress and _ssl module are bit slow too. But I don't know we can improve them or not. ssl itself took 2.9 ms. It's because ssl has six enums. *4. http.client* import time: 1376 | 2448 | email.header ... import time: 1469 | 7791 | email.utils import time: 408 | 10646 | email._policybase import time: 939 | 12210 | email.feedparser import time: 322 | 12720 | email.parser ... import time: 599 | 1361 | email.message import time: 1162 | 16694 | http.client email.parser has very large import tree. But I don't know how to break the tree. *5. cgi* import time: 1083 | 1083 | html.entities import time: 560 | 1643 | html ... import time: 656 | 2609 | shutil import time: 424 | 3033 | tempfile import time: 656 | 5331 | cgi cgi module uses tempfile to save uploaded file. But requests imports cgi just for `cgi.parse_header()`. tempfile is not used. Maybe, it's worth enough to import it lazily. FYI, cgi depends on very slow email.parser too. But this tree doesn't contain it because http.client is imported before cgi. Even though it's not problem for requests, it may affects to real CGI application. Of course, startup time is very important for CGI applications too. *6. http.cookiejar and http.cookies* It's slow because it has many `re.compile()` *Ideas* There are some places to break large import tree by "import in function" hack. ABC is slow, and it's used widely without almost no real need. (Who need selectors is ABC?) We can't remove ABC dependency because of backward compatibility. But I hope ABC is implemented in C by Python 3.7. Enum is slow, maybe slower than most people think. I don't know why exactly, but I suspect that it's because namespace dict implemented in Python. Anyway, I think we can have C implementation of IntEnum and IntFlag, like namedtpule vs PyStructSequence. It doesn't need to 100% compatible with current enum. Especially, no need for using metaclass. Another major slowness comes from compiling regular expression. I think we can increase cache size of `re.compile` and use ondemand cached compiling (e.g. `re.match()`), instead of "compile at import time" in many modules. PEP 562 -- Module __getattr__ helps a lot too. It make possible to split collection module and strings module. (strings module is used often for constants like strings.ascii_letters, but strings.Template cause import time re.compile()) Regards, -- Inada Naoki -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sun Oct 1 22:34:55 2017 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 1 Oct 2017 19:34:55 -0700 Subject: [Python-Dev] Investigating time for `import requests` In-Reply-To: References: Message-ID: On Sun, Oct 1, 2017 at 7:04 PM, INADA Naoki wrote: > 4. http.client > > import time: 1376 | 2448 | email.header > ... > import time: 1469 | 7791 | email.utils > import time: 408 | 10646 | email._policybase > import time: 939 | 12210 | email.feedparser > import time: 322 | 12720 | email.parser > ... > import time: 599 | 1361 | email.message > import time: 1162 | 16694 | http.client > > email.parser has very large import tree. > But I don't know how to break the tree. There is some work to get urllib3/requests to stop using http.client, though it's not clear if/when it will actually happen: https://github.com/shazow/urllib3/pull/1068 > Another major slowness comes from compiling regular expression. > I think we can increase cache size of `re.compile` and use ondemand cached > compiling (e.g. `re.match()`), > instead of "compile at import time" in many modules. In principle re.compile() itself could be made lazy -- return a regular exception object that just holds the string, and then compiles and caches it the first time it's used. Might be tricky to do in a backwards compatibility way if it moves detection of invalid regexes from compile time to use time, but it could be an opt-in flag. -n -- Nathaniel J. Smith -- https://vorpus.org From v+python at g.nevcal.com Sun Oct 1 22:49:04 2017 From: v+python at g.nevcal.com (Glenn Linderman) Date: Sun, 1 Oct 2017 19:49:04 -0700 Subject: [Python-Dev] Investigating time for `import requests` In-Reply-To: References: Message-ID: On 10/1/2017 7:34 PM, Nathaniel Smith wrote: >> Another major slowness comes from compiling regular expression. >> I think we can increase cache size of `re.compile` and use ondemand cached >> compiling (e.g. `re.match()`), >> instead of "compile at import time" in many modules. > In principle re.compile() itself could be made lazy -- return a > regular exception object that just holds the string, and then compiles > and caches it the first time it's used. Might be tricky to do in a > backwards compatibility way if it moves detection of invalid regexes > from compile time to use time, but it could be an opt-in flag. Would be interesting to know how many of the in-module, compile time re.compile calls use dynamic values, versus string constants. Seems like string constant parameters to re.compile calls could be moved to on-first-use compiling without significant backwards incompatibility impact if there is an adequate test suite... and if there isn't an adequate test suite, should we care about the deferred detection? -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sun Oct 1 23:42:18 2017 From: guido at python.org (Guido van Rossum) Date: Sun, 1 Oct 2017 20:42:18 -0700 Subject: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files) In-Reply-To: References: Message-ID: On Sun, Oct 1, 2017 at 1:52 PM, Koos Zevenhoven wrote: > On Oct 1, 2017 19:26, "Guido van Rossum" wrote: > > Your PEP is currently incomplete. If you don't finish it, it is not even a > contender. But TBH it's not my favorite anyway, so you could also just > withdraw it. > > > I can withdraw it if you ask me to, but I don't want to withdraw it > without any reason. I haven't changed my mind about the big picture. OTOH, > PEP 521 is elegant and could be used to implement PEP 555, but 521 is > almost certainly less performant and has some problems regarding context > manager wrappers that use composition instead of inheritance. > It is my understanding that PEP 521 (which proposes to add optional __suspend__ and __resume__ methods to the context manager protocol, to be called whenever a frame is suspended or resumed inside a `with` block) is no longer a contender because it would be way too slow. I haven't read it recently or thought about it, so I don't know what the second issue you mention is about (though it's presumably about the `yield` in a context manager implemented using a generator decorated with `@contextlib.contextmanager`). So it's really between PEP 550 and PEP 555. And there are currently too many parts of PEP 555 that are left to the imagination of the reader. So, again, I ask you to put up or shut up. It's your choice. If you don't want to do the work completing the PEP you might as well withdraw (once I am satisfied with Yury's PEP I will just accept it when there's no contender). If you do complete it I will probably still choose PEP 550 -- but at the moment the choice would be between something I understand completely and something that's too poorly specified to be able to reason about it. --Guido > -- Koos > > > > On Oct 1, 2017 9:13 AM, "Koos Zevenhoven" wrote: > >> On Sep 29, 2017 18:21, "Guido van Rossum" wrote: >> >> >> PS. PEP 550 is still unaccepted, awaiting a new revision from Yury and >> Elvis. >> >> >> This is getting really off-topic, but I do have updates to add to PEP 555 >> if there is interest in that. IMO, 555 is better and most likely faster >> than 550, but on the other hand, the issues with PEP 550 are most likely >> not going to be a problem for me personally. >> >> -- Koos >> > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Oct 2 00:03:38 2017 From: guido at python.org (Guido van Rossum) Date: Sun, 1 Oct 2017 21:03:38 -0700 Subject: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files) In-Reply-To: References: Message-ID: One more thing. I would really appreciate it if you properly wrapped lines in your PEP around column 72 instead of using a single line per paragraph. This is the standard convention, see the template in PEP 12. On Sun, Oct 1, 2017 at 8:42 PM, Guido van Rossum wrote: > On Sun, Oct 1, 2017 at 1:52 PM, Koos Zevenhoven wrote: > >> On Oct 1, 2017 19:26, "Guido van Rossum" wrote: >> >> Your PEP is currently incomplete. If you don't finish it, it is not even >> a contender. But TBH it's not my favorite anyway, so you could also just >> withdraw it. >> >> >> I can withdraw it if you ask me to, but I don't want to withdraw it >> without any reason. I haven't changed my mind about the big picture. OTOH, >> PEP 521 is elegant and could be used to implement PEP 555, but 521 is >> almost certainly less performant and has some problems regarding context >> manager wrappers that use composition instead of inheritance. >> > > It is my understanding that PEP 521 (which proposes to add optional > __suspend__ and __resume__ methods to the context manager protocol, to be > called whenever a frame is suspended or resumed inside a `with` block) is > no longer a contender because it would be way too slow. I haven't read it > recently or thought about it, so I don't know what the second issue you > mention is about (though it's presumably about the `yield` in a context > manager implemented using a generator decorated with > `@contextlib.contextmanager`). > > So it's really between PEP 550 and PEP 555. And there are currently too > many parts of PEP 555 that are left to the imagination of the reader. So, > again, I ask you to put up or shut up. It's your choice. If you don't want > to do the work completing the PEP you might as well withdraw (once I am > satisfied with Yury's PEP I will just accept it when there's no contender). > If you do complete it I will probably still choose PEP 550 -- but at the > moment the choice would be between something I understand completely and > something that's too poorly specified to be able to reason about it. > > --Guido > > >> -- Koos >> >> >> >> On Oct 1, 2017 9:13 AM, "Koos Zevenhoven" wrote: >> >>> On Sep 29, 2017 18:21, "Guido van Rossum" wrote: >>> >>> >>> PS. PEP 550 is still unaccepted, awaiting a new revision from Yury and >>> Elvis. >>> >>> >>> This is getting really off-topic, but I do have updates to add to PEP >>> 555 if there is interest in that. IMO, 555 is better and most likely faster >>> than 550, but on the other hand, the issues with PEP 550 are most likely >>> not going to be a problem for me personally. >>> >>> -- Koos >>> >> >> > > > -- > --Guido van Rossum (python.org/~guido) > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Oct 2 00:44:57 2017 From: guido at python.org (Guido van Rossum) Date: Sun, 1 Oct 2017 21:44:57 -0700 Subject: [Python-Dev] PEP 553 In-Reply-To: <1B4494FE-AA40-4FF7-B81A-B0FB14F5F259@python.org> References: <1B4494FE-AA40-4FF7-B81A-B0FB14F5F259@python.org> Message-ID: Hope you don't mind me CC'ing python-dev. On Sun, Oct 1, 2017 at 9:38 AM, Barry Warsaw wrote: > You seem to be in PEP review mode :) > > What do you think about 553? Still want to wait, or do you think it?s > missing anything? So far, all the feedback has been positive, and I think > we can basically just close the open issues and both the PEP and > implementation are ready to go. > > Happy to do more work on it if you feel it?s necessary. > -B > I'm basically in agreement. Some quick notes: - There's a comma instead of a period at the end of the 4th bullet in the Rationale: "Breaking the idiom up into two lines further complicates the use of the debugger," . Also I don't understand how this complicates use (even though I also always type it as one line). And the lint warning is actually useful when you accidentally leave this in code you send to CI (assuming that runs a linter, as it should). TBH the biggest argument (to me) is that I simply don't know *how* I would enter some IDE's debugger programmatically. I think it should also be pointed out that even if an IDE has a way to specify conditional breakpoints, the UI may be such that it's easier to just add the check to the code -- and then the breakpoint() option is much more attractive than having to look up how it's done in your particular IDE (especially since this is not all that common). - There's no rationale for the *args, **kwds part of the breakpoint() signature. (I vaguely recall someone on the mailing list asking for it but it seemed far-fetched at best.) - The explanation of the relationship between sys.breakpoint() and sys.__breakpointhook__ was unclear to me -- I had to go to the docs for __displayhook__ ( https://docs.python.org/3/library/sys.html#sys.__displayhook__) to understand that sys.__breakpointhook__ is simply initialized to the same function as sys.breakpointhook, and the idea is that if you screw up you can restore sys.breakpointhook from the other rather than having to restart your process. Is that in fact it? The text "``sys.__breakpointhook__`` then stashes the default value of ``sys.breakpointhook()`` to make it easy to reset" seems to imply some action that would happen... when? how? - Some pseudo-code would be nice. It seems that it's like this: # in builtins def breakpoint(*args, **kwds): import sys return sys.breakpointhook(*args, **kwds) # in sys def breakpointhook(*args, **kwds): import os hook = os.getenv('PYTHONBREAKPOINT') if hook == '0': return None if not hook: import pdb return pdb.set_trace(*args, **kwds) if '.' not in hook: import builtins mod = builtins funcname = hook else: modname, funcname = hook.rsplit('.', 1) __import__(modname) import sys mod = sys.modules[modname] func = getattr(mod, funcname) return func(*args, **kwds) __breakpointhook__ = breakpointhook Except that the error handling should be a bit better. (In particular the PEP specifies a try/except around most of the code in sys.breakpointhook() that issues a RuntimeWarning and returns None.) - Not sure what the PEP's language around evaluation of PYTHONBREAKPOINT means for the above pseudo code. I *think* the PEP author's opinion is that the above pseudo-code is fine. Since programs can mutate their own environment, I think something like `os.environ['PYTHONBREAKPOINT'] = 'foo.bar.baz'; breakpoint()` should result in foo.bar.baz() being imported and called, right? - I'm not quite sure what sort of fast-tracking for PYTHONBREAKPOINT=0 you had in mind beyond putting it first in the code above. - Did you get confirmation from other debuggers? E.g. does it work for IDLE, Wing IDE, PyCharm, and VS 2015? - I'm not sure what the point would be of making a call to breakpoint() a special opcode (it seems a lot of work for something of this nature). ISTM that if some IDE modifies bytecode it can do whatever it well please without a PEP. - I don't see the point of calling `pdb.pm()` at breakpoint time. But it could be done using the PEP with `import pdb; sys.breakpointhook = pdb.pm` right? So this hardly deserves an open issue. - I haven't read the actual implementation in the PR. A PEP should not depend on the actual proposed implementation for disambiguation of its specification (hence my proposal to add pseudo-code to the PEP). That's what I have! -- --Guido van Rossum (python.org/~guido ) -------------- next part -------------- An HTML attachment was scrubbed... URL: From raymond.hettinger at gmail.com Mon Oct 2 01:13:22 2017 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sun, 1 Oct 2017 22:13:22 -0700 Subject: [Python-Dev] Investigating time for `import requests` In-Reply-To: References: Message-ID: > On Oct 1, 2017, at 7:34 PM, Nathaniel Smith wrote: > > In principle re.compile() itself could be made lazy -- return a > regular exception object that just holds the string, and then compiles > and caches it the first time it's used. Might be tricky to do in a > backwards compatibility way if it moves detection of invalid regexes > from compile time to use time, but it could be an opt-in flag. ISTM that someone writing ``re.compile(pattern)`` is explicitly saying they want the regex to be pre-compiled. For cache on first-use, we already have a way to do that with ``re.search(pattern, some string)`` which compiles and then caches. What would be more interesting would be to have a way to save the compiled regex in a pyc file so that it can be restored on load rather than recomputed. Also, we should remind ourselves that making more and more things lazy is a false optimization unless those things never get used. Otherwise, all we're doing is ending the timing before all the relevant work is done. If the lazy object does get used, we've made the actual total execution time worse (because of the overhead of the lazy evaluation logic). Raymond From tjreedy at udel.edu Mon Oct 2 02:15:45 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 2 Oct 2017 02:15:45 -0400 Subject: [Python-Dev] PEP 553 In-Reply-To: References: <1B4494FE-AA40-4FF7-B81A-B0FB14F5F259@python.org> Message-ID: On 10/2/2017 12:44 AM, Guido van Rossum wrote: > - There's no rationale for the *args, **kwds part of the breakpoint() > signature. (I vaguely recall someone on the mailing list asking for it > but it seemed far-fetched at best.) If IDLE's event-driven GUI debugger were rewritten to run in the user process, people wanting to debug a tkinter program should be able to pass in their root, with its mainloop, rather than having the debugger create its own, as it normally would. Something else could come up. > - The explanation of the relationship between sys.breakpoint() and > sys.__breakpointhook__ was unclear to me -- I had to go to the docs for > __displayhook__ > (https://docs.python.org/3/library/sys.html#sys.__displayhook__) to > understand that sys.__breakpointhook__ is simply initialized to the same > function as sys.breakpointhook, and the idea is that if you screw up you > can restore sys.breakpointhook from the other rather than having to > restart your process. Is that in fact it? The text > "``sys.__breakpointhook__`` then stashes the default value of > ``sys.breakpointhook()`` to make it easy to reset" seems to imply some > action that would happen... when? how? > > - Some pseudo-code would be nice. It seems that it's like this: This will be helpful to anyone implementing their own breakpointhook. > # in builtins > def breakpoint(*args, **kwds): > ??? import sys > ??? return sys.breakpointhook(*args, **kwds) > > # in sys > def breakpointhook(*args, **kwds): > ??? import os > ??? hook = os.getenv('PYTHONBREAKPOINT') > ??? if hook == '0': > ??????? return None > ??? if not hook: > ? ?? ?? import pdb > ??????? return pdb.set_trace(*args, **kwds) > > ??? if '.' not in hook: > ??????? import builtins > ??????? mod = builtins > ??????? funcname = hook > ??? else: > ? ?? ?? modname, funcname = hook.rsplit('.', 1) > ??????? __import__(modname) > ??????? import sys > ??????? mod = sys.modules[modname] > ??? func = getattr(mod, funcname) > ??? return func(*args, **kwds) > > __breakpointhook__ = breakpointhook > > Except that the error handling should be a bit better. (In particular > the PEP specifies a try/except around most of the code in > sys.breakpointhook() that issues a RuntimeWarning and returns None.) > > - Not sure what the PEP's language around evaluation of PYTHONBREAKPOINT > means for the above pseudo code. I *think* the PEP author's opinion is > that the above pseudo-code is fine. Since programs can mutate their own > environment, I think something like `os.environ['PYTHONBREAKPOINT'] = > 'foo.bar.baz'; breakpoint()` should result in foo.bar.baz() being > imported and called, right? > > - I'm not quite sure what sort of fast-tracking for PYTHONBREAKPOINT=0 > you had in mind beyond putting it first in the code above. > > - Did you get confirmation from other debuggers? E.g. does it work for > IDLE, Wing IDE, PyCharm, and VS 2015? > > - I'm not sure what the point would be of making a call to breakpoint() > a special opcode (it seems a lot of work for something of this nature). > ISTM that if some IDE modifies bytecode it can do whatever it well > please without a PEP. > > - I don't see the point of calling `pdb.pm ()` at > breakpoint time. But it could be done using the PEP with `import pdb; > sys.breakpointhook = pdb.pm ` right? So this hardly > deserves an open issue. > > - I haven't read the actual implementation in the PR. A PEP should not > depend on the actual proposed implementation for disambiguation of its > specification (hence my proposal to add pseudo-code to the PEP). -- Terry Jan Reedy From ncoghlan at gmail.com Mon Oct 2 03:39:11 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 2 Oct 2017 17:39:11 +1000 Subject: [Python-Dev] Investigating time for `import requests` In-Reply-To: References: Message-ID: On 2 October 2017 at 15:13, Raymond Hettinger wrote: > >> On Oct 1, 2017, at 7:34 PM, Nathaniel Smith wrote: >> >> In principle re.compile() itself could be made lazy -- return a >> regular exception object that just holds the string, and then compiles >> and caches it the first time it's used. Might be tricky to do in a >> backwards compatibility way if it moves detection of invalid regexes >> from compile time to use time, but it could be an opt-in flag. > > ISTM that someone writing ``re.compile(pattern)`` is explicitly saying they want the regex to be pre-compiled. For cache on first-use, we already have a way to do that with ``re.search(pattern, some string)`` which compiles and then caches. > > What would be more interesting would be to have a way to save the compiled regex in a pyc file so that it can be restored on load rather than recomputed. > > Also, we should remind ourselves that making more and more things lazy is a false optimization unless those things never get used. Otherwise, all we're doing is ending the timing before all the relevant work is done. If the lazy object does get used, we've made the actual total execution time worse (because of the overhead of the lazy evaluation logic). Right, and I think the approach Inada-san took here is a good example of how to do that effectively (there are a lot of command line scripts and other startup-sensitive operations that will include an "import requests", but *not* directly import any of the other modules in its dependency tree, so "What requests uses" can identify a useful set of avoidable imports. A Flask "Hello world" app could likely provide another such sample, as could some example data analysis notebooks). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From p.f.moore at gmail.com Mon Oct 2 04:57:01 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 2 Oct 2017 09:57:01 +0100 Subject: [Python-Dev] Investigating time for `import requests` In-Reply-To: References: Message-ID: On 2 October 2017 at 06:13, Raymond Hettinger wrote: > >> On Oct 1, 2017, at 7:34 PM, Nathaniel Smith wrote: >> >> In principle re.compile() itself could be made lazy -- return a >> regular exception object that just holds the string, and then compiles >> and caches it the first time it's used. Might be tricky to do in a >> backwards compatibility way if it moves detection of invalid regexes >> from compile time to use time, but it could be an opt-in flag. > > ISTM that someone writing ``re.compile(pattern)`` is explicitly saying they want the regex to be pre-compiled. For cache on first-use, we already have a way to do that with ``re.search(pattern, some string)`` which compiles and then caches. In practice, I don't think the fact that re.search() et al cache the compiled expressions is that well known (it's mentioned in the re.compile docs, but not in the re.search docs) and so people often compile up front because they think it helps, rather than actually measuring to check. Also, many regexes are long and complex, so factoring them out as global variables is a reasonable practice. And it's easy to imagine people deciding that putting the re.compile step into the global, rather than having the global be a string that gets passed to re.search, is a sensible thing to do (I know I'd do that, without even thinking about it). So I think that cache on first use is likely to be a useful optimisation in practical terms. I don't have any feel for how many uses of re.compile up front would be harmed if we defer compilation to first use (other than "probably not many") but we could make it opt-in if necessary - we'd hit the same problem of people not thinking to opt in, though. Paul From christian at python.org Mon Oct 2 05:06:17 2017 From: christian at python.org (Christian Heimes) Date: Mon, 2 Oct 2017 11:06:17 +0200 Subject: [Python-Dev] Investigating time for `import requests` In-Reply-To: References: Message-ID: On 2017-10-02 04:04, INADA Naoki wrote: > *3. ssl* > > import time:? ? ? 2007 |? ? ? ?2007 |? ? ? ? ? ? ? ? ? ? ?ipaddress > import time:? ? ? 2386 |? ? ? ?2386 |? ? ? ? ? ? ? ? ? ? ?textwrap > import time:? ? ? 2723 |? ? ? ?2723 |? ? ? ? ? ? ? ? ? ? ?_ssl > ... > import time:? ? ? ?306 |? ? ? ? 988 |? ? ? ? ? ? ? ? ? ? ?base64 > import time:? ? ? 2902 |? ? ? 11004 |? ? ? ? ? ? ? ? ? ?ssl > > I already created pull request about removing textwrap dependency from ssl. > https://github.com/python/cpython/pull/3849 Thanks for the patch. I left a comment on the PR. Please update your patch and give me a chance to review patches next time. > ipaddress and _ssl module are bit slow too.? But I don't know we can > improve them or not. The _ssl extension module has to initialize OpenSSL. It is expected to take a while. For 3.7 I'll replace ssl.match_hostname with OpenSSL function. The ssl module will no longer depend on re and ipaddress module. > ssl itself took 2.9 ms.? It's because ssl has six enums. Why are enums so slow? Christian From raymond.hettinger at gmail.com Mon Oct 2 05:42:19 2017 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Mon, 2 Oct 2017 02:42:19 -0700 Subject: [Python-Dev] Investigating time for `import requests` In-Reply-To: References: Message-ID: > On Oct 2, 2017, at 12:39 AM, Nick Coghlan wrote: > > "What requests uses" can identify a useful set of > avoidable imports. A Flask "Hello world" app could likely provide > another such sample, as could some example data analysis notebooks). Right. It is probably worthwhile to identify which parts of the library are typically imported but are not ever used. And likewise, identify a core set of commonly used tools that are going to be almost unavoidable in sufficiently interesting applications (like using requests to access a REST API, running a micro-webframework, or invoking mercurial). Presumably, if any of this is going to make a difference to end users, we need to see if there is any avoidable work that takes a significant fraction of the total time from invocation through the point where the user first sees meaningful output. That would include loading from nonvolatile storage, executing the various imports, and doing the actual application. I don't expect to find anything that would help users of Django, Flask, and Bottle since those are typically long-running apps where we value response time more than startup time. For scripts using the requests module, there will be some fruit because not everything that is imported is used. However, that may not be significant because scripts using requests tend to be I/O bound. In the timings below, 6% of the running time is used to load and run python.exe, another 16% is used to import requests, and the remaining 78% is devoted to the actual task of running a simple REST API query. It would be interesting to see how much of the 16% could be avoided without major alterations to requests, to urllib3, and to the standard library. For mercurial, "hg log" or "hg commit" will likely be instructive about what portion of the imports actually get used. A push or pull will likely be I/O bound so those commands are less informative. Raymond --------- Quick timing for a minimal script using the requests module ----------- $ cat > demo_github_rest_api.py import requests info = requests.get('https://api.github.com/users/raymondh').json() print('%(name)s works at %(company)s. Contact at %(email)s' % info) $ time python3.6 demo_github_rest_api.py Raymond Hettinger works at SauceLabs. Contact at None real 0m0.561s user 0m0.134s sys 0m0.018s $ time python3.6 -c "import requests" real 0m0.125s user 0m0.104s sys 0m0.014s $ time python3.6 -c "" real 0m0.036s user 0m0.024s sys 0m0.005s From christian at python.org Mon Oct 2 05:47:05 2017 From: christian at python.org (Christian Heimes) Date: Mon, 2 Oct 2017 11:47:05 +0200 Subject: [Python-Dev] Python startup optimization: script vs. service Message-ID: <8381f166-5d37-a847-7792-7e0c7f3a9704@python.org> Hello python-dev, it's great to see that so many developers are working on speeding up Python's startup. The improvements are going to make Python more suitable for command line scripts. However I'm worried that some approaches are going to make other use cases slower and less efficient. I'm talking about downsides of lazy initialization and deferred imports. For short running command line scripts, lazy initialization of regular expressions and deferred import of rarely used modules can greatly reduce startup time and reduce memory usage. For long running processes, deferring imports and initialization can be a huge performance problem. A typical server application should initialize as much as possible at startup and then signal its partners that it is ready to serve requests. A deferred import of a module is going to slow down the first request that happens to require the module. This is unacceptable for some applications, e.g. Raymond's example of speed trading. It's even worse for forking servers. A forking HTTP server handles each request in a forked child. Each child process has to compile a lazy regular expression or important a deferred module over and over. uWSGI's emperor / vassal mode us a pre-fork model with multiple server processes to efficiently share memory with copy-on-write semantics. Lazy imports will make the approach less efficient and slow down forking of new vassals. TL;DR please refrain from moving imports into functions or implementing lazy modes, until we have figured out how to satisfy requirements of both scripts and long running services. We probably need a PEP... Christian From songofacandy at gmail.com Mon Oct 2 07:10:32 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Mon, 02 Oct 2017 11:10:32 +0000 Subject: [Python-Dev] Python startup optimization: script vs. service In-Reply-To: <8381f166-5d37-a847-7792-7e0c7f3a9704@python.org> References: <8381f166-5d37-a847-7792-7e0c7f3a9704@python.org> Message-ID: Hi. My company is using Python for web service. So I understand what you're worrying. I'm against fine grained, massive lazy loading too. But I think we're careful enough for lazy importing. https://github.com/python/cpython/pull/3849 In this PR, I stop using textwrap entirely, instead of lazy import. https://github.com/python/cpython/pull/3796 In this PR, lazy loading only happens when uuid1 is used. But uuid1 is very uncommon for nowdays. https://github.com/python/cpython/pull/3757 In this PR, singledispatch is lazy loading types and weakref. But singledispatch is used as decorator. So if web application uses singledispatch, it's loaded before preforking. https://github.com/python/cpython/pull/1269 In this PR, there are some lazy imports. But the number of lazy imports seems small enough. I don't think we're going to too aggressive. In case of regular expression, we're about starting discussion. No real changes are made yet. For example, tokenize.py has large regular expressions. But most of web application uses only one of them: linecache.py uses tokenize.open(), and it uses regular expression for encoding cookie. (Note that traceback is using linecache. It's very commonly imported.) So 90% of time and memory for importing tokenize is just a waste not only CLI application, but also web applications. I have not create PR to lazy importing linecache or tokenize, because I'm worrying about "import them at first traceback". I feel Go's habit helps in some cases; "A little copying is better than a little dependency." (https://go-proverbs.github.io/ ) Maybe, copying `tokenize.open()` into linecache is better than lazy loading tokenize. Anyway, I completely agree with you; we should careful enough about lazy (importing | compiling). Regards, On Mon, Oct 2, 2017 at 6:47 PM Christian Heimes wrote: > Hello python-dev, > > it's great to see that so many developers are working on speeding up > Python's startup. The improvements are going to make Python more > suitable for command line scripts. However I'm worried that some > approaches are going to make other use cases slower and less efficient. > I'm talking about downsides of lazy initialization and deferred imports. > > > For short running command line scripts, lazy initialization of regular > expressions and deferred import of rarely used modules can greatly > reduce startup time and reduce memory usage. > > > For long running processes, deferring imports and initialization can be > a huge performance problem. A typical server application should > initialize as much as possible at startup and then signal its partners > that it is ready to serve requests. A deferred import of a module is > going to slow down the first request that happens to require the module. > This is unacceptable for some applications, e.g. Raymond's example of > speed trading. > > It's even worse for forking servers. A forking HTTP server handles each > request in a forked child. Each child process has to compile a lazy > regular expression or important a deferred module over and over. > uWSGI's emperor / vassal mode us a pre-fork model with multiple server > processes to efficiently share memory with copy-on-write semantics. Lazy > imports will make the approach less efficient and slow down forking of > new vassals. > > > TL;DR please refrain from moving imports into functions or implementing > lazy modes, until we have figured out how to satisfy requirements of > both scripts and long running services. We probably need a PEP... > > Christian > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com > -- Inada Naoki -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Mon Oct 2 09:26:09 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 2 Oct 2017 15:26:09 +0200 Subject: [Python-Dev] Python startup optimization: script vs. service In-Reply-To: References: <8381f166-5d37-a847-7792-7e0c7f3a9704@python.org> Message-ID: 2017-10-02 13:10 GMT+02:00 INADA Naoki : > https://github.com/python/cpython/pull/3796 > In this PR, lazy loading only happens when uuid1 is used. > But uuid1 is very uncommon for nowdays. Antoine Pitrou added a new C extension _uuid which is imported as soon as uuid(.py) is imported. On Linux at least, the main "overhead" is still done on "import uuid". But Antoine change optimized a lot "import uuid" import time! > https://github.com/python/cpython/pull/3757 > In this PR, singledispatch is lazy loading types and weakref. > But singledispatch is used as decorator. > So if web application uses singledispatch, it's loaded before preforking. While "import module" is fast, maybe we should use sometimes a global variable to cache the import. module = None def func(): global module if module is None: import module ... I'm not sure that it's possible to write an helper for such pattern. In *this case*, it's ok, since @singledispatch is more designed to be used with top-level functions, not on nested functions. So the overhead is only at startup, not at runtime in practice. > Maybe, copying `tokenize.open()` into linecache is better than lazy loading > tokenize. Please don't copy code, only do that if we have no other choice. > Anyway, I completely agree with you; we should careful enough about lazy > (importing | compiling). I think that most core devs are aware of tradeoffs and we try to find a compromise on a case by case basis. Victor From k7hoven at gmail.com Mon Oct 2 10:03:17 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Mon, 2 Oct 2017 17:03:17 +0300 Subject: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files) In-Reply-To: References: Message-ID: On Mon, Oct 2, 2017 at 6:42 AM, Guido van Rossum wrote: > On Sun, Oct 1, 2017 at 1:52 PM, Koos Zevenhoven wrote: > >> On Oct 1, 2017 19:26, "Guido van Rossum" wrote: >> >> Your PEP is currently incomplete. If you don't finish it, it is not even >> a contender. But TBH it's not my favorite anyway, so you could also just >> withdraw it. >> >> >> I can withdraw it if you ask me to, but I don't want to withdraw it >> without any reason. I haven't changed my mind about the big picture. OTOH, >> PEP 521 is elegant and could be used to implement PEP 555, but 521 is >> almost certainly less performant and has some problems regarding context >> manager wrappers that use composition instead of inheritance. >> > > It is my understanding that PEP 521 (which proposes to add optional > __suspend__ and __resume__ methods to the context manager protocol, to be > called whenever a frame is suspended or resumed inside a `with` block) is > no longer a contender because it would be way too slow. I haven't read it > recently or thought about it, so I don't know what the second issue you > mention is about (though it's presumably about the `yield` in a context > manager implemented using a generator decorated with > `@contextlib.contextmanager`). > > ?Well, it's not completely unrelated to that. The problem I'm talking about is perhaps most easily seen from a simple context manager wrapper that uses composition instead of inheritance: class Wrapper: def __init__(self): self._wrapped = SomeContextManager() def __enter__(self): print("Entering context") return self._wrapped.__enter__() def __exit__(self): self._wrapped.__exit__() print("Exited context") Now, if the wrapped contextmanager becomes a PEP 521 one with __suspend__ and __resume__, the Wrapper class is broken, because it does not respect __suspend__ and __resume__. So actually this is a backwards compatiblity issue. But if the wrapper is made using inheritance, the problem goes away: class Wrapper(SomeContextManager): def __enter__(self): print("Entering context") return super().__enter__() def __exit__(self): super().__exit__() print("Exited context") Now the wrapper cleanly inherits the new optional __suspend__ and __resume__ from the wrapped context manager type. ??Koos > -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Oct 2 10:45:30 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 2 Oct 2017 07:45:30 -0700 Subject: [Python-Dev] PEP 553 In-Reply-To: References: <1B4494FE-AA40-4FF7-B81A-B0FB14F5F259@python.org> Message-ID: On Sun, Oct 1, 2017 at 11:15 PM, Terry Reedy wrote: > On 10/2/2017 12:44 AM, Guido van Rossum wrote: > > - There's no rationale for the *args, **kwds part of the breakpoint() >> signature. (I vaguely recall someone on the mailing list asking for it but >> it seemed far-fetched at best.) >> > > If IDLE's event-driven GUI debugger were rewritten to run in the user > process, people wanting to debug a tkinter program should be able to pass > in their root, with its mainloop, rather than having the debugger create > its own, as it normally would. Something else could come up. > But if they care so much, they could also use a small wrapper as the sys.breakpointhook that retrieves the root and calls the IDLE debugger with that. Why is adding the root to the breakpoint() call better than that? To me, the main attraction for breakpoint is that there's something I can type quickly and insert at any point in the code. During a debugging session I may try setting it in many different places. If I have to also pass the root each time I type "breakpoint()" that's just an unnecessary detail compared to having it done automatically by the hook. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian at python.org Mon Oct 2 10:48:37 2017 From: christian at python.org (Christian Heimes) Date: Mon, 2 Oct 2017 16:48:37 +0200 Subject: [Python-Dev] Python startup optimization: script vs. service In-Reply-To: <2418C6BB-2AB0-4407-ADE6-ED795DD0B965@gmail.com> References: <8381f166-5d37-a847-7792-7e0c7f3a9704@python.org> <2418C6BB-2AB0-4407-ADE6-ED795DD0B965@gmail.com> Message-ID: <213d0382-dfdd-dfc5-2e1e-3b408acac256@python.org> On 2017-10-02 14:05, George King wrote: > I?m new to this issue, but curious: could the long-running server > mitigate lazy loading problems simply by explicitly importing the > deferred modules, e.g. at the top of __main__.py? It would require some > performance tracing or other analysis to figure out what needed to be > imported, but this might be a very easy way to win back response times > for demanding applications. Conversely, small scripts currently have no > recourse. That approach could work, but I think that it is the wrong approach. I'd rather keep Python optimized for long-running processes and introduce a new mode / option to optimize for short-running scripts. Christian From victor.stinner at gmail.com Mon Oct 2 10:53:09 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 2 Oct 2017 16:53:09 +0200 Subject: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files) In-Reply-To: References: Message-ID: Please start a new thread on python-dev. It's unrelated to "deterministic pyc files". Victor From gwk.lists at gmail.com Mon Oct 2 08:05:27 2017 From: gwk.lists at gmail.com (George King) Date: Mon, 2 Oct 2017 08:05:27 -0400 Subject: [Python-Dev] Python startup optimization: script vs. service In-Reply-To: References: <8381f166-5d37-a847-7792-7e0c7f3a9704@python.org> Message-ID: <2418C6BB-2AB0-4407-ADE6-ED795DD0B965@gmail.com> I?m new to this issue, but curious: could the long-running server mitigate lazy loading problems simply by explicitly importing the deferred modules, e.g. at the top of __main__.py? It would require some performance tracing or other analysis to figure out what needed to be imported, but this might be a very easy way to win back response times for demanding applications. Conversely, small scripts currently have no recourse. > On Oct 2, 2017, at 7:10 AM, INADA Naoki wrote: > > Hi. > > My company is using Python for web service. > So I understand what you're worrying. > I'm against fine grained, massive lazy loading too. > > But I think we're careful enough for lazy importing. > > https://github.com/python/cpython/pull/3849 > In this PR, I stop using textwrap entirely, instead of lazy import. > > https://github.com/python/cpython/pull/3796 > In this PR, lazy loading only happens when uuid1 is used. > But uuid1 is very uncommon for nowdays. > > https://github.com/python/cpython/pull/3757 > In this PR, singledispatch is lazy loading types and weakref. > But singledispatch is used as decorator. > So if web application uses singledispatch, it's loaded before preforking. > > https://github.com/python/cpython/pull/1269 > In this PR, there are some lazy imports. > But the number of lazy imports seems small enough. > > I don't think we're going to too aggressive. > > In case of regular expression, we're about starting discussion. > No real changes are made yet. > > For example, tokenize.py has large regular expressions. > But most of web application uses only one of them: linecache.py uses > tokenize.open(), and it uses regular expression for encoding cookie. > (Note that traceback is using linecache. It's very commonly imported.) > > So 90% of time and memory for importing tokenize is just a waste not > only CLI application, but also web applications. > I have not create PR to lazy importing linecache or tokenize, because > I'm worrying about "import them at first traceback". > > I feel Go's habit helps in some cases; "A little copying is better than a little dependency." > (https://go-proverbs.github.io/ ) > Maybe, copying `tokenize.open()` into linecache is better than lazy loading tokenize. > > > Anyway, I completely agree with you; we should careful enough about lazy (importing | compiling). > > Regards, > > On Mon, Oct 2, 2017 at 6:47 PM Christian Heimes > wrote: > Hello python-dev, > > it's great to see that so many developers are working on speeding up > Python's startup. The improvements are going to make Python more > suitable for command line scripts. However I'm worried that some > approaches are going to make other use cases slower and less efficient. > I'm talking about downsides of lazy initialization and deferred imports. > > > For short running command line scripts, lazy initialization of regular > expressions and deferred import of rarely used modules can greatly > reduce startup time and reduce memory usage. > > > For long running processes, deferring imports and initialization can be > a huge performance problem. A typical server application should > initialize as much as possible at startup and then signal its partners > that it is ready to serve requests. A deferred import of a module is > going to slow down the first request that happens to require the module. > This is unacceptable for some applications, e.g. Raymond's example of > speed trading. > > It's even worse for forking servers. A forking HTTP server handles each > request in a forked child. Each child process has to compile a lazy > regular expression or important a deferred module over and over. > uWSGI's emperor / vassal mode us a pre-fork model with multiple server > processes to efficiently share memory with copy-on-write semantics. Lazy > imports will make the approach less efficient and slow down forking of > new vassals. > > > TL;DR please refrain from moving imports into functions or implementing > lazy modes, until we have figured out how to satisfy requirements of > both scripts and long running services. We probably need a PEP... > > Christian > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com > -- > Inada Naoki > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/gwk.lists%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian at python.org Mon Oct 2 10:59:08 2017 From: christian at python.org (Christian Heimes) Date: Mon, 2 Oct 2017 16:59:08 +0200 Subject: [Python-Dev] Python startup optimization: script vs. service In-Reply-To: References: <8381f166-5d37-a847-7792-7e0c7f3a9704@python.org> Message-ID: <7a7f7c3c-4792-d695-3b91-ebb02da9a980@python.org> On 2017-10-02 15:26, Victor Stinner wrote: > 2017-10-02 13:10 GMT+02:00 INADA Naoki : >> https://github.com/python/cpython/pull/3796 >> In this PR, lazy loading only happens when uuid1 is used. >> But uuid1 is very uncommon for nowdays. > > Antoine Pitrou added a new C extension _uuid which is imported as soon > as uuid(.py) is imported. On Linux at least, the main "overhead" is > still done on "import uuid". But Antoine change optimized a lot > "import uuid" import time! > >> https://github.com/python/cpython/pull/3757 >> In this PR, singledispatch is lazy loading types and weakref. >> But singledispatch is used as decorator. >> So if web application uses singledispatch, it's loaded before preforking. > > While "import module" is fast, maybe we should use sometimes a global > variable to cache the import. > > module = None > def func(): > global module > if module is None: import module > ... > > I'm not sure that it's possible to write an helper for such pattern. I would rather like to see a function in importlib that handles deferred imports: modulename = importlib.deferred_import('modulename') def deferred_import(name): if name in sys.modules: # special case 'None' here return sys.modules[name] else: return ModuleProxy(name) ModuleProxy is a module type subclass that loads the module on demand. Christian From barry at python.org Mon Oct 2 10:59:27 2017 From: barry at python.org (Barry Warsaw) Date: Mon, 2 Oct 2017 10:59:27 -0400 Subject: [Python-Dev] Python startup optimization: script vs. service In-Reply-To: <213d0382-dfdd-dfc5-2e1e-3b408acac256@python.org> References: <8381f166-5d37-a847-7792-7e0c7f3a9704@python.org> <2418C6BB-2AB0-4407-ADE6-ED795DD0B965@gmail.com> <213d0382-dfdd-dfc5-2e1e-3b408acac256@python.org> Message-ID: <9E4EA9B9-E7AD-4E57-8284-CF4B3D69B5D0@python.org> On Oct 2, 2017, at 10:48, Christian Heimes wrote: > > That approach could work, but I think that it is the wrong approach. I'd > rather keep Python optimized for long-running processes and introduce a > new mode / option to optimize for short-running scripts. What would that look like, how would it be invoked, and how would that change the behavior of the interpreter? -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From victor.stinner at gmail.com Mon Oct 2 11:02:15 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 2 Oct 2017 17:02:15 +0200 Subject: [Python-Dev] Python startup optimization: script vs. service In-Reply-To: <213d0382-dfdd-dfc5-2e1e-3b408acac256@python.org> References: <8381f166-5d37-a847-7792-7e0c7f3a9704@python.org> <2418C6BB-2AB0-4407-ADE6-ED795DD0B965@gmail.com> <213d0382-dfdd-dfc5-2e1e-3b408acac256@python.org> Message-ID: 2017-10-02 16:48 GMT+02:00 Christian Heimes : > That approach could work, but I think that it is the wrong approach. I'd > rather keep Python optimized for long-running processes and introduce a > new mode / option to optimize for short-running scripts. "Filling caches on demand" is an old pattern. I don't think that we are doing anything new here. If we add an opt-in option, I would prefer to have an option to explicitly "fill caches", rather than the opposite. I know another example of "lazy cache": base64.b85encode() fills a cache at the first call. Victor From gwk.lists at gmail.com Mon Oct 2 11:12:48 2017 From: gwk.lists at gmail.com (George King) Date: Mon, 2 Oct 2017 11:12:48 -0400 Subject: [Python-Dev] Python startup optimization: script vs. service In-Reply-To: <213d0382-dfdd-dfc5-2e1e-3b408acac256@python.org> References: <8381f166-5d37-a847-7792-7e0c7f3a9704@python.org> <2418C6BB-2AB0-4407-ADE6-ED795DD0B965@gmail.com> <213d0382-dfdd-dfc5-2e1e-3b408acac256@python.org> Message-ID: <14C2AE09-7E72-4A8A-951B-443642549436@gmail.com> Fair, but can you justify your preference? From my perspective, I write many small command line scripts, and all of them would benefit from faster load time. Am I going to have to stick mode-setting incantations at the top of every single one? I occasionally write simple servers, and none of them would suffer for having the first request respond slightly slowly. In many cases they have slow first response times anyway due to file system warmup, etc. > On Oct 2, 2017, at 10:48 AM, Christian Heimes wrote: > > On 2017-10-02 14:05, George King wrote: >> I?m new to this issue, but curious: could the long-running server >> mitigate lazy loading problems simply by explicitly importing the >> deferred modules, e.g. at the top of __main__.py? It would require some >> performance tracing or other analysis to figure out what needed to be >> imported, but this might be a very easy way to win back response times >> for demanding applications. Conversely, small scripts currently have no >> recourse. > > That approach could work, but I think that it is the wrong approach. I'd > rather keep Python optimized for long-running processes and introduce a > new mode / option to optimize for short-running scripts. > > Christian From barry at python.org Mon Oct 2 11:15:35 2017 From: barry at python.org (Barry Warsaw) Date: Mon, 2 Oct 2017 11:15:35 -0400 Subject: [Python-Dev] Investigating time for `import requests` In-Reply-To: References: Message-ID: <41524B9D-BB91-4866-BC7F-F4E31A879E4B@python.org> On Oct 1, 2017, at 22:34, Nathaniel Smith wrote: > > In principle re.compile() itself could be made lazy -- return a > regular exception object that just holds the string, and then compiles > and caches it the first time it's used. Might be tricky to do in a > backwards compatibility way if it moves detection of invalid regexes > from compile time to use time, but it could be an opt-in flag. I already tried that experiment. 1) there are tricky corner cases; 2) nobody liked the change in semantics when re.compile() was made lazy. https://bugs.python.org/issue31580 https://github.com/python/cpython/pull/3755 I think there are opportunities for an explicit API for lazy compilation of regular expressions, but I?m skeptical of the adoption curve making it worthwhile. But maybe I?m wrong! -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From storchaka at gmail.com Mon Oct 2 11:38:42 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Mon, 2 Oct 2017 18:38:42 +0300 Subject: [Python-Dev] Python startup optimization: script vs. service In-Reply-To: References: <8381f166-5d37-a847-7792-7e0c7f3a9704@python.org> Message-ID: 02.10.17 16:26, Victor Stinner ????: > While "import module" is fast, maybe we should use sometimes a global > variable to cache the import. > > module = None > def func(): > global module > if module is None: import module > ... I optimized "import module", and I think it can be optimized even more, up to making the above trick unnecessary. Currently there is an overhead of checking that the module found in sys.modules is not imported right now. From greg at krypto.org Mon Oct 2 12:25:00 2017 From: greg at krypto.org (Gregory P. Smith) Date: Mon, 02 Oct 2017 16:25:00 +0000 Subject: [Python-Dev] Python startup optimization: script vs. service In-Reply-To: References: <8381f166-5d37-a847-7792-7e0c7f3a9704@python.org> <2418C6BB-2AB0-4407-ADE6-ED795DD0B965@gmail.com> <213d0382-dfdd-dfc5-2e1e-3b408acac256@python.org> Message-ID: On Mon, Oct 2, 2017 at 8:03 AM Victor Stinner wrote: > 2017-10-02 16:48 GMT+02:00 Christian Heimes : > > That approach could work, but I think that it is the wrong approach. I'd > > rather keep Python optimized for long-running processes and introduce a > > new mode / option to optimize for short-running scripts. > > "Filling caches on demand" is an old pattern. I don't think that we > are doing anything new here. > > If we add an opt-in option, I would prefer to have an option to > explicitly "fill caches", rather than the opposite. > +1 the common case benefits from the laziness. The much less common piece of code that needs to pre-initialize as much as possible to avoid work happening at an inopportune future time (prior to forking, while handling latency sensitive real time requests yet still being written in CPython, etc.) knows its needs and can ask for it. -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian at python.org Mon Oct 2 13:02:42 2017 From: christian at python.org (Christian Heimes) Date: Mon, 2 Oct 2017 19:02:42 +0200 Subject: [Python-Dev] Python startup optimization: script vs. service In-Reply-To: <9E4EA9B9-E7AD-4E57-8284-CF4B3D69B5D0@python.org> References: <8381f166-5d37-a847-7792-7e0c7f3a9704@python.org> <2418C6BB-2AB0-4407-ADE6-ED795DD0B965@gmail.com> <213d0382-dfdd-dfc5-2e1e-3b408acac256@python.org> <9E4EA9B9-E7AD-4E57-8284-CF4B3D69B5D0@python.org> Message-ID: On 2017-10-02 16:59, Barry Warsaw wrote: > On Oct 2, 2017, at 10:48, Christian Heimes wrote: >> >> That approach could work, but I think that it is the wrong approach. I'd >> rather keep Python optimized for long-running processes and introduce a >> new mode / option to optimize for short-running scripts. > > What would that look like, how would it be invoked, and how would that change the behavior of the interpreter? I haven't given it much thought yet. Here are just some wild ideas: - add '-l' command line option (l for lazy) - in lazy mode, delay some slow operations (re compile, enum, ...) - delay some imports in lazy mode, e.g. with a deferred import proxy Christian From k7hoven at gmail.com Mon Oct 2 13:13:54 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Mon, 2 Oct 2017 20:13:54 +0300 Subject: [Python-Dev] Inheritance vs composition in backcompat (PEP521) In-Reply-To: References: Message-ID: Hi all, It was suggested that I start a new thread, because the other thread drifted away from its original topic. So here, in case someone is interested: On Oct 2, 2017 17:03, "Koos Zevenhoven wrote: On Mon, Oct 2, 2017 at 6:42 AM, Guido van Rossum wrote: On Sun, Oct 1, 2017 at 1:52 PM, Koos Zevenhoven wrote: On Oct 1, 2017 19:26, "Guido van Rossum" wrote: Your PEP is currently incomplete. If you don't finish it, it is not even a contender. But TBH it's not my favorite anyway, so you could also just withdraw it. I can withdraw it if you ask me to, but I don't want to withdraw it without any reason. I haven't changed my mind about the big picture. OTOH, PEP 521 is elegant and could be used to implement PEP 555, but 521 is almost certainly less performant and has some problems regarding context manager wrappers that use composition instead of inheritance. It is my understanding that PEP 521 (which proposes to add optional __suspend__ and __resume__ methods to the context manager protocol, to be called whenever a frame is suspended or resumed inside a `with` block) is no longer a contender because it would be way too slow. I haven't read it recently or thought about it, so I don't know what the second issue you mention is about (though it's presumably about the `yield` in a context manager implemented using a generator decorated with `@contextlib.contextmanager`). ?Well, it's not completely unrelated to that. The problem I'm talking about is perhaps most easily seen from a simple context manager wrapper that uses composition instead of inheritance: class Wrapper: def __init__(self): self._wrapped = SomeContextManager() def __enter__(self): print("Entering context") return self._wrapped.__enter__() def __exit__(self): self._wrapped.__exit__() print("Exited context") Now, if the wrapped contextmanager becomes a PEP 521 one with __suspend__ and __resume__, the Wrapper class is broken, because it does not respect __suspend__ and __resume__. So actually this is a backwards compatiblity issue. But if the wrapper is made using inheritance, the problem goes away: class Wrapper(SomeContextManager): def __enter__(self): print("Entering context") return super().__enter__() def __exit__(self): super().__exit__() print("Exited context") Now the wrapper cleanly inherits the new optional __suspend__ and __resume__ from the wrapped context manager type. ??Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Mon Oct 2 13:29:50 2017 From: brett at python.org (Brett Cannon) Date: Mon, 02 Oct 2017 17:29:50 +0000 Subject: [Python-Dev] Python startup optimization: script vs. service In-Reply-To: <7a7f7c3c-4792-d695-3b91-ebb02da9a980@python.org> References: <8381f166-5d37-a847-7792-7e0c7f3a9704@python.org> <7a7f7c3c-4792-d695-3b91-ebb02da9a980@python.org> Message-ID: On Mon, 2 Oct 2017 at 08:00 Christian Heimes wrote: > On 2017-10-02 15:26, Victor Stinner wrote: > > 2017-10-02 13:10 GMT+02:00 INADA Naoki : > >> https://github.com/python/cpython/pull/3796 > >> In this PR, lazy loading only happens when uuid1 is used. > >> But uuid1 is very uncommon for nowdays. > > > > Antoine Pitrou added a new C extension _uuid which is imported as soon > > as uuid(.py) is imported. On Linux at least, the main "overhead" is > > still done on "import uuid". But Antoine change optimized a lot > > "import uuid" import time! > > > >> https://github.com/python/cpython/pull/3757 > >> In this PR, singledispatch is lazy loading types and weakref. > >> But singledispatch is used as decorator. > >> So if web application uses singledispatch, it's loaded before > preforking. > > > > While "import module" is fast, maybe we should use sometimes a global > > variable to cache the import. > > > > module = None > > def func(): > > global module > > if module is None: import module > > ... > > > > I'm not sure that it's possible to write an helper for such pattern. > > I would rather like to see a function in importlib that handles deferred > imports: > > modulename = importlib.deferred_import('modulename') > > def deferred_import(name): > if name in sys.modules: > # special case 'None' here > return sys.modules[name] > else: > return ModuleProxy(name) > > ModuleProxy is a module type subclass that loads the module on demand. > My current design for an opt-in lazy importing setup includes an explicit function for importlib that's mainly targeted for the stdlib and it's startup module needs, but could be used by others: https://notebooks.azure.com/Brett/libraries/di2Btqj7zSI/html/Lazy%20importing.ipynb . -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Mon Oct 2 14:01:35 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 2 Oct 2017 14:01:35 -0400 Subject: [Python-Dev] Investigating time for `import requests` In-Reply-To: References: Message-ID: On 10/2/2017 4:57 AM, Paul Moore wrote: > In practice, I don't think the fact that re.search() et al cache the > compiled expressions is that well known (it's mentioned in the > re.compile docs, but not in the re.search docs) We could add redundant mentions in the functions ;-). > and so people often compile up front because they think it helps, > rather than actually measuring to check. -- Terry Jan Reedy From barry at python.org Mon Oct 2 14:02:28 2017 From: barry at python.org (Barry Warsaw) Date: Mon, 2 Oct 2017 14:02:28 -0400 Subject: [Python-Dev] PEP 553 In-Reply-To: References: <1B4494FE-AA40-4FF7-B81A-B0FB14F5F259@python.org> Message-ID: Thanks for the review Guido! The PEP and implementation have been updated to address your comments, but let me briefly respond here. > On Oct 2, 2017, at 00:44, Guido van Rossum wrote: > - There's a comma instead of a period at the end of the 4th bullet in the Rationale: "Breaking the idiom up into two lines further complicates the use of the debugger,? . Thanks, fixed. > Also I don't understand how this complicates use I?ve addressed that with some additional wording in the PEP. Basically, it?s my contention that splitting it up on two lines introduces more opportunity for mistakes. > TBH the biggest argument (to me) is that I simply don't know *how* I would enter some IDE's debugger programmatically. I think it should also be pointed out that even if an IDE has a way to specify conditional breakpoints, the UI may be such that it's easier to just add the check to the code -- and then the breakpoint() option is much more attractive than having to look up how it's done in your particular IDE (especially since this is not all that common). This is a really excellent point! I?ve reworked that section of the PEP to make this clear. > - There's no rationale for the *args, **kwds part of the breakpoint() signature. (I vaguely recall someone on the mailing list asking for it but it seemed far-fetched at best.) I?ve added some rationale. The idea comes from optional `header` argument to IPython?s programmatic debugger API. I liked that enough to add it to pdb.set_trace() for 3.7. IPython accepts other optional arguments, so I think we do want to allow those to be passed through the call chain. I expect any debugger?s advertised entry point to make these optional, so `breakpoint()` will always just work. > - The explanation of the relationship between sys.breakpoint() and sys.__breakpointhook__ was unclear to me I think you understand it correctly, and I?ve hopefully clarified that in the PEP now, so you wouldn?t have to read the __displayhook__ (or __excepthook__) docs to understand how it works. > - Some pseudo-code would be nice. Great idea; added that to the PEP (pretty close to what you have, but with the warnings handling, etc.) > I think something like `os.environ['PYTHONBREAKPOINT'] = 'foo.bar.baz'; breakpoint()` should result in foo.bar.baz() being imported and called, right? Correct. Clarified in the PEP now. > - I'm not quite sure what sort of fast-tracking for PYTHONBREAKPOINT=0 you had in mind beyond putting it first in the code above. That?s pretty close to it. Clarified. > - Did you get confirmation from other debuggers? E.g. does it work for IDLE, Wing IDE, PyCharm, and VS 2015? From some of them, yes. Terry confirmed for IDLE, and I posted a statement in favor of the PEP from the PyCharm folks. I?m pretty sure Steve confirmed that this would be useful for VS, and I haven?t heard from the Wing folks. > - I'm not sure what the point would be of making a call to breakpoint() a special opcode (it seems a lot of work for something of this nature). ISTM that if some IDE modifies bytecode it can do whatever it well please without a PEP. I?m strongly against including anything related to a new bytecode to PEP 553; they?re just IMHO orthogonal issues, and I?ve removed this as an open issue for 553. I understand why some debugger developers want it though. There was a talk at Pycon 2017 about what PyCharm does. They have to rewrite the bytecode to insert a call to a ?trampoline function? which in many ways is the equivalent of breakpoint() and sys.breakpointhook(). I.e. it?s a small function that sets up and calls a more complicated function to do the actual debugging. IIRC, they open up space for 4 bytecodes, with all the fixups that implies. The idea was that there could be a single bytecode that essentially calls builtin breakpoint(). Steve indicated that this might also be useful for VS. There?s a fair bit that would have to be fleshed out to make this idea real, but as I say, I think it shouldn?t have anything to do with PEP 553, except that it could probably build on the APIs we?re adding here. > - I don't see the point of calling `pdb.pm()` at breakpoint time. But it could be done using the PEP with `import pdb; sys.breakpointhook = pdb.pm` right? So this hardly deserves an open issue. Correct, and I?ve removed this open issue. > - I haven't read the actual implementation in the PR. A PEP should not depend on the actual proposed implementation for disambiguation of its specification (hence my proposal to add pseudo-code to the PEP). > > That's what I have! Cool, that?s very helpful, thanks! -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From christian at python.org Mon Oct 2 14:18:35 2017 From: christian at python.org (Christian Heimes) Date: Mon, 2 Oct 2017 20:18:35 +0200 Subject: [Python-Dev] Python startup optimization: script vs. service In-Reply-To: References: <8381f166-5d37-a847-7792-7e0c7f3a9704@python.org> <7a7f7c3c-4792-d695-3b91-ebb02da9a980@python.org> Message-ID: On 2017-10-02 19:29, Brett Cannon wrote: > My current design for an opt-in lazy importing setup includes an > explicit function for importlib that's mainly targeted for the stdlib > and it's startup module needs, but could be used by others: > https://notebooks.azure.com/Brett/libraries/di2Btqj7zSI/html/Lazy%20importing.ipynb Awesome, thanks Brett! :) Small nit pick, you need to add a special case for blocked imports. From brett at python.org Mon Oct 2 14:56:15 2017 From: brett at python.org (Brett Cannon) Date: Mon, 02 Oct 2017 18:56:15 +0000 Subject: [Python-Dev] Investigating time for `import requests` In-Reply-To: References: Message-ID: On Mon, 2 Oct 2017 at 02:43 Raymond Hettinger wrote: > > > On Oct 2, 2017, at 12:39 AM, Nick Coghlan wrote: > > > > "What requests uses" can identify a useful set of > > avoidable imports. A Flask "Hello world" app could likely provide > > another such sample, as could some example data analysis notebooks). > > Right. It is probably worthwhile to identify which parts of the library > are typically imported but are not ever used. And likewise, identify a > core set of commonly used tools that are going to be almost unavoidable in > sufficiently interesting applications (like using requests to access a REST > API, running a micro-webframework, or invoking mercurial). > > Presumably, if any of this is going to make a difference to end users, we > need to see if there is any avoidable work that takes a significant > fraction of the total time from invocation through the point where the user > first sees meaningful output. That would include loading from nonvolatile > storage, executing the various imports, and doing the actual application. > > I don't expect to find anything that would help users of Django, Flask, > and Bottle since those are typically long-running apps where we value > response time more than startup time. > > For scripts using the requests module, there will be some fruit because > not everything that is imported is used. However, that may not be > significant because scripts using requests tend to be I/O bound. In the > timings below, 6% of the running time is used to load and run python.exe, > another 16% is used to import requests, and the remaining 78% is devoted to > the actual task of running a simple REST API query. It would be interesting > to see how much of the 16% could be avoided without major alterations to > requests, to urllib3, and to the standard library. > > For mercurial, "hg log" or "hg commit" will likely be instructive about > what portion of the imports actually get used. A push or pull will likely > be I/O bound so those commands are less informative. > So Mercurial specifically is an odd duck because they already do lazy importing (in fact they are using the lazy loading support from importlib). In terms of all of this discussion of tweaking import to be lazy, I think the best approach would be providing an opt-in solution that CLI tools can turn on ASAP while the default stays eager. That way everyone gets what they want while the stdlib provides a shared solution that's maintained alongside import itself to make sure it functions appropriately. -Brett > > > Raymond > > > --------- Quick timing for a minimal script using the requests module > ----------- > > $ cat > demo_github_rest_api.py > import requests > info = requests.get('https://api.github.com/users/raymondh').json() > print('%(name)s works at %(company)s. Contact at %(email)s' % info) > > $ time python3.6 demo_github_rest_api.py > Raymond Hettinger works at SauceLabs. Contact at None > > real 0m0.561s > user 0m0.134s > sys 0m0.018s > > $ time python3.6 -c "import requests" > > real 0m0.125s > user 0m0.104s > sys 0m0.014s > > $ time python3.6 -c "" > > real 0m0.036s > user 0m0.024s > sys 0m0.005s > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Mon Oct 2 14:58:30 2017 From: brett at python.org (Brett Cannon) Date: Mon, 02 Oct 2017 18:58:30 +0000 Subject: [Python-Dev] Python startup optimization: script vs. service In-Reply-To: References: <8381f166-5d37-a847-7792-7e0c7f3a9704@python.org> <7a7f7c3c-4792-d695-3b91-ebb02da9a980@python.org> Message-ID: On Mon, 2 Oct 2017 at 11:19 Christian Heimes wrote: > On 2017-10-02 19:29, Brett Cannon wrote: > > My current design for an opt-in lazy importing setup includes an > > explicit function for importlib that's mainly targeted for the stdlib > > and it's startup module needs, but could be used by others: > > > https://notebooks.azure.com/Brett/libraries/di2Btqj7zSI/html/Lazy%20importing.ipynb > > Awesome, thanks Brett! :) > Well, I have to find the time to try and get this in for Python 3.7 (I'm currently working with Barry on a pkg_resources replacement so there's a queue :) . > > Small nit pick, you need to add a special case for blocked imports. > Added a note at the end of the notebook about needing to make sure that's properly supported. -Brett > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Mon Oct 2 15:01:18 2017 From: brett at python.org (Brett Cannon) Date: Mon, 02 Oct 2017 19:01:18 +0000 Subject: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files) In-Reply-To: <1506822394.2443337.1123772232.2C322F02@webmail.messagingengine.com> References: <1506701836.3115188.1122665216.743447DD@webmail.messagingengine.com> <1506822394.2443337.1123772232.2C322F02@webmail.messagingengine.com> Message-ID: On Sat, 30 Sep 2017 at 18:46 Benjamin Peterson wrote: > What do you mean by bytecode-specific APIs? The internal importlib ones? > There's that, but more specifically py_compile and compileall. -Brett > > On Fri, Sep 29, 2017, at 09:38, Brett Cannon wrote: > > BTW, if you find the bytecode-specific APIs are sub-par while trying to > > update them, let me know as I have been toying with cleaning them up and > > centralizing them under importlib for a while and just never gotten > > around > > to sitting down and coming up with a better design that warranted putting > > the time into it. :) > > > > On Fri, 29 Sep 2017 at 09:17 Benjamin Peterson > > wrote: > > > > > Thanks, Guido and everyone who submitted feedback! > > > > > > I guess I know what I'll be doing this weekend. > > > > > > > > > On Fri, Sep 29, 2017, at 08:18, Guido van Rossum wrote: > > > > Let me try that again. > > > > > > > > There have been no further comments. PEP 552 is now accepted. > > > > > > > > Congrats, Benjamin! Go ahead and send your implementation for > > > > review.Oops. > > > > Let me try that again. > > > > > > > > PS. PEP 550 is still unaccepted, awaiting a new revision from Yury > and > > > > Elvis. > > > > > > > > -- > > > > --Guido van Rossum (python.org/~guido ) > > > > _______________________________________________ > > > > Python-Dev mailing list > > > > Python-Dev at python.org > > > > https://mail.python.org/mailman/listinfo/python-dev > > > > Unsubscribe: > > > > > https://mail.python.org/mailman/options/python-dev/benjamin%40python.org > > > > > > _______________________________________________ > > > Python-Dev mailing list > > > Python-Dev at python.org > > > https://mail.python.org/mailman/listinfo/python-dev > > > Unsubscribe: > > > https://mail.python.org/mailman/options/python-dev/brett%40python.org > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Mon Oct 2 15:29:15 2017 From: barry at python.org (Barry Warsaw) Date: Mon, 2 Oct 2017 15:29:15 -0400 Subject: [Python-Dev] Investigating time for `import requests` In-Reply-To: References: Message-ID: <641723C8-AE9E-44F1-94AE-68A716980318@python.org> On Oct 2, 2017, at 14:56, Brett Cannon wrote: > So Mercurial specifically is an odd duck because they already do lazy importing (in fact they are using the lazy loading support from importlib). In terms of all of this discussion of tweaking import to be lazy, I think the best approach would be providing an opt-in solution that CLI tools can turn on ASAP while the default stays eager. That way everyone gets what they want while the stdlib provides a shared solution that's maintained alongside import itself to make sure it functions appropriately. The problem I think is that to get full benefit of lazy loading, it has to be turned on globally for bare ?import? statements. A typical application has tons of dependencies and all those libraries are also doing module global imports, so unless lazy loading somehow covers them, it?ll be an incomplete gain. But of course it?ll take forever for all your dependencies to use whatever new API we come up with, and if it?s not as convenient to write as ?import foo? then I suspect it won?t much catch on anyway. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From tjreedy at udel.edu Mon Oct 2 16:10:06 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 2 Oct 2017 16:10:06 -0400 Subject: [Python-Dev] PEP 553 In-Reply-To: References: <1B4494FE-AA40-4FF7-B81A-B0FB14F5F259@python.org> Message-ID: On 10/2/2017 10:45 AM, Guido van Rossum wrote: > On Sun, Oct 1, 2017 at 11:15 PM, Terry Reedy > wrote: > > On 10/2/2017 12:44 AM, Guido van Rossum wrote: > > - There's no rationale for the *args, **kwds part of the > breakpoint() signature. (I vaguely recall someone on the mailing > list asking for it but it seemed far-fetched at best.) > > > If IDLE's event-driven GUI debugger were rewritten to run in the > user process, people wanting to debug a tkinter program should be > able to pass in their root, with its mainloop, rather than having > the debugger create its own, as it normally would.? Something else > could come up. > > > But if they care so much, they could also use a small wrapper as the > sys.breakpointhook that retrieves the root and calls the IDLE debugger > with that. 'They' include beginners that need the simplicity of breakpoint() the most. > Why is adding the root to the breakpoint() call better than > that? To me, the main attraction for breakpoint is that there's > something I can type quickly and insert at any point in the code. I agree. > During a debugging session > I may try setting it in many different places. If I > have to also pass the root each time I type "breakpoint()" that's just > an unnecessary detail compared to having it done automatically by the hook. Even though pdb.set_trace re-initializes each call, idb.whatever should *not*. So it should set something that can be queried. My idea was that a person should pass root only on the first call. But that founders on the fact that 'first call' may not be deterministic. if cond: breakpoint() breakpoint() Besides which, someone might insert breakpoint() before creating a root. So I will try instead initializing with iroot = tk._default_root if tk._default_root else tk.Tk() and stick with iroot.update() and avoid i.mainloop() A revised tk-based debugger, closer to pdb than it is now, will require some experimentation. I would like to be able to use it to debug IDLE run from a command line, and that will be a fairly severe test of compatibility with a tkinter application. You could approve breakpoint() without args now and add them if and when there were more convincing need. -- Terry Jan Reedy From solipsis at pitrou.net Mon Oct 2 16:55:22 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 2 Oct 2017 22:55:22 +0200 Subject: [Python-Dev] Investigating time for `import requests` References: Message-ID: <20171002225522.07e2b34a@fsol> On Mon, 02 Oct 2017 18:56:15 +0000 Brett Cannon wrote: > > So Mercurial specifically is an odd duck because they already do lazy > importing (in fact they are using the lazy loading support from importlib). Do they? I was under the impression they had their own home-baked, GPL-licensed, lazy-loading __import__ re-implementation. At least they used to, perhaps they switched to something else (probably still GPL-licensed, though). Regards Antoine. From solipsis at pitrou.net Mon Oct 2 16:57:02 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 2 Oct 2017 22:57:02 +0200 Subject: [Python-Dev] Investigating time for `import requests` References: <41524B9D-BB91-4866-BC7F-F4E31A879E4B@python.org> Message-ID: <20171002225702.785a53d4@fsol> On Mon, 2 Oct 2017 11:15:35 -0400 Barry Warsaw wrote: > > I think there are opportunities for an explicit API for lazy compilation of regular expressions, but I?m skeptical of the adoption curve making it worthwhile. But maybe I?m wrong! We already have two caching schemes available in the re module: one explicit and eager with re.compile(), one implicit and lazy with re.search() and friends. I doubt we really need a third one :-) Regards Antoine. From guido at python.org Mon Oct 2 17:02:34 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 2 Oct 2017 14:02:34 -0700 Subject: [Python-Dev] Inheritance vs composition in backcompat (PEP521) In-Reply-To: References: Message-ID: On Mon, Oct 2, 2017 at 10:13 AM, Koos Zevenhoven wrote: > Hi all, It was suggested that I start a new thread, because the other > thread drifted away from its original topic. So here, in case someone is > interested: > > On Oct 2, 2017 17:03, "Koos Zevenhoven wrote: > > On Mon, Oct 2, 2017 at 6:42 AM, Guido van Rossum wrote: > > On Sun, Oct 1, 2017 at 1:52 PM, Koos Zevenhoven wrote: > > On Oct 1, 2017 19:26, "Guido van Rossum" wrote: > > Your PEP is currently incomplete. If you don't finish it, it is not even a > contender. But TBH it's not my favorite anyway, so you could also just > withdraw it. > > > I can withdraw it if you ask me to, but I don't want to withdraw it > without any reason. I haven't changed my mind about the big picture. OTOH, > PEP 521 is elegant and could be used to implement PEP 555, but 521 is > almost certainly less performant and has some problems regarding context > manager wrappers that use composition instead of inheritance. > > > It is my understanding that PEP 521 (which proposes to add optional > __suspend__ and __resume__ methods to the context manager protocol, to be > called whenever a frame is suspended or resumed inside a `with` block) is > no longer a contender because it would be way too slow. I haven't read it > recently or thought about it, so I don't know what the second issue you > mention is about (though it's presumably about the `yield` in a context > manager implemented using a generator decorated with > `@contextlib.contextmanager`). > > > ?Well, it's not completely unrelated to that. The problem I'm talking > about is perhaps most easily seen from a simple context manager wrapper > that uses composition instead of inheritance: > > class Wrapper: > def __init__(self): > self._wrapped = SomeContextManager() > > def __enter__(self): > print("Entering context") > return self._wrapped.__enter__() > > def __exit__(self): > self._wrapped.__exit__() > print("Exited context") > > > Now, if the wrapped contextmanager becomes a PEP 521 one with __suspend__ > and __resume__, the Wrapper class is broken, because it does not respect > __suspend__ and __resume__. So actually this is a backwards compatiblity > issue. > > Why is it backwards incompatible? I'd think that without PEP 521 it would be broken in exactly the same way because there's no __suspend__/__resume__ at all. > But if the wrapper is made using inheritance, the problem goes away: > > > class Wrapper(SomeContextManager): > def __enter__(self): > print("Entering context") > return super().__enter__() > > def __exit__(self): > super().__exit__() > print("Exited context") > > > Now the wrapper cleanly inherits the new optional __suspend__ and > __resume__ from the wrapped context manager type. > > In any case this is completely academic because PEP 521 is not going to happen. Nathaniel himself has said so (I think in the context of discussing PEP 550). -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Oct 2 17:36:59 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 2 Oct 2017 14:36:59 -0700 Subject: [Python-Dev] PEP 553 In-Reply-To: References: <1B4494FE-AA40-4FF7-B81A-B0FB14F5F259@python.org> Message-ID: On Mon, Oct 2, 2017 at 11:02 AM, Barry Warsaw wrote: > Thanks for the review Guido! The PEP and implementation have been updated > to address your comments, but let me briefly respond here. > > > On Oct 2, 2017, at 00:44, Guido van Rossum wrote: > > > - There's a comma instead of a period at the end of the 4th bullet in > the Rationale: "Breaking the idiom up into two lines further complicates > the use of the debugger,? . > > Thanks, fixed. > > > Also I don't understand how this complicates use > > I?ve addressed that with some additional wording in the PEP. Basically, > it?s my contention that splitting it up on two lines introduces more > opportunity for mistakes. > > > TBH the biggest argument (to me) is that I simply don't know *how* I > would enter some IDE's debugger programmatically. I think it should also be > pointed out that even if an IDE has a way to specify conditional > breakpoints, the UI may be such that it's easier to just add the check to > the code -- and then the breakpoint() option is much more attractive than > having to look up how it's done in your particular IDE (especially since > this is not all that common). > > This is a really excellent point! I?ve reworked that section of the PEP > to make this clear. > > > - There's no rationale for the *args, **kwds part of the breakpoint() > signature. (I vaguely recall someone on the mailing list asking for it but > it seemed far-fetched at best.) > > I?ve added some rationale. The idea comes from optional `header` argument > to IPython?s programmatic debugger API. I liked that enough to add it to > pdb.set_trace() for 3.7. IPython accepts other optional arguments, so I > think we do want to allow those to be passed through the call chain. I > expect any debugger?s advertised entry point to make these optional, so > `breakpoint()` will always just work. > > > - The explanation of the relationship between sys.breakpoint() and > sys.__breakpointhook__ was unclear to me > > I think you understand it correctly, and I?ve hopefully clarified that in > the PEP now, so you wouldn?t have to read the __displayhook__ (or > __excepthook__) docs to understand how it works. > > > - Some pseudo-code would be nice. > > Great idea; added that to the PEP (pretty close to what you have, but with > the warnings handling, etc.) > > > I think something like `os.environ['PYTHONBREAKPOINT'] = 'foo.bar.baz'; > breakpoint()` should result in foo.bar.baz() being imported and called, > right? > > Correct. Clarified in the PEP now. > > > - I'm not quite sure what sort of fast-tracking for PYTHONBREAKPOINT=0 > you had in mind beyond putting it first in the code above. > > That?s pretty close to it. Clarified. > > > - Did you get confirmation from other debuggers? E.g. does it work for > IDLE, Wing IDE, PyCharm, and VS 2015? > > From some of them, yes. Terry confirmed for IDLE, and I posted a > statement in favor of the PEP from the PyCharm folks. I?m pretty sure > Steve confirmed that this would be useful for VS, and I haven?t heard from > the Wing folks. > > > - I'm not sure what the point would be of making a call to breakpoint() > a special opcode (it seems a lot of work for something of this nature). > ISTM that if some IDE modifies bytecode it can do whatever it well please > without a PEP. > > I?m strongly against including anything related to a new bytecode to PEP > 553; they?re just IMHO orthogonal issues, and I?ve removed this as an open > issue for 553. > > I understand why some debugger developers want it though. There was a > talk at Pycon 2017 about what PyCharm does. They have to rewrite the > bytecode to insert a call to a ?trampoline function? which in many ways is > the equivalent of breakpoint() and sys.breakpointhook(). I.e. it?s a small > function that sets up and calls a more complicated function to do the > actual debugging. IIRC, they open up space for 4 bytecodes, with all the > fixups that implies. The idea was that there could be a single bytecode > that essentially calls builtin breakpoint(). Steve indicated that this > might also be useful for VS. > > There?s a fair bit that would have to be fleshed out to make this idea > real, but as I say, I think it shouldn?t have anything to do with PEP 553, > except that it could probably build on the APIs we?re adding here. > > > - I don't see the point of calling `pdb.pm()` at breakpoint time. But > it could be done using the PEP with `import pdb; sys.breakpointhook = > pdb.pm` right? So this hardly deserves an open issue. > > Correct, and I?ve removed this open issue. > > > - I haven't read the actual implementation in the PR. A PEP should not > depend on the actual proposed implementation for disambiguation of its > specification (hence my proposal to add pseudo-code to the PEP). > > > > That's what I have! > > Cool, that?s very helpful, thanks! > I've seen your updates and it is now acceptable, except for *one* nit: in builtins.breakpoint() the pseudo code raises RuntimeError if sys.breakpointhook is missing or None. OTOH sys.breakpointhook() just issues a RuntimeWarning when something's wrong with the hook. Maybe builtins.breakpoint() should also just warn if it can't find the hook? Setting `sys.breakpointhook = None` might be the simplest way to programmatically disable breakpoints. Why not allow it? To Terry: Barry has given another excellent argument for passing through *args, **kwds so I remove my objection to it, regardless of what I think of your argument about the IDLE debugger and root windows. (It's been too long since I used Tkinter, so I don't trust my intuition there much anyways.) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From k7hoven at gmail.com Mon Oct 2 17:52:21 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Tue, 3 Oct 2017 00:52:21 +0300 Subject: [Python-Dev] Inheritance vs composition in backcompat (PEP521) In-Reply-To: References: Message-ID: On Oct 3, 2017 00:02, "Guido van Rossum" wrote: On Mon, Oct 2, 2017 at 10:13 AM, Koos Zevenhoven wrote: > Hi all, It was suggested that I start a new thread, because the other > thread drifted away from its original topic. So here, in case someone is > interested: > > On Oct 2, 2017 17:03, "Koos Zevenhoven wrote: > > On Mon, Oct 2, 2017 at 6:42 AM, Guido van Rossum wrote: > > On Sun, Oct 1, 2017 at 1:52 PM, Koos Zevenhoven wrote: > > On Oct 1, 2017 19:26, "Guido van Rossum" wrote: > > Your PEP is currently incomplete. If you don't finish it, it is not even a > contender. But TBH it's not my favorite anyway, so you could also just > withdraw it. > > > I can withdraw it if you ask me to, but I don't want to withdraw it > without any reason. I haven't changed my mind about the big picture. OTOH, > PEP 521 is elegant and could be used to implement PEP 555, but 521 is > almost certainly less performant and has some problems regarding context > manager wrappers that use composition instead of inheritance. > > > It is my understanding that PEP 521 (which proposes to add optional > __suspend__ and __resume__ methods to the context manager protocol, to be > called whenever a frame is suspended or resumed inside a `with` block) is > no longer a contender because it would be way too slow. I haven't read it > recently or thought about it, so I don't know what the second issue you > mention is about (though it's presumably about the `yield` in a context > manager implemented using a generator decorated with > `@contextlib.contextmanager`). > > > ?Well, it's not completely unrelated to that. The problem I'm talking > about is perhaps most easily seen from a simple context manager wrapper > that uses composition instead of inheritance: > > class Wrapper: > def __init__(self): > self._wrapped = SomeContextManager() > > def __enter__(self): > print("Entering context") > return self._wrapped.__enter__() > > def __exit__(self): > self._wrapped.__exit__() > print("Exited context") > > > Now, if the wrapped contextmanager becomes a PEP 521 one with __suspend__ > and __resume__, the Wrapper class is broken, because it does not respect > __suspend__ and __resume__. So actually this is a backwards compatiblity > issue. > > Why is it backwards incompatible? I'd think that without PEP 521 it would be broken in exactly the same way because there's no __suspend__/__resume__ at all. The wrapper is (would be) broken because it depends on the internal implementation of the wrapped CM. Maybe the author of SomeContextManager wants to upgrade the CM to also work in coroutines and generators. But it could be a more subtle change in the CM implementation. The problem becomes more serious and more obvious if you don't know which context manager you are wrapping: class Wrapper: def __init__(self, contextmanager): self._wrapped = contextmanager def __enter__(self): print("Entering context") return self._wrapped.__enter__() def __exit__(self): self._wrapped.__exit__() print("Exited context") The wrapper is (would be) broken because it does not work for all CMs anymore. But if the wrapper is made using inheritance, the problem goes away: > > > class Wrapper(SomeContextManager): > def __enter__(self): > print("Entering context") > return super().__enter__() > > def __exit__(self): > super().__exit__() > print("Exited context") > > > Now the wrapper cleanly inherits the new optional __suspend__ and > __resume__ from the wrapped context manager type. > > In any case this is completely academic because PEP 521 is not going to happen. Nathaniel himself has said so (I think in the context of discussing PEP 550). I don't mind this (or Nathaniel ;-) being academic. The backwards incompatibility issue I've just described applies to any extension via composition, if the underlying type/protocol grows new members (like the CM protocol would have gained __suspend__ and __resume__ in PEP521). -- Koos (mobile) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Oct 2 17:59:52 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 2 Oct 2017 14:59:52 -0700 Subject: [Python-Dev] Inheritance vs composition in backcompat (PEP521) In-Reply-To: References: Message-ID: Mon, Oct 2, 2017 at 2:52 PM, Koos Zevenhoven wrote > I don't mind this (or Nathaniel ;-) being academic. The backwards > incompatibility issue I've just described applies to any extension via > composition, if the underlying type/protocol grows new members (like the CM > protocol would have gained __suspend__ and __resume__ in PEP521). > Since you seem to have a good grasp on this issue, does PEP 550 suffer from the same problem? (Or PEP 555, for that matter? :-) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Mon Oct 2 18:03:25 2017 From: barry at python.org (Barry Warsaw) Date: Mon, 2 Oct 2017 18:03:25 -0400 Subject: [Python-Dev] PEP 553 In-Reply-To: References: <1B4494FE-AA40-4FF7-B81A-B0FB14F5F259@python.org> Message-ID: On Oct 2, 2017, at 17:36, Guido van Rossum wrote: > I've seen your updates and it is now acceptable, except for *one* nit: in builtins.breakpoint() the pseudo code raises RuntimeError if sys.breakpointhook is missing or None. OTOH sys.breakpointhook() just issues a RuntimeWarning when something's wrong with the hook. Maybe builtins.breakpoint() should also just warn if it can't find the hook? Setting `sys.breakpointhook = None` might be the simplest way to programmatically disable breakpoints. Why not allow it? Oh, actually the pseudocode doesn?t match the C implementation exactly in this regard. Currently the C implementation is more like: def breakpoint(*args, **kws): import sys missing = object() hook = getattr(sys, 'breakpointhook', missing) if hook is missing: raise RuntimeError('lost sys.breakpointhook') return hook(*args, **kws) The intent being, much like the other sys-hooks, that if PySys_GetObject("breakpointhook?) returns NULL, Something Bad Happened, so we have to set an error string and bail. (PySys_GetObject() does not set an exception.) E.g. >>> def foo(): ... print('yes') ... breakpoint() ... print('no') ... >>> del sys.breakpointhook >>> foo() yes Traceback (most recent call last): File "", line 1, in File "", line 3, in foo RuntimeError: lost sys.breakpointhook Setting `sys.breakpoint = None` could be an interesting use case, but that?s not currently special in any way: >>> sys.breakpointhook = None >>> foo() yes Traceback (most recent call last): File "", line 1, in File "", line 3, in foo TypeError: 'NoneType' object is not callable I?m open to special-casing this if you think it?s useful. (I?ll update the pseudocode in the PEP.) Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From k7hoven at gmail.com Mon Oct 2 18:11:28 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Tue, 3 Oct 2017 01:11:28 +0300 Subject: [Python-Dev] Inheritance vs composition in backcompat (PEP521) In-Reply-To: References: Message-ID: On Oct 3, 2017 01:00, "Guido van Rossum" wrote: Mon, Oct 2, 2017 at 2:52 PM, Koos Zevenhoven wrote I don't mind this (or Nathaniel ;-) being academic. The backwards > incompatibility issue I've just described applies to any extension via > composition, if the underlying type/protocol grows new members (like the CM > protocol would have gained __suspend__ and __resume__ in PEP521). > Since you seem to have a good grasp on this issue, does PEP 550 suffer from the same problem? (Or PEP 555, for that matter? :-) Neither has this particular issue, because they don't extend an existing protocol. If this thread has any significance, it will most likely be elsewhere. -- Koos (mobile) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From k7hoven at gmail.com Mon Oct 2 18:19:50 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Tue, 3 Oct 2017 01:19:50 +0300 Subject: [Python-Dev] Inheritance vs composition in backcompat (PEP521) In-Reply-To: References: Message-ID: On Oct 3, 2017 01:11, "Koos Zevenhoven" wrote: On Oct 3, 2017 01:00, "Guido van Rossum" wrote: Mon, Oct 2, 2017 at 2:52 PM, Koos Zevenhoven wrote I don't mind this (or Nathaniel ;-) being academic. The backwards > incompatibility issue I've just described applies to any extension via > composition, if the underlying type/protocol grows new members (like the CM > protocol would have gained __suspend__ and __resume__ in PEP521). > Since you seem to have a good grasp on this issue, does PEP 550 suffer from the same problem? (Or PEP 555, for that matter? :-) Neither has this particular issue, because they don't extend an existing protocol. If this thread has any significance, it will most likely be elsewhere. That said, I did come across this thought while trying to find flaws in my own PEP ;) -- Koos -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Oct 2 18:43:50 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 2 Oct 2017 15:43:50 -0700 Subject: [Python-Dev] PEP 553 In-Reply-To: References: <1B4494FE-AA40-4FF7-B81A-B0FB14F5F259@python.org> Message-ID: On Mon, Oct 2, 2017 at 3:03 PM, Barry Warsaw wrote: > On Oct 2, 2017, at 17:36, Guido van Rossum wrote: > > > I've seen your updates and it is now acceptable, except for *one* nit: > in builtins.breakpoint() the pseudo code raises RuntimeError if > sys.breakpointhook is missing or None. OTOH sys.breakpointhook() just > issues a RuntimeWarning when something's wrong with the hook. Maybe > builtins.breakpoint() should also just warn if it can't find the hook? > Setting `sys.breakpointhook = None` might be the simplest way to > programmatically disable breakpoints. Why not allow it? > > Oh, actually the pseudocode doesn?t match the C implementation exactly in > this regard. Currently the C implementation is more like: > > def breakpoint(*args, **kws): > import sys > missing = object() > hook = getattr(sys, 'breakpointhook', missing) > if hook is missing: > raise RuntimeError('lost sys.breakpointhook') > return hook(*args, **kws) > > The intent being, much like the other sys-hooks, that if PySys_GetObject("breakpointhook?) > returns NULL, Something Bad Happened, so we have to set an error string and > bail. (PySys_GetObject() does not set an exception.) > > E.g. > > >>> def foo(): > ... print('yes') > ... breakpoint() > ... print('no') > ... > >>> del sys.breakpointhook > >>> foo() > yes > Traceback (most recent call last): > File "", line 1, in > File "", line 3, in foo > RuntimeError: lost sys.breakpointhook > > > Setting `sys.breakpoint = None` could be an interesting use case, but > that?s not currently special in any way: > > >>> sys.breakpointhook = None > >>> foo() > yes > Traceback (most recent call last): > File "", line 1, in > File "", line 3, in foo > TypeError: 'NoneType' object is not callable > > > I?m open to special-casing this if you think it?s useful. > > (I?ll update the pseudocode in the PEP.) > OK. That then concludes the review of your PEP. It is now accepted! Congrats. I am looking forward to using the backport. :-) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Mon Oct 2 20:06:11 2017 From: barry at python.org (Barry Warsaw) Date: Mon, 2 Oct 2017 20:06:11 -0400 Subject: [Python-Dev] PEP 553 In-Reply-To: References: <1B4494FE-AA40-4FF7-B81A-B0FB14F5F259@python.org> Message-ID: <11E419CD-6E87-44C6-965B-A07DEF8237AE@python.org> On Oct 2, 2017, at 18:43, Guido van Rossum wrote: > > OK. That then concludes the review of your PEP. It is now accepted! Congrats. I am looking forward to using the backport. :-) Yay, thanks! We?ll see if I can sneak that backport past Ned. :) -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From ronaldoussoren at mac.com Mon Oct 2 19:47:34 2017 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Tue, 03 Oct 2017 08:47:34 +0900 Subject: [Python-Dev] Investigating time for `import requests` In-Reply-To: <641723C8-AE9E-44F1-94AE-68A716980318@python.org> References: <641723C8-AE9E-44F1-94AE-68A716980318@python.org> Message-ID: <302A53D9-D6F1-4872-ACF2-522889412FA1@mac.com> Op 3 okt. 2017 om 04:29 heeft Barry Warsaw het volgende geschreven: > On Oct 2, 2017, at 14:56, Brett Cannon wrote: > >> So Mercurial specifically is an odd duck because they already do lazy importing (in fact they are using the lazy loading support from importlib). In terms of all of this discussion of tweaking import to be lazy, I think the best approach would be providing an opt-in solution that CLI tools can turn on ASAP while the default stays eager. That way everyone gets what they want while the stdlib provides a shared solution that's maintained alongside import itself to make sure it functions appropriately. > > The problem I think is that to get full benefit of lazy loading, it has to be turned on globally for bare ?import? statements. A typical application has tons of dependencies and all those libraries are also doing module global imports, so unless lazy loading somehow covers them, it?ll be an incomplete gain. But of course it?ll take forever for all your dependencies to use whatever new API we come up with, and if it?s not as convenient to write as ?import foo? then I suspect it won?t much catch on anyway. > One thing to keep in mind is that imports can have important side-effects. Turning every import statement into a lazy import will not be backward compatible. Ronald From ericsnowcurrently at gmail.com Mon Oct 2 21:31:30 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Mon, 2 Oct 2017 21:31:30 -0400 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) In-Reply-To: References: Message-ID: On Thu, Sep 14, 2017 at 8:44 PM, Nick Coghlan wrote: > Not really, because the only way to ensure object separation (i.e no > refcounted objects accessible from multiple interpreters at once) with > a bytes-based API would be to either: > > 1. Always copy (eliminating most of the low overhead communications > benefits that subinterpreters may offer over multiple processes) > 2. Make the bytes implementation more complicated by allowing multiple > bytes objects to share the same underlying storage while presenting as > distinct objects in different interpreters > 3. Make the output on the receiving side not actually a bytes object, > but instead a view onto memory owned by another object in a different > interpreter (a "memory view", one might say) 4. Pass Bytes through directly. The only problem of which I'm aware is that when Py_DECREF() triggers Bytes.__del__(), it happens in the current interpreter, which may not be the "owner" (i.e. allocated the object). So the solution would be to make PyBytesType.tp_free() effectively run as a "pending call" under the owner. This would require two things: 1. a new PyBytesObject.owner field (PyInterpreterState *), or a separate owner table, which would be set when the object is passed through a channel 2. a Py_AddPendingCall() that targets a specific interpreter (which I expect would be desirable regardless) Then, when the object has an owner, PyBytesType.tp_free() would add a pending call on the owner to call PyObject_Del() on the Bytes object. The catch is that currently "pending" calls (via Py_AddPendingCall) are run only in the main thread of the main interpreter. We'd need a similar mechanism that targets a specific interpreter . > By contrast, if we allow an actual bytes object to be shared, then > either every INCREF or DECREF on that bytes object becomes a > synchronisation point, or else we end up needing some kind of > secondary per-interpreter refcount where the interpreter doesn't drop > its shared reference to the original object in its source interpreter > until the internal refcount in the borrowing interpreter drops to > zero. There shouldn't be a need to synchronize on INCREF. If both interpreters have at least 1 reference then either one adding a reference shouldn't be a problem. If only one interpreter has a reference then the other won't be adding any references. If neither has a reference then neither is going to add any references. Perhaps I've missed something. Under what circumstances would INCREF happen while the refcount is 0? On DECREF there shouldn't be a problem except possibly with a small race between decrementing the refcount and checking for a refcount of 0. We could address that several different ways, including allowing the pending call to get queued only once (or being a noop the second time). FWIW, I'm not opposed to the CIV/memoryview approach, but want to make sure we really can't use Bytes before going down that route. -eric From ericsnowcurrently at gmail.com Mon Oct 2 22:15:01 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Mon, 2 Oct 2017 22:15:01 -0400 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) In-Reply-To: <20170923114545.115b901b@fsol> References: <20170918124636.7ee8de0b@fsol> <20170923114545.115b901b@fsol> Message-ID: After having looked it over, I'm leaning toward supporting buffering, as well as not blocking by default. Neither adds much complexity to the implementation. On Sat, Sep 23, 2017 at 5:45 AM, Antoine Pitrou wrote: > On Fri, 22 Sep 2017 19:09:01 -0600 > Eric Snow wrote: >> > send() blocking until someone else calls recv() is not only bad for >> > performance, >> >> What is the performance problem? > > Intuitively, there must be some kind of context switch (interpreter > switch?) at each send() call to let the other end receive the data, > since you don't have any internal buffering. There would be an internal size-1 buffer. >> (FWIW, CSP >> provides rigorous guarantees about deadlock detection (which Go >> leverages), though I'm not sure how much benefit that can offer such a >> dynamic language as Python.) > > Hmm... deadlock detection is one thing, but when detected you must still > solve those deadlock issues, right? Yeah, I haven't given much thought into how we could leverage that capability but my gut feeling is that we won't have much opportunity to do so. :) >> I'm not sure I understand your concern here. Perhaps I used the word >> "sharing" too ambiguously? By "sharing" I mean that the two actors >> have read access to something that at least one of them can modify. >> If they both only have read-only access then it's effectively the same >> as if they are not sharing. > > Right. What I mean is that you *can* share very simple "data" under > the form of synchronization primitives. You may want to synchronize > your interpreters even they don't share user-visible memory areas. The > point of synchronization is not only to avoid memory corruption but > also to regulate and orchestrate processing amongst multiple workers > (for example processes or interpreters). For example, a semaphore is > an easy way to implement "I want no more than N workers to do this > thing at the same time" ("this thing" can be something such as disk > I/O). I'm still not convinced that sharing synchronization primitives is important enough to be worth including it in the PEP. It can be added later, or via an extension module in the meantime. To that end, I'll add a mechanism to the PEP for third-party types to indicate that they can be passed through channels. Something like "obj.__channel_support__ = True". -eric From ericsnowcurrently at gmail.com Mon Oct 2 22:16:47 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Mon, 2 Oct 2017 22:16:47 -0400 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) In-Reply-To: References: Message-ID: On Mon, Oct 2, 2017 at 9:31 PM, Eric Snow wrote: > On DECREF there shouldn't be a problem except possibly with a small > race between decrementing the refcount and checking for a refcount of > 0. We could address that several different ways, including allowing > the pending call to get queued only once (or being a noop the second > time). Alternately, the channel could own a reference and DECREF it in the owning interpreter once the refcount reaches 1. -eric From ericsnowcurrently at gmail.com Mon Oct 2 22:20:09 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Mon, 2 Oct 2017 22:20:09 -0400 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) In-Reply-To: References: <20170918124636.7ee8de0b@fsol> <20170923114545.115b901b@fsol> Message-ID: On Mon, Sep 25, 2017 at 8:42 PM, Nathaniel Smith wrote: > It's fairly reasonable to implement a mutex using a CSP-style > unbuffered channel (send = acquire, receive = release). And the same > trick turns a channel with a fixed-size buffer into a bounded > semaphore. It won't be as efficient as a modern specialized mutex > implementation, of course, but it's workable. > > Unfortunately while technically you can construct a buffered channel > out of an unbuffered channel, the construction's pretty unreasonable > (it needs two dedicated threads per channel). Yeah, if threading's synchronization primitives make sense between interpreters then we'll add direct support. Using channels for that isn't a good option. -eric From ericsnowcurrently at gmail.com Mon Oct 2 22:35:19 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Mon, 2 Oct 2017 22:35:19 -0400 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) In-Reply-To: References: <20170918124636.7ee8de0b@fsol> <20170923114545.115b901b@fsol> <20170926090445.6b57ddd1@fsol> Message-ID: On Wed, Sep 27, 2017 at 1:26 AM, Nick Coghlan wrote: > It's also the case that unlike Go channels, which were designed from > scratch on the basis of implementing pure CSP, FWIW, Go's channels (and goroutines) don't implement pure CSP. They provide a variant that the Go authors felt was more in-line with the language's flavor. The channels in the PEP aim to support a more pure implementation. > Python has an > established behavioural precedent in the APIs of queue.Queue and > collections.deque: they're unbounded by default, and you have to opt > in to making them bounded. Right. That's part of why I'm leaning toward support for buffered channels. > While the article title is clickbaity, > http://www.jtolds.com/writing/2016/03/go-channels-are-bad-and-you-should-feel-bad/ > actually has a good discussion of this point. Search for "compose" to > find the relevant section ("Channels don?t compose well with other > concurrency primitives"). > > The specific problem cited is that only offering unbuffered or > bounded-buffer channels means that every send call becomes a potential > deadlock scenario, as all that needs to happen is for you to be > holding a different synchronisation primitive when the send call > blocks. Yeah, that blog post was a reference for me as I was designing the PEP's channels. > The fact that the proposal now allows for M:N sender:receiver > relationships (just as queue.Queue does with threads) makes that > problem worse, since you may now have variability not only on the > message consumption side, but also on the message production side. > > Consider this example where you have an event processing thread pool > that we're attempting to isolate from blocking IO by using channels > rather than coroutines. > > Desired flow: > > 1. Listener thread receives external message from socket > 2. Listener thread files message for processing on receive channel > 3. Listener thread returns to blocking on the receive socket > > 4. Processing thread picks up message from receive channel > 5. Processing thread processes message > 6. Processing thread puts reply on the send channel > > 7. Sending thread picks up message from send channel > 8. Sending thread makes a blocking network send call to transmit the message > 9. Sending thread returns to blocking on the send channel > > When queue.Queue is used to pass the messages between threads, such an > arrangement will be effectively non-blocking as long as the send rate > is greater than or equal to the receive rate. However, the GIL means > it won't exploit all available cores, even if we create multiple > processing threads: you have to switch to multiprocessing for that, > with all the extra overhead that entails. > > So I see the essential premise of PEP 554 as being to ask the question > "If each of these threads was running its own *interpreter*, could we > use Sans IO style protocols with interpreter channels to separate > internally "synchronous" processing threads from separate IO threads > operating at system boundaries, without having to make the entire > application pervasively asynchronous?" +1 > If channels are an unbuffered blocking primitive, then we don't get > that benefit: even when there are additional receive messages to be > processed, the processing thread will block until the previous send > has completed. Switching the listener and sender threads over to > asynchronous IO would help with that, but they'd also end up having to > implement their own message buffering to manage the lack of buffering > in the core channel primitive. > > By contrast, if the core channels are designed to offer an unbounded > buffer by default, then you can get close-to-CSP semantics just by > setting the buffer size to 1 (it's still not exactly CSP, since that > has a buffer size of 0, but you at least get the semantics of having > to alternate sending and receiving of messages). Yep, I came to the same conclusion. >> By the way, I do think efficiency is a concern here. Otherwise >> subinterpreters don't even have a point (just use multiprocessing). > > Agreed, and I think the interaction between the threading module and > the interpreters module is one we're going to have to explicitly call > out as being covered by the provisional status of the interpreters > module, as I think it could be incredibly valuable to be able to send > at least some threading objects through channels, and have them be an > interpreter-specific reference to a common underlying sync primitive. Agreed. I'll add a note to the PEP. -eric From songofacandy at gmail.com Mon Oct 2 23:29:40 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Tue, 03 Oct 2017 03:29:40 +0000 Subject: [Python-Dev] Make re.compile faster Message-ID: Before deferring re.compile, can we make it faster? I profiled `import string` and small optimization can make it 2x faster! (but it's not backward compatible) Before optimize: import time: self [us] | cumulative | imported package import time: 2339 | 9623 | string string module took about 2.3 ms to import. I found: * RegexFlag.__and__ and __new__ is called very often. * _optimize_charset is slow, because re.UNICODE | re.IGNORECASE diff --git a/Lib/sre_compile.py b/Lib/sre_compile.py index 144620c6d1..7c662247d4 100644 --- a/Lib/sre_compile.py +++ b/Lib/sre_compile.py @@ -582,7 +582,7 @@ def isstring(obj): def _code(p, flags): - flags = p.pattern.flags | flags + flags = int(p.pattern.flags) | int(flags) code = [] # compile info block diff --git a/Lib/string.py b/Lib/string.py index b46e60c38f..fedd92246d 100644 --- a/Lib/string.py +++ b/Lib/string.py @@ -81,7 +81,7 @@ class Template(metaclass=_TemplateMetaclass): delimiter = '$' idpattern = r'[_a-z][_a-z0-9]*' braceidpattern = None - flags = _re.IGNORECASE + flags = _re.IGNORECASE | _re.ASCII def __init__(self, template): self.template = template patched: import time: 1191 | 8479 | string Of course, this patch is not backward compatible. [a-z] doesn't match with '?' or '?' anymore. But who cares? (in sre_compile.py) # LATIN SMALL LETTER I, LATIN SMALL LETTER DOTLESS I (0x69, 0x131), # i? # LATIN SMALL LETTER S, LATIN SMALL LETTER LONG S (0x73, 0x17f), # s? There are some other `re.I(GNORECASE)` options in stdlib. I'll check them. More optimization can be done with implementing sre_parse and sre_compile in C. But I have no time for it in this year. Regards, -- Inada Naoki -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Tue Oct 3 01:35:52 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 3 Oct 2017 08:35:52 +0300 Subject: [Python-Dev] Make re.compile faster In-Reply-To: References: Message-ID: 03.10.17 06:29, INADA Naoki ????: > Before deferring re.compile, can we make it faster? > > I profiled `import string` and small optimization can make it 2x faster! > (but it's not backward compatible) Please open an issue for this. > I found: > > * RegexFlag.__and__ and __new__ is called very often. > * _optimize_charset is slow, because re.UNICODE | re.IGNORECASE > > diff --git a/Lib/sre_compile.py b/Lib/sre_compile.py > index 144620c6d1..7c662247d4 100644 > --- a/Lib/sre_compile.py > +++ b/Lib/sre_compile.py > @@ -582,7 +582,7 @@ def isstring(obj): > > ?def _code(p, flags): > > -? ? flags = p.pattern.flags | flags > +? ? flags = int(p.pattern.flags) | int(flags) > ? ? ?code = [] > > ? ? ?# compile info block Maybe cast flags to int earlier, in sre_compile.compile()? > diff --git a/Lib/string.py b/Lib/string.py > index b46e60c38f..fedd92246d 100644 > --- a/Lib/string.py > +++ b/Lib/string.py > @@ -81,7 +81,7 @@ class Template(metaclass=_TemplateMetaclass): > ? ? ?delimiter = '$' > ? ? ?idpattern = r'[_a-z][_a-z0-9]*' > ? ? ?braceidpattern = None > -? ? flags = _re.IGNORECASE > +? ? flags = _re.IGNORECASE | _re.ASCII > > ? ? ?def __init__(self, template): > ? ? ? ? ?self.template = template > > patched: > import time:? ? ? 1191 |? ? ? ?8479 | string > > Of course, this patch is not backward compatible. [a-z] doesn't match > with '?' or '?' anymore. > But who cares? This looks like a bug fix. I'm wondering if it is worth to backport it to 3.6. But the change itself can break a user code that changes idpattern without touching flags. There is other way, but it should be discussed on the bug tracker. From storchaka at gmail.com Tue Oct 3 01:41:07 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 3 Oct 2017 08:41:07 +0300 Subject: [Python-Dev] Make re.compile faster In-Reply-To: References: Message-ID: 03.10.17 06:29, INADA Naoki ????: > More optimization can be done with implementing sre_parse and > sre_compile in C. > But I have no time for it in this year. And please don't do this! This would make maintaining the re module hard. The performance of the compiler is less important than correctness and performance of matching and searching. From victor.stinner at gmail.com Tue Oct 3 05:18:59 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 3 Oct 2017 11:18:59 +0200 Subject: [Python-Dev] Make re.compile faster In-Reply-To: References: Message-ID: > * RegexFlag.__and__ and __new__ is called very often. Yeah, when the re module was modified to use enums for flags, re.compile() became slower: https://pyperformance.readthedocs.io/cpython_results_2017.html#slowdown https://speed.python.org/timeline/#/?exe=12&ben=regex_compile&env=1&revs=200&equid=off&quarts=on&extr=on It would be nice if internally we could use integers again to reduce this overhead, without loosing the nice representation: >>> re.I Victor From k7hoven at gmail.com Tue Oct 3 05:58:05 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Tue, 3 Oct 2017 12:58:05 +0300 Subject: [Python-Dev] Investigating time for `import requests` In-Reply-To: <302A53D9-D6F1-4872-ACF2-522889412FA1@mac.com> References: <641723C8-AE9E-44F1-94AE-68A716980318@python.org> <302A53D9-D6F1-4872-ACF2-522889412FA1@mac.com> Message-ID: I've probably missed a lot of this discussion, but this lazy import discussion confuses me. We already have both eager import (import at the top of the file), and lazy import (import right before use). The former is good when you know you need the module, and the latter is good when you having the overhead at first use is preferable over having the overhead at startup. But like Raymond was saying, this is of course especially relevant when that import is likely never used. Maybe the fact that the latter is not recommended gives people the feeling that we don't have lazy imports, although we do. What we *don't* have, however, is *partially* lazy imports and partially executed code, something like: on demand: class Foo: # a lot of stuff here def foo_function(my_foo, bar): # more stuff here When executed, the `on demand` block would only keep track of which names are being bound to (here, "Foo" and "foo_function"), and on the lookup of those names in the namespace, the code would actually be run. Then you could also do on demand: import sometimes_needed_module Or on demand: from . import all, submodules, of, this, package This would of course drift away from "namespaces are simply dicts". But who cares, if they still provide the dict interface. See e.g. this example with automatic lazy imports: https://gist.github.com/k7hoven/21c5532ce19b306b08bb4e82cfe5a609 Another thing we *don't* have is unimporting. What if I know that I'm only going to need some particular module in this one initialization function. Why should I keep it in memory for the whole lifetime of the program? ??Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue Oct 3 07:00:10 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 3 Oct 2017 13:00:10 +0200 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) In-Reply-To: References: <20170918124636.7ee8de0b@fsol> <20170923114545.115b901b@fsol> Message-ID: <20171003130010.3a95a673@fsol> On Mon, 2 Oct 2017 22:15:01 -0400 Eric Snow wrote: > > I'm still not convinced that sharing synchronization primitives is > important enough to be worth including it in the PEP. It can be added > later, or via an extension module in the meantime. To that end, I'll > add a mechanism to the PEP for third-party types to indicate that they > can be passed through channels. Something like > "obj.__channel_support__ = True". How would that work? If it's simply a matter of flipping a bit, why don't we do it for all objects? Regards Antoine. From barry at python.org Tue Oct 3 10:14:58 2017 From: barry at python.org (Barry Warsaw) Date: Tue, 3 Oct 2017 10:14:58 -0400 Subject: [Python-Dev] Make re.compile faster In-Reply-To: References: Message-ID: On Oct 3, 2017, at 01:35, Serhiy Storchaka wrote: > >> diff --git a/Lib/string.py b/Lib/string.py >> index b46e60c38f..fedd92246d 100644 >> --- a/Lib/string.py >> +++ b/Lib/string.py >> @@ -81,7 +81,7 @@ class Template(metaclass=_TemplateMetaclass): >> delimiter = '$' >> idpattern = r'[_a-z][_a-z0-9]*' >> braceidpattern = None >> - flags = _re.IGNORECASE >> + flags = _re.IGNORECASE | _re.ASCII >> def __init__(self, template): >> self.template = template >> patched: >> import time: 1191 | 8479 | string >> Of course, this patch is not backward compatible. [a-z] doesn't match with '?' or '?' anymore. >> But who cares? > > This looks like a bug fix. I'm wondering if it is worth to backport it to 3.6. But the change itself can break a user code that changes idpattern without touching flags. There is other way, but it should be discussed on the bug tracker. It?s definitely an API change, as I mention in the bug tracker. It?s *probably* safe in practice given that the documentation does say that identifiers are ASCII by default, but it also means that a client who wants to use Unicode previously didn?t have to touch flags, and after this change would now have to do so. `flags` is part of the public API. Maybe for subclasses you could say that if delimiter, idpattern, or braceidpattern are anything but the defaults, fall back to just re.IGNORECASE. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From barry at python.org Tue Oct 3 10:21:55 2017 From: barry at python.org (Barry Warsaw) Date: Tue, 3 Oct 2017 10:21:55 -0400 Subject: [Python-Dev] Make re.compile faster In-Reply-To: References: Message-ID: <0DD27349-B37C-408C-9C0B-918DEC8F52DA@python.org> On Oct 3, 2017, at 01:41, Serhiy Storchaka wrote: > > 03.10.17 06:29, INADA Naoki ????: >> More optimization can be done with implementing sre_parse and sre_compile in C. >> But I have no time for it in this year. > > And please don't do this! This would make maintaining the re module hard. The performance of the compiler is less important than correctness and performance of matching and searching. What if the compiler could recognize constant arguments to re.compile() and do the regex compilation at that point? You?d need a way to represent the precompiled regex in the bytecode, and it would technically be a semantic change since regex problems would be discovered at compilation time instead of runtime - but that might be a good thing. You could also make that an optimization flag for opt-in, or a flag to allow opt out. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From solipsis at pitrou.net Tue Oct 3 10:30:17 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 3 Oct 2017 16:30:17 +0200 Subject: [Python-Dev] Make re.compile faster References: <0DD27349-B37C-408C-9C0B-918DEC8F52DA@python.org> Message-ID: <20171003163017.439d61c4@fsol> On Tue, 3 Oct 2017 10:21:55 -0400 Barry Warsaw wrote: > On Oct 3, 2017, at 01:41, Serhiy Storchaka wrote: > > > > 03.10.17 06:29, INADA Naoki ????: > >> More optimization can be done with implementing sre_parse and sre_compile in C. > >> But I have no time for it in this year. > > > > And please don't do this! This would make maintaining the re module hard. The performance of the compiler is less important than correctness and performance of matching and searching. > > What if the compiler could recognize constant arguments to re.compile() and do the regex compilation at that point? You?d need a way to represent the precompiled regex in the bytecode, and it would technically be a semantic change since regex problems would be discovered at compilation time instead of runtime - but that might be a good thing. You could also make that an optimization flag for opt-in, or a flag to allow opt out. We need a regex literal! With bytes, formatted, and bytes formatted variants. Regards Antoine. From ericsnowcurrently at gmail.com Tue Oct 3 10:36:55 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 3 Oct 2017 08:36:55 -0600 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) In-Reply-To: <20171003130010.3a95a673@fsol> References: <20170918124636.7ee8de0b@fsol> <20170923114545.115b901b@fsol> <20171003130010.3a95a673@fsol> Message-ID: On Tue, Oct 3, 2017 at 5:00 AM, Antoine Pitrou wrote: > On Mon, 2 Oct 2017 22:15:01 -0400 > Eric Snow wrote: >> >> I'm still not convinced that sharing synchronization primitives is >> important enough to be worth including it in the PEP. It can be added >> later, or via an extension module in the meantime. To that end, I'll >> add a mechanism to the PEP for third-party types to indicate that they >> can be passed through channels. Something like >> "obj.__channel_support__ = True". > > How would that work? If it's simply a matter of flipping a bit, why > don't we do it for all objects? The type would also have to be safe to share between interpreters. :) Eventually I'd like to make that work for all immutable objects (and immutable containers thereof), but until then each type must be adapted individually. The PEP starts off with just Bytes. -eric From solipsis at pitrou.net Tue Oct 3 10:55:48 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 3 Oct 2017 16:55:48 +0200 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) In-Reply-To: References: <20170918124636.7ee8de0b@fsol> <20170923114545.115b901b@fsol> <20171003130010.3a95a673@fsol> Message-ID: <20171003165548.3f1c8bdc@fsol> On Tue, 3 Oct 2017 08:36:55 -0600 Eric Snow wrote: > On Tue, Oct 3, 2017 at 5:00 AM, Antoine Pitrou wrote: > > On Mon, 2 Oct 2017 22:15:01 -0400 > > Eric Snow wrote: > >> > >> I'm still not convinced that sharing synchronization primitives is > >> important enough to be worth including it in the PEP. It can be added > >> later, or via an extension module in the meantime. To that end, I'll > >> add a mechanism to the PEP for third-party types to indicate that they > >> can be passed through channels. Something like > >> "obj.__channel_support__ = True". > > > > How would that work? If it's simply a matter of flipping a bit, why > > don't we do it for all objects? > > The type would also have to be safe to share between interpreters. :) But what does it mean to be safe to share, while the exact degree and nature of the isolation between interpreters (and also their concurrent execution) is unspecified? I think we need a sharing protocol, not just a flag. We also need to think carefully about that protocol, so that it does not imply unnecessary memory copies. Therefore I think the protocol should be something like the buffer protocol, that allows to acquire and release a set of shared memory areas, but without imposing any semantics onto those memory areas (each type implementing its own semantics). And there needs to be a dedicated reference counting for object shares, so that the original object can be notified when all its shares have vanished. Regards Antoine. From sf at fermigier.com Tue Oct 3 06:18:04 2017 From: sf at fermigier.com (=?UTF-8?Q?St=C3=A9fane_Fermigier?=) Date: Tue, 3 Oct 2017 12:18:04 +0200 Subject: [Python-Dev] Investigating time for `import requests` In-Reply-To: References: Message-ID: Hi, On Mon, Oct 2, 2017 at 11:42 AM, Raymond Hettinger < raymond.hettinger at gmail.com> wrote: > > > I don't expect to find anything that would help users of Django, Flask, > and Bottle since those are typically long-running apps where we value > response time more than startup time. > Actually, as web developers, we also value startup time when in development mode, specially when we are in "hot reload" mode (when the app restarts automatically each time we save a development file). In my mid-sized projects (~10 kE LOC, ~150 pip dependencies) it takes between 5 and 10s. This is probably the upper limit to "stay in flow". Same for unit tests. There is this famous Gary Bernhardt talk [https://youtu.be/RAxiiRPHS9k?t=12m ] he argues that a whole unit test suite should be able to run in < 1s and actually show examples where the developer is able to run hundreds of tests in less that 1s. Note: In my projects, it take 3-4 seconds just to collect them (using pytest --collect-only), but I suspect Python's startup time is only responsible for a small part of this delay. Still, this is an important point to keep in mind. S. -- Stefane Fermigier - http://fermigier.com/ - http://twitter.com/sfermigier - http://linkedin.com/in/sfermigier Founder & CEO, Abilian - Enterprise Social Software - http://www.abilian.com/ Chairman, Free&OSS Group / Systematic Cluster - http://www.gt-logiciel-libre.org/ Co-Chairman, National Council for Free & Open Source Software (CNLL) - http://cnll.fr/ Founder & Organiser, PyData Paris - http://pydata.fr/ --- ?You never change things by ?ghting the existing reality. To change something, build a new model that makes the existing model obsolete.? ? R. Buckminster Fuller -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Tue Oct 3 11:03:30 2017 From: barry at python.org (Barry Warsaw) Date: Tue, 03 Oct 2017 11:03:30 -0400 Subject: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files) In-Reply-To: References: Message-ID: Guido van Rossum wrote: > There have been no further comments. PEP 552 is now accepted. > > Congrats, Benjamin! Go ahead and send your implementation for review.Oops. > Let me try that again. While I'm very glad PEP 552 has been accepted, it occurs to me that it will now be more difficult to parse the various pyc file formats from Python. E.g. I used to be able to just open the pyc in binary mode, read all the bytes, and then lop off the first 8 bytes to get to the code object. With the addition of the source file size, I now have to (maybe, if I have to also read old-style pyc files) lop off the front 12 bytes, but okay. With PEP 552, I have to do a lot more work to just get at the code object. How many bytes at the front of the file do I need to skip past? What about all the metadata at the front of the pyc, how do I interpret that if I want to get at it from Python code? Should the PEP 552 implementation add an API, probably to importlib.util, that would understand all current and future formats? Something like this perhaps? class PycFileSpec: magic_number: bytes timestamp: Optional[bytes] # maybe an int? datetime? source_size: Optional[bytes] bit_field: Optional[bytes] code_object: bytes def parse_pyc(path: str) -> PycFileSpec: Cheers, -Barry From stefan_ml at behnel.de Tue Oct 3 11:13:35 2017 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 3 Oct 2017 17:13:35 +0200 Subject: [Python-Dev] Make re.compile faster In-Reply-To: References: Message-ID: INADA Naoki schrieb am 03.10.2017 um 05:29: > Before deferring re.compile, can we make it faster? I tried cythonizing both sre_compile.py and sre_parse.py, which gave me a speedup of a bit more than 2x. There is definitely space left for further improvements since I didn't know much about the code, and also didn't dig very deeply. I used this benchmark to get uncached patterns: [re_compile("[a-z]{%d}[0-9]+[0-9a-z]*[%d-9]" % (i, i%8)) for i in range(20000)] Time for Python master version: 2.14 seconds Time for Cython compiled version: 1.05 seconds I used the latest Cython master for it, as I had to make a couple of type inference improvements for bytearray objects along the way. Cython's master branch is here: https://github.com/cython/cython My CPython changes are here: https://github.com/scoder/cpython/compare/master...scoder:cythonized_sre_compile They are mostly just external type declarations and a tiny type inference helper fix. I could have used the more maintainable PEP-484 annotations for local variables right in the .py files, but AFAIK, those are still not wanted in the standard library. And they also won't suffice for switching to extension types in sre_parse. Together with the integer flag changes, that could give a pretty noticible improvement overall. Stefan From gvanrossum at gmail.com Tue Oct 3 11:15:04 2017 From: gvanrossum at gmail.com (Guido van Rossum) Date: Tue, 3 Oct 2017 08:15:04 -0700 Subject: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files) In-Reply-To: References: Message-ID: It's really not that hard. You just check the magic number and if it's the new one, skip 4 words. No need to understand the internals of the header. On Oct 3, 2017 08:06, "Barry Warsaw" wrote: > Guido van Rossum wrote: > > There have been no further comments. PEP 552 is now accepted. > > > > Congrats, Benjamin! Go ahead and send your implementation for > review.Oops. > > Let me try that again. > > While I'm very glad PEP 552 has been accepted, it occurs to me that it > will now be more difficult to parse the various pyc file formats from > Python. E.g. I used to be able to just open the pyc in binary mode, > read all the bytes, and then lop off the first 8 bytes to get to the > code object. With the addition of the source file size, I now have to > (maybe, if I have to also read old-style pyc files) lop off the front 12 > bytes, but okay. > > With PEP 552, I have to do a lot more work to just get at the code > object. How many bytes at the front of the file do I need to skip past? > What about all the metadata at the front of the pyc, how do I interpret > that if I want to get at it from Python code? > > Should the PEP 552 implementation add an API, probably to > importlib.util, that would understand all current and future formats? > Something like this perhaps? > > class PycFileSpec: > magic_number: bytes > timestamp: Optional[bytes] # maybe an int? datetime? > source_size: Optional[bytes] > bit_field: Optional[bytes] > code_object: bytes > > def parse_pyc(path: str) -> PycFileSpec: > > Cheers, > -Barry > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue Oct 3 11:24:43 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 3 Oct 2017 17:24:43 +0200 Subject: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files) References: Message-ID: <20171003172443.6ff71b9e@fsol> On Tue, 3 Oct 2017 08:15:04 -0700 Guido van Rossum wrote: > It's really not that hard. You just check the magic number and if it's the > new one, skip 4 words. No need to understand the internals of the header. Still, I agree with Barry that an API would be nice. Regards Antoine. > > On Oct 3, 2017 08:06, "Barry Warsaw" wrote: > > > Guido van Rossum wrote: > > > There have been no further comments. PEP 552 is now accepted. > > > > > > Congrats, Benjamin! Go ahead and send your implementation for > > review.Oops. > > > Let me try that again. > > > > While I'm very glad PEP 552 has been accepted, it occurs to me that it > > will now be more difficult to parse the various pyc file formats from > > Python. E.g. I used to be able to just open the pyc in binary mode, > > read all the bytes, and then lop off the first 8 bytes to get to the > > code object. With the addition of the source file size, I now have to > > (maybe, if I have to also read old-style pyc files) lop off the front 12 > > bytes, but okay. > > > > With PEP 552, I have to do a lot more work to just get at the code > > object. How many bytes at the front of the file do I need to skip past? > > What about all the metadata at the front of the pyc, how do I interpret > > that if I want to get at it from Python code? > > > > Should the PEP 552 implementation add an API, probably to > > importlib.util, that would understand all current and future formats? > > Something like this perhaps? > > > > class PycFileSpec: > > magic_number: bytes > > timestamp: Optional[bytes] # maybe an int? datetime? > > source_size: Optional[bytes] > > bit_field: Optional[bytes] > > code_object: bytes > > > > def parse_pyc(path: str) -> PycFileSpec: > > > > Cheers, > > -Barry > > > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > https://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > > guido%40python.org > > > From storchaka at gmail.com Tue Oct 3 11:28:21 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 3 Oct 2017 18:28:21 +0300 Subject: [Python-Dev] Make re.compile faster In-Reply-To: <0DD27349-B37C-408C-9C0B-918DEC8F52DA@python.org> References: <0DD27349-B37C-408C-9C0B-918DEC8F52DA@python.org> Message-ID: 03.10.17 17:21, Barry Warsaw ????: > What if the compiler could recognize constant arguments to re.compile() and do the regex compilation at that point? You?d need a way to represent the precompiled regex in the bytecode, and it would technically be a semantic change since regex problems would be discovered at compilation time instead of runtime - but that might be a good thing. You could also make that an optimization flag for opt-in, or a flag to allow opt out. The representation of the compiled regex is an implementation detail. It is even not exposed since the regex is compiled. And it is changed faster than bytecode and marshal format. It can be changed even in a bugfix release. For implementing this idea we need: 1. Invent a universal portable regex bytecode. It shouldn't contain flaws and limitations and should support all features of Unicode regexps and possible extensions. It should also predict future Unicode changes and be able to code them. 2. Add support of regex objects in marshal format. 3. Implement an advanced AST optimizer. 4. Rewrite the regex compiler in C or make the AST optimizer able to execute Python code. I think we are far away from this. Any of the above problems is much larger and can give larger benefit than changing several microseconds at startup. Forget about this. Let's first get rid of GIL! From guido at python.org Tue Oct 3 11:47:05 2017 From: guido at python.org (Guido van Rossum) Date: Tue, 3 Oct 2017 08:47:05 -0700 Subject: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files) In-Reply-To: <20171003172443.6ff71b9e@fsol> References: <20171003172443.6ff71b9e@fsol> Message-ID: I'm fine with adding an API, though I don't think that an API that knows about all current (historic) and future formats belongs in importlib.util -- that module only concerns itself with the *current* format. In terms of the API design I'd make take an IO[bytes] and just read and parse the header, so after that you can use marshal.load() straight from the file object. File size, mtime and bitfield should be represented as ints (the parser should take care of endianness).The hash should be a bytes. On Tue, Oct 3, 2017 at 8:24 AM, Antoine Pitrou wrote: > On Tue, 3 Oct 2017 08:15:04 -0700 > Guido van Rossum wrote: > > It's really not that hard. You just check the magic number and if it's > the > > new one, skip 4 words. No need to understand the internals of the header. > > Still, I agree with Barry that an API would be nice. > > Regards > > Antoine. > > > > > On Oct 3, 2017 08:06, "Barry Warsaw" wrote: > > > > > Guido van Rossum wrote: > > > > There have been no further comments. PEP 552 is now accepted. > > > > > > > > Congrats, Benjamin! Go ahead and send your implementation for > > > review.Oops. > > > > Let me try that again. > > > > > > While I'm very glad PEP 552 has been accepted, it occurs to me that it > > > will now be more difficult to parse the various pyc file formats from > > > Python. E.g. I used to be able to just open the pyc in binary mode, > > > read all the bytes, and then lop off the first 8 bytes to get to the > > > code object. With the addition of the source file size, I now have to > > > (maybe, if I have to also read old-style pyc files) lop off the front > 12 > > > bytes, but okay. > > > > > > With PEP 552, I have to do a lot more work to just get at the code > > > object. How many bytes at the front of the file do I need to skip > past? > > > What about all the metadata at the front of the pyc, how do I > interpret > > > that if I want to get at it from Python code? > > > > > > Should the PEP 552 implementation add an API, probably to > > > importlib.util, that would understand all current and future formats? > > > Something like this perhaps? > > > > > > class PycFileSpec: > > > magic_number: bytes > > > timestamp: Optional[bytes] # maybe an int? datetime? > > > source_size: Optional[bytes] > > > bit_field: Optional[bytes] > > > code_object: bytes > > > > > > def parse_pyc(path: str) -> PycFileSpec: > > > > > > Cheers, > > > -Barry > > > > > > _______________________________________________ > > > Python-Dev mailing list > > > Python-Dev at python.org > > > https://mail.python.org/mailman/listinfo/python-dev > > > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > > > guido%40python.org > > > > > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Tue Oct 3 12:29:30 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 3 Oct 2017 19:29:30 +0300 Subject: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files) In-Reply-To: References: Message-ID: 03.10.17 18:15, Guido van Rossum ????: > It's really not that hard. You just check the magic number and if it's > the new one, skip 4 words. No need to understand the internals of the > header. Hence you should know all old magic numbers to determine if the read magic number is the new one. Right? From storchaka at gmail.com Tue Oct 3 12:53:29 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 3 Oct 2017 19:53:29 +0300 Subject: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files) In-Reply-To: References: Message-ID: 26.09.17 23:47, Guido van Rossum ????: > I've read the current version of PEP 552 over and I think everything > looks good for acceptance. I believe there are no outstanding objections > (or they have been adequately addressed in responses). > > Therefore I intend to accept PEP 552 this Friday, unless grave > objections are raised on this mailing list (python-dev). > > Congratulations Benjamin. Gotta love those tristate options! While PEP 552 is accepted, I would want to see some changes. 1. Increase the size of the constant part of the signature to at least 32 bits. Currently only the third and forth bytes are constant, and they are '\r\n', that is often occurred in text files. The first two bytes can be different in every Python version. This make hard detecting pyc files by utilities like file (1). 2. Split the "version" of pyc files by "major" and "minor" parts. Every major version is incompatible with other major versions, the interpreter accepts only one particular major version. It can't be changed in a bugfix release. But all minor versions inside the same major version are forward and backward compatible. The interpreter should be able to execute pyc file with arbitrary minor version, but it can use minor version of pyc file to handle errors in older versions. Minor version can be changed in a bugfix release. I hope this can help us with issues like https://bugs.python.org/issue29537. Currently 3.5 supports two magic numbers. If we change the pyc format, it would be easy to make the above changes. From steve.dower at python.org Tue Oct 3 13:01:29 2017 From: steve.dower at python.org (Steve Dower) Date: Tue, 3 Oct 2017 10:01:29 -0700 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) In-Reply-To: <20171003165548.3f1c8bdc@fsol> References: <20170918124636.7ee8de0b@fsol> <20170923114545.115b901b@fsol> <20171003130010.3a95a673@fsol> <20171003165548.3f1c8bdc@fsol> Message-ID: On 03Oct2017 0755, Antoine Pitrou wrote: > On Tue, 3 Oct 2017 08:36:55 -0600 > Eric Snow wrote: >> On Tue, Oct 3, 2017 at 5:00 AM, Antoine Pitrou wrote: >>> On Mon, 2 Oct 2017 22:15:01 -0400 >>> Eric Snow wrote: >>>> >>>> I'm still not convinced that sharing synchronization primitives is >>>> important enough to be worth including it in the PEP. It can be added >>>> later, or via an extension module in the meantime. To that end, I'll >>>> add a mechanism to the PEP for third-party types to indicate that they >>>> can be passed through channels. Something like >>>> "obj.__channel_support__ = True". >>> >>> How would that work? If it's simply a matter of flipping a bit, why >>> don't we do it for all objects? >> >> The type would also have to be safe to share between interpreters. :) > > But what does it mean to be safe to share, while the exact degree > and nature of the isolation between interpreters (and also their > concurrent execution) is unspecified? > > I think we need a sharing protocol, not just a flag. The easiest such protocol is essentially: * an object can represent itself as bytes (e.g. generate a bytes object representing some global token, such as a kernel handle or memory address) * those bytes are sent over the standard channel * the object can instantiate itself from those bytes (e.g. wrap the existing handle, create a memoryview over the same block of memory, etc.) * cross-interpreter refcounting is either ignored (because the kernel is refcounting the resource) or manual (by including more shared info in the token) Since this is trivial to implement over the basic bytes channel, and doesn't even require a standard protocol except for convenience, Eric decided to avoid blocking the core functionality on this. I'm inclined to agree - get the basic functionality supported and let people build on it before we try to lock down something we don't fully understand yet. About the only thing that seems to be worth doing up-front is some sort of pending-call callback mechanism between interpreters, but even that doesn't need to block the core functionality (you can do it trivially with threads and another channel right now, and there's always room to make something more efficient later). There are plenty of smart people out there who can and will figure out the best way to design this. By giving them the tools and the ability to design something awesome, we're more likely to get something awesome than by committing to a complete design now. Right now, they're all blocked on the fact that subinterpreters are incredibly hard to start running, let alone experiment with. Eric's PEP will fix that part and enable others to take it from building blocks to powerful libraries. Cheers, Steve From benjamin at python.org Tue Oct 3 13:29:23 2017 From: benjamin at python.org (Benjamin Peterson) Date: Tue, 03 Oct 2017 10:29:23 -0700 Subject: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files) In-Reply-To: References: Message-ID: <1507051763.2766090.1126540976.75FD41A3@webmail.messagingengine.com> On Tue, Oct 3, 2017, at 08:03, Barry Warsaw wrote: > Guido van Rossum wrote: > > There have been no further comments. PEP 552 is now accepted. > > > > Congrats, Benjamin! Go ahead and send your implementation for review.Oops. > > Let me try that again. > > While I'm very glad PEP 552 has been accepted, it occurs to me that it > will now be more difficult to parse the various pyc file formats from > Python. E.g. I used to be able to just open the pyc in binary mode, > read all the bytes, and then lop off the first 8 bytes to get to the > code object. With the addition of the source file size, I now have to > (maybe, if I have to also read old-style pyc files) lop off the front 12 > bytes, but okay. > > With PEP 552, I have to do a lot more work to just get at the code > object. How many bytes at the front of the file do I need to skip past? > What about all the metadata at the front of the pyc, how do I interpret > that if I want to get at it from Python code? As Guido points out, the header is just now always 4 32-bit words rather than 3. Not long ago we underwent the transition from 2-3 words without widespread disaster. > > Should the PEP 552 implementation add an API, probably to > importlib.util, that would understand all current and future formats? > Something like this perhaps? > > class PycFileSpec: > magic_number: bytes > timestamp: Optional[bytes] # maybe an int? datetime? > source_size: Optional[bytes] > bit_field: Optional[bytes] > code_object: bytes > > def parse_pyc(path: str) -> PycFileSpec: I'm not sure turning the implementation details of our internal formats into APIs is the way to go. From nad at python.org Tue Oct 3 16:06:44 2017 From: nad at python.org (Ned Deily) Date: Tue, 3 Oct 2017 16:06:44 -0400 Subject: [Python-Dev] [RELEASE] Python 3.6.3 is now available Message-ID: <3D8F5841-F970-4C00-9A49-B4F8FD24CF79@python.org> On behalf of the Python development community and the Python 3.6 release team, I am happy to announce the availability of Python 3.6.3, the third maintenance release of Python 3.6. Detailed information about the changes made in 3.6.3 can be found in the change log here: https://docs.python.org/3.6/whatsnew/changelog.html#python-3-6-3-final Please see "What?s New In Python 3.6" for more information about the new features in Python 3.6: https://docs.python.org/3.6/whatsnew/3.6.html You can download Python 3.6.3 here: https://www.python.org/downloads/release/python-363/ The next maintenance release of Python 3.6 is expected to follow in about 3 months, around the end of 2017-12. More information about the 3.6 release schedule can be found here: https://www.python.org/dev/peps/pep-0494/ Enjoy! -- Ned Deily nad at python.org -- [] From victor.stinner at gmail.com Tue Oct 3 16:56:26 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 3 Oct 2017 22:56:26 +0200 Subject: [Python-Dev] [RELEASE] Python 3.6.3 is now available In-Reply-To: <3D8F5841-F970-4C00-9A49-B4F8FD24CF79@python.org> References: <3D8F5841-F970-4C00-9A49-B4F8FD24CF79@python.org> Message-ID: Hi, Good news: Python 3.6.3 has no more known security vulnerabilities ;-) Python 3.6.3 fixes two security vulnerabilities: "urllib FTP protocol stream injection" https://python-security.readthedocs.io/vuln/urllib_ftp_protocol_stream_injection.html "Expat 2.2.3" (don't impact Linux, since Linux distros use the system expat library) https://python-security.readthedocs.io/vuln/expat_2.2.3.html Note: I'm not sure that the vulnerabilities fixed in Expat 2.2.2 and Expat 2.2.3 really impacted Python, since Python uses its own entropy source to set the "hash secret", but well, it's usually safer to use a more recent library version :-) Victor 2017-10-03 22:06 GMT+02:00 Ned Deily : > On behalf of the Python development community and the Python 3.6 > release team, I am happy to announce the availability of Python 3.6.3, > the third maintenance release of Python 3.6. Detailed information > about the changes made in 3.6.3 can be found in the change log here: > > https://docs.python.org/3.6/whatsnew/changelog.html#python-3-6-3-final > > Please see "What?s New In Python 3.6" for more information about the > new features in Python 3.6: > > https://docs.python.org/3.6/whatsnew/3.6.html > > You can download Python 3.6.3 here: > > https://www.python.org/downloads/release/python-363/ > > The next maintenance release of Python 3.6 is expected to follow in > about 3 months, around the end of 2017-12. More information about the > 3.6 release schedule can be found here: > > https://www.python.org/dev/peps/pep-0494/ > > Enjoy! > > -- > Ned Deily > nad at python.org -- [] > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com From ncoghlan at gmail.com Wed Oct 4 00:24:10 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 4 Oct 2017 14:24:10 +1000 Subject: [Python-Dev] Python startup optimization: script vs. service In-Reply-To: References: <8381f166-5d37-a847-7792-7e0c7f3a9704@python.org> <2418C6BB-2AB0-4407-ADE6-ED795DD0B965@gmail.com> <213d0382-dfdd-dfc5-2e1e-3b408acac256@python.org> <9E4EA9B9-E7AD-4E57-8284-CF4B3D69B5D0@python.org> Message-ID: On 3 October 2017 at 03:02, Christian Heimes wrote: > On 2017-10-02 16:59, Barry Warsaw wrote: >> On Oct 2, 2017, at 10:48, Christian Heimes wrote: >>> >>> That approach could work, but I think that it is the wrong approach. I'd >>> rather keep Python optimized for long-running processes and introduce a >>> new mode / option to optimize for short-running scripts. >> >> What would that look like, how would it be invoked, and how would that change the behavior of the interpreter? > > I haven't given it much thought yet. Here are just some wild ideas: > > - add '-l' command line option (l for lazy) > - in lazy mode, delay some slow operations (re compile, enum, ...) > - delay some imports in lazy mode, e.g. with a deferred import proxy I don't think is the right way to structure the defaults, since the web services world is in the middle of moving back closer to the CLI/CGI model, where a platform like AWS Lambda will take care of spinning up language interpreter instances on demand, using them to process a single request, and then discarding them. It's also somewhat unreliable to pass command line options as part of shebang lines, and packaging tools need to be able to generate shebang lines that are compatible with a wide variety of Python implementations By contrast, long running Python services will typically be using some form of WSGI server (whether that's mod_wsgi, uWSGI, gunicorn, Twisted, tornado, or something else) that can choose to adjust *their* defaults to force the underlying language runtime into an "eager state initialisation" mode, even if the default setting is to respect requests for lazy initialisation. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Wed Oct 4 01:07:31 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 4 Oct 2017 15:07:31 +1000 Subject: [Python-Dev] Inheritance vs composition in backcompat (PEP521) In-Reply-To: References: Message-ID: On 3 October 2017 at 03:13, Koos Zevenhoven wrote: > Well, it's not completely unrelated to that. The problem I'm talking about > is perhaps most easily seen from a simple context manager wrapper that uses > composition instead of inheritance: > > class Wrapper: > def __init__(self): > self._wrapped = SomeContextManager() > > def __enter__(self): > print("Entering context") > return self._wrapped.__enter__() > > def __exit__(self): > self._wrapped.__exit__() > print("Exited context") > > > Now, if the wrapped contextmanager becomes a PEP 521 one with __suspend__ > and __resume__, the Wrapper class is broken, because it does not respect > __suspend__ and __resume__. So actually this is a backwards compatiblity > issue. This is a known problem, and one of the main reasons that having a truly transparent object proxy like https://wrapt.readthedocs.io/en/latest/wrappers.html#object-proxy as part of the standard library would be highly desirable. Actually getting such a proxy defined, implemented, and integrated isn't going to be easy though, so while Graham (Dumpleton, the author of wrapt) is generally amenable to the idea, he doesn't have the time or inclination to do that work himself. In the meantime, we mostly work around the problem by defining new protocols rather than extending existing ones, but it still means it takes longer than it otherwise for full support for new interfaces to ripple out through various object proxying libraries (especially for hard-to-proxy protocols like the new asynchronous ones that require particular methods to be defined as coroutines). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Wed Oct 4 01:36:37 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 4 Oct 2017 15:36:37 +1000 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) In-Reply-To: References: Message-ID: On 3 October 2017 at 11:31, Eric Snow wrote: > There shouldn't be a need to synchronize on INCREF. If both > interpreters have at least 1 reference then either one adding a > reference shouldn't be a problem. If only one interpreter has a > reference then the other won't be adding any references. If neither > has a reference then neither is going to add any references. Perhaps > I've missed something. Under what circumstances would INCREF happen > while the refcount is 0? The problem relates to the fact that there aren't any memory barriers around CPython's INCREF operations (they're implemented as an ordinary C post-increment operation), so you can get the following scenario: * thread on CPU A has the sole reference (ob_refcnt=1) * thread on CPU B acquires a new reference, but hasn't pushed the updated ob_refcnt value back to the shared memory cache yet * original thread on CPU A drops its reference, *thinks* the refcnt is now zero, and deletes the object * bad things now happen in CPU B as the thread running there tries to use a deleted object :) The GIL currently protects us from this, as switching CPUs requires switching threads, which means the original thread has to release the GIL (flushing all of its state changes to the shared cache), and the new thread has to acquire it (hence refreshing its local cache from the shared one). The need to switch all incref/decref operations over to using atomic thread-safe primitives when removing the GIL is one of the main reasons that attempting to remove the GIL *within* an interpreter is expensive (and why Larry et al are having to explore completely different ref count management strategies for the GILectomy). By contrast, if you rely on a new memoryview variant to mediate all data sharing between interpreters, then you can make sure that *it* is using synchronisation primitives as needed to ensure the required cache coherency across different CPUs, without any negative impacts on regular single interpreter code (which can still rely on the cache coherency guarantees provided by the GIL). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From victor.stinner at gmail.com Wed Oct 4 05:52:32 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 4 Oct 2017 11:52:32 +0200 Subject: [Python-Dev] Reorganize Python categories (Core, Library, ...)? Message-ID: Hi, Python uses a few categories to group bugs (on bugs.python.org) and NEWS entries (in the Python changelog). List used by the blurb tool: #.. section: Security #.. section: Core and Builtins #.. section: Library #.. section: Documentation #.. section: Tests #.. section: Build #.. section: Windows #.. section: macOS #.. section: IDLE #.. section: Tools/Demos #.. section: C API My problem is that almost all changes go into "Library" category. When I read long changelogs, it's sometimes hard to identify quickly the context (ex: impacted modules) of a change. It's also hard to find open bugs of a specific module on bugs.python.org, since almost all bugs are in the very generic "Library" category. Using full text returns "false positives". I would prefer to see more specific categories like: * Buildbots: only issues specific to buildbots * Networking: socket, asyncio, asyncore, asynchat modules * Security: ssl module but also vulnerabilities in any other part of CPython -- we already added a Security category in NEWS/blurb * Parallelim: multiprocessing and concurrent.futures modules It's hard to find categories generic enough to not only contain a single item, but not contain too many items neither. Other ideas: * XML: xml.doc, xml.etree, xml.parsers, xml.sax modules * Import machinery: imp and importlib modules * Typing: abc and typing modules The best would be to have a mapping of a module name into a category, and make sure that all modules have a category. We might try to count the number of commits and NEWS entries of the last 12 months to decide if a category has the correct size. I don't think that we need a distinct categoy for each module. We can put many uncommon modules in a generic category. By the way, we need maybe also a new "module name" field in the bug tracker. But then comes the question of normalizing module names. For example, should "email.message" be normalized to "email"? Maybe store "email.message" but use "email" for search, display the module in the issue title, etc. Victor From k7hoven at gmail.com Wed Oct 4 06:22:44 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Wed, 4 Oct 2017 13:22:44 +0300 Subject: [Python-Dev] Inheritance vs composition in backcompat (PEP521) In-Reply-To: References: Message-ID: On Wed, Oct 4, 2017 at 8:07 AM, Nick Coghlan wrote: > On 3 October 2017 at 03:13, Koos Zevenhoven wrote: > > Well, it's not completely unrelated to that. The problem I'm talking > about > > is perhaps most easily seen from a simple context manager wrapper that > uses > > composition instead of inheritance: > > > > class Wrapper: > > def __init__(self): > > self._wrapped = SomeContextManager() > > > > def __enter__(self): > > print("Entering context") > > return self._wrapped.__enter__() > > > > def __exit__(self): > > self._wrapped.__exit__() > > print("Exited context") > > > > > > Now, if the wrapped contextmanager becomes a PEP 521 one with __suspend__ > > and __resume__, the Wrapper class is broken, because it does not respect > > __suspend__ and __resume__. So actually this is a backwards compatiblity > > issue. > > This is a known problem, and one of the main reasons that having a > truly transparent object proxy like > https://wrapt.readthedocs.io/en/latest/wrappers.html#object-proxy as > part of the standard library would be highly desirable. > > This is barely related to the problem I describe. The wrapper is not supposed to pretend to *be* the underlying object. It's just supposed to extend its functionality. Maybe it's just me, but using a transparent object proxy for this sounds like someone trying to avoid inheritance for no reason and at any cost. Inheritance probably has faster method access, and makes it more obvious what's going on: def Wrapper(contextmanager): class Wrapper(type(contextmanager)): def __enter__(self): print("Entering context") return contextmanager.__enter__() def __exit__(self): contextmanager.__exit__() print("Exited context") return Wrapper() A wrapper based on a transparent object proxy is just a non-transparent replacement for inheritance. Its wrapper nature is non-transparent because it pretends to `be` the original object, while it's actually a wrapper. But an object cannot `be` another object as long as the `is` operator won't return True. And any straightforward way to implement that would add performance overhead for normal objects. I do remember sometimes wanting a transparent object proxy. But not for normal wrappers. But I don't think I've gone as far as looking for a library to do that, because it seems that you can only go half way anyway. ??Koos ?-- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian at python.org Wed Oct 4 07:52:53 2017 From: brian at python.org (Brian Curtin) Date: Wed, 4 Oct 2017 07:52:53 -0400 Subject: [Python-Dev] Reorganize Python categories (Core, Library, ...)? In-Reply-To: References: Message-ID: On Wed, Oct 4, 2017 at 5:52 AM, Victor Stinner wrote: > Hi, > > Python uses a few categories to group bugs (on bugs.python.org) and > NEWS entries (in the Python changelog). List used by the blurb tool: > > #.. section: Security > #.. section: Core and Builtins > #.. section: Library > #.. section: Documentation > #.. section: Tests > #.. section: Build > #.. section: Windows > #.. section: macOS > #.. section: IDLE > #.. section: Tools/Demos > #.. section: C API > > My problem is that almost all changes go into "Library" category. When > I read long changelogs, it's sometimes hard to identify quickly the > context (ex: impacted modules) of a change. > > It's also hard to find open bugs of a specific module on > bugs.python.org, since almost all bugs are in the very generic > "Library" category. Using full text returns "false positives". > > I would prefer to see more specific categories like: > > * Buildbots: only issues specific to buildbots > I would expect anything listed under buildbot to be about infrastructure changes related to the running of build machines. I think what you're getting at are the bugs that appear on build machines that weren't otherwise caught during the development of a recent change. In the end those are still just bugs in code, so I'm not sure I would group them at such a high level. Wouldn't this be a better use of the priority field? -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Oct 4 08:33:57 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 4 Oct 2017 22:33:57 +1000 Subject: [Python-Dev] Inheritance vs composition in backcompat (PEP521) In-Reply-To: References: Message-ID: On 4 October 2017 at 20:22, Koos Zevenhoven wrote: > On Wed, Oct 4, 2017 at 8:07 AM, Nick Coghlan wrote: >> >> On 3 October 2017 at 03:13, Koos Zevenhoven wrote: >> > Well, it's not completely unrelated to that. The problem I'm talking >> > about >> > is perhaps most easily seen from a simple context manager wrapper that >> > uses >> > composition instead of inheritance: >> > >> > class Wrapper: >> > def __init__(self): >> > self._wrapped = SomeContextManager() >> > >> > def __enter__(self): >> > print("Entering context") >> > return self._wrapped.__enter__() >> > >> > def __exit__(self): >> > self._wrapped.__exit__() >> > print("Exited context") >> > >> > >> > Now, if the wrapped contextmanager becomes a PEP 521 one with >> > __suspend__ >> > and __resume__, the Wrapper class is broken, because it does not respect >> > __suspend__ and __resume__. So actually this is a backwards compatiblity >> > issue. >> >> This is a known problem, and one of the main reasons that having a >> truly transparent object proxy like >> https://wrapt.readthedocs.io/en/latest/wrappers.html#object-proxy as >> part of the standard library would be highly desirable. >> > > This is barely related to the problem I describe. The wrapper is not > supposed to pretend to *be* the underlying object. It's just supposed to > extend its functionality. If a wrapper *isn't* trying to act as a transparent object proxy, and is instead adapting it to a particular protocol, then yes, you'll need to update the wrapper when the protocol is extended. That's not a backwards compatibility problem, because the only way to encounter it is to update your code to rely on the new extended protocol - your *existing* code will continue to work fine, since that, by definition, can't be relying on the new protocol extension. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From solipsis at pitrou.net Wed Oct 4 08:36:32 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 4 Oct 2017 14:36:32 +0200 Subject: [Python-Dev] Reorganize Python categories (Core, Library, ...)? References: Message-ID: <20171004143632.668fbc0a@fsol> On Wed, 4 Oct 2017 11:52:32 +0200 Victor Stinner wrote: > > It's also hard to find open bugs of a specific module on > bugs.python.org, since almost all bugs are in the very generic > "Library" category. Using full text returns "false positives". > > I would prefer to see more specific categories like: > > * Buildbots: only issues specific to buildbots > * Networking: socket, asyncio, asyncore, asynchat modules > * Security: ssl module but also vulnerabilities in any other part of > CPython -- we already added a Security category in NEWS/blurb > * Parallelim: multiprocessing and concurrent.futures modules This is mixing different taxonomies and will make things ambiguous. If there's a crash in socket.sendmsg() that affects mainly multiprocessing, should it be in "Networking", "Security" or "Parallelism"? If there's a bug where SSLSocket.recvinto() doesn't accept some writable buffers, is it "Networking" or "Security"? etc. I agree with making the "Library" section finer-grained, but then shouldn't the subsection be simply the top-level module/package name? (e.g. "collections", "xml", "logging", "asyncio", "concurrent"...) Also, perhaps the "blurb" tool can suggest a category depending on which stdlib files were modified, though there must be an easy way for the committer to override that choice. > I don't think that we need a distinct categoy for each module. We can > put many uncommon modules in a generic category. What is the problem with having a distinct category for each module? At worse, the logic which generates Docs from blurb files can merge some categories together if desired. There's no problem with having a very fine-grained categorization *on disk*, since the presentation can be made different. OTOH if the categorization is coarse-grained on disk (such is the case currently), the presentation layer can't recreate the information that was lost when committing. Regards Antoine. From k7hoven at gmail.com Wed Oct 4 08:45:15 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Wed, 4 Oct 2017 15:45:15 +0300 Subject: [Python-Dev] Inheritance vs composition in backcompat (PEP521) In-Reply-To: References: Message-ID: On Wed, Oct 4, 2017 at 3:33 PM, Nick Coghlan wrote: > On 4 October 2017 at 20:22, Koos Zevenhoven wrote: > > On Wed, Oct 4, 2017 at 8:07 AM, Nick Coghlan wrote: > >> > >> On 3 October 2017 at 03:13, Koos Zevenhoven wrote: > >> > Well, it's not completely unrelated to that. The problem I'm talking > >> > about > >> > is perhaps most easily seen from a simple context manager wrapper that > >> > uses > >> > composition instead of inheritance: > >> > > >> > class Wrapper: > >> > def __init__(self): > >> > self._wrapped = SomeContextManager() > >> > > >> > def __enter__(self): > >> > print("Entering context") > >> > return self._wrapped.__enter__() > >> > > >> > def __exit__(self): > >> > self._wrapped.__exit__() > >> > print("Exited context") > >> > > >> > > >> > Now, if the wrapped contextmanager becomes a PEP 521 one with > >> > __suspend__ > >> > and __resume__, the Wrapper class is broken, because it does not > respect > >> > __suspend__ and __resume__. So actually this is a backwards > compatiblity > >> > issue. > >> > >> This is a known problem, and one of the main reasons that having a > >> truly transparent object proxy like > >> https://wrapt.readthedocs.io/en/latest/wrappers.html#object-proxy as > >> part of the standard library would be highly desirable. > >> > > > > This is barely related to the problem I describe. The wrapper is not > > supposed to pretend to *be* the underlying object. It's just supposed to > > extend its functionality. > > If a wrapper *isn't* trying to act as a transparent object proxy, and > is instead adapting it to a particular protocol, then yes, you'll need > to update the wrapper when the protocol is extended. > > ?Yes, but it still means that the change in the dependency (in this case a standard Python protocol) breaks the wrapper code.? Remember that the wrappeR class and the wrappeD class can be implemented in different libraries. > That's not a backwards compatibility problem, because the only way to > encounter it is to update your code to rely on the new extended > protocol - your *existing* code will continue to work fine, since > that, by definition, can't be relying on the new protocol extension. > > ?No, not all code is "your" code. Clearly this is not a well-known problem. This is a backwards-compatibility problem for the author of the wrappeR, not for the author of the wrappeD object. ??Koos ? -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Oct 4 09:04:42 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 4 Oct 2017 23:04:42 +1000 Subject: [Python-Dev] Inheritance vs composition in backcompat (PEP521) In-Reply-To: References: Message-ID: On 4 October 2017 at 22:45, Koos Zevenhoven wrote: > On Wed, Oct 4, 2017 at 3:33 PM, Nick Coghlan wrote: >> That's not a backwards compatibility problem, because the only way to >> encounter it is to update your code to rely on the new extended >> protocol - your *existing* code will continue to work fine, since >> that, by definition, can't be relying on the new protocol extension. >> > > No, not all code is "your" code. Clearly this is not a well-known problem. > This is a backwards-compatibility problem for the author of the wrappeR, not > for the author of the wrappeD object. No, you're misusing the phrase "backwards compatibility", and confusing it with "feature enablement". Preserving backwards compatibility just means "existing code and functionality don't break". It has nothing to do with whether or not other support libraries and frameworks might need to change in order to enable full access to a new language feature. Take the length hint protocol defined in PEP 424 for example: that extended the iterator protocol to include a new optional __length_hint__ method, such that container constructors can make a more reasonable guess as to how much space they should pre-allocate when being initialised from an iterator or iterable rather than another container. That protocol means that many container wrappers break the optimisation. That's not a compatibility problem, it just means those wrappers don't support the feature, and it would potentially be a useful enhancement if they did. Similarly, when context managers were added, folks needed to add appropriate implementations of the protocol in order to be able to actually make use of the feature. If a library didn't support it natively, then they either needed to write their own context manager, or else contribute an enhancement to that library. This pattern applies whenever a new protocol is added or an existing protocol is extended: whether or not you can actually rely on the new feature will depend on whether or not all your dependencies also support it. The best case scenarios are those where we can enable a new feature in a few key standard library APIs, and then most third party APIs will transparently pick up the new behaviour (e.g. as we did for the fspath protocol). However, even in situations like that, there may still be other code that makes no longer correct assumptions, and blocks access to the new feature (e.g. by including an explicit isinstance check). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From victor.stinner at gmail.com Wed Oct 4 09:22:48 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 4 Oct 2017 15:22:48 +0200 Subject: [Python-Dev] Reorganize Python categories (Core, Library, ...)? In-Reply-To: <20171004143632.668fbc0a@fsol> References: <20171004143632.668fbc0a@fsol> Message-ID: 2017-10-04 14:36 GMT+02:00 Antoine Pitrou : > If there's a crash in socket.sendmsg() that affects mainly > multiprocessing, should it be in "Networking", "Security" or > "Parallelism"? bugs.python.org allows you to select zero or *multiple* categories :-) It's common that categories of a bug evolves. For example, a buildbot issue is first tagged as "Tests", but then moves into the correct category once the problem is better understood. > If there's a bug where SSLSocket.recvinto() doesn't > accept some writable buffers, is it "Networking" or "Security"? etc. Usually, when the reach the final fix, it becomes much easier to pick the correct category. Between Networking and Security, Security wins since it's more important to list security fixes first in the changelog. > I agree with making the "Library" section finer-grained, but then > shouldn't the subsection be simply the top-level module/package name? > (e.g. "collections", "xml", "logging", "asyncio", "concurrent"...) Yeah, that's another option. I don't know how to solve the problem, I just listed the issues I have with the bug tracker and the changelog :-) > Also, perhaps the "blurb" tool can suggest a category depending on > which stdlib files were modified, though there must be an easy way for > the committer to override that choice. This is why a mapping module name => category would help, yes. > What is the problem with having a distinct category for each module? > At worse, the logic which generates Docs from blurb files can merge > some categories together if desired. There's no problem with having a > very fine-grained categorization *on disk*, since the presentation can > be made different. OTOH if the categorization is coarse-grained on > disk (such is the case currently), the presentation layer can't > recreate the information that was lost when committing. Technically, blurb is correctly limited to a main category written in the filename. Maybe blurb can evolve to store the modified module (modules?) to infer the categoy? I don't know. At least in the bug tracker, I would prefer to have the module name *and* a distinct list of categories. As I wrote, while the analysis makes progress, the module name can change, but also categories. Yesterday, I analyzed a bug in test_cgitb. In fact, the bug was in test_imp, something completely different :-) test_imp has side effects, causing a bug in test_cgitb. Victor From barry at python.org Wed Oct 4 09:39:21 2017 From: barry at python.org (Barry Warsaw) Date: Wed, 4 Oct 2017 09:39:21 -0400 Subject: [Python-Dev] Reorganize Python categories (Core, Library, ...)? In-Reply-To: References: Message-ID: On Oct 4, 2017, at 05:52, Victor Stinner wrote: > My problem is that almost all changes go into "Library" category. When > I read long changelogs, it's sometimes hard to identify quickly the > context (ex: impacted modules) of a change. > > It's also hard to find open bugs of a specific module on > bugs.python.org, since almost all bugs are in the very generic > "Library" category. Using full text returns "false positives". > > It's hard to find categories generic enough to not only contain a > single item, but not contain too many items neither. Other ideas: > > * XML: xml.doc, xml.etree, xml.parsers, xml.sax modules > * Import machinery: imp and importlib modules > * Typing: abc and typing modules I often run into the same problem. If we?re going to split up the Library section, then I think it makes sense to follow the top-level organization of the library manual: https://docs.python.org/3/library/index.html That already provides a mapping from module to category, and for the most part it?s a taxonomy that makes sense and is time proven. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From ericsnowcurrently at gmail.com Wed Oct 4 09:51:26 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Wed, 4 Oct 2017 07:51:26 -0600 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) In-Reply-To: References: Message-ID: On Tue, Oct 3, 2017 at 11:36 PM, Nick Coghlan wrote: > The problem relates to the fact that there aren't any memory barriers > around CPython's INCREF operations (they're implemented as an ordinary > C post-increment operation), so you can get the following scenario: > > * thread on CPU A has the sole reference (ob_refcnt=1) > * thread on CPU B acquires a new reference, but hasn't pushed the > updated ob_refcnt value back to the shared memory cache yet > * original thread on CPU A drops its reference, *thinks* the refcnt is > now zero, and deletes the object > * bad things now happen in CPU B as the thread running there tries to > use a deleted object :) I'm not clear on where we'd run into this problem with channels. Mirroring your scenario: * interpreter A (in thread on CPU A) INCREFs the object (the GIL is still held) * interp A sends the object to the channel * interp B (in thread on CPU B) receives the object from the channel * the new reference is held until interp B DECREFs the object >From what I see, at no point do we get a refcount of 0, such that there would be a race on the object being deleted. The only problem I'm aware of (it dawned on me last night), is in the case that the interpreter that created the object gets deleted before the object does. In that case we can't pass the deletion back to the original interpreter. (I don't think this problem is necessarily exclusive to the solution I've proposed for Bytes.) -eric From k7hoven at gmail.com Wed Oct 4 09:51:23 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Wed, 4 Oct 2017 16:51:23 +0300 Subject: [Python-Dev] Inheritance vs composition in backcompat (PEP521) In-Reply-To: References: Message-ID: On Wed, Oct 4, 2017 at 4:04 PM, Nick Coghlan wrote: > On 4 October 2017 at 22:45, Koos Zevenhoven wrote: > > On Wed, Oct 4, 2017 at 3:33 PM, Nick Coghlan wrote: > >> That's not a backwards compatibility problem, because the only way to > >> encounter it is to update your code to rely on the new extended > >> protocol - your *existing* code will continue to work fine, since > >> that, by definition, can't be relying on the new protocol extension. > >> > > > > No, not all code is "your" code. Clearly this is not a well-known > problem. > > This is a backwards-compatibility problem for the author of the wrappeR, > not > > for the author of the wrappeD object. > > No, you're misusing the phrase "backwards compatibility", and > confusing it with "feature enablement". > > Preserving backwards compatibility just means "existing code and > functionality don't break". It has nothing to do with whether or not > other support libraries and frameworks might need to change in order > to enable full access to a new language feature. > > ?It's not about full access to a new language feature. It's about the wrappeR promising it can wrap any ?context manager, which it then no longer can. If the __suspend__ and __resume__ methods are ignored, that is not about "not having full access to a new feature" ? that's broken code. The error message you get (if any) may not contain any hint of what went wrong. Take the length hint protocol defined in PEP 424 for example: that > extended the iterator protocol to include a new optional > __length_hint__ method, such that container constructors can make a > more reasonable guess as to how much space they should pre-allocate > when being initialised from an iterator or iterable rather than > another container. > > ?This is slightly similar, but not really. Not using __length_hint__ does not affect the correctness of code. > That protocol means that many container wrappers break the > optimisation. That's not a compatibility problem, it just means those > wrappers don't support the feature, and it would potentially be a > useful enhancement if they did. > > ?Again, ignoring __length_hint__ does not lead to broken code, so that just means the wrapper is as slow or as fast as it was before. ?So I still think it's an issue for the author of the wrapper to fix??even if just by documenting that the wrapper does not support the new protocol members. But that would not be necessary if the wrapper uses inheritance. (Of course there may be another reason to not use inheritance, but just overriding two methods seems like a good case for inheritance.). ? ?This discussion seems pretty pointless by now. It's true that *some* code needs to change for this to be a problem. Updating only the Python version does not break a codebase if libraries aren't updated, and even then, breakage is not very likely, I suppose. It all depends on the kind of change that is made. For __length_hint__, you only risk not getting the performance improvement. For __suspend__ and __resume__, there's a small chance of problems. For some other change, it might be even riskier. But this is definitely not the most dangerous type of compatibility issue. ??Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Wed Oct 4 10:14:22 2017 From: barry at python.org (Barry Warsaw) Date: Wed, 4 Oct 2017 10:14:22 -0400 Subject: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files) In-Reply-To: <1507051763.2766090.1126540976.75FD41A3@webmail.messagingengine.com> References: <1507051763.2766090.1126540976.75FD41A3@webmail.messagingengine.com> Message-ID: On Oct 3, 2017, at 13:29, Benjamin Peterson wrote: > I'm not sure turning the implementation details of our internal formats > into APIs is the way to go. I still think an API in the stdlib would be useful and appropriate, but it?s not like this couldn?t be done as a 3rd party module. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From gvanrossum at gmail.com Wed Oct 4 11:08:50 2017 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed, 4 Oct 2017 08:08:50 -0700 Subject: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files) In-Reply-To: References: Message-ID: On Oct 3, 2017 9:55 AM, "Serhiy Storchaka" wrote: While PEP 552 is accepted, I would want to see some changes. 1. Increase the size of the constant part of the signature to at least 32 bits. Currently only the third and forth bytes are constant, and they are '\r\n', that is often occurred in text files. The first two bytes can be different in every Python version. This make hard detecting pyc files by utilities like file (1). 2. Split the "version" of pyc files by "major" and "minor" parts. Every major version is incompatible with other major versions, the interpreter accepts only one particular major version. It can't be changed in a bugfix release. But all minor versions inside the same major version are forward and backward compatible. The interpreter should be able to execute pyc file with arbitrary minor version, but it can use minor version of pyc file to handle errors in older versions. Minor version can be changed in a bugfix release. I hope this can help us with issues like https://bugs.python.org/issue29537. Currently 3.5 supports two magic numbers. If we change the pyc format, it would be easy to make the above changes. IIUC the PEP doesn't commit to any particular magic word format, so this can be negotiated separately, on the tracker (unless there's a PEP specifying the internal structure of the magic word?). --Guido -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Wed Oct 4 11:50:33 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 4 Oct 2017 17:50:33 +0200 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) References: Message-ID: <20171004175033.2e42d3a8@fsol> On Mon, 2 Oct 2017 21:31:30 -0400 Eric Snow wrote: > > > By contrast, if we allow an actual bytes object to be shared, then > > either every INCREF or DECREF on that bytes object becomes a > > synchronisation point, or else we end up needing some kind of > > secondary per-interpreter refcount where the interpreter doesn't drop > > its shared reference to the original object in its source interpreter > > until the internal refcount in the borrowing interpreter drops to > > zero. > > There shouldn't be a need to synchronize on INCREF. If both > interpreters have at least 1 reference then either one adding a > reference shouldn't be a problem. I'm not sure what Nick meant by "synchronization point", but at least you certainly need INCREF and DECREF to be atomic, which is a departure from today's Py_INCREF / Py_DECREF behaviour (and is significantly slower, even on high-level benchmarks). Regards Antoine. From k7hoven at gmail.com Wed Oct 4 11:54:39 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Wed, 4 Oct 2017 18:54:39 +0300 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) In-Reply-To: References: Message-ID: On Wed, Oct 4, 2017 at 4:51 PM, Eric Snow wrote: > On Tue, Oct 3, 2017 at 11:36 PM, Nick Coghlan wrote: > > The problem relates to the fact that there aren't any memory barriers > > around CPython's INCREF operations (they're implemented as an ordinary > > C post-increment operation), so you can get the following scenario: > > > > * thread on CPU A has the sole reference (ob_refcnt=1) > > * thread on CPU B acquires a new reference, but hasn't pushed the > > updated ob_refcnt value back to the shared memory cache yet > > * original thread on CPU A drops its reference, *thinks* the refcnt is > > now zero, and deletes the object > > * bad things now happen in CPU B as the thread running there tries to > > use a deleted object :) > > I'm not clear on where we'd run into this problem with channels. > Mirroring your scenario: > > * interpreter A (in thread on CPU A) INCREFs the object (the GIL is still > held) > * interp A sends the object to the channel > * interp B (in thread on CPU B) receives the object from the channel > * the new reference is held until interp B DECREFs the object > > From what I see, at no point do we get a refcount of 0, such that > there would be a race on the object being deleted. > > ? So what you're saying is that when Larry finishes the gilectomy, subinterpreters will work GIL-free too??-) ???Koos ? The only problem I'm aware of (it dawned on me last night), is in the > case that the interpreter that created the object gets deleted before > the object does. In that case we can't pass the deletion back to the > original interpreter. (I don't think this problem is necessarily > exclusive to the solution I've proposed for Bytes.) > > -eric > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > k7hoven%40gmail.com > -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Wed Oct 4 11:51:18 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 4 Oct 2017 17:51:18 +0200 Subject: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files) References: <1507051763.2766090.1126540976.75FD41A3@webmail.messagingengine.com> Message-ID: <20171004175118.66c1b46e@fsol> On Wed, 4 Oct 2017 10:14:22 -0400 Barry Warsaw wrote: > On Oct 3, 2017, at 13:29, Benjamin Peterson wrote: > > > I'm not sure turning the implementation details of our internal formats > > into APIs is the way to go. > > I still think an API in the stdlib would be useful and appropriate, but it?s not like this couldn?t be done as a 3rd party module. It can also be an implementation-specific API for which we don't guarantee anything in the future. The consenting adults rule would apply. Regards Antoine. From solipsis at pitrou.net Wed Oct 4 11:55:09 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 4 Oct 2017 17:55:09 +0200 Subject: [Python-Dev] Reorganize Python categories (Core, Library, ...)? In-Reply-To: References: <20171004143632.668fbc0a@fsol> Message-ID: <20171004175509.78a46799@fsol> On Wed, 4 Oct 2017 15:22:48 +0200 Victor Stinner wrote: > 2017-10-04 14:36 GMT+02:00 Antoine Pitrou : > > If there's a crash in socket.sendmsg() that affects mainly > > multiprocessing, should it be in "Networking", "Security" or > > "Parallelism"? > > bugs.python.org allows you to select zero or *multiple* categories :-) I'm getting confused. Are you talking about NEWS file categories or bugs.python.org categories? > > If there's a bug where SSLSocket.recvinto() doesn't > > accept some writable buffers, is it "Networking" or "Security"? etc. > > Usually, when the reach the final fix, it becomes much easier to pick > the correct category. There is no definite "correct category" when you're mixing different classification schemes (what kind of bug it is -- bug/security/enhancement/etc. --, what functional domain it pertains to -- networking/concurrency/etc. --, which stdlib API it affects). That's the problem I was pointing to. Regards Antoine. From solipsis at pitrou.net Wed Oct 4 11:59:28 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 4 Oct 2017 17:59:28 +0200 Subject: [Python-Dev] Reorganize Python categories (Core, Library, ...)? References: Message-ID: <20171004175928.00c63fa6@fsol> On Wed, 4 Oct 2017 09:39:21 -0400 Barry Warsaw wrote: > On Oct 4, 2017, at 05:52, Victor Stinner wrote: > > > My problem is that almost all changes go into "Library" category. When > > I read long changelogs, it's sometimes hard to identify quickly the > > context (ex: impacted modules) of a change. > > > > It's also hard to find open bugs of a specific module on > > bugs.python.org, since almost all bugs are in the very generic > > "Library" category. Using full text returns "false positives". > > > > It's hard to find categories generic enough to not only contain a > > single item, but not contain too many items neither. Other ideas: > > > > * XML: xml.doc, xml.etree, xml.parsers, xml.sax modules > > * Import machinery: imp and importlib modules > > * Typing: abc and typing modules > > I often run into the same problem. If we?re going to split up the Library section, then I think it makes sense to follow the top-level organization of the library manual: > > https://docs.python.org/3/library/index.html I think I'd rather type the module name than have to look up the proper category in the documentation. IOW, the module name -> category mapping alluded to by Victor would need to exist somewhere in programmatic (or machine-readable) form. But then we might as well store the actual module name in the NEWS files and do the mapping when generating the presentation :-) Regards Antoine. From solipsis at pitrou.net Wed Oct 4 12:02:07 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 4 Oct 2017 18:02:07 +0200 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) References: <20171004175033.2e42d3a8@fsol> Message-ID: <20171004180207.5d9bc7e9@fsol> On Wed, 4 Oct 2017 17:50:33 +0200 Antoine Pitrou wrote: > On Mon, 2 Oct 2017 21:31:30 -0400 > Eric Snow wrote: > > > > > By contrast, if we allow an actual bytes object to be shared, then > > > either every INCREF or DECREF on that bytes object becomes a > > > synchronisation point, or else we end up needing some kind of > > > secondary per-interpreter refcount where the interpreter doesn't drop > > > its shared reference to the original object in its source interpreter > > > until the internal refcount in the borrowing interpreter drops to > > > zero. > > > > There shouldn't be a need to synchronize on INCREF. If both > > interpreters have at least 1 reference then either one adding a > > reference shouldn't be a problem. > > I'm not sure what Nick meant by "synchronization point", but at least > you certainly need INCREF and DECREF to be atomic, which is a departure > from today's Py_INCREF / Py_DECREF behaviour (and is significantly > slower, even on high-level benchmarks). To be clear, I'm writing this under the hypothesis of per-interpreter GILs. I'm not really interested in the per-process GIL case :-) Regards Antoine. From benjamin at python.org Wed Oct 4 13:53:51 2017 From: benjamin at python.org (Benjamin Peterson) Date: Wed, 04 Oct 2017 10:53:51 -0700 Subject: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files) In-Reply-To: References: <1507051763.2766090.1126540976.75FD41A3@webmail.messagingengine.com> Message-ID: <1507139631.839954.1127905064.60C88DA2@webmail.messagingengine.com> On Wed, Oct 4, 2017, at 07:14, Barry Warsaw wrote: > On Oct 3, 2017, at 13:29, Benjamin Peterson wrote: > > > I'm not sure turning the implementation details of our internal formats > > into APIs is the way to go. > > I still think an API in the stdlib would be useful and appropriate, but > it?s not like this couldn?t be done as a 3rd party module. It might be helpful to enumerate the usecases for such an API. Perhaps a narrow, specialized API could satisfy most needs in a supportable way. From barry at python.org Wed Oct 4 14:06:48 2017 From: barry at python.org (Barry Warsaw) Date: Wed, 4 Oct 2017 14:06:48 -0400 Subject: [Python-Dev] PEP 553; the interaction between $PYTHONBREAKPOINT and -E Message-ID: <4238B62F-8D68-444C-BDFA-7A235E3E6691@python.org> Victor brings up a good question in his review of the PEP 553 implementation. https://github.com/python/cpython/pull/3355 https://bugs.python.org/issue31353 The question is whether $PYTHONBREAKPOINT should be ignored if -E is given? I think it makes sense for $PYTHONBREAKPOINT to be sensitive to -E, but in thinking about it some more, it might make better sense for the semantics to be that when -E is given, we treat it like PYTHONBREAKPOINT=0, i.e. disable the breakpoint, rather than fallback to the `pdb.set_trace` default. My thinking is this: -E is often used in production environments to prevent stray environment settings from affecting the Python process. In those environments, you probably also want to prevent stray breakpoints from stopping the process, so it?s more helpful to disable breakpoint processing when -E is given rather than running pdb.set_trace(). If you have a strong opinion either way, please follow up here, on the PR, or on the bug tracker. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From guido at python.org Wed Oct 4 14:14:58 2017 From: guido at python.org (Guido van Rossum) Date: Wed, 4 Oct 2017 11:14:58 -0700 Subject: [Python-Dev] PEP 553; the interaction between $PYTHONBREAKPOINT and -E In-Reply-To: <4238B62F-8D68-444C-BDFA-7A235E3E6691@python.org> References: <4238B62F-8D68-444C-BDFA-7A235E3E6691@python.org> Message-ID: Treating -E as PYTHONBREAKPOINT=0 makes sense. On Wed, Oct 4, 2017 at 11:06 AM, Barry Warsaw wrote: > Victor brings up a good question in his review of the PEP 553 > implementation. > > https://github.com/python/cpython/pull/3355 > https://bugs.python.org/issue31353 > > The question is whether $PYTHONBREAKPOINT should be ignored if -E is given? > > I think it makes sense for $PYTHONBREAKPOINT to be sensitive to -E, but in > thinking about it some more, it might make better sense for the semantics > to be that when -E is given, we treat it like PYTHONBREAKPOINT=0, i.e. > disable the breakpoint, rather than fallback to the `pdb.set_trace` default. > > My thinking is this: -E is often used in production environments to > prevent stray environment settings from affecting the Python process. In > those environments, you probably also want to prevent stray breakpoints > from stopping the process, so it?s more helpful to disable breakpoint > processing when -E is given rather than running pdb.set_trace(). > > If you have a strong opinion either way, please follow up here, on the PR, > or on the bug tracker. > > Cheers, > -Barry > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Wed Oct 4 16:52:50 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 4 Oct 2017 22:52:50 +0200 Subject: [Python-Dev] PEP 553; the interaction between $PYTHONBREAKPOINT and -E References: <4238B62F-8D68-444C-BDFA-7A235E3E6691@python.org> Message-ID: <20171004225250.7b14234b@fsol> On Wed, 4 Oct 2017 14:06:48 -0400 Barry Warsaw wrote: > Victor brings up a good question in his review of the PEP 553 implementation. > > https://github.com/python/cpython/pull/3355 > https://bugs.python.org/issue31353 > > The question is whether $PYTHONBREAKPOINT should be ignored if -E is given? > > I think it makes sense for $PYTHONBREAKPOINT to be sensitive to -E, but in thinking about it some more, it might make better sense for the semantics to be that when -E is given, we treat it like PYTHONBREAKPOINT=0, i.e. disable the breakpoint, rather than fallback to the `pdb.set_trace` default. """Special cases aren't special enough to break the rules.""" People expect -E to disable envvar-driven overrides, so just treat it like that and don't try to second-guess the user. Regards Antoine. From guido at python.org Wed Oct 4 17:03:32 2017 From: guido at python.org (Guido van Rossum) Date: Wed, 4 Oct 2017 14:03:32 -0700 Subject: [Python-Dev] PEP 553; the interaction between $PYTHONBREAKPOINT and -E In-Reply-To: <20171004225250.7b14234b@fsol> References: <4238B62F-8D68-444C-BDFA-7A235E3E6691@python.org> <20171004225250.7b14234b@fsol> Message-ID: Well that also makes sense. On Wed, Oct 4, 2017 at 1:52 PM, Antoine Pitrou wrote: > On Wed, 4 Oct 2017 14:06:48 -0400 > Barry Warsaw wrote: > > Victor brings up a good question in his review of the PEP 553 > implementation. > > > > https://github.com/python/cpython/pull/3355 > > https://bugs.python.org/issue31353 > > > > The question is whether $PYTHONBREAKPOINT should be ignored if -E is > given? > > > > I think it makes sense for $PYTHONBREAKPOINT to be sensitive to -E, but > in thinking about it some more, it might make better sense for the > semantics to be that when -E is given, we treat it like PYTHONBREAKPOINT=0, > i.e. disable the breakpoint, rather than fallback to the `pdb.set_trace` > default. > > """Special cases aren't special enough to break the rules.""" > > People expect -E to disable envvar-driven overrides, so just treat it > like that and don't try to second-guess the user. > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From h4ck3r2007123 at gmail.com Wed Oct 4 18:56:14 2017 From: h4ck3r2007123 at gmail.com (VERY ANONYMOUS) Date: Wed, 4 Oct 2017 15:56:14 -0700 Subject: [Python-Dev] PEP 544 Message-ID: i want to learn -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed Oct 4 19:58:13 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 5 Oct 2017 10:58:13 +1100 Subject: [Python-Dev] PEP 544 In-Reply-To: References: Message-ID: <20171004235813.GF13110@ando.pearwood.info> On Wed, Oct 04, 2017 at 03:56:14PM -0700, VERY ANONYMOUS wrote: > i want to learn Start by learning to communicate in full sentences. You want to learn what? Core development? Python? How to program? English? This is not a mailing list for Python beginners. Try the "tutor" or "python-list" mailing lists. -- Steve From yarkot1 at gmail.com Wed Oct 4 20:22:56 2017 From: yarkot1 at gmail.com (Yarko Tymciurak) Date: Wed, 4 Oct 2017 19:22:56 -0500 Subject: [Python-Dev] PEP 553 In-Reply-To: <11E419CD-6E87-44C6-965B-A07DEF8237AE@python.org> References: <1B4494FE-AA40-4FF7-B81A-B0FB14F5F259@python.org> <11E419CD-6E87-44C6-965B-A07DEF8237AE@python.org> Message-ID: Barry suggested I bring this up here. It seems the right time to at least discuss this: RE: PEP 553 enabling / disabling breakpoints --- I've recently started using a simple conditional breakpoint in ipython, and wonder if - in addition to Nick Coghlan's request for the env 'PYTHONBREAKPOINT' (boolean?), it would make sense (I _think_ so) to add a condition parameter to the breakpoint() call. This does raise several questions, but it seems that it could make for a simple unified way to conditionally call an arbitrary debugger. What I found useful (in the contecxt of ipython - but general enough) you can see in this gist: https://gist.github.com/yarko/bdaa9d3178a6db03e160fdbabb3a9885 If PEP 553's breakpoint() were to follow this sort of interface (with "condition"), it raises a couple of questions: - how would a missing (default) parameter be done? - how would parameters to be passed to the debugger "of record" be passed in (named tuple? - sort of ugly) - would PYTHONBREAKPOINT be a global switch (I think yes), vs a `condition` default. I have no dog in the fight, but to raise the possibility (?) of having PEP 553 implement simple conditional breakpoint processing. Any / all comments much appreciated. Regards, Yarko On Mon, Oct 2, 2017 at 7:06 PM, Barry Warsaw wrote: > On Oct 2, 2017, at 18:43, Guido van Rossum wrote: > > > > OK. That then concludes the review of your PEP. It is now accepted! > Congrats. I am looking forward to using the backport. :-) > > Yay, thanks! We?ll see if I can sneak that backport past Ned. :) > > -Barry > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > yarkot1%40gmail.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Wed Oct 4 20:28:16 2017 From: barry at python.org (Barry Warsaw) Date: Wed, 4 Oct 2017 20:28:16 -0400 Subject: [Python-Dev] PEP 553; the interaction between $PYTHONBREAKPOINT and -E In-Reply-To: <20171004225250.7b14234b@fsol> References: <4238B62F-8D68-444C-BDFA-7A235E3E6691@python.org> <20171004225250.7b14234b@fsol> Message-ID: > """Special cases aren't special enough to break the rules.""" > > People expect -E to disable envvar-driven overrides, so just treat it > like that and don't try to second-guess the user. And of course "Although practicality beats purity.? :) So while I agree that the consistency argument makes sense, does it make the most practical sense? I?m not sure. On the PR, Nick suggests even another option: treat -E as all other environment variables, but then -I would be PYTHONBREAKPOINT=0. Since the documentation for -I says "(implies -E and -s)? that seems even more special-case-y to me. "In the face of ambiguity, refuse the temptation to guess.? I?m really not sure what the right answer is, including to *not* make PYTHONBREAKPOINT obey -E. Unfortunately we probably won?t really get a good answer in practice until Python 3.7 is released, so maybe I just choose one and document that the behavior of PYTHONBREAKPOINT under -E is provision for now. If that?s acceptable, then I would just treat -E for PYTHONBREAKPOINT the same as all other environment variables, and we?ll see how it goes. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From barry at python.org Wed Oct 4 20:50:47 2017 From: barry at python.org (Barry Warsaw) Date: Wed, 4 Oct 2017 20:50:47 -0400 Subject: [Python-Dev] PEP 553 In-Reply-To: References: <1B4494FE-AA40-4FF7-B81A-B0FB14F5F259@python.org> <11E419CD-6E87-44C6-965B-A07DEF8237AE@python.org> Message-ID: <4D2C4576-CDFC-4D73-A8B5-184B558A5BDC@python.org> On Oct 4, 2017, at 20:22, Yarko Tymciurak wrote: > I've recently started using a simple conditional breakpoint in ipython, and wonder if - in addition to Nick Coghlan's request for the env 'PYTHONBREAKPOINT' (boolean?), it would make sense (I _think_ so) to add a condition parameter to the breakpoint() call. This does raise several questions, but it seems that it could make for a simple unified way to conditionally call an arbitrary debugger. What I found useful (in the contecxt of ipython - but general enough) you can see in this gist: https://gist.github.com/yarko/bdaa9d3178a6db03e160fdbabb3a9885 > > If PEP 553's breakpoint() were to follow this sort of interface (with "condition"), it raises a couple of questions: > - how would a missing (default) parameter be done? > - how would parameters to be passed to the debugger "of record" be passed in (named tuple? - sort of ugly) > - would PYTHONBREAKPOINT be a global switch (I think yes), vs a `condition` default. > > I have no dog in the fight, but to raise the possibility (?) of having PEP 553 implement simple conditional breakpoint processing. Thanks for bringing this up Yarko. I think this could be done with the current specification for PEP 553 and an additional API from the various debuggers. I don?t think it needs to be part of PEP 553 explicitly, given the additional complications you describe above. Remember that both built-in breakpoint() and sys.breakpointhook() accept *args, **kws, and it is left up to the actual debugger API to interpret/accept those additional arguments. So let?s say you wanted to implement this behavior with pdb. I think you could do something as simple as: def conditional_set_trace(*, condition=True): if condition: pdb.set_trace() sys.breakpointhook = conditional_set_trace Then in your code, you would just write: def foo(value): breakpoint(condition=(value < 0)) With the IPython gist you referenced, you wouldn?t even need that convenience function. Just set sys.breakpointhook=conditional_breakpoint.breakpoint_ and viola! You could also PYTHONBREAKPOINT=conditional_breakpoint.breakpoint_ python3.7 ? and it should Just Work. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From v+python at g.nevcal.com Wed Oct 4 20:34:55 2017 From: v+python at g.nevcal.com (Glenn Linderman) Date: Wed, 4 Oct 2017 17:34:55 -0700 Subject: [Python-Dev] PEP 553 In-Reply-To: References: <1B4494FE-AA40-4FF7-B81A-B0FB14F5F259@python.org> <11E419CD-6E87-44C6-965B-A07DEF8237AE@python.org> Message-ID: <8d443a51-929b-e611-bfc9-b998805eecde@g.nevcal.com> On 10/4/2017 5:22 PM, Yarko Tymciurak wrote: > Barry suggested I bring this up here. > > It seems the right time to at least discuss this: > > RE:? PEP 553 enabling / disabling breakpoints --- > > I've recently started using a simple conditional breakpoint in > ipython, and wonder if? - in addition to Nick Coghlan's request for > the env 'PYTHONBREAKPOINT'? (boolean?), it would make sense (I _think_ > so) to add a condition parameter to the breakpoint() call.? This does > raise several questions, but it seems that it could make for a simple > unified way to conditionally call an arbitrary debugger.? What I found > useful (in the contecxt of ipython - but general enough) you can see > in this gist: > https://gist.github.com/yarko/bdaa9d3178a6db03e160fdbabb3a9885 > > If PEP 553's breakpoint() were to follow this sort of interface (with > "condition"), it raises a couple of questions: > - how would a missing (default) parameter be done? > - how would parameters to be passed to the debugger "of record" be > passed in (named tuple? - sort of ugly) > - would PYTHONBREAKPOINT be a global switch (I think yes), vs a > `condition` default. > > I have no dog in the fight, but to raise the possibility (?) of having > PEP 553 implement simple conditional breakpoint processing. > > Any / all comments much appreciated. > breakpoint() already accepts arguments. Therefore no change to the PEP is needed to implement your suggestion. What you are suggesting is simply a convention among debuggers to handle a parameter named "condition" in a particular manner. It seems to me that if condition: ??? breakpoint() would be faster and clearer, but there is nothing to prevent a debugger from implementing your suggestion if it seems useful to the developers of the debugger. If it is useful enough to enough people, the users will clamor for other debuggers to implement it also. -------------- next part -------------- An HTML attachment was scrubbed... URL: From yarkot1 at gmail.com Wed Oct 4 21:09:36 2017 From: yarkot1 at gmail.com (Yarko Tymciurak) Date: Wed, 4 Oct 2017 20:09:36 -0500 Subject: [Python-Dev] PEP 553 In-Reply-To: <4D2C4576-CDFC-4D73-A8B5-184B558A5BDC@python.org> References: <1B4494FE-AA40-4FF7-B81A-B0FB14F5F259@python.org> <11E419CD-6E87-44C6-965B-A07DEF8237AE@python.org> <4D2C4576-CDFC-4D73-A8B5-184B558A5BDC@python.org> Message-ID: On Wed, Oct 4, 2017 at 7:50 PM, Barry Warsaw wrote: > On Oct 4, 2017, at 20:22, Yarko Tymciurak wrote: > > > I've recently started using a simple conditional breakpoint in ipython, > and wonder if - in addition to Nick Coghlan's request for the env > 'PYTHONBREAKPOINT' (boolean?), it would make sense (I _think_ so) to add a > condition parameter to the breakpoint() call. This does raise several > questions, but it seems that it could make for a simple unified way to > conditionally call an arbitrary debugger. What I found useful (in the > contecxt of ipython - but general enough) you can see in this gist: > https://gist.github.com/yarko/bdaa9d3178a6db03e160fdbabb3a9885 > > > > If PEP 553's breakpoint() were to follow this sort of interface (with > "condition"), it raises a couple of questions: > > - how would a missing (default) parameter be done? > > - how would parameters to be passed to the debugger "of record" be > passed in (named tuple? - sort of ugly) > > - would PYTHONBREAKPOINT be a global switch (I think yes), vs a > `condition` default. > > > > I have no dog in the fight, but to raise the possibility (?) of having > PEP 553 implement simple conditional breakpoint processing. > > Thanks for bringing this up Yarko. I think this could be done with the > current specification for PEP 553 and an additional API from the various > debuggers. I don?t think it needs to be part of PEP 553 explicitly, given > the additional complications you describe above. > > Remember that both built-in breakpoint() and sys.breakpointhook() accept > *args, **kws, and it is left up to the actual debugger API to > interpret/accept those additional arguments. So let?s say you wanted to > implement this behavior with pdb. I think you could do something as simple > as: > > def conditional_set_trace(*, condition=True): > if condition: > pdb.set_trace() > > sys.breakpointhook = conditional_set_trace > > Then in your code, you would just write: > > def foo(value): > breakpoint(condition=(value < 0)) > > With the IPython gist you referenced, you wouldn?t even need that > convenience function. Just set sys.breakpointhook=conditional_breakpoint.breakpoint_ > and viola! > > You could also PYTHONBREAKPOINT=conditional_breakpoint.breakpoint_ > python3.7 ? and it should Just Work. > Thanks Barry - yes, I see: you're correct. Thanks for the pep! - Yarko > > Cheers, > -Barry > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > yarkot1%40gmail.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Oct 4 21:41:18 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 5 Oct 2017 11:41:18 +1000 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) In-Reply-To: References: Message-ID: On 4 October 2017 at 23:51, Eric Snow wrote: > On Tue, Oct 3, 2017 at 11:36 PM, Nick Coghlan wrote: >> The problem relates to the fact that there aren't any memory barriers >> around CPython's INCREF operations (they're implemented as an ordinary >> C post-increment operation), so you can get the following scenario: >> >> * thread on CPU A has the sole reference (ob_refcnt=1) >> * thread on CPU B acquires a new reference, but hasn't pushed the >> updated ob_refcnt value back to the shared memory cache yet >> * original thread on CPU A drops its reference, *thinks* the refcnt is >> now zero, and deletes the object >> * bad things now happen in CPU B as the thread running there tries to >> use a deleted object :) > > I'm not clear on where we'd run into this problem with channels. > Mirroring your scenario: > > * interpreter A (in thread on CPU A) INCREFs the object (the GIL is still held) > * interp A sends the object to the channel > * interp B (in thread on CPU B) receives the object from the channel > * the new reference is held until interp B DECREFs the object > > From what I see, at no point do we get a refcount of 0, such that > there would be a race on the object being deleted. Having the sending interpreter do the INCREF just changes the problem to be a memory leak waiting to happen rather than an access-after-free issue, since the problematic non-synchronised scenario then becomes: * thread on CPU A has two references (ob_refcnt=2) * it sends a reference to a thread on CPU B via a channel * thread on CPU A releases its reference (ob_refcnt=1) * updated ob_refcnt value hasn't made it back to the shared memory cache yet * thread on CPU B releases its reference (ob_refcnt=1) * both threads have released their reference, but the refcnt is still 1 -> object leaks! We simply can't have INCREFs and DECREFs happening in different threads without some way of ensuring cache coherency for *both* operations - otherwise we risk either the refcount going to zero when it shouldn't, or *not* going to zero when it should. The current CPython implementation relies on the process global GIL for that purpose, so none of these problems will show up until you start trying to replace that with per-interpreter locks. Free threaded reference counting relies on (expensive) atomic increments & decrements. The cross-interpreter view proposal aims to allow per-interpreter GILs without introducing atomic increments & decrements by instead relying on the view itself to ensure that it's holding the right GIL for the object whose refcount it's manipulating, and the receiving interpreter explicitly closing the view when it's done with it. So while CIVs wouldn't be as easy to use as regular object references: 1. They'd be no harder to use than memoryviews in general 2. They'd structurally ensure that regular object refcounts can still rely on "protected by the GIL" semantics 3. They'd structurally ensure zero performance degradation for regular object refcounts 4. By virtue of being memoryview based, they'd encourage the adoption of interfaces and practices that can be adapted to multiple processes through the use of techniques like shared memory regions and memory mapped files (see http://www.boost.org/doc/libs/1_54_0/doc/html/interprocess/sharedmemorybetweenprocesses.html for some detailed explanations of how that works, and https://arrow.apache.org/ for an example of ways tools like Pandas can use that to enable zero-copy data sharing) > The only problem I'm aware of (it dawned on me last night), is in the > case that the interpreter that created the object gets deleted before > the object does. In that case we can't pass the deletion back to the > original interpreter. (I don't think this problem is necessarily > exclusive to the solution I've proposed for Bytes.) The cross-interpreter-view idea proposes to deal with that by having the CIV hold a strong reference not only to the sending object (which is already part of the regular memoryview semantics), but *also* to the sending interpreter - that way, neither the sending object nor the sending interpreter can go away until the receiving interpreter closes the view. The refcount-integrity-ensuring sequence of events becomes: 1. Sending interpreter submits the object to the channel 2. Channel creates a CIV with references to the sending interpreter & sending object, and a view on the sending object's memory 3. Receiving interpreter gets the CIV from the channel 4. Receiving interpreter closes the CIV either explicitly or via __del__ (the latter would emit ResourceWarning) 5. CIV switches execution back to the sending interpreter and releases both the memory buffer and the reference to the sending object 6. CIV switches execution back to the receiving interpreter, and releases its reference to the sending interpreter 7. Execution continues in the receiving interpreter Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Wed Oct 4 21:52:25 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 5 Oct 2017 11:52:25 +1000 Subject: [Python-Dev] PEP 553; the interaction between $PYTHONBREAKPOINT and -E In-Reply-To: References: <4238B62F-8D68-444C-BDFA-7A235E3E6691@python.org> <20171004225250.7b14234b@fsol> Message-ID: On 5 October 2017 at 10:28, Barry Warsaw wrote: >> """Special cases aren't special enough to break the rules.""" >> >> People expect -E to disable envvar-driven overrides, so just treat it >> like that and don't try to second-guess the user. > > And of course "Although practicality beats purity.? :) > > So while I agree that the consistency argument makes sense, does it make the most practical sense? > > I?m not sure. On the PR, Nick suggests even another option: treat -E as all other environment variables, but then -I would be PYTHONBREAKPOINT=0. Since the documentation for -I says "(implies -E and -s)? that seems even more special-case-y to me. -I is inherently a special-case, since it's effectively our "system Python mode", while we don't actually have a separate system Python binary. > Unfortunately we probably won?t really get a good answer in practice until Python 3.7 is released, so maybe I just choose one and document that the behavior of PYTHONBREAKPOINT under -E is provision for now. If that?s acceptable, then I would just treat -E for PYTHONBREAKPOINT the same as all other environment variables, and we?ll see how it goes. I'd be fine with this as the main reason I wanted PYTHONBREAKPOINT=0 was for pre-merge CI systems, and those tend to have tightly controlled environment settings, so you don't need to rely on -E or -I when running your tests. That said, it may also be worth considering a "-X nobreakpoints" option (and then -I could imply "-E -s -X nobreakpoints"). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From guido at python.org Wed Oct 4 22:12:48 2017 From: guido at python.org (Guido van Rossum) Date: Wed, 4 Oct 2017 19:12:48 -0700 Subject: [Python-Dev] PEP 553 In-Reply-To: References: <1B4494FE-AA40-4FF7-B81A-B0FB14F5F259@python.org> <11E419CD-6E87-44C6-965B-A07DEF8237AE@python.org> <4D2C4576-CDFC-4D73-A8B5-184B558A5BDC@python.org> Message-ID: Yarko, there's one thing I don't understand. Maybe you can enlighten me. Why would you prefer breakpoint(x >= 1000) over if x >= 1000: breakpoint() ? The latter seems unambiguous and requires thinking all around. Is there something in iPython that makes this impractical? -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Wed Oct 4 23:31:05 2017 From: barry at python.org (Barry Warsaw) Date: Wed, 4 Oct 2017 23:31:05 -0400 Subject: [Python-Dev] PEP 553; the interaction between $PYTHONBREAKPOINT and -E In-Reply-To: References: <4238B62F-8D68-444C-BDFA-7A235E3E6691@python.org> <20171004225250.7b14234b@fsol> Message-ID: On Oct 4, 2017, at 21:52, Nick Coghlan wrote: > >> Unfortunately we probably won?t really get a good answer in practice until Python 3.7 is released, so maybe I just choose one and document that the behavior of PYTHONBREAKPOINT under -E is provision for now. If that?s acceptable, then I would just treat -E for PYTHONBREAKPOINT the same as all other environment variables, and we?ll see how it goes. > > I'd be fine with this as the main reason I wanted PYTHONBREAKPOINT=0 > was for pre-merge CI systems, and those tend to have tightly > controlled environment settings, so you don't need to rely on -E or -I > when running your tests. > > That said, it may also be worth considering a "-X nobreakpoints" > option (and then -I could imply "-E -s -X nobreakpoints"). Thanks for the feedback Nick. For now we?ll go with the standard behavior of -E and see how it goes. We can always add a -X later. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From victor.stinner at gmail.com Thu Oct 5 02:58:44 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 5 Oct 2017 08:58:44 +0200 Subject: [Python-Dev] PEP 553; the interaction between $PYTHONBREAKPOINT and -E In-Reply-To: References: <4238B62F-8D68-444C-BDFA-7A235E3E6691@python.org> <20171004225250.7b14234b@fsol> Message-ID: I concur with Antoine, please don't add a special case for -E. But it seems like you already agreed with that :-) Victor Le 5 oct. 2017 05:33, "Barry Warsaw" a ?crit : > On Oct 4, 2017, at 21:52, Nick Coghlan wrote: > > > >> Unfortunately we probably won?t really get a good answer in practice > until Python 3.7 is released, so maybe I just choose one and document that > the behavior of PYTHONBREAKPOINT under -E is provision for now. If that?s > acceptable, then I would just treat -E for PYTHONBREAKPOINT the same as all > other environment variables, and we?ll see how it goes. > > > > I'd be fine with this as the main reason I wanted PYTHONBREAKPOINT=0 > > was for pre-merge CI systems, and those tend to have tightly > > controlled environment settings, so you don't need to rely on -E or -I > > when running your tests. > > > > That said, it may also be worth considering a "-X nobreakpoints" > > option (and then -I could imply "-E -s -X nobreakpoints"). > > Thanks for the feedback Nick. For now we?ll go with the standard behavior > of -E and see how it goes. We can always add a -X later. > > Cheers, > -Barry > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > victor.stinner%40gmail.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Thu Oct 5 04:17:46 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 5 Oct 2017 11:17:46 +0300 Subject: [Python-Dev] PEP 553; the interaction between $PYTHONBREAKPOINT and -E In-Reply-To: <4238B62F-8D68-444C-BDFA-7A235E3E6691@python.org> References: <4238B62F-8D68-444C-BDFA-7A235E3E6691@python.org> Message-ID: 04.10.17 21:06, Barry Warsaw ????: > Victor brings up a good question in his review of the PEP 553 implementation. > > https://github.com/python/cpython/pull/3355 > https://bugs.python.org/issue31353 > > The question is whether $PYTHONBREAKPOINT should be ignored if -E is given? > > I think it makes sense for $PYTHONBREAKPOINT to be sensitive to -E, but in thinking about it some more, it might make better sense for the semantics to be that when -E is given, we treat it like PYTHONBREAKPOINT=0, i.e. disable the breakpoint, rather than fallback to the `pdb.set_trace` default. > > My thinking is this: -E is often used in production environments to prevent stray environment settings from affecting the Python process. In those environments, you probably also want to prevent stray breakpoints from stopping the process, so it?s more helpful to disable breakpoint processing when -E is given rather than running pdb.set_trace(). > > If you have a strong opinion either way, please follow up here, on the PR, or on the bug tracker. What if make the default value depending on the debug level? In debug mode it is "pdb.set_trace", in optimized mode it is "0". Then in production environments you can use -E -O for ignoring environment settings and disable breakpoints. From ericsnowcurrently at gmail.com Thu Oct 5 04:45:26 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 5 Oct 2017 02:45:26 -0600 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) In-Reply-To: References: Message-ID: On Tue, Oct 3, 2017 at 8:55 AM, Antoine Pitrou wrote: > I think we need a sharing protocol, not just a flag. We also need to > think carefully about that protocol, so that it does not imply > unnecessary memory copies. Therefore I think the protocol should be > something like the buffer protocol, that allows to acquire and release > a set of shared memory areas, but without imposing any semantics onto > those memory areas (each type implementing its own semantics). And > there needs to be a dedicated reference counting for object shares, so > that the original object can be notified when all its shares have > vanished. I've come to agree. :) I actually came to the same conclusion tonight before I'd been able to read through your message carefully. My idea is below. Your suggestion about protecting shared memory areas is something to discuss further, though I'm not sure it's strictly necessary yet (before we stop sharing the GIL). On Wed, Oct 4, 2017 at 7:41 PM, Nick Coghlan wrote: > Having the sending interpreter do the INCREF just changes the problem > to be a memory leak waiting to happen rather than an access-after-free > issue, since the problematic non-synchronised scenario then becomes: > > * thread on CPU A has two references (ob_refcnt=2) > * it sends a reference to a thread on CPU B via a channel > * thread on CPU A releases its reference (ob_refcnt=1) > * updated ob_refcnt value hasn't made it back to the shared memory cache yet > * thread on CPU B releases its reference (ob_refcnt=1) > * both threads have released their reference, but the refcnt is still > 1 -> object leaks! > > We simply can't have INCREFs and DECREFs happening in different > threads without some way of ensuring cache coherency for *both* > operations - otherwise we risk either the refcount going to zero when > it shouldn't, or *not* going to zero when it should. > > The current CPython implementation relies on the process global GIL > for that purpose, so none of these problems will show up until you > start trying to replace that with per-interpreter locks. > > Free threaded reference counting relies on (expensive) atomic > increments & decrements. Right. I'm not sure why I was missing that, but I'm clear now. Below is a rough idea of what I think may work instead (the result of much tossing and turning in bed*). While we're still sharing a GIL between interpreters: Channel.send(obj): # in interp A incref(obj) if type(obj).tp_share == NULL: raise ValueError("not a shareable type") ch.objects.append(obj) Channel.recv(): # in interp B orig = ch.objects.pop(0) obj = orig.tp_share() return obj bytes.tp_share(): return self After we move to not sharing the GIL between interpreters: Channel.send(obj): # in interp A incref(obj) if type(obj).tp_share == NULL: raise ValueError("not a shareable type") set_owner(obj) # obj.owner or add an obj -> interp entry to global table ch.objects.append(obj) Channel.recv(): # in interp B orig = ch.objects.pop(0) obj = orig.tp_share() set_shared(obj, orig) # add to a global table return obj bytes.tp_share(): obj = blank_bytes(len(self)) obj.ob_sval = self.ob_sval # hand-wavy memory sharing return obj bytes.tp_free(): # under no-shared-GIL: # most of this could be pulled into a macro for re-use orig = lookup_shared(self) if orig != NULL: current = release_LIL() interp = lookup_owner(orig) acquire_LIL(interp) decref(orig) release_LIL(interp) acquire_LIL(current) # clear shared/owner tables # clear/release self.ob_sval free(self) The CIV approach could be facilitated through something like a new SharedBuffer type, or through a separate BufferViewChannel, etc. Most notably, this approach avoids hard-coding specific type support into channels and should work out fine under no-shared-GIL subinterpreters. One nice thing about the tp_share slot is that it makes it much easier (along with C-API for managing the global owned/shared tables) to implement other types that are legal to pass through channels. Such could be provided via extension modules. Numpy arrays could be made to support it, if that's your thing. Antoine could give tp_share to locks and semaphores. :) Of course, any such types would have to ensure that they are actually safe to share between intepreters without a GIL between them... For PEP 554, I'd only propose the tp_share slot and its use in Channel.send()/.recv(). The parts related to global tables and memory sharing and tp_free() wouldn't be necessary until we stop sharing the GIL between interpreters. However, I believe that tp_share would make us ready for that. -eric * I should know by now that some ideas sound better in the middle of the night than they do the next day, but this idea is keeping me awake so I'll risk it! :) From ncoghlan at gmail.com Thu Oct 5 06:57:10 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 5 Oct 2017 20:57:10 +1000 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) In-Reply-To: References: Message-ID: On 5 October 2017 at 18:45, Eric Snow wrote: > After we move to not sharing the GIL between interpreters: > > Channel.send(obj): # in interp A > incref(obj) > if type(obj).tp_share == NULL: > raise ValueError("not a shareable type") > set_owner(obj) # obj.owner or add an obj -> interp entry to global > table > ch.objects.append(obj) > > Channel.recv(): # in interp B > orig = ch.objects.pop(0) > obj = orig.tp_share() > set_shared(obj, orig) # add to a global table > return obj > This would be hard to get to work reliably, because "orig.tp_share()" would be running in the receiving interpreter, but all the attributes of "orig" would have been allocated by the sending interpreter. It gets more reliable if it's *Channel.send* that calls tp_share() though, but moving the call to the sending side makes it clear that a tp_share protocol would still need to rely on a more primitive set of "shareable objects" that were the permitted return values from the tp_share call. And that's the real pay-off that comes from defining this in terms of the memoryview protocol: Py_buffer structs *aren't* Python objects, so it's only a regular C struct that gets passed across the interpreter boundary (the reference to the original objects gets carried along passively as part of the CIV - it never gets *used* in the receiving interpreter). > bytes.tp_share(): > obj = blank_bytes(len(self)) > obj.ob_sval = self.ob_sval # hand-wavy memory sharing > return obj > This is effectively reinventing memoryview, while trying to pretend it's an ordinary bytes object. Don't reinvent memoryview :) > bytes.tp_free(): # under no-shared-GIL: > # most of this could be pulled into a macro for re-use > orig = lookup_shared(self) > if orig != NULL: > current = release_LIL() > interp = lookup_owner(orig) > acquire_LIL(interp) > decref(orig) > release_LIL(interp) > acquire_LIL(current) > # clear shared/owner tables > # clear/release self.ob_sval > free(self) > I don't think we should be touching the behaviour of core builtins solely to enable message passing to subinterpreters without a shared GIL. The simplest possible variant of CIVs that I can think of would be able to avoid that outcome by being a memoryview subclass, since they just need to hold the extra reference to the original interpreter, and include some logic to swtich interpreters at the appropriate time. That said, I think there's definitely a useful design question to ask in this area, not about bytes (which can be readily represented by a memoryview variant in the receiving interpreter), but about *strings*: they have a more complex internal layout than bytes objects, but as long as the receiving interpreter can make sure that the original string continues to exist, then you could usefully implement a "strview" type to avoid having to go through an encode/decode cycle just to pass a string to another subinterpreter. That would provide a reasonable compelling argument that CIVs *shouldn't* be implemented as memoryview subclasses, but instead defined as *containing* a managed view of an object owned by a different interpreter. That way, even if the initial implementation only supported CIVs that contained a memoryview instance, we'd have the freedom to define other kinds of views later (such as strview), while being able to reuse the same CIV machinery. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Thu Oct 5 07:33:58 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 5 Oct 2017 13:33:58 +0200 Subject: [Python-Dev] PEP 553; the interaction between $PYTHONBREAKPOINT and -E In-Reply-To: References: <4238B62F-8D68-444C-BDFA-7A235E3E6691@python.org> Message-ID: > What if make the default value depending on the debug level? In debug mode > it is "pdb.set_trace", in optimized mode it is "0". Then in production > environments you can use -E -O for ignoring environment settings and disable > breakpoints. I don't know what is the best option, but I dislike adding two options, PYTHONBREAKPOINT and -X nobreakpoint, for the same features. I would become complicated to know which option has the priority. I would prefer a generic "release mode" option. In the past, I proposed the opposite: a "developer mode": https://mail.python.org/pipermail/python-ideas/2016-March/039314.html "python3 -X dev" would be an "alias" to "PYTHONMALLOC=debug python3.6 -Wd -bb -X faulthandler script.py". Python has more and more options to enable debug checks at runtime, it's hard to be aware of all of them. My intent is to run tests in "developer mode": if tests pass, you are sure that they will pass in the regular mode since the developer mode only enables more checks at runtime, it shouldn't change the behaviour. It seems like the consensus is more to run Python in "release mode" by default, since it was decided to hide DeprecationWarning by default. I understood that the default mode targets end users. Victor From barry at python.org Thu Oct 5 09:44:24 2017 From: barry at python.org (Barry Warsaw) Date: Thu, 5 Oct 2017 09:44:24 -0400 Subject: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files) In-Reply-To: <1507139631.839954.1127905064.60C88DA2@webmail.messagingengine.com> References: <1507051763.2766090.1126540976.75FD41A3@webmail.messagingengine.com> <1507139631.839954.1127905064.60C88DA2@webmail.messagingengine.com> Message-ID: On Oct 4, 2017, at 13:53, Benjamin Peterson wrote: > It might be helpful to enumerate the usecases for such an API. Perhaps a > narrow, specialized API could satisfy most needs in a supportable way. Currently `python -m dis thing.py` compiles the source then disassembles it. It would be kind of cool if you could pass a .pyc file to -m dis, in which case you?d need to unpack the header to get to the code object. A naive implementation would unpack the magic number and refuse to disassemble any files that don?t match whatever that version of Python understands. A more robust (possibly 3rd party) implementation could potentially disassemble a range of magic numbers and formats, and an API to get at the code object and metadata would help. I was thinking about the bytecode hacking that some debuggers do. This API would help them support multiple versions of Python. They could use the API to discover what pyc format was in use, extract the code object, hack the bytecode and possibly rewrite a new PEP 3147 style pyc file with the debugger bytecodes inserted. Third party bytecode optimizers could use the API to unpack multiple versions of pyc files, do their optimizations, and rewrite new files with the proper format. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From guido at python.org Thu Oct 5 10:32:28 2017 From: guido at python.org (Guido van Rossum) Date: Thu, 5 Oct 2017 07:32:28 -0700 Subject: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files) In-Reply-To: References: <1507051763.2766090.1126540976.75FD41A3@webmail.messagingengine.com> <1507139631.839954.1127905064.60C88DA2@webmail.messagingengine.com> Message-ID: Honestly I think the API for accessing historic pyc headers should itself also be 3rd party. CPython itself should not bother (backwards compatibility with pyc files has never been a feature). On Thu, Oct 5, 2017 at 6:44 AM, Barry Warsaw wrote: > On Oct 4, 2017, at 13:53, Benjamin Peterson wrote: > > > It might be helpful to enumerate the usecases for such an API. Perhaps a > > narrow, specialized API could satisfy most needs in a supportable way. > > Currently `python -m dis thing.py` compiles the source then disassembles > it. It would be kind of cool if you could pass a .pyc file to -m dis, in > which case you?d need to unpack the header to get to the code object. A > naive implementation would unpack the magic number and refuse to > disassemble any files that don?t match whatever that version of Python > understands. A more robust (possibly 3rd party) implementation could > potentially disassemble a range of magic numbers and formats, and an API to > get at the code object and metadata would help. > > I was thinking about the bytecode hacking that some debuggers do. This > API would help them support multiple versions of Python. They could use > the API to discover what pyc format was in use, extract the code object, > hack the bytecode and possibly rewrite a new PEP 3147 style pyc file with > the debugger bytecodes inserted. > > Third party bytecode optimizers could use the API to unpack multiple > versions of pyc files, do their optimizations, and rewrite new files with > the proper format. > > Cheers, > -Barry > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.rodola at gmail.com Thu Oct 5 16:35:05 2017 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Thu, 5 Oct 2017 22:35:05 +0200 Subject: [Python-Dev] Reorganize Python categories (Core, Library, ...)? In-Reply-To: References: Message-ID: On Wed, Oct 4, 2017 at 11:52 AM, Victor Stinner wrote: > Hi, > > Python uses a few categories to group bugs (on bugs.python.org) and > NEWS entries (in the Python changelog). List used by the blurb tool: > > #.. section: Security > #.. section: Core and Builtins > #.. section: Library > #.. section: Documentation > #.. section: Tests > #.. section: Build > #.. section: Windows > #.. section: macOS > #.. section: IDLE > #.. section: Tools/Demos > #.. section: C API > > My problem is that almost all changes go into "Library" category. When > I read long changelogs, it's sometimes hard to identify quickly the > context (ex: impacted modules) of a change. > > It's also hard to find open bugs of a specific module on > bugs.python.org, since almost all bugs are in the very generic > "Library" category. Using full text returns "false positives". > > I would prefer to see more specific categories like: > > * Buildbots: only issues specific to buildbots > * Networking: socket, asyncio, asyncore, asynchat modules > * Security: ssl module but also vulnerabilities in any other part of > CPython -- we already added a Security category in NEWS/blurb > * Parallelim: multiprocessing and concurrent.futures modules > > It's hard to find categories generic enough to not only contain a > single item, but not contain too many items neither. Other ideas: > > * XML: xml.doc, xml.etree, xml.parsers, xml.sax modules > * Import machinery: imp and importlib modules > * Typing: abc and typing modules > > The best would be to have a mapping of a module name into a category, > and make sure that all modules have a category. We might try to count > the number of commits and NEWS entries of the last 12 months to decide > if a category has the correct size. > > I don't think that we need a distinct categoy for each module. We can > put many uncommon modules in a generic category. > > By the way, we need maybe also a new "module name" field in the bug > tracker. But then comes the question of normalizing module names. For > example, should "email.message" be normalized to "email"? Maybe store > "email.message" but use "email" for search, display the module in the > issue title, etc. > > Victor Personally I've always dreamed about having *all* module names. That would reflect experts.rst file: https://github.com/python/devguide/blob/master/experts.rst -- Giampaolo - http://grodola.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From k7hoven at gmail.com Thu Oct 5 17:08:58 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Fri, 6 Oct 2017 00:08:58 +0300 Subject: [Python-Dev] Inheritance vs composition in backcompat (PEP521) In-Reply-To: References: Message-ID: On Tue, Oct 3, 2017 at 1:11 AM, Koos Zevenhoven wrote: > On Oct 3, 2017 01:00, "Guido van Rossum" wrote: > > Mon, Oct 2, 2017 at 2:52 PM, Koos Zevenhoven wrote > > I don't mind this (or Nathaniel ;-) being academic. The backwards >> incompatibility issue I've just described applies to any extension via >> composition, if the underlying type/protocol grows new members (like the CM >> protocol would have gained __suspend__ and __resume__ in PEP521). >> > > Since you seem to have a good grasp on this issue, does PEP 550 suffer > from the same problem? (Or PEP 555, for that matter? :-) > > > > Neither has this particular issue, because they don't extend an existing > protocol. If this thread has any significance, it will most likely be > elsewhere. > ?Actually, I realize I should be more precise with terminology regarding "extending an existing protocol"/"growing new members". Below, I'm still using PEP 521 as an example (sorry). In fact, in some sense, "adding" __suspend__ and __resume__ to context managers *does not* extend the context manager protocol, even though it kind of looks like it does. There would instead be two separate protocols: (A) The traditional PEP 343 context manager: __enter__ __exit__ (B) The hyphothetical PEP 521 context manager: __enter__ __suspend__ __resume__ __exit__ Protocols A and B are incompatible in both directions: * It is generally not safe to use a type-A context manager assuming it implements B. * It is generally not safe to use a type-B context manager assuming it implements A. But if you now have a type-B object, it looks like it's also type-A, especially for code that is not aware of the existence of B. This is where the problems come from: a wrapper for type A does the wrong thing when wrapping a type-B object (except when using inheritance). [Side note: Another interpretation of the situation is that, instead of adding protocol B, A is removed and is replaced with: (C) The hypothetical PEP 521 context manager with optional members: __enter__ __suspend__ (optional) __resume__ (optional) __exit__ But now the same problems just come from the fact that A no longer exists while there is code out there that assumes A. But this is only a useful interpretation if you are the only user of the protocol or if it's otherwise ok to remove A. So let's go back to the A-B interpretation.] Q: Could the problem of protocol conflict be solved? One way to tell A and B apart would be to always explicitly mark the protocol with a base class. Obviously this is not the case with existing uses of context managers. But there's another way, which is to change the naming: (A) The traditional PEP 343 context manager: __enter__ __exit__ (Z) The *modified* hyphothetical PEP 521 context manager: __begin__ __suspend__ __resume__ __end__ Now, A and Z are easy to tell apart. A context manager wrapper designed for type A immediately fails if used to wrap a type-Z object. But of course the whole context manager concept now suddenly became a lot more complicated. It is interesting that, in the A-B scheme, making a general context manager wrapper using inheritance *just works*, even if A is not a subprotocol of B and B is not a subprotocol of A. Anyway, a lot of this is amplified by the fact that the methods of the context manager protocols are not independent functionality. Instead, calling one of them leads to the requirement that the other methods are also called at the right moments. --Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Thu Oct 5 17:27:49 2017 From: barry at python.org (Barry Warsaw) Date: Thu, 5 Oct 2017 17:27:49 -0400 Subject: [Python-Dev] PEP 553; the interaction between $PYTHONBREAKPOINT and -E In-Reply-To: References: <4238B62F-8D68-444C-BDFA-7A235E3E6691@python.org> Message-ID: <97FACE44-D2EE-4AD4-905C-73D59055AB97@python.org> > I don't know what is the best option, but I dislike adding two > options, PYTHONBREAKPOINT and -X nobreakpoint, for the same features. > I would become complicated to know which option has the priority. Just to close the loop, I?ve landed the PEP 553 PR treating PYTHONBREAKPOINT the same as all other environment variables when -E is present. Let?s see how that goes. Thanks all for the great feedback and reviews. Now I?m thinking about putting a backport version on PyPI. :) Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From ericsnowcurrently at gmail.com Thu Oct 5 21:48:59 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 5 Oct 2017 19:48:59 -0600 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) In-Reply-To: References: Message-ID: On Thu, Oct 5, 2017 at 4:57 AM, Nick Coghlan wrote: > This would be hard to get to work reliably, because "orig.tp_share()" would > be running in the receiving interpreter, but all the attributes of "orig" > would have been allocated by the sending interpreter. It gets more reliable > if it's *Channel.send* that calls tp_share() though, but moving the call to > the sending side makes it clear that a tp_share protocol would still need to > rely on a more primitive set of "shareable objects" that were the permitted > return values from the tp_share call. The point of running tp_share() in the receiving interpreter is to force allocation under that interpreter, so that GC applies there. I agree that you basically can't do anything in tp_share() that would affect the sending interpreter, including INCREF and DECREF. Since we INCREFed in send(), we know that the we have a safe reference, so we don't have to worry about that part in tp_share(). We would only be able to do low-level things (like the buffer protocol) that don't interact with the original object's interpreter. Given that this is a quite low-level tp slot and low-level functionality, I'd expect that a sufficiently clear entry (i.e. warning) in the docs would be enough for the few that dare. >From my perspective adding the tp_share slot allows for much more experimentation with object sharing (right now, long before we get to considering how to stop sharing the GIL) by us *and* third parties. None of the alternatives seem to offer the same opportunity while still working out *after* we stop sharing the GIL. > > And that's the real pay-off that comes from defining this in terms of the > memoryview protocol: Py_buffer structs *aren't* Python objects, so it's only > a regular C struct that gets passed across the interpreter boundary (the > reference to the original objects gets carried along passively as part of > the CIV - it never gets *used* in the receiving interpreter). Yeah, the (PEP 3118) buffer protocol offers precedent in a number of ways that are applicable to channels here. I'm simply reticent to lock PEP 554 into such a specific solution as the buffer-specific CIV. I'm trying to accommodate anticipated future needs while keeping the PEP as simple and basic as possible. It's driving me nuts! :P Things were *much* simpler before I added Channels to the PEP. :) > >> >> bytes.tp_share(): >> obj = blank_bytes(len(self)) >> obj.ob_sval = self.ob_sval # hand-wavy memory sharing >> return obj > > > This is effectively reinventing memoryview, while trying to pretend it's an > ordinary bytes object. Don't reinvent memoryview :) > >> >> bytes.tp_free(): # under no-shared-GIL: >> # most of this could be pulled into a macro for re-use >> orig = lookup_shared(self) >> if orig != NULL: >> current = release_LIL() >> interp = lookup_owner(orig) >> acquire_LIL(interp) >> decref(orig) >> release_LIL(interp) >> acquire_LIL(current) >> # clear shared/owner tables >> # clear/release self.ob_sval >> free(self) > > > I don't think we should be touching the behaviour of core builtins solely to > enable message passing to subinterpreters without a shared GIL. Keep in mind that I included the above as a possible solution using tp_share() that would work *after* we stop sharing the GIL. My point is that with tp_share() we have a solution that works now *and* will work later. I don't care how we use tp_share to do so. :) I long to be able to say in the PEP that you can pass bytes through the channel and get bytes on the other side. That said, I'm not sure how this could be made to work without involving tp_free(). If that is really off the table (even in the simplest possible ways) then I don't think there is a way to actually share objects of builtin types between interpreters other than through views like CIV. We could still support tp_share() for the sake of third parties, which would facilitate that simplicity I was aiming for in sending data between interpreters, as well as leaving the door open for nearly all the same experimentation. However, I expect that most *uses* of channels will involve builtin types, particularly as we start off, so having to rely on view types for builtins would add not-insignificant awkwardness to using channels. I'd still like to avoid that if possible, so let's not rush to completely close the door on small modifications to tp_free for builtins. :) Regardless, I still (after a night's rest and a day of not thinking about it) consider tp_share() to be the solution I'd been hoping we'd find, whether or not we can apply it to builtin types. > > The simplest possible variant of CIVs that I can think of would be able to > avoid that outcome by being a memoryview subclass, since they just need to > hold the extra reference to the original interpreter, and include some logic > to swtich interpreters at the appropriate time. > > That said, I think there's definitely a useful design question to ask in > this area, not about bytes (which can be readily represented by a memoryview > variant in the receiving interpreter), but about *strings*: they have a more > complex internal layout than bytes objects, but as long as the receiving > interpreter can make sure that the original string continues to exist, then > you could usefully implement a "strview" type to avoid having to go through > an encode/decode cycle just to pass a string to another subinterpreter. > > That would provide a reasonable compelling argument that CIVs *shouldn't* be > implemented as memoryview subclasses, but instead defined as *containing* a > managed view of an object owned by a different interpreter. > > That way, even if the initial implementation only supported CIVs that > contained a memoryview instance, we'd have the freedom to define other kinds > of views later (such as strview), while being able to reuse the same CIV > machinery. Hmm, so a CIV implementation that accomplishes something similar to tp_share()? For some reason I'm seeing similarities between CIV-vs.-tp_share and the import machinery before PEP 451. Before we added module specs, import hook authors had to do a bunch of the busy work that the import machinery does for you now by leveraging module specs. Back then we worked to provide a number of helpers to reduce that extra pain of writing an import hook. Now the helpers are irrelevant and the extra burden is gone. My mind is drawn to the comparison between that and the question of CIV vs. tp_share(). CIV would be more like the post-451 import world, where I expect the CIV would take care of the data sharing operations. That said, the situation in PEP 554 is sufficiently different that I'm not convinced a generic CIV protocol would be better. I'm not sure how much CIV could do for you over helpers+tp_share. Anyway, here are the leading approaches that I'm looking at now: * adding a tp_share slot + you send() the object directly and recv() the object coming out of tp_share() (which will probably be the same type as the original) + this would eventually require small changes in tp_free for participating types + we would likely provide helpers (eventually), similar to the new buffer protocol, to make it easier to manage sharing data * simulating tp_share via an external global registry (or a registry on the Channel type) + it would still be hard to make work without hooking into tp_free() * CIVs hard-coded in Channel (or BufferViewChannel, etc.) for specific types (e.g. buffers) + you send() the object like normal, but recv() the view * a CIV protocol on Channel by which you can add support for more types + you send() the object like normal but recv() the view + could work through subclassing or a registry + a lot of conceptual similarity with tp_share+tp_free * a CIV-like proxy + you wrap the object, send() the proxy, and recv() a proxy + this is entirely compatible with tp_share() Here are what I consider the key metrics relative to the utility of a solution (not in any significant order): * how hard to understand as a Python programmer? * how much extra work (if any) for folks calling Channel.send()? * how much extra work (if any) for folks calling Channel.recv()? * how complex is the CPython implementation? * how hard to understand as a type author (wanting to add support for their type)? * how hard to add support for a new type? * what variety of types could be supported? * what breadth of experimentation opens up? The most important thing to me is keeping things simple for Python programmers. After that is ease-of-use for type authors. However, I also want to put us in a good position in 3.7 to experiment extensively with subinterpreters, so that's a big consideration. Consequently, for PEP 554 my goal is to find a solution for object sharing that keeps things simple in Python while laying a basic foundation we can build on at the C level, so we don't get locked in but still maximize our opportunities to experiment. :) -eric From ablacktshirt at gmail.com Thu Oct 5 22:19:42 2017 From: ablacktshirt at gmail.com (Yubin Ruan) Date: Fri, 6 Oct 2017 10:19:42 +0800 Subject: [Python-Dev] how/where is open() implemented ? Message-ID: Hi, I am looking for the implementation of open() in the src, but so far I am not able to do this. >From my observation, the implementation of open() in python2/3 does not employ the open(2) system call. However without open(2) how can one possibly obtain a file descriptor? Yubin From ncoghlan at gmail.com Thu Oct 5 22:31:56 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 6 Oct 2017 12:31:56 +1000 Subject: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files) In-Reply-To: References: <1507051763.2766090.1126540976.75FD41A3@webmail.messagingengine.com> <1507139631.839954.1127905064.60C88DA2@webmail.messagingengine.com> Message-ID: On 5 October 2017 at 23:44, Barry Warsaw wrote: > On Oct 4, 2017, at 13:53, Benjamin Peterson wrote: > > > It might be helpful to enumerate the usecases for such an API. Perhaps a > > narrow, specialized API could satisfy most needs in a supportable way. > > Currently `python -m dis thing.py` compiles the source then disassembles > it. It would be kind of cool if you could pass a .pyc file to -m dis, in > which case you?d need to unpack the header to get to the code object. A > naive implementation would unpack the magic number and refuse to > disassemble any files that don?t match whatever that version of Python > understands. A more robust (possibly 3rd party) implementation could > potentially disassemble a range of magic numbers and formats, and an API to > get at the code object and metadata would help. > > I was thinking about the bytecode hacking that some debuggers do. This > API would help them support multiple versions of Python. They could use > the API to discover what pyc format was in use, extract the code object, > hack the bytecode and possibly rewrite a new PEP 3147 style pyc file with > the debugger bytecodes inserted. > > Third party bytecode optimizers could use the API to unpack multiple > versions of pyc files, do their optimizations, and rewrite new files with > the proper format. > Actually doing that properly also requires keeping track of which opcodes were valid in different versions of the eval loop, so as Guido suggests, such an abstraction layer would make the most sense as a third party project that tracked: - the magic number for each CPython feature release (plus the 3.5.3+ anomaly) - the pyc header format for each CPython feature release - the valid opcode set for each CPython feature release - any other version dependent variations (e.g. the expected stack layout for BUILD_MAP changed in Python 3.5, when the evaluation order for dict displays was updated to be key then value, rather than the other way around) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Oct 5 22:35:36 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 6 Oct 2017 12:35:36 +1000 Subject: [Python-Dev] Reorganize Python categories (Core, Library, ...)? In-Reply-To: References: Message-ID: On 6 October 2017 at 06:35, Giampaolo Rodola' wrote: > On Wed, Oct 4, 2017 at 11:52 AM, Victor Stinner > wrote: > >> By the way, we need maybe also a new "module name" field in the bug >> tracker. But then comes the question of normalizing module names. For >> example, should "email.message" be normalized to "email"? Maybe store >> "email.message" but use "email" for search, display the module in the >> issue title, etc. >> >> Victor > > > Personally I've always dreamed about having *all* module names. That would > reflect experts.rst file: > https://github.com/python/devguide/blob/master/experts.rst > Right. One UX note though, based on similarly long lists in the Bugzilla component fields for Fedora and RHEL: list boxes don't scale well to really long lists of items, so such a field would ideally be based on a combo-box with typeahead support. (We have something like that already for the nosy list, where the typeahead support checks for Experts Index entries) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From jelle.zijlstra at gmail.com Thu Oct 5 22:53:29 2017 From: jelle.zijlstra at gmail.com (Jelle Zijlstra) Date: Thu, 5 Oct 2017 19:53:29 -0700 Subject: [Python-Dev] how/where is open() implemented ? In-Reply-To: References: Message-ID: 2017-10-05 19:19 GMT-07:00 Yubin Ruan : > Hi, > I am looking for the implementation of open() in the src, but so far I > am not able to do this. > > In Python 3, builtins.open is the same as io.open, which is implemented in the _io_open function in Modules/_io/_iomodule.c. > From my observation, the implementation of open() in python2/3 does > not employ the open(2) system call. However without open(2) how can > one possibly obtain a file descriptor? > There is a call to open() (the C function) in _io_FileIO___init___impl in Modules/_io/fileio.c. I haven't traced through all the code, but I suspect builtins.open ends up calling that. > > Yubin > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > jelle.zijlstra%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Thu Oct 5 23:00:22 2017 From: python at mrabarnett.plus.com (MRAB) Date: Fri, 6 Oct 2017 04:00:22 +0100 Subject: [Python-Dev] how/where is open() implemented ? In-Reply-To: References: Message-ID: <01038caa-103c-f775-f152-ca898fa56be4@mrabarnett.plus.com> On 2017-10-06 03:19, Yubin Ruan wrote: > Hi, > I am looking for the implementation of open() in the src, but so far I > am not able to do this. > >>From my observation, the implementation of open() in python2/3 does > not employ the open(2) system call. However without open(2) how can > one possibly obtain a file descriptor? > I think it's somewhere in here: https://github.com/python/cpython/blob/master/Modules/_io/fileio.c From ncoghlan at gmail.com Thu Oct 5 23:38:23 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 6 Oct 2017 13:38:23 +1000 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) In-Reply-To: References: Message-ID: On 6 October 2017 at 11:48, Eric Snow wrote: > > And that's the real pay-off that comes from defining this in terms of the > > memoryview protocol: Py_buffer structs *aren't* Python objects, so it's > only > > a regular C struct that gets passed across the interpreter boundary (the > > reference to the original objects gets carried along passively as part of > > the CIV - it never gets *used* in the receiving interpreter). > > Yeah, the (PEP 3118) buffer protocol offers precedent in a number of > ways that are applicable to channels here. I'm simply reticent to > lock PEP 554 into such a specific solution as the buffer-specific CIV. > I'm trying to accommodate anticipated future needs while keeping the > PEP as simple and basic as possible. It's driving me nuts! :P Things > were *much* simpler before I added Channels to the PEP. :) > Starting with memory-sharing only doesn't lock us into anything, since you can still add a more flexible kind of channel based on a different protocol later if it turns out that memory sharing isn't enough. By contrast, if you make the initial channel semantics incompatible with multiprocessing by design, you *will* prevent anyone from experimenting with replicating the shared memory based channel API for communicating between processes :) That said, if you'd prefer to keep the "Channel" name available for the possible introduction of object channels at a later date, you could call the initial memoryview based channel a "MemChannel". > > I don't think we should be touching the behaviour of core builtins > solely to > > enable message passing to subinterpreters without a shared GIL. > > Keep in mind that I included the above as a possible solution using > tp_share() that would work *after* we stop sharing the GIL. My point > is that with tp_share() we have a solution that works now *and* will > work later. I don't care how we use tp_share to do so. :) I long to > be able to say in the PEP that you can pass bytes through the channel > and get bytes on the other side. > Memory views are a builtin type as well, and they emphasise the practical benefit we're trying to get relative to typical multiprocessing arranagements: zero-copy data sharing. So here's my proposed experimentation-enabling development strategy: 1. Start out with a MemChannel API, that accepts any buffer-exporting object as input, and outputs only a cross-interpreter memoryview subclass 2. Use that as the basis for the work to get to a per-interpreter locking arrangement that allows subinterpreters to fully exploit multiple CPUs 3. Only then try to design a Channel API that allows for sharing builtin immutable objects between interpreters (bytes, strings, numbers), at a time when you can be certain you won't be inadvertently making it harder to make the GIL a truly per-interpreter lock, rather than the current process global runtime lock. The key benefit of this approach is that we *know* MemChannel can work: the buffer protocol already operates at the level of C structs and pointers, not Python objects, and there are already plenty of interesting buffer-protocol-supporting objects around, so as long as the CIV switches interpreters at the right time, there aren't any fundamentally new runtime level capabilities needed to implement it. The lower level MemChannel API could then also be replicated for multiprocessing, while the higher level more speculative object-based Channel API would be specific to subinterpreters (and probably only ever designed and implemented if you first succeed in making subinterpreters sufficiently independent that they don't rely on a process-wide GIL any more). So I'm not saying "Never design an object-sharing protocol specifically for use with subinterpreters". I'm saying "You don't have a demonstrated need for that yet, so don't try to define it until you do". > My mind is drawn to the comparison between that and the question of > CIV vs. tp_share(). CIV would be more like the post-451 import world, > where I expect the CIV would take care of the data sharing operations. > That said, the situation in PEP 554 is sufficiently different that I'm > not convinced a generic CIV protocol would be better. I'm not sure > how much CIV could do for you over helpers+tp_share. > > Anyway, here are the leading approaches that I'm looking at now: > > * adding a tp_share slot > + you send() the object directly and recv() the object coming out of > tp_share() > (which will probably be the same type as the original) > + this would eventually require small changes in tp_free for > participating types > + we would likely provide helpers (eventually), similar to the new > buffer protocol, > to make it easier to manage sharing data > I'm skeptical about this approach because you'll be designing in a vacuum against future possible constraints that you can't test yet: the inherent complexity in the object sharing protocol will come from *not* having a process-wide GIL, but you'll be starting out with a process-wide GIL still in place. And that means third parties will inevitably rely on the process-wide GIL in their tp_share implementations (despite their best intentions), and you'll end up with the same issue that causes problems for the rest of the C API. By contrast, if you delay this step until *after* the GIL has successfully been shifted to being per-interpreter, then by the time the new protocol is defined, people will also be able to test their tp_share implementations properly. At that point, you'd also presumably have evidence of demand to justify the introduction of a new core language protocol, as: * folks will only complain about the limitations of MemChannel if they're actually using subinterpreters * the complaints about the limitations of MemChannel would help guide the object sharing protocol design > * simulating tp_share via an external global registry (or a registry > on the Channel type) > + it would still be hard to make work without hooking into tp_free() > * CIVs hard-coded in Channel (or BufferViewChannel, etc.) for specific > types (e.g. buffers) > + you send() the object like normal, but recv() the view > * a CIV protocol on Channel by which you can add support for more types > + you send() the object like normal but recv() the view > + could work through subclassing or a registry > + a lot of conceptual similarity with tp_share+tp_free > * a CIV-like proxy > + you wrap the object, send() the proxy, and recv() a proxy > + this is entirely compatible with tp_share() > * Allow for multiple channel types, such that MemChannel is merely the *first* channel type, rather than the *only* channel type + Allows PEP 554 to be restricted to things we already know can be made to work + Doesn't block the introduction of an object-sharing based Channel in some future release + Allows for at least some channel types to be adapted for use with shared memory and multiprocessing > Here are what I consider the key metrics relative to the utility of a > solution (not in any significant order): > > * how hard to understand as a Python programmer? > Not especially important yet - this is more a criterion for the final API, not the initial experimental platform. > * how much extra work (if any) for folks calling Channel.send()? > * how much extra work (if any) for folks calling Channel.recv()? > I don't think either are particularly important yet, although we also don't want to raise any pointless barriers to experimentation. > * how complex is the CPython implementation? > This is critical, since we want to minimise any potential for undesirable side effects on regular single interpreter code. > * how hard to understand as a type author (wanting to add support for > their type)? > * how hard to add support for a new type? > * what variety of types could be supported? > * what breadth of experimentation opens up? > You missed the big one: what risk does the initial channel design pose to the underlying objective of making the GIL a genuinely per-interpreter lock? If we don't eventually reach the latter goal, then subinterpreters won't really offer much in the way of compelling benefits over just using a thread pool and queue.Queue. MemChannel poses zero additional risk to that, since we wouldn't be sharing actual Python objects between interpreters, only C pointers and structs. By contrast, introducing an object channel early poses significant new risks to that goal, since it will force you to solve hard protocol design and refcount management problems *before* making the switch, rather than being able to defer the design of the object channel protocol until *after* you've already enabled the ability to run subinterpreters in completely independent threads. > The most important thing to me is keeping things simple for Python > programmers. After that is ease-of-use for type authors. However, I > also want to put us in a good position in 3.7 to experiment > extensively with subinterpreters, so that's a big consideration. > > Consequently, for PEP 554 my goal is to find a solution for object > sharing that keeps things simple in Python while laying a basic > foundation we can build on at the C level, so we don't get locked in > but still maximize our opportunities to experiment. :) > I think our priorities are quite different then, as I believe PEP 554 should be focused on defining a relatively easy to implement API that nevertheless makes it possible to write interesting programs while working on the goal of making the GIL per-interpreter, without worrying too much about whether or not the initial cross-interpreter communication channels closely resemble the final ones that will be intended for more general use. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From status at bugs.python.org Fri Oct 6 12:09:52 2017 From: status at bugs.python.org (Python tracker) Date: Fri, 6 Oct 2017 18:09:52 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20171006160952.A856E11A85C@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2017-09-29 - 2017-10-06) Python tracker at https://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 6225 (+17) closed 37243 (+62) total 43468 (+79) Open issues with patches: 2393 Issues opened (53) ================== #11063: Rework uuid module: lazy initialization and add a new C extens https://bugs.python.org/issue11063 reopened by haypo #31178: [EASY] subprocess: TypeError: can't concat str to bytes, in _e https://bugs.python.org/issue31178 reopened by haypo #31415: Add -X option to show import time https://bugs.python.org/issue31415 reopened by terry.reedy #31639: http.server and SimpleHTTPServer hang after a few requests https://bugs.python.org/issue31639 opened by mattpr #31640: Document exit() from parse_args https://bugs.python.org/issue31640 opened by CharlesMerriam #31642: None value in sys.modules no longer blocks import https://bugs.python.org/issue31642 opened by christian.heimes #31643: test_uuid: test_getnode and test_windll_getnode fail if connec https://bugs.python.org/issue31643 opened by Ivan.Pozdeev #31645: openssl build fails in win32 if .pl extension is not associate https://bugs.python.org/issue31645 opened by Ivan.Pozdeev #31647: asyncio: StreamWriter write_eof() after close raises mysteriou https://bugs.python.org/issue31647 opened by twisteroid ambassador #31650: implement PEP 552 https://bugs.python.org/issue31650 opened by benjamin.peterson #31652: make install fails: no module _ctypes https://bugs.python.org/issue31652 opened by Dandan Lee #31653: Don't release the GIL if we can acquire a multiprocessing sema https://bugs.python.org/issue31653 opened by Daniel Colascione #31654: ctypes should support atomic operations https://bugs.python.org/issue31654 opened by Daniel Colascione #31655: SimpleNamespace accepts non-string keyword names https://bugs.python.org/issue31655 opened by serhiy.storchaka #31658: xml.sax.parse won't accept path objects https://bugs.python.org/issue31658 opened by craigh #31659: ssl module should not use textwrap for wrapping PEM format. https://bugs.python.org/issue31659 opened by inada.naoki #31660: sys.executable different in os.execv'd python3.6 virtualenv se https://bugs.python.org/issue31660 opened by Stephen Moore #31664: Add support of new crypt methods https://bugs.python.org/issue31664 opened by serhiy.storchaka #31665: Edit "Setting [windows] environmental variables" https://bugs.python.org/issue31665 opened by terry.reedy #31666: Pandas_datareader Error Message - ModuleNotFoundError: No modu https://bugs.python.org/issue31666 opened by Scott Tucholka #31667: Wrong links in the gettext.NullTranslations class https://bugs.python.org/issue31667 opened by linkid #31668: "fixFirefoxAnchorBug" function in doctools.js causes navigatin https://bugs.python.org/issue31668 opened by fireattack #31670: Associate .wasm with application/wasm https://bugs.python.org/issue31670 opened by flagxor #31672: string.Template should use re.ASCII flag https://bugs.python.org/issue31672 opened by inada.naoki #31674: Buildbots: random "Failed to connect to github.com port 443: C https://bugs.python.org/issue31674 opened by haypo #31676: test.test_imp.ImportTests.test_load_source has side effects https://bugs.python.org/issue31676 opened by serhiy.storchaka #31678: Incorrect C Function name for timedelta https://bugs.python.org/issue31678 opened by phobosmir #31680: Expose curses library name and version on Python level https://bugs.python.org/issue31680 opened by serhiy.storchaka #31681: pkgutil.get_data() leaks open files in Python 2.7 https://bugs.python.org/issue31681 opened by Elvis.Pranskevichus #31683: a stack overflow on windows in faulthandler._fatal_error() https://bugs.python.org/issue31683 opened by Oren Milman #31684: Scientific formatting of decimal 0 different from float 0 https://bugs.python.org/issue31684 opened by Aaron.Meurer #31686: GZip library doesn't properly close files https://bugs.python.org/issue31686 opened by Jake Lever #31687: test_semaphore_tracker() of test_multiprocessing_spawn fails r https://bugs.python.org/issue31687 opened by haypo #31690: Make RE "a", "L" and "u" inline flags local https://bugs.python.org/issue31690 opened by serhiy.storchaka #31691: Include missing info on required build steps and how to build https://bugs.python.org/issue31691 opened by Ivan.Pozdeev #31692: [2.7] Test `test_huntrleaks()` of test_regrtest fails in debug https://bugs.python.org/issue31692 opened by ishcherb #31694: Running Windows installer with LauncherOnly=1 should not regis https://bugs.python.org/issue31694 opened by uranusjr #31695: Improve bigmem tests https://bugs.python.org/issue31695 opened by serhiy.storchaka #31698: Add REQ_NAME to the node.h API https://bugs.python.org/issue31698 opened by Jelle Zijlstra #31699: Deadlocks in `concurrent.futures.ProcessPoolExecutor` with pic https://bugs.python.org/issue31699 opened by tomMoral #31700: one-argument version for Generator.typing https://bugs.python.org/issue31700 opened by srittau #31701: faulthandler dumps 'Windows fatal exception: code 0xe06d7363' https://bugs.python.org/issue31701 opened by Fynn Be #31702: Allow to specify the number of rounds for SHA-* hashing in cry https://bugs.python.org/issue31702 opened by serhiy.storchaka #31704: HTTP check lowercase response from proxy https://bugs.python.org/issue31704 opened by alvaromunoz #31705: test_sha256 from test_socket fails on ppc64le arch https://bugs.python.org/issue31705 opened by cstratak #31706: urlencode should accept generator as values for mappings when https://bugs.python.org/issue31706 opened by freitafr #31710: setup.py: _ctypes won't getbuilt when system ffi is only in $P https://bugs.python.org/issue31710 opened by pmpp #31711: ssl.SSLSocket.send(b"") fails https://bugs.python.org/issue31711 opened by joernheissler #31712: subprocess with stderr=subprocess.STDOUT hang https://bugs.python.org/issue31712 opened by l4mer #31713: python3 python-config script generates invalid includes https://bugs.python.org/issue31713 opened by matthewlweber #31714: Improve re documentation https://bugs.python.org/issue31714 opened by serhiy.storchaka #31715: Add mimetype for extension .mjs https://bugs.python.org/issue31715 opened by bradleymeck #31717: Socket documentation threading misstep? https://bugs.python.org/issue31717 opened by apoplexy Most recent 15 issues with no replies (15) ========================================== #31717: Socket documentation threading misstep? https://bugs.python.org/issue31717 #31715: Add mimetype for extension .mjs https://bugs.python.org/issue31715 #31713: python3 python-config script generates invalid includes https://bugs.python.org/issue31713 #31711: ssl.SSLSocket.send(b"") fails https://bugs.python.org/issue31711 #31705: test_sha256 from test_socket fails on ppc64le arch https://bugs.python.org/issue31705 #31704: HTTP check lowercase response from proxy https://bugs.python.org/issue31704 #31702: Allow to specify the number of rounds for SHA-* hashing in cry https://bugs.python.org/issue31702 #31700: one-argument version for Generator.typing https://bugs.python.org/issue31700 #31699: Deadlocks in `concurrent.futures.ProcessPoolExecutor` with pic https://bugs.python.org/issue31699 #31698: Add REQ_NAME to the node.h API https://bugs.python.org/issue31698 #31695: Improve bigmem tests https://bugs.python.org/issue31695 #31691: Include missing info on required build steps and how to build https://bugs.python.org/issue31691 #31687: test_semaphore_tracker() of test_multiprocessing_spawn fails r https://bugs.python.org/issue31687 #31686: GZip library doesn't properly close files https://bugs.python.org/issue31686 #31681: pkgutil.get_data() leaks open files in Python 2.7 https://bugs.python.org/issue31681 Most recent 15 issues waiting for review (15) ============================================= #31715: Add mimetype for extension .mjs https://bugs.python.org/issue31715 #31714: Improve re documentation https://bugs.python.org/issue31714 #31706: urlencode should accept generator as values for mappings when https://bugs.python.org/issue31706 #31691: Include missing info on required build steps and how to build https://bugs.python.org/issue31691 #31690: Make RE "a", "L" and "u" inline flags local https://bugs.python.org/issue31690 #31683: a stack overflow on windows in faulthandler._fatal_error() https://bugs.python.org/issue31683 #31681: pkgutil.get_data() leaks open files in Python 2.7 https://bugs.python.org/issue31681 #31678: Incorrect C Function name for timedelta https://bugs.python.org/issue31678 #31676: test.test_imp.ImportTests.test_load_source has side effects https://bugs.python.org/issue31676 #31672: string.Template should use re.ASCII flag https://bugs.python.org/issue31672 #31670: Associate .wasm with application/wasm https://bugs.python.org/issue31670 #31667: Wrong links in the gettext.NullTranslations class https://bugs.python.org/issue31667 #31664: Add support of new crypt methods https://bugs.python.org/issue31664 #31659: ssl module should not use textwrap for wrapping PEM format. https://bugs.python.org/issue31659 #31645: openssl build fails in win32 if .pl extension is not associate https://bugs.python.org/issue31645 Top 10 most discussed issues (10) ================================= #31654: ctypes should support atomic operations https://bugs.python.org/issue31654 13 msgs #11063: Rework uuid module: lazy initialization and add a new C extens https://bugs.python.org/issue11063 11 msgs #31622: Make threading.get_ident() return an opaque type https://bugs.python.org/issue31622 11 msgs #31672: string.Template should use re.ASCII flag https://bugs.python.org/issue31672 10 msgs #31589: Links for French documentation PDF is broken: LaTeX issue with https://bugs.python.org/issue31589 9 msgs #31630: math.tan has poor accuracy near pi/2 on OpenBSD https://bugs.python.org/issue31630 9 msgs #31619: Strange error when convert hexadecimal with underscores to int https://bugs.python.org/issue31619 7 msgs #31415: Add -X option to show import time https://bugs.python.org/issue31415 6 msgs #31684: Scientific formatting of decimal 0 different from float 0 https://bugs.python.org/issue31684 6 msgs #31583: 2to3 call for file in current directory yields error https://bugs.python.org/issue31583 5 msgs Issues closed (60) ================== #20323: Argument Clinic: docstring_prototype output causes build failu https://bugs.python.org/issue20323 closed by zach.ware #21411: Enable Treat Warning as Error on 32-bit Windows https://bugs.python.org/issue21411 closed by zach.ware #23283: Backport Tools/clinic to 3.4 https://bugs.python.org/issue23283 closed by zach.ware #25153: PCbuild/*.vcxproj* should use CRLF line endings https://bugs.python.org/issue25153 closed by zach.ware #25658: PyThread assumes pthread_key_t is an integer, which is against https://bugs.python.org/issue25658 closed by ncoghlan #27494: 2to3 parser failure caused by a comma after a generator expres https://bugs.python.org/issue27494 closed by benjamin.peterson #29041: Reference leaks on Windows https://bugs.python.org/issue29041 closed by zach.ware #30397: Expose regular expression and match objects types in the re mo https://bugs.python.org/issue30397 closed by serhiy.storchaka #30404: Make stdout and stderr truly unbuffered when using -u option https://bugs.python.org/issue30404 closed by serhiy.storchaka #30406: async and await should be keywords in 3.7 https://bugs.python.org/issue30406 closed by yselivanov #30465: FormattedValue expressions have wrong lineno and col_offset in https://bugs.python.org/issue30465 closed by eric.smith #30872: Update curses docs to Python 3 https://bugs.python.org/issue30872 closed by serhiy.storchaka #31158: test_pty: test_basic() fails randomly on Travis CI https://bugs.python.org/issue31158 closed by haypo #31285: a SystemError and an assertion failure in warnings.warn_explic https://bugs.python.org/issue31285 closed by serhiy.storchaka #31336: Speed up _PyType_Lookup() for class creation https://bugs.python.org/issue31336 closed by serhiy.storchaka #31353: Implement PEP 553 - built-in breakpoint() https://bugs.python.org/issue31353 closed by barry #31460: IDLE: Revise ModuleBrowser API https://bugs.python.org/issue31460 closed by terry.reedy #31478: assertion failure in random.seed() in case the seed argument h https://bugs.python.org/issue31478 closed by serhiy.storchaka #31510: test_many_processes() of test_multiprocessing_spawn failed on https://bugs.python.org/issue31510 closed by haypo #31516: current_thread() becomes "dummy" thread during shutdown https://bugs.python.org/issue31516 closed by pitrou #31540: Adding context in concurrent.futures.ProcessPoolExecutor https://bugs.python.org/issue31540 closed by pitrou #31555: Windows pyd slower when not loaded via load_dynamic https://bugs.python.org/issue31555 closed by steve.dower #31556: asyncio.wait_for can cancel futures faster with timeout==0 https://bugs.python.org/issue31556 closed by yselivanov #31574: Add dtrace hook for importlib https://bugs.python.org/issue31574 closed by lukasz.langa #31581: Reduce the number of imports for functools https://bugs.python.org/issue31581 closed by inada.naoki #31592: assertion failure in Python/ast.c in case of a bad unicodedata https://bugs.python.org/issue31592 closed by serhiy.storchaka #31596: expose pthread_getcpuclockid in time module https://bugs.python.org/issue31596 closed by benjamin.peterson #31602: assertion failure in zipimporter.get_source() in case of a bad https://bugs.python.org/issue31602 closed by brett.cannon #31627: test_mailbox fails if the hostname is empty https://bugs.python.org/issue31627 closed by serhiy.storchaka #31634: Consider installing wheel in ensurepip by default https://bugs.python.org/issue31634 closed by ncoghlan #31638: zipapp module should support compression https://bugs.python.org/issue31638 closed by paul.moore #31641: concurrent.futures.as_completed() no longer accepts arbitrary https://bugs.python.org/issue31641 closed by ned.deily #31644: bug in datetime.datetime.timestamp https://bugs.python.org/issue31644 closed by eric.smith #31646: bug in time.mktime https://bugs.python.org/issue31646 closed by r.david.murray #31648: Improve ElementPath https://bugs.python.org/issue31648 closed by serhiy.storchaka #31649: IDLE: Make _htest, _utest parameters keyword-only. https://bugs.python.org/issue31649 closed by terry.reedy #31651: io.FileIO cannot write more than 2GB (-4096) bytes??? must be https://bugs.python.org/issue31651 closed by benjamin.peterson #31656: Bitwise operations for bytes-type https://bugs.python.org/issue31656 closed by r.david.murray #31657: unit test for optimization levels does not cover __debug__ cas https://bugs.python.org/issue31657 closed by Mariatta #31661: Issues with request rate in robotparser https://bugs.python.org/issue31661 closed by berker.peksag #31662: trivial typos in Tools/msi/uploadrelease.bat https://bugs.python.org/issue31662 closed by steve.dower #31663: pyautogui.typewrite() method doesn't work as expected. https://bugs.python.org/issue31663 closed by terry.reedy #31669: string.Template: code, docs and PEP all disagree on definition https://bugs.python.org/issue31669 closed by serhiy.storchaka #31671: IntFlag makes re.compile slower https://bugs.python.org/issue31671 closed by inada.naoki #31673: Fix the name of Tkinter's adderrorinfo method https://bugs.python.org/issue31673 closed by serhiy.storchaka #31675: Tkinter: memory leak in splitlines() and split() https://bugs.python.org/issue31675 closed by serhiy.storchaka #31677: email.header uses re.IGNORECASE without re.ASCII https://bugs.python.org/issue31677 closed by inada.naoki #31679: pydot missing write, write_png, etc https://bugs.python.org/issue31679 closed by zach.ware #31682: Exception: Cannot import `win32api`! https://bugs.python.org/issue31682 closed by zach.ware #31685: Cannot connect to CE path `127.0.0.1:8010` https://bugs.python.org/issue31685 closed by zach.ware #31688: scope error https://bugs.python.org/issue31688 closed by r.david.murray #31689: random.choices does not work with negative weights https://bugs.python.org/issue31689 closed by rhettinger #31693: Document Py_GETENV https://bugs.python.org/issue31693 closed by barry #31696: don't mention GCC in sys.version when built with Clang https://bugs.python.org/issue31696 closed by benjamin.peterson #31697: Regression in futures.as_completed with ProcessPoolExecutor. https://bugs.python.org/issue31697 closed by coady #31703: [EASY] Running test_builtin twice fails on input tty tests https://bugs.python.org/issue31703 closed by serhiy.storchaka #31707: Irrational fractions https://bugs.python.org/issue31707 closed by mark.dickinson #31708: Allow use of asynchronous generator expressions in synchronous https://bugs.python.org/issue31708 closed by yselivanov #31709: Drop support for asynchronous __aiter__ https://bugs.python.org/issue31709 closed by yselivanov #31716: os.path.isdir returns true for dots https://bugs.python.org/issue31716 closed by morha13 From k7hoven at gmail.com Fri Oct 6 12:29:24 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Fri, 6 Oct 2017 19:29:24 +0300 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) In-Reply-To: References: Message-ID: While I'm actually trying not to say much here so that I can avoid this discussion now, here's just a couple of ideas and thoughts from me at this point: (A) Instead of sending bytes and receiving memoryviews, one could consider sending *and* receiving memoryviews for now. That could then be extended into more types of objects in the future without changing the basic concept of the channel. Probably, the memoryview would need to be copied (but not the data of course). But I'm guessing copying a memoryview would be quite fast. This would hopefully require less API changes or additions in the future. OTOH, giving it a different name like MemChannel or making it 3rd party will buy some more time to figure out the right API. But maybe that's not needed. (B) We would probably then like to pretend that the object coming out the other end of a Channel *is* the original object. As long as these channels are the only way to directly pass objects between interpreters, there are essentially only two ways to tell the difference (AFAICT): 1. Calling id(...) and sending it over to the other interpreter and checking if it's the same. 2. When the same object is sent twice to the same interpreter. Then one can compare the two with id(...) or using the `is` operator. There are solutions to the problems too: 1. Send the id() from the sending interpreter along with the sent object so that the receiving interpreter can somehow attach it to the object and then return it from id(...). 2. When an object is received, make a lookup in an interpreter-wide cache to see if an object by this id has already been received. If yes, take that one. Now it should essentially look like the received object is really "the same one" as in the sending interpreter. This should also work with multiple interpreters and multiple channels, as long as the id is always preserved. (C) One further complication regarding memoryview in general is that .release() should probably be propagated to the sending interpreter somehow. (D) I think someone already mentioned this one, but would it not be better to start a new interpreter in the background in a new thread by default? I think this would make things simpler and leave more freedom regarding the implementation in the future. If you need to run an interpreter within the current thread, you could perhaps optionally do that too. ??Koos PS. I have lots of thoughts related to this, but I can't afford to engage in them now. (Anyway, it's probably more urgent to get some stuff with PEP 555 and its spin-off thoughts out of the way). On Fri, Oct 6, 2017 at 6:38 AM, Nick Coghlan wrote: > On 6 October 2017 at 11:48, Eric Snow wrote: > >> > And that's the real pay-off that comes from defining this in terms of >> the >> > memoryview protocol: Py_buffer structs *aren't* Python objects, so it's >> only >> > a regular C struct that gets passed across the interpreter boundary (the >> > reference to the original objects gets carried along passively as part >> of >> > the CIV - it never gets *used* in the receiving interpreter). >> >> Yeah, the (PEP 3118) buffer protocol offers precedent in a number of >> ways that are applicable to channels here. I'm simply reticent to >> lock PEP 554 into such a specific solution as the buffer-specific CIV. >> I'm trying to accommodate anticipated future needs while keeping the >> PEP as simple and basic as possible. It's driving me nuts! :P Things >> were *much* simpler before I added Channels to the PEP. :) >> > > Starting with memory-sharing only doesn't lock us into anything, since you > can still add a more flexible kind of channel based on a different protocol > later if it turns out that memory sharing isn't enough. > > By contrast, if you make the initial channel semantics incompatible with > multiprocessing by design, you *will* prevent anyone from experimenting > with replicating the shared memory based channel API for communicating > between processes :) > > That said, if you'd prefer to keep the "Channel" name available for the > possible introduction of object channels at a later date, you could call > the initial memoryview based channel a "MemChannel". > > >> > I don't think we should be touching the behaviour of core builtins >> solely to >> > enable message passing to subinterpreters without a shared GIL. >> >> Keep in mind that I included the above as a possible solution using >> tp_share() that would work *after* we stop sharing the GIL. My point >> is that with tp_share() we have a solution that works now *and* will >> work later. I don't care how we use tp_share to do so. :) I long to >> be able to say in the PEP that you can pass bytes through the channel >> and get bytes on the other side. >> > > Memory views are a builtin type as well, and they emphasise the practical > benefit we're trying to get relative to typical multiprocessing > arranagements: zero-copy data sharing. > > So here's my proposed experimentation-enabling development strategy: > > 1. Start out with a MemChannel API, that accepts any buffer-exporting > object as input, and outputs only a cross-interpreter memoryview subclass > 2. Use that as the basis for the work to get to a per-interpreter locking > arrangement that allows subinterpreters to fully exploit multiple CPUs > 3. Only then try to design a Channel API that allows for sharing builtin > immutable objects between interpreters (bytes, strings, numbers), at a time > when you can be certain you won't be inadvertently making it harder to make > the GIL a truly per-interpreter lock, rather than the current process > global runtime lock. > > The key benefit of this approach is that we *know* MemChannel can work: > the buffer protocol already operates at the level of C structs and > pointers, not Python objects, and there are already plenty of interesting > buffer-protocol-supporting objects around, so as long as the CIV switches > interpreters at the right time, there aren't any fundamentally new runtime > level capabilities needed to implement it. > > The lower level MemChannel API could then also be replicated for > multiprocessing, while the higher level more speculative object-based > Channel API would be specific to subinterpreters (and probably only ever > designed and implemented if you first succeed in making subinterpreters > sufficiently independent that they don't rely on a process-wide GIL any > more). > > So I'm not saying "Never design an object-sharing protocol specifically > for use with subinterpreters". I'm saying "You don't have a demonstrated > need for that yet, so don't try to define it until you do". > > > >> My mind is drawn to the comparison between that and the question of >> CIV vs. tp_share(). CIV would be more like the post-451 import world, >> where I expect the CIV would take care of the data sharing operations. >> That said, the situation in PEP 554 is sufficiently different that I'm >> not convinced a generic CIV protocol would be better. I'm not sure >> how much CIV could do for you over helpers+tp_share. >> >> Anyway, here are the leading approaches that I'm looking at now: >> >> * adding a tp_share slot >> + you send() the object directly and recv() the object coming out of >> tp_share() >> (which will probably be the same type as the original) >> + this would eventually require small changes in tp_free for >> participating types >> + we would likely provide helpers (eventually), similar to the new >> buffer protocol, >> to make it easier to manage sharing data >> > > I'm skeptical about this approach because you'll be designing in a vacuum > against future possible constraints that you can't test yet: the inherent > complexity in the object sharing protocol will come from *not* having a > process-wide GIL, but you'll be starting out with a process-wide GIL still > in place. And that means third parties will inevitably rely on the > process-wide GIL in their tp_share implementations (despite their best > intentions), and you'll end up with the same issue that causes problems for > the rest of the C API. > > By contrast, if you delay this step until *after* the GIL has successfully > been shifted to being per-interpreter, then by the time the new protocol is > defined, people will also be able to test their tp_share implementations > properly. > > At that point, you'd also presumably have evidence of demand to justify > the introduction of a new core language protocol, as: > > * folks will only complain about the limitations of MemChannel if they're > actually using subinterpreters > * the complaints about the limitations of MemChannel would help guide the > object sharing protocol design > > >> * simulating tp_share via an external global registry (or a registry >> on the Channel type) >> + it would still be hard to make work without hooking into tp_free() >> * CIVs hard-coded in Channel (or BufferViewChannel, etc.) for specific >> types (e.g. buffers) >> + you send() the object like normal, but recv() the view >> * a CIV protocol on Channel by which you can add support for more types >> + you send() the object like normal but recv() the view >> + could work through subclassing or a registry >> + a lot of conceptual similarity with tp_share+tp_free >> * a CIV-like proxy >> + you wrap the object, send() the proxy, and recv() a proxy >> + this is entirely compatible with tp_share() >> > > * Allow for multiple channel types, such that MemChannel is merely the > *first* channel type, rather than the *only* channel type > + Allows PEP 554 to be restricted to things we already know can be made > to work > + Doesn't block the introduction of an object-sharing based Channel in > some future release > + Allows for at least some channel types to be adapted for use with > shared memory and multiprocessing > > >> Here are what I consider the key metrics relative to the utility of a >> solution (not in any significant order): >> >> * how hard to understand as a Python programmer? >> > > Not especially important yet - this is more a criterion for the final API, > not the initial experimental platform. > > >> * how much extra work (if any) for folks calling Channel.send()? >> * how much extra work (if any) for folks calling Channel.recv()? >> > > I don't think either are particularly important yet, although we also > don't want to raise any pointless barriers to experimentation. > > >> * how complex is the CPython implementation? >> > > This is critical, since we want to minimise any potential for undesirable > side effects on regular single interpreter code. > > >> * how hard to understand as a type author (wanting to add support for >> their type)? >> * how hard to add support for a new type? >> * what variety of types could be supported? >> * what breadth of experimentation opens up? >> > > You missed the big one: what risk does the initial channel design pose to > the underlying objective of making the GIL a genuinely per-interpreter lock? > > If we don't eventually reach the latter goal, then subinterpreters won't > really offer much in the way of compelling benefits over just using a > thread pool and queue.Queue. > > MemChannel poses zero additional risk to that, since we wouldn't be > sharing actual Python objects between interpreters, only C pointers and > structs. > > By contrast, introducing an object channel early poses significant new > risks to that goal, since it will force you to solve hard protocol design > and refcount management problems *before* making the switch, rather than > being able to defer the design of the object channel protocol until *after* > you've already enabled the ability to run subinterpreters in completely > independent threads. > > >> The most important thing to me is keeping things simple for Python >> programmers. After that is ease-of-use for type authors. However, I >> also want to put us in a good position in 3.7 to experiment >> extensively with subinterpreters, so that's a big consideration. >> >> Consequently, for PEP 554 my goal is to find a solution for object >> sharing that keeps things simple in Python while laying a basic >> foundation we can build on at the C level, so we don't get locked in >> but still maximize our opportunities to experiment. :) >> > > I think our priorities are quite different then, as I believe PEP 554 > should be focused on defining a relatively easy to implement API that > nevertheless makes it possible to write interesting programs while working > on the goal of making the GIL per-interpreter, without worrying too much > about whether or not the initial cross-interpreter communication channels > closely resemble the final ones that will be intended for more general use. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > k7hoven%40gmail.com > > -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From yarkot1 at gmail.com Fri Oct 6 14:53:28 2017 From: yarkot1 at gmail.com (Yarko Tymciurak) Date: Fri, 6 Oct 2017 13:53:28 -0500 Subject: [Python-Dev] PEP 553 In-Reply-To: References: <1B4494FE-AA40-4FF7-B81A-B0FB14F5F259@python.org> <11E419CD-6E87-44C6-965B-A07DEF8237AE@python.org> <4D2C4576-CDFC-4D73-A8B5-184B558A5BDC@python.org> Message-ID: apologies - I didn't "reply all" to this. For the record: I made an argument (in reply) about interactive tinkering, and setting "condition", and Guido replied essentially that "if condition: breakpoint()" is just as good for tinkering... a condition parameter to debuggers is not useful, and not as explicit. Yes - agreed (and the `gist` which I tinkered w/ one day - I've now discarded ;-). Thanks, Guido! - Yarko On Wed, Oct 4, 2017 at 9:12 PM, Guido van Rossum wrote: > Yarko, there's one thing I don't understand. Maybe you can enlighten me. > Why would you prefer > > breakpoint(x >= 1000) > > over > > if x >= 1000: breakpoint() > > ? > > The latter seems unambiguous and requires thinking all around. Is there > something in iPython that makes this impractical? > > -- > --Guido van Rossum (python.org/~guido) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Sun Oct 8 04:02:01 2017 From: cournape at gmail.com (David Cournapeau) Date: Sun, 8 Oct 2017 17:02:01 +0900 Subject: [Python-Dev] Investigating time for `import requests` In-Reply-To: References: Message-ID: On Mon, Oct 2, 2017 at 6:42 PM, Raymond Hettinger < raymond.hettinger at gmail.com> wrote: > > > On Oct 2, 2017, at 12:39 AM, Nick Coghlan wrote: > > > > "What requests uses" can identify a useful set of > > avoidable imports. A Flask "Hello world" app could likely provide > > another such sample, as could some example data analysis notebooks). > > Right. It is probably worthwhile to identify which parts of the library > are typically imported but are not ever used. And likewise, identify a > core set of commonly used tools that are going to be almost unavoidable in > sufficiently interesting applications (like using requests to access a REST > API, running a micro-webframework, or invoking mercurial). > > Presumably, if any of this is going to make a difference to end users, we > need to see if there is any avoidable work that takes a significant > fraction of the total time from invocation through the point where the user > first sees meaningful output. That would include loading from nonvolatile > storage, executing the various imports, and doing the actual application. > > I don't expect to find anything that would help users of Django, Flask, > and Bottle since those are typically long-running apps where we value > response time more than startup time. > > For scripts using the requests module, there will be some fruit because > not everything that is imported is used. However, that may not be > significant because scripts using requests tend to be I/O bound. In the > timings below, 6% of the running time is used to load and run python.exe, > another 16% is used to import requests, and the remaining 78% is devoted to > the actual task of running a simple REST API query. It would be interesting > to see how much of the 16% could be avoided without major alterations to > requests, to urllib3, and to the standard library. > It is certainly true that for a CLI tool that actually makes any network I/O, especially SSL, import times will quickly be negligible. It becomes tricky for complex tools, because of error management. For example, a common pattern I have used in the past is to have a high level "catch all exceptions" function that dispatch the CLI command: try: main_function(...) except ErrorKind1: .... except requests.exceptions.SSLError: # gives complete message about options when receiving SSL errors, e.g. invalid certificate This pattern requires importing requests every time the command is run, even if no network IO is actually done. For complex CLI tools, maybe most command don't use network IO (the tool in question was a complete packages manager), but you pay ~100 ms because of requests import for every command. It is particularly visible because commands latency starts to be felt around 100-150 ms, and while you can do a lot in python in 100-150 ms, you can't do much in 0-50 ms. David > For mercurial, "hg log" or "hg commit" will likely be instructive about > what portion of the imports actually get used. A push or pull will likely > be I/O bound so those commands are less informative. > > > Raymond > > > --------- Quick timing for a minimal script using the requests module > ----------- > > $ cat > demo_github_rest_api.py > import requests > info = requests.get('https://api.github.com/users/raymondh').json() > print('%(name)s works at %(company)s. Contact at %(email)s' % info) > > $ time python3.6 demo_github_rest_api.py > Raymond Hettinger works at SauceLabs. Contact at None > > real 0m0.561s > user 0m0.134s > sys 0m0.018s > > $ time python3.6 -c "import requests" > > real 0m0.125s > user 0m0.104s > sys 0m0.014s > > $ time python3.6 -c "" > > real 0m0.036s > user 0m0.024s > sys 0m0.005s > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > cournape%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Sun Oct 8 07:44:51 2017 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 8 Oct 2017 22:44:51 +1100 Subject: [Python-Dev] Investigating time for `import requests` In-Reply-To: References: Message-ID: On Sun, Oct 8, 2017 at 7:02 PM, David Cournapeau wrote: > It is certainly true that for a CLI tool that actually makes any network > I/O, especially SSL, import times will quickly be negligible. It becomes > tricky for complex tools, because of error management. For example, a common > pattern I have used in the past is to have a high level "catch all > exceptions" function that dispatch the CLI command: > > try: > main_function(...) > except ErrorKind1: > .... > except requests.exceptions.SSLError: > # gives complete message about options when receiving SSL errors, e.g. > invalid certificate > > This pattern requires importing requests every time the command is run, even > if no network IO is actually done. For complex CLI tools, maybe most command > don't use network IO (the tool in question was a complete packages manager), > but you pay ~100 ms because of requests import for every command. It is > particularly visible because commands latency starts to be felt around > 100-150 ms, and while you can do a lot in python in 100-150 ms, you can't do > much in 0-50 ms. This would be a perfect use-case for lazy importing, then. You'd pay the price of the import only if you get an error that isn't caught by one of the preceding except blocks. ChrisA From k7hoven at gmail.com Sun Oct 8 11:13:19 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Sun, 8 Oct 2017 18:13:19 +0300 Subject: [Python-Dev] Investigating time for `import requests` In-Reply-To: References: Message-ID: On Sun, Oct 8, 2017 at 11:02 AM, David Cournapeau wrote: > > On Mon, Oct 2, 2017 at 6:42 PM, Raymond Hettinger < > raymond.hettinger at gmail.com> wrote: > >> >> > On Oct 2, 2017, at 12:39 AM, Nick Coghlan wrote: >> > >> > "What requests uses" can identify a useful set of >> > avoidable imports. A Flask "Hello world" app could likely provide >> > another such sample, as could some example data analysis notebooks). >> >> Right. It is probably worthwhile to identify which parts of the library >> are typically imported but are not ever used. And likewise, identify a >> core set of commonly used tools that are going to be almost unavoidable in >> sufficiently interesting applications (like using requests to access a REST >> API, running a micro-webframework, or invoking mercurial). >> >> Presumably, if any of this is going to make a difference to end users, we >> need to see if there is any avoidable work that takes a significant >> fraction of the total time from invocation through the point where the user >> first sees meaningful output. That would include loading from nonvolatile >> storage, executing the various imports, and doing the actual application. >> >> I don't expect to find anything that would help users of Django, Flask, >> and Bottle since those are typically long-running apps where we value >> response time more than startup time. >> >> For scripts using the requests module, there will be some fruit because >> not everything that is imported is used. However, that may not be >> significant because scripts using requests tend to be I/O bound. In the >> timings below, 6% of the running time is used to load and run python.exe, >> another 16% is used to import requests, and the remaining 78% is devoted to >> the actual task of running a simple REST API query. It would be interesting >> to see how much of the 16% could be avoided without major alterations to >> requests, to urllib3, and to the standard library. >> > > It is certainly true that for a CLI tool that actually makes any network > I/O, especially SSL, import times will quickly be negligible. It becomes > tricky for complex tools, because of error management. For example, a > common pattern I have used in the past is to have a high level "catch all > exceptions" function that dispatch the CLI command: > > try: > main_function(...) > except ErrorKind1: > .... > except requests.exceptions.SSLError: > # gives complete message about options when receiving SSL errors, e.g. > invalid certificate > > This pattern requires importing requests every time the command is run, > even if no network IO is actually done. For complex CLI tools, maybe most > command don't use network IO (the tool in question was a complete packages > manager), but you pay ~100 ms because of requests import for every command. > It is particularly visible because commands latency starts to be felt > around 100-150 ms, and while you can do a lot in python in 100-150 ms, you > can't do much in 0-50 ms. > > Yes. ?OTOH, ?it can also happen that the *imports* are in fact what use the network IO. At the office, I usually import from a network drive. For instance, `import requests` takes a little less than a second, and `import IPython` usually takes more than a second, with some variation. ??Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From k7hoven at gmail.com Sun Oct 8 11:24:13 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Sun, 8 Oct 2017 18:24:13 +0300 Subject: [Python-Dev] Investigating time for `import requests` In-Reply-To: References: Message-ID: On Sun, Oct 8, 2017 at 2:44 PM, Chris Angelico wrote: > On Sun, Oct 8, 2017 at 7:02 PM, David Cournapeau > wrote: > > It is certainly true that for a CLI tool that actually makes any network > > I/O, especially SSL, import times will quickly be negligible. It becomes > > tricky for complex tools, because of error management. For example, a > common > > pattern I have used in the past is to have a high level "catch all > > exceptions" function that dispatch the CLI command: > > > > try: > > main_function(...) > > except ErrorKind1: > > .... > > except requests.exceptions.SSLError: > > # gives complete message about options when receiving SSL errors, > e.g. > > invalid certificate > > > > This pattern requires importing requests every time the command is run, > even > > if no network IO is actually done. For complex CLI tools, maybe most > command > > don't use network IO (the tool in question was a complete packages > manager), > > but you pay ~100 ms because of requests import for every command. It is > > particularly visible because commands latency starts to be felt around > > 100-150 ms, and while you can do a lot in python in 100-150 ms, you > can't do > > much in 0-50 ms. > > This would be a perfect use-case for lazy importing, then. You'd pay > the price of the import only if you get an error that isn't caught by > one of the preceding except blocks. > ?I suppose it might be convenient to be able to do something like: with autoimport: try: main_function(...) ? except ErrorKind1: ... except requests.exceptions.SLLError: ... The easiest workaround at the moment is still pretty clumsy: def import_SLLError(): from requests.exceptions import SLLError return SLLError ... except import_SLLError(): But what happens if that gives you an ImportError? ??Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Sun Oct 8 18:46:12 2017 From: eric at trueblade.com (Eric V. Smith) Date: Sun, 8 Oct 2017 18:46:12 -0400 Subject: [Python-Dev] Investigating time for `import requests` In-Reply-To: References: Message-ID: <701464c4-94fa-fdfe-0e7d-121e8df428be@trueblade.com> > The easiest workaround at the moment is still pretty clumsy: > > def import_SLLError(): > ? ? from requests.exceptions import SLLError > ? ? return SLLError > > ... > > > ? ? except import_SLLError(): > > > But what happens if that gives you an ImportError? You can't catch a requests exception unless requests has already been imported, you could do something like: except Exception as ex: if 'requests' in sys.modules: import requests # this is basically free at this point if isinstance(ex, requests.exceptions): ... Eric. From ncoghlan at gmail.com Sun Oct 8 22:27:17 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 9 Oct 2017 12:27:17 +1000 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) In-Reply-To: References: Message-ID: On 7 October 2017 at 02:29, Koos Zevenhoven wrote: > While I'm actually trying not to say much here so that I can avoid this > discussion now, here's just a couple of ideas and thoughts from me at this > point: > > (A) > Instead of sending bytes and receiving memoryviews, one could consider > sending *and* receiving memoryviews for now. That could then be extended > into more types of objects in the future without changing the basic concept > of the channel. Probably, the memoryview would need to be copied (but not > the data of course). But I'm guessing copying a memoryview would be quite > fast. > The proposal is to allow sending any buffer-exporting object, so sending a memoryview would be supported. > This would hopefully require less API changes or additions in the future. > OTOH, giving it a different name like MemChannel or making it 3rd party > will buy some more time to figure out the right API. But maybe that's not > needed. > I think having both a memory-centric data channel and an object-centric data channel would be useful long term, so I don't see a lot of downsides to starting with the easier-to-implement MemChannel, and then looking at how to define a plain Channel later. For example, it occurs to me is that the closest current equivalent we have to an object level counterpart to the memory buffer protocol would be the weak reference protocol, wherein a multi-interpreter-aware proxy object could actually take care of switching interpreters as needed when manipulating reference counts. While weakrefs themselves wouldn't be usable in the general case (many builtin types don't support weak references, and we'd want to support strong cross-interpreter references anyway), a wrapt-style object proxy would provide us with a way to maintain a single strong reference to the original object in its originating interpreter (implicitly switching to that interpreter as needed), while also maintaining a regular local reference count on the proxy object in the receiving interpreter. And here's the neat thing: since subinterpreters share an address space, it would be possible to experiment with an object-proxy based channel by passing object pointers over a memoryview based channel. > (B) > We would probably then like to pretend that the object coming out the > other end of a Channel *is* the original object. As long as these channels > are the only way to directly pass objects between interpreters, there are > essentially only two ways to tell the difference (AFAICT): > > 1. Calling id(...) and sending it over to the other interpreter and > checking if it's the same. > > 2. When the same object is sent twice to the same interpreter. Then one > can compare the two with id(...) or using the `is` operator. > > There are solutions to the problems too: > > 1. Send the id() from the sending interpreter along with the sent object > so that the receiving interpreter can somehow attach it to the object and > then return it from id(...). > > 2. When an object is received, make a lookup in an interpreter-wide cache > to see if an object by this id has already been received. If yes, take that > one. > > Now it should essentially look like the received object is really "the > same one" as in the sending interpreter. This should also work with > multiple interpreters and multiple channels, as long as the id is always > preserved. > I don't personally think we want to expend much (if any) effort on presenting the illusion that the objects on either end of the channel are the "same" object, but postponing the question entirely is also one of the benefits I see to starting with MemChannel, and leaving the object-centric Channel until later. > (C) > One further complication regarding memoryview in general is that > .release() should probably be propagated to the sending interpreter somehow. > Yep, switching interpreters when releasing the buffer is the main reason you couldn't use a regular memoryview for this purpose - you need a variant that holds a strong reference to the sending interpreter, and switches back to it for the buffer release operation. > (D) > I think someone already mentioned this one, but would it not be better to > start a new interpreter in the background in a new thread by default? I > think this would make things simpler and leave more freedom regarding the > implementation in the future. If you need to run an interpreter within the > current thread, you could perhaps optionally do that too. > Not really, as that approach doesn't compose as well with existing thread management primitives like concurrent.futures.ThreadPoolExecutor. It also doesn't match the way the existing subinterpreter machinery works, where threads can change their active interpreter. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Oct 8 22:39:28 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 9 Oct 2017 12:39:28 +1000 Subject: [Python-Dev] PEP 554 v3 (new interpreters module) In-Reply-To: References: Message-ID: On 14 September 2017 at 11:44, Eric Snow wrote: > Examples > ======== > > Run isolated code > ----------------- > > :: > > interp = interpreters.create() > print('before') > interp.run('print("during")') > print('after') > A few more suggestions for examples: Running a module: main_module = mod_name interp.run(f"import runpy; runpy.run_module({main_module!r})") Running as script (including zip archives & directories): main_script = path_name interp.run(f"import runpy; runpy.run_path({main_script!r})") Running in a thread pool executor: interps = [interpreters.create() for i in range(5)] with concurrent.futures.ThreadPoolExecutor(max_workers=len(interps)) as pool: print('before') for interp in interps: pool.submit(interp.run, 'print("starting"); print("stopping")' print('after') That last one is prompted by the questions about the benefits of keeping the notion of an interpreter state distinct from the notion of a main thread (it allows a single "MainThread" object to be mapped to different OS level threads at different points in time, which means it's easier to combine with existing constructs for managing OS level thread pools). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Mon Oct 9 12:40:10 2017 From: brett at python.org (Brett Cannon) Date: Mon, 09 Oct 2017 16:40:10 +0000 Subject: [Python-Dev] Investigating time for `import requests` In-Reply-To: <641723C8-AE9E-44F1-94AE-68A716980318@python.org> References: <641723C8-AE9E-44F1-94AE-68A716980318@python.org> Message-ID: On Mon, Oct 2, 2017, 12:30 Barry Warsaw, wrote: > On Oct 2, 2017, at 14:56, Brett Cannon wrote: > > > So Mercurial specifically is an odd duck because they already do lazy > importing (in fact they are using the lazy loading support from importlib). > In terms of all of this discussion of tweaking import to be lazy, I think > the best approach would be providing an opt-in solution that CLI tools can > turn on ASAP while the default stays eager. That way everyone gets what > they want while the stdlib provides a shared solution that's maintained > alongside import itself to make sure it functions appropriately. > > The problem I think is that to get full benefit of lazy loading, it has to > be turned on globally for bare ?import? statements. A typical application > has tons of dependencies and all those libraries are also doing module > global imports, so unless lazy loading somehow covers them, it?ll be an > incomplete gain. But of course it?ll take forever for all your > dependencies to use whatever new API we come up with, and if it?s not as > convenient to write as ?import foo? then I suspect it won?t much catch on > anyway. > My approach supports global "switch" to make it transparent (see the notebook for details). I'm just saying you could also support a function for lazy importing when you have only a module or two you to be lazy about while being eager otherwise. -Brett > -Barry > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Mon Oct 9 12:44:47 2017 From: brett at python.org (Brett Cannon) Date: Mon, 09 Oct 2017 16:44:47 +0000 Subject: [Python-Dev] Investigating time for `import requests` In-Reply-To: <20171002225522.07e2b34a@fsol> References: <20171002225522.07e2b34a@fsol> Message-ID: On Mon, Oct 2, 2017, 13:56 Antoine Pitrou, wrote: > On Mon, 02 Oct 2017 18:56:15 +0000 > Brett Cannon wrote: > > > > So Mercurial specifically is an odd duck because they already do lazy > > importing (in fact they are using the lazy loading support from > importlib). > > Do they? I was under the impression they had their own home-baked, > GPL-licensed, lazy-loading __import__ re-implementation. > > At least they used to, perhaps they switched to something else > (probably still GPL-licensed, though). > Their Python 3 port wraps the stdlib code in their old API (they showed me the code at PyCon US). So the GPL bit is for API adapting. -Brett > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Mon Oct 9 12:48:07 2017 From: brett at python.org (Brett Cannon) Date: Mon, 09 Oct 2017 16:48:07 +0000 Subject: [Python-Dev] Investigating time for `import requests` In-Reply-To: <302A53D9-D6F1-4872-ACF2-522889412FA1@mac.com> References: <641723C8-AE9E-44F1-94AE-68A716980318@python.org> <302A53D9-D6F1-4872-ACF2-522889412FA1@mac.com> Message-ID: On Mon, Oct 2, 2017, 17:49 Ronald Oussoren, wrote: > Op 3 okt. 2017 om 04:29 heeft Barry Warsaw het > volgende geschreven: > > > On Oct 2, 2017, at 14:56, Brett Cannon wrote: > > > >> So Mercurial specifically is an odd duck because they already do lazy > importing (in fact they are using the lazy loading support from importlib). > In terms of all of this discussion of tweaking import to be lazy, I think > the best approach would be providing an opt-in solution that CLI tools can > turn on ASAP while the default stays eager. That way everyone gets what > they want while the stdlib provides a shared solution that's maintained > alongside import itself to make sure it functions appropriately. > > > > The problem I think is that to get full benefit of lazy loading, it has > to be turned on globally for bare ?import? statements. A typical > application has tons of dependencies and all those libraries are also doing > module global imports, so unless lazy loading somehow covers them, it?ll be > an incomplete gain. But of course it?ll take forever for all your > dependencies to use whatever new API we come up with, and if it?s not as > convenient to write as ?import foo? then I suspect it won?t much catch on > anyway. > > > > One thing to keep in mind is that imports can have important side-effects. > Turning every import statement into a lazy import will not be backward > compatible. > Yep, and that's a lesson Mercurial shared with me at PyCon US this year. My planned approach has a blacklist for modules to only load eagerly. -Brett > Ronald > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Mon Oct 9 13:02:30 2017 From: brett at python.org (Brett Cannon) Date: Mon, 09 Oct 2017 17:02:30 +0000 Subject: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files) In-Reply-To: <20171004175118.66c1b46e@fsol> References: <1507051763.2766090.1126540976.75FD41A3@webmail.messagingengine.com> <20171004175118.66c1b46e@fsol> Message-ID: On Wed, Oct 4, 2017, 09:00 Antoine Pitrou, wrote: > On Wed, 4 Oct 2017 10:14:22 -0400 > Barry Warsaw wrote: > > On Oct 3, 2017, at 13:29, Benjamin Peterson wrote: > > > > > I'm not sure turning the implementation details of our internal formats > > > into APIs is the way to go. > > > > I still think an API in the stdlib would be useful and appropriate, but > it?s not like this couldn?t be done as a 3rd party module. > > It can also be an implementation-specific API for which we don't > guarantee anything in the future. The consenting adults rule would > apply. > I've toyed with the idea of coming up with an API for bytecode files, but having lived through the last file format change I could never come up with one that didn't just chop off the header or would be a maintenance burden. But having an API that followed just what was in that Python release like the ast module would solve that problem. Then 3rd-party code could wrap it it smooth out differences between versions. -Brett > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rhinerfeldnyc at gmail.com Tue Oct 10 15:49:01 2017 From: rhinerfeldnyc at gmail.com (Richard Hinerfeld) Date: Tue, 10 Oct 2017 15:49:01 -0400 Subject: [Python-Dev] Python3.5.4 Compiled on Linux gives the following error messgae Message-ID: richard at debian:~/Python-3.5.4$ ./python -m test -v test_gdb == CPython 3.5.4 (default, Oct 10 2017, 00:27:44) [GCC 4.7.2] == Linux-3.2.0-4-686-pae-i686-with-debian-7.11 little-endian == hash algorithm: siphash24 32bit == /home/richard/Python-3.5.4/build/test_python_5566 Testing with flags: sys.flags(debug=0, inspect=0, interactive=0, optimize=0, dont_write_bytecode=0, no_user_site=0, no_site=0, ignore_environment=0, verbose=0, bytes_warning=0, quiet=0, hash_randomization=1, isolated=0) 0:00:00 load avg: 0.10 [1/1] test_gdb test_gdb skipped -- gdb not built with embedded python support 1 test skipped: test_gdb Please note I have downloaded a new version of GDB and compiled it with the config option --with-python. I then installed it on my system. I get the same error Thank You -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Tue Oct 10 17:03:18 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 10 Oct 2017 23:03:18 +0200 Subject: [Python-Dev] Python3.5.4 Compiled on Linux gives the following error messgae In-Reply-To: References: Message-ID: Hi, 2017-10-10 21:49 GMT+02:00 Richard Hinerfeld : > test_gdb skipped -- gdb not built with embedded python support > (...) > Please note I have downloaded a new version of GDB and compiled it with > the config option --with-python. > I then installed it on my system. Python looks for "gdb" program in the PATH. Did you include the path where you installed the more recent and complete gdb in the PATH environment variable? Short extract of test_gdb.py: proc = subprocess.Popen(["gdb", "-nx", "--version"], ...) Victor From python-dev at mgmiller.net Wed Oct 11 16:33:43 2017 From: python-dev at mgmiller.net (Mike Miller) Date: Wed, 11 Oct 2017 13:33:43 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <750de277-4975-48a9-7ef5-817b78bcbdde@trueblade.com> <61662340-c713-3ca2-db3c-50e0480f9eb6@mail.de> Message-ID: <9bc33149-3745-c1ce-26a4-4425b0b7a646@mgmiller.net> (Apologies for reviving a dead horse, but may not be around at the blessed time.) As potential names of this concept, I liked record and row, but agreed they were a bit too specific and not quite exact. In my recent (unrelated) reading however, I came across another term and think it might fit better, called an "entity." It has some nice properties: - Traditional dictionary definition, meaning "thing" - Same specificity as the current base-class name: object - Corresponds to a class or instance (depending on context) in data terminology From: http://ewebarchitecture.com/web-databases/database-entities An entity is a thing or object of importance about which data must be captured. Information about an entity is captured in the form of attributes and/or relationships. All things aren't entities?only those about which information should be captured. If something is a candidate for being an entity and it has no attributes or relationships, it isn't an entity. Thoughts? Another candidate is "container" but is not very descriptive. -Mike On 2017-09-16 11:14, Steve Holden wrote: > I therefore propose "row", which is sufficiently neutral to avoid most current > opposition and yet a common field-oriented mechanism for accessing units of > retrieved data by name. From ncoghlan at gmail.com Wed Oct 11 22:56:56 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 12 Oct 2017 12:56:56 +1000 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <9bc33149-3745-c1ce-26a4-4425b0b7a646@mgmiller.net> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <750de277-4975-48a9-7ef5-817b78bcbdde@trueblade.com> <61662340-c713-3ca2-db3c-50e0480f9eb6@mail.de> <9bc33149-3745-c1ce-26a4-4425b0b7a646@mgmiller.net> Message-ID: On 12 October 2017 at 06:33, Mike Miller wrote: > (Apologies for reviving a dead horse, but may not be around at the blessed > time.) > > As potential names of this concept, I liked record and row, but agreed > they were a bit too specific and not quite exact. In my recent (unrelated) > reading however, I came across another term and think it might fit better, > called an "entity." > > It has some nice properties: > > - Traditional dictionary definition, meaning "thing" > - Same specificity as the current base-class name: object > - Corresponds to a class or instance (depending on context) in data > terminology > >From my perspective, the main benefit of a compound name like "data class" is that it emphasises a deliberate behavioural choice (adopted from attrs): data classes are just regular classes, with some definition time logic to help define data fields. By contrast, if we give them their own name (as with suggestions like record, row, entity), that makes them start to sound more like enums: an alternative base class with different runtime behaviour from a regular class. Cheers, Nick. P.S. I'll grant that this reasoning doesn't entirely mesh with the naming of "Abstract Base Class", but that phrase at least explicitly has the word "base" in it, suggesting that inheritance is involved in the way it works. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From python-dev at mgmiller.net Thu Oct 12 00:49:22 2017 From: python-dev at mgmiller.net (Mike Miller) Date: Wed, 11 Oct 2017 21:49:22 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <750de277-4975-48a9-7ef5-817b78bcbdde@trueblade.com> <61662340-c713-3ca2-db3c-50e0480f9eb6@mail.de> <9bc33149-3745-c1ce-26a4-4425b0b7a646@mgmiller.net> Message-ID: <35b4d62e-e104-8436-217c-84e09cd49e7f@mgmiller.net> On 2017-10-11 19:56, Nick Coghlan wrote: > From my perspective, the main benefit of a compound name like "data class" is > that it emphasises a deliberate behavioural choice (adopted from attrs): data > classes are just regular classes, with some definition time logic to help define > data fields. IMO, the problem with the dataclass name isn't the data part, but the "class" part. No other class has "class" in its name(?), not even object. The Department of Redundancy Department will love it. If it must be a compound name, it should rather be dataobject, no? > By contrast, if we give them their own name (as with suggestions like record, > row, entity), that makes them start to sound more like enums: an alternative > base class with different runtime behaviour from a regular class. This pep also adds many methods for use at runtime, though perhaps the behavior is more subtle. > P.S. I'll grant that this reasoning doesn't entirely mesh with the naming of > "Abstract Base Class", but that phrase at least explicitly has the word "base" > in it, suggesting that inheritance is involved in the way it works. There was some discussion over inheritance vs. decoration, not sure if it was settled. (Just noticed that the abc module got away with a class name of "ABC," perhaps dataclass would be more palatable as "DC", though entity sounds a bit nicer.) Cheers, -Mike From ncoghlan at gmail.com Thu Oct 12 02:05:43 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 12 Oct 2017 16:05:43 +1000 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <35b4d62e-e104-8436-217c-84e09cd49e7f@mgmiller.net> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <750de277-4975-48a9-7ef5-817b78bcbdde@trueblade.com> <61662340-c713-3ca2-db3c-50e0480f9eb6@mail.de> <9bc33149-3745-c1ce-26a4-4425b0b7a646@mgmiller.net> <35b4d62e-e104-8436-217c-84e09cd49e7f@mgmiller.net> Message-ID: On 12 October 2017 at 14:49, Mike Miller wrote: > > On 2017-10-11 19:56, Nick Coghlan wrote: > >> From my perspective, the main benefit of a compound name like "data >> class" is that it emphasises a deliberate behavioural choice (adopted from >> attrs): data classes are just regular classes, with some definition time >> logic to help define data fields. >> > > IMO, the problem with the dataclass name isn't the data part, but the > "class" part. No other class has "class" in its name(?), not even object. > The Department of Redundancy Department will love it. > > If it must be a compound name, it should rather be dataobject, no? > No, because dataclass is the name of a class decorator ("This class is a data class"), not the name of a type. It's akin to "static method", "class method", and "instance method" for function definitions (although the last one isn't a typical decorator, since it's the default behaviour for functions placed inside a class). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From python-dev at mgmiller.net Thu Oct 12 04:20:30 2017 From: python-dev at mgmiller.net (Mike Miller) Date: Thu, 12 Oct 2017 01:20:30 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <750de277-4975-48a9-7ef5-817b78bcbdde@trueblade.com> <61662340-c713-3ca2-db3c-50e0480f9eb6@mail.de> <9bc33149-3745-c1ce-26a4-4425b0b7a646@mgmiller.net> Message-ID: <2e0b6992-190d-56a8-4d03-6d7554215101@mgmiller.net> On 2017-10-12 00:36, St?fane Fermigier wrote: > "An object that is not defined by its attributes, but rather by a thread of > continuity and its identity." (from > https://en.wikipedia.org/wiki/Domain-driven_design#Building_blocks) Not sure I follow all this, but Python objects do have identities once instantiated. e.g. >>> id('') > See also the more general Wikipedia definition "An entity is something that > exists as itself, as a subject or as an object, actually or potentially, > concretely or abstractly, physically or not." > (https://en.wikipedia.org/wiki/Entity). > > In the context of DDD, entities are usually opposed to value objects: "An object > that contains attributes but has no conceptual identity. They should be treated > as immutable.". (https://en.wikipedia.org/wiki/Domain-driven_design#Building_blocks) > > Attrs, and by extension the dataclass proposal (I guess), provide some support > for both: > > - Providing support for quickly constructing immutable objects from a bag of > attributes, and providing equality based on those attributes, it helps implement > Value Objects (not sure much more is needed actually) > > - By supporting equality based on some "primary key", it will also help with > maintaining the concept of "equality" in entities. I don't believe either module particularly supports or restricts immutability? -Mike From steve at holdenweb.com Thu Oct 12 06:20:14 2017 From: steve at holdenweb.com (Steve Holden) Date: Thu, 12 Oct 2017 11:20:14 +0100 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <2e0b6992-190d-56a8-4d03-6d7554215101@mgmiller.net> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <750de277-4975-48a9-7ef5-817b78bcbdde@trueblade.com> <61662340-c713-3ca2-db3c-50e0480f9eb6@mail.de> <9bc33149-3745-c1ce-26a4-4425b0b7a646@mgmiller.net> <2e0b6992-190d-56a8-4d03-6d7554215101@mgmiller.net> Message-ID: On Thu, Oct 12, 2017 at 9:20 AM, Mike Miller wrote: > > On 2017-10-12 00:36, St?fane Fermigier wrote: > >> "An object that is not defined by its attributes, but rather by a thread >> of continuity and its identity." (from https://en.wikipedia.org/wiki/ >> Domain-driven_design#Building_blocks) >> > > Not sure I follow all this, but Python objects do have identities once > instantiated. e.g. >>> id('') > > ?It seems to me that the quoted document is attempting to make a distinction ?similar to the one between classes (entities) and instances (value objects). The reason I liked "row" as a name is because it resembles "vector" and hence is loosely assocaited with the concept of a tuple as well as being familiar to database users. In fact the answer to a relational query was, I believe, originally formally defined as a set of tuples. ?Sometimes one can simply be too hifalutin' [ http://www.dictionary.com/browse/hifalutin]?, and language that attempts to be precise obscures meaning to the less specialised reader. See also the more general Wikipedia definition "An entity is something that >> exists as itself, as a subject or as an object, actually or potentially, >> concretely or abstractly, physically or not." ( >> https://en.wikipedia.org/wiki/Entity). >> >> In the context of DDD, entities are usually opposed to value objects: "An >> object that contains attributes but has no conceptual identity. They should >> be treated as immutable.". (https://en.wikipedia.org/wiki >> /Domain-driven_design#Building_blocks) >> >> Attrs, and by extension the dataclass proposal (I guess), provide some >> support for both: >> >> - Providing support for quickly constructing immutable objects from a bag >> of attributes, and providing equality based on those attributes, it helps >> implement Value Objects (not sure much more is needed actually) >> >> - By supporting equality based on some "primary key", it will also help >> with maintaining the concept of "equality" in entities. >> > > I don't believe either module particularly supports or restricts > immutability? > > -Mike > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gjcarneiro at gmail.com Thu Oct 12 09:16:21 2017 From: gjcarneiro at gmail.com (Gustavo Carneiro) Date: Thu, 12 Oct 2017 14:16:21 +0100 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <750de277-4975-48a9-7ef5-817b78bcbdde@trueblade.com> <61662340-c713-3ca2-db3c-50e0480f9eb6@mail.de> <9bc33149-3745-c1ce-26a4-4425b0b7a646@mgmiller.net> <2e0b6992-190d-56a8-4d03-6d7554215101@mgmiller.net> Message-ID: On 12 October 2017 at 11:20, Steve Holden wrote: > On Thu, Oct 12, 2017 at 9:20 AM, Mike Miller > wrote: > >> >> On 2017-10-12 00:36, St?fane Fermigier wrote: >> >>> "An object that is not defined by its attributes, but rather by a thread >>> of continuity and its identity." (from https://en.wikipedia.org/wiki/ >>> Domain-driven_design#Building_blocks) >>> >> >> Not sure I follow all this, but Python objects do have identities once >> instantiated. e.g. >>> id('') >> >> ?It seems to me that the quoted document is attempting to make a > distinction ?similar to the one between classes (entities) and instances > (value objects). The reason I liked "row" as a name is because it > resembles "vector" and hence is loosely assocaited with the concept of a > tuple as well as being familiar to database users. In fact the answer to a > relational query was, I believe, originally formally defined as a set of > tuples. > But rows and tuples are usually immutable, at least in database terms. These data classes are not immutable (by default). If you want tuple-like behaviour, you can continue to use tuples. I see dataclasses as something closer to C `struct`. Most likely someone already considered `struct` as name; if not, please consider it. Else stick with dataclass, it's a good name IMHO. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Thu Oct 12 10:46:15 2017 From: guido at python.org (Guido van Rossum) Date: Thu, 12 Oct 2017 07:46:15 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <750de277-4975-48a9-7ef5-817b78bcbdde@trueblade.com> <61662340-c713-3ca2-db3c-50e0480f9eb6@mail.de> <9bc33149-3745-c1ce-26a4-4425b0b7a646@mgmiller.net> <2e0b6992-190d-56a8-4d03-6d7554215101@mgmiller.net> Message-ID: I am still firmly convinced that @dataclass is the right name for the decorator (and `dataclasses` for the module). -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Thu Oct 12 10:52:50 2017 From: barry at python.org (Barry Warsaw) Date: Thu, 12 Oct 2017 10:52:50 -0400 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <750de277-4975-48a9-7ef5-817b78bcbdde@trueblade.com> <61662340-c713-3ca2-db3c-50e0480f9eb6@mail.de> <9bc33149-3745-c1ce-26a4-4425b0b7a646@mgmiller.net> <2e0b6992-190d-56a8-4d03-6d7554215101@mgmiller.net> Message-ID: <59BCE37A-BA0F-4556-8377-148E2AC55D0B@python.org> On Oct 12, 2017, at 10:46, Guido van Rossum wrote: > > I am still firmly convinced that @dataclass is the right name for the decorator (and `dataclasses` for the module). Darn, and I was going to suggest they be called EricTheHalfABees, with enums being renamed to EricTheHalfNotBees. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From sf at fermigier.com Thu Oct 12 05:24:18 2017 From: sf at fermigier.com (=?UTF-8?Q?St=C3=A9fane_Fermigier?=) Date: Thu, 12 Oct 2017 11:24:18 +0200 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <2e0b6992-190d-56a8-4d03-6d7554215101@mgmiller.net> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <750de277-4975-48a9-7ef5-817b78bcbdde@trueblade.com> <61662340-c713-3ca2-db3c-50e0480f9eb6@mail.de> <9bc33149-3745-c1ce-26a4-4425b0b7a646@mgmiller.net> <2e0b6992-190d-56a8-4d03-6d7554215101@mgmiller.net> Message-ID: On Thu, Oct 12, 2017 at 10:20 AM, Mike Miller wrote: > > On 2017-10-12 00:36, St?fane Fermigier wrote: > >> "An object that is not defined by its attributes, but rather by a thread >> of continuity and its identity." (from https://en.wikipedia.org/wiki/ >> Domain-driven_design#Building_blocks) >> > > Not sure I follow all this, but Python objects do have identities once > instantiated. e.g. >>> id('') Yes, for the lifetime of the object in the Python VM. But if you are dealing with objects that are persisted using some kind of ORM, ODM, OODB, then it wont work. It's quite common (but not always the best solution) to use some kind of UUID to represent the identity of each entity. Also, there can be circumstances where two objects can exist at the same time in the VM which represent the same object, in which case one should ensure that a == b iff a.uid == a.uid (in the case 'uid' is the attribute used to carry the unique identifier). > I don't believe either module particularly supports or restricts > immutability? http://www.attrs.org/en/stable/examples.html#immutability https://www.python.org/dev/peps/pep-0557/#frozen-instances S. > > > -Mike > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/sfermigie > r%2Blists%40gmail.com > > -- Stefane Fermigier - http://fermigier.com/ - http://twitter.com/sfermigier - http://linkedin.com/in/sfermigier Founder & CEO, Abilian - Enterprise Social Software - http://www.abilian.com/ Chairman, Free&OSS Group / Systematic Cluster - http://www.gt-logiciel-libre.org/ Co-Chairman, National Council for Free & Open Source Software (CNLL) - http://cnll.fr/ Founder & Organiser, PyData Paris - http://pydata.fr/ --- ?You never change things by ?ghting the existing reality. To change something, build a new model that makes the existing model obsolete.? ? R. Buckminster Fuller -------------- next part -------------- An HTML attachment was scrubbed... URL: From sf at fermigier.com Thu Oct 12 03:36:46 2017 From: sf at fermigier.com (=?UTF-8?Q?St=C3=A9fane_Fermigier?=) Date: Thu, 12 Oct 2017 09:36:46 +0200 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: <9bc33149-3745-c1ce-26a4-4425b0b7a646@mgmiller.net> References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <750de277-4975-48a9-7ef5-817b78bcbdde@trueblade.com> <61662340-c713-3ca2-db3c-50e0480f9eb6@mail.de> <9bc33149-3745-c1ce-26a4-4425b0b7a646@mgmiller.net> Message-ID: On Wed, Oct 11, 2017 at 10:33 PM, Mike Miller wrote: > (Apologies for reviving a dead horse, but may not be around at the blessed > time.) > > As potential names of this concept, I liked record and row, but agreed > they were a bit too specific and not quite exact. In my recent (unrelated) > reading however, I came across another term and think it might fit better, > called an "entity." > I'm not familiar with ER modelling but I would advise against using the term "entity", as it has, in domain-driven design (DDD) a very specific meaning: "An object that is not defined by its attributes, but rather by a thread of continuity and its identity." (from https://en.wikipedia.org/wiki/Domain-driven_design#Building_blocks) See also the more general Wikipedia definition "An entity is something that exists as itself, as a subject or as an object, actually or potentially, concretely or abstractly, physically or not." ( https://en.wikipedia.org/wiki/Entity). In the context of DDD, entities are usually opposed to value objects: "An object that contains attributes but has no conceptual identity. They should be treated as immutable.". ( https://en.wikipedia.org/wiki/Domain-driven_design#Building_blocks) Attrs, and by extension the dataclass proposal (I guess), provide some support for both: - Providing support for quickly constructing immutable objects from a bag of attributes, and providing equality based on those attributes, it helps implement Value Objects (not sure much more is needed actually) - By supporting equality based on some "primary key", it will also help with maintaining the concept of "equality" in entities. It would be great if the dataclass proposal could help implement DDD technical concepts in Python, but its terminology should not conflict the DDD terminology, if we want to avoid confusion. Cheers, S. -- Stefane Fermigier - http://fermigier.com/ - http://twitter.com/sfermigier - http://linkedin.com/in/sfermigier Founder & CEO, Abilian - Enterprise Social Software - http://www.abilian.com/ Chairman, Free&OSS Group / Systematic Cluster - http://www.gt-logiciel-libre.org/ Co-Chairman, National Council for Free & Open Source Software (CNLL) - http://cnll.fr/ Founder & Organiser, PyData Paris - http://pydata.fr/ --- ?You never change things by ?ghting the existing reality. To change something, build a new model that makes the existing model obsolete.? ? R. Buckminster Fuller -------------- next part -------------- An HTML attachment was scrubbed... URL: From python-dev at mgmiller.net Thu Oct 12 11:24:31 2017 From: python-dev at mgmiller.net (Mike Miller) Date: Thu, 12 Oct 2017 08:24:31 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <750de277-4975-48a9-7ef5-817b78bcbdde@trueblade.com> <61662340-c713-3ca2-db3c-50e0480f9eb6@mail.de> <9bc33149-3745-c1ce-26a4-4425b0b7a646@mgmiller.net> <35b4d62e-e104-8436-217c-84e09cd49e7f@mgmiller.net> Message-ID: On 2017-10-11 23:05, Nick Coghlan wrote: > It's akin to "static method", "class method", and "instance method" for function > definitions (although the last one isn't a typical decorator, since it's the > default behaviour for functions placed inside a class). Thanks, I'm still thinking of it as inheritance for some reason. -Mike From lkb.teichmann at gmail.com Thu Oct 12 14:21:29 2017 From: lkb.teichmann at gmail.com (Martin Teichmann) Date: Thu, 12 Oct 2017 20:21:29 +0200 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <0da1ba37-ff98-5394-ba55-f59855048ff3@trueblade.com> Message-ID: Hi list, first, a big thanks to the authors of PEP 557! Great idea! For me, the dataclasses were a typical example for inheritance, to be more precise, for metaclasses. I was astonished to see them implemented using decorators, and I was not the only one, citing Guido: > I think it would be useful to write 1-2 sentences about the problem with > inheritance -- in that case you pretty much have to use a metaclass, and the > use of a metaclass makes life harder for people who want to use their own > metaclass (since metaclasses don't combine without some manual > intervention). Python is at a weird point here. At about every new release of Python, a new idea shows up that could be easily solved using metaclasses, yet every time we hesitate to use them, because of said necessary manual intervention for metaclass combination. So I think we have two options now: We could deprecate metaclasses, going down routes like PEP 487's __init_subclass__. Unfortunately, for data classes __init_subclass__ it is too late in the class creation process for it to influence the __slots__ mechanism. A __new_subclass__, that acts earlier, could do the job, but to me that simply sounds like reinventing the wheel of metaclasses. The other option would be to simply make metaclasses work properly. We would just have to define a way to automatically combine metaclasses. Guido once mention once (here: https://mail.python.org/pipermail/python-dev/2017-June/148501.html) that he left out automatic synthesis of combined metaclasses on purpose, but given that this seems to be a major problem, I think it is about time to overthink this decision. So I propose to add such an automatic synthesis. My idea is that a metaclass author can define the __or__ and __ror__ methods for automatic metaclass synthesis. Then if a class C inherits from two classes A and B with metaclasses MetaA and MetaB, the metaclass would be MetaA | MetaB. Greetings Martin From guido at python.org Thu Oct 12 14:57:52 2017 From: guido at python.org (Guido van Rossum) Date: Thu, 12 Oct 2017 11:57:52 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <0da1ba37-ff98-5394-ba55-f59855048ff3@trueblade.com> Message-ID: You're right that if it were easier to combine metaclasses we would not shy away from them so easily. Perhaps you and others interested in this topic can try to prototype an implementation and see how it would work in practice (with some realistic existing metaclasses)? Then the next step would be to write a PEP. But in this case I really recommend trying to implement it first (in pure Python) to see if it can actually work. On Thu, Oct 12, 2017 at 11:21 AM, Martin Teichmann wrote: > Hi list, > > first, a big thanks to the authors of PEP 557! Great idea! > > For me, the dataclasses were a typical example for inheritance, to be > more precise, for metaclasses. I was astonished to see them > implemented using decorators, and I was not the only one, citing > Guido: > > > I think it would be useful to write 1-2 sentences about the problem with > > inheritance -- in that case you pretty much have to use a metaclass, and > the > > use of a metaclass makes life harder for people who want to use their own > > metaclass (since metaclasses don't combine without some manual > > intervention). > > Python is at a weird point here. At about every new release of Python, > a new idea shows up that could be easily solved using metaclasses, yet > every time we hesitate to use them, because of said necessary manual > intervention for metaclass combination. > > So I think we have two options now: We could deprecate metaclasses, > going down routes like PEP 487's __init_subclass__. Unfortunately, for > data classes __init_subclass__ it is too late in the class creation > process for it to influence the __slots__ mechanism. A > __new_subclass__, that acts earlier, could do the job, but to me that > simply sounds like reinventing the wheel of metaclasses. > > The other option would be to simply make metaclasses work properly. We > would just have to define a way to automatically combine metaclasses. > Guido once mention once (here: > https://mail.python.org/pipermail/python-dev/2017-June/148501.html) > that he left out automatic synthesis of combined metaclasses on > purpose, but given that this seems to be a major problem, I think it > is about time to overthink this decision. > > So I propose to add such an automatic synthesis. My idea is that a > metaclass author can define the __or__ and __ror__ methods for > automatic metaclass synthesis. Then if a class C inherits from two > classes A and B with metaclasses MetaA and MetaB, the metaclass would > be MetaA | MetaB. > > Greetings > > Martin > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Oct 12 18:33:18 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 12 Oct 2017 15:33:18 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <750de277-4975-48a9-7ef5-817b78bcbdde@trueblade.com> <61662340-c713-3ca2-db3c-50e0480f9eb6@mail.de> <9bc33149-3745-c1ce-26a4-4425b0b7a646@mgmiller.net> <2e0b6992-190d-56a8-4d03-6d7554215101@mgmiller.net> Message-ID: On Thu, Oct 12, 2017 at 3:20 AM, Steve Holden wrote: > The reason I liked "row" as a name is because it resembles "vector" and > hence is loosely assocaited with the concept of a tuple as well as being > familiar to database users. In fact the answer to a relational query was, I > believe, originally formally defined as a set of tuples. > > Is the intent that these things preserve order? in which case, I like row is OK (though still don't see what's wrong with record). I still dop'nt love it though -- it gives the expectation of a row in a data table )or csv file, or.. which will be a common use case, but really, it doesn't conceptually have anything to do with tabular data. in fact, one might want to store a bunch of these in, say, a 2D (or 3D) array, then row would be pretty weird.... I don't much like entity either -- it is either way to generic -- everyting is an entity! even less specific than "object". Or two specific (and incorrect) in the lexicon of particular domains. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Oct 12 18:44:41 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 12 Oct 2017 15:44:41 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <0da1ba37-ff98-5394-ba55-f59855048ff3@trueblade.com> Message-ID: I think we've drifted into a new topic, but... I was astonished to see them > implemented using decorators, and I was not the only one. ... > Python is at a weird point here. At about every new release of Python, > a new idea shows up that could be easily solved using metaclasses, yet > every time we hesitate to use them, because of said necessary manual > intervention for metaclass combination. > > So I think we have two options now: We could deprecate metaclasses, > I was thinking about this last spring, when I tried to cram all sorts of python metaprogramming into one 3hr class... Trying to come up with a an exam[ple for metclasses, I couldn't come up with anything that couldn't be done more claerly (to me) with a class decorator. I also found some commentary on the web (sorry, no links :-( ) indicating that metacalsses were added before class decorators, and that they really don't have a compelling use case any more. Now it seem that not only do they not have a compelling use case, in some (many) instances, there are compelling reasons to NOT use them, and rather use decorators. So why deprecate them? or at least discourage their use? The other option would be to simply make metaclasses work properly. We > would just have to define a way to automatically combine metaclasses. > "just"? Anyway, let's say that is doable -- would you then be able to do something with metaclasses that you could not do with decorators? or it in a cleaner, easier to write or understand way? There-should-be-one--and-preferably-only-one--obvious-way-to-do-it-ly yours, -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Thu Oct 12 19:44:19 2017 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 12 Oct 2017 19:44:19 -0400 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <750de277-4975-48a9-7ef5-817b78bcbdde@trueblade.com> <61662340-c713-3ca2-db3c-50e0480f9eb6@mail.de> <9bc33149-3745-c1ce-26a4-4425b0b7a646@mgmiller.net> <2e0b6992-190d-56a8-4d03-6d7554215101@mgmiller.net> Message-ID: On 10/12/2017 6:33 PM, Chris Barker wrote: > On Thu, Oct 12, 2017 at 3:20 AM, Steve Holden > wrote: > > ?The reason I liked "row" as a name is because it resembles > "vector" and hence is loosely assocaited with the concept of a tuple > as well as being familiar to database users. In fact the answer to a > relational query was, I believe, originally formally defined as a > set of tuples. > > > Is the intent that these things preserve order? In the sense that the parameters to __init__(), the appearance in the repr, the order of the returned tuple in as_tuple(), and the order of comparisons will be the same as the order that the fields are defined, then yes. Eric. From ethan at stoneleaf.us Thu Oct 12 20:56:41 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 12 Oct 2017 17:56:41 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <0da1ba37-ff98-5394-ba55-f59855048ff3@trueblade.com> Message-ID: <59E00F49.1050307@stoneleaf.us> On 10/12/2017 03:44 PM, Chris Barker wrote: > I think we've drifted into a new topic, but... > I was thinking about this last spring, when I tried to cram all sorts of python metaprogramming into one 3hr class... > > Trying to come up with a an exam[ple for metclasses, I couldn't come up with anything that couldn't be done more claerly > (to me) with a class decorator. The Enum data type requires metaclasses. Any time you want to modify the behavior of a class (not its instances, the class itself) you need a metaclass. Agreed that it's pretty rare, but we need them. -- ~Ethan~ From raymond.hettinger at gmail.com Thu Oct 12 23:09:12 2017 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Thu, 12 Oct 2017 20:09:12 -0700 Subject: [Python-Dev] PEP 557: Data Classes In-Reply-To: References: <88ef501a-90de-a63d-03d3-7a9f15124aa0@trueblade.com> <750de277-4975-48a9-7ef5-817b78bcbdde@trueblade.com> <61662340-c713-3ca2-db3c-50e0480f9eb6@mail.de> <9bc33149-3745-c1ce-26a4-4425b0b7a646@mgmiller.net> <2e0b6992-190d-56a8-4d03-6d7554215101@mgmiller.net> Message-ID: > On Oct 12, 2017, at 7:46 AM, Guido van Rossum wrote: > > I am still firmly convinced that @dataclass is the right name for the decorator (and `dataclasses` for the module). +1 from me. The singular/plural pair has the same nice feel as "from fractions import Fraction", "from itertools import product" and "from collections import namedtuple". Raymond From ncoghlan at gmail.com Fri Oct 13 02:30:50 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 13 Oct 2017 16:30:50 +1000 Subject: [Python-Dev] What is the design purpose of metaclasses vs code generating decorators? (was Re: PEP 557: Data Classes) Message-ID: On 13 October 2017 at 04:21, Martin Teichmann wrote: > For me, the dataclasses were a typical example for inheritance, to be > more precise, for metaclasses. I was astonished to see them > implemented using decorators, and I was not the only one, citing > Guido: > > > I think it would be useful to write 1-2 sentences about the problem with > > inheritance -- in that case you pretty much have to use a metaclass, and > the > > use of a metaclass makes life harder for people who want to use their own > > metaclass (since metaclasses don't combine without some manual > > intervention). > > Python is at a weird point here. At about every new release of Python, > a new idea shows up that could be easily solved using metaclasses, yet > every time we hesitate to use them, because of said necessary manual > intervention for metaclass combination. > Metaclasses currently tend to serve two distinct purposes: 1. Actually altering the runtime behaviour of a class and its children in non-standard ways (e.g. enums, ABCs, ORMs) 2. Boilerplate reduction in class definitions, reducing the amount of code you need to write as the author of that class Nobody has a problem with using metaclasses for the first purpose - that's what they're for. It's the second use case where they're problematic, as the fact that they're preserved on the class becomes a leaky implementation detail, and the lack of a JIT in CPython means they can also end up being expensive from a runtime performance perspective. Mixin classes have the same problem: something that the author may want to handle as an internal implementation detail leaks through to the runtime state of the class object. Code generating decorators like functools.total_ordering and dataclasses.dataclass (aka attr.s) instead aim at the boilerplate reduction problem directly: they let you declare in the class body the parts that you need to specify as the class designer, and then fill in at class definition time the parts that can be inferred from that base. If all you have access to is the runtime class, it behaves almost exactly as if you had written out all the autogenerated methods by hand (there may be subtle differences in the method metadata, such as the values of `__qualname__` and `__globals__`). Such decorators also do more work at class definition time in order to reduce the amount of runtime overhead introduced by reliance on chained method calls in a non-JITted Python runtime. As such, the code generating decorators have a clear domain of applicability: boilerplate reduction for class definitions without impacting the way instances behave (other than attribute and method injection), and without implicitly impacting subclass definitions (other than through regular inheritance behaviour). As far as the dataclass interaction with `__slots__` goes, that's a problem largely specific to slots (and `__metaclass__` before it), in that they're the only characteristics of a class definition that affect how CPython allocates memory for the class object itself (the descriptors for the slots are stored as a pointer array after the class struct, rather than only in the class dict). Given PEP 526 variable annotations, __slots__ could potentially benefit from a __metaclass__ style makeover, allowing an "infer_slots=True" keyword argument to type.__new__ to request that the list of slots be inferred from __annotations__ (Slot inference would conflict with setting class level default values, but that's a real conflict, as you'd be trying to use the same name on the class object for both the slot descriptor and the default value) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From lkb.teichmann at gmail.com Fri Oct 13 05:35:17 2017 From: lkb.teichmann at gmail.com (Martin Teichmann) Date: Fri, 13 Oct 2017 11:35:17 +0200 Subject: [Python-Dev] What is the design purpose of metaclasses vs code generating decorators? (was Re: PEP 557: Data Classes) In-Reply-To: References: Message-ID: > Metaclasses currently tend to serve two distinct purposes: > > 1. Actually altering the runtime behaviour of a class and its children in > non-standard ways (e.g. enums, ABCs, ORMs) > 2. Boilerplate reduction in class definitions, reducing the amount of code > you need to write as the author of that class > > Nobody has a problem with using metaclasses for the first purpose - that's > what they're for. I am that nobody. The examples you give would be much nicer solved with decorators. Especially for ABCs it would be much better, because the fact that a class is an ABC is explicitly not inherited - its entire reason of existence is to be inherited with all the abstractness going away. And yet, currently the concrete class will still inherit from ABC. The following way of writing ABCs looks much nicer to me: @abstractclass class Spam: @abstractmethod def ham(self): ... The same holds for enums. Inheriting from enums is possible, but weird, given that you cannot add new enums to it. So, especially when comparing to the dataclasses, the following looks appealing to me: @enum class Breakfast: spam = 0 ham = 1 I'm not an expert on ORMs, but they seem to me a lot like data classes in the context of this discussion. I am aware that currently some things are easier done with metaclasses. But given that decorators can create an entirely new class if they like, they have all the power to do what they want, and even in a way much easier understood by people. As an example, here the autoslot decorator: def autoslot(cls): """turn all class variables into slots""" cls.__slots__ = list(cls.__dict__) return type(cls.__name__, cls.__bases__, class.__dict__) So I am personally more and more leaning towards a metaclass-free future. Cheers Martin From k7hoven at gmail.com Fri Oct 13 07:06:14 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Fri, 13 Oct 2017 14:06:14 +0300 Subject: [Python-Dev] What is the design purpose of metaclasses vs code generating decorators? (was Re: PEP 557: Data Classes) In-Reply-To: References: Message-ID: While I really can't continue to be active in this discussion now [*], here are some thoughts based on observations I made: These three PEPs are all essentially solving an occurrence of the same problem: PEP 549 Instance descriptors PEP 562 Module __getattr__ PEP 560 Core support for generic types (the __class_getitem__ part) https://www.python.org/dev/peps/pep-0549/ https://www.python.org/dev/peps/pep-0562/ https://www.python.org/dev/peps/pep-0560/ PEPs 549 and 562 want an instance of ModuleType (i.e. modules) to define something on itself that looks like there was something defined on the class. For PEP 549 this is a property and for 562 it's a __getattr__ method. PEP 560 wants a __class_getitem__ method, defined on a class (instance of a metaclass), to look like there was a __getitem__ on the metaclass. PEP 560 is thus an attempt at a more fine-grained definition of a metaclass-like feature, where conflicts are less likely or can potentially be better dealth with. While PEPs 549 and 562 are doing a very similar thing to PEP 560 in theory, these use cases do not fall nicely into Nick's classification of uses for metaclasses in the email below. PEP 560 is trying to avoid a metaclass (a subclass of type) as an additional base class specifically for one class object). PEPs 549 and 562 are trying to avoid an additional class (a subclass of ModuleType) as an additional base class specifically for this one module. Whether or not fine-grainedness is the answer, it might make sense to list more different related use cases. Probably even the peps repo has more examples than the three I listed above. ??Koos ?[*] I'll try to be able to do what's nee?ded for the PEP 555 discussion ? no1/3w still on python-ideas. On Fri, Oct 13, 2017 at 9:30 AM, Nick Coghlan wrote: > On 13 October 2017 at 04:21, Martin Teichmann > wrote: > >> For me, the dataclasses were a typical example for inheritance, to be >> more precise, for metaclasses. I was astonished to see them >> implemented using decorators, and I was not the only one, citing >> Guido: >> >> > I think it would be useful to write 1-2 sentences about the problem with >> > inheritance -- in that case you pretty much have to use a metaclass, >> and the >> > use of a metaclass makes life harder for people who want to use their >> own >> > metaclass (since metaclasses don't combine without some manual >> > intervention). >> >> Python is at a weird point here. At about every new release of Python, >> a new idea shows up that could be easily solved using metaclasses, yet >> every time we hesitate to use them, because of said necessary manual >> intervention for metaclass combination. >> > > Metaclasses currently tend to serve two distinct purposes: > > 1. Actually altering the runtime behaviour of a class and its children in > non-standard ways (e.g. enums, ABCs, ORMs) > 2. Boilerplate reduction in class definitions, reducing the amount of code > you need to write as the author of that class > > Nobody has a problem with using metaclasses for the first purpose - that's > what they're for. > > It's the second use case where they're problematic, as the fact that > they're preserved on the class becomes a leaky implementation detail, and > the lack of a JIT in CPython means they can also end up being expensive > from a runtime performance perspective. > > Mixin classes have the same problem: something that the author may want to > handle as an internal implementation detail leaks through to the runtime > state of the class object. > > Code generating decorators like functools.total_ordering and > dataclasses.dataclass (aka attr.s) instead aim at the boilerplate reduction > problem directly: they let you declare in the class body the parts that you > need to specify as the class designer, and then fill in at class definition > time the parts that can be inferred from that base. > > If all you have access to is the runtime class, it behaves almost exactly > as if you had written out all the autogenerated methods by hand (there may > be subtle differences in the method metadata, such as the values of > `__qualname__` and `__globals__`). > > Such decorators also do more work at class definition time in order to > reduce the amount of runtime overhead introduced by reliance on chained > method calls in a non-JITted Python runtime. > > As such, the code generating decorators have a clear domain of > applicability: boilerplate reduction for class definitions without > impacting the way instances behave (other than attribute and method > injection), and without implicitly impacting subclass definitions (other > than through regular inheritance behaviour). > > As far as the dataclass interaction with `__slots__` goes, that's a > problem largely specific to slots (and `__metaclass__` before it), in that > they're the only characteristics of a class definition that affect how > CPython allocates memory for the class object itself (the descriptors for > the slots are stored as a pointer array after the class struct, rather than > only in the class dict). > > Given PEP 526 variable annotations, __slots__ could potentially benefit > from a __metaclass__ style makeover, allowing an "infer_slots=True" keyword > argument to type.__new__ to request that the list of slots be inferred from > __annotations__ (Slot inference would conflict with setting class level > default values, but that's a real conflict, as you'd be trying to use the > same name on the class object for both the slot descriptor and the default > value) > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > k7hoven%40gmail.com > > -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Oct 13 08:53:23 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 13 Oct 2017 22:53:23 +1000 Subject: [Python-Dev] What is the design purpose of metaclasses vs code generating decorators? (was Re: PEP 557: Data Classes) In-Reply-To: References: Message-ID: On 13 October 2017 at 19:35, Martin Teichmann wrote: > > Metaclasses currently tend to serve two distinct purposes: > > > > 1. Actually altering the runtime behaviour of a class and its children in > > non-standard ways (e.g. enums, ABCs, ORMs) > > 2. Boilerplate reduction in class definitions, reducing the amount of > code > > you need to write as the author of that class > > > > Nobody has a problem with using metaclasses for the first purpose - > that's > > what they're for. > > I am that nobody. The examples you give would be much nicer solved > with decorators. Especially for ABCs it would be much better, because > the fact that a class is an ABC is explicitly not inherited - its > entire reason of existence is to be inherited with all the > abstractness going away. And yet, currently the concrete class will > still inherit from ABC. Aye, ABCs are definitely a case where I think it would be valuable to have a class decorator that: 1. Transplants any concrete method implementations from the ABC 2. Ensures that the class being defined actually implements all the required abstract methods The register method doesn't do either of those things, while inheritance has the unwanted side-effect of changing the metaclass of even concrete subclasses. As a handwavey concept, something like: @abc.implements(collections.Mapping) class MyMapping: ... # Just implement the abstract methods, get the rest injected So I am personally more and more leaning towards a metaclass-free future. > I agree with this view in the sense that I'd like the number of use cases that *require* a custom metaclass to get ever smaller (replacing them with regular inheritance + definition time method injection), but I also think they'll always have a place as a way for folks to explore the design space of what's possible given full control over the class definition process. That way, proposals like __init_subclass__ and __set_name__ can be based on pattern extraction from cases where people have decided the feature was valuable enough to be worth the hassle of maintaining a custom metaclass. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From akishorecert at gmail.com Fri Oct 13 10:58:29 2017 From: akishorecert at gmail.com (Kishore Kumar Alajangi) Date: Fri, 13 Oct 2017 20:28:29 +0530 Subject: [Python-Dev] Scrapy Question Message-ID: Hi Experts, Could someone guide me how to use the code in below question(Link). https://stackoverflow.com/questions/46711909/extract-urls-recursively-from-website-archives-in-scrapy Thanks, KK. -------------- next part -------------- An HTML attachment was scrubbed... URL: From phd at phdru.name Fri Oct 13 11:05:51 2017 From: phd at phdru.name (Oleg Broytman) Date: Fri, 13 Oct 2017 17:05:51 +0200 Subject: [Python-Dev] Scrapy Question In-Reply-To: References: Message-ID: <20171013150551.GA9739@phdru.name> Hello. This mailing list is to work on developing Python (adding new features to Python itself and fixing bugs); if you're having problems learning, understanding or using Python, please find another forum. Probably python-list/comp.lang.python mailing list/news group is the best place; there are Python developers who participate in it; you may get a faster, and probably more complete, answer there. See http://www.python.org/community/ for other lists/news groups/fora. Thank you for understanding. On Fri, Oct 13, 2017 at 08:28:29PM +0530, Kishore Kumar Alajangi wrote: > Hi Experts, > > Could someone guide me how to use the code in below question(Link). > https://stackoverflow.com/questions/46711909/extract-urls-recursively-from-website-archives-in-scrapy > > Thanks, > KK. Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From guido at python.org Fri Oct 13 11:57:00 2017 From: guido at python.org (Guido van Rossum) Date: Fri, 13 Oct 2017 08:57:00 -0700 Subject: [Python-Dev] What is the design purpose of metaclasses vs code generating decorators? (was Re: PEP 557: Data Classes) In-Reply-To: References: Message-ID: This is food for thought. I'll have to let it sink in a bit, but you may be on to something. Since the question was asked at some point, yes, metaclasses are much older than class decorators. At some point I found the book Putting Metaclasses to Work by Ira Forman and Scott Danforth ( https://www.amazon.com/Putting-Metaclasses-Work-Ira-Forman/dp/0201433052) and translated the book's ideas from C++ to Python, except for the automatic merging of multiple inherited metaclasses. But in many cases class decorators are more useful. I do worry that things like your autoslots decorator example might be problematic because they create a new class, throwing away a lot of work that was already done. But perhaps the right way to address this would be to move the decision about the instance layout to a later phase? (Not sure if that makes sense though.) --Guido On Fri, Oct 13, 2017 at 2:35 AM, Martin Teichmann wrote: > > Metaclasses currently tend to serve two distinct purposes: > > > > 1. Actually altering the runtime behaviour of a class and its children in > > non-standard ways (e.g. enums, ABCs, ORMs) > > 2. Boilerplate reduction in class definitions, reducing the amount of > code > > you need to write as the author of that class > > > > Nobody has a problem with using metaclasses for the first purpose - > that's > > what they're for. > > I am that nobody. The examples you give would be much nicer solved > with decorators. Especially for ABCs it would be much better, because > the fact that a class is an ABC is explicitly not inherited - its > entire reason of existence is to be inherited with all the > abstractness going away. And yet, currently the concrete class will > still inherit from ABC. The following way of writing ABCs looks much > nicer to me: > > @abstractclass > class Spam: > @abstractmethod > def ham(self): > ... > > The same holds for enums. Inheriting from enums is possible, but > weird, given that you cannot add new enums to it. So, especially when > comparing to the dataclasses, the following looks appealing to me: > > @enum > class Breakfast: > spam = 0 > ham = 1 > > I'm not an expert on ORMs, but they seem to me a lot like data classes > in the context of this discussion. > > I am aware that currently some things are easier done with > metaclasses. But given that decorators can create an entirely new > class if they like, they have all the power to do what they want, and > even in a way much easier understood by people. > > As an example, here the autoslot decorator: > > def autoslot(cls): > """turn all class variables into slots""" > cls.__slots__ = list(cls.__dict__) > return type(cls.__name__, cls.__bases__, class.__dict__) > > So I am personally more and more leaning towards a metaclass-free future. > > Cheers > > Martin > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From status at bugs.python.org Fri Oct 13 12:09:42 2017 From: status at bugs.python.org (Python tracker) Date: Fri, 13 Oct 2017 18:09:42 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20171013160942.CA23656C48@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2017-10-06 - 2017-10-13) Python tracker at https://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 6251 (+26) closed 37280 (+37) total 43531 (+63) Open issues with patches: 2410 Issues opened (47) ================== #27172: Undeprecate inspect.getfullargspec() https://bugs.python.org/issue27172 reopened by larry #31718: some methods of uninitialized io.IncrementalNewlineDecoder obj https://bugs.python.org/issue31718 opened by Oren Milman #31721: assertion failure in FutureObj_finalize() after setting _log_t https://bugs.python.org/issue31721 opened by Oren Milman #31722: _io.IncrementalNewlineDecoder doesn't inherit codecs.Increment https://bugs.python.org/issue31722 opened by serhiy.storchaka #31724: test_xmlrpc_net should use something other than buildbot.pytho https://bugs.python.org/issue31724 opened by zach.ware #31725: Turtle/tkinter: NameError crashes ipython with "Tcl_AsyncDelet https://bugs.python.org/issue31725 opened by Rick J. Pelleg #31726: Missing token.COMMENT https://bugs.python.org/issue31726 opened by fpom #31727: FTP_TLS errors when https://bugs.python.org/issue31727 opened by jonathan-lp #31729: multiprocesssing.pool.AsyncResult undocumented field https://bugs.python.org/issue31729 opened by gene at nlc.co.nz #31731: [2.7] test_io hangs on x86 Gentoo Refleaks 2.7 https://bugs.python.org/issue31731 opened by haypo #31733: [2.7] Add PYTHONSHOWREFCOUNT environment variable to Python 2. https://bugs.python.org/issue31733 opened by haypo #31734: crash or SystemError in sqlite3.Cache in case it is uninitiali https://bugs.python.org/issue31734 opened by Oren Milman #31735: Documentation incorrectly states how descriptors are invoked https://bugs.python.org/issue31735 opened by Paul Pinterits #31737: Documentation renders incorrectly https://bugs.python.org/issue31737 opened by gibson042 #31738: Lib/site.py: method `abs_paths` is not documented https://bugs.python.org/issue31738 opened by lielfr #31739: socket.close recommended but not demonstrated in same-page exa https://bugs.python.org/issue31739 opened by Nathaniel Manista #31742: Default to emitting FutureWarning for provisional APIs https://bugs.python.org/issue31742 opened by ncoghlan #31743: Proportional Width Font on Generated Python Docs PDFs https://bugs.python.org/issue31743 opened by synthmeat #31745: Overloading "Py_GetPath" does not work https://bugs.python.org/issue31745 opened by kayhayen #31746: crashes in sqlite3.Connection in case it is uninitialized or p https://bugs.python.org/issue31746 opened by Oren Milman #31748: configure fails to detect fchdir() using CFLAGS="-Werror -Wall https://bugs.python.org/issue31748 opened by dilyan.palauzov #31749: Request: Human readable byte amounts in the standard library https://bugs.python.org/issue31749 opened by miserlou2 #31750: expose PyCell_Type in types module https://bugs.python.org/issue31750 opened by akvadrako #31751: Support for C++ 11 and/or C++ 14 in python.org installer https://bugs.python.org/issue31751 opened by hardkrash #31752: Assertion failure in timedelta() in case of bad __divmod__ https://bugs.python.org/issue31752 opened by serhiy.storchaka #31753: Unnecessary closure in ast.literal_eval https://bugs.python.org/issue31753 opened by Aaron Hall #31754: Documented type of parameter 'itemsize' to PyBuffer_FillContig https://bugs.python.org/issue31754 opened by snoeberger #31756: subprocess.run should alias universal_newlines to text https://bugs.python.org/issue31756 opened by andrewclegg #31757: Tutorial: Fibonacci numbers start with 1, 1 https://bugs.python.org/issue31757 opened by skyhein #31758: various refleaks in _elementtree https://bugs.python.org/issue31758 opened by Oren Milman #31760: Re-definition of _POSIX_C_SOURCE with Fedora 26. https://bugs.python.org/issue31760 opened by matrixise #31763: Add NOTICE level to the logging module https://bugs.python.org/issue31763 opened by mp5023 #31764: sqlite3.Cursor.close() crashes in case the Cursor object is un https://bugs.python.org/issue31764 opened by Oren Milman #31765: BUG: System deadlocks performing big loop operations in python https://bugs.python.org/issue31765 opened by Nik101 #31767: Windows Installer fails with error 0x80091007 when trying to i https://bugs.python.org/issue31767 opened by Igor.Skochinsky #31768: argparse drops '|'s when the arguments are too long https://bugs.python.org/issue31768 opened by caveman #31769: configure includes user CFLAGS when detecting pthreads support https://bugs.python.org/issue31769 opened by floppymaster #31770: crash and refleaks when calling sqlite3.Cursor.__init__() more https://bugs.python.org/issue31770 opened by Oren Milman #31771: tkinter geometry string +- when window ovelaps left or top of https://bugs.python.org/issue31771 opened by 8baWnoVi #31773: Rewrite _PyTime_GetWinPerfCounter() for _PyTime_t https://bugs.python.org/issue31773 opened by haypo #31774: tarfile.open ignores custom bufsize value when creating a new https://bugs.python.org/issue31774 opened by cstratak #31775: Support unbuffered TextIOWrapper https://bugs.python.org/issue31775 opened by haypo #31776: Missing "raise from None" in /Lib/xml/etree/ElementPath.py https://bugs.python.org/issue31776 opened by pablogsal #31777: IDLE: Let users add to font selection https://bugs.python.org/issue31777 opened by terry.reedy #31778: ast.literal_eval supports non-literals in Python 3 https://bugs.python.org/issue31778 opened by David Bieber #31779: assertion failures and a crash when using an uninitialized str https://bugs.python.org/issue31779 opened by Oren Milman #31780: Using format spec ',x' displays incorrect error message https://bugs.python.org/issue31780 opened by FHTMitchell Most recent 15 issues with no replies (15) ========================================== #31780: Using format spec ',x' displays incorrect error message https://bugs.python.org/issue31780 #31779: assertion failures and a crash when using an uninitialized str https://bugs.python.org/issue31779 #31778: ast.literal_eval supports non-literals in Python 3 https://bugs.python.org/issue31778 #31777: IDLE: Let users add to font selection https://bugs.python.org/issue31777 #31776: Missing "raise from None" in /Lib/xml/etree/ElementPath.py https://bugs.python.org/issue31776 #31774: tarfile.open ignores custom bufsize value when creating a new https://bugs.python.org/issue31774 #31770: crash and refleaks when calling sqlite3.Cursor.__init__() more https://bugs.python.org/issue31770 #31764: sqlite3.Cursor.close() crashes in case the Cursor object is un https://bugs.python.org/issue31764 #31760: Re-definition of _POSIX_C_SOURCE with Fedora 26. https://bugs.python.org/issue31760 #31754: Documented type of parameter 'itemsize' to PyBuffer_FillContig https://bugs.python.org/issue31754 #31746: crashes in sqlite3.Connection in case it is uninitialized or p https://bugs.python.org/issue31746 #31745: Overloading "Py_GetPath" does not work https://bugs.python.org/issue31745 #31743: Proportional Width Font on Generated Python Docs PDFs https://bugs.python.org/issue31743 #31729: multiprocesssing.pool.AsyncResult undocumented field https://bugs.python.org/issue31729 #31721: assertion failure in FutureObj_finalize() after setting _log_t https://bugs.python.org/issue31721 Most recent 15 issues waiting for review (15) ============================================= #31779: assertion failures and a crash when using an uninitialized str https://bugs.python.org/issue31779 #31776: Missing "raise from None" in /Lib/xml/etree/ElementPath.py https://bugs.python.org/issue31776 #31773: Rewrite _PyTime_GetWinPerfCounter() for _PyTime_t https://bugs.python.org/issue31773 #31770: crash and refleaks when calling sqlite3.Cursor.__init__() more https://bugs.python.org/issue31770 #31764: sqlite3.Cursor.close() crashes in case the Cursor object is un https://bugs.python.org/issue31764 #31763: Add NOTICE level to the logging module https://bugs.python.org/issue31763 #31758: various refleaks in _elementtree https://bugs.python.org/issue31758 #31752: Assertion failure in timedelta() in case of bad __divmod__ https://bugs.python.org/issue31752 #31748: configure fails to detect fchdir() using CFLAGS="-Werror -Wall https://bugs.python.org/issue31748 #31746: crashes in sqlite3.Connection in case it is uninitialized or p https://bugs.python.org/issue31746 #31734: crash or SystemError in sqlite3.Cache in case it is uninitiali https://bugs.python.org/issue31734 #31733: [2.7] Add PYTHONSHOWREFCOUNT environment variable to Python 2. https://bugs.python.org/issue31733 #31724: test_xmlrpc_net should use something other than buildbot.pytho https://bugs.python.org/issue31724 #31722: _io.IncrementalNewlineDecoder doesn't inherit codecs.Increment https://bugs.python.org/issue31722 #31718: some methods of uninitialized io.IncrementalNewlineDecoder obj https://bugs.python.org/issue31718 Top 10 most discussed issues (10) ================================= #31742: Default to emitting FutureWarning for provisional APIs https://bugs.python.org/issue31742 27 msgs #31692: [2.7] Test `test_huntrleaks()` of test_regrtest fails in debug https://bugs.python.org/issue31692 14 msgs #31701: faulthandler dumps 'Windows fatal exception: code 0xe06d7363' https://bugs.python.org/issue31701 10 msgs #30744: Local variable assignment is broken when combined with threads https://bugs.python.org/issue30744 9 msgs #31165: list_slice() does crash if the list is mutated indirectly by P https://bugs.python.org/issue31165 8 msgs #31558: gc.freeze() - an API to mark objects as uncollectable https://bugs.python.org/issue31558 8 msgs #31748: configure fails to detect fchdir() using CFLAGS="-Werror -Wall https://bugs.python.org/issue31748 8 msgs #31749: Request: Human readable byte amounts in the standard library https://bugs.python.org/issue31749 8 msgs #13802: IDLE font settings: use multiple character sets in examples https://bugs.python.org/issue13802 7 msgs #31327: bug in dateutil\tz\tz.py https://bugs.python.org/issue31327 7 msgs Issues closed (34) ================== #26546: Provide translated french translation on docs.python.org https://bugs.python.org/issue26546 closed by mdk #27867: various issues due to misuse of PySlice_GetIndicesEx https://bugs.python.org/issue27867 closed by serhiy.storchaka #28157: Document time module constants (timezone, tzname, etc.) as dep https://bugs.python.org/issue28157 closed by berker.peksag #28647: python --help: -u is misdocumented as binary mode https://bugs.python.org/issue28647 closed by berker.peksag #30058: Buffer overflow in kqueue.control() https://bugs.python.org/issue30058 closed by serhiy.storchaka #30767: logging must check exc_info correctly https://bugs.python.org/issue30767 closed by vinay.sajip #31507: email.utils.parseaddr has no docstring https://bugs.python.org/issue31507 closed by Mariatta #31509: test_subprocess hangs randomly on AMD64 Windows10 3.x https://bugs.python.org/issue31509 closed by haypo #31537: Bug in readline module documentation example https://bugs.python.org/issue31537 closed by Mariatta #31567: Inconsistent documentation around decorators https://bugs.python.org/issue31567 closed by merwok #31591: Closing socket raises AttributeError: 'collections.deque' obje https://bugs.python.org/issue31591 closed by reidfaiv #31642: None value in sys.modules no longer blocks import https://bugs.python.org/issue31642 closed by serhiy.storchaka #31655: SimpleNamespace accepts non-string keyword names https://bugs.python.org/issue31655 closed by serhiy.storchaka #31666: Pandas_datareader Error Message - ModuleNotFoundError: No modu https://bugs.python.org/issue31666 closed by matrixise #31681: pkgutil.get_data() leaks open files in Python 2.7 https://bugs.python.org/issue31681 closed by merwok #31683: a stack overflow on windows in faulthandler._fatal_error() https://bugs.python.org/issue31683 closed by haypo #31712: subprocess with stderr=subprocess.STDOUT hang https://bugs.python.org/issue31712 closed by martin.panter #31719: [2.7] test_regrtest.test_crashed() fails on s390x https://bugs.python.org/issue31719 closed by haypo #31720: msilib.MSIError is spelled MsiError in documentation https://bugs.python.org/issue31720 closed by Mariatta #31723: refleaks in zipimport when calling zipimporter.__init__() more https://bugs.python.org/issue31723 closed by haypo #31728: crashes in _elementtree due to unsafe decrefs of Element.text https://bugs.python.org/issue31728 closed by serhiy.storchaka #31730: list unhashable, can not be use as key to dict https://bugs.python.org/issue31730 closed by ned.deily #31732: Add TRACE level to the logging module https://bugs.python.org/issue31732 closed by vinay.sajip #31736: PEP 485 tiny typo https://bugs.python.org/issue31736 closed by berker.peksag #31740: refleaks when calling sqlite3.Connection.__init__() more than https://bugs.python.org/issue31740 closed by haypo #31741: backports import path can not be overridden in Windows (Linux https://bugs.python.org/issue31741 closed by zach.ware #31744: Python 2.7.14 Fails to compile on CentOS/RHEL7 https://bugs.python.org/issue31744 closed by ned.deily #31747: fatal error LNK1120 in PCbuild\python3dll.vcxproj https://bugs.python.org/issue31747 closed by denis-osipov #31755: SetType is missing in the 'types' module https://bugs.python.org/issue31755 closed by christian.heimes #31759: re wont recover nor fail on runaway regular expression https://bugs.python.org/issue31759 closed by tim.peters #31761: regrtest: faulthandler.enable() fails with io.UnsupportedOpera https://bugs.python.org/issue31761 closed by Mariatta #31762: Issue in login https://bugs.python.org/issue31762 closed by r.david.murray #31766: Python 3.5 missing from documentation https://bugs.python.org/issue31766 closed by ned.deily #31772: SourceLoader uses stale bytecode in case of equal mtime second https://bugs.python.org/issue31772 closed by brett.cannon From ethan at stoneleaf.us Fri Oct 13 15:10:50 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 13 Oct 2017 12:10:50 -0700 Subject: [Python-Dev] What is the design purpose of metaclasses vs code generating decorators? (was Re: PEP 557: Data Classes) In-Reply-To: References: Message-ID: <59E10FBA.3040207@stoneleaf.us> On 10/13/2017 02:35 AM, Martin Teichmann wrote: >> Metaclasses currently tend to serve two distinct purposes: >> >> 1. Actually altering the runtime behaviour of a class and its children in >> non-standard ways (e.g. enums, ABCs, ORMs) >> 2. Boilerplate reduction in class definitions, reducing the amount of code >> you need to write as the author of that class >> >> Nobody has a problem with using metaclasses for the first purpose - that's >> what they're for. > > I am that nobody. The examples you give would be much nicer solved > with decorators. > The same holds for enums. Inheriting from enums is possible, but > weird, given that you cannot add new enums to it. So, especially when > comparing to the dataclasses, the following looks appealing to me: > > @enum > class Breakfast: > spam = 0 > ham = 1 Things that will not work if Enum does not have a metaclass: list(EnumClass) -> list of enum members dir(EnumClass) -> custom list of "interesting" items len(EnumClass) -> number of members member in EnumClass -> True or False - protection from adding, deleting, and changing members - guards against reusing the same name twice - possible to have properties and members with the same name (i.e. "value" and "name") -- ~Ethan~ From tinchester at gmail.com Fri Oct 13 15:18:29 2017 From: tinchester at gmail.com (=?UTF-8?Q?Tin_Tvrtkovi=C4=87?=) Date: Fri, 13 Oct 2017 19:18:29 +0000 Subject: [Python-Dev] What is the design purpose of metaclasses vs code generating decorators? (was Re: PEP 557: Data Classes) In-Reply-To: References: Message-ID: > > Date: Fri, 13 Oct 2017 08:57:00 -0700 > From: Guido van Rossum > To: Martin Teichmann > Cc: Python-Dev > Subject: Re: [Python-Dev] What is the design purpose of metaclasses vs > code generating decorators? (was Re: PEP 557: Data Classes) > Message-ID: > < > CAP7+vJKBVuDqf09zTWDAuvQ-cCNM+cF82c22s2NJOj+A9k7_kA at mail.gmail.com> > Content-Type: text/plain; charset="utf-8" > > This is food for thought. I'll have to let it sink in a bit, but you may be > on to something. > > Since the question was asked at some point, yes, metaclasses are much older > than class decorators. At some point I found the book Putting Metaclasses > to Work by Ira Forman and Scott Danforth ( > https://www.amazon.com/Putting-Metaclasses-Work-Ira-Forman/dp/0201433052) > and translated the book's ideas from C++ to Python, except for the > automatic merging of multiple inherited metaclasses. > > But in many cases class decorators are more useful. > > I do worry that things like your autoslots decorator example might be > problematic because they create a new class, throwing away a lot of work > that was already done. But perhaps the right way to address this would be > to move the decision about the instance layout to a later phase? (Not sure > if that makes sense though.) > > --Guido > Just FYI, recreating the class with slots runs into problems with regards to PEP 3135 (New Super). In attrs we resort to black magic to update the __class__ cell in existing methods. Being able to add slotness later would be good, but not that useful for us since we have to support down to 2.7. -------------- next part -------------- An HTML attachment was scrubbed... URL: From random832 at fastmail.com Fri Oct 13 17:02:13 2017 From: random832 at fastmail.com (Random832) Date: Fri, 13 Oct 2017 17:02:13 -0400 Subject: [Python-Dev] What is the design purpose of metaclasses vs code generating decorators? (was Re: PEP 557: Data Classes) In-Reply-To: References: Message-ID: <1507928533.1276577.1138175840.6BEEDE5B@webmail.messagingengine.com> On Fri, Oct 13, 2017, at 02:30, Nick Coghlan wrote: > Metaclasses currently tend to serve two distinct purposes: > > 1. Actually altering the runtime behaviour of a class and its children > in non-standard ways (e.g. enums, ABCs, ORMs) > 2. Boilerplate reduction in class definitions, reducing the amount of > code you need to write as the author of that class > > Nobody has a problem with using metaclasses for the first purpose - > that's what they're for. > > It's the second use case where they're problematic, as the fact that > they're preserved on the class becomes a leaky implementation detail, > and the lack of a JIT in CPython means they can also end up being > expensive from a runtime performance perspective. What about a metaclass that isn't a metaclass? A metaclass can be any callable and can return any object, e.g. a normal type. def AutoSlotMeta(name, bases, dct, real_metaclass=type): """turn all class variables into slots""" dct['__slots__'] = list(dct) return real_metaclass(name, bases, dct) From lkb.teichmann at gmail.com Sat Oct 14 10:37:13 2017 From: lkb.teichmann at gmail.com (Martin Teichmann) Date: Sat, 14 Oct 2017 16:37:13 +0200 Subject: [Python-Dev] What is the design purpose of metaclasses vs code generating decorators? (was Re: PEP 557: Data Classes) In-Reply-To: <59E10FBA.3040207@stoneleaf.us> References: <59E10FBA.3040207@stoneleaf.us> Message-ID: > Things that will not work if Enum does not have a metaclass: > > list(EnumClass) -> list of enum members > dir(EnumClass) -> custom list of "interesting" items > len(EnumClass) -> number of members > member in EnumClass -> True or False > > - protection from adding, deleting, and changing members > - guards against reusing the same name twice > - possible to have properties and members with the same name (i.e. "value" > and "name") In current Python this is true. But if we would go down the route of PEP 560 (which I just found, I wasn't involved in its discussion), then we could just add all the needed functionality to classes. I would do it slightly different than proposed in PEP 560: classmethods are very similar to methods on a metaclass. They are just not called by the special method machinery. I propose that the following is possible: >>> class Spam: ... @classmethod ... def __getitem__(self, item): ... return "Ham" >>> Spam[3] Ham this should solve most of your usecases. When thinking about how an automatic metaclass combiner would look like, I realized that it should ideally just reproduce the class mro, just with metaclasses. So if a class has an mro of [A, B, C, object], its metaclass should have an mro of unique_everseen([type(A), type(B), type(C), type]). But in this case, why add this layer at all? Just give the class the ability to do everything a metaclass could do, using mechanisms like @classmethod, and we're done. Greetings Martin From lkb.teichmann at gmail.com Sat Oct 14 11:00:44 2017 From: lkb.teichmann at gmail.com (Martin Teichmann) Date: Sat, 14 Oct 2017 17:00:44 +0200 Subject: [Python-Dev] What is the design purpose of metaclasses vs code generating decorators? (was Re: PEP 557: Data Classes) In-Reply-To: References: Message-ID: >> I do worry that things like your autoslots decorator example might be >> problematic because they create a new class, throwing away a lot of work >> that was already done. But perhaps the right way to address this would be >> to move the decision about the instance layout to a later phase? (Not sure >> if that makes sense though.) >> >> --Guido > > > Just FYI, recreating the class with slots runs into problems with regards to > PEP 3135 (New Super). In attrs we resort to black magic to update the > __class__ cell in existing methods. Being able to add slotness later would > be good, but not that useful for us since we have to support down to 2.7. You're both bringing up an important point here: while in function decorators it is normal to return a completely new function (albeit one that wraps the original), this is close to impossible for classes. You cannot just wrap a class in another one. You may inherit from it, but that's already often not what one wants. While I am not worried about the poor computers having to do a lot of work creating a throwaway class, I do see the problem with the new super. What I would like to see is something like the @wraps decorator for classes, such that you could write something like: def class_decorator(cls): @wraps_class class MyNewCoolClass: """my cool functionality here""" return MyNewCoolClass wraps_class would then copy over everything such that the new class gets it. Unfortunately, this won't work, because of the new super. The new super is about the only thing that cannot be dynamically changed in Python. While it is no problem to make a function a method of a random class (just use __get__), it is not possible to move a function from one class to another, because you cannot change its binding to __class__, which is used by super(). And even if we could, the method whose __class__ we want to change might hide in a wrapper. The current behavior of __class__ is weird, it is set to the class that type.__new__ creates. So even if another metaclasses __new__ or a decorator returns another class, the method's __class__ would still point to the original class, which might even not exist anymore. One might argue that this is due to the fact that it is not well-defined what __class__ should be set to. But this is not true, it is crystal clear: __class__ should be set to the class from whose __dict__ the method was drawn. I thought about a new three-parameter __get__(self, instance, owner, supplier), which would then set __class__ to the supplier. This is a beautiful concept, that is unfortunately not so simple when it comes to methods hidden in wrappers. Greetings Martin From ncoghlan at gmail.com Sat Oct 14 11:38:15 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 15 Oct 2017 01:38:15 +1000 Subject: [Python-Dev] What is the design purpose of metaclasses vs code generating decorators? (was Re: PEP 557: Data Classes) In-Reply-To: References: Message-ID: On 15 October 2017 at 01:00, Martin Teichmann wrote: > While I am not worried about the poor computers having to do a lot of > work creating a throwaway class, I do see the problem with the new > super. What I would like to see is something like the @wraps decorator > for classes, such that you could write something like: > > def class_decorator(cls): > @wraps_class > class MyNewCoolClass: > """my cool functionality here""" > return MyNewCoolClass > > wraps_class would then copy over everything such that the new class gets > it. > > Unfortunately, this won't work, because of the new super. The new > super is about the only thing that cannot be dynamically changed in > Python. While it is no problem to make a function a method of a random > class (just use __get__), it is not possible to move a function from > one class to another, because you cannot change its binding to > __class__, which is used by super(). And even if we could, the method > whose __class__ we want to change might hide in a wrapper. > Up until 3.6, we made it so that the class cell for zero-arg super was an almost entirely hidden implementation detail at class creation time. To allow zero-arg super() class methods to work from __init_subclass__, we changed that to include __classcell__ in the execution namespace passed to the metaclass. So it seems to me that to enable the class *replacement* use case, we'd only need one new thing: access to that cell as an attribute on the class itself. That way, when creating the new class you could set "ns['__classcell__'] = old_cls.__classcell__" before calling the new metaclass, which would allow the new class to take over zero-arg super() resolution for any methods defined lexically inside the original. For the class *duplication* use case (where you want to leave the original class intact), the main thing this would let you do is to reliably detect that there is at least one method in the class body using the zero-arg super() form (when cls.__classcell__ is non-None), and either issue a warning or fail outright (since methods that rely on cooperative multiple inheritance need a specific defining class or else the runtime parent resolution gets inconsistent). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Sat Oct 14 11:49:30 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Sat, 14 Oct 2017 08:49:30 -0700 Subject: [Python-Dev] What is the design purpose of metaclasses vs code generating decorators? (was Re: PEP 557: Data Classes) In-Reply-To: References: <59E10FBA.3040207@stoneleaf.us> Message-ID: <59E2320A.3010209@stoneleaf.us> On 10/14/2017 07:37 AM, Martin Teichmann wrote: >> Things that will not work if Enum does not have a metaclass: >> >> list(EnumClass) -> list of enum members >> dir(EnumClass) -> custom list of "interesting" items >> len(EnumClass) -> number of members >> member in EnumClass -> True or False >> >> - protection from adding, deleting, and changing members >> - guards against reusing the same name twice >> - possible to have properties and members with the same name (i.e. "value" >> and "name") > > In current Python this is true. But if we would go down the route of > PEP 560 (which I just found, I wasn't involved in its discussion), > then we could just add all the needed functionality to classes. > > I would do it slightly different than proposed in PEP 560: > classmethods are very similar to methods on a metaclass. They are just > not called by the special method machinery. I propose that the > following is possible: > > >>> class Spam: > ... @classmethod > ... def __getitem__(self, item): > ... return "Ham" > > >>> Spam[3] > Ham > > this should solve most of your usecases. The problem with your solution is you couldn't then have a __getitem__ for the instances -- it's an either/or situation. The problem with PEP 560 is that it doesn't allow the class definition protections that a metaclass does. -- ~Ethan~ From levkivskyi at gmail.com Sat Oct 14 11:57:59 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Sat, 14 Oct 2017 17:57:59 +0200 Subject: [Python-Dev] What is the design purpose of metaclasses vs code generating decorators? (was Re: PEP 557: Data Classes) In-Reply-To: <59E2320A.3010209@stoneleaf.us> References: <59E10FBA.3040207@stoneleaf.us> <59E2320A.3010209@stoneleaf.us> Message-ID: On 14 October 2017 at 17:49, Ethan Furman wrote: > The problem with PEP 560 is that it doesn't allow the class definition > protections that a metaclass does. > Since the discussion turned to PEP 560, I can say that I don't want this to be a general mechanism, the PEP rationale section gives several specific examples why we don't want metaclasses to implement generic class machinery/internals. Could you please elaborate more what is wrong with PEP 560 and what do you mean by "class definition protections" -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Sat Oct 14 12:14:34 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Sat, 14 Oct 2017 09:14:34 -0700 Subject: [Python-Dev] PEP 560 vs metaclass' class definition protections [was Re: What is the design purpose of metaclasses ...] In-Reply-To: References: <59E10FBA.3040207@stoneleaf.us> <59E2320A.3010209@stoneleaf.us> Message-ID: <59E237EA.4060708@stoneleaf.us> On 10/14/2017 08:57 AM, Ivan Levkivskyi wrote: > On 14 October 2017 at 17:49, Ethan Furman wrote: >> The problem with PEP 560 is that it doesn't allow the class definition >> protections that a metaclass does. > > Since the discussion turned to PEP 560, I can say that I don't want this > to be a general mechanism, the PEP rationale section gives several specific > examples why we don't want metaclasses to implement generic class > machinery/internals. > > Could you please elaborate more what is wrong with PEP 560 and what do you > mean by "class definition protections" Nothing is wrong with PEP 560. What I am referring to is: class MyEnum(Enum): red = 0 red = 1 The Enum metaclass machinery will raise an error at the "red = 1" line because it detects the redefinition of "red". This check can only happen during class definition, so only the metaclass can do it. -- ~Ethan~ From ncoghlan at gmail.com Sat Oct 14 12:58:06 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 15 Oct 2017 02:58:06 +1000 Subject: [Python-Dev] PEP 560 vs metaclass' class definition protections [was Re: What is the design purpose of metaclasses ...] In-Reply-To: <59E237EA.4060708@stoneleaf.us> References: <59E10FBA.3040207@stoneleaf.us> <59E2320A.3010209@stoneleaf.us> <59E237EA.4060708@stoneleaf.us> Message-ID: On 15 October 2017 at 02:14, Ethan Furman wrote: > On 10/14/2017 08:57 AM, Ivan Levkivskyi wrote: > >> On 14 October 2017 at 17:49, Ethan Furman wrote: >> > > The problem with PEP 560 is that it doesn't allow the class definition >>> >> >> protections that a metaclass does. > >> >> Since the discussion turned to PEP 560, I can say that I don't want this >> > > to be a general mechanism, the PEP rationale section gives several > specific > > examples why we don't want metaclasses to implement generic class > > machinery/internals. > >> >> Could you please elaborate more what is wrong with PEP 560 and what do you >> > > mean by "class definition protections" > > Nothing is wrong with PEP 560. What I am referring to is: > > class MyEnum(Enum): > red = 0 > red = 1 > > The Enum metaclass machinery will raise an error at the "red = 1" line > because it detects the redefinition of "red". This check can only happen > during class definition, so only the metaclass can do it. > That's not necessarily an inherent restriction though - if we did decide to go even further in the direction of "How do we let base classes override semantics that currently require a custom metaclass?", then there's a fairly clear parallel between "mcl.__init__/bases.__init_subclass__" and "mcl.__prepare__/bases.__prepare_subclass__". OTOH, if you have multiple bases with competing __prepare__ methods you really *should* get a metaclass conflict, since the class body can only be executed in one namespace. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From levkivskyi at gmail.com Sat Oct 14 14:30:33 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Sat, 14 Oct 2017 20:30:33 +0200 Subject: [Python-Dev] PEP 560 vs metaclass' class definition protections [was Re: What is the design purpose of metaclasses ...] In-Reply-To: <59E237EA.4060708@stoneleaf.us> References: <59E10FBA.3040207@stoneleaf.us> <59E2320A.3010209@stoneleaf.us> <59E237EA.4060708@stoneleaf.us> Message-ID: On 14 October 2017 at 18:14, Ethan Furman wrote: > On 10/14/2017 08:57 AM, Ivan Levkivskyi wrote: > >> >> Could you please elaborate more what is wrong with PEP 560 and what do >> you mean by "class definition protections" >> > > Nothing is wrong with PEP 560. What I am referring to is: > [snip] > > OK thanks, then let us keep PEP 560 to its original scope. Its design is specific to generic classes, so it will probably not help with "wider" metaclass problems. As a side note, I don't think elimination of metaclasses should be a "goal by itself". This is a powerful and flexible mechanism, but there are specific situations where metaclasses don't work well because of e.g. frequent conflicts or performance penalties. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Sat Oct 14 15:25:21 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Sat, 14 Oct 2017 12:25:21 -0700 Subject: [Python-Dev] PEP 560 vs metaclass' class definition protections [was Re: What is the design purpose of metaclasses ...] In-Reply-To: References: <59E10FBA.3040207@stoneleaf.us> <59E2320A.3010209@stoneleaf.us> <59E237EA.4060708@stoneleaf.us> Message-ID: <59E264A1.60105@stoneleaf.us> On 10/14/2017 11:30 AM, Ivan Levkivskyi wrote: > As a side note, I don't think elimination of metaclasses should be a "goal by itself". > This is a powerful and flexible mechanism, but there are specific situations where > metaclasses don't work well because of e.g. frequent conflicts or performance penalties. +1 -- ~Ethan~ From yselivanov.ml at gmail.com Sun Oct 15 21:33:16 2017 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Sun, 15 Oct 2017 21:33:16 -0400 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion Message-ID: Hi, It looks like the discussion about the execution context became extremely hard to follow. There are many opinions on how the spec for generators should look like. What seems to be "natural" behaviour/example to one, seems to be completely unreasonable to other people. Recent emails from Guido indicate that he doesn't want to implement execution contexts for generators (at least in 3.7). In another thread Guido said this: "... Because coroutines and generators are similar under the covers, Yury demonstrated the issue with generators instead of coroutines (which are unfamiliar to many people). And then somehow we got hung up about fixing the problem in the example." And Guido is right. My initial motivation to write PEP 550 was to solve my own pain point, have a solution for async code. 'threading.local' is completely unusable there, but complex code bases demand a working solution. I thought that because coroutines and generators are so similar under the hood, I can design a simple solution that will cover all edge cases. Turns out it is not possible to do it in one pass. Therefore, in order to make some progress, I propose to split the problem in half: Stage 1. A new execution context PEP to solve the problem *just for async code*. The PEP will target Python 3.7 and completely ignore synchronous generators and asynchronous generators. It will be based on PEP 550 v1 (no chained lookups, immutable mapping or CoW as an optimization) and borrow some good API decisions from PEP 550 v3+ (contextvars module, ContextVar class). The API (and C-API) will be designed to be future proof and ultimately allow transition to the stage 2. Stage 2. When Python 3.7 is out, we'll see how people use execution contexts for async code and collect feedback. If we recognize that Python users want execution contexts for generators/asynchronous generators, we'll make a new PEP to add support for them in Python 3.8. That future discussion will be focused on generators specifically, and therefore I expect it to be somewhat more focused. I will start working on the new PEP for stage 1 tomorrow. I expect to have a first version by the end of the week. I will also publish PEP 550 v1 as a separate PEP (as v1 is a totally different PEP anyways). Thanks, Yury From ncoghlan at gmail.com Sun Oct 15 23:12:38 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 16 Oct 2017 13:12:38 +1000 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: On 16 October 2017 at 11:33, Yury Selivanov wrote: > Stage 2. When Python 3.7 is out, we'll see how people use execution > contexts for async code and collect feedback. If we recognize that > Python users want execution contexts for generators/asynchronous > generators, we'll make a new PEP to add support for them in Python > 3.8. That future discussion will be focused on generators > specifically, and therefore I expect it to be somewhat more focused. > As long as it's made clear that the interaction between context variables and generators is formally undefined in 3.7, I think that's reasonable - folks that want to ensure the current behaviour indefinitely should keep using thread locals rather than switching over to context variables. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sun Oct 15 23:17:46 2017 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 15 Oct 2017 20:17:46 -0700 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: On Sun, Oct 15, 2017 at 6:33 PM, Yury Selivanov wrote: > Hi, > > It looks like the discussion about the execution context became > extremely hard to follow. There are many opinions on how the spec for > generators should look like. What seems to be "natural" > behaviour/example to one, seems to be completely unreasonable to other > people. Recent emails from Guido indicate that he doesn't want to > implement execution contexts for generators (at least in 3.7). > > In another thread Guido said this: "... Because coroutines and > generators are similar under the covers, Yury demonstrated the issue > with generators instead of coroutines (which are unfamiliar to many > people). And then somehow we got hung up about fixing the problem in > the example." > > And Guido is right. My initial motivation to write PEP 550 was to > solve my own pain point, have a solution for async code. > 'threading.local' is completely unusable there, but complex code bases > demand a working solution. I thought that because coroutines and > generators are so similar under the hood, I can design a simple > solution that will cover all edge cases. Turns out it is not possible > to do it in one pass. > > Therefore, in order to make some progress, I propose to split the > problem in half: > > Stage 1. A new execution context PEP to solve the problem *just for > async code*. The PEP will target Python 3.7 and completely ignore > synchronous generators and asynchronous generators. It will be based > on PEP 550 v1 (no chained lookups, immutable mapping or CoW as an > optimization) and borrow some good API decisions from PEP 550 v3+ > (contextvars module, ContextVar class). The API (and C-API) will be > designed to be future proof and ultimately allow transition to the > stage 2. If you want to ignore generators/async generators, then I think you don't even want PEP 550 v1, you just want something like a {set,get}_context_state API that lets you access the ThreadState's context dict (or rather, an opaque ContextState object that holds the context dict), and then task schedulers can call them at appropriate moments. -n -- Nathaniel J. Smith -- https://vorpus.org From guido at python.org Mon Oct 16 01:04:31 2017 From: guido at python.org (Guido van Rossum) Date: Sun, 15 Oct 2017 22:04:31 -0700 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: On Sun, Oct 15, 2017 at 8:12 PM, Nick Coghlan wrote: > On 16 October 2017 at 11:33, Yury Selivanov > wrote: > >> Stage 2. When Python 3.7 is out, we'll see how people use execution >> contexts for async code and collect feedback. If we recognize that >> Python users want execution contexts for generators/asynchronous >> generators, we'll make a new PEP to add support for them in Python >> 3.8. That future discussion will be focused on generators >> specifically, and therefore I expect it to be somewhat more focused. >> > > As long as it's made clear that the interaction between context variables > and generators is formally undefined in 3.7, I think that's reasonable - > folks that want to ensure the current behaviour indefinitely should keep > using thread locals rather than switching over to context variables. > It shouldn't be formally undefined. It should have the semantics it acquires when you combine the existing (well-defined) formal semantics of generators with the (to be defined) formal semantics of context variables. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Oct 16 01:10:29 2017 From: guido at python.org (Guido van Rossum) Date: Sun, 15 Oct 2017 22:10:29 -0700 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: On Sun, Oct 15, 2017 at 8:17 PM, Nathaniel Smith wrote: > On Sun, Oct 15, 2017 at 6:33 PM, Yury Selivanov > wrote: > > Stage 1. A new execution context PEP to solve the problem *just for > > async code*. The PEP will target Python 3.7 and completely ignore > > synchronous generators and asynchronous generators. It will be based > > on PEP 550 v1 (no chained lookups, immutable mapping or CoW as an > > optimization) and borrow some good API decisions from PEP 550 v3+ > > (contextvars module, ContextVar class). The API (and C-API) will be > > designed to be future proof and ultimately allow transition to the > > stage 2. > > If you want to ignore generators/async generators, then I think you > don't even want PEP 550 v1, you just want something like a > {set,get}_context_state API that lets you access the ThreadState's > context dict (or rather, an opaque ContextState object that holds the > context dict), and then task schedulers can call them at appropriate > moments. > Yes, that's what I meant by "ignoring generators". And I'd like there to be a "current context" that's a per-thread MutableMapping with ContextVar keys. Maybe there's not much more to it apart from naming the APIs for getting and setting it? To be clear, I am fine with this being a specific subtype of MutableMapping. But I don't see much benefit in making it more abstract than that. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Mon Oct 16 01:26:43 2017 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 15 Oct 2017 22:26:43 -0700 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: On Sun, Oct 15, 2017 at 10:10 PM, Guido van Rossum wrote: > On Sun, Oct 15, 2017 at 8:17 PM, Nathaniel Smith wrote: >> >> On Sun, Oct 15, 2017 at 6:33 PM, Yury Selivanov >> wrote: >> > Stage 1. A new execution context PEP to solve the problem *just for >> > async code*. The PEP will target Python 3.7 and completely ignore >> > synchronous generators and asynchronous generators. It will be based >> > on PEP 550 v1 (no chained lookups, immutable mapping or CoW as an >> > optimization) and borrow some good API decisions from PEP 550 v3+ >> > (contextvars module, ContextVar class). The API (and C-API) will be >> > designed to be future proof and ultimately allow transition to the >> > stage 2. >> >> If you want to ignore generators/async generators, then I think you >> don't even want PEP 550 v1, you just want something like a >> {set,get}_context_state API that lets you access the ThreadState's >> context dict (or rather, an opaque ContextState object that holds the >> context dict), and then task schedulers can call them at appropriate >> moments. > > > Yes, that's what I meant by "ignoring generators". And I'd like there to be > a "current context" that's a per-thread MutableMapping with ContextVar keys. > Maybe there's not much more to it apart from naming the APIs for getting and > setting it? To be clear, I am fine with this being a specific subtype of > MutableMapping. But I don't see much benefit in making it more abstract than > that. We don't need it to be abstract (it's fine to have a single concrete mapping type that we always use internally), but I think we do want it to be opaque (instead of exposing the MutableMapping interface, the only way to get/set specific values should be through the ContextVar interface). The advantages are: - This allows C level caching of values in ContextVar objects (in particular, funneling mutations through a limited API makes cache invalidation *much* easier) - It gives us flexibility to change the underlying data structure without breaking API, or for different implementations to make different choices -- in particular, it's not clear whether a dict or HAMT is better, and it's not clear whether a regular dict or WeakKeyDict is better. The first point (caching) I think is the really compelling one: in practice decimal and numpy are already using tricky caching code to reduce the overhead of accessing the ThreadState dict, and this gets even trickier with context-local state which has more cache invalidation points, so if we don't do this in the interpreter then it could actually become a blocker for adoption. OTOH it's easy for the interpreter itself to do this caching, and it makes everyone faster. -n -- Nathaniel J. Smith -- https://vorpus.org From p.f.moore at gmail.com Mon Oct 16 04:26:23 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 16 Oct 2017 09:26:23 +0100 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: On 16 October 2017 at 02:33, Yury Selivanov wrote: > Stage 1. A new execution context PEP to solve the problem *just for > async code*. The PEP will target Python 3.7 and completely ignore > synchronous generators and asynchronous generators. It will be based > on PEP 550 v1 (no chained lookups, immutable mapping or CoW as an > optimization) and borrow some good API decisions from PEP 550 v3+ > (contextvars module, ContextVar class). The API (and C-API) will be > designed to be future proof and ultimately allow transition to the > stage 2. So would decimal contexts stick to using threading.local? If so, presumably they'd still have problems with async. If not, won't you still be stuck with having to define the new semantics they have when used with generators? Or would it be out of scope for the PEP to take a position on what decimal does? Paul From victor.stinner at gmail.com Mon Oct 16 06:42:30 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 16 Oct 2017 12:42:30 +0200 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution Message-ID: Hi, While discussions on this PEP are not over on python-ideas, I proposed this PEP directly on python-dev since I consider that my PEP already summarizes current and past proposed alternatives. python-ideas threads: * Add time.time_ns(): system clock with nanosecond resolution * Why not picoseconds? The PEP 564 will be shortly online at: https://www.python.org/dev/peps/pep-0564/ Victor PEP: 564 Title: Add new time functions with nanosecond resolution Version: $Revision$ Last-Modified: $Date$ Author: Victor Stinner Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 16-October-2017 Python-Version: 3.7 Abstract ======== Add five new functions to the ``time`` module: ``time_ns()``, ``perf_counter_ns()``, ``monotonic_ns()``, ``clock_gettime_ns()`` and ``clock_settime_ns()``. They are similar to the function without the ``_ns`` suffix, but have nanosecond resolution: use a number of nanoseconds as a Python int. The best ``time.time_ns()`` resolution measured in Python is 3 times better then ``time.time()`` resolution on Linux and Windows. Rationale ========= Float type limited to 104 days ------------------------------ The clocks resolution of desktop and latop computers is getting closer to nanosecond resolution. More and more clocks have a frequency in MHz, up to GHz for the CPU TSC clock. The Python ``time.time()`` function returns the current time as a floatting point number which is usually a 64-bit binary floatting number (in the IEEE 754 format). The problem is that the float type starts to lose nanoseconds after 104 days. Conversion from nanoseconds (``int``) to seconds (``float``) and then back to nanoseconds (``int``) to check if conversions lose precision:: # no precision loss >>> x = 2 ** 52 + 1; int(float(x * 1e-9) * 1e9) - x 0 # precision loss! (1 nanosecond) >>> x = 2 ** 53 + 1; int(float(x * 1e-9) * 1e9) - x -1 >>> print(datetime.timedelta(seconds=2 ** 53 / 1e9)) 104 days, 5:59:59.254741 ``time.time()`` returns seconds elapsed since the UNIX epoch: January 1st, 1970. This function loses precision since May 1970 (47 years ago):: >>> import datetime >>> unix_epoch = datetime.datetime(1970, 1, 1) >>> print(unix_epoch + datetime.timedelta(seconds=2**53 / 1e9)) 1970-04-15 05:59:59.254741 Previous rejected PEP --------------------- Five years ago, the PEP 410 proposed a large and complex change in all Python functions returning time to support nanosecond resolution using the ``decimal.Decimal`` type. The PEP was rejected for different reasons: * The idea of adding a new optional parameter to change the result type was rejected. It's an uncommon (and bad?) programming practice in Python. * It was not clear if hardware clocks really had a resolution of 1 nanosecond, especially at the Python level. * The ``decimal.Decimal`` type is uncommon in Python and so requires to adapt code to handle it. CPython enhancements of the last 5 years ---------------------------------------- Since the PEP 410 was rejected: * The ``os.stat_result`` structure got 3 new fields for timestamps as nanoseconds (Python ``int``): ``st_atime_ns``, ``st_ctime_ns`` and ``st_mtime_ns``. * The PEP 418 was accepted, Python 3.3 got 3 new clocks: ``time.monotonic()``, ``time.perf_counter()`` and ``time.process_time()``. * The CPython private "pytime" C API handling time now uses a new ``_PyTime_t`` type: simple 64-bit signed integer (C ``int64_t``). The ``_PyTime_t`` unit is an implementation detail and not part of the API. The unit is currently ``1 nanosecond``. Existing Python APIs using nanoseconds as int --------------------------------------------- The ``os.stat_result`` structure has 3 fields for timestamps as nanoseconds (``int``): ``st_atime_ns``, ``st_ctime_ns`` and ``st_mtime_ns``. The ``ns`` parameter of the ``os.utime()`` function accepts a ``(atime_ns: int, mtime_ns: int)`` tuple: nanoseconds. Changes ======= New functions ------------- This PEP adds five new functions to the ``time`` module: * ``time.clock_gettime_ns(clock_id)`` * ``time.clock_settime_ns(clock_id, time: int)`` * ``time.perf_counter_ns()`` * ``time.monotonic_ns()`` * ``time.time_ns()`` These functions are similar to the version without the ``_ns`` suffix, but use nanoseconds as Python ``int``. For example, ``time.monotonic_ns() == int(time.monotonic() * 1e9)`` if ``monotonic()`` value is small enough to not lose precision. Unchanged functions ------------------- This PEP only proposed to add new functions getting or setting clocks with nanosecond resolution. Clocks are likely to lose precision, especially when their reference is the UNIX epoch. Python has other functions handling time (get time, timeout, etc.), but no nanosecond variant is proposed for them since they are less likely to lose precision. Example of unchanged functions: * ``os`` module: ``sched_rr_get_interval()``, ``times()``, ``wait3()`` and ``wait4()`` * ``resource`` module: ``ru_utime`` and ``ru_stime`` fields of ``getrusage()`` * ``signal`` module: ``getitimer()``, ``setitimer()`` * ``time`` module: ``clock_getres()`` Since the ``time.clock()`` function was deprecated in Python 3.3, no ``time.clock_ns()`` is added. Alternatives and discussion =========================== Sub-nanosecond resolution ------------------------- ``time.time_ns()`` API is not "future-proof": if clocks resolutions increase, new Python functions may be needed. In practive, the resolution of 1 nanosecond is currently enough for all structures used by all operating systems functions. Hardware clock with a resolution better than 1 nanosecond already exists. For example, the frequency of a CPU TSC clock is the CPU base frequency: the resolution is around 0.3 ns for a CPU running at 3 GHz. Users who have access to such hardware and really need sub-nanosecond resolution can easyly extend Python for their needs. Such rare use case don't justify to design the Python standard library to support sub-nanosecond resolution. For the CPython implementation, nanosecond resolution is convenient: the standard and well supported ``int64_t`` type can be used to store time. It supports a time delta between -292 years and 292 years. Using the UNIX epoch as reference, this type supports time since year 1677 to year 2262:: >>> 1970 - 2 ** 63 / (10 ** 9 * 3600 * 24 * 365.25) 1677.728976954687 >>> 1970 + 2 ** 63 / (10 ** 9 * 3600 * 24 * 365.25) 2262.271023045313 Different types --------------- It was proposed to modify ``time.time()`` to use float type with better precision. The PEP 410 proposed to use ``decimal.Decimal``, but it was rejected. Apart ``decimal.Decimal``, no portable ``float`` type with better precision is currently available in Python. Changing the builtin Python ``float`` type is out of the scope of this PEP. Other ideas of new types were proposed to support larger or arbitrary precision: fractions, structures or 2-tuple using integers, fixed-precision floating point number, etc. See also the PEP 410 for a previous long discussion on other types. Adding a new type requires more effort to support it, than reusing ``int``. The standard library, third party code and applications would have to be modified to support it. The Python ``int`` type is well known, well supported, ease to manipulate, and supports all arithmetic operations like: ``dt = t2 - t1``. Moreover, using nanoseconds as integer is not new in Python, it's already used for ``os.stat_result`` and ``os.utime(ns=(atime_ns, mtime_ns))``. .. note:: If the Python ``float`` type becomes larger (ex: decimal128 or float128), the ``time.time()`` precision will increase as well. Different API ------------- The ``time.time(ns=False)`` API was proposed to avoid adding new functions. It's an uncommon (and bad?) programming practice in Python to change the result type depending on a parameter. Different options were proposed to allow the user to choose the time resolution. If each Python module uses a different resolution, it can become difficult to handle different resolutions, instead of just seconds (``time.time()`` returning ``float``) and nanoseconds (``time.time_ns()`` returning ``int``). Moreover, as written above, there is no need for resolution better than 1 nanosecond in practive in the Python standard library. Annex: Clocks Resolution in Python ================================== Script ot measure the smallest difference between two ``time.time()`` and ``time.time_ns()`` reads ignoring differences of zero:: import math import time LOOPS = 10 ** 6 print("time.time_ns(): %s" % time.time_ns()) print("time.time(): %s" % time.time()) min_dt = [abs(time.time_ns() - time.time_ns()) for _ in range(LOOPS)] min_dt = min(filter(bool, min_dt)) print("min time_ns() delta: %s ns" % min_dt) min_dt = [abs(time.time() - time.time()) for _ in range(LOOPS)] min_dt = min(filter(bool, min_dt)) print("min time() delta: %s ns" % math.ceil(min_dt * 1e9)) Results of time(), perf_counter() and monotonic(). Linux (kernel 4.12 on Fedora 26): * time_ns(): **84 ns** * time(): **239 ns** * perf_counter_ns(): 84 ns * perf_counter(): 82 ns * monotonic_ns(): 84 ns * monotonic(): 81 ns Windows 8.1: * time_ns(): **318000 ns** * time(): **894070 ns** * perf_counter_ns(): 100 ns * perf_counter(): 100 ns * monotonic_ns(): 15000000 ns * monotonic(): 15000000 ns The difference on ``time.time()`` is significant: **84 ns (2.8x better) vs 239 ns on Linux and 318 us (2.8x better) vs 894 us on Windows**. The difference (presion loss) will be larger next years since every day adds 864,00,000,000,000 nanoseconds to the system clock. The difference on ``time.perf_counter()`` and ``time.monotonic clock()`` is not visible in this quick script since the script runs less than 1 minute, and the uptime of the computer used to run the script was smaller than 1 week. A significant difference should be seen with an uptime of 104 days or greater. .. note:: Internally, Python starts ``monotonic()`` and ``perf_counter()`` clocks at zero on some platforms which indirectly reduce the precision loss. Copyright ========= This document has been placed in the public domain. From ncoghlan at gmail.com Mon Oct 16 07:44:18 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 16 Oct 2017 21:44:18 +1000 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: On 16 October 2017 at 18:26, Paul Moore wrote: > On 16 October 2017 at 02:33, Yury Selivanov > wrote: > > Stage 1. A new execution context PEP to solve the problem *just for > > async code*. The PEP will target Python 3.7 and completely ignore > > synchronous generators and asynchronous generators. It will be based > > on PEP 550 v1 (no chained lookups, immutable mapping or CoW as an > > optimization) and borrow some good API decisions from PEP 550 v3+ > > (contextvars module, ContextVar class). The API (and C-API) will be > > designed to be future proof and ultimately allow transition to the > > stage 2. > > So would decimal contexts stick to using threading.local? If so, > presumably they'd still have problems with async. If not, won't you > still be stuck with having to define the new semantics they have when > used with generators? Or would it be out of scope for the PEP to take > a position on what decimal does? > Decimal could (and should) still switch over in order to make itself more coroutine-friendly, as in this version of the proposal, the key design parameters would be: - for synchronous code that never changes the execution context, context variables and thread locals are essentially equivalent (since there will be exactly one execution context per thread) - for asynchronous code, each task managed by the event loop will get its own execution context (each of which is distinct from the event loop's own execution context) So while I was initially disappointed by the suggestion, I'm coming around to the perspective that it's probably a good pragmatic way to improve context variable adoption rates, since it makes it straightforward for folks to seamlessly switch between using context variables when they're available, and falling back to thread local variables otherwise (and perhaps restricting their coroutine support to Python versions that offer context variables). The downside is that you'll still need to explicitly revert the decimal context before yielding from a generator if you didn't want the context change to "leak", but that's not a new constraint - it's one that already exists for the thread-local based decimal context API. So going down this path would lock in the *default* semantics for the interaction between context variables and generators as being the same as the interaction between thread locals and generators, but would still leave the door open to subsequently introducing an opt-in API like the "contextvars.iter_in_context" idea for cases where folks decided they wanted to do something different (like capturing the context at the point where iterator was created and then temporarily switching back to that on each iteration). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Mon Oct 16 08:03:30 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 16 Oct 2017 13:03:30 +0100 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: On 16 October 2017 at 12:44, Nick Coghlan wrote: > The downside is that you'll still need to explicitly revert the decimal > context before yielding from a generator if you didn't want the context > change to "leak", but that's not a new constraint - it's one that already > exists for the thread-local based decimal context API. Ah, OK.Now I follow. Thanks for clarifying. Paul From victor.stinner at gmail.com Mon Oct 16 09:50:15 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 16 Oct 2017 15:50:15 +0200 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: References: Message-ID: I read again the discussions on python-ideas and noticed that I forgot to mention the "time_ns module" idea. I also added a section to give concrete examples of the precision loss. https://github.com/python/peps/commit/a4828def403913dbae7452b4f9b9d62a0c83a278 Issues caused by precision loss ------------------------------- Example 1: measure time delta ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ A server is running for longer than 104 days. A clock is read before and after running a function to measure its performance. This benchmark lose precision only because the float type used by clocks, not because of the clock resolution. On Python microbenchmarks, it is common to see function calls taking less than 100 ns. A difference of a single nanosecond becomes significant. Example 2: compare time with different resolution ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Two programs "A" and "B" are runing on the same system, so use the system block. The program A reads the system clock with nanosecond resolution and writes the timestamp with nanosecond resolution. The program B reads the timestamp with nanosecond resolution, but compares it to the system clock read with a worse resolution. To simplify the example, let's say that it reads the clock with second resolution. If that case, there is a window of 1 second while the program B can see the timestamp written by A as "in the future". Nowadays, more and more databases and filesystems support storing time with nanosecond resolution. .. note:: This issue was already fixed for file modification time by adding the ``st_mtime_ns`` field to the ``os.stat()`` result, and by accepting nanoseconds in ``os.utime()``. This PEP proposes to generalize the fix. (...) Modify time.time() result type ------------------------------ It was proposed to modify ``time.time()`` to return a different float type with better precision. The PEP 410 proposed to use ``decimal.Decimal`` which already exists and supports arbitray precision, but it was rejected. Apart ``decimal.Decimal``, no portable ``float`` type with better precision is currently available in Python. Changing the builtin Python ``float`` type is out of the scope of this PEP. Moreover, changing existing functions to return a new type introduces a risk of breaking the backward compatibility even the new type is designed carefully. (...) New time_ns module ------------------ Add a new ``time_ns`` module which contains the five new functions: * ``time_ns.clock_gettime(clock_id)`` * ``time_ns.clock_settime(clock_id, time: int)`` * ``time_ns.perf_counter()`` * ``time_ns.monotonic()`` * ``time_ns.time()`` The first question is if the ``time_ns`` should expose exactly the same API (constants, functions, etc.) than the ``time`` module. It can be painful to maintain two flavors of the ``time`` module. How users use suppose to make a choice between these two modules? If tomorrow, other nanosecond variant are needed in the ``os`` module, will we have to add a new ``os_ns`` module as well? There are functions related to time in many modules: ``time``, ``os``, ``signal``, ``resource``, ``select``, etc. Another idea is to add a ``time.ns`` submodule or a nested-namespace to get the ``time.ns.time()`` syntax. Victor From apaulross at gmail.com Mon Oct 16 10:35:20 2017 From: apaulross at gmail.com (Paul Ross) Date: Mon, 16 Oct 2017 15:35:20 +0100 Subject: [Python-Dev] Preprocessing the CPython Source Tree Message-ID: I have implemented a C preprocessor written in Python which gives some useful visualisations of source code, particularly macro usage: https://github.com/paulross/cpip I have been running this on the CPython source code and it occurs to me that this might be useful to the python-dev community. For example the Python dictionary source code is visualised here: http://cpip.readthedocs.io/en/latest/_static/dictobject.c/ index_dictobject.c_a3f5bfec1ed531371fb1a2bcdcb2e9c2.html I found this really useful when I was getting a segfault during a dictionary insert from my C code. The segfault was on this line http://cpip.readthedocs.io/en/ latest/_static/dictobject.c/dictobject.c_a3f5bfec1ed531371fb1a2bcdcb2e9 c2.html#1130 but it is hard to see what is going on with macros inside macros. If you click on the link on the left end of the line it takes you to the full expansion of the macros http://cpip.readthedocs.io/en/ latest/_static/dictobject.c/dictobject.c.html#1130 as this is what the compiler and debugger see. I could examine these values in GDB and figure out what was going on. I could also figure what that MAINTAIN_TRACKING macro was doing by looking at the macros page generated by CPIP: http://cpip.readthedocs.io/en/ latest/_static/dictobject.c/macros_ref.html#_TUFJTlRBSU5fVFJBQ0tJTkdfMA__ and following those links. I was wondering if it would be valuable to python-dev developers if this tool was run regularly over the CPython source tree(s). A single source tree takes about 12 CPU hours to process and generates 8GB of HTML/SVG. If this is useful then where to host this? Regards, Paul Ross -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Mon Oct 16 11:06:31 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 16 Oct 2017 17:06:31 +0200 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution References: Message-ID: <20171016170631.375edd56@fsol> Hi, On Mon, 16 Oct 2017 12:42:30 +0200 Victor Stinner wrote: > > ``time.time()`` returns seconds elapsed since the UNIX epoch: January > 1st, 1970. This function loses precision since May 1970 (47 years ago):: This is a funny sentence. I doubt computers (Unix or not) had nanosecond clocks in May 1970. > This PEP adds five new functions to the ``time`` module: > > * ``time.clock_gettime_ns(clock_id)`` > * ``time.clock_settime_ns(clock_id, time: int)`` > * ``time.perf_counter_ns()`` > * ``time.monotonic_ns()`` > * ``time.time_ns()`` Why not ``time.process_time_ns()``? > Hardware clock with a resolution better than 1 nanosecond already > exists. For example, the frequency of a CPU TSC clock is the CPU base > frequency: the resolution is around 0.3 ns for a CPU running at 3 > GHz. Users who have access to such hardware and really need > sub-nanosecond resolution can easyly extend Python for their needs. Typo: easily. But how is easy is it? > Such rare use case don't justify to design the Python standard library > to support sub-nanosecond resolution. I suspect that assertion will be challenged at some point :-) Though I agree with the ease of implementation argument (about int64_t being wide enough for nanoseconds but not picoseconds). Regards Antoine. From victor.stinner at gmail.com Mon Oct 16 11:23:15 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 16 Oct 2017 17:23:15 +0200 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: <20171016170631.375edd56@fsol> References: <20171016170631.375edd56@fsol> Message-ID: 2017-10-16 17:06 GMT+02:00 Antoine Pitrou : >> This PEP adds five new functions to the ``time`` module: >> >> * ``time.clock_gettime_ns(clock_id)`` >> * ``time.clock_settime_ns(clock_id, time: int)`` >> * ``time.perf_counter_ns()`` >> * ``time.monotonic_ns()`` >> * ``time.time_ns()`` > > Why not ``time.process_time_ns()``? I only wrote my first email on python-ideas to ask this question, but I got no answer on this question, only proposal of other solutions to get time with nanosecond resolution. So I picked the simplest option: start simple, only add new clocks, and maybe add more "_ns" functions later. If we add process_time_ns(), should we also add nanosecond resolution to other functions related to process or CPU time? * Add "ru_utime_ns" and "ru_stime_ns" to the resource.struct_rusage used by os.wait3(), os.wait4() and resource.getrusage() * For os.times(): add os.times_ns()? For this one, I prefer to add a new function rather than duplicating *all* fields of os.times_result, since all fields store durations Victor From benhoyt at gmail.com Mon Oct 16 11:37:37 2017 From: benhoyt at gmail.com (Ben Hoyt) Date: Mon, 16 Oct 2017 11:37:37 -0400 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: References: Message-ID: I've read the examples you wrote here, but I'm struggling to see what the real-life use cases are for this. When would you care about *both* very long-running servers (104 days+) and nanosecond precision? I'm not saying it could never happen, but would want to see real "experience reports" of when this is needed. -Ben On Mon, Oct 16, 2017 at 9:50 AM, Victor Stinner wrote: > I read again the discussions on python-ideas and noticed that I forgot > to mention the "time_ns module" idea. I also added a section to give > concrete examples of the precision loss. > > https://github.com/python/peps/commit/a4828def403913dbae7452b4f9b9d6 > 2a0c83a278 > > Issues caused by precision loss > ------------------------------- > > Example 1: measure time delta > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > A server is running for longer than 104 days. A clock is read before > and after running a function to measure its performance. This benchmark > lose precision only because the float type used by clocks, not because > of the clock resolution. > > On Python microbenchmarks, it is common to see function calls taking > less than 100 ns. A difference of a single nanosecond becomes > significant. > > Example 2: compare time with different resolution > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > Two programs "A" and "B" are runing on the same system, so use the system > block. The program A reads the system clock with nanosecond resolution > and writes the timestamp with nanosecond resolution. The program B reads > the timestamp with nanosecond resolution, but compares it to the system > clock read with a worse resolution. To simplify the example, let's say > that it reads the clock with second resolution. If that case, there is a > window of 1 second while the program B can see the timestamp written by A > as "in the future". > > Nowadays, more and more databases and filesystems support storing time > with nanosecond resolution. > > .. note:: > This issue was already fixed for file modification time by adding the > ``st_mtime_ns`` field to the ``os.stat()`` result, and by accepting > nanoseconds in ``os.utime()``. This PEP proposes to generalize the > fix. > > (...) > > Modify time.time() result type > ------------------------------ > > It was proposed to modify ``time.time()`` to return a different float > type with better precision. > > The PEP 410 proposed to use ``decimal.Decimal`` which already exists and > supports arbitray precision, but it was rejected. Apart > ``decimal.Decimal``, no portable ``float`` type with better precision is > currently available in Python. > > Changing the builtin Python ``float`` type is out of the scope of this > PEP. > > Moreover, changing existing functions to return a new type introduces a > risk of breaking the backward compatibility even the new type is > designed carefully. > > (...) > > New time_ns module > ------------------ > > Add a new ``time_ns`` module which contains the five new functions: > > * ``time_ns.clock_gettime(clock_id)`` > * ``time_ns.clock_settime(clock_id, time: int)`` > * ``time_ns.perf_counter()`` > * ``time_ns.monotonic()`` > * ``time_ns.time()`` > > The first question is if the ``time_ns`` should expose exactly the same > API (constants, functions, etc.) than the ``time`` module. It can be > painful to maintain two flavors of the ``time`` module. How users use > suppose to make a choice between these two modules? > > If tomorrow, other nanosecond variant are needed in the ``os`` module, > will we have to add a new ``os_ns`` module as well? There are functions > related to time in many modules: ``time``, ``os``, ``signal``, > ``resource``, ``select``, etc. > > Another idea is to add a ``time.ns`` submodule or a nested-namespace to > get the ``time.ns.time()`` syntax. > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > benhoyt%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Oct 16 11:49:43 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 16 Oct 2017 08:49:43 -0700 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: On Sun, Oct 15, 2017 at 10:26 PM, Nathaniel Smith wrote: > On Sun, Oct 15, 2017 at 10:10 PM, Guido van Rossum > wrote: > > Yes, that's what I meant by "ignoring generators". And I'd like there to > be > > a "current context" that's a per-thread MutableMapping with ContextVar > keys. > > Maybe there's not much more to it apart from naming the APIs for getting > and > > setting it? To be clear, I am fine with this being a specific subtype of > > MutableMapping. But I don't see much benefit in making it more abstract > than > > that. > > We don't need it to be abstract (it's fine to have a single concrete > mapping type that we always use internally), but I think we do want it > to be opaque (instead of exposing the MutableMapping interface, the > only way to get/set specific values should be through the ContextVar > interface). The advantages are: > > - This allows C level caching of values in ContextVar objects (in > particular, funneling mutations through a limited API makes cache > invalidation *much* easier) > Well the MutableMapping could still be a proxy or something that invalidates the cache when mutated. That's why I said it should be a single concrete mapping type. (It also doesn't have to derive from MutableMapping -- it's sufficient for it to be a duck type for one, or perhaps some Python-level code could `register()` it. > - It gives us flexibility to change the underlying data structure > without breaking API, or for different implementations to make > different choices -- in particular, it's not clear whether a dict or > HAMT is better, and it's not clear whether a regular dict or > WeakKeyDict is better. > I would keep it simple and supid, but WeakKeyDict is a subtype of MutableMapping, and I'm sure we can find a way to implement the full MutableMapping interface on top of HAMT as well. > The first point (caching) I think is the really compelling one: in > practice decimal and numpy are already using tricky caching code to > reduce the overhead of accessing the ThreadState dict, and this gets > even trickier with context-local state which has more cache > invalidation points, so if we don't do this in the interpreter then it > could actually become a blocker for adoption. OTOH it's easy for the > interpreter itself to do this caching, and it makes everyone faster. > I agree, but I don't see how making the type a subtype (or duck type) of MutableMapping prevents any of those strategies. (Maybe you were equating MutableMapping with "subtype of dict"?) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Mon Oct 16 11:42:09 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 16 Oct 2017 17:42:09 +0200 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: References: <20171016170631.375edd56@fsol> Message-ID: <20171016174209.51cbf40c@fsol> On Mon, 16 Oct 2017 17:23:15 +0200 Victor Stinner wrote: > 2017-10-16 17:06 GMT+02:00 Antoine Pitrou : > >> This PEP adds five new functions to the ``time`` module: > >> > >> * ``time.clock_gettime_ns(clock_id)`` > >> * ``time.clock_settime_ns(clock_id, time: int)`` > >> * ``time.perf_counter_ns()`` > >> * ``time.monotonic_ns()`` > >> * ``time.time_ns()`` > > > > Why not ``time.process_time_ns()``? > > I only wrote my first email on python-ideas to ask this question, but > I got no answer on this question, only proposal of other solutions to > get time with nanosecond resolution. So I picked the simplest option: > start simple, only add new clocks, and maybe add more "_ns" functions > later. > > If we add process_time_ns(), should we also add nanosecond resolution > to other functions related to process or CPU time? Restricting this PEP to the time module would be fine with me. Regards Antoine. From guido at python.org Mon Oct 16 11:58:03 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 16 Oct 2017 08:58:03 -0700 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: References: Message-ID: On Mon, Oct 16, 2017 at 8:37 AM, Ben Hoyt wrote: > I've read the examples you wrote here, but I'm struggling to see what the > real-life use cases are for this. When would you care about *both* very > long-running servers (104 days+) and nanosecond precision? I'm not saying > it could never happen, but would want to see real "experience reports" of > when this is needed. > A long-running server might still want to log precise *durations* of various events. (Durations of events are the bread and butter of server performance tuning.) And for this it might want to use the most precise clock available, which is perf_counter(). But if perf_counter()'s epoch is the start of the process, after 104 days it can no longer report ns precision due to float rounding (even though the internal counter does not lose ns). -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Mon Oct 16 12:00:02 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 16 Oct 2017 18:00:02 +0200 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: References: Message-ID: 2017-10-16 17:37 GMT+02:00 Ben Hoyt : > I've read the examples you wrote here, but I'm struggling to see what the > real-life use cases are for this. When would you care about *both* very > long-running servers (104 days+) and nanosecond precision? I'm not saying it > could never happen, but would want to see real "experience reports" of when > this is needed. The second example doesn't depend on the system uptime nor how long the program is running. You can hit the issue just after the system finished to boot: "Example 2: compare time with different resolution" https://www.python.org/dev/peps/pep-0564/#example-2-compare-time-with-different-resolution Victor From victor.stinner at gmail.com Mon Oct 16 12:06:06 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 16 Oct 2017 18:06:06 +0200 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: <20171016174209.51cbf40c@fsol> References: <20171016170631.375edd56@fsol> <20171016174209.51cbf40c@fsol> Message-ID: 2017-10-16 17:42 GMT+02:00 Antoine Pitrou : > Restricting this PEP to the time module would be fine with me. Maybe I should add a short sentence to keep the question open, but exclude it from the direct scope of the PEP? For example: "New nanosecond flavor of these functions may be added later, if a concrete use case comes in." What do you think? Victor From yselivanov.ml at gmail.com Mon Oct 16 12:11:26 2017 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Mon, 16 Oct 2017 12:11:26 -0400 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: On Mon, Oct 16, 2017 at 11:49 AM, Guido van Rossum wrote: > On Sun, Oct 15, 2017 at 10:26 PM, Nathaniel Smith wrote: >> >> On Sun, Oct 15, 2017 at 10:10 PM, Guido van Rossum >> wrote: >> > Yes, that's what I meant by "ignoring generators". And I'd like there to >> > be >> > a "current context" that's a per-thread MutableMapping with ContextVar >> > keys. >> > Maybe there's not much more to it apart from naming the APIs for getting >> > and >> > setting it? To be clear, I am fine with this being a specific subtype of >> > MutableMapping. But I don't see much benefit in making it more abstract >> > than >> > that. >> >> We don't need it to be abstract (it's fine to have a single concrete >> mapping type that we always use internally), but I think we do want it >> to be opaque (instead of exposing the MutableMapping interface, the >> only way to get/set specific values should be through the ContextVar >> interface). The advantages are: >> >> - This allows C level caching of values in ContextVar objects (in >> particular, funneling mutations through a limited API makes cache >> invalidation *much* easier) > > > Well the MutableMapping could still be a proxy or something that invalidates > the cache when mutated. That's why I said it should be a single concrete > mapping type. (It also doesn't have to derive from MutableMapping -- it's > sufficient for it to be a duck type for one, or perhaps some Python-level > code could `register()` it. Yeah, we can do a proxy. > >> >> - It gives us flexibility to change the underlying data structure >> without breaking API, or for different implementations to make >> different choices -- in particular, it's not clear whether a dict or >> HAMT is better, and it's not clear whether a regular dict or >> WeakKeyDict is better. > > > I would keep it simple and supid, but WeakKeyDict is a subtype of > MutableMapping, and I'm sure we can find a way to implement the full > MutableMapping interface on top of HAMT as well. Correct. > >> >> The first point (caching) I think is the really compelling one: in >> practice decimal and numpy are already using tricky caching code to >> reduce the overhead of accessing the ThreadState dict, and this gets >> even trickier with context-local state which has more cache >> invalidation points, so if we don't do this in the interpreter then it >> could actually become a blocker for adoption. OTOH it's easy for the >> interpreter itself to do this caching, and it makes everyone faster. > > > I agree, but I don't see how making the type a subtype (or duck type) of > MutableMapping prevents any of those strategies. (Maybe you were equating > MutableMapping with "subtype of dict"?) Question: why do we want EC objects to be mappings? I'd rather make them opaque, which will result in less code and make it more future-proof. The key arguments for keeping ContextVar abstraction: * Naturally avoids name clashes. * Allows to implement efficient caching. This is important if we want libraries like decimal/numpy to start using it. * Abstracts away the actual implementation of the EC. This is a future-proof solution, with which we can enable EC support for generators in the future. We already know two possible solutions (PEP 550 v1, PEP 550 current), and ContextVar is a good enough abstraction to support both of them. IMO ContextVar.set() and ContextVar.get() is a simple and nice API to work with the EC. Most people (aside framework authors) won't even need to work with EC objects directly anyways. Yury From benhoyt at gmail.com Mon Oct 16 12:14:00 2017 From: benhoyt at gmail.com (Ben Hoyt) Date: Mon, 16 Oct 2017 12:14:00 -0400 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: References: Message-ID: Got it -- fair enough. We deploy so often where I work (a couple of times a week at least) that 104 days seems like an eternity. But I can see where for a very stable file server or something you might well run it that long without deploying. Then again, why are you doing performance tuning on a "very stable server"? -Ben On Mon, Oct 16, 2017 at 11:58 AM, Guido van Rossum wrote: > On Mon, Oct 16, 2017 at 8:37 AM, Ben Hoyt wrote: > >> I've read the examples you wrote here, but I'm struggling to see what the >> real-life use cases are for this. When would you care about *both* very >> long-running servers (104 days+) and nanosecond precision? I'm not saying >> it could never happen, but would want to see real "experience reports" of >> when this is needed. >> > > A long-running server might still want to log precise *durations* of > various events. (Durations of events are the bread and butter of server > performance tuning.) And for this it might want to use the most precise > clock available, which is perf_counter(). But if perf_counter()'s epoch is > the start of the process, after 104 days it can no longer report ns > precision due to float rounding (even though the internal counter does not > lose ns). > > -- > --Guido van Rossum (python.org/~guido) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Mon Oct 16 12:28:33 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 16 Oct 2017 18:28:33 +0200 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: References: <20171016170631.375edd56@fsol> <20171016174209.51cbf40c@fsol> Message-ID: <20171016182833.57ab74b5@fsol> On Mon, 16 Oct 2017 18:06:06 +0200 Victor Stinner wrote: > 2017-10-16 17:42 GMT+02:00 Antoine Pitrou : > > Restricting this PEP to the time module would be fine with me. > > Maybe I should add a short sentence to keep the question open, but > exclude it from the direct scope of the PEP? For example: > > "New nanosecond flavor of these functions may be added later, if a > concrete use case comes in." > > What do you think? It sounds fine to me! Regards Antoine. From victor.stinner at gmail.com Mon Oct 16 12:28:14 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 16 Oct 2017 18:28:14 +0200 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: References: Message-ID: 2017-10-16 18:14 GMT+02:00 Ben Hoyt : > Got it -- fair enough. > > We deploy so often where I work (a couple of times a week at least) that 104 > days seems like an eternity. But I can see where for a very stable file > server or something you might well run it that long without deploying. Then > again, why are you doing performance tuning on a "very stable server"? I'm not sure of what you mean by "performance *tuning*". My idea in the example is more to collect live performance metrics to make sure that everything is fine on your "very stable server". Send these metrics to your favorite time serie database like Gnocchi, Graphite, Graphana or whatever. Victor From yselivanov.ml at gmail.com Mon Oct 16 12:53:00 2017 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Mon, 16 Oct 2017 12:53:00 -0400 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: On Mon, Oct 16, 2017 at 7:44 AM, Nick Coghlan wrote: [..] > So going down this path would lock in the *default* semantics for the > interaction between context variables and generators as being the same as > the interaction between thread locals and generators, but would still leave > the door open to subsequently introducing an opt-in API like the > "contextvars.iter_in_context" idea for cases where folks decided they wanted > to do something different (like capturing the context at the point where > iterator was created and then temporarily switching back to that on each > iteration). I think we can still implement context isolation in generators in later versions for ContextVars. In 3.7, ContextVars will only support async tasks and threads. Using them in generators will be *documented* as unsafe, as the context will "leak out". Fixing generators in some later version of Python will then be a feature/bug fix. I expect almost no backwards compatibility issue, same as I wouldn't expect them if we switched decimal to PEP 550 in 3.7. Yury From benhoyt at gmail.com Mon Oct 16 12:53:11 2017 From: benhoyt at gmail.com (Ben Hoyt) Date: Mon, 16 Oct 2017 12:53:11 -0400 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: References: Message-ID: Makes sense, thanks. -Ben On Mon, Oct 16, 2017 at 12:28 PM, Victor Stinner wrote: > 2017-10-16 18:14 GMT+02:00 Ben Hoyt : > > Got it -- fair enough. > > > > We deploy so often where I work (a couple of times a week at least) that > 104 > > days seems like an eternity. But I can see where for a very stable file > > server or something you might well run it that long without deploying. > Then > > again, why are you doing performance tuning on a "very stable server"? > > I'm not sure of what you mean by "performance *tuning*". My idea in > the example is more to collect live performance metrics to make sure > that everything is fine on your "very stable server". Send these > metrics to your favorite time serie database like Gnocchi, Graphite, > Graphana or whatever. > > Victor > -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Mon Oct 16 12:53:18 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 16 Oct 2017 18:53:18 +0200 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: <20171016182833.57ab74b5@fsol> References: <20171016170631.375edd56@fsol> <20171016174209.51cbf40c@fsol> <20171016182833.57ab74b5@fsol> Message-ID: 2017-10-16 18:28 GMT+02:00 Antoine Pitrou : >> What do you think? > > It sounds fine to me! Ok fine, I updated the PEP. Let's start simple with the few functions (5 "clock" functions) which are "obviously" impacted by the precission loss. Victor From solipsis at pitrou.net Mon Oct 16 12:59:44 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 16 Oct 2017 18:59:44 +0200 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: References: <20171016170631.375edd56@fsol> <20171016174209.51cbf40c@fsol> <20171016182833.57ab74b5@fsol> Message-ID: <20171016185944.1fee1d7a@fsol> On Mon, 16 Oct 2017 18:53:18 +0200 Victor Stinner wrote: > 2017-10-16 18:28 GMT+02:00 Antoine Pitrou : > >> What do you think? > > > > It sounds fine to me! > > Ok fine, I updated the PEP. Let's start simple with the few functions > (5 "clock" functions) which are "obviously" impacted by the precission > loss. It should be 6 functions, right? > > Victor From guido at python.org Mon Oct 16 13:00:14 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 16 Oct 2017 10:00:14 -0700 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: On Mon, Oct 16, 2017 at 9:11 AM, Yury Selivanov wrote: > On Mon, Oct 16, 2017 at 11:49 AM, Guido van Rossum > wrote: > > On Sun, Oct 15, 2017 at 10:26 PM, Nathaniel Smith wrote: > >> We don't need it to be abstract (it's fine to have a single concrete > >> mapping type that we always use internally), but I think we do want it > >> to be opaque (instead of exposing the MutableMapping interface, the > >> only way to get/set specific values should be through the ContextVar > >> interface). The advantages are: > >> > >> - This allows C level caching of values in ContextVar objects (in > >> particular, funneling mutations through a limited API makes cache > >> invalidation *much* easier) > > > Well the MutableMapping could still be a proxy or something that > invalidates > > the cache when mutated. That's why I said it should be a single concrete > > mapping type. (It also doesn't have to derive from MutableMapping -- it's > > sufficient for it to be a duck type for one, or perhaps some Python-level > > code could `register()` it. > > Yeah, we can do a proxy. > > >> - It gives us flexibility to change the underlying data structure > >> without breaking API, or for different implementations to make > >> different choices -- in particular, it's not clear whether a dict or > >> HAMT is better, and it's not clear whether a regular dict or > >> WeakKeyDict is better. > > > I would keep it simple and supid, but WeakKeyDict is a subtype of > > MutableMapping, and I'm sure we can find a way to implement the full > > MutableMapping interface on top of HAMT as well. > > Correct. > > >> The first point (caching) I think is the really compelling one: in > >> practice decimal and numpy are already using tricky caching code to > >> reduce the overhead of accessing the ThreadState dict, and this gets > >> even trickier with context-local state which has more cache > >> invalidation points, so if we don't do this in the interpreter then it > >> could actually become a blocker for adoption. OTOH it's easy for the > >> interpreter itself to do this caching, and it makes everyone faster. > > > I agree, but I don't see how making the type a subtype (or duck type) of > > MutableMapping prevents any of those strategies. (Maybe you were equating > > MutableMapping with "subtype of dict"?) > > Question: why do we want EC objects to be mappings? I'd rather make > them opaque, which will result in less code and make it more > future-proof. > I'd rather have them mappings, since that's what they represent. It helps users understand what's going on behind the scenes, just like modules, classes and (most) instances have a `__dict__` that you can look at and (in most cases) manipulate. > The key arguments for keeping ContextVar abstraction: > To be clear, I do want to keep ContextVar! > * Naturally avoids name clashes. > > * Allows to implement efficient caching. This is important if we want > libraries like decimal/numpy to start using it. > > * Abstracts away the actual implementation of the EC. This is a > future-proof solution, with which we can enable EC support for > generators in the future. We already know two possible solutions (PEP > 550 v1, PEP 550 current), and ContextVar is a good enough abstraction > to support both of them. > > IMO ContextVar.set() and ContextVar.get() is a simple and nice API to > work with the EC. Most people (aside framework authors) won't even > need to work with EC objects directly anyways. > Sure. But (unlike you, it seems) I find it important that users can understand their actions in terms of operations on the mapping representing the context. Its type should be a specific class that inherits from `MutableMapping[ContextVar, object]`. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Mon Oct 16 13:20:44 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 16 Oct 2017 19:20:44 +0200 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: <20171016185944.1fee1d7a@fsol> References: <20171016170631.375edd56@fsol> <20171016174209.51cbf40c@fsol> <20171016182833.57ab74b5@fsol> <20171016185944.1fee1d7a@fsol> Message-ID: Oh, now I'm confused. I misunderstood your previous message. I understood that you changed you mind and didn't want to add process_time_ns(). Can you elaborate why you consider that time.process_time_ns() is needed, but not the nanosecond flavor of os.times() nor resource.getrusage()? These functions use the same or similar clock, no? Depending on platform, time.process_time() may be implemented with resource.getrusage(), os.times() or something else. Victor -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Mon Oct 16 13:24:32 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 16 Oct 2017 19:24:32 +0200 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: References: <20171016170631.375edd56@fsol> <20171016174209.51cbf40c@fsol> <20171016182833.57ab74b5@fsol> <20171016185944.1fee1d7a@fsol> Message-ID: <20171016192432.1dd80750@fsol> On Mon, 16 Oct 2017 19:20:44 +0200 Victor Stinner wrote: > Oh, now I'm confused. I misunderstood your previous message. I understood > that you changed you mind and didn't want to add process_time_ns(). > > Can you elaborate why you consider that time.process_time_ns() is needed, > but not the nanosecond flavor of os.times() nor resource.getrusage()? These > functions use the same or similar clock, no? I didn't say they weren't needed, I said that we could restrict ourselves to the time module for the time being if it makes things easier. But if you want to tackle all of them at once, go for it! :-) Regards Antoine. From guido at python.org Mon Oct 16 13:35:17 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 16 Oct 2017 10:35:17 -0700 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: On Mon, Oct 16, 2017 at 9:53 AM, Yury Selivanov wrote: > I think we can still implement context isolation in generators in > later versions for ContextVars. In 3.7, ContextVars will only support > async tasks and threads. Using them in generators will be > *documented* as unsafe, as the context will "leak out". Fixing > generators in some later version of Python will then be a feature/bug > fix. I expect almost no backwards compatibility issue, same as I > wouldn't expect them if we switched decimal to PEP 550 in 3.7. > Context also leaks into a generator. That's a feature too. Basically a generator does not have its own context; in that respect it's no different from a regular function call. The apparent difference is that it's possible to call next() on a generator object from different contexts (that's always been possible, in today's Python you can do this from multiple threads and there's even protection against re-entering a generator frame that's already active in another thread -- the GIL helps here of course). I expect that any future (post-3.7) changes to how context work in generators will have to support this as the default behavior, and to get other behavior the generator will have to be marked or wrapped somehow. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Mon Oct 16 14:12:39 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 16 Oct 2017 11:12:39 -0700 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: <59E4F697.9060003@stoneleaf.us> On 10/16/2017 09:11 AM, Yury Selivanov wrote: > Question: why do we want EC objects to be mappings? I'd rather make > them opaque, which will result in less code and make it more > future-proof. > > The key arguments for keeping ContextVar abstraction: > > * Naturally avoids name clashes. > > * Allows to implement efficient caching. This is important if we want > libraries like decimal/numpy to start using it. > > * Abstracts away the actual implementation of the EC. This is a > future-proof solution, with which we can enable EC support for > generators in the future. We already know two possible solutions (PEP > 550 v1, PEP 550 current), and ContextVar is a good enough abstraction > to support both of them. > > IMO ContextVar.set() and ContextVar.get() is a simple and nice API to > work with the EC. Most people (aside framework authors) won't even > need to work with EC objects directly anyways. Framework/library authors are users too. Please don't make the interface unpleasant to use. What would be really nice is to have attribute access like thread locals. Instead of working with individual ContextVars you grab the LocalContext and access the vars as attributes. I don't recall reading in the PEP why this is a bad idea. -- ~Ethan~ From rhinerfeldnyc at gmail.com Mon Oct 16 18:21:28 2017 From: rhinerfeldnyc at gmail.com (Richard Hinerfeld) Date: Mon, 16 Oct 2017 18:21:28 -0400 Subject: [Python-Dev] Compiling Python-3.6.3 fails two tests test_math and test_cmath Message-ID: Compiling Python-3.6.3 on Linux fails two tests: test_math and test_cmatg -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- running build running build_ext The following modules found by detect_modules() in setup.py, have been built by the Makefile instead, as configured by the Setup files: atexit pwd time running build_scripts copying and adjusting /home/richard/Python-3.6.3/Tools/scripts/pydoc3 -> build/scripts-3.6 copying and adjusting /home/richard/Python-3.6.3/Tools/scripts/idle3 -> build/scripts-3.6 copying and adjusting /home/richard/Python-3.6.3/Tools/scripts/2to3 -> build/scripts-3.6 copying and adjusting /home/richard/Python-3.6.3/Tools/scripts/pyvenv -> build/scripts-3.6 changing mode of build/scripts-3.6/pydoc3 from 644 to 755 changing mode of build/scripts-3.6/idle3 from 644 to 755 changing mode of build/scripts-3.6/2to3 from 644 to 755 changing mode of build/scripts-3.6/pyvenv from 644 to 755 renaming build/scripts-3.6/pydoc3 to build/scripts-3.6/pydoc3.6 renaming build/scripts-3.6/idle3 to build/scripts-3.6/idle3.6 renaming build/scripts-3.6/2to3 to build/scripts-3.6/2to3-3.6 renaming build/scripts-3.6/pyvenv to build/scripts-3.6/pyvenv-3.6 ./python ./Tools/scripts/run_tests.py -v test_cmath == CPython 3.6.3 (default, Oct 16 2017, 14:42:21) [GCC 4.7.2] == Linux-3.2.0-4-686-pae-i686-with-debian-7.11 little-endian == cwd: /home/richard/Python-3.6.3/build/test_python_10507 == CPU count: 1 == encodings: locale=UTF-8, FS=utf-8 Using random seed 5661358 Run tests in parallel using 3 child processes 0:00:01 load avg: 0.24 [1/1/1] test_cmath failed testAtanSign (test.test_cmath.CMathTests) ... ok testAtanhSign (test.test_cmath.CMathTests) ... ok testTanhSign (test.test_cmath.CMathTests) ... ok test_abs (test.test_cmath.CMathTests) ... ok test_abs_overflows (test.test_cmath.CMathTests) ... ok test_cmath_matches_math (test.test_cmath.CMathTests) ... ok test_constants (test.test_cmath.CMathTests) ... ok test_infinity_and_nan_constants (test.test_cmath.CMathTests) ... ok test_input_type (test.test_cmath.CMathTests) ... ok test_isfinite (test.test_cmath.CMathTests) ... ok test_isinf (test.test_cmath.CMathTests) ... ok test_isnan (test.test_cmath.CMathTests) ... ok test_phase (test.test_cmath.CMathTests) ... ok test_polar (test.test_cmath.CMathTests) ... ok test_polar_errno (test.test_cmath.CMathTests) ... ok test_rect (test.test_cmath.CMathTests) ... ok test_specific_values (test.test_cmath.CMathTests) ... FAIL test_user_object (test.test_cmath.CMathTests) ... ok test_asymmetry (test.test_cmath.IsCloseTests) ... ok test_complex_near_zero (test.test_cmath.IsCloseTests) ... ok test_complex_values (test.test_cmath.IsCloseTests) ... ok test_decimals (test.test_cmath.IsCloseTests) ... ok test_eight_decimal_places (test.test_cmath.IsCloseTests) ... ok test_fractions (test.test_cmath.IsCloseTests) ... ok test_identical (test.test_cmath.IsCloseTests) ... ok test_identical_infinite (test.test_cmath.IsCloseTests) ... ok test_inf_ninf_nan (test.test_cmath.IsCloseTests) ... ok test_integers (test.test_cmath.IsCloseTests) ... ok test_near_zero (test.test_cmath.IsCloseTests) ... ok test_negative_tolerances (test.test_cmath.IsCloseTests) ... ok test_reject_complex_tolerances (test.test_cmath.IsCloseTests) ... ok test_zero_tolerance (test.test_cmath.IsCloseTests) ... ok ====================================================================== FAIL: test_specific_values (test.test_cmath.CMathTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/richard/Python-3.6.3/Lib/test/test_cmath.py", line 418, in test_specific_values msg=error_message) File "/home/richard/Python-3.6.3/Lib/test/test_cmath.py", line 149, in rAssertAlmostEqual '{!r} and {!r} are not sufficiently close'.format(a, b)) AssertionError: tan0064: tan(complex(1.5707963267948961, 0.0)) Expected: complex(1978937966095219.0, 0.0) Received: complex(1978945885716843.0, 0.0) Received value insufficiently close to expected value. ---------------------------------------------------------------------- Ran 32 tests in 0.316s FAILED (failures=1) 1 test failed: test_cmath Re-running failed tests in verbose mode Re-running test 'test_cmath' in verbose mode testAtanSign (test.test_cmath.CMathTests) ... ok testAtanhSign (test.test_cmath.CMathTests) ... ok testTanhSign (test.test_cmath.CMathTests) ... ok test_abs (test.test_cmath.CMathTests) ... ok test_abs_overflows (test.test_cmath.CMathTests) ... ok test_cmath_matches_math (test.test_cmath.CMathTests) ... ok test_constants (test.test_cmath.CMathTests) ... ok test_infinity_and_nan_constants (test.test_cmath.CMathTests) ... ok test_input_type (test.test_cmath.CMathTests) ... ok test_isfinite (test.test_cmath.CMathTests) ... ok test_isinf (test.test_cmath.CMathTests) ... ok test_isnan (test.test_cmath.CMathTests) ... ok test_phase (test.test_cmath.CMathTests) ... ok test_polar (test.test_cmath.CMathTests) ... ok test_polar_errno (test.test_cmath.CMathTests) ... ok test_rect (test.test_cmath.CMathTests) ... ok test_specific_values (test.test_cmath.CMathTests) ... FAIL test_user_object (test.test_cmath.CMathTests) ... ok test_asymmetry (test.test_cmath.IsCloseTests) ... ok test_complex_near_zero (test.test_cmath.IsCloseTests) ... ok test_complex_values (test.test_cmath.IsCloseTests) ... ok test_decimals (test.test_cmath.IsCloseTests) ... ok test_eight_decimal_places (test.test_cmath.IsCloseTests) ... ok test_fractions (test.test_cmath.IsCloseTests) ... ok test_identical (test.test_cmath.IsCloseTests) ... ok test_identical_infinite (test.test_cmath.IsCloseTests) ... ok test_inf_ninf_nan (test.test_cmath.IsCloseTests) ... ok test_integers (test.test_cmath.IsCloseTests) ... ok test_near_zero (test.test_cmath.IsCloseTests) ... ok test_negative_tolerances (test.test_cmath.IsCloseTests) ... ok test_reject_complex_tolerances (test.test_cmath.IsCloseTests) ... ok test_zero_tolerance (test.test_cmath.IsCloseTests) ... ok ====================================================================== FAIL: test_specific_values (test.test_cmath.CMathTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/richard/Python-3.6.3/Lib/test/test_cmath.py", line 418, in test_specific_values msg=error_message) File "/home/richard/Python-3.6.3/Lib/test/test_cmath.py", line 149, in rAssertAlmostEqual '{!r} and {!r} are not sufficiently close'.format(a, b)) AssertionError: tan0064: tan(complex(1.5707963267948961, 0.0)) Expected: complex(1978937966095219.0, 0.0) Received: complex(1978945885716843.0, 0.0) Received value insufficiently close to expected value. ---------------------------------------------------------------------- Ran 32 tests in 0.326s FAILED (failures=1) 1 test failed again: test_cmath Total duration: 2 sec Tests result: FAILURE -------------- next part -------------- running build running build_ext The following modules found by detect_modules() in setup.py, have been built by the Makefile instead, as configured by the Setup files: atexit pwd time running build_scripts copying and adjusting /home/richard/Python-3.6.3/Tools/scripts/pydoc3 -> build/scripts-3.6 copying and adjusting /home/richard/Python-3.6.3/Tools/scripts/idle3 -> build/scripts-3.6 copying and adjusting /home/richard/Python-3.6.3/Tools/scripts/2to3 -> build/scripts-3.6 copying and adjusting /home/richard/Python-3.6.3/Tools/scripts/pyvenv -> build/scripts-3.6 changing mode of build/scripts-3.6/pydoc3 from 644 to 755 changing mode of build/scripts-3.6/idle3 from 644 to 755 changing mode of build/scripts-3.6/2to3 from 644 to 755 changing mode of build/scripts-3.6/pyvenv from 644 to 755 renaming build/scripts-3.6/pydoc3 to build/scripts-3.6/pydoc3.6 renaming build/scripts-3.6/idle3 to build/scripts-3.6/idle3.6 renaming build/scripts-3.6/2to3 to build/scripts-3.6/2to3-3.6 renaming build/scripts-3.6/pyvenv to build/scripts-3.6/pyvenv-3.6 ./python ./Tools/scripts/run_tests.py -v test_math == CPython 3.6.3 (default, Oct 16 2017, 14:42:21) [GCC 4.7.2] == Linux-3.2.0-4-686-pae-i686-with-debian-7.11 little-endian == cwd: /home/richard/Python-3.6.3/build/test_python_10483 == CPU count: 1 == encodings: locale=UTF-8, FS=utf-8 Using random seed 2843454 Run tests in parallel using 3 child processes 0:00:04 load avg: 0.10 [1/1/1] test_math failed testAcos (test.test_math.MathTests) ... ok testAcosh (test.test_math.MathTests) ... ok testAsin (test.test_math.MathTests) ... ok testAsinh (test.test_math.MathTests) ... ok testAtan (test.test_math.MathTests) ... ok testAtan2 (test.test_math.MathTests) ... ok testAtanh (test.test_math.MathTests) ... ok testCeil (test.test_math.MathTests) ... ok testConstants (test.test_math.MathTests) ... ok testCopysign (test.test_math.MathTests) ... ok testCos (test.test_math.MathTests) ... ok testCosh (test.test_math.MathTests) ... ok testDegrees (test.test_math.MathTests) ... ok testExp (test.test_math.MathTests) ... ok testFabs (test.test_math.MathTests) ... ok testFactorial (test.test_math.MathTests) ... ok testFactorialHugeInputs (test.test_math.MathTests) ... ok testFloor (test.test_math.MathTests) ... ok testFmod (test.test_math.MathTests) ... ok testFrexp (test.test_math.MathTests) ... ok testFsum (test.test_math.MathTests) ... skipped 'fsum is not exact on machines with double rounding' testGcd (test.test_math.MathTests) ... ok testHypot (test.test_math.MathTests) ... ok testIsfinite (test.test_math.MathTests) ... ok testIsinf (test.test_math.MathTests) ... ok testIsnan (test.test_math.MathTests) ... ok testLdexp (test.test_math.MathTests) ... ok testLog (test.test_math.MathTests) ... ok testLog10 (test.test_math.MathTests) ... ok testLog1p (test.test_math.MathTests) ... ok testLog2 (test.test_math.MathTests) ... ok testLog2Exact (test.test_math.MathTests) ... ok testModf (test.test_math.MathTests) ... ok testPow (test.test_math.MathTests) ... ok testRadians (test.test_math.MathTests) ... ok testSin (test.test_math.MathTests) ... ok testSinh (test.test_math.MathTests) ... ok testSqrt (test.test_math.MathTests) ... ok testTan (test.test_math.MathTests) ... ok testTanh (test.test_math.MathTests) ... ok testTanhSign (test.test_math.MathTests) ... ok test_exceptions (test.test_math.MathTests) ... ok test_inf_constant (test.test_math.MathTests) ... ok test_mtestfile (test.test_math.MathTests) ... ok test_nan_constant (test.test_math.MathTests) ... ok test_testfile (test.test_math.MathTests) ... FAIL test_trunc (test.test_math.MathTests) ... ok test_asymmetry (test.test_math.IsCloseTests) ... ok test_decimals (test.test_math.IsCloseTests) ... ok test_eight_decimal_places (test.test_math.IsCloseTests) ... ok test_fractions (test.test_math.IsCloseTests) ... ok test_identical (test.test_math.IsCloseTests) ... ok test_identical_infinite (test.test_math.IsCloseTests) ... ok test_inf_ninf_nan (test.test_math.IsCloseTests) ... ok test_integers (test.test_math.IsCloseTests) ... ok test_near_zero (test.test_math.IsCloseTests) ... ok test_negative_tolerances (test.test_math.IsCloseTests) ... ok test_zero_tolerance (test.test_math.IsCloseTests) ... ok /home/richard/Python-3.6.3/Lib/test/ieee754.txt Doctest: ieee754.txt ... ok ====================================================================== FAIL: test_testfile (test.test_math.MathTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/richard/Python-3.6.3/Lib/test/test_math.py", line 1220, in test_testfile '\n '.join(failures)) AssertionError: Failures in test_testfile: tan0064: tan(1.5707963267948961): expected 1978937966095219.0, got 1978945885716843.0 (error = 7.92e+09 (31678486496 ulps); permitted error = 0 or 5 ulps) ---------------------------------------------------------------------- Ran 59 tests in 3.161s FAILED (failures=1, skipped=1) 1 test failed: test_math Re-running failed tests in verbose mode Re-running test 'test_math' in verbose mode testAcos (test.test_math.MathTests) ... ok testAcosh (test.test_math.MathTests) ... ok testAsin (test.test_math.MathTests) ... ok testAsinh (test.test_math.MathTests) ... ok testAtan (test.test_math.MathTests) ... ok testAtan2 (test.test_math.MathTests) ... ok testAtanh (test.test_math.MathTests) ... ok testCeil (test.test_math.MathTests) ... ok testConstants (test.test_math.MathTests) ... ok testCopysign (test.test_math.MathTests) ... ok testCos (test.test_math.MathTests) ... ok testCosh (test.test_math.MathTests) ... ok testDegrees (test.test_math.MathTests) ... ok testExp (test.test_math.MathTests) ... ok testFabs (test.test_math.MathTests) ... ok testFactorial (test.test_math.MathTests) ... ok testFactorialHugeInputs (test.test_math.MathTests) ... ok testFloor (test.test_math.MathTests) ... ok testFmod (test.test_math.MathTests) ... ok testFrexp (test.test_math.MathTests) ... ok testFsum (test.test_math.MathTests) ... skipped 'fsum is not exact on machines with double rounding' testGcd (test.test_math.MathTests) ... ok testHypot (test.test_math.MathTests) ... ok testIsfinite (test.test_math.MathTests) ... ok testIsinf (test.test_math.MathTests) ... ok testIsnan (test.test_math.MathTests) ... ok testLdexp (test.test_math.MathTests) ... ok testLog (test.test_math.MathTests) ... ok testLog10 (test.test_math.MathTests) ... ok testLog1p (test.test_math.MathTests) ... ok testLog2 (test.test_math.MathTests) ... ok testLog2Exact (test.test_math.MathTests) ... ok testModf (test.test_math.MathTests) ... ok testPow (test.test_math.MathTests) ... ok testRadians (test.test_math.MathTests) ... ok testSin (test.test_math.MathTests) ... ok testSinh (test.test_math.MathTests) ... ok testSqrt (test.test_math.MathTests) ... ok testTan (test.test_math.MathTests) ... ok testTanh (test.test_math.MathTests) ... ok testTanhSign (test.test_math.MathTests) ... ok test_exceptions (test.test_math.MathTests) ... ok test_inf_constant (test.test_math.MathTests) ... ok test_mtestfile (test.test_math.MathTests) ... ok test_nan_constant (test.test_math.MathTests) ... ok test_testfile (test.test_math.MathTests) ... FAIL test_trunc (test.test_math.MathTests) ... ok test_asymmetry (test.test_math.IsCloseTests) ... ok test_decimals (test.test_math.IsCloseTests) ... ok test_eight_decimal_places (test.test_math.IsCloseTests) ... ok test_fractions (test.test_math.IsCloseTests) ... ok test_identical (test.test_math.IsCloseTests) ... ok test_identical_infinite (test.test_math.IsCloseTests) ... ok test_inf_ninf_nan (test.test_math.IsCloseTests) ... ok test_integers (test.test_math.IsCloseTests) ... ok test_near_zero (test.test_math.IsCloseTests) ... ok test_negative_tolerances (test.test_math.IsCloseTests) ... ok test_zero_tolerance (test.test_math.IsCloseTests) ... ok /home/richard/Python-3.6.3/Lib/test/ieee754.txt Doctest: ieee754.txt ... ok ====================================================================== FAIL: test_testfile (test.test_math.MathTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/richard/Python-3.6.3/Lib/test/test_math.py", line 1220, in test_testfile '\n '.join(failures)) AssertionError: Failures in test_testfile: tan0064: tan(1.5707963267948961): expected 1978937966095219.0, got 1978945885716843.0 (error = 7.92e+09 (31678486496 ulps); permitted error = 0 or 5 ulps) ---------------------------------------------------------------------- Ran 59 tests in 3.183s FAILED (failures=1, skipped=1) 1 test failed again: test_math Total duration: 7 sec Tests result: FAILURE From tim.peters at gmail.com Mon Oct 16 18:27:04 2017 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 16 Oct 2017 17:27:04 -0500 Subject: [Python-Dev] Compiling Python-3.6.3 fails two tests test_math and test_cmath In-Reply-To: References: Message-ID: [Richard Hinerfeld ] > Compiling Python-3.6.3 on Linux fails two tests: test_math and test_cmatg Precisely which version of Linux? The same failure has already been reported on OpenBSD here: https://bugs.python.org/issue31630 From njs at pobox.com Mon Oct 16 20:29:19 2017 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 16 Oct 2017 17:29:19 -0700 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: <59E4F697.9060003@stoneleaf.us> References: <59E4F697.9060003@stoneleaf.us> Message-ID: On Mon, Oct 16, 2017 at 11:12 AM, Ethan Furman wrote: > What would be really nice is to have attribute access like thread locals. > Instead of working with individual ContextVars you grab the LocalContext and > access the vars as attributes. I don't recall reading in the PEP why this > is a bad idea. You're mixing up levels -- the way threading.local objects work is that there's one big dict that's hidden inside the interpreter (in the ThreadState), and it holds a separate little dict for each threading.local. The dict holding ContextVars is similar to the big dict; a threading.local itself is like a ContextVar that holds a dict. (And the reason it's this way is that it's easy to build either version on top of the other, and we did some survey of threading.local usage and the ContextVar style usage was simpler in the majority of cases.) For threading.local there's no way to get at the big dict at all from Python; it's hidden inside the C APIs and threading internals. I'm guessing you've never missed this :-). For ContextVars we can't hide it that much, because async frameworks need to be able to swap the current dict when switching tasks and clone it when starting a new task, but those are the only absolutely necessary operations. -n -- Nathaniel J. Smith -- https://vorpus.org From njs at pobox.com Mon Oct 16 20:57:06 2017 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 16 Oct 2017 17:57:06 -0700 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: On Mon, Oct 16, 2017 at 8:49 AM, Guido van Rossum wrote: > On Sun, Oct 15, 2017 at 10:26 PM, Nathaniel Smith wrote: >> >> On Sun, Oct 15, 2017 at 10:10 PM, Guido van Rossum >> wrote: >> > Yes, that's what I meant by "ignoring generators". And I'd like there to >> > be >> > a "current context" that's a per-thread MutableMapping with ContextVar >> > keys. >> > Maybe there's not much more to it apart from naming the APIs for getting >> > and >> > setting it? To be clear, I am fine with this being a specific subtype of >> > MutableMapping. But I don't see much benefit in making it more abstract >> > than >> > that. >> >> We don't need it to be abstract (it's fine to have a single concrete >> mapping type that we always use internally), but I think we do want it >> to be opaque (instead of exposing the MutableMapping interface, the >> only way to get/set specific values should be through the ContextVar >> interface). The advantages are: >> >> - This allows C level caching of values in ContextVar objects (in >> particular, funneling mutations through a limited API makes cache >> invalidation *much* easier) > > > Well the MutableMapping could still be a proxy or something that invalidates > the cache when mutated. That's why I said it should be a single concrete > mapping type. (It also doesn't have to derive from MutableMapping -- it's > sufficient for it to be a duck type for one, or perhaps some Python-level > code could `register()` it. MutableMapping is just a really complicated interface -- you have to deal with iterator invalidation and popitem and implementing view classes and all that. It seems like a lot of code for a feature that no-one seems to worry about missing right now. (In fact, I suspect the extra code required to implement the full MutableMapping interface on top of a basic HAMT type is larger than the extra code to implement the current PEP 550 draft's chaining semantics on top of this proposal for a minimal PEP 550.) What do you think of something like: class Context: def __init__(self, /, init: MutableMapping[ContextVar,object] = {}): ... def as_dict(self) -> Dict[ContextVar, object]: "Returns a snapshot of the internal state." def copy(self) -> Context: "Equivalent to (but maybe faster than) Context(self.as_dict())." I like the idea of making it possible to set up arbitrary Contexts and introspect them, because sometimes you do need to debug weird issues or do some wacky stuff deep in the guts of a coroutine scheduler, but this would give us that without implementing MutableMapping's 17 methods and 7 helper classes. -n -- Nathaniel J. Smith -- https://vorpus.org From ethan at stoneleaf.us Mon Oct 16 22:29:44 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 16 Oct 2017 19:29:44 -0700 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: <59E4F697.9060003@stoneleaf.us> Message-ID: <59E56B18.1020900@stoneleaf.us> On 10/16/2017 05:29 PM, Nathaniel Smith wrote: > On Mon, Oct 16, 2017 at 11:12 AM, Ethan Furman wrote: >> What would be really nice is to have attribute access like thread locals. >> Instead of working with individual ContextVars you grab the LocalContext and >> access the vars as attributes. I don't recall reading in the PEP why this >> is a bad idea. > > You're mixing up levels -- the way threading.local objects work is > that there's one big dict that's hidden inside the interpreter (in the > ThreadState), and it holds a separate little dict for each > threading.local. The dict holding ContextVars is similar to the big > dict; a threading.local itself is like a ContextVar that holds a dict. > (And the reason it's this way is that it's easy to build either > version on top of the other, and we did some survey of threading.local > usage and the ContextVar style usage was simpler in the majority of > cases.) > > For threading.local there's no way to get at the big dict at all from > Python; it's hidden inside the C APIs and threading internals. I'm > guessing you've never missed this :-). For ContextVars we can't hide > it that much, because async frameworks need to be able to swap the > current dict when switching tasks and clone it when starting a new > task, but those are the only absolutely necessary operations. Ah, thank you. -- ~Ethan~ From guido at python.org Mon Oct 16 23:40:29 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 16 Oct 2017 20:40:29 -0700 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: Hm. I really like the idea that you can implement and demonstrate all of ContextVar by manipulating the underlying mapping. And surely compared to the effort of implementing the HAMT itself (including all its edge cases) surely implementing the mutable mapping API should be considered recreational programming. On Mon, Oct 16, 2017 at 5:57 PM, Nathaniel Smith wrote: > On Mon, Oct 16, 2017 at 8:49 AM, Guido van Rossum > wrote: > > On Sun, Oct 15, 2017 at 10:26 PM, Nathaniel Smith wrote: > >> > >> On Sun, Oct 15, 2017 at 10:10 PM, Guido van Rossum > >> wrote: > >> > Yes, that's what I meant by "ignoring generators". And I'd like there > to > >> > be > >> > a "current context" that's a per-thread MutableMapping with ContextVar > >> > keys. > >> > Maybe there's not much more to it apart from naming the APIs for > getting > >> > and > >> > setting it? To be clear, I am fine with this being a specific subtype > of > >> > MutableMapping. But I don't see much benefit in making it more > abstract > >> > than > >> > that. > >> > >> We don't need it to be abstract (it's fine to have a single concrete > >> mapping type that we always use internally), but I think we do want it > >> to be opaque (instead of exposing the MutableMapping interface, the > >> only way to get/set specific values should be through the ContextVar > >> interface). The advantages are: > >> > >> - This allows C level caching of values in ContextVar objects (in > >> particular, funneling mutations through a limited API makes cache > >> invalidation *much* easier) > > > > > > Well the MutableMapping could still be a proxy or something that > invalidates > > the cache when mutated. That's why I said it should be a single concrete > > mapping type. (It also doesn't have to derive from MutableMapping -- it's > > sufficient for it to be a duck type for one, or perhaps some Python-level > > code could `register()` it. > > MutableMapping is just a really complicated interface -- you have to > deal with iterator invalidation and popitem and implementing view > classes and all that. It seems like a lot of code for a feature that > no-one seems to worry about missing right now. (In fact, I suspect the > extra code required to implement the full MutableMapping interface on > top of a basic HAMT type is larger than the extra code to implement > the current PEP 550 draft's chaining semantics on top of this proposal > for a minimal PEP 550.) > > What do you think of something like: > > class Context: > def __init__(self, /, init: MutableMapping[ContextVar,object] = {}): > ... > > def as_dict(self) -> Dict[ContextVar, object]: > "Returns a snapshot of the internal state." > > def copy(self) -> Context: > "Equivalent to (but maybe faster than) Context(self.as_dict())." > > I like the idea of making it possible to set up arbitrary Contexts and > introspect them, because sometimes you do need to debug weird issues > or do some wacky stuff deep in the guts of a coroutine scheduler, but > this would give us that without implementing MutableMapping's 17 > methods and 7 helper classes. > > -n > > -- > Nathaniel J. Smith -- https://vorpus.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Oct 17 00:09:27 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 17 Oct 2017 14:09:27 +1000 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: On 17 October 2017 at 03:00, Guido van Rossum wrote: > On Mon, Oct 16, 2017 at 9:11 AM, Yury Selivanov > wrote: > >> > I agree, but I don't see how making the type a subtype (or duck type) of >> > MutableMapping prevents any of those strategies. (Maybe you were >> equating >> > MutableMapping with "subtype of dict"?) >> >> Question: why do we want EC objects to be mappings? I'd rather make >> them opaque, which will result in less code and make it more >> future-proof. >> > > I'd rather have them mappings, since that's what they represent. It helps > users understand what's going on behind the scenes, just like modules, > classes and (most) instances have a `__dict__` that you can look at and (in > most cases) manipulate. > Perhaps rather than requiring that EC's *be* mappings, we could instead require that they expose a mapping API as their __dict__ attribute, similar to the way class dictionaries work? Then the latter could return a proxy that translated mapping operations into the appropriate method calls on the ContextVar being used as the key. Something like: class ExecutionContextProxy: def __init__(self, ec): self._ec = ec # Omitted from the methods below: checking if this EC is the # active EC, and implicitly switching to it if it isn't (for read ops) # or complaining (for write ops) # Individual operations call methods on the key itself def __getitem__(self, key): return key.get() def __setitem__(self, key, value): if not isinstance(key, ContextVar): raise TypeError("Execution context keys must be context variables") key.set(value) def __delitem__(self, key): key.delete() # The key set would be the context vars assigned in the active context def __contains__(self, key): # Note: PEP 550 currently calls the below method ec.vars(), # but I just realised that's confusing, given that the vars() builtin # returns a mapping return key in self._ec.assigned_vars() def __iter__(self): return iter(self._ec.assigned_vars()) def keys(self): return self._ec.assigned_vars() # These are the simple iterator versions of values() and items() # but they could be enhanced to return dynamic views instead def values(self): for k in self._ec.assigned_vars(): yield k.get() def items(self): for k in self._ec.assigned_vars(): yield (k, k.get()) The nice thing about defining the mapping API as a wrapper around otherwise opaque interpreter internals is that it makes it clearer which operations are expected to matter for runtime performance (i.e. the ones handled by the ExecutionContext itself), and which are mainly being provided as intuition pumps for humans attempting to understand how execution contexts actually work (whether for debugging purposes, or simply out of curiosity) If there's a part of the mapping proxy API where we don't have a strong intuition about how it should work, then instead of attempting to guess suitable semantics, we can instead define it as raising RuntimeError for now, and then wait and see if the appropriate semantics become clearer over time. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Oct 17 00:31:35 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 16 Oct 2017 21:31:35 -0700 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: No, that version just defers to magic in ContextVar.get/set, whereas what I'd like to see is that the latter are just implemented in terms of manipulating the mapping directly. The only operations for which speed matters would be __getitem__ and __setitem__; most other methods just defer to those. __delitem__ must also be a primitive, as must __iter__ and __len__ -- but those don't need to be as speedy (however __delitem__ must really work!). On Mon, Oct 16, 2017 at 9:09 PM, Nick Coghlan wrote: > On 17 October 2017 at 03:00, Guido van Rossum wrote: > >> On Mon, Oct 16, 2017 at 9:11 AM, Yury Selivanov >> wrote: >> >>> > I agree, but I don't see how making the type a subtype (or duck type) >>> of >>> > MutableMapping prevents any of those strategies. (Maybe you were >>> equating >>> > MutableMapping with "subtype of dict"?) >>> >>> Question: why do we want EC objects to be mappings? I'd rather make >>> them opaque, which will result in less code and make it more >>> future-proof. >>> >> >> I'd rather have them mappings, since that's what they represent. It helps >> users understand what's going on behind the scenes, just like modules, >> classes and (most) instances have a `__dict__` that you can look at and (in >> most cases) manipulate. >> > > Perhaps rather than requiring that EC's *be* mappings, we could instead > require that they expose a mapping API as their __dict__ attribute, similar > to the way class dictionaries work? > > Then the latter could return a proxy that translated mapping operations > into the appropriate method calls on the ContextVar being used as the key. > > Something like: > > class ExecutionContextProxy: > def __init__(self, ec): > self._ec = ec > # Omitted from the methods below: checking if this EC is the > # active EC, and implicitly switching to it if it isn't (for > read ops) > # or complaining (for write ops) > > # Individual operations call methods on the key itself > def __getitem__(self, key): > return key.get() > def __setitem__(self, key, value): > if not isinstance(key, ContextVar): > raise TypeError("Execution context keys must be context > variables") > key.set(value) > def __delitem__(self, key): > key.delete() > > # The key set would be the context vars assigned in the active > context > def __contains__(self, key): > # Note: PEP 550 currently calls the below method ec.vars(), > # but I just realised that's confusing, given that the vars() > builtin > # returns a mapping > return key in self._ec.assigned_vars() > def __iter__(self): > return iter(self._ec.assigned_vars()) > def keys(self): > return self._ec.assigned_vars() > > # These are the simple iterator versions of values() and items() > # but they could be enhanced to return dynamic views instead > def values(self): > for k in self._ec.assigned_vars(): > yield k.get() > def items(self): > for k in self._ec.assigned_vars(): > yield (k, k.get()) > > The nice thing about defining the mapping API as a wrapper around > otherwise opaque interpreter internals is that it makes it clearer which > operations are expected to matter for runtime performance (i.e. the ones > handled by the ExecutionContext itself), and which are mainly being > provided as intuition pumps for humans attempting to understand how > execution contexts actually work (whether for debugging purposes, or simply > out of curiosity) > > If there's a part of the mapping proxy API where we don't have a strong > intuition about how it should work, then instead of attempting to guess > suitable semantics, we can instead define it as raising RuntimeError for > now, and then wait and see if the appropriate semantics become clearer over > time. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Oct 17 01:02:52 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 17 Oct 2017 15:02:52 +1000 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: On 17 October 2017 at 14:31, Guido van Rossum wrote: > No, that version just defers to magic in ContextVar.get/set, whereas what > I'd like to see is that the latter are just implemented in terms of > manipulating the mapping directly. The only operations for which speed > matters would be __getitem__ and __setitem__; most other methods just defer > to those. __delitem__ must also be a primitive, as must __iter__ and > __len__ -- but those don't need to be as speedy (however __delitem__ must > really work!). > To have the mapping API at the base of the design, we'd want to go back to using the ContextKey version of the API as the core primitive (to ensure we don't get name conflicts between different modules and packages), and then have ContextVar be a convenience wrapper that always accesses the currently active context: class ContextKey: ... class ExecutionContext: ... class ContextVar: def __init__(self, name): self._key = ContextKey(name) def get(self): return get_execution_context()[self._key] def set(self, value): get_execution_context()[self._key] = value def delete(self, value): del get_execution_context()[self._key] While I'd defer to Yury on the technical feasibility, I'd expect that version could probably be made to work *if* you were amenable to some of the mapping methods on the execution context raising RuntimeError in order to avoid locking ourselves in to particular design decisions before we're ready to make them. The reason I say that is because one of the biggest future-proofing concerns when it comes to exposing a mapping as the lowest API layer is that it makes the following code pattern possible: ec = get_execution_context() # Change to a different execution context ec[key] = new_value The appropriate semantics for that case (modifying a context that isn't the currently active one) are *really* unclear, which is why PEP 550 structures the API to prevent it (context variables can only manipulate the active context, not arbitrary contexts). However, even with a mapping at the lowest layer, a similar API constraint could still be introduced via a runtime guard in the mutation methods: if get_execution_context() is not self: raise RuntimeError("Cannot modify an inactive execution context") That way, to actually mutate a different context, you'd still have to switch contexts, just as you have to switch threads in C if you want to modify another thread's thread specific storage. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Oct 17 01:12:46 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 17 Oct 2017 15:12:46 +1000 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: On 17 October 2017 at 15:02, Nick Coghlan wrote: > On 17 October 2017 at 14:31, Guido van Rossum wrote: > >> No, that version just defers to magic in ContextVar.get/set, whereas what >> I'd like to see is that the latter are just implemented in terms of >> manipulating the mapping directly. The only operations for which speed >> matters would be __getitem__ and __setitem__; most other methods just defer >> to those. __delitem__ must also be a primitive, as must __iter__ and >> __len__ -- but those don't need to be as speedy (however __delitem__ must >> really work!). >> > > To have the mapping API at the base of the design, we'd want to go back to > using the ContextKey version of the API as the core primitive (to ensure we > don't get name conflicts between different modules and packages), and then > have ContextVar be a convenience wrapper that always accesses the currently > active context: > > class ContextKey: > ... > class ExecutionContext: > ... > > class ContextVar: > def __init__(self, name): > self._key = ContextKey(name) > > def get(self): > return get_execution_context()[self._key] > > def set(self, value): > get_execution_context()[self._key] = value > > def delete(self, value): > del get_execution_context()[self._key] > Tangent: if we do go this way, it actually maps pretty nicely to the idea of a "threading.ThreadVar" API that wraps threading.local(): class ThreadVar: def __init__(self, name): self._name = name self._storage = threading.local() def get(self): return self._storage.value def set(self, value): self._storage.value = value def delete(self): del self._storage.value (Note: real implementations of either idea would need to pay more attention to producing clear exception messages and instance representations) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Tue Oct 17 06:16:01 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 17 Oct 2017 12:16:01 +0200 Subject: [Python-Dev] Two new environment variables to debug Python 2.7 Message-ID: Hi, FYI I just merged two pull requests into Python 2.7, each add a new environment variable changing the behaviour in debug mode: bpo-31733: don't dump "[xxx refs]" into stderr by default anymore Set PYTHONSHOWREFCOUNT=1 commit 3c082a7fdb472f02bcac7a7f8fe1e3a34a11b70b bpo-31692, bpo-19527: don't dump allocations counts by default anymore Set PYTHONSHOWALLOCCOUNT=1 commit 7b4ba62e388474e811268322b47f80d464933541 I never ever used "[xxx refs]" to detect a reference leak. I only use "./python -m test -R 3:3 test_xxx" to detect reference leaks. To be honest, usually I only run such test explicitly when one of our "Refleaks" buildbot starts to comlain :-) Victor From victor.stinner at gmail.com Tue Oct 17 09:11:12 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 17 Oct 2017 15:11:12 +0200 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: References: Message-ID: > Since the ``time.clock()`` function was deprecated in Python 3.3, no > ``time.clock_ns()`` is added. FYI I just proposed a change to *remove* time.clock() from Python 3.7: https://bugs.python.org/issue31803 This change is not required by, nor directly related to, the PEP 564. Victor From guido at python.org Tue Oct 17 11:51:38 2017 From: guido at python.org (Guido van Rossum) Date: Tue, 17 Oct 2017 08:51:38 -0700 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: On Mon, Oct 16, 2017 at 10:02 PM, Nick Coghlan wrote: > On 17 October 2017 at 14:31, Guido van Rossum wrote: > >> No, that version just defers to magic in ContextVar.get/set, whereas what >> I'd like to see is that the latter are just implemented in terms of >> manipulating the mapping directly. The only operations for which speed >> matters would be __getitem__ and __setitem__; most other methods just defer >> to those. __delitem__ must also be a primitive, as must __iter__ and >> __len__ -- but those don't need to be as speedy (however __delitem__ must >> really work!). >> > > To have the mapping API at the base of the design, we'd want to go back to > using the ContextKey version of the API as the core primitive (to ensure we > don't get name conflicts between different modules and packages), and then > have ContextVar be a convenience wrapper that always accesses the currently > active context: > > class ContextKey: > ... > class ExecutionContext: > ... > > class ContextVar: > def __init__(self, name): > self._key = ContextKey(name) > > def get(self): > return get_execution_context()[self._key] > > def set(self, value): > get_execution_context()[self._key] = value > > def delete(self, value): > del get_execution_context()[self._key] > Why would we need this extra layer? I would assume that the key can just be the ContextVar object itself, e.g. return get_execution_context()[self] or get_execution_context()[self] = value > While I'd defer to Yury on the technical feasibility, I'd expect that > version could probably be made to work *if* you were amenable to some of > the mapping methods on the execution context raising RuntimeError in order > to avoid locking ourselves in to particular design decisions before we're > ready to make them. > > The reason I say that is because one of the biggest future-proofing > concerns when it comes to exposing a mapping as the lowest API layer is > that it makes the following code pattern possible: > > ec = get_execution_context() > # Change to a different execution context > ec[key] = new_value > > The appropriate semantics for that case (modifying a context that isn't > the currently active one) are *really* unclear, which is why PEP 550 > structures the API to prevent it (context variables can only manipulate the > active context, not arbitrary contexts). > But why on earth would you want to prevent that? If there's some caching involved that I have overlooked (another problem with the complexity of the design, or perhaps the explanation), couldn't mutating the non-context simply set a dirty bit to ensure that if it ever gets made the current context again the cache must be considered invalidated? > However, even with a mapping at the lowest layer, a similar API constraint > could still be introduced via a runtime guard in the mutation methods: > > if get_execution_context() is not self: > raise RuntimeError("Cannot modify an inactive execution context") > > That way, to actually mutate a different context, you'd still have to > switch contexts, just as you have to switch threads in C if you want to > modify another thread's thread specific storage. > But that sounds really perverse. If anything, modifying an EC that's not any thread's current context should be *simpler* than modifying the current context. (I'm okay with a prohibition on modifying another *thread's* current context.) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From yselivanov.ml at gmail.com Tue Oct 17 11:54:14 2017 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Tue, 17 Oct 2017 11:54:14 -0400 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: On Tue, Oct 17, 2017 at 1:02 AM, Nick Coghlan wrote: > On 17 October 2017 at 14:31, Guido van Rossum wrote: >> >> No, that version just defers to magic in ContextVar.get/set, whereas what >> I'd like to see is that the latter are just implemented in terms of >> manipulating the mapping directly. The only operations for which speed >> matters would be __getitem__ and __setitem__; most other methods just defer >> to those. __delitem__ must also be a primitive, as must __iter__ and __len__ >> -- but those don't need to be as speedy (however __delitem__ must really >> work!). > > > To have the mapping API at the base of the design, we'd want to go back to > using the ContextKey version of the API as the core primitive (to ensure we > don't get name conflicts between different modules and packages), and then > have ContextVar be a convenience wrapper that always accesses the currently > active context: > > class ContextKey: > ... > class ExecutionContext: > ... > > class ContextVar: > def __init__(self, name): > self._key = ContextKey(name) > > def get(self): > return get_execution_context()[self._key] > > def set(self, value): > get_execution_context()[self._key] = value > > def delete(self, value): > del get_execution_context()[self._key] ContextVar itself will be hashable, we don't need ContextKeys. > > While I'd defer to Yury on the technical feasibility, I'd expect that > version could probably be made to work *if* you were amenable to some of the > mapping methods on the execution context raising RuntimeError in order to > avoid locking ourselves in to particular design decisions before we're ready > to make them. > > The reason I say that is because one of the biggest future-proofing concerns > when it comes to exposing a mapping as the lowest API layer is that it makes > the following code pattern possible: > > ec = get_execution_context() > # Change to a different execution context > ec[key] = new_value I *really* don't want to make ECs behave like 'locals()'. That will make everything way more complicated. My way of thinking about this: "get_execution_context()" returns you a shallow copy of the current EC (at least conceptually). So making any modifications on it won't affect the current environment. The only way to actually apply the modified EC object to the environment will be its 'run(callable)' method. Yury From nas at arctrix.com Tue Oct 17 13:39:36 2017 From: nas at arctrix.com (Neil Schemenauer) Date: Tue, 17 Oct 2017 17:39:36 +0000 (UTC) Subject: [Python-Dev] Python startup optimization: script vs. service References: <8381f166-5d37-a847-7792-7e0c7f3a9704@python.org> <2418C6BB-2AB0-4407-ADE6-ED795DD0B965@gmail.com> <213d0382-dfdd-dfc5-2e1e-3b408acac256@python.org> Message-ID: Christian Heimes wrote: > That approach could work, but I think that it is the wrong > approach. I'd rather keep Python optimized for long-running > processes and introduce a new mode / option to optimize for > short-running scripts. Another idea is to run a fake transasaction through the process before forking. That will "warm up" things so that most of the lazy init is already done. After returning from the core sprint, I have gotten over my initial enthusiam for my "lazy module defs" idea. It is just too big of a change for Python to accept that this point. I still hope there would be a way to make LOAD_NAME/LOAD_GLOBAL trigger something like __getattr__(). That would allow libraries that want to aggressively do lazy-init to do so in the clean way. The main reason that Python startup is slow is that we do far too much work on module import (e.g. initializing data structures that never get used). Reducing that work will almost necessarily impact pre-fork model programs (e.g. they expect the init to be done before the fork). As someone who uses that model heavily, I would still be okay with the "lazification" as I think there are many more programs that would be helped vs the ones hurt. Initializing everything that your program might possibibly need right at startup time doesn't seem like a goal to strive for. I can understand if you have a different opinion though. A third approach would be to do more init work at compile time. E.g. for re.compile, if the compiled result could be stored in the .pyc, that would eliminate a lot of time for short scripts and for long-running programs. Some Lisp systems have "compiler macros". They are basically a hook to allow programs to do some work before the code is sent to the compiler. If something like that existed in Python, it could be used by re.compile to generate a compiled representation of the regex to store in the .pyc file. That kind of behavior is pretty different than the "there is only runtime" model that Python generally tries to follow. Spit-ball idea, thought up just now: PAT = __compiled__(re.compile(...)) The expression in __compiled__(..) would be evaluated by the compiler and the resulting value would become the value to store in th .pyc. If you are running the code as the script, __compiled__ just returns its argument unchanged. Cheers, Neil From nas at arctrix.com Tue Oct 17 13:42:44 2017 From: nas at arctrix.com (Neil Schemenauer) Date: Tue, 17 Oct 2017 17:42:44 +0000 (UTC) Subject: [Python-Dev] Reorganize Python categories (Core, Library, ...)? References: <20171004143632.668fbc0a@fsol> <20171004175509.78a46799@fsol> Message-ID: Antoine Pitrou wrote: > There is no definite "correct category" when you're mixing different > classification schemes (what kind of bug it is -- > bug/security/enhancement/etc. --, what functional domain it pertains > to -- networking/concurrency/etc. --, which stdlib API it affects). I think there should be a set of tags rather than a single category. In the blurb entry, you could apply all the tags that are relevant. From guido at python.org Tue Oct 17 14:25:37 2017 From: guido at python.org (Guido van Rossum) Date: Tue, 17 Oct 2017 11:25:37 -0700 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: On Tue, Oct 17, 2017 at 8:54 AM, Yury Selivanov wrote: > On Tue, Oct 17, 2017 at 1:02 AM, Nick Coghlan wrote: > > The reason I say that is because one of the biggest future-proofing > concerns > > when it comes to exposing a mapping as the lowest API layer is that it > makes > > the following code pattern possible: > > > > ec = get_execution_context() > > # Change to a different execution context > > ec[key] = new_value > > I *really* don't want to make ECs behave like 'locals()'. That will > make everything way more complicated. > At least some of the problems with locals() have more to do with the legacy of that function than with inherent difficulties. And local variables might be optimized by a JIT in a way that context vars never will be (or at least if we ever get to that point we will be able to redesign the API first). > My way of thinking about this: "get_execution_context()" returns you a > shallow copy of the current EC (at least conceptually). So making any > modifications on it won't affect the current environment. The only > way to actually apply the modified EC object to the environment will > be its 'run(callable)' method. > I understand that you don't want to throw away the implementation work you've already done. But I find that the abstractions you've introduced are getting in the way of helping people understand what they can do with context variables, and I really want to go back to a model that is *much* closer to understanding how instance variables are just self.__dict__. (Even though there are possible complications due to __slots__ and @property.) In short, I really don't think there's a need for context variables to be faster than instance variables. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Oct 17 15:25:33 2017 From: guido at python.org (Guido van Rossum) Date: Tue, 17 Oct 2017 12:25:33 -0700 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: Also, IMO this is all the interface we should need to explain to users (even framework authors): https://github.com/gvanrossum/pep550/blob/master/simpler.py On Tue, Oct 17, 2017 at 11:25 AM, Guido van Rossum wrote: > On Tue, Oct 17, 2017 at 8:54 AM, Yury Selivanov > wrote: > >> On Tue, Oct 17, 2017 at 1:02 AM, Nick Coghlan wrote: >> > The reason I say that is because one of the biggest future-proofing >> concerns >> > when it comes to exposing a mapping as the lowest API layer is that it >> makes >> > the following code pattern possible: >> > >> > ec = get_execution_context() >> > # Change to a different execution context >> > ec[key] = new_value >> >> I *really* don't want to make ECs behave like 'locals()'. That will >> make everything way more complicated. >> > > At least some of the problems with locals() have more to do with the > legacy of that function than with inherent difficulties. And local > variables might be optimized by a JIT in a way that context vars never will > be (or at least if we ever get to that point we will be able to redesign > the API first). > > >> My way of thinking about this: "get_execution_context()" returns you a >> shallow copy of the current EC (at least conceptually). So making any >> modifications on it won't affect the current environment. The only >> way to actually apply the modified EC object to the environment will >> be its 'run(callable)' method. >> > > I understand that you don't want to throw away the implementation work > you've already done. But I find that the abstractions you've introduced are > getting in the way of helping people understand what they can do with > context variables, and I really want to go back to a model that is *much* > closer to understanding how instance variables are just self.__dict__. > (Even though there are possible complications due to __slots__ and > @property.) > > In short, I really don't think there's a need for context variables to be > faster than instance variables. > > -- > --Guido van Rossum (python.org/~guido) > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From nad at python.org Tue Oct 17 15:35:57 2017 From: nad at python.org (Ned Deily) Date: Tue, 17 Oct 2017 15:35:57 -0400 Subject: [Python-Dev] [RELEASE] Python 3.7.0a2 is now available for testing Message-ID: <77DF3A3A-DC16-46A3-A933-B0CDA771EAB9@python.org> Python 3.7.0a2 is the second of four planned alpha previews of Python 3.7, the next feature release of Python. During the alpha phase, Python 3.7 remains under heavy development: additional features will be added and existing features may be modified or deleted. Please keep in mind that this is a preview release and its use is not recommended for production environments. The next preview, 3.7.0a3, is planned for 2017-11-27. You can find Python 3.7.0a2 and more information here: https://www.python.org/downloads/release/python-370a2/ -- Ned Deily nad at python.org -- [] From njs at pobox.com Tue Oct 17 15:51:24 2017 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 17 Oct 2017 12:51:24 -0700 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: On Oct 17, 2017 11:25 AM, "Guido van Rossum" wrote: In short, I really don't think there's a need for context variables to be faster than instance variables. There really is: currently the cost of looking up a thread local through the C API is a dict lookup, which is faster than instance variable lookup, and decimal and numpy have both found that that's already too expensive. Or maybe you're just talking about the speed when the cache misses, in which case never mind :-). -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From yselivanov.ml at gmail.com Tue Oct 17 15:55:38 2017 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Tue, 17 Oct 2017 15:55:38 -0400 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: On Tue, Oct 17, 2017 at 2:25 PM, Guido van Rossum wrote: > On Tue, Oct 17, 2017 at 8:54 AM, Yury Selivanov [..] >> My way of thinking about this: "get_execution_context()" returns you a >> shallow copy of the current EC (at least conceptually). So making any >> modifications on it won't affect the current environment. The only >> way to actually apply the modified EC object to the environment will >> be its 'run(callable)' method. > > > I understand that you don't want to throw away the implementation work > you've already done. But I find that the abstractions you've introduced are > getting in the way of helping people understand what they can do with > context variables, and I really want to go back to a model that is *much* > closer to understanding how instance variables are just self.__dict__. (Even > though there are possible complications due to __slots__ and @property.) I don't really care about the implementation work that has already been done, it's OK if I write it from scratch again. I actually like what you did in https://github.com/gvanrossum/pep550/blob/master/simpler.py, it seems reasonable. The only thing that I'd change is to remove "set_ctx" from the public API and add "Context.run(callable)". This makes the API more flexible to potential future changes and amendments. > In short, I really don't think there's a need for context variables to be > faster than instance variables. Well, even our current idea about the API, doesn't really prohibit us from adding a cache to ContextVar.get(). That would be an implementation detail, right? Same as our latest optimization for creating bound methods (CALL_METHOD & LOAD_METHOD opcodes), which avoids creating bound method instances when it's OK not to. Yury From victor.stinner at gmail.com Tue Oct 17 16:10:42 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 17 Oct 2017 22:10:42 +0200 Subject: [Python-Dev] PEP 510 (function specialization) rejected Message-ID: Hi, I rejected my own PEP 510 "Specialize functions with guards" that I wrote in January 2016: https://github.com/python/peps/commit/c99fb8bf5b5c16c170e1603a1c66a74e93a4ae84 "This PEP was rejected by its author since the design didn't show any significant speedup, but also because of the lack of time to implement the most advanced and complex optimizations." I stopped working on my FAT Python project almost one year ago: https://faster-cpython.readthedocs.io/fat_python.html Victor From guido at python.org Tue Oct 17 16:16:19 2017 From: guido at python.org (Guido van Rossum) Date: Tue, 17 Oct 2017 13:16:19 -0700 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: On Tue, Oct 17, 2017 at 12:51 PM, Nathaniel Smith wrote: > On Oct 17, 2017 11:25 AM, "Guido van Rossum" wrote: > > > In short, I really don't think there's a need for context variables to be > faster than instance variables. > > > There really is: currently the cost of looking up a thread local through > the C API is a dict lookup, which is faster than instance variable lookup, > and decimal and numpy have both found that that's already too expensive. > (At first I found this hard to believe, but then I realized that decimal and numpy presumably access these from C code where a typical operation like Decimal.__add__ is much faster than a dict lookup. So point taken.) > Or maybe you're just talking about the speed when the cache misses, in > which case never mind :-). > I'm happy to support caching the snot out of this, but it seems we agree that the "semantics" can be specified without taking the caching into account, and that's what I'm after. I presume that each ContextVar object will have one cached value? Because that's what PEP 550 currently specifies. Surely it wouldn't be hard for a direct __setitem__ (or __delitem__) call to cause the cache to be invalidated, regardless of whether the affected context is a *current* context or not (a bunch of different approaches suggest themselves).Oother mutating methods can be implemented in terms of __setitem__ (or __delitem__), and I don't care for them to be as fast as a cached lookup. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Tue Oct 17 16:23:59 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 17 Oct 2017 22:23:59 +0200 Subject: [Python-Dev] PEP 511 (code transformers) rejected Message-ID: Hi, I rejected my own PEP 511 "API for code transformers" that I wrote in January 2016: https://github.com/python/peps/commit/9d8fd950014a80324791d7dae3c130b1b64fdace Rejection Notice: """ This PEP was rejected by its author. This PEP was seen as blessing new Python-like programming languages which are close but incompatible with the regular Python language. It was decided to not promote syntaxes incompatible with Python. This PEP was also seen as a nice tool to experiment new Python features, but it is already possible to experiment them without the PEP, only with importlib hooks. If a feature becomes useful, it should be directly part of Python, instead of depending on an third party Python module. Finally, this PEP was driven was the FAT Python optimization project which was abandonned in 2016, since it was not possible to show any significant speedup, but also because of the lack of time to implement the most advanced and complex optimizations. """ Victor From guido at python.org Tue Oct 17 16:25:11 2017 From: guido at python.org (Guido van Rossum) Date: Tue, 17 Oct 2017 13:25:11 -0700 Subject: [Python-Dev] PEP 510 (function specialization) rejected In-Reply-To: References: Message-ID: It takes courage to admit failures like this! I think this is a good call. It echoes the experiences with Unladen Swallow and Pyston. Despite what people may think, CPython really isn't slow, given the large set of constraints on the implementation. On Tue, Oct 17, 2017 at 1:10 PM, Victor Stinner wrote: > Hi, > > I rejected my own PEP 510 "Specialize functions with guards" that I > wrote in January 2016: > > https://github.com/python/peps/commit/c99fb8bf5b5c16c170e1603a1c66a7 > 4e93a4ae84 > > "This PEP was rejected by its author since the design didn't show any > significant speedup, but also because of the lack of time to implement > the most advanced and complex optimizations." > > I stopped working on my FAT Python project almost one year ago: > https://faster-cpython.readthedocs.io/fat_python.html > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Tue Oct 17 16:28:33 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 17 Oct 2017 22:28:33 +0200 Subject: [Python-Dev] PEP 510 (function specialization) rejected In-Reply-To: References: Message-ID: 2017-10-17 22:25 GMT+02:00 Guido van Rossum : > It takes courage to admit failures like this! I think this is a good call. > It echoes the experiences with Unladen Swallow and Pyston. Despite what > people may think, CPython really isn't slow, given the large set of > constraints on the implementation. Oh, I still have a long queue of optimization ideas that I want to try :-) But first, I would like to fix the issue blocking all significant optimizations: make the stable ABI usable to allow to change major CPython design choices without breaking C extensions. https://haypo.github.io/new-python-c-api.html Sadly, I didn't find time yet to write a proper PEP for that. Victor From desmoulinmichel at gmail.com Tue Oct 17 16:36:16 2017 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Tue, 17 Oct 2017 22:36:16 +0200 Subject: [Python-Dev] Python startup optimization: script vs. service In-Reply-To: References: <8381f166-5d37-a847-7792-7e0c7f3a9704@python.org> <2418C6BB-2AB0-4407-ADE6-ED795DD0B965@gmail.com> <213d0382-dfdd-dfc5-2e1e-3b408acac256@python.org> Message-ID: <28da9b10-05ea-a3df-ff08-f2e8d12cb891@gmail.com> Maybe it's time to bring back the debate on the "lazy" keyword then ? Rendering any statement arbitrarily lazy could help with perfs. It would also make hacks like ugettext_lazy in Django useless. And would render moot the extensions of f-strings for lazily rendered ones. And bring lazy imports in the mix. Le 17/10/2017 ? 19:39, Neil Schemenauer a ?crit?: > Christian Heimes wrote: >> That approach could work, but I think that it is the wrong >> approach. I'd rather keep Python optimized for long-running >> processes and introduce a new mode / option to optimize for >> short-running scripts. > > Another idea is to run a fake transasaction through the process > before forking. That will "warm up" things so that most of the lazy > init is already done. > > After returning from the core sprint, I have gotten over my initial > enthusiam for my "lazy module defs" idea. It is just too big of a > change for Python to accept that this point. I still hope there > would be a way to make LOAD_NAME/LOAD_GLOBAL trigger something like > __getattr__(). That would allow libraries that want to aggressively > do lazy-init to do so in the clean way. > > The main reason that Python startup is slow is that we do far too > much work on module import (e.g. initializing data structures that > never get used). Reducing that work will almost necessarily impact > pre-fork model programs (e.g. they expect the init to be done before > the fork). > > As someone who uses that model heavily, I would still be okay with > the "lazification" as I think there are many more programs that > would be helped vs the ones hurt. Initializing everything that your > program might possibibly need right at startup time doesn't seem > like a goal to strive for. I can understand if you have a different > opinion though. > > A third approach would be to do more init work at compile time. > E.g. for re.compile, if the compiled result could be stored in the > .pyc, that would eliminate a lot of time for short scripts and for > long-running programs. Some Lisp systems have "compiler macros". > They are basically a hook to allow programs to do some work before > the code is sent to the compiler. If something like that existed in > Python, it could be used by re.compile to generate a compiled > representation of the regex to store in the .pyc file. That kind of > behavior is pretty different than the "there is only runtime" model > that Python generally tries to follow. > > Spit-ball idea, thought up just now: > > PAT = __compiled__(re.compile(...)) > > The expression in __compiled__(..) would be evaluated by the > compiler and the resulting value would become the value to store in > th .pyc. If you are running the code as the script, __compiled__ > just returns its argument unchanged. > > Cheers, > > Neil > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/desmoulinmichel%40gmail.com > From guido at python.org Tue Oct 17 16:45:19 2017 From: guido at python.org (Guido van Rossum) Date: Tue, 17 Oct 2017 13:45:19 -0700 Subject: [Python-Dev] Python startup optimization: script vs. service In-Reply-To: <28da9b10-05ea-a3df-ff08-f2e8d12cb891@gmail.com> References: <8381f166-5d37-a847-7792-7e0c7f3a9704@python.org> <2418C6BB-2AB0-4407-ADE6-ED795DD0B965@gmail.com> <213d0382-dfdd-dfc5-2e1e-3b408acac256@python.org> <28da9b10-05ea-a3df-ff08-f2e8d12cb891@gmail.com> Message-ID: Let's kick this part of the discussion back to python-ideas. On Tue, Oct 17, 2017 at 1:36 PM, Michel Desmoulin wrote: > Maybe it's time to bring back the debate on the "lazy" keyword then ? > Rendering any statement arbitrarily lazy could help with perfs. It would > also make hacks like ugettext_lazy in Django useless. And would render > moot the extensions of f-strings for lazily rendered ones. And bring > lazy imports in the mix. > > Le 17/10/2017 ? 19:39, Neil Schemenauer a ?crit : > > Christian Heimes wrote: > >> That approach could work, but I think that it is the wrong > >> approach. I'd rather keep Python optimized for long-running > >> processes and introduce a new mode / option to optimize for > >> short-running scripts. > > > > Another idea is to run a fake transasaction through the process > > before forking. That will "warm up" things so that most of the lazy > > init is already done. > > > > After returning from the core sprint, I have gotten over my initial > > enthusiam for my "lazy module defs" idea. It is just too big of a > > change for Python to accept that this point. I still hope there > > would be a way to make LOAD_NAME/LOAD_GLOBAL trigger something like > > __getattr__(). That would allow libraries that want to aggressively > > do lazy-init to do so in the clean way. > > > > The main reason that Python startup is slow is that we do far too > > much work on module import (e.g. initializing data structures that > > never get used). Reducing that work will almost necessarily impact > > pre-fork model programs (e.g. they expect the init to be done before > > the fork). > > > > As someone who uses that model heavily, I would still be okay with > > the "lazification" as I think there are many more programs that > > would be helped vs the ones hurt. Initializing everything that your > > program might possibibly need right at startup time doesn't seem > > like a goal to strive for. I can understand if you have a different > > opinion though. > > > > A third approach would be to do more init work at compile time. > > E.g. for re.compile, if the compiled result could be stored in the > > .pyc, that would eliminate a lot of time for short scripts and for > > long-running programs. Some Lisp systems have "compiler macros". > > They are basically a hook to allow programs to do some work before > > the code is sent to the compiler. If something like that existed in > > Python, it could be used by re.compile to generate a compiled > > representation of the regex to store in the .pyc file. That kind of > > behavior is pretty different than the "there is only runtime" model > > that Python generally tries to follow. > > > > Spit-ball idea, thought up just now: > > > > PAT = __compiled__(re.compile(...)) > > > > The expression in __compiled__(..) would be evaluated by the > > compiler and the resulting value would become the value to store in > > th .pyc. If you are running the code as the script, __compiled__ > > just returns its argument unchanged. > > > > Cheers, > > > > Neil > > > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > https://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > desmoulinmichel%40gmail.com > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Tue Oct 17 18:05:24 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 18 Oct 2017 00:05:24 +0200 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: <20171016170631.375edd56@fsol> References: <20171016170631.375edd56@fsol> Message-ID: Antoine Pitrou: > Why not ``time.process_time_ns()``? I measured the minimum delta between two clock reads, ignoring zeros. I tested time.process_time(), os.times(), resource.getrusage(), and their nanosecond variants (with my WIP implementation of the PEP 564). Linux: * process_time_ns(): 1 ns * process_time(): 2 ns * resource.getrusage(): 1 us ru_usage structure uses timeval, so it makes sense * clock(): 1 us CLOCKS_PER_SECOND = 1,000,000 => res = 1 us * times_ns().elapsed, times().elapsed: 10 ms os.sysconf("SC_CLK_TCK") == HZ = 100 => res = 10 ms * times_ns().user, times().user: 10 ms os.sysconf("SC_CLK_TCK") == HZ = 100 => res = 10 ms Windows: * process_time(), process_time_ns(): 15.6 ms * os.times().user, os.times_ns().user: 15.6 ms Note: I didn't test os.wait3() and os.wait4(), but they also use the ru_usage structure and so probably also have a resolution of 1 us. It looks like *currently*, only time.process_time() has a resolution in nanoseconds (smaller than 1 us). I propose to only add time.process_time_ns(), as you proposed. We might add nanosecond variant for the other functions once operating systems will add new functions with better resolution. Victor From victor.stinner at gmail.com Tue Oct 17 19:14:31 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 18 Oct 2017 01:14:31 +0200 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: References: <20171016170631.375edd56@fsol> Message-ID: I updated my PEP 564 to add time.process_time_ns(): https://github.com/python/peps/blob/master/pep-0564.rst The HTML version should be updated shortly: https://www.python.org/dev/peps/pep-0564/ I better explained why some functions got a new nanosecond variant, whereas others don't. The rationale is the precision loss affecting only a few functions in practice. I completed the "Annex: Clocks Resolution in Python" with more numbers, again, to explain why some functions don't need a nanosecond variant. Thanks Antoine, the PEP now looks better to me :-) Victor 2017-10-18 0:05 GMT+02:00 Victor Stinner : > Antoine Pitrou: >> Why not ``time.process_time_ns()``? > > I measured the minimum delta between two clock reads, ignoring zeros. > I tested time.process_time(), os.times(), resource.getrusage(), and > their nanosecond variants (with my WIP implementation of the PEP 564). > > Linux: > > * process_time_ns(): 1 ns > * process_time(): 2 ns > * resource.getrusage(): 1 us > ru_usage structure uses timeval, so it makes sense > * clock(): 1 us > CLOCKS_PER_SECOND = 1,000,000 => res = 1 us > * times_ns().elapsed, times().elapsed: 10 ms > os.sysconf("SC_CLK_TCK") == HZ = 100 => res = 10 ms > * times_ns().user, times().user: 10 ms > os.sysconf("SC_CLK_TCK") == HZ = 100 => res = 10 ms > > Windows: > > * process_time(), process_time_ns(): 15.6 ms > * os.times().user, os.times_ns().user: 15.6 ms > > Note: I didn't test os.wait3() and os.wait4(), but they also use the > ru_usage structure and so probably also have a resolution of 1 us. > > It looks like *currently*, only time.process_time() has a resolution > in nanoseconds (smaller than 1 us). I propose to only add > time.process_time_ns(), as you proposed. > > We might add nanosecond variant for the other functions once operating > systems will add new functions with better resolution. > > Victor From ncoghlan at gmail.com Wed Oct 18 00:40:24 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 18 Oct 2017 14:40:24 +1000 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: On 18 October 2017 at 05:55, Yury Selivanov wrote: > On Tue, Oct 17, 2017 at 2:25 PM, Guido van Rossum > wrote: > > On Tue, Oct 17, 2017 at 8:54 AM, Yury Selivanov > > [..] > >> My way of thinking about this: "get_execution_context()" returns you a > >> shallow copy of the current EC (at least conceptually). So making any > >> modifications on it won't affect the current environment. The only > >> way to actually apply the modified EC object to the environment will > >> be its 'run(callable)' method. > > > > > > I understand that you don't want to throw away the implementation work > > you've already done. But I find that the abstractions you've introduced > are > > getting in the way of helping people understand what they can do with > > context variables, and I really want to go back to a model that is *much* > > closer to understanding how instance variables are just self.__dict__. > (Even > > though there are possible complications due to __slots__ and @property.) > > I don't really care about the implementation work that has already > been done, it's OK if I write it from scratch again. > > I actually like what you did in > https://github.com/gvanrossum/pep550/blob/master/simpler.py, it seems > reasonable. The only thing that I'd change is to remove "set_ctx" > from the public API and add "Context.run(callable)". This makes the > API more flexible to potential future changes and amendments. > Yep, with that tweak, I like Guido's suggested API as well. Attempting to explain why I think we want "Context.run(callable)" rather "context_vars.set_ctx()" by drawing an analogy to thread local storage: 1. In C, the compiler & CPU work together to ensure you can't access another thread's thread locals. 2. In Python's thread locals API, we do the same thing: you can only get access to the running thread's thread locals, not anyone else's At the Python API layer, we don't expose the ability to switch explicitly to another thread state while remaining within the current function. Instead, we only offer two options: starting a new thread, and waiting for a thread to finish execution. The lifecycle of the thread local storage is then intrinsically linked to the lifecycle of the thread it belongs to. That intrinsic link makes various aspects of thread local storage easier to reason about, since the active thread state can't change in the middle of a running function - even if the current thread gets suspended by the OS, resuming the function also implies resuming the original thread. Including a "contextvars.set_ctx" API would be akin to making PyThreadState_Swap a public Python-level API, rather than only exposing _thread.start_new_thread the way we do now. One reason we *don't* do that is because it would make thread locals much harder to reason about - every function call could have an implicit side effect of changing the active thread state, which would mean the thread locals at the start of the function could differ from those at the end of the function, even if the function itself didn't do anything to change them. Only offering Context.run(callable) provides a similar "the only changes to the execution context will be those this function, or a function it called, explicitly initiates" protection for context variables, and Guido's requested API simplifications make this aspect even easier to reason about: after any given function call, you can be certain of being back in the context you started in, because we wouldn't expose any Python level API that allowed an execution context switch to persist beyond the frame that initiated it. ==== The above is my main rationale for preferring contextvars.Context.run() to contextvars.set_ctx(), but it's not the only reason I prefer it. At a more abstract design philosophy level, I think the distinction between symmetric and asymmetric coroutines is relevant here [2]: * in symmetric coroutines, there's a single operation that says "switch to running this other coroutine" * in asymmetric coroutines, there are separate operations for starting or resuming coroutine and for suspending the currently running one Python's native coroutines are asymmetric - we don't provide a "switch to this coroutine" primitive, we instead provide an API for starting or resuming a coroutine (via cr.__next__(), cr.send() & cr.throw()), and an API for suspending one (via await). The contextvars.set_ctx() API would be suitable for symmetric coroutines, as there's no implied notion of parent context/child context, just a notion of switching which context is active. The Context.run() API aligns better with asymmetric coroutines, as there's a clear distinction between the parent frame (the one initiating the context switch) and the child frame (the one running in the designated context). As a practical matter, Context.run also composes nicely (in combination with functools.partial) for use with any existing API based on submitting functions for delayed execution, or execution in another thread or process: - sched - concurrent.futures - arbitrary callback APIs - method based protocols (including iteration) By contrast, "contextvars.set_ctx" would need various wrappers to handle correctly reverting the context change, and would hence be prone to "changed the active context without changing it back" bugs (which can be especially fun when you're dealing with a shared pool of worker threads or processes). Cheers, Nick. [1] Technically C extensions can play games with this via PyThreadState_Swap, but I'm not going to worry about that here [2] https://stackoverflow.com/questions/41891989/what-is-the-difference-between-asymmetric-and-symmetric-coroutines -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Wed Oct 18 02:25:15 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 17 Oct 2017 23:25:15 -0700 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: <59E6F3CB.40500@stoneleaf.us> On 10/17/2017 09:40 PM, Nick Coghlan wrote: > On 18 October 2017 at 05:55, Yury Selivanov wrote: >> I actually like what you did in >> https://github.com/gvanrossum/pep550/blob/master/simpler.py >> , it seems >> reasonable. The only thing that I'd change is to remove "set_ctx" >> from the public API and add "Context.run(callable)". This makes the >> API more flexible to potential future changes and amendments. > > Yep, with that tweak, I like Guido's suggested API as well. > > > Attempting to explain why I think we want "Context.run(callable)" rather "context_vars.set_ctx()" by drawing an analogy > to thread local storage: > > 1. In C, the compiler & CPU work together to ensure you can't access another thread's thread locals. > 2. In Python's thread locals API, we do the same thing: you can only get access to the running thread's thread locals, > not anyone else's > > At the Python API layer, we don't expose the ability to switch explicitly to another thread state while remaining within > the current function. Instead, we only offer two options: starting a new thread, and waiting for a thread to finish > execution. The lifecycle of the thread local storage is then intrinsically linked to the lifecycle of the thread it > belongs to. I seem to remember mention about frameworks being able to modify contexts for various tasks/coroutines; if the framework cannot create and switch to a new context how will it set them up? Or did I misunderstand? -- ~Ethan~ From ncoghlan at gmail.com Wed Oct 18 02:25:29 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 18 Oct 2017 16:25:29 +1000 Subject: [Python-Dev] PEP 510 (function specialization) rejected In-Reply-To: References: Message-ID: On 18 October 2017 at 06:25, Guido van Rossum wrote: > It takes courage to admit failures like this! I think this is a good call. > It echoes the experiences with Unladen Swallow and Pyston. > And Armin Rigo's experience with psyco before that. Despite what people may think, CPython really isn't slow, given the large > set of constraints on the implementation. > Antonio Cuni had a good PyPy presentation at EuroPython indirectly talking about the fact that when folks say "Python is slow", what they often mean is "Many of Python's conceptual abstractions come at a high runtime cost in the reference implementation": https://speakerdeck.com/antocuni/the-joy-of-pypy-jit-abstractions-for-free That means the general language level performance pay-offs for alternative implementations come from working out how to make the abstraction layers cheaper, as experience shows that opt-in ahead-of-time techniques like Cython, vectorisation, and binary extension modules can do a much better job of dealing with the clearly identifiable low level performance bottlenecks (Readers that aren't familiar with the concept may be interested in [1] as a good recent example of the effectiveness of the latter approach). Cheers, Nick. [1] https://blog.sentry.io/2016/10/19/fixing-python-performance-with-rust.html -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Oct 18 02:56:50 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 18 Oct 2017 16:56:50 +1000 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: <59E6F3CB.40500@stoneleaf.us> References: <59E6F3CB.40500@stoneleaf.us> Message-ID: On 18 October 2017 at 16:25, Ethan Furman wrote: > On 10/17/2017 09:40 PM, Nick Coghlan wrote: > >> At the Python API layer, we don't expose the ability to switch explicitly >> to another thread state while remaining within >> the current function. Instead, we only offer two options: starting a new >> thread, and waiting for a thread to finish >> execution. The lifecycle of the thread local storage is then >> intrinsically linked to the lifecycle of the thread it >> belongs to. >> > > I seem to remember mention about frameworks being able to modify contexts > for various tasks/coroutines; if the framework cannot create and switch to > a new context how will it set them up? Or did I misunderstand? That's what Context.run() will handle. >From a usage perspective, what Yury and I are suggesting is that switching execution to a new context should always implicitly switch back when the called operation returns: def run_in_isolated_ec(operation): ec = contextvars.new_context() return ec.run(operation) That's safe and simple to use, and integrates nicely with the APIs for resuming coroutines (next(cr), cr.__next__(), cr.send(), cr.throw()). By contrast, exposing set_ctx() directly puts the responsibility for reverting the active context back to the initial one on the caller: def run_in_isolated_ec(operation): ec = contextvars.new_context() initial_ec = contextvar.get_ctx() contextvars.set_ctx(ec) try: return operation() finally: # What happens if we forget to revert back to the previous context here? contextvars.set_ctx(initial_ec) While the contextvars implementation is going to need a context switching capability like that internally in order to implement ec.run() (just as the threading implementation's C API includes PyThreadState_Swap), we don't currently have a use case for exposing it as a Python level API. And if we don't expose an API that allows it, then we can delay specifying what effect the following code should have all the calling function or coroutine (or whether it's a runtime error): def buggy_run_in_isolated_ec(operation): ec = contextvars.new_context() contextvars.set_ctx(ec) return operation() # Oops, forgot to revert the active context to the one we were called with So if we start out only exposing "Context.run()" at the Python layer (covering all currently known use cases), then any subsequent addition of an in-place context switching API can be guided by specific examples of situations where Context.run() isn't sufficient. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Wed Oct 18 05:55:26 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 18 Oct 2017 11:55:26 +0200 Subject: [Python-Dev] Tracking fixes of security vulnerabilies: we are good! Message-ID: Hi, Since the beginning of the year, I'm working on a tool to track if all security vulnerabilities are fixed in all Python maintained versions (versions still accepting security fixes): http://python-security.readthedocs.io/vulnerabilities.html Currently, five branches are maintained: 2.7, 3.4, 3.5, 3.6 and master. https://devguide.python.org/#status-of-python-branches Thanks to Ned Deily and Georg Brandl, Python 3.3 reached its end-of-life (EOL) last month, after 5 years of good service (as expected). It reduced the number of maintained branches from six to five :-) Python 3.3.7 released last months contains the last security fixes. The good news is that we got releases last months with fixes for almost all security vulnerabilities. Only Python 3.4 and Python 3.5 have two known vulnerabilities, but I consider that their severity is not high hopefully. "Expat 2.2.3" is not fixed yet in Python 3.4 and 3.5, but I'm not sure that Python is really affected by fixed Expat vulnerabilities, since Python uses its own code to generate a secret key for the Expat "hash secret". Our embedded expat copy is used on Windows and macOS, but not on Linux. "update zlib to 1.2.11" was fixed in the Python 3.4 branch, but no release was made yet. This issue only impacts Windows. Linux and macOS use the system zlib. Victor From guido at python.org Wed Oct 18 13:06:24 2017 From: guido at python.org (Guido van Rossum) Date: Wed, 18 Oct 2017 10:06:24 -0700 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: On Tue, Oct 17, 2017 at 9:40 PM, Nick Coghlan wrote: > On 18 October 2017 at 05:55, Yury Selivanov > wrote: > >> I actually like what you did in >> https://github.com/gvanrossum/pep550/blob/master/simpler.py, it seems >> reasonable. The only thing that I'd change is to remove "set_ctx" >> from the public API and add "Context.run(callable)". This makes the >> API more flexible to potential future changes and amendments. >> > > Yep, with that tweak, I like Guido's suggested API as well. > I've added the suggested Context.run() method. > > Attempting to explain why I think we want "Context.run(callable)" rather > "context_vars.set_ctx()" by drawing an analogy to thread local storage: > > 1. In C, the compiler & CPU work together to ensure you can't access > another thread's thread locals. > But why is that so important? I wouldn't recommend doing it, but it might be handy for a debugger to be able to inspect a thread's thread-locals. As it is, it seems a debugger can only access thread-locals for the thread in which the debugger itself runs. It has better access to the real locals on the thread's stack of frames! > 2. In Python's thread locals API, we do the same thing: you can only get > access to the running thread's thread locals, not anyone else's > But there's no real benefit in this. In C, I could imagine a compiler optimizing access to thread-locals, but in Python that's moot. > At the Python API layer, we don't expose the ability to switch explicitly > to another thread state while remaining within the current function. > Instead, we only offer two options: starting a new thread, and waiting for > a thread to finish execution. The lifecycle of the thread local storage is > then intrinsically linked to the lifecycle of the thread it belongs to. > To me this feels more a side-effect of the implementation (perhaps inherited from C's implementation) than an intentional design. To be clear, I think it's totally fine for *clients* of the ContextVar API -- e.g. numpy or decimal -- to assume that their context doesn't change arbitrarily while they're happily executing in a single frame or calling stuff they trust not to change the context. (IOW all changes to a particular ContextVar would be through that ContextVar object, not through behind-the-scenes manipulation of the thread's current context). But for *frameworks* (e.g. asyncio or Twisted) I find it simpler to think about the context in terms of `set_ctx` and `get_ctx`, and I worry that *hiding* these might block off certain API design patterns that some framework might want to use -- who knows, maybe Nathaniel (who is fond of `with` ) might come up with a context manager to run a block of code in a different context (perhaps cloned from the current one). > That intrinsic link makes various aspects of thread local storage easier > to reason about, since the active thread state can't change in the middle > of a running function - even if the current thread gets suspended by the > OS, resuming the function also implies resuming the original thread. > I don't feel reasoning would be much impaired. When reasoning about code we make assumptions that are theoretically unsafe all the time (e.g. "nobody will move the clock back"). > Including a "contextvars.set_ctx" API would be akin to making > PyThreadState_Swap a public Python-level API, rather than only exposing > _thread.start_new_thread the way we do now. > It's different for threads, because they are the bedrock of execution, and nobody is interested in implementing their own threading framework that doesn't build on this same bedrock. > One reason we *don't* do that is because it would make thread locals much > harder to reason about - every function call could have an implicit side > effect of changing the active thread state, which would mean the thread > locals at the start of the function could differ from those at the end of > the function, even if the function itself didn't do anything to change them. > Hm. Threads are still hard to reason about, because for everything *but* thread-locals there is always the possibility that it's being mutated by another thread... So I don't think we should get our knickers twisted over thread-local variables. > Only offering Context.run(callable) provides a similar "the only changes > to the execution context will be those this function, or a function it > called, explicitly initiates" protection for context variables, and Guido's > requested API simplifications make this aspect even easier to reason about: > after any given function call, you can be certain of being back in the > context you started in, because we wouldn't expose any Python level API > that allowed an execution context switch to persist beyond the frame that > initiated it. > And as long as you're not calling something that's a specific framework's API for messing with the context, that's a fine assumption. I just don't see the need to try to "enforce" this by hiding the underlying API. (Especially since I presume that at the C API level it will still be possible -- else how would Context.run() itself be implemented?) > ==== > > The above is my main rationale for preferring contextvars.Context.run() to > contextvars.set_ctx(), but it's not the only reason I prefer it. > > At a more abstract design philosophy level, I think the distinction > between symmetric and asymmetric coroutines is relevant here [2]: > > * in symmetric coroutines, there's a single operation that says "switch to > running this other coroutine" > * in asymmetric coroutines, there are separate operations for starting or > resuming coroutine and for suspending the currently running one > > Python's native coroutines are asymmetric - we don't provide a "switch to > this coroutine" primitive, we instead provide an API for starting or > resuming a coroutine (via cr.__next__(), cr.send() & cr.throw()), and an > API for suspending one (via await). > > The contextvars.set_ctx() API would be suitable for symmetric coroutines, > as there's no implied notion of parent context/child context, just a notion > of switching which context is active. > > The Context.run() API aligns better with asymmetric coroutines, as there's > a clear distinction between the parent frame (the one initiating the > context switch) and the child frame (the one running in the designated > context). > Sure. But a *framework* might build something different. > As a practical matter, Context.run also composes nicely (in combination > with functools.partial) for use with any existing API based on submitting > functions for delayed execution, or execution in another thread or process: > > - sched > - concurrent.futures > - arbitrary callback APIs > - method based protocols (including iteration) > > By contrast, "contextvars.set_ctx" would need various wrappers to handle > correctly reverting the context change, and would hence be prone to > "changed the active context without changing it back" bugs (which can be > especially fun when you're dealing with a shared pool of worker threads or > processes). > So let's have both. Cheers, > Nick. > > [1] Technically C extensions can play games with this via > PyThreadState_Swap, but I'm not going to worry about that here > [2] https://stackoverflow.com/questions/41891989/what-is- > the-difference-between-asymmetric-and-symmetric-coroutines > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From yselivanov.ml at gmail.com Wed Oct 18 13:50:07 2017 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Wed, 18 Oct 2017 13:50:07 -0400 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: On Wed, Oct 18, 2017 at 1:06 PM, Guido van Rossum wrote: > On Tue, Oct 17, 2017 at 9:40 PM, Nick Coghlan wrote: [..] >> By contrast, "contextvars.set_ctx" would need various wrappers to handle >> correctly reverting the context change, and would hence be prone to "changed >> the active context without changing it back" bugs (which can be especially >> fun when you're dealing with a shared pool of worker threads or processes). > > > So let's have both. The main reason why I don't like 'set_ctx()' is because it would make it harder for us to adopt PEP 550-like design later in the future (*if* we need that.) PEP 550 is designed in such a way, that 'generator.send()' is the only thing that can control the actual stack of LCs. If users can call 'set_ctx' themselves, it means that it's no longer safe for 'generator.send()' to simply pop the topmost LC from the stack. This can be worked around, potentially, but the we don't actually need 'set_ctx' in asyncio or in any other async framework. There is simply no hard motivation to have it. That's why I'd like to have just Context.run(), because it's sufficient, and it doesn't burn the bridge to PEP 550-like design. Yury From ethan at stoneleaf.us Wed Oct 18 14:10:46 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 18 Oct 2017 11:10:46 -0700 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: <59E79926.1050203@stoneleaf.us> On 10/18/2017 10:50 AM, Yury Selivanov wrote: > On Wed, Oct 18, 2017 at 1:06 PM, Guido van Rossum wrote: >> On Tue, Oct 17, 2017 at 9:40 PM, Nick Coghlan wrote: > [..] >>> By contrast, "contextvars.set_ctx" would need various wrappers to handle >>> correctly reverting the context change, and would hence be prone to "changed >>> the active context without changing it back" bugs (which can be especially >>> fun when you're dealing with a shared pool of worker threads or processes). >> >> >> So let's have both. > > The main reason why I don't like 'set_ctx()' is because it would make > it harder for us to adopt PEP 550-like design later in the future > (*if* we need that.) > > PEP 550 is designed in such a way, that 'generator.send()' is the only > thing that can control the actual stack of LCs. If users can call > 'set_ctx' themselves, it means that it's no longer safe for > 'generator.send()' to simply pop the topmost LC from the stack. I don't see why this is a concern -- Python is a "consenting adults" language. If users decide to start mucking around with advanced behavior and something breaks, well, they own all the pieces! ;) Unless it's extremely difficult to not seg-fault in such a situation I don't think this is a valid argument. -- ~Ethan~ From yselivanov.ml at gmail.com Wed Oct 18 14:15:59 2017 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Wed, 18 Oct 2017 14:15:59 -0400 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: <59E79926.1050203@stoneleaf.us> References: <59E79926.1050203@stoneleaf.us> Message-ID: On Wed, Oct 18, 2017 at 2:10 PM, Ethan Furman wrote: [..] > Unless it's extremely difficult to not seg-fault in such a situation I don't > think this is a valid argument. Well, you don't think so, but I do, after writing a few implementations of this PEP family. It would complicate the design, but the function isn't even needed, strictly speaking. Yury From guido at python.org Wed Oct 18 14:21:34 2017 From: guido at python.org (Guido van Rossum) Date: Wed, 18 Oct 2017 11:21:34 -0700 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: On Wed, Oct 18, 2017 at 10:50 AM, Yury Selivanov wrote: > The main reason why I don't like 'set_ctx()' is because it would make > it harder for us to adopt PEP 550-like design later in the future > (*if* we need that.) > > PEP 550 is designed in such a way, that 'generator.send()' is the only > thing that can control the actual stack of LCs. If users can call > 'set_ctx' themselves, it means that it's no longer safe for > 'generator.send()' to simply pop the topmost LC from the stack. This > can be worked around, potentially, but the we don't actually need > 'set_ctx' in asyncio or in any other async framework. There is simply > no hard motivation to have it. That's why I'd like to have just > Context.run(), because it's sufficient, and it doesn't burn the bridge > to PEP 550-like design. > Honestly that stack-popping in send() always felt fragile to me, so I'd be happy if we didn't need to depend on it. That said I'm okay with presenting set_ctx() *primarily* as an educational tool for showing how Context.run() works. We could name it _set_ctx() and add a similar note as we have for sys._getframe(), basically keeping the door open for future changes that may render it non-functional without worries about backward compatibility (and without invoking the notion of "provisional" API). There's no problem with get_ctx() right? -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From yselivanov.ml at gmail.com Wed Oct 18 14:45:38 2017 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Wed, 18 Oct 2017 14:45:38 -0400 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: On Wed, Oct 18, 2017 at 2:21 PM, Guido van Rossum wrote: > On Wed, Oct 18, 2017 at 10:50 AM, Yury Selivanov > wrote: >> >> The main reason why I don't like 'set_ctx()' is because it would make >> it harder for us to adopt PEP 550-like design later in the future >> (*if* we need that.) >> >> PEP 550 is designed in such a way, that 'generator.send()' is the only >> thing that can control the actual stack of LCs. If users can call >> 'set_ctx' themselves, it means that it's no longer safe for >> 'generator.send()' to simply pop the topmost LC from the stack. This >> can be worked around, potentially, but the we don't actually need >> 'set_ctx' in asyncio or in any other async framework. There is simply >> no hard motivation to have it. That's why I'd like to have just >> Context.run(), because it's sufficient, and it doesn't burn the bridge >> to PEP 550-like design. > > > Honestly that stack-popping in send() always felt fragile to me, so I'd be > happy if we didn't need to depend on it. > > That said I'm okay with presenting set_ctx() *primarily* as an educational > tool for showing how Context.run() works. We could name it _set_ctx() and > add a similar note as we have for sys._getframe(), basically keeping the > door open for future changes that may render it non-functional without > worries about backward compatibility (and without invoking the notion of > "provisional" API). '_set_ctx()' + documentation bits work for me. I also assume that if you accept the PEP, you do it provisionally, right? That should make it possible for us to *slightly* tweak the implementation/API/semantics in 3.8 if needed. > There's no problem with get_ctx() right? Yes, 'get_ctx()' is absolutely fine. We still need it for async tasks/callbacks. Yury From guido at python.org Wed Oct 18 14:53:25 2017 From: guido at python.org (Guido van Rossum) Date: Wed, 18 Oct 2017 11:53:25 -0700 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: Actually after recent debate I think this PEP should *not* be provisional. On Wed, Oct 18, 2017 at 11:45 AM, Yury Selivanov wrote: > On Wed, Oct 18, 2017 at 2:21 PM, Guido van Rossum > wrote: > > On Wed, Oct 18, 2017 at 10:50 AM, Yury Selivanov < > yselivanov.ml at gmail.com> > > wrote: > >> > >> The main reason why I don't like 'set_ctx()' is because it would make > >> it harder for us to adopt PEP 550-like design later in the future > >> (*if* we need that.) > >> > >> PEP 550 is designed in such a way, that 'generator.send()' is the only > >> thing that can control the actual stack of LCs. If users can call > >> 'set_ctx' themselves, it means that it's no longer safe for > >> 'generator.send()' to simply pop the topmost LC from the stack. This > >> can be worked around, potentially, but the we don't actually need > >> 'set_ctx' in asyncio or in any other async framework. There is simply > >> no hard motivation to have it. That's why I'd like to have just > >> Context.run(), because it's sufficient, and it doesn't burn the bridge > >> to PEP 550-like design. > > > > > > Honestly that stack-popping in send() always felt fragile to me, so I'd > be > > happy if we didn't need to depend on it. > > > > That said I'm okay with presenting set_ctx() *primarily* as an > educational > > tool for showing how Context.run() works. We could name it _set_ctx() and > > add a similar note as we have for sys._getframe(), basically keeping the > > door open for future changes that may render it non-functional without > > worries about backward compatibility (and without invoking the notion of > > "provisional" API). > > '_set_ctx()' + documentation bits work for me. I also assume that if > you accept the PEP, you do it provisionally, right? That should make > it possible for us to *slightly* tweak the > implementation/API/semantics in 3.8 if needed. > > > There's no problem with get_ctx() right? > > Yes, 'get_ctx()' is absolutely fine. We still need it for async > tasks/callbacks. > > Yury > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Oct 18 19:26:50 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 19 Oct 2017 09:26:50 +1000 Subject: [Python-Dev] Timeout for PEP 550 / Execution Context discussion In-Reply-To: References: Message-ID: On 19 October 2017 at 04:53, Guido van Rossum wrote: > Actually after recent debate I think > this PEP should *not* be provisional. > +1 from me - "contextvars._set_ctx()" is the only part I think we're really unsure about in your latest API design, and marking that specific function as private will cover the fact that its semantics aren't guaranteed yet. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From ronaldoussoren at mac.com Fri Oct 20 09:42:03 2017 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Fri, 20 Oct 2017 15:42:03 +0200 Subject: [Python-Dev] Investigating time for `import requests` In-Reply-To: References: <641723C8-AE9E-44F1-94AE-68A716980318@python.org> <302A53D9-D6F1-4872-ACF2-522889412FA1@mac.com> Message-ID: <0B92C2DA-8073-4285-8630-6396587A4539@mac.com> Op 10 okt. 2017 om 01:48 heeft Brett Cannon > het volgende geschreven: > > > On Mon, Oct 2, 2017, 17:49 Ronald Oussoren, > wrote: > Op 3 okt. 2017 om 04:29 heeft Barry Warsaw > het volgende geschreven: > > > On Oct 2, 2017, at 14:56, Brett Cannon > wrote: > > > >> So Mercurial specifically is an odd duck because they already do lazy importing (in fact they are using the lazy loading support from importlib). In terms of all of this discussion of tweaking import to be lazy, I think the best approach would be providing an opt-in solution that CLI tools can turn on ASAP while the default stays eager. That way everyone gets what they want while the stdlib provides a shared solution that's maintained alongside import itself to make sure it functions appropriately. > > > > The problem I think is that to get full benefit of lazy loading, it has to be turned on globally for bare ?import? statements. A typical application has tons of dependencies and all those libraries are also doing module global imports, so unless lazy loading somehow covers them, it?ll be an incomplete gain. But of course it?ll take forever for all your dependencies to use whatever new API we come up with, and if it?s not as convenient to write as ?import foo? then I suspect it won?t much catch on anyway. > > > > One thing to keep in mind is that imports can have important side-effects. Turning every import statement into a lazy import will not be backward compatible. > > Yep, and that's a lesson Mercurial shared with me at PyCon US this year. My planned approach has a blacklist for modules to only load eagerly. I?m not sure if i understand. Do you want to turn on lazy loading for the stdlib only (with a blacklist for modules that won?t work that way), or generally? In the latter case this would still not be backward compatible. Ronald -------------- next part -------------- An HTML attachment was scrubbed... URL: From ronaldoussoren at mac.com Fri Oct 20 11:48:18 2017 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Fri, 20 Oct 2017 17:48:18 +0200 Subject: [Python-Dev] What is the design purpose of metaclasses vs code generating decorators? (was Re: PEP 557: Data Classes) In-Reply-To: References: <59E10FBA.3040207@stoneleaf.us> Message-ID: <10281E61-682A-4F67-B3E9-52861F6B0B85@mac.com> > On 14 Oct 2017, at 16:37, Martin Teichmann wrote: > >> Things that will not work if Enum does not have a metaclass: >> >> list(EnumClass) -> list of enum members >> dir(EnumClass) -> custom list of "interesting" items >> len(EnumClass) -> number of members >> member in EnumClass -> True or False >> >> - protection from adding, deleting, and changing members >> - guards against reusing the same name twice >> - possible to have properties and members with the same name (i.e. "value" >> and "name") > > In current Python this is true. But if we would go down the route of > PEP 560 (which I just found, I wasn't involved in its discussion), > then we could just add all the needed functionality to classes. > > I would do it slightly different than proposed in PEP 560: > classmethods are very similar to methods on a metaclass. They are just > not called by the special method machinery. I propose that the > following is possible: > >>>> class Spam: > ... @classmethod > ... def __getitem__(self, item): > ... return "Ham" > >>>> Spam[3] > Ham > > this should solve most of your usecases. Except when you want to implement __getitem__ for instances as well :-). An important difference between @classmethod and methods on the metaclass is that @classmethod methods live in the same namespace as instance methods, while methods on the metaclass don?t. I ran into similar problems in PyObjC: Apple?s Cocoa libraries use instance and class methods with the same name. That when using methods on a metaclass, but not when using something similar to @classmethod. Because of this PyObjC is a heavy user of metaclasses (generated from C for additional fun). A major disadvantage of this is that tends to confuse smart editors. Ronald From status at bugs.python.org Fri Oct 20 12:09:43 2017 From: status at bugs.python.org (Python tracker) Date: Fri, 20 Oct 2017 18:09:43 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20171020160943.0F39C11A85F@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2017-10-13 - 2017-10-20) Python tracker at https://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 6260 ( +9) closed 37318 (+38) total 43578 (+47) Open issues with patches: 2415 Issues opened (36) ================== #31761: Failures and crashes when running tests by import in IDLE https://bugs.python.org/issue31761 reopened by terry.reedy #31772: SourceLoader uses stale bytecode in case of equal mtime second https://bugs.python.org/issue31772 reopened by brett.cannon #31781: crashes when calling methods of an uninitialized zipimport.zip https://bugs.python.org/issue31781 opened by Oren Milman #31782: Add a timeout to multiprocessing's Pool.join https://bugs.python.org/issue31782 opened by Will Starms #31783: Race condition in ThreadPoolExecutor when scheduling new jobs https://bugs.python.org/issue31783 opened by Steven.Barker #31784: Implementation of the PEP 564: Add time.time_ns() https://bugs.python.org/issue31784 opened by haypo #31787: various refleaks when calling the __init__() method of an obje https://bugs.python.org/issue31787 opened by Oren Milman #31789: Better error message when failing to overload metaclass.mro https://bugs.python.org/issue31789 opened by bup #31790: double free or corruption (while using smem) https://bugs.python.org/issue31790 opened by zulthan #31791: Ensure that all PyTypeObject fields are set to non-NULL defaul https://bugs.python.org/issue31791 opened by pdox #31793: Allow to specialize smart quotes in documentation translations https://bugs.python.org/issue31793 opened by mdk #31794: Issues with test.autotest https://bugs.python.org/issue31794 opened by serhiy.storchaka #31798: `site.abs__file__` fails for modules where `__file__` cannot b https://bugs.python.org/issue31798 opened by Alexander McFarlane #31800: datetime.strptime: Support for parsing offsets with a colon https://bugs.python.org/issue31800 opened by mariocj89 #31801: vars() manipulation encounters problems with Enum https://bugs.python.org/issue31801 opened by ethan.furman #31802: 'import posixpath' fails if 'os.path' has not be imported alre https://bugs.python.org/issue31802 opened by Mark.Shannon #31803: time.clock() should emit a DeprecationWarning https://bugs.python.org/issue31803 opened by haypo #31804: multiprocessing calls flush on sys.stdout at exit even if it i https://bugs.python.org/issue31804 opened by Pox TheGreat #31805: support._is_gui_available() hangs on x86-64 Sierra 3.6/3.x bui https://bugs.python.org/issue31805 opened by haypo #31807: unitest.mock: Using autospec=True conflicts with 'wraps' https://bugs.python.org/issue31807 opened by John Villalovos #31808: tarfile.extractall fails to overwrite symlinks https://bugs.python.org/issue31808 opened by Frederic Beister #31809: ssl module unnecessarily pins the client curve when using ECDH https://bugs.python.org/issue31809 opened by grrrrrrrrr #31810: Travis CI, buildbots: run "make smelly" to check if CPython le https://bugs.python.org/issue31810 opened by haypo #31811: async and await missing from keyword list in lexical analysis https://bugs.python.org/issue31811 opened by Colin Dunklau #31812: Document PEP 545 (documentation translation) in What's New in https://bugs.python.org/issue31812 opened by haypo #31813: python -m enshure pip stucks https://bugs.python.org/issue31813 opened by Serhy Pyton #31814: subprocess_fork_exec more stable with vfork https://bugs.python.org/issue31814 opened by Albert.Zeyer #31815: Make itertools iterators interruptible https://bugs.python.org/issue31815 opened by serhiy.storchaka #31817: Compilation Error with Python 3.6.1/3.6.3 with Tkinter https://bugs.python.org/issue31817 opened by jpc2350 #31818: [macOS] _scproxy.get_proxies() crash -- get_proxies() is not f https://bugs.python.org/issue31818 opened by Mirko Friedenhagen #31821: pause_reading() doesn't work from connection_made() https://bugs.python.org/issue31821 opened by pitrou #31822: Document that urllib.parse.{Defrag,Split,Parse}Result are name https://bugs.python.org/issue31822 opened by Allen Li #31823: Opaque default value for close_fds argument in Popen.__init__ https://bugs.python.org/issue31823 opened by ?????????????? ?????????????????? #31824: Missing default argument detail in documentation of StreamRead https://bugs.python.org/issue31824 opened by PeterLovett #31826: Misleading __version__ attribute of modules in standard librar https://bugs.python.org/issue31826 opened by abukaj #31827: Remove os.stat_float_times() https://bugs.python.org/issue31827 opened by haypo Most recent 15 issues with no replies (15) ========================================== #31826: Misleading __version__ attribute of modules in standard librar https://bugs.python.org/issue31826 #31824: Missing default argument detail in documentation of StreamRead https://bugs.python.org/issue31824 #31823: Opaque default value for close_fds argument in Popen.__init__ https://bugs.python.org/issue31823 #31821: pause_reading() doesn't work from connection_made() https://bugs.python.org/issue31821 #31817: Compilation Error with Python 3.6.1/3.6.3 with Tkinter https://bugs.python.org/issue31817 #31812: Document PEP 545 (documentation translation) in What's New in https://bugs.python.org/issue31812 #31808: tarfile.extractall fails to overwrite symlinks https://bugs.python.org/issue31808 #31807: unitest.mock: Using autospec=True conflicts with 'wraps' https://bugs.python.org/issue31807 #31804: multiprocessing calls flush on sys.stdout at exit even if it i https://bugs.python.org/issue31804 #31802: 'import posixpath' fails if 'os.path' has not be imported alre https://bugs.python.org/issue31802 #31801: vars() manipulation encounters problems with Enum https://bugs.python.org/issue31801 #31794: Issues with test.autotest https://bugs.python.org/issue31794 #31793: Allow to specialize smart quotes in documentation translations https://bugs.python.org/issue31793 #31790: double free or corruption (while using smem) https://bugs.python.org/issue31790 #31789: Better error message when failing to overload metaclass.mro https://bugs.python.org/issue31789 Most recent 15 issues waiting for review (15) ============================================= #31827: Remove os.stat_float_times() https://bugs.python.org/issue31827 #31821: pause_reading() doesn't work from connection_made() https://bugs.python.org/issue31821 #31815: Make itertools iterators interruptible https://bugs.python.org/issue31815 #31810: Travis CI, buildbots: run "make smelly" to check if CPython le https://bugs.python.org/issue31810 #31809: ssl module unnecessarily pins the client curve when using ECDH https://bugs.python.org/issue31809 #31804: multiprocessing calls flush on sys.stdout at exit even if it i https://bugs.python.org/issue31804 #31803: time.clock() should emit a DeprecationWarning https://bugs.python.org/issue31803 #31802: 'import posixpath' fails if 'os.path' has not be imported alre https://bugs.python.org/issue31802 #31800: datetime.strptime: Support for parsing offsets with a colon https://bugs.python.org/issue31800 #31793: Allow to specialize smart quotes in documentation translations https://bugs.python.org/issue31793 #31787: various refleaks when calling the __init__() method of an obje https://bugs.python.org/issue31787 #31784: Implementation of the PEP 564: Add time.time_ns() https://bugs.python.org/issue31784 #31782: Add a timeout to multiprocessing's Pool.join https://bugs.python.org/issue31782 #31781: crashes when calling methods of an uninitialized zipimport.zip https://bugs.python.org/issue31781 #31779: assertion failures and a crash when using an uninitialized str https://bugs.python.org/issue31779 Top 10 most discussed issues (10) ================================= #31803: time.clock() should emit a DeprecationWarning https://bugs.python.org/issue31803 17 msgs #30744: Local variable assignment is broken when combined with threads https://bugs.python.org/issue30744 11 msgs #31815: Make itertools iterators interruptible https://bugs.python.org/issue31815 10 msgs #31778: ast.literal_eval supports non-literals in Python 3 https://bugs.python.org/issue31778 7 msgs #31818: [macOS] _scproxy.get_proxies() crash -- get_proxies() is not f https://bugs.python.org/issue31818 7 msgs #31800: datetime.strptime: Support for parsing offsets with a colon https://bugs.python.org/issue31800 6 msgs #29696: Use namedtuple in string.Formatter.parse iterator response https://bugs.python.org/issue29696 4 msgs #31742: Default to emitting FutureWarning for provisional APIs https://bugs.python.org/issue31742 4 msgs #31753: Unnecessary closure in ast.literal_eval https://bugs.python.org/issue31753 4 msgs #31791: Ensure that all PyTypeObject fields are set to non-NULL defaul https://bugs.python.org/issue31791 4 msgs Issues closed (40) ================== #13802: IDLE font settings: use multiple character sets in examples https://bugs.python.org/issue13802 closed by terry.reedy #25588: Run test suite from IDLE idlelib.run subprocess https://bugs.python.org/issue25588 closed by terry.reedy #26098: [WIP] PEP 510: Specialize functions with guards https://bugs.python.org/issue26098 closed by haypo #26145: [WIP] PEP 511: Add sys.set_code_transformers() https://bugs.python.org/issue26145 closed by haypo #28603: traceback module can't format/print unhashable exceptions https://bugs.python.org/issue28603 closed by serhiy.storchaka #30013: Compiler warning in Modules/posixmodule.c https://bugs.python.org/issue30013 closed by serhiy.storchaka #30457: Allow retrieve the number of waiters pending for most of the a https://bugs.python.org/issue30457 closed by yselivanov #30541: Add restricted mocks to the python unittest mocking framework https://bugs.python.org/issue30541 closed by haypo #31234: Make support.threading_cleanup() stricter https://bugs.python.org/issue31234 closed by haypo #31334: select.poll.poll fails on BSDs with arbitrary negative timeout https://bugs.python.org/issue31334 closed by serhiy.storchaka #31452: asyncio.gather does not cancel tasks if one fails https://bugs.python.org/issue31452 closed by yselivanov #31558: gc.freeze() - an API to mark objects as uncollectable https://bugs.python.org/issue31558 closed by lukasz.langa #31618: Change sys.settrace opcode tracing to occur after frame line n https://bugs.python.org/issue31618 closed by ncoghlan #31622: Make threading.get_ident() return an opaque type https://bugs.python.org/issue31622 closed by pdox #31632: Asyncio: SSL transport does not support set_protocol() https://bugs.python.org/issue31632 closed by yselivanov #31676: test.test_imp.ImportTests.test_load_source has side effects https://bugs.python.org/issue31676 closed by haypo #31692: [2.7] Test `test_huntrleaks()` of test_regrtest fails in debug https://bugs.python.org/issue31692 closed by haypo #31714: Improve re documentation https://bugs.python.org/issue31714 closed by serhiy.storchaka #31733: [2.7] Add PYTHONSHOWREFCOUNT environment variable to Python 2. https://bugs.python.org/issue31733 closed by haypo #31738: Lib/site.py: method `abs_paths` is not documented https://bugs.python.org/issue31738 closed by merwok #31754: Documented type of parameter 'itemsize' to PyBuffer_FillContig https://bugs.python.org/issue31754 closed by berker.peksag #31757: Tutorial: Fibonacci numbers start with 1, 1 https://bugs.python.org/issue31757 closed by rhettinger #31760: Re-definition of _POSIX_C_SOURCE with Fedora 26. https://bugs.python.org/issue31760 closed by martin.panter #31763: Add NOTICE level to the logging module https://bugs.python.org/issue31763 closed by rhettinger #31765: BUG: System deadlocks performing big loop operations in python https://bugs.python.org/issue31765 closed by terry.reedy #31776: Missing "raise from None" in /Lib/xml/etree/ElementPath.py https://bugs.python.org/issue31776 closed by serhiy.storchaka #31780: Using format spec ',x' displays incorrect error message https://bugs.python.org/issue31780 closed by terry.reedy #31785: Move instruction code from ceval.c to a separate file https://bugs.python.org/issue31785 closed by rhettinger #31786: In select.poll.poll() ms can be 0 if timeout < 0 https://bugs.python.org/issue31786 closed by serhiy.storchaka #31788: Typo in comments Modules/_ssl.c https://bugs.python.org/issue31788 closed by benjamin.peterson #31792: test_buffer altered the execution environment https://bugs.python.org/issue31792 closed by serhiy.storchaka #31795: Slicings documentation doesn't mention Ellipsis https://bugs.python.org/issue31795 closed by rhettinger #31796: The mccabe complexity output in module flake8. https://bugs.python.org/issue31796 closed by ned.deily #31797: Python 3.6.3: JSON loop fails using elif https://bugs.python.org/issue31797 closed by eric.smith #31799: Improve __spec__ discoverability https://bugs.python.org/issue31799 closed by barry #31806: Use _PyTime_ROUND_TIMEOUT in _threadmodule.c, timemodule.c and https://bugs.python.org/issue31806 closed by serhiy.storchaka #31816: Unexpected behaviour of `dir()` after implementation of __dir_ https://bugs.python.org/issue31816 closed by christian.heimes #31819: Add sock_recv_into to AbstractEventLoop https://bugs.python.org/issue31819 closed by yselivanov #31820: Calling email.message.set_payload twice produces an invalid em https://bugs.python.org/issue31820 closed by Zirak Ertan #31825: bytes decode raises OverflowError desipte errors='ignore' https://bugs.python.org/issue31825 closed by serhiy.storchaka From brett at python.org Fri Oct 20 13:23:07 2017 From: brett at python.org (Brett Cannon) Date: Fri, 20 Oct 2017 17:23:07 +0000 Subject: [Python-Dev] Investigating time for `import requests` In-Reply-To: <0B92C2DA-8073-4285-8630-6396587A4539@mac.com> References: <641723C8-AE9E-44F1-94AE-68A716980318@python.org> <302A53D9-D6F1-4872-ACF2-522889412FA1@mac.com> <0B92C2DA-8073-4285-8630-6396587A4539@mac.com> Message-ID: On Fri, 20 Oct 2017 at 06:42 Ronald Oussoren wrote: > Op 10 okt. 2017 om 01:48 heeft Brett Cannon het > volgende geschreven: > > > > On Mon, Oct 2, 2017, 17:49 Ronald Oussoren, > wrote: > >> Op 3 okt. 2017 om 04:29 heeft Barry Warsaw het >> volgende geschreven: >> >> > On Oct 2, 2017, at 14:56, Brett Cannon wrote: >> > >> >> So Mercurial specifically is an odd duck because they already do lazy >> importing (in fact they are using the lazy loading support from importlib). >> In terms of all of this discussion of tweaking import to be lazy, I think >> the best approach would be providing an opt-in solution that CLI tools can >> turn on ASAP while the default stays eager. That way everyone gets what >> they want while the stdlib provides a shared solution that's maintained >> alongside import itself to make sure it functions appropriately. >> > >> > The problem I think is that to get full benefit of lazy loading, it has >> to be turned on globally for bare ?import? statements. A typical >> application has tons of dependencies and all those libraries are also doing >> module global imports, so unless lazy loading somehow covers them, it?ll be >> an incomplete gain. But of course it?ll take forever for all your >> dependencies to use whatever new API we come up with, and if it?s not as >> convenient to write as ?import foo? then I suspect it won?t much catch on >> anyway. >> > >> >> One thing to keep in mind is that imports can have important >> side-effects. Turning every import statement into a lazy import will not be >> backward compatible. >> > > Yep, and that's a lesson Mercurial shared with me at PyCon US this year. > My planned approach has a blacklist for modules to only load eagerly. > > > I?m not sure if i understand. Do you want to turn on lazy loading for the > stdlib only (with a blacklist for modules that won?t work that way), or > generally? > Generally, but provide out of the box a blacklist that already contains troublesome modules in the stdlib. > In the latter case this would still not be backward compatible. > Correct, which is why it would be opt-in and projects would need to be diligent in keeping up their blacklist. -------------- next part -------------- An HTML attachment was scrubbed... URL: From francismb at email.de Sat Oct 21 07:39:38 2017 From: francismb at email.de (francismb) Date: Sat, 21 Oct 2017 13:39:38 +0200 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: References: <20171016170631.375edd56@fsol> Message-ID: <4f15b978-d786-3b80-77f7-c6cb7d313573@email.de> Hi Victor, On 10/18/2017 01:14 AM, Victor Stinner wrote: > I updated my PEP 564 to add time.process_time_ns(): > https://github.com/python/peps/blob/master/pep-0564.rst > > The HTML version should be updated shortly: > https://www.python.org/dev/peps/pep-0564/ ** In practive, the resolution of 1 nanosecond ** ** no need for resolution better than 1 nanosecond in practive in the Python standard library.** practice vs practice If I understood you correctly on Python-ideas (here just for the records, otherwise please ignore it): why not something like (please change '_in' for what you like): time.time_in(precision) time.monotonic_in(precision) where precision is an enumeration for: 'seconds', 'milliseconds' 'microseconds'... (or 's', 'ms', 'us', 'ns', ...) Thanks, --francis From guido at python.org Sat Oct 21 11:45:37 2017 From: guido at python.org (Guido van Rossum) Date: Sat, 21 Oct 2017 08:45:37 -0700 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: <4f15b978-d786-3b80-77f7-c6cb7d313573@email.de> References: <20171016170631.375edd56@fsol> <4f15b978-d786-3b80-77f7-c6cb7d313573@email.de> Message-ID: That sounds like unnecessary generality, and also suggests that the API might support precisions way beyond what is realistic. On Sat, Oct 21, 2017 at 4:39 AM, francismb wrote: > Hi Victor, > > On 10/18/2017 01:14 AM, Victor Stinner wrote: > > I updated my PEP 564 to add time.process_time_ns(): > > https://github.com/python/peps/blob/master/pep-0564.rst > > > > The HTML version should be updated shortly: > > https://www.python.org/dev/peps/pep-0564/ > > ** In practive, the resolution of 1 nanosecond ** > > ** no need for resolution better than 1 nanosecond in practive in the > Python standard library.** > > practice vs practice > > > > If I understood you correctly on Python-ideas (here just for the > records, otherwise please ignore it): > > why not something like (please change '_in' for what you like): > > time.time_in(precision) > time.monotonic_in(precision) > > > where precision is an enumeration for: 'seconds', 'milliseconds' > 'microseconds'... (or 's', 'ms', 'us', 'ns', ...) > > > Thanks, > --francis > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From francismb at email.de Sat Oct 21 14:23:55 2017 From: francismb at email.de (francismb) Date: Sat, 21 Oct 2017 20:23:55 +0200 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: References: <20171016170631.375edd56@fsol> <4f15b978-d786-3b80-77f7-c6cb7d313573@email.de> Message-ID: If it sounds as there is no need or is unnecessary to you then it its ok :-), thank you for the feedback ! I'm just curious on: On 10/21/2017 05:45 PM, Guido van Rossum wrote: > That sounds like unnecessary generality, Meaning that the selection of precision on running time 'costs'? I understand that one can just multiply/divide the nanoseconds returned, (or it could be a factory) but wouldn't it help for future enhancements to reduce the number of functions (the 'pico' question)? > and also suggests that the API > might support precisions way beyond what is realistic. Doesn't that depends on the offered/supported enums (in that case down to 'ns' as Victor proposed) ? Thanks, --francis From victor.stinner at gmail.com Sat Oct 21 19:32:32 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Sun, 22 Oct 2017 01:32:32 +0200 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: References: <20171016170631.375edd56@fsol> <4f15b978-d786-3b80-77f7-c6cb7d313573@email.de> Message-ID: Le 21 oct. 2017 20:31, "francismb" a ?crit : I understand that one can just multiply/divide the nanoseconds returned, (or it could be a factory) but wouldn't it help for future enhancements to reduce the number of functions (the 'pico' question)? If you are me to predict the future, I predict that CPU frequency will be stuck below 10 GHz for the next 10 years :-) Did you hear that the Moore law is no more true since 2012 (Intel said since 2015)? Since 2002, CPUs frequency are blocked around 3 GHz. Overclock records are around 8 GHz with very specialized hardware, not usable for a classical PC. I don't want to overengineer an API "just in case". Let's provide nanoseconds. We can discuss picoseconds later, maybe in 10 years? You can now start to bet if decimal128 will come before or after picoseconds in mainstream CPUs :-) By the way, we are talking about a resolution of 1 ns, but remember that a Python function call is closer to 50 ns. I am not sure that picosecond makes sense if CPU doesn't become much faster. I am too shy to put such predictions in a very offical PEP ;-) Victor -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Oct 21 23:01:40 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 22 Oct 2017 13:01:40 +1000 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: References: <20171016170631.375edd56@fsol> <4f15b978-d786-3b80-77f7-c6cb7d313573@email.de> Message-ID: On 22 October 2017 at 09:32, Victor Stinner wrote: > Le 21 oct. 2017 20:31, "francismb" a ?crit : > > I understand that one can just multiply/divide the nanoseconds returned, > (or it could be a factory) but wouldn't it help for future enhancements > to reduce the number of functions (the 'pico' question)? > > > If you are me to predict the future, I predict that CPU frequency will be > stuck below 10 GHz for the next 10 years :-) > There are actually solid physical reasons for that prediction likely being true. Aside from the power consumption, heat dissipation, and EM radiation issues that arise with higher switching frequencies, you also start running into more problems with digital circuit metastability ([1], [2]): the more clock edges you have per second, the higher the chances of an asynchronous input changing state at a bad time. So yeah, for nanosecond resolution to not be good enough for programs running in Python, we're going to be talking about some genuinely fundamental changes in the nature of computing hardware, and it's currently unclear if or how established programming languages will make that jump (see [3] for a gentle introduction to the current state of practical quantum computing). At that point, picoseconds vs nanoseconds is likely to be the least of our conceptual modeling challenges :) Cheers, Nick. [1] https://en.wikipedia.org/wiki/Metastability_in_electronics [2] https://electronics.stackexchange.com/questions/14816/what-is-metastability [3] https://medium.com/@decodoku/how-to-program-a-quantum-computer-982a9329ed02 -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Sun Oct 22 05:40:04 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 22 Oct 2017 11:40:04 +0200 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution References: Message-ID: <20171022114004.45b2d89b@fsol> Hi Victor, I made some small fixes to the PEP. As far as I'm concerned, the PEP is ok and should be approved :-) Regards Antoine. On Mon, 16 Oct 2017 12:42:30 +0200 Victor Stinner wrote: > Hi, > > While discussions on this PEP are not over on python-ideas, I proposed > this PEP directly on python-dev since I consider that my PEP already > summarizes current and past proposed alternatives. > > python-ideas threads: > > * Add time.time_ns(): system clock with nanosecond resolution > * Why not picoseconds? > > The PEP 564 will be shortly online at: > https://www.python.org/dev/peps/pep-0564/ > > Victor > > > PEP: 564 > Title: Add new time functions with nanosecond resolution > Version: $Revision$ > Last-Modified: $Date$ > Author: Victor Stinner > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 16-October-2017 > Python-Version: 3.7 > > > Abstract > ======== > > Add five new functions to the ``time`` module: ``time_ns()``, > ``perf_counter_ns()``, ``monotonic_ns()``, ``clock_gettime_ns()`` and > ``clock_settime_ns()``. They are similar to the function without the > ``_ns`` suffix, but have nanosecond resolution: use a number of > nanoseconds as a Python int. > > The best ``time.time_ns()`` resolution measured in Python is 3 times > better then ``time.time()`` resolution on Linux and Windows. > > > Rationale > ========= > > Float type limited to 104 days > ------------------------------ > > The clocks resolution of desktop and latop computers is getting closer > to nanosecond resolution. More and more clocks have a frequency in MHz, > up to GHz for the CPU TSC clock. > > The Python ``time.time()`` function returns the current time as a > floatting point number which is usually a 64-bit binary floatting number > (in the IEEE 754 format). > > The problem is that the float type starts to lose nanoseconds after 104 > days. Conversion from nanoseconds (``int``) to seconds (``float``) and > then back to nanoseconds (``int``) to check if conversions lose > precision:: > > # no precision loss > >>> x = 2 ** 52 + 1; int(float(x * 1e-9) * 1e9) - x > 0 > # precision loss! (1 nanosecond) > >>> x = 2 ** 53 + 1; int(float(x * 1e-9) * 1e9) - x > -1 > >>> print(datetime.timedelta(seconds=2 ** 53 / 1e9)) > 104 days, 5:59:59.254741 > > ``time.time()`` returns seconds elapsed since the UNIX epoch: January > 1st, 1970. This function loses precision since May 1970 (47 years ago):: > > >>> import datetime > >>> unix_epoch = datetime.datetime(1970, 1, 1) > >>> print(unix_epoch + datetime.timedelta(seconds=2**53 / 1e9)) > 1970-04-15 05:59:59.254741 > > > Previous rejected PEP > --------------------- > > Five years ago, the PEP 410 proposed a large and complex change in all > Python functions returning time to support nanosecond resolution using > the ``decimal.Decimal`` type. > > The PEP was rejected for different reasons: > > * The idea of adding a new optional parameter to change the result type > was rejected. It's an uncommon (and bad?) programming practice in > Python. > > * It was not clear if hardware clocks really had a resolution of 1 > nanosecond, especially at the Python level. > > * The ``decimal.Decimal`` type is uncommon in Python and so requires > to adapt code to handle it. > > > CPython enhancements of the last 5 years > ---------------------------------------- > > Since the PEP 410 was rejected: > > * The ``os.stat_result`` structure got 3 new fields for timestamps as > nanoseconds (Python ``int``): ``st_atime_ns``, ``st_ctime_ns`` > and ``st_mtime_ns``. > > * The PEP 418 was accepted, Python 3.3 got 3 new clocks: > ``time.monotonic()``, ``time.perf_counter()`` and > ``time.process_time()``. > > * The CPython private "pytime" C API handling time now uses a new > ``_PyTime_t`` type: simple 64-bit signed integer (C ``int64_t``). > The ``_PyTime_t`` unit is an implementation detail and not part of the > API. The unit is currently ``1 nanosecond``. > > Existing Python APIs using nanoseconds as int > --------------------------------------------- > > The ``os.stat_result`` structure has 3 fields for timestamps as > nanoseconds (``int``): ``st_atime_ns``, ``st_ctime_ns`` and > ``st_mtime_ns``. > > The ``ns`` parameter of the ``os.utime()`` function accepts a > ``(atime_ns: int, mtime_ns: int)`` tuple: nanoseconds. > > > Changes > ======= > > New functions > ------------- > > This PEP adds five new functions to the ``time`` module: > > * ``time.clock_gettime_ns(clock_id)`` > * ``time.clock_settime_ns(clock_id, time: int)`` > * ``time.perf_counter_ns()`` > * ``time.monotonic_ns()`` > * ``time.time_ns()`` > > These functions are similar to the version without the ``_ns`` suffix, > but use nanoseconds as Python ``int``. > > For example, ``time.monotonic_ns() == int(time.monotonic() * 1e9)`` if > ``monotonic()`` value is small enough to not lose precision. > > Unchanged functions > ------------------- > > This PEP only proposed to add new functions getting or setting clocks > with nanosecond resolution. Clocks are likely to lose precision, > especially when their reference is the UNIX epoch. > > Python has other functions handling time (get time, timeout, etc.), but > no nanosecond variant is proposed for them since they are less likely to > lose precision. > > Example of unchanged functions: > > * ``os`` module: ``sched_rr_get_interval()``, ``times()``, ``wait3()`` > and ``wait4()`` > > * ``resource`` module: ``ru_utime`` and ``ru_stime`` fields of > ``getrusage()`` > > * ``signal`` module: ``getitimer()``, ``setitimer()`` > > * ``time`` module: ``clock_getres()`` > > Since the ``time.clock()`` function was deprecated in Python 3.3, no > ``time.clock_ns()`` is added. > > > Alternatives and discussion > =========================== > > Sub-nanosecond resolution > ------------------------- > > ``time.time_ns()`` API is not "future-proof": if clocks resolutions > increase, new Python functions may be needed. > > In practive, the resolution of 1 nanosecond is currently enough for all > structures used by all operating systems functions. > > Hardware clock with a resolution better than 1 nanosecond already > exists. For example, the frequency of a CPU TSC clock is the CPU base > frequency: the resolution is around 0.3 ns for a CPU running at 3 > GHz. Users who have access to such hardware and really need > sub-nanosecond resolution can easyly extend Python for their needs. > Such rare use case don't justify to design the Python standard library > to support sub-nanosecond resolution. > > For the CPython implementation, nanosecond resolution is convenient: the > standard and well supported ``int64_t`` type can be used to store time. > It supports a time delta between -292 years and 292 years. Using the > UNIX epoch as reference, this type supports time since year 1677 to year > 2262:: > > >>> 1970 - 2 ** 63 / (10 ** 9 * 3600 * 24 * 365.25) > 1677.728976954687 > >>> 1970 + 2 ** 63 / (10 ** 9 * 3600 * 24 * 365.25) > 2262.271023045313 > > Different types > --------------- > > It was proposed to modify ``time.time()`` to use float type with better > precision. The PEP 410 proposed to use ``decimal.Decimal``, but it was > rejected. Apart ``decimal.Decimal``, no portable ``float`` type with > better precision is currently available in Python. Changing the builtin > Python ``float`` type is out of the scope of this PEP. > > Other ideas of new types were proposed to support larger or arbitrary > precision: fractions, structures or 2-tuple using integers, > fixed-precision floating point number, etc. > > See also the PEP 410 for a previous long discussion on other types. > > Adding a new type requires more effort to support it, than reusing > ``int``. The standard library, third party code and applications would > have to be modified to support it. > > The Python ``int`` type is well known, well supported, ease to > manipulate, and supports all arithmetic operations like: > ``dt = t2 - t1``. > > Moreover, using nanoseconds as integer is not new in Python, it's > already used for ``os.stat_result`` and > ``os.utime(ns=(atime_ns, mtime_ns))``. > > .. note:: > If the Python ``float`` type becomes larger (ex: decimal128 or > float128), the ``time.time()`` precision will increase as well. > > Different API > ------------- > > The ``time.time(ns=False)`` API was proposed to avoid adding new > functions. It's an uncommon (and bad?) programming practice in Python to > change the result type depending on a parameter. > > Different options were proposed to allow the user to choose the time > resolution. If each Python module uses a different resolution, it can > become difficult to handle different resolutions, instead of just > seconds (``time.time()`` returning ``float``) and nanoseconds > (``time.time_ns()`` returning ``int``). Moreover, as written above, > there is no need for resolution better than 1 nanosecond in practive in > the Python standard library. > > > Annex: Clocks Resolution in Python > ================================== > > Script ot measure the smallest difference between two ``time.time()`` and > ``time.time_ns()`` reads ignoring differences of zero:: > > import math > import time > > LOOPS = 10 ** 6 > > print("time.time_ns(): %s" % time.time_ns()) > print("time.time(): %s" % time.time()) > > min_dt = [abs(time.time_ns() - time.time_ns()) > for _ in range(LOOPS)] > min_dt = min(filter(bool, min_dt)) > print("min time_ns() delta: %s ns" % min_dt) > > min_dt = [abs(time.time() - time.time()) > for _ in range(LOOPS)] > min_dt = min(filter(bool, min_dt)) > print("min time() delta: %s ns" % math.ceil(min_dt * 1e9)) > > Results of time(), perf_counter() and monotonic(). > > Linux (kernel 4.12 on Fedora 26): > > * time_ns(): **84 ns** > * time(): **239 ns** > * perf_counter_ns(): 84 ns > * perf_counter(): 82 ns > * monotonic_ns(): 84 ns > * monotonic(): 81 ns > > Windows 8.1: > > * time_ns(): **318000 ns** > * time(): **894070 ns** > * perf_counter_ns(): 100 ns > * perf_counter(): 100 ns > * monotonic_ns(): 15000000 ns > * monotonic(): 15000000 ns > > The difference on ``time.time()`` is significant: **84 ns (2.8x better) > vs 239 ns on Linux and 318 us (2.8x better) vs 894 us on Windows**. The > difference (presion loss) will be larger next years since every day adds > 864,00,000,000,000 nanoseconds to the system clock. > > The difference on ``time.perf_counter()`` and ``time.monotonic clock()`` > is not visible in this quick script since the script runs less than 1 > minute, and the uptime of the computer used to run the script was > smaller than 1 week. A significant difference should be seen with an > uptime of 104 days or greater. > > .. note:: > Internally, Python starts ``monotonic()`` and ``perf_counter()`` > clocks at zero on some platforms which indirectly reduce the > precision loss. > > > > Copyright > ========= > > This document has been placed in the public domain. From wes.turner at gmail.com Sun Oct 22 11:06:46 2017 From: wes.turner at gmail.com (Wes Turner) Date: Sun, 22 Oct 2017 11:06:46 -0400 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: References: <20171016170631.375edd56@fsol> <4f15b978-d786-3b80-77f7-c6cb7d313573@email.de> Message-ID: On Saturday, October 21, 2017, Nick Coghlan wrote: > On 22 October 2017 at 09:32, Victor Stinner > wrote: > >> Le 21 oct. 2017 20:31, "francismb" > > a ?crit : >> >> I understand that one can just multiply/divide the nanoseconds returned, >> (or it could be a factory) but wouldn't it help for future enhancements >> to reduce the number of functions (the 'pico' question)? >> >> >> If you are me to predict the future, I predict that CPU frequency will be >> stuck below 10 GHz for the next 10 years :-) >> > > There are actually solid physical reasons for that prediction likely being > true. Aside from the power consumption, heat dissipation, and EM radiation > issues that arise with higher switching frequencies, you also start running > into more problems with digital circuit metastability ([1], [2]): the more > clock edges you have per second, the higher the chances of an asynchronous > input changing state at a bad time. > > So yeah, for nanosecond resolution to not be good enough for programs > running in Python, we're going to be talking about some genuinely > fundamental changes in the nature of computing hardware, and it's currently > unclear if or how established programming languages will make that jump > (see [3] for a gentle introduction to the current state of practical > quantum computing). At that point, picoseconds vs nanoseconds is likely to > be the least of our conceptual modeling challenges :) > There are current applications with greater-than nanosecond precision: - relativity experiments - particle experiments Must they always use their own implementations of time., datetime. __init__, fromordinal, fromtimestamp ?! - https://scholar.google.com/scholar?q=femtosecond - https://scholar.google.com/scholar?q=attosecond - GPS now supports nanosecond resolution - https://en.wikipedia.org/wiki/Quantum_clock#More_accurate_experimental_clocks > In 2015 JILA evaluated the absolute frequency uncertainty of their latest strontium-87 optical lattice clock at 2.1 ? 10?18, which corresponds to a measurable gravitational time dilation for an elevation change of 2 cm (0.79 in) What about bus latency (and variance)? From https://www.nist.gov/publications/optical-two-way-time-and-frequency-transfer-over-free-space : > Optical two-way time and frequency transfer over free space > Abstract > The transfer of high-quality time-frequency signals between remote locations underpins many applications, including precision navigation and timing, clock-based geodesy, long-baseline interferometry, coherent radar arrays, tests of general relativity and fundamental constants, and future redefinition of the second. However, present microwave-based time-frequency transfer is inadequate for state-of-the-art optical clocks and oscillators that have femtosecond-level timing jitter and accuracies below 1 ? 10?17. Commensurate optically based transfer methods are therefore needed. Here we demonstrate optical time-frequency transfer over free space via two-way exchange between coherent frequency combs, each phase-locked to the local optical oscillator. We achieve 1 fs timing deviation, residual instability below 1 ? 10?18 at 1,000 s and systematic offsets below 4 ? 10?19, despite frequent signal fading due to atmospheric turbulence or obstructions across the 2 km link. This free-space transfer can enable terrestrial links to support clock-based geodesy. Combined with satellite-based optical communications, it provides a path towards global-scale geodesy, high-accuracy time-frequency distribution and satellite-based relativity experiments. How much wider must an epoch-relative time struct be for various realistic time precisions/accuracies? 10-6 micro ? 10-9 nano n -- int64 10-12 pico p 10-15 femto f 10-18 atto a 10-21 zepto z 10-24 yocto y I'm at a loss to recommend a library to prefix these with the epoch; but future compatibility may be a helpful, realistic objective. Natural keys with such time resolution are still unfortunately likely to collide. > > Cheers, > Nick. > > [1] https://en.wikipedia.org/wiki/Metastability_in_electronics > [2] https://electronics.stackexchange.com/questions/ > 14816/what-is-metastability > [3] https://medium.com/@decodoku/how-to-program-a-quantum- > computer-982a9329ed02 > > > -- > Nick Coghlan | ncoghlan at gmail.com > | Brisbane, > Australia > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Sun Oct 22 11:23:44 2017 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 23 Oct 2017 02:23:44 +1100 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: References: <20171016170631.375edd56@fsol> <4f15b978-d786-3b80-77f7-c6cb7d313573@email.de> Message-ID: On Mon, Oct 23, 2017 at 2:06 AM, Wes Turner wrote: > What about bus latency (and variance)? I'm currently in Los Angeles. Bus latency is measured in minutes, and may easily exceed sixty of them. :| Seriously though: For applications requiring accurate representation of relativistic effects, the stdlib datetime module has a good few problems besides lacking sub-nanosecond precision. I'd be inclined to YAGNI this away unless/until some third-party module demonstrates that there's actually a use for a datetime module that can handle all that. ChrisA From ncoghlan at gmail.com Sun Oct 22 11:36:41 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 23 Oct 2017 01:36:41 +1000 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: References: <20171016170631.375edd56@fsol> <4f15b978-d786-3b80-77f7-c6cb7d313573@email.de> Message-ID: On 23 October 2017 at 01:06, Wes Turner wrote: > On Saturday, October 21, 2017, Nick Coghlan wrote: > >> So yeah, for nanosecond resolution to not be good enough for programs >> running in Python, we're going to be talking about some genuinely >> fundamental changes in the nature of computing hardware, and it's currently >> unclear if or how established programming languages will make that jump >> (see [3] for a gentle introduction to the current state of practical >> quantum computing). At that point, picoseconds vs nanoseconds is likely to >> be the least of our conceptual modeling challenges :) >> > There are current applications with greater-than nanosecond precision: > > - relativity experiments > - particle experiments > > Must they always use their own implementations of time., datetime. > __init__, fromordinal, fromtimestamp ?! > Yes, as time is a critical part of their experimental setup - when you're operating at relativistic speeds and the kinds of energy levels that particle accelerators hit, it's a bad idea to assume that regular time libraries that assume Newtonian physics applies are going to be up to the task. Normal software assumes a nanosecond is almost no time at all - in high energy particle physics, a nanosecond is enough time for light to travel 30 centimetres, and a high energy particle that stuck around that long before decaying into a lower energy state would be classified as "long lived". Cheers. Nick. P.S. "Don't take code out of the environment it was designed for and assume it will just keep working normally" is one of the main lessons folks learned from the destruction of the first Ariane 5 launch rocket in 1996 (see the first paragraph in https://en.wikipedia.org/wiki/Ariane_5#Notable_launches ) -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Sun Oct 22 13:30:49 2017 From: mertz at gnosis.cx (David Mertz) Date: Sun, 22 Oct 2017 10:30:49 -0700 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: References: <20171016170631.375edd56@fsol> <4f15b978-d786-3b80-77f7-c6cb7d313573@email.de> Message-ID: I worked at a molecular dynamics lab for a number of years. I advocated switching all our code to using attosecond units (rather than fractional picoseconds). However, this had nothing whatsoever to do with the machine clock speeds, but only with the physical quantities represented and the scaling/rounding math. It didn't happen, for various reasons. But if it had, I certainly wouldn't have expected standard library support for this. The 'time' module is about wall clock out calendar time, not about *simulation time*. FWIW, a very long simulation might cover a millisecond of simulated time.... we're a very long way from looking at molecular behavior over 104 days. On Oct 22, 2017 8:10 AM, "Wes Turner" wrote: On Saturday, October 21, 2017, Nick Coghlan wrote: > On 22 October 2017 at 09:32, Victor Stinner > wrote: > >> Le 21 oct. 2017 20:31, "francismb" a ?crit : >> >> I understand that one can just multiply/divide the nanoseconds returned, >> (or it could be a factory) but wouldn't it help for future enhancements >> to reduce the number of functions (the 'pico' question)? >> >> >> If you are me to predict the future, I predict that CPU frequency will be >> stuck below 10 GHz for the next 10 years :-) >> > > There are actually solid physical reasons for that prediction likely being > true. Aside from the power consumption, heat dissipation, and EM radiation > issues that arise with higher switching frequencies, you also start running > into more problems with digital circuit metastability ([1], [2]): the more > clock edges you have per second, the higher the chances of an asynchronous > input changing state at a bad time. > > So yeah, for nanosecond resolution to not be good enough for programs > running in Python, we're going to be talking about some genuinely > fundamental changes in the nature of computing hardware, and it's currently > unclear if or how established programming languages will make that jump > (see [3] for a gentle introduction to the current state of practical > quantum computing). At that point, picoseconds vs nanoseconds is likely to > be the least of our conceptual modeling challenges :) > There are current applications with greater-than nanosecond precision: - relativity experiments - particle experiments Must they always use their own implementations of time., datetime. __init__, fromordinal, fromtimestamp ?! - https://scholar.google.com/scholar?q=femtosecond - https://scholar.google.com/scholar?q=attosecond - GPS now supports nanosecond resolution - https://en.wikipedia.org/wiki/Quantum_clock#More_accurate_ experimental_clocks > In 2015 JILA evaluated the absolute frequency uncertainty of their latest strontium-87 optical lattice clock at 2.1 ? 10?18, which corresponds to a measurable gravitational time dilation for an elevation change of 2 cm (0.79 in) What about bus latency (and variance)? >From https://www.nist.gov/publications/optical-two-way- time-and-frequency-transfer-over-free-space : > Optical two-way time and frequency transfer over free space > Abstract > The transfer of high-quality time-frequency signals between remote locations underpins many applications, including precision navigation and timing, clock-based geodesy, long-baseline interferometry, coherent radar arrays, tests of general relativity and fundamental constants, and future redefinition of the second. However, present microwave-based time-frequency transfer is inadequate for state-of-the-art optical clocks and oscillators that have femtosecond-level timing jitter and accuracies below 1 ? 10?17. Commensurate optically based transfer methods are therefore needed. Here we demonstrate optical time-frequency transfer over free space via two-way exchange between coherent frequency combs, each phase-locked to the local optical oscillator. We achieve 1 fs timing deviation, residual instability below 1 ? 10?18 at 1,000 s and systematic offsets below 4 ? 10?19, despite frequent signal fading due to atmospheric turbulence or obstructions across the 2 km link. This free-space transfer can enable terrestrial links to support clock-based geodesy. Combined with satellite-based optical communications, it provides a path towards global-scale geodesy, high-accuracy time-frequency distribution and satellite-based relativity experiments. How much wider must an epoch-relative time struct be for various realistic time precisions/accuracies? 10-6 micro ? 10-9 nano n -- int64 10-12 pico p 10-15 femto f 10-18 atto a 10-21 zepto z 10-24 yocto y I'm at a loss to recommend a library to prefix these with the epoch; but future compatibility may be a helpful, realistic objective. Natural keys with such time resolution are still unfortunately likely to collide. > > Cheers, > Nick. > > [1] https://en.wikipedia.org/wiki/Metastability_in_electronics > [2] https://electronics.stackexchange.com/questions/14816/what- > is-metastability > [3] https://medium.com/@decodoku/how-to-program-a-quantum-comput > er-982a9329ed02 > > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ mertz%40gnosis.cx -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Sun Oct 22 16:42:40 2017 From: wes.turner at gmail.com (Wes Turner) Date: Sun, 22 Oct 2017 16:42:40 -0400 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: References: <20171016170631.375edd56@fsol> <4f15b978-d786-3b80-77f7-c6cb7d313573@email.de> Message-ID: On Sunday, October 22, 2017, David Mertz wrote: > I worked at a molecular dynamics lab for a number of years. I advocated > switching all our code to using attosecond units (rather than fractional > picoseconds). > > However, this had nothing whatsoever to do with the machine clock speeds, > but only with the physical quantities represented and the scaling/rounding > math. > > It didn't happen, for various reasons. But if it had, I certainly wouldn't > have expected standard library support for this. The 'time' module is about > wall clock out calendar time, not about *simulation time*. > > FWIW, a very long simulation might cover a millisecond of simulated > time.... we're a very long way from looking at molecular behavior over 104 > days. > Maybe that's why we haven't found any CTCs (closed timelike curves) yet. Aligning simulation data in context to other events may be enlightening: is there a good library for handing high precision time units in Python (and/or CFFI)? ... http://opendata.cern.ch/ http://opendata.cern.ch/getting-started/CMS > > > On Oct 22, 2017 8:10 AM, "Wes Turner" > wrote: > > > > On Saturday, October 21, 2017, Nick Coghlan > wrote: > >> On 22 October 2017 at 09:32, Victor Stinner >> wrote: >> >>> Le 21 oct. 2017 20:31, "francismb" a ?crit : >>> >>> I understand that one can just multiply/divide the nanoseconds returned, >>> (or it could be a factory) but wouldn't it help for future enhancements >>> to reduce the number of functions (the 'pico' question)? >>> >>> >>> If you are me to predict the future, I predict that CPU frequency will >>> be stuck below 10 GHz for the next 10 years :-) >>> >> >> There are actually solid physical reasons for that prediction likely >> being true. Aside from the power consumption, heat dissipation, and EM >> radiation issues that arise with higher switching frequencies, you also >> start running into more problems with digital circuit metastability ([1], >> [2]): the more clock edges you have per second, the higher the chances of >> an asynchronous input changing state at a bad time. >> >> So yeah, for nanosecond resolution to not be good enough for programs >> running in Python, we're going to be talking about some genuinely >> fundamental changes in the nature of computing hardware, and it's currently >> unclear if or how established programming languages will make that jump >> (see [3] for a gentle introduction to the current state of practical >> quantum computing). At that point, picoseconds vs nanoseconds is likely to >> be the least of our conceptual modeling challenges :) >> > > There are current applications with greater-than nanosecond precision: > > - relativity experiments > - particle experiments > > Must they always use their own implementations of time., datetime. > __init__, fromordinal, fromtimestamp ?! > > - https://scholar.google.com/scholar?q=femtosecond > - https://scholar.google.com/scholar?q=attosecond > - GPS now supports nanosecond resolution > - > > https://en.wikipedia.org/wiki/Quantum_clock#More_accurate_ex > perimental_clocks > > > In 2015 JILA evaluated the > absolute frequency uncertainty of their latest strontium-87 > optical lattice > clock at 2.1 ? 10?18, which corresponds to a measurable gravitational > time dilation > for an > elevation change of 2 cm (0.79 in) > > What about bus latency (and variance)? > > From https://www.nist.gov/publications/optical-two-way-time-and- > frequency-transfer-over-free-space : > > > Optical two-way time and frequency transfer over free space > > Abstract > > The transfer of high-quality time-frequency signals between remote > locations underpins many applications, including precision navigation and > timing, clock-based geodesy, long-baseline interferometry, coherent radar > arrays, tests of general relativity and fundamental constants, and future > redefinition of the second. However, present microwave-based time-frequency > transfer is inadequate for state-of-the-art optical clocks and oscillators > that have femtosecond-level timing jitter and accuracies below 1 ? 10?17. > Commensurate optically based transfer methods are therefore needed. Here we > demonstrate optical time-frequency transfer over free space via two-way > exchange between coherent frequency combs, each phase-locked to the local > optical oscillator. We achieve 1 fs timing deviation, residual instability > below 1 ? 10?18 at 1,000 s and systematic offsets below 4 ? 10?19, > despite frequent signal fading due to atmospheric turbulence or > obstructions across the 2 km link. This free-space transfer can enable > terrestrial links to support clock-based geodesy. Combined with > satellite-based optical communications, it provides a path towards > global-scale geodesy, high-accuracy time-frequency distribution and > satellite-based relativity experiments. > > How much wider must an epoch-relative time struct be for various realistic > time precisions/accuracies? > > 10-6 micro ? > 10-9 nano n -- int64 > 10-12 pico p > 10-15 femto f > 10-18 atto a > 10-21 zepto z > 10-24 yocto y > > I'm at a loss to recommend a library to prefix these with the epoch; but > future compatibility may be a helpful, realistic objective. > > Natural keys with such time resolution are still unfortunately likely to > collide. > > >> >> Cheers, >> Nick. >> >> [1] https://en.wikipedia.org/wiki/Metastability_in_electronics >> [2] https://electronics.stackexchange.com/questions/14816/what-i >> s-metastability >> [3] https://medium.com/@decodoku/how-to-program-a-quantum-comput >> er-982a9329ed02 >> >> >> -- >> Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia >> > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/mertz% > 40gnosis.cx > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Sun Oct 22 19:54:37 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 23 Oct 2017 01:54:37 +0200 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: References: <20171016170631.375edd56@fsol> <4f15b978-d786-3b80-77f7-c6cb7d313573@email.de> Message-ID: Le 22 oct. 2017 17:06, "Wes Turner" a ?crit : Must they always use their own implementations of time., datetime. __init__, fromordinal, fromtimestamp ?! Yes, exactly. Note: Adding resolution better than 1 us to datetime is not in the scope of the PEP but there is an issue, open since a long time. I don't think that time.time_ns() is usable for such experiment. Again, calling a function is Python takes around 50 ns. Victor -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at holdenweb.com Mon Oct 23 12:55:45 2017 From: steve at holdenweb.com (Steve Holden) Date: Mon, 23 Oct 2017 17:55:45 +0100 Subject: [Python-Dev] [OT] Early PyCon Pictures, anyone? Message-ID: Hi all, I've giving a talk on the history of PyCon at PyCon UK this weekend. I'd love to include some photos from the early conferences but alas most of the links I've found on the web are stale and broken. If anyone has pictures, or valid links to such pictures, I'd be delighted to hear about them. Thanks Steve Holden -------------- next part -------------- An HTML attachment was scrubbed... URL: From python-dev at mgmiller.net Mon Oct 23 18:03:09 2017 From: python-dev at mgmiller.net (Mike Miller) Date: Mon, 23 Oct 2017 15:03:09 -0700 Subject: [Python-Dev] iso8601 parsing Message-ID: <01e69881-3710-87c8-f47a-dfc427ec65b5@mgmiller.net> Hi, Could anyone put this five year-old bug about parsing iso8601 format date-times on the front burner? http://bugs.python.org/issue15873 In the comments there's a lot of hand-wringing about different variations that bogged it down, but right now I only need it to handle the output of datetime.isoformat(): >>> dt.isoformat() '2017-10-20T08:20:08.986166+00:00' Perhaps if we could get that minimum first step in, it could be iterated on and made more lenient in the future. Thank you, -Mike From chris.barker at noaa.gov Mon Oct 23 18:59:20 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 23 Oct 2017 15:59:20 -0700 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: References: <20171016170631.375edd56@fsol> <4f15b978-d786-3b80-77f7-c6cb7d313573@email.de> Message-ID: On Sun, Oct 22, 2017 at 1:42 PM, Wes Turner wrote: > Aligning simulation data in context to other events may be enlightening: > is there a good library for handing high precision time units in Python > (and/or CFFI)? > Well, numpy's datetime64 can be set to use (almost) whatever unit you want: https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays. datetime.html#datetime-units Though it uses a single epoch, which I don't think ever made sense with femtoseconds.... And it has other problems, but it was designed that way, just for the reason. However, while there has been discussion of improvements, like making the epoch settable, none of them have happened, which makes me think that no one is using it for physics experiments, but rather plain old human calendar time... -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjol at tjol.eu Mon Oct 23 19:19:49 2017 From: tjol at tjol.eu (Thomas Jollans) Date: Tue, 24 Oct 2017 01:19:49 +0200 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: References: <20171016170631.375edd56@fsol> <4f15b978-d786-3b80-77f7-c6cb7d313573@email.de> Message-ID: <0eee7a98-a470-556f-5231-5afab78c5248@tjol.eu> On 22/10/17 17:06, Wes Turner wrote: > There are current applications with greater-than nanosecond precision: > > - relativity experiments > - particle experiments > > Must they always use their own implementations of time., datetime. > __init__, fromordinal, fromtimestamp ?! > > - https://scholar.google.com/scholar?q=femtosecond > - https://scholar.google.com/scholar?q=attosecond > - GPS now supports nanosecond resolution > - Sure, but in these kinds of experiments you don't have a "timestamp" in the usual sense. You'll have some kind of high-precision "clock", but in most cases there's no way and no reason to synchronise this to wall time. You end up distinguishing between "macro-time" (wall time) and "micro-time" (time in the experiment relative to something) In a particle accelerator, you care about measuring relative times of almost-simultaneous detection events with extremely high precision. You'll also presumably have a timestamp for the event, but you won't be able or willing to measure that with anything like the same accuracy. While you might be able to say that you detected, say, a muon at 01:23:45.6789 at ?t=543.6ps*, you have femtosecond resolution, you have a timestamp, but you don't have a femtosecond timestamp. In ultrafast spectroscopy, we get a time resolution equal to the duration of your laser pulses (fs-ps), but all the micro-times measured will be relative to some reference laser pulse, which repeats at >MHz frequencies. We also integrate over millions of events - wall-time timestamps don't enter into it. In summary, yes, when writing software for experiments working with high time resolution you have to write your own implementations of whatever data formats best describe time as you're measuring it, which generally won't line up with time as a PC (or a railway company) looks at it. Cheers Thomas * The example is implausible not least because I understand muon chambers tend to be a fair bit bigger than 15cm, but you get my point. From hasan.diwan at gmail.com Mon Oct 23 20:33:56 2017 From: hasan.diwan at gmail.com (Hasan Diwan) Date: Mon, 23 Oct 2017 17:33:56 -0700 Subject: [Python-Dev] iso8601 parsing In-Reply-To: <01e69881-3710-87c8-f47a-dfc427ec65b5@mgmiller.net> References: <01e69881-3710-87c8-f47a-dfc427ec65b5@mgmiller.net> Message-ID: If one simply replaces the 'T' with a space and trims it after the '.', IIRC, it parses fine. -- H On Oct 23, 2017 15:16, "Mike Miller" wrote: > Hi, > > Could anyone put this five year-old bug about parsing iso8601 format > date-times on the front burner? > > http://bugs.python.org/issue15873 > > In the comments there's a lot of hand-wringing about different variations > that bogged it down, but right now I only need it to handle the output of > datetime.isoformat(): > > >>> dt.isoformat() > '2017-10-20T08:20:08.986166+00:00' > > Perhaps if we could get that minimum first step in, it could be iterated > on and made more lenient in the future. > > Thank you, > -Mike > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/hasan. > diwan%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Mon Oct 23 22:18:42 2017 From: wes.turner at gmail.com (Wes Turner) Date: Mon, 23 Oct 2017 22:18:42 -0400 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: <0eee7a98-a470-556f-5231-5afab78c5248@tjol.eu> References: <20171016170631.375edd56@fsol> <4f15b978-d786-3b80-77f7-c6cb7d313573@email.de> <0eee7a98-a470-556f-5231-5afab78c5248@tjol.eu> Message-ID: On Monday, October 23, 2017, Thomas Jollans wrote: > On 22/10/17 17:06, Wes Turner wrote: > > There are current applications with greater-than nanosecond precision: > > > > - relativity experiments > > - particle experiments > > > > Must they always use their own implementations of time., datetime. > > __init__, fromordinal, fromtimestamp ?! > > > > - https://scholar.google.com/scholar?q=femtosecond > > - https://scholar.google.com/scholar?q=attosecond > > - GPS now supports nanosecond resolution > > - > > Sure, but in these kinds of experiments you don't have a "timestamp" in > the usual sense. > > You'll have some kind of high-precision "clock", but in most cases > there's no way and no reason to synchronise this to wall time. You end > up distinguishing between "macro-time" (wall time) and "micro-time" > (time in the experiment relative to something) > > In a particle accelerator, you care about measuring relative times of > almost-simultaneous detection events with extremely high precision. > You'll also presumably have a timestamp for the event, but you won't be > able or willing to measure that with anything like the same accuracy. > > While you might be able to say that you detected, say, a muon at > 01:23:45.6789 at ?t=543.6ps*, you have femtosecond resolution, you have > a timestamp, but you don't have a femtosecond timestamp. > > In ultrafast spectroscopy, we get a time resolution equal to the > duration of your laser pulses (fs-ps), but all the micro-times measured > will be relative to some reference laser pulse, which repeats at >MHz > frequencies. We also integrate over millions of events - wall-time > timestamps don't enter into it. > > In summary, yes, when writing software for experiments working with high > time resolution you have to write your own implementations of whatever > data formats best describe time as you're measuring it, which generally > won't line up with time as a PC (or a railway company) looks at it. (Sorry, maybe too OT) So these experiments are all done in isolation; referent to t=0. > Aligning simulation data in context to other events may be enlightening: IIUC, https://en.wikipedia.org/wiki/Quantum_mechanics_of_time_travel implies that there are (or may) Are potentially connections between events over greater periods of time. It's unfortunate that aligning this data requires adding offsets and working with nonstandard adhoc time structs. A problem for another day, I suppose. Thanks for adding time_ns(l. > Cheers > Thomas > > > * The example is implausible not least because I understand muon > chambers tend to be a fair bit bigger than 15cm, but you get my point. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > wes.turner%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Tue Oct 24 03:00:45 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 24 Oct 2017 09:00:45 +0200 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: <0eee7a98-a470-556f-5231-5afab78c5248@tjol.eu> References: <20171016170631.375edd56@fsol> <4f15b978-d786-3b80-77f7-c6cb7d313573@email.de> <0eee7a98-a470-556f-5231-5afab78c5248@tjol.eu> Message-ID: Thanks Thomas, it was interesting! You confirmed that time.time_ns() and other system clocks exposed by Python are inappropriate for sub-nanosecond physical experiment. By the way, you mentionned that clocks are not synchronized. That's another revelant point. Even if system clocks are synchronized on a single computer, I read that you cannot reach nanosecond resolution for a NTP synchronization even in a small LAN. For large systems or distributed systems, a "global (synchronized) clock" is not an option. You cannot synchronize clocks correctly, so your algorithms must not rely on time, or at least not too precise resolution. I am saying that to again repeat that we are far from sub-second nanosecond resolution for system clock. Victor Le 24 oct. 2017 01:39, "Thomas Jollans" a ?crit : > On 22/10/17 17:06, Wes Turner wrote: > > There are current applications with greater-than nanosecond precision: > > > > - relativity experiments > > - particle experiments > > > > Must they always use their own implementations of time., datetime. > > __init__, fromordinal, fromtimestamp ?! > > > > - https://scholar.google.com/scholar?q=femtosecond > > - https://scholar.google.com/scholar?q=attosecond > > - GPS now supports nanosecond resolution > > - > > Sure, but in these kinds of experiments you don't have a "timestamp" in > the usual sense. > > You'll have some kind of high-precision "clock", but in most cases > there's no way and no reason to synchronise this to wall time. You end > up distinguishing between "macro-time" (wall time) and "micro-time" > (time in the experiment relative to something) > > In a particle accelerator, you care about measuring relative times of > almost-simultaneous detection events with extremely high precision. > You'll also presumably have a timestamp for the event, but you won't be > able or willing to measure that with anything like the same accuracy. > > While you might be able to say that you detected, say, a muon at > 01:23:45.6789 at ?t=543.6ps*, you have femtosecond resolution, you have > a timestamp, but you don't have a femtosecond timestamp. > > In ultrafast spectroscopy, we get a time resolution equal to the > duration of your laser pulses (fs-ps), but all the micro-times measured > will be relative to some reference laser pulse, which repeats at >MHz > frequencies. We also integrate over millions of events - wall-time > timestamps don't enter into it. > > In summary, yes, when writing software for experiments working with high > time resolution you have to write your own implementations of whatever > data formats best describe time as you're measuring it, which generally > won't line up with time as a PC (or a railway company) looks at it. > > Cheers > Thomas > > > * The example is implausible not least because I understand muon > chambers tend to be a fair bit bigger than 15cm, but you get my point. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > victor.stinner%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue Oct 24 05:22:15 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 24 Oct 2017 11:22:15 +0200 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution References: <20171016170631.375edd56@fsol> <4f15b978-d786-3b80-77f7-c6cb7d313573@email.de> <0eee7a98-a470-556f-5231-5afab78c5248@tjol.eu> Message-ID: <20171024112215.01b98851@fsol> On Tue, 24 Oct 2017 09:00:45 +0200 Victor Stinner wrote: > By the way, you mentionned that clocks are not synchronized. That's another > revelant point. Even if system clocks are synchronized on a single > computer, I read that you cannot reach nanosecond resolution for a NTP > synchronization even in a small LAN. > > For large systems or distributed systems, a "global (synchronized) clock" > is not an option. You cannot synchronize clocks correctly, so your > algorithms must not rely on time, or at least not too precise resolution. > > I am saying that to again repeat that we are far from sub-second nanosecond > resolution for system clock. What does synchronization have to do with it? If synchronization matters, then your PEP should be rejected, because current computers using NTP can't synchronize with a better precision than 230 ns. See https://blog.cloudflare.com/how-to-achieve-low-latency/ Regards Antoine. From wes.turner at gmail.com Tue Oct 24 06:36:11 2017 From: wes.turner at gmail.com (Wes Turner) Date: Tue, 24 Oct 2017 06:36:11 -0400 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: <20171024112215.01b98851@fsol> References: <20171016170631.375edd56@fsol> <4f15b978-d786-3b80-77f7-c6cb7d313573@email.de> <0eee7a98-a470-556f-5231-5afab78c5248@tjol.eu> <20171024112215.01b98851@fsol> Message-ID: On Tuesday, October 24, 2017, Antoine Pitrou > wrote: > On Tue, 24 Oct 2017 09:00:45 +0200 > Victor Stinner wrote: > > By the way, you mentionned that clocks are not synchronized. That's > another > > revelant point. Even if system clocks are synchronized on a single > > computer, I read that you cannot reach nanosecond resolution for a NTP > > synchronization even in a small LAN. > > > > For large systems or distributed systems, a "global (synchronized) clock" > > is not an option. You cannot synchronize clocks correctly, so your > > algorithms must not rely on time, or at least not too precise resolution. > > > > I am saying that to again repeat that we are far from sub-second > nanosecond > > resolution for system clock. > > What does synchronization have to do with it? If synchronization > matters, then your PEP should be rejected, because current computers > using NTP can't synchronize with a better precision than 230 ns. >From https://en.wikipedia.org/wiki/Virtual_black_hole : > In the derivation of his equations, Einstein suggested that physical space-time is Riemannian, ie curved. A small domain of it is approximately flat space-time. >From https://en.wikipedia.org/wiki/Quantum_foam : > Based on the uncertainty principles of quantum mechanics and the general theory of relativity, there is no reason that spacetime needs to be fundamentally smooth. Instead, in a quantum theory of gravity, spacetime would consist of many small, ever-changing regions in which space and time are not definite, but fluctuate in a foam-like manner. So, in regards to time synchronization, FWIU: - WWVB "can provide time with an accuracy of about 100 microseconds" - GPS time can synchronize down to "tens of nanoseconds" - Blockchains work around local timestamp issues by "enforcing" linearity > > See https://blog.cloudflare.com/how-to-achieve-low-latency/ > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/wes. > turner%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Tue Oct 24 07:20:53 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 24 Oct 2017 13:20:53 +0200 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: <20171024112215.01b98851@fsol> References: <20171016170631.375edd56@fsol> <4f15b978-d786-3b80-77f7-c6cb7d313573@email.de> <0eee7a98-a470-556f-5231-5afab78c5248@tjol.eu> <20171024112215.01b98851@fsol> Message-ID: 2017-10-24 11:22 GMT+02:00 Antoine Pitrou : > What does synchronization have to do with it? If synchronization > matters, then your PEP should be rejected, because current computers > using NTP can't synchronize with a better precision than 230 ns. Currently, the PEP 564 is mostly designed for handling time on the same computer. Better resolution inside the same process, and "synchronization" between two processes running on the same host: https://www.python.org/dev/peps/pep-0564/#issues-caused-by-precision-loss Maybe tomorrow, time.time_ns() will help for use cases with more computers :-) > See https://blog.cloudflare.com/how-to-achieve-low-latency/ This article doesn't mention NTP, synchronization or nanoseconds. Where did you see "230 ns" for NTP? Victor From antoine at python.org Tue Oct 24 07:25:27 2017 From: antoine at python.org (Antoine Pitrou) Date: Tue, 24 Oct 2017 13:25:27 +0200 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: References: <20171016170631.375edd56@fsol> <4f15b978-d786-3b80-77f7-c6cb7d313573@email.de> <0eee7a98-a470-556f-5231-5afab78c5248@tjol.eu> <20171024112215.01b98851@fsol> Message-ID: Le 24/10/2017 ? 13:20, Victor Stinner a ?crit?: >> See https://blog.cloudflare.com/how-to-achieve-low-latency/ > > This article doesn't mention NTP, synchronization or nanoseconds. NTP is layered over UDP. The article shows base case UDP latencies of around 15?s over 10Gbps Ethernet. Regards Antoine. From victor.stinner at gmail.com Tue Oct 24 11:31:48 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 24 Oct 2017 17:31:48 +0200 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: References: <20171016170631.375edd56@fsol> <4f15b978-d786-3b80-77f7-c6cb7d313573@email.de> <0eee7a98-a470-556f-5231-5afab78c5248@tjol.eu> <20171024112215.01b98851@fsol> Message-ID: Warning: the PEP 564 doesn't make any assumption about clock synchronizations. My intent is only to expose what the operating system provides without losing precision. That's all :-) 2017-10-24 13:25 GMT+02:00 Antoine Pitrou : > NTP is layered over UDP. The article shows base case UDP latencies of > around 15?s over 10Gbps Ethernet. Ah ok. IMHO the discussion became off-topic somewhere, but I'm curious, so I searched about the best NTP accuracy and found: https://blog.meinbergglobal.com/2013/11/22/ntp-vs-ptp-network-timing-smackdown/ "Is the accuracy you need measured in microseconds or nanoseconds? If the answer is yes, you want PTP (IEEE 1588). If the answer is in milliseconds or seconds, then you want NTP." "There is even ongoing standards work to use technology developed at CERN (...) to extend PTP to picoseconds." It seems like PTP is more accurate than NTP. Victor From chris.barker at noaa.gov Tue Oct 24 17:26:14 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 24 Oct 2017 14:26:14 -0700 Subject: [Python-Dev] iso8601 parsing In-Reply-To: References: <01e69881-3710-87c8-f47a-dfc427ec65b5@mgmiller.net> Message-ID: On Mon, Oct 23, 2017 at 5:33 PM, Hasan Diwan wrote: > If one simply replaces the 'T' with a space and trims it after the '.', > IIRC, it parses fine. > sure, but really, can anyone argue that it's not a good idea for datetime ot be able to read the iso format it puts out??? -CHB > -- H > > On Oct 23, 2017 15:16, "Mike Miller" wrote: > >> Hi, >> >> Could anyone put this five year-old bug about parsing iso8601 format >> date-times on the front burner? >> >> http://bugs.python.org/issue15873 >> >> In the comments there's a lot of hand-wringing about different variations >> that bogged it down, but right now I only need it to handle the output of >> datetime.isoformat(): >> >> >>> dt.isoformat() >> '2017-10-20T08:20:08.986166+00:00' >> >> Perhaps if we could get that minimum first step in, it could be iterated >> on and made more lenient in the future. >> >> Thank you, >> -Mike >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: https://mail.python.org/mailman/options/python-dev/hasan.diw >> an%40gmail.com >> > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > chris.barker%40noaa.gov > > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Tue Oct 24 17:53:58 2017 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 24 Oct 2017 17:53:58 -0400 Subject: [Python-Dev] iso8601 parsing In-Reply-To: References: <01e69881-3710-87c8-f47a-dfc427ec65b5@mgmiller.net> Message-ID: On Tue, Oct 24, 2017 at 5:26 PM, Chris Barker wrote: > On Mon, Oct 23, 2017 at 5:33 PM, Hasan Diwan wrote: >> > can anyone argue that it's not a good idea for datetime ot > be able to read the iso format it puts out? No, but the last time I suggested that that datetime types should satisfy the same invariants as numbers, namely T(repr(x)) == x, the idea was met will silence. I, on the other hand, am not very enthusiastic about named constructors such as date.isoparse(). Compared with date(s:str), this is one more method name to remember, plus the potential for abuse as an instance method. What is d.isoparse('2017-11-24')? From elprans at gmail.com Tue Oct 24 20:11:43 2017 From: elprans at gmail.com (Elvis Pranskevichus) Date: Tue, 24 Oct 2017 20:11:43 -0400 Subject: [Python-Dev] iso8601 parsing In-Reply-To: References: <01e69881-3710-87c8-f47a-dfc427ec65b5@mgmiller.net> Message-ID: <11362716.0I2SPu8sME@hammer.magicstack.net> On Tuesday, October 24, 2017 5:53:58 PM EDT Alexander Belopolsky wrote: > No, but the last time I suggested that that datetime types should > satisfy the same invariants as numbers, namely > T(repr(x)) == x, the idea was met will silence. I, on the other hand, > am not very enthusiastic about named constructors such as > date.isoparse(). Compared with date(s:str), this is one more method > name to remember, plus the potential for abuse as an instance method. > What is d.isoparse('2017-11-24')? Agreed. datetime(s:str) seems like a far more natural and consistent choice. Elvis From tritium-list at sdamon.com Wed Oct 25 11:45:15 2017 From: tritium-list at sdamon.com (Alex Walters) Date: Wed, 25 Oct 2017 11:45:15 -0400 Subject: [Python-Dev] iso8601 parsing In-Reply-To: References: <01e69881-3710-87c8-f47a-dfc427ec65b5@mgmiller.net> Message-ID: <0f9001d34da8$41900b80$c4b02280$@sdamon.com> > -----Original Message----- > From: Python-Dev [mailto:python-dev-bounces+tritium- > list=sdamon.com at python.org] On Behalf Of Alexander Belopolsky > Sent: Tuesday, October 24, 2017 5:54 PM > To: Chris Barker > Cc: Python-Dev > Subject: Re: [Python-Dev] iso8601 parsing > > On Tue, Oct 24, 2017 at 5:26 PM, Chris Barker > wrote: > > On Mon, Oct 23, 2017 at 5:33 PM, Hasan Diwan > wrote: > >> > > can anyone argue that it's not a good idea for datetime ot > > be able to read the iso format it puts out? > > No, but the last time I suggested that that datetime types should > satisfy the same invariants as numbers, namely > T(repr(x)) == x, the idea was met will silence. I, on the other hand, > am not very enthusiastic about named constructors such as > date.isoparse(). Compared with date(s:str), this is one more method > name to remember, plus the potential for abuse as an instance method. > What is d.isoparse('2017-11-24')? Datetime.datetime.fromiso() (classmethod) is much more in keeping with the rest of the datetime api - in fact, I have tried calling that method more than once, before remembering datetime *doesn't* have that classmethod. Making it a classmethod solves any concerns about calling it as an instance method (the same way d.now() and d.strptime() just create and return a new datetime objects, not mutates the current). In fact, looking at the docs, most of the methods are classmethods, so an additional classmethod is fitting. I really do not like the idea of making the first positional argument of the datetime constructor int or str. What happens when you pass a string for the first argument and ints for subsequent arguments? You would have to raise a typeerror or valueerror. I don't like that API design - it means the type of the first argument changes the semantic meaning of subsequent arguments, and that just adds a level of confusion to any api. You might be able to get away with that in third party code, but this is the standard library, and this is the time manipulation module in the standard library - you have to assume that this is one of the first modules a new user uses, we have to keep the api sane. The only way I can think of keeping the api sane and still pass an iso string to the constructor is to pass It as a keyword argument - and at that point you have to remember the argument name anyways, so you might as well make it a classmethod to match everything else in the library. $0.02 > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/tritium- > list%40sdamon.com From alexander.belopolsky at gmail.com Wed Oct 25 12:07:23 2017 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 25 Oct 2017 12:07:23 -0400 Subject: [Python-Dev] iso8601 parsing In-Reply-To: <0f9001d34da8$41900b80$c4b02280$@sdamon.com> References: <01e69881-3710-87c8-f47a-dfc427ec65b5@mgmiller.net> <0f9001d34da8$41900b80$c4b02280$@sdamon.com> Message-ID: <4C4C3B3F-762B-4813-A232-188743DA9371@gmail.com> > On Oct 25, 2017, at 11:45 AM, Alex Walters wrote: > > it means > the type of the first argument changes the semantic meaning of subsequent > arguments, and that just adds a level of confusion to any api. No, it does not. Passing a string a the first of three arguments will still be a type error. From tritium-list at sdamon.com Wed Oct 25 15:47:28 2017 From: tritium-list at sdamon.com (Alex Walters) Date: Wed, 25 Oct 2017 15:47:28 -0400 Subject: [Python-Dev] iso8601 parsing In-Reply-To: <4C4C3B3F-762B-4813-A232-188743DA9371@gmail.com> References: <01e69881-3710-87c8-f47a-dfc427ec65b5@mgmiller.net> <0f9001d34da8$41900b80$c4b02280$@sdamon.com> <4C4C3B3F-762B-4813-A232-188743DA9371@gmail.com> Message-ID: <0fb101d34dca$174114b0$45c33e10$@sdamon.com> > -----Original Message----- > From: Alexander Belopolsky [mailto:alexander.belopolsky at gmail.com] > Sent: Wednesday, October 25, 2017 12:07 PM > To: Alex Walters > Cc: Chris Barker ; Python-Dev dev at python.org> > Subject: Re: [Python-Dev] iso8601 parsing > > > > > On Oct 25, 2017, at 11:45 AM, Alex Walters > wrote: > > > > it means > > the type of the first argument changes the semantic meaning of > subsequent > > arguments, and that just adds a level of confusion to any api. > > No, it does not. Passing a string a the first of three arguments will still be a > type error. And that is a confusing api. The problem has already been solved by classmethod alternate constructors - they are already used widely in the datetime api. NOT using classmethod constructors is new and confusing for the SINGLE use case of parsing iso formatted dates and times. Why is that special? Why isn't ordinal time special to get into __init__? From tritium-list at sdamon.com Wed Oct 25 15:48:49 2017 From: tritium-list at sdamon.com (Alex Walters) Date: Wed, 25 Oct 2017 15:48:49 -0400 Subject: [Python-Dev] iso8601 parsing In-Reply-To: <11362716.0I2SPu8sME@hammer.magicstack.net> References: <01e69881-3710-87c8-f47a-dfc427ec65b5@mgmiller.net> <11362716.0I2SPu8sME@hammer.magicstack.net> Message-ID: <0fba01d34dca$47ad7940$d7086bc0$@sdamon.com> > -----Original Message----- > From: Python-Dev [mailto:python-dev-bounces+tritium- > list=sdamon.com at python.org] On Behalf Of Elvis Pranskevichus > Sent: Tuesday, October 24, 2017 8:12 PM > To: python-dev at python.org > Cc: Chris Barker > Subject: Re: [Python-Dev] iso8601 parsing > > On Tuesday, October 24, 2017 5:53:58 PM EDT Alexander Belopolsky wrote: > > No, but the last time I suggested that that datetime types should > > satisfy the same invariants as numbers, namely > > T(repr(x)) == x, the idea was met will silence. I, on the other hand, > > am not very enthusiastic about named constructors such as > > date.isoparse(). Compared with date(s:str), this is one more method > > name to remember, plus the potential for abuse as an instance method. > > What is d.isoparse('2017-11-24')? > > Agreed. datetime(s:str) seems like a far more natural and consistent > choice. It's inconsistent with the rest of the module. All other constructions of datetime objects are on classmethods. Why make parsing ISO time special? > > Elvis > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/tritium- > list%40sdamon.com From alexander.belopolsky at gmail.com Wed Oct 25 16:32:39 2017 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 25 Oct 2017 16:32:39 -0400 Subject: [Python-Dev] iso8601 parsing In-Reply-To: <0fba01d34dca$47ad7940$d7086bc0$@sdamon.com> References: <01e69881-3710-87c8-f47a-dfc427ec65b5@mgmiller.net> <11362716.0I2SPu8sME@hammer.magicstack.net> <0fba01d34dca$47ad7940$d7086bc0$@sdamon.com> Message-ID: On Wed, Oct 25, 2017 at 3:48 PM, Alex Walters wrote: > Why make parsing ISO time special? It's not the ISO format per se that is special, but parsing of str(x). For all numeric types, int, float, complex and even fractions.Fraction, we have a roundtrip invariant T(str(x)) == x. Datetime types are a special kind of numbers, but they don't follow this established pattern. This is annoying when you deal with time series where it is common to have text files with a mix of dates, timestamps and numbers. You can write generic code to deal with ints and floats, but have to special-case anything time related. From tritium-list at sdamon.com Wed Oct 25 17:18:08 2017 From: tritium-list at sdamon.com (Alex Walters) Date: Wed, 25 Oct 2017 17:18:08 -0400 Subject: [Python-Dev] iso8601 parsing In-Reply-To: References: <01e69881-3710-87c8-f47a-dfc427ec65b5@mgmiller.net> <11362716.0I2SPu8sME@hammer.magicstack.net> <0fba01d34dca$47ad7940$d7086bc0$@sdamon.com> Message-ID: <0fbb01d34dd6$c1bee690$453cb3b0$@sdamon.com> > -----Original Message----- > From: Alexander Belopolsky [mailto:alexander.belopolsky at gmail.com] > Sent: Wednesday, October 25, 2017 4:33 PM > To: Alex Walters > Cc: Elvis Pranskevichus ; Python-Dev dev at python.org>; Chris Barker > Subject: Re: [Python-Dev] iso8601 parsing > > On Wed, Oct 25, 2017 at 3:48 PM, Alex Walters > wrote: > > Why make parsing ISO time special? > > It's not the ISO format per se that is special, but parsing of str(x). > For all numeric types, int, float, complex and even > fractions.Fraction, we have a roundtrip invariant T(str(x)) == x. > Datetime types are a special kind of numbers, but they don't follow > this established pattern. This is annoying when you deal with time > series where it is common to have text files with a mix of dates, > timestamps and numbers. You can write generic code to deal with ints > and floats, but have to special-case anything time related. >>> repr(datetime.datetime.now()) 'datetime.datetime(2017, 10, 25, 17, 16, 20, 973107)' You can already roundtrip the repr of datetime objects with eval (if you care to do so). You get iso formatting from a method on dt objects, I don?t see why it should be parsed by anything but a classmethod. From random832 at fastmail.com Wed Oct 25 17:29:50 2017 From: random832 at fastmail.com (Random832) Date: Wed, 25 Oct 2017 17:29:50 -0400 Subject: [Python-Dev] iso8601 parsing In-Reply-To: References: <01e69881-3710-87c8-f47a-dfc427ec65b5@mgmiller.net> <11362716.0I2SPu8sME@hammer.magicstack.net> <0fba01d34dca$47ad7940$d7086bc0$@sdamon.com> Message-ID: <1508966990.119900.1151076584.7F83F87B@webmail.messagingengine.com> On Wed, Oct 25, 2017, at 16:32, Alexander Belopolsky wrote: > This is annoying when you deal with time > series where it is common to have text files with a mix of dates, > timestamps and numbers. You can write generic code to deal with ints > and floats, but have to special-case anything time related. Generic code that uses a Callable[[str], ...] instead of a type works fine with a class method. column1.parser = int column2.parser = float column3.parser = datetime.parse_iso column4.parser = json.loads It is *very slightly* more complex than a model that needs the type also for some reason and has the type pull double duty as the parser... but why do that? From chris.barker at noaa.gov Wed Oct 25 17:30:00 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 25 Oct 2017 14:30:00 -0700 Subject: [Python-Dev] iso8601 parsing In-Reply-To: <0fbb01d34dd6$c1bee690$453cb3b0$@sdamon.com> References: <01e69881-3710-87c8-f47a-dfc427ec65b5@mgmiller.net> <11362716.0I2SPu8sME@hammer.magicstack.net> <0fba01d34dca$47ad7940$d7086bc0$@sdamon.com> <0fbb01d34dd6$c1bee690$453cb3b0$@sdamon.com> Message-ID: +1 on a classmethod constructor +0 on a based-on-type default constructor +inf on SOMETHING! Let's get passed the bike shedding and make this work! -CHB On Wed, Oct 25, 2017 at 2:18 PM, Alex Walters wrote: > > > > -----Original Message----- > > From: Alexander Belopolsky [mailto:alexander.belopolsky at gmail.com] > > Sent: Wednesday, October 25, 2017 4:33 PM > > To: Alex Walters > > Cc: Elvis Pranskevichus ; Python-Dev > dev at python.org>; Chris Barker > > Subject: Re: [Python-Dev] iso8601 parsing > > > > On Wed, Oct 25, 2017 at 3:48 PM, Alex Walters > > wrote: > > > Why make parsing ISO time special? > > > > It's not the ISO format per se that is special, but parsing of str(x). > > For all numeric types, int, float, complex and even > > fractions.Fraction, we have a roundtrip invariant T(str(x)) == x. > > Datetime types are a special kind of numbers, but they don't follow > > this established pattern. This is annoying when you deal with time > > series where it is common to have text files with a mix of dates, > > timestamps and numbers. You can write generic code to deal with ints > > and floats, but have to special-case anything time related. > > >>> repr(datetime.datetime.now()) > 'datetime.datetime(2017, 10, 25, 17, 16, 20, 973107)' > > You can already roundtrip the repr of datetime objects with eval (if you > care to do so). You get iso formatting from a method on dt objects, I > don?t see why it should be parsed by anything but a classmethod. > > > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed Oct 25 17:30:57 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 26 Oct 2017 08:30:57 +1100 Subject: [Python-Dev] iso8601 parsing In-Reply-To: References: <01e69881-3710-87c8-f47a-dfc427ec65b5@mgmiller.net> <11362716.0I2SPu8sME@hammer.magicstack.net> <0fba01d34dca$47ad7940$d7086bc0$@sdamon.com> Message-ID: <20171025213056.GP9068@ando.pearwood.info> On Wed, Oct 25, 2017 at 04:32:39PM -0400, Alexander Belopolsky wrote: > On Wed, Oct 25, 2017 at 3:48 PM, Alex Walters wrote: > > Why make parsing ISO time special? > > It's not the ISO format per se that is special, but parsing of str(x). > For all numeric types, int, float, complex and even > fractions.Fraction, we have a roundtrip invariant T(str(x)) == x. > Datetime types are a special kind of numbers, but they don't follow > this established pattern. This is annoying when you deal with time > series where it is common to have text files with a mix of dates, > timestamps and numbers. You can write generic code to deal with ints > and floats, but have to special-case anything time related. Maybe I'm just being slow today, but I don't see how you can write "generic code" to convert text to int/float/complex/Fraction, but not times. The only difference is that instead of calling the type directly, you call the appropriate classmethod. What am I missing? -- Steven From alexander.belopolsky at gmail.com Wed Oct 25 19:22:43 2017 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 25 Oct 2017 19:22:43 -0400 Subject: [Python-Dev] iso8601 parsing In-Reply-To: <20171025213056.GP9068@ando.pearwood.info> References: <01e69881-3710-87c8-f47a-dfc427ec65b5@mgmiller.net> <11362716.0I2SPu8sME@hammer.magicstack.net> <0fba01d34dca$47ad7940$d7086bc0$@sdamon.com> <20171025213056.GP9068@ando.pearwood.info> Message-ID: On Wed, Oct 25, 2017 at 5:30 PM, Steven D'Aprano wrote: > Maybe I'm just being slow today, but I don't see how you can write > "generic code" to convert text to int/float/complex/Fraction, but not > times. The only difference is that instead of calling the type directly, > you call the appropriate classmethod. > > What am I missing? Nothing. The only annoyance is that the data processing code typically needs to know the type anyway, so the classmethod is one more variable to keep track of. From alexander.belopolsky at gmail.com Wed Oct 25 19:27:08 2017 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 25 Oct 2017 19:27:08 -0400 Subject: [Python-Dev] iso8601 parsing In-Reply-To: References: <01e69881-3710-87c8-f47a-dfc427ec65b5@mgmiller.net> <11362716.0I2SPu8sME@hammer.magicstack.net> <0fba01d34dca$47ad7940$d7086bc0$@sdamon.com> <0fbb01d34dd6$c1bee690$453cb3b0$@sdamon.com> Message-ID: On Wed, Oct 25, 2017 at 5:30 PM, Chris Barker wrote: > Let's get passed the bike shedding and make this work! Sure. Submitting a pull request for would be a good start. From chris.barker at noaa.gov Wed Oct 25 19:45:32 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 25 Oct 2017 16:45:32 -0700 Subject: [Python-Dev] iso8601 parsing In-Reply-To: References: <01e69881-3710-87c8-f47a-dfc427ec65b5@mgmiller.net> <11362716.0I2SPu8sME@hammer.magicstack.net> <0fba01d34dca$47ad7940$d7086bc0$@sdamon.com> <20171025213056.GP9068@ando.pearwood.info> Message-ID: On Wed, Oct 25, 2017 at 4:22 PM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > > times. The only difference is that instead of calling the type directly, > > you call the appropriate classmethod. > > > > What am I missing? > > Nothing. The only annoyance is that the data processing code typically > needs to know the type anyway, so the classmethod is one more variable > to keep track of. I don't think anyone is arguing that is is hard to do either way, or that hard to use either way. I think it comes down to a trade-off between: Having an API for datetime that is like the datetime for number types: int(str(an_int)) == an_int so: datetime(str(a_datetime)) == a_datetime Alexander strongly supports that. Or an API that is perhaps more like the rest of the datetime api, which has a number of alternate constructors: datetime.now() datetime.fromordinal() datetime.fromtimestamp() And has: datetime.isoformat() So a datetime.fromisoformat() would make a lot of sense. I would note that the number types don't have all those alternate constructors Also, the number types mostly have a single parameter (except int has an optional base parameter). So I'm not sure the parallel is that strong. Would it be horrible if we did both? After all, right now, datetime has: In [16]: dt.isoformat() Out[16]: '2017-10-25T16:30:48.744489' and In [18]: dt.__str__() Out[18]: '2017-10-25 16:30:48.744489' which do almost the same thing (I understand the "T" is option in iso 8601) However, maybe they are different when you add a time zone? ISO 8601 support offsets, but not time zones -- presumably the __str__ supports the full datetime tzinfo somehow. Which may be why .isoformat() exists. Though I don't think that means you couldn't have the __init__ parse proper ISO strings, in addition to the full datetime __str__ results. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Oct 25 19:52:05 2017 From: guido at python.org (Guido van Rossum) Date: Wed, 25 Oct 2017 16:52:05 -0700 Subject: [Python-Dev] iso8601 parsing In-Reply-To: References: <01e69881-3710-87c8-f47a-dfc427ec65b5@mgmiller.net> <11362716.0I2SPu8sME@hammer.magicstack.net> <0fba01d34dca$47ad7940$d7086bc0$@sdamon.com> <20171025213056.GP9068@ando.pearwood.info> Message-ID: I think this ship has long sailed. Sorry Alexander, but I see a new class method in your future. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Wed Oct 25 22:37:09 2017 From: wes.turner at gmail.com (Wes Turner) Date: Wed, 25 Oct 2017 22:37:09 -0400 Subject: [Python-Dev] iso8601 parsing In-Reply-To: References: <01e69881-3710-87c8-f47a-dfc427ec65b5@mgmiller.net> <11362716.0I2SPu8sME@hammer.magicstack.net> <0fba01d34dca$47ad7940$d7086bc0$@sdamon.com> <20171025213056.GP9068@ando.pearwood.info> Message-ID: On Wednesday, October 25, 2017, Chris Barker wrote: > On Wed, Oct 25, 2017 at 4:22 PM, Alexander Belopolsky < > alexander.belopolsky at gmail.com > > wrote: > >> > times. The only difference is that instead of calling the type directly, >> > you call the appropriate classmethod. >> > >> > What am I missing? >> >> Nothing. The only annoyance is that the data processing code typically >> needs to know the type anyway, so the classmethod is one more variable >> to keep track of. > > > I don't think anyone is arguing that is is hard to do either way, or that > hard to use either way. > > I think it comes down to a trade-off between: > > Having an API for datetime that is like the datetime for number types: > > int(str(an_int)) == an_int > > so: > > datetime(str(a_datetime)) == a_datetime > > Alexander strongly supports that. > > Or an API that is perhaps more like the rest of the datetime api, which > has a number of alternate constructors: > > datetime.now() > > datetime.fromordinal() > > datetime.fromtimestamp() > > And has: > > datetime.isoformat() > > So a > > datetime.fromisoformat() > > would make a lot of sense. > > I would note that the number types don't have all those alternate > constructors Also, the number types mostly have a single parameter (except > int has an optional base parameter). So I'm not sure the parallel is that > strong. > > Would it be horrible if we did both? > > After all, right now, datetime has: > > In [16]: dt.isoformat() > Out[16]: '2017-10-25T16:30:48.744489' > > and > In [18]: dt.__str__() > Out[18]: '2017-10-25 16:30:48.744489' > > which do almost the same thing (I understand the "T" is option in iso 8601) > > However, maybe they are different when you add a time zone? > > ISO 8601 support offsets, but not time zones -- presumably the __str__ > supports the full datetime tzinfo somehow. Which may be why .isoformat() > exists. > ISO8601 does support timezones. https://en.wikipedia.org/wiki/ISO_8601#Time_zone_designators I might be wrong, but I think many of the third party libraries listed here default to either UTC or timezone-naieve timezones: https://github.com/vinta/awesome-python/blob/master/README.md#date-and-time Ctrl-F for 'tzinfo=' in the docs really doesn't explain how to just do it with my local time. Here's an example with a *custom* GMT1 tzinfo subclass: https://docs.python.org/3/library/datetime.html#datetime.time.tzname > Though I don't think that means you couldn't have the __init__ parse > proper ISO strings, in addition to the full datetime __str__ results. > What would you call the str argument? Does it accept strptime args or only ISO8601? Would all of that string parsing logic be a performance regression from the current constructor? Does it accept None or empty string? ... Deserializing dates from JSON (without #JSONLD and xsd:dateTime (ISO8601)) types is nasty, regardless (try/except, *custom* schema awareness). And pickle is dangerous. AFAIU, we should not ever eval(repr(dt: datetime)). ... Should the date time constructor support nanos= (just like time_ns())? ENH: Add nanosecond support to the time and datetime constructors ... The astropy Time class supports a string argument as the first parameter sometimes: http://docs.astropy.org/en/stable/time/#inferring-input-format Astropy does support a "year zero". > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fred at fdrake.net Wed Oct 25 23:41:59 2017 From: fred at fdrake.net (Fred Drake) Date: Wed, 25 Oct 2017 23:41:59 -0400 Subject: [Python-Dev] iso8601 parsing In-Reply-To: References: <01e69881-3710-87c8-f47a-dfc427ec65b5@mgmiller.net> <11362716.0I2SPu8sME@hammer.magicstack.net> <0fba01d34dca$47ad7940$d7086bc0$@sdamon.com> <20171025213056.GP9068@ando.pearwood.info> Message-ID: On Wed, Oct 25, 2017 at 10:37 PM, Wes Turner wrote: > What would you call the str argument? Does it accept strptime args or only > ISO8601? There'd be no reason to accept a format. That wouldn't make sense. A .fromiso(s:str) should only accept an ISO 8601 string, though I'd advocate tolerating both space and "T". > Would all of that string parsing logic be a performance regression > from the current constructor? Does it accept None or empty string? It's an alternate constructor, so should not impact the existing constructor (though it could employ the existing constructor to get work done). It should not accept anything but a valid ISO 8601 string. > Should the date time constructor support nanos= (just like time_ns())? No. It should support exactly up to 6 decimal digits to populate the microsecond field. > ENH: Add nanosecond support to the time and datetime constructors This should be left for a separate change, if we determine it should be implemented for the datetime and timedelta types. -Fred -- Fred L. Drake, Jr. "A storm broke loose in my mind." --Albert Einstein From ncoghlan at gmail.com Thu Oct 26 01:37:48 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 26 Oct 2017 15:37:48 +1000 Subject: [Python-Dev] iso8601 parsing In-Reply-To: References: <01e69881-3710-87c8-f47a-dfc427ec65b5@mgmiller.net> <11362716.0I2SPu8sME@hammer.magicstack.net> <0fba01d34dca$47ad7940$d7086bc0$@sdamon.com> <0fbb01d34dd6$c1bee690$453cb3b0$@sdamon.com> Message-ID: On 26 October 2017 at 07:30, Chris Barker wrote: > +1 on a classmethod constructor > +0 on a based-on-type default constructor > > +inf on SOMETHING! > > Let's get passed the bike shedding and make this work! > I'll also note that these aren't either/or outcomes: adding a str-specific classmethod *doesn't* preclude implicitly calling that class method from the default constructor later based on the input type. For example, decimal.Decimal.from_float() was added before the type constructor gained native support for float inputs, due to concerns about potential binary-vs-decimal rounding errors arising from doing such conversions implicitly. So we can add "datetime.fromisoformat(isotime: str)" now, and then *if* we later decide to support the "type(x)(str(x)) == x" numeric invariant for the datetime classes, that can be specified as "If the constructor arguments consist of a single string, that is handled by calling the fromisoformat class method". Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Thu Oct 26 05:24:54 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 26 Oct 2017 11:24:54 +0200 Subject: [Python-Dev] Migrate python-dev to Mailman 3? Message-ID: Hi, We are using Mailman 3 for the new buildbot-status mailing list and it works well: https://mail.python.org/mm3/archives/list/buildbot-status at python.org/ I prefer to read archives with this UI, it's simpler to follow threads, and it's possible to reply on the web UI! To be honest, we got some issues when the new security-announce mailing list was quickly migrated from Mailman 2 to Mailman 3, but issues were quicky fixed as well. Would it be possible to migrate python-dev to Mailman 3? Do you see any blocker issue? I sent to email to the Python postmaster as well. Victor From solipsis at pitrou.net Thu Oct 26 06:01:37 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 26 Oct 2017 12:01:37 +0200 Subject: [Python-Dev] Migrate python-dev to Mailman 3? References: Message-ID: <20171026120137.1de34389@fsol> On Thu, 26 Oct 2017 11:24:54 +0200 Victor Stinner wrote: > > We are using Mailman 3 for the new buildbot-status mailing list and it > works well: > > https://mail.python.org/mm3/archives/list/buildbot-status at python.org/ > > I prefer to read archives with this UI, it's simpler to follow > threads, and it's possible to reply on the web UI! Personally, I really don't like that UI. Is it possible to have a pipermail-style UI as an alternative? (I don't care about buildbot-status, but I do care about python-dev) Regards Antoine. From victor.stinner at gmail.com Thu Oct 26 06:15:36 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 26 Oct 2017 12:15:36 +0200 Subject: [Python-Dev] Migrate python-dev to Mailman 3? In-Reply-To: <20171026120137.1de34389@fsol> References: <20171026120137.1de34389@fsol> Message-ID: 2017-10-26 12:01 GMT+02:00 Antoine Pitrou : > Is it possible to have a > pipermail-style UI as an alternative? I don't know pipermail. Do you have an example? I don't think that Mailman 3 gives the choice of the UI for archives. I didn't ask anyone to write a new software. I only proposed to use what we already have. And yeah, I expect that some people will complain, as each time that we make any kind of change :-) Each UI has advantages and drawbacks. The main drawback of Mailman 2 archives is that discussions are splitted between each month. It can be a pain to follow a long discussion done in multiple months. Sadly, I don't know if Mailman 3 handles this case better :-D Victor From antoine at python.org Thu Oct 26 06:19:45 2017 From: antoine at python.org (Antoine Pitrou) Date: Thu, 26 Oct 2017 12:19:45 +0200 Subject: [Python-Dev] Migrate python-dev to Mailman 3? In-Reply-To: References: <20171026120137.1de34389@fsol> Message-ID: <090a325f-6b04-2cbc-e677-0493dc0273ad@python.org> Le 26/10/2017 ? 12:15, Victor Stinner a ?crit?: > 2017-10-26 12:01 GMT+02:00 Antoine Pitrou : >> Is it possible to have a >> pipermail-style UI as an alternative? > > I don't know pipermail. Do you have an example? https://mail.python.org/pipermail/python-dev/ :-) > The main drawback of Mailman 2 archives is that discussions are > splitted between each month. It can be a pain to follow a long > discussion done in multiple months. Sadly, I don't know if Mailman 3 > handles this case better :-D If I take https://mail.python.org/mm3/archives/list/buildbot-status at python.org/thread/MZ7QOZM6V7OALPSYSNSIHGGSLXMQHCF2/ as an example, I don't think it will allow to follow a long discussion *at all*. Can you imagine a 100-message thread displayed that way? The pipermail UI isn't perfect (the monthly segregation can be annoying as you point out), but at least it has a synthetic and easy-to-navigate tree view. Regards Antoine. From ncoghlan at gmail.com Thu Oct 26 08:40:40 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 26 Oct 2017 22:40:40 +1000 Subject: [Python-Dev] Migrate python-dev to Mailman 3? In-Reply-To: <090a325f-6b04-2cbc-e677-0493dc0273ad@python.org> References: <20171026120137.1de34389@fsol> <090a325f-6b04-2cbc-e677-0493dc0273ad@python.org> Message-ID: On 26 October 2017 at 20:19, Antoine Pitrou wrote: > > Le 26/10/2017 ? 12:15, Victor Stinner a ?crit : > > 2017-10-26 12:01 GMT+02:00 Antoine Pitrou : > >> Is it possible to have a > >> pipermail-style UI as an alternative? > > > > I don't know pipermail. Do you have an example? > > https://mail.python.org/pipermail/python-dev/ :-) > > > The main drawback of Mailman 2 archives is that discussions are > > splitted between each month. It can be a pain to follow a long > > discussion done in multiple months. Sadly, I don't know if Mailman 3 > > handles this case better :-D > > If I take > https://mail.python.org/mm3/archives/list/buildbot-status@ > python.org/thread/MZ7QOZM6V7OALPSYSNSIHGGSLXMQHCF2/ > as an example, I don't think it will allow to follow a long discussion > *at all*. Can you imagine a 100-message thread displayed that way? > If folks want to see MM3 in action with some more active lists, I'd suggest looking at the Fedora MM3 instance, especially the main dev list: https://lists.fedoraproject.org/archives/list/devel at lists.fedoraproject.org/ There's a 100 message thread about Firefox 57 here, and it's no harder to read than a long thread on any other web forum: https://lists.fedoraproject.org/archives/list/devel at lists.fedoraproject.org/thread/K5HSROMAIMNYSJONB5EIAQKWKYNFYSHK/ (if you switch to the strictly chronological display, it's almost *identical* to a web forum) https://lists.fedoraproject.org/archives/list/devel at lists.fedoraproject.org/thread/WKEB6M7J2WTFJBZYD7AZ4JB6J2O6VEWK/ is an example of a thread that was first posted back in July, but then updated more recently when the change slipped from F27 into F28. If you look at the activity for a month, as in https://lists.fedoraproject.org/archives/list/devel at lists.fedoraproject.org/2017/10/?count=50, the archiver will show you a single entry for each thread active in that month, with a link through to the consolidate archive view. Pages for individual messages do exist (e.g. https://lists.fedoraproject.org/archives/list/devel at lists.fedoraproject.org/message/HBA3O755BWRZMTDBBOCUHCKC3RREGTII/ ), and I'd expect Aurelian to be amenable to accepting a PR at https://gitlab.com/mailman/hyperkitty if anyone was particularly keen to add pipermail style forward/back buttons to those pages. Similarly, I'd be surprised if anyone objected to a toggle on the thread view page that allowed you to opt in to hiding the full message contents by default (and hence get back to a more pipermail style "Subject-lines-and-poster-details-only" overview). Cheers, Nick. P.S. MM3 supports a multi-archiver design, so it would presumably also be possible to write a static-HTML-only pipermail style archiver that ran in parallel with the interactive web gateway. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From antoine at python.org Thu Oct 26 08:52:30 2017 From: antoine at python.org (Antoine Pitrou) Date: Thu, 26 Oct 2017 14:52:30 +0200 Subject: [Python-Dev] Migrate python-dev to Mailman 3? In-Reply-To: References: <20171026120137.1de34389@fsol> <090a325f-6b04-2cbc-e677-0493dc0273ad@python.org> Message-ID: <723abc42-9c69-fb45-86ca-e04c5ef0a84e@python.org> Le 26/10/2017 ? 14:40, Nick Coghlan a ?crit?: > > If folks want to see MM3 in action with some more active lists, I'd > suggest looking at the Fedora MM3 instance, especially the main dev > list: > https://lists.fedoraproject.org/archives/list/devel at lists.fedoraproject.org> > There's a 100 message thread about Firefox 57 here, and it's no harder > to read than a long thread on any other web forum: > https://lists.fedoraproject.org/archives/list/devel at lists.fedoraproject.org/thread/K5HSROMAIMNYSJONB5EIAQKWKYNFYSHK/ > (if you switch to the strictly chronological display, it's almost > *identical* to a web forum) Thanks for posting these examples. The comparison with "other" web forums is irrelevant, though, since we're talking about replacing the pipermail UI (which is not laid out like a web forum, but as a dense synthetic tree view). IMHO, common web forums (I assume you're talking the phpBB kind) are unfit for presenting a structured discussion and they're not a very interesting point of comparison :-) > Pages for individual messages do exist (e.g. > https://lists.fedoraproject.org/archives/list/devel at lists.fedoraproject.org/message/HBA3O755BWRZMTDBBOCUHCKC3RREGTII/ > ), and I'd expect Aurelian to be amenable to accepting a PR at > https://gitlab.com/mailman/hyperkitty if anyone was particularly keen to > add pipermail style forward/back buttons to those pages. I have no doubt that it's possible to submit PRs to improve MM3's current UI. Still, someone has to do the work, and until it is done I find that a migration would be detrimental to my personal use of the ML archives. YMMV :-) Regards Antoine. From p.f.moore at gmail.com Thu Oct 26 09:43:49 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 26 Oct 2017 14:43:49 +0100 Subject: [Python-Dev] Migrate python-dev to Mailman 3? In-Reply-To: References: Message-ID: On 26 October 2017 at 10:24, Victor Stinner wrote: > We are using Mailman 3 for the new buildbot-status mailing list and it > works well: > > https://mail.python.org/mm3/archives/list/buildbot-status at python.org/ > > I prefer to read archives with this UI, it's simpler to follow > threads, and it's possible to reply on the web UI! > > To be honest, we got some issues when the new security-announce > mailing list was quickly migrated from Mailman 2 to Mailman 3, but > issues were quicky fixed as well. > > Would it be possible to migrate python-dev to Mailman 3? Do you see > any blocker issue? > > I sent to email to the Python postmaster as well. My only use of the pipermail archives is to find permanent URLs for mails I want to refer people to. My usage goes as follows: 1. Google search for a post. 2. Paste in the URL to an email. Or, if I have the post already (usually in my email client). 1. Check the date and subject of the post. 2. Go to the pipermail article by month, and scan the list for the subject and author. 3. Click on the link, check it's the right email, copy the URL. 4. Paste it into my email. I don't use the archives for reading. If the above two usages are still available, I don't care. But in particular, the fact that individual posts are searchable from Google is important to me. And in the second usage, having a single scrollable webpage with no extraneous data just subject/author and a bit of threading by indentation speeds up my usage a lot - the UI you linked to (and the monthly archive page with the initial lines of postings on it) is FAR too cluttered to be usable for my purposes. So basically, what I'm asking is what would be the support for the use case "Find a permanent link to an archived article as fast as possible based on subject/author or a Google search". Finally, how would a transition be handled? I assume the old archives would be retained, so would there be a cut-off date and people would have to know to use the old or new archives based on the date of the message? Paul From donald at stufft.io Thu Oct 26 10:01:35 2017 From: donald at stufft.io (Donald Stufft) Date: Thu, 26 Oct 2017 10:01:35 -0400 Subject: [Python-Dev] Migrate python-dev to Mailman 3? In-Reply-To: <090a325f-6b04-2cbc-e677-0493dc0273ad@python.org> References: <20171026120137.1de34389@fsol> <090a325f-6b04-2cbc-e677-0493dc0273ad@python.org> Message-ID: > On Oct 26, 2017, at 6:19 AM, Antoine Pitrou wrote: > > The pipermail UI isn't perfect (the monthly segregation can be annoying > as you point out), but at least it has a synthetic and easy-to-navigate > tree view. Pipermail is *horrible* and it?s tree view makes things actively harder to read in large part because once the depth of the tree gets beyond like,, 3? or so, it just gives up trying to make it a tree and starts rendering all descendants past a certain point as siblings in a nonsensical order making it impossible to follow along on a discussion as everything ends up out of order. -------------- next part -------------- An HTML attachment was scrubbed... URL: From antoine at python.org Thu Oct 26 10:10:27 2017 From: antoine at python.org (Antoine Pitrou) Date: Thu, 26 Oct 2017 16:10:27 +0200 Subject: [Python-Dev] Migrate python-dev to Mailman 3? In-Reply-To: References: <20171026120137.1de34389@fsol> <090a325f-6b04-2cbc-e677-0493dc0273ad@python.org> Message-ID: Le 26/10/2017 ? 16:01, Donald Stufft a ?crit?: > >> On Oct 26, 2017, at 6:19 AM, Antoine Pitrou > > wrote: >> >> The pipermail UI isn't perfect (the monthly segregation can be annoying >> as you point out), but at least it has a synthetic and easy-to-navigate >> tree view. > > Pipermail is *horrible* and it?s tree view makes things actively harder > to read in large part because once the depth of the tree gets beyond > like,, 3? or so, it just gives up trying to make it a tree and starts > rendering all descendants past a certain point as siblings in a > nonsensical order making it impossible to follow along on a discussion > as everything ends up out of order. You're right. It shows that I'm used to pipermail's deficiencies, and don't notice them as much as I used to do. However, MM3 seems to be doing the exact same thing that pipermail does when it comes to capping the tree view indentation to a certain limit. If you scroll down the following page enough (or you can search for example the sentence "I don't believe anyone outside of Firefox enthusiasts and the package maintainer were even aware there was an issue to discuss"), you'll see some replies displayed at the same indentation level as the message they reply to: https://lists.fedoraproject.org/archives/list/devel at lists.fedoraproject.org/thread/K5HSROMAIMNYSJONB5EIAQKWKYNFYSHK/ Regards Antoine. From wes.turner at gmail.com Thu Oct 26 10:28:02 2017 From: wes.turner at gmail.com (Wes Turner) Date: Thu, 26 Oct 2017 10:28:02 -0400 Subject: [Python-Dev] Migrate python-dev to Mailman 3? In-Reply-To: References: Message-ID: On Thursday, October 26, 2017, Paul Moore wrote: > On 26 October 2017 at 10:24, Victor Stinner > wrote: > > We are using Mailman 3 for the new buildbot-status mailing list and it > > works well: > > > > https://mail.python.org/mm3/archives/list/buildbot-status at python.org/ > > > > I prefer to read archives with this UI, it's simpler to follow > > threads, and it's possible to reply on the web UI! > > > > To be honest, we got some issues when the new security-announce > > mailing list was quickly migrated from Mailman 2 to Mailman 3, but > > issues were quicky fixed as well. > > > > Would it be possible to migrate python-dev to Mailman 3? Do you see > > any blocker issue? > > > > I sent to email to the Python postmaster as well. > > My only use of the pipermail archives is to find permanent URLs for > mails I want to refer people to. My usage goes as follows: > > 1. Google search for a post. > 2. Paste in the URL to an email. > > Or, if I have the post already (usually in my email client). > > 1. Check the date and subject of the post. > 2. Go to the pipermail article by month, and scan the list for the > subject and author. > 3. Click on the link, check it's the right email, copy the URL. > 4. Paste it into my email. > > I don't use the archives for reading. If the above two usages are > still available, I don't care. But in particular, the fact that > individual posts are searchable from Google is important to me. And in > the second usage, having a single scrollable webpage with no > extraneous data just subject/author and a bit of threading by > indentation speeds up my usage a lot - the UI you linked to (and the > monthly archive page with the initial lines of postings on it) is FAR > too cluttered to be usable for my purposes. > > This: > So basically, what I'm asking is what would be the support for the use > case "Find a permanent link to an archived article as fast as possible > based on subject/author or a Google search". The complexity of this process is also very wastefully frustrating to me. (Maybe it's in the next month's message tree? No fulltext search? No way to even do an inurl: search because of the URIs?!) Isn't there a way to append a permalink to the relayed message footers? Google Groups and Github do this and it saves a lot of time. [Re-searches for things] Mailman3 adds an RFC 5064 "Archived-At" header with a link that some clients provide the ability to open in a normal human browser: http://dustymabe.com/2016/01/10/archived-at-email-header-from-mailman-3-lists/ I often click the "view it on Github" link in GitHub issue emails. (It's after the '--' email signature delimiter, so it doesn't take up so much room). "[feature] Add permalink to mail message to the footer when delivering email" https://gitlab.com/mailman/hyperkitty/issues/27 > Finally, how would a transition be handled? I assume the old archives > would be retained, so would there be a cut-off date and people would > have to know to use the old or new archives based on the date of the > message? Could an HTTP redirect help with directing users to the new or old archives? > > Paul > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > wes.turner%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Oct 26 12:45:31 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 26 Oct 2017 09:45:31 -0700 Subject: [Python-Dev] iso8601 parsing In-Reply-To: References: <01e69881-3710-87c8-f47a-dfc427ec65b5@mgmiller.net> <11362716.0I2SPu8sME@hammer.magicstack.net> <0fba01d34dca$47ad7940$d7086bc0$@sdamon.com> <20171025213056.GP9068@ando.pearwood.info> Message-ID: On Wed, Oct 25, 2017 at 7:37 PM, Wes Turner wrote: > ISO 8601 support offsets, but not time zones -- presumably the __str__ >> supports the full datetime tzinfo somehow. Which may be why .isoformat() >> exists. >> > > ISO8601 does support timezones. > https://en.wikipedia.org/wiki/ISO_8601#Time_zone_designators > No, it doesn't -- it may call them "timezones", but it only supports offsets -- that is, and offset of -6 could be US Eastern Standard Time or US Central Daylight TIme (or I got that backwards :-) ) The point is that an offset is really easy, and timezones (with Daylight savings and all that) are a frickin' nightmare, but ARE supported by datetime Note that the vocabulary is not precise here, as I see this in the Pyton docs: *class *datetime.timezone A class that implements the tzinfo abstract base class as a fixed offset from the UTC. So THAT is supported by iso8601, and, indeed maps naturally to it. Which means we can round trip isp8601 datetimes nicely, but can't round trip a datetime with a "full featured" tzinfo attached. I don't think this really has any impact on the proposal, though: it's clear what to do when parsing a iso Datetime. I might be wrong, but I think many of the third party libraries listed here > default to either UTC or timezone-naieve timezones: > https://github.com/vinta/awesome-python/blob/master/ > README.md#date-and-time > This is a key point that I hope is obvious: If an ISO string has NO offset or timezone indicator, then a naive datetime should be created. (I say, I "hope" it's obvious, because the numpy datetime64 implementation initially (and for years) would apply the machine local timezone to a bare iso string -- which was a f-ing nightmare!) > Ctrl-F for 'tzinfo=' in the docs really doesn't explain how to just do it > with my local time. > > Here's an example with a *custom* GMT1 tzinfo subclass: > https://docs.python.org/3/library/datetime.html#datetime.time.tzname > Here it is: class GMT1(tzinfo): def utcoffset(self, dt): return timedelta(hours=1) def dst(self, dt): return timedelta(0) def tzname(self,dt): return "Europe/Prague" I hope Prague doesn't do DST, or that would be just wrong ... What would you call the str argument? Does it accept strptime args or only > ISO8601? > I think Fred answered this, but iso 8601 only. we already have strptime if you need to parse anything else. Would all of that string parsing logic be a performance regression from the > current constructor? Does it accept None or empty string? > I suppose you need to do a little type checking first, so a tiny one. Though maybe just catching an Exception, so really tiny. The current constructor only takes numbers, so yes the string parsing version would be slower, but only if you use it... Deserializing dates from JSON (without #JSONLD and xsd:dateTime (ISO8601)) > types is nasty, regardless (try/except, *custom* schema awareness). And > pickle is dangerous. > > AFAIU, we should not ever eval(repr(dt: datetime)). > why not? isn't that what __repr__ is supposed to do? Or do you mean not that it shouldn't work, but that we shouldn't do it? -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From tritium-list at sdamon.com Thu Oct 26 13:07:12 2017 From: tritium-list at sdamon.com (Alex Walters) Date: Thu, 26 Oct 2017 13:07:12 -0400 Subject: [Python-Dev] iso8601 parsing In-Reply-To: References: <01e69881-3710-87c8-f47a-dfc427ec65b5@mgmiller.net> <11362716.0I2SPu8sME@hammer.magicstack.net> <0fba01d34dca$47ad7940$d7086bc0$@sdamon.com> <20171025213056.GP9068@ando.pearwood.info> Message-ID: <111f01d34e7c$deeea790$9ccbf6b0$@sdamon.com> From: Python-Dev [mailto:python-dev-bounces+tritium-list=sdamon.com at python.org] On Behalf Of Chris Barker Sent: Thursday, October 26, 2017 12:46 PM To: Wes Turner Cc: Python-Dev Subject: Re: [Python-Dev] iso8601 parsing > No, it doesn't -- it may call them "timezones", but it only supports offsets -- that is, and offset of -6 could be US Eastern Standard Time or US Central Daylight TIme (or I got that backwards :-) ) US Central Standard, Mountain Daylight. (Eastern is -5/-4DST) -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at msapiro.net Thu Oct 26 13:48:11 2017 From: mark at msapiro.net (Mark Sapiro) Date: Thu, 26 Oct 2017 10:48:11 -0700 Subject: [Python-Dev] Migrate python-dev to Mailman 3? In-Reply-To: References: Message-ID: On 10/26/2017 07:28 AM, Wes Turner wrote: > > > On Thursday, October 26, 2017, Paul Moore wrote: > > On 26 October 2017 at 10:24, Victor Stinner > wrote: > > We are using Mailman 3 for the new buildbot-status mailing list and it > > works well: > > > > > https://mail.python.org/mm3/archives/list/buildbot-status at python.org/ > > ... > > My only use of the pipermail archives is to find permanent URLs for > mails I want to refer people to. My usage goes as follows: > > 1. Google search for a post. > 2. Paste in the URL to an email. > > Or, if I have the post already (usually in my email client). > > 1. Check the date and subject of the post. > 2. Go to the pipermail article by month, and scan the list for the > subject and author. > 3. Click on the link, check it's the right email, copy the URL. > 4. Paste it into my email. > > I don't use the archives for reading. If the above two usages are > still available, I don't care. But in particular, the fact that > individual posts are searchable from Google is important to me. A Google search narrowed with "site:mail.python.org" and perhaps "inurl:listname at python.org" works for HyperKitty archives as well. Also, the archive itself has a "search this list" box. ...> The complexity of this process is also very wastefully frustrating to > me. (Maybe it's in the next month's message tree? No fulltext search? No > way to even do an inurl: search because of the URIs?!) I don't see these issues. There is a full text search box on the archive page and I don't see the problem with Google inurl: > Isn't there a way to append a permalink to the relayed message footers? > Google Groups and Github do this and it saves a lot of time. As you note below, there is an Archived-At: header. I have just submitted an RFE at to enable placing this in the message header/footer. > [Re-searches for things] > > Mailman3 adds an RFC 5064 "Archived-At" header with a link that some > clients provide the ability to open in a normal human browser: > > http://dustymabe.com/2016/01/10/archived-at-email-header-from-mailman-3-lists/ > > I often click the "view it on Github" link in GitHub issue emails. (It's > after the '--' email signature delimiter, so it doesn't take up so much > room). > ? > "[feature] Add permalink to mail message to the footer when delivering > email" > https://gitlab.com/mailman/hyperkitty/issues/27 This needs to be in Mailman Core, not HyperKitty. As I note above, I filed an RFE with core and also referenced it in the HyperKitty issue > Finally, how would a transition be handled? I assume the old archives > would be retained, so would there be a cut-off date and people would > have to know to use the old or new archives based on the date of the > message? > > > Could an HTTP redirect help with directing users to the new or old archives? What we did when migrating security-sig is we migrated the archive but kept the old one and added this message and link to the old archive page. "This list has been migrated to Mailman 3. This archive is not being updated. Here is the new archive including these old posts." We also redirected to . We purposely didn't redirect the old archive so that saved URLs would still work. We did the same things for security-announce and clearly can do the same for future migrations. Finally note that Mailman 3 supports archivers other than HyperKitty. For example, one can configure a list to archive at www.mail-archive.com, in such a way that the Archived-At: permalink points to the message at www.mail-archive.com. -- Mark Sapiro The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan From wes.turner at gmail.com Thu Oct 26 16:26:03 2017 From: wes.turner at gmail.com (Wes Turner) Date: Thu, 26 Oct 2017 16:26:03 -0400 Subject: [Python-Dev] iso8601 parsing In-Reply-To: References: <01e69881-3710-87c8-f47a-dfc427ec65b5@mgmiller.net> <11362716.0I2SPu8sME@hammer.magicstack.net> <0fba01d34dca$47ad7940$d7086bc0$@sdamon.com> <20171025213056.GP9068@ando.pearwood.info> Message-ID: On Thursday, October 26, 2017, Chris Barker wrote: > On Wed, Oct 25, 2017 at 7:37 PM, Wes Turner > wrote: > >> ISO 8601 support offsets, but not time zones -- presumably the __str__ >>> supports the full datetime tzinfo somehow. Which may be why .isoformat() >>> exists. >>> >> >> ISO8601 does support timezones. >> https://en.wikipedia.org/wiki/ISO_8601#Time_zone_designators >> > > No, it doesn't -- it may call them "timezones", but it only supports > offsets -- that is, and offset of -6 could be US Eastern Standard Time or > US Central Daylight TIme (or I got that backwards :-) ) > > The point is that an offset is really easy, and timezones (with Daylight > savings and all that) are a frickin' nightmare, but ARE supported by > datetime > > Note that the vocabulary is not precise here, as I see this in the Pyton > docs: > > *class *datetime.timezone > > A class that implements the tzinfo > abstract > base class as a fixed offset from the UTC. > So THAT is supported by iso8601, and, indeed maps naturally to it. > Got it, thanks. > > Which means we can round trip isp8601 datetimes nicely, but can't round > trip a datetime with a "full featured" tzinfo attached. > Because an iso8601 string only persists the offset. > > I don't think this really has any impact on the proposal, though: it's > clear what to do when parsing a iso Datetime. > > I might be wrong, but I think many of the third party libraries listed >> here default to either UTC or timezone-naieve timezones: >> https://github.com/vinta/awesome-python/blob/master/README. >> md#date-and-time >> > > This is a key point that I hope is obvious: > > > If an ISO string has NO offset or timezone indicator, then a naive > datetime should be created. > > > (I say, I "hope" it's obvious, because the numpy datetime64 implementation > initially (and for years) would apply the machine local timezone to a bare > iso string -- which was a f-ing nightmare!) > astropy.time.Time supports numpy. > > >> Ctrl-F for 'tzinfo=' in the docs really doesn't explain how to just do it >> with my local time. >> >> Here's an example with a *custom* GMT1 tzinfo subclass: >> https://docs.python.org/3/library/datetime.html#datetime.time.tzname >> > > Here it is: > > class GMT1(tzinfo): > def utcoffset(self, dt): > return timedelta(hours=1) > def dst(self, dt): > return timedelta(0) > def tzname(self,dt): > return "Europe/Prague" > > I hope Prague doesn't do DST, or that would be just wrong ... > Pendulum seems to have a faster timezone lookup than pytz: https://pendulum.eustace.io/blog/a-faster-alternative-to-pyz.html Both pendulum and pytz are in conda-forge (the new basis for the anaconda distribution). > > What would you call the str argument? Does it accept strptime args or only >> ISO8601? >> > > I think Fred answered this, but iso 8601 only. we already have strptime if > you need to parse anything else. > > Would all of that string parsing logic be a performance regression from >> the current constructor? Does it accept None or empty string? >> > > I suppose you need to do a little type checking first, so a tiny one. > > Though maybe just catching an Exception, so really tiny. > > The current constructor only takes numbers, so yes the string parsing > version would be slower, but only if you use it... > > Deserializing dates from JSON (without #JSONLD and xsd:dateTime (ISO8601)) >> types is nasty, regardless (try/except, *custom* schema awareness). And >> pickle is dangerous. >> >> AFAIU, we should not ever eval(repr(dt: datetime)). >> > > why not? isn't that what __repr__ is supposed to do? > repr(dict) now returns ellipses ... for cyclical dicts; so I'm assuming that repr only MAY be eval'able. > > Or do you mean not that it shouldn't work, but that we shouldn't do it? > That We shouldn't ever eval untrusted data / code. (That's why we need package hashes, signatures, and TUF). > > -CHB > > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE > > (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wes.turner at gmail.com Thu Oct 26 16:36:15 2017 From: wes.turner at gmail.com (Wes Turner) Date: Thu, 26 Oct 2017 16:36:15 -0400 Subject: [Python-Dev] Migrate python-dev to Mailman 3? In-Reply-To: References: Message-ID: On Thursday, October 26, 2017, Mark Sapiro wrote: > On 10/26/2017 07:28 AM, Wes Turner wrote: > > > > > > On Thursday, October 26, 2017, Paul Moore > wrote: > > > > On 26 October 2017 at 10:24, Victor Stinner > > > wrote: > > > We are using Mailman 3 for the new buildbot-status mailing list > and it > > > works well: > > > > > > > > https://mail.python.org/mm3/archives/list/buildbot-status@ > python.org/ > > > > ... > > > > My only use of the pipermail archives is to find permanent URLs for > > mails I want to refer people to. My usage goes as follows: > > > > 1. Google search for a post. > > 2. Paste in the URL to an email. > > > > Or, if I have the post already (usually in my email client). > > > > 1. Check the date and subject of the post. > > 2. Go to the pipermail article by month, and scan the list for the > > subject and author. > > 3. Click on the link, check it's the right email, copy the URL. > > 4. Paste it into my email. > > > > I don't use the archives for reading. If the above two usages are > > still available, I don't care. But in particular, the fact that > > individual posts are searchable from Google is important to me. > > > A Google search narrowed with "site:mail.python.org" and perhaps > "inurl:listname at python.org " works for HyperKitty archives > as well. Also, > the archive itself has a "search this list" box. Gmail also supports "list:python.org" now. > > > ...> The complexity of this process is also very wastefully frustrating to > > me. (Maybe it's in the next month's message tree? No fulltext search? No > > way to even do an inurl: search because of the URIs?!) > > > I don't see these issues. There is a full text search box on the archive > page and I don't see the problem with Google inurl: This URL style would work with inurl: inurl:x.TLD/THREADID/msgid These can't span across the year-month or otherwise catch other threads in the result set: inurl:mail.python.org/pipermail/astropy/2017-September/0001.html inurl:mail.python.org/pipermail/astropy/2018-January/0002.html > > > > Isn't there a way to append a permalink to the relayed message footers? > > Google Groups and Github do this and it saves a lot of time. > > > As you note below, there is an Archived-At: header. I have just > submitted an RFE at to > enable placing this in the message header/footer. > > > > [Re-searches for things] > > > > Mailman3 adds an RFC 5064 "Archived-At" header with a link that some > > clients provide the ability to open in a normal human browser: > > > > http://dustymabe.com/2016/01/10/archived-at-email-header- > from-mailman-3-lists/ > > > > I often click the "view it on Github" link in GitHub issue emails. (It's > > after the '--' email signature delimiter, so it doesn't take up so much > > room). > > > > "[feature] Add permalink to mail message to the footer when delivering > > email" > > https://gitlab.com/mailman/hyperkitty/issues/27 > > > This needs to be in Mailman Core, not HyperKitty. As I note above, I > filed an RFE with core and also referenced it in the HyperKitty issue Thanks! > > > > Finally, how would a transition be handled? I assume the old archives > > would be retained, so would there be a cut-off date and people would > > have to know to use the old or new archives based on the date of the > > message? > > > > > > Could an HTTP redirect help with directing users to the new or old > archives? > > > What we did when migrating security-sig is we migrated the archive but > kept the old one and added this message and link to the old archive page. > > "This list has been migrated to Mailman 3. This archive is not being > updated. Here is the new archive including these old posts." > > We also redirected > to > . > > We purposely didn't redirect the old archive so that saved URLs would > still work. > > We did the same things for security-announce and clearly can do the same > for future migrations. Great. > > Finally note that Mailman 3 supports archivers other than HyperKitty. > For example, one can configure a list to archive at > www.mail-archive.com, in such a way that the Archived-At: permalink > points to the message at www.mail-archive.com. Someday someone will have the time to implement this in e.g. posterious or hyperkitty or from a complete mbox: https://github.com/westurner/wiki/wiki/ideas#open-source-mailing-list-extractor Thanks again! > > -- > Mark Sapiro > The highway is for > gamblers, > San Francisco Bay Area, California better use your sense - B. Dylan > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at ethanhs.me Thu Oct 26 18:42:19 2017 From: ethan at ethanhs.me (Ethan Smith) Date: Thu, 26 Oct 2017 15:42:19 -0700 Subject: [Python-Dev] PEP 561: Distributing and Packaging Type Information Message-ID: Hello all, I have completed an implementation for PEP 561, and believe it is time to share the PEP and implementation with python-dev Python-ideas threads: * PEP 561: Distributing and Packaging Type Information * PEP 561 v2 - Packaging Static Type Information * PEP 561: Distributing Type Information V3 The live version is here: https://www.python.org/dev/peps/pep-0561/ As always, duplicated below. Ethan Smith --------------------------------------------------- PEP: 561 Title: Distributing and Packaging Type Information Author: Ethan Smith Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 09-Sep-2017 Python-Version: 3.7 Post-History: Abstract ======== PEP 484 introduced type hinting to Python, with goals of making typing gradual and easy to adopt. Currently, typing information must be distributed manually. This PEP provides a standardized means to package and distribute type information and an ordering for type checkers to resolve modules and collect this information for type checking using existing packaging architecture. Rationale ========= Currently, package authors wish to distribute code that has inline type information. However, there is no standard method to distribute packages with inline type annotations or syntax that can simultaneously be used at runtime and in type checking. Additionally, if one wished to ship typing information privately the only method would be via setting ``MYPYPATH`` or the equivalent to manually point to stubs. If the package can be released publicly, it can be added to typeshed [1]_. However, this does not scale and becomes a burden on the maintainers of typeshed. Additionally, it ties bugfixes to releases of the tool using typeshed. PEP 484 has a brief section on distributing typing information. In this section [2]_ the PEP recommends using ``shared/typehints/pythonX.Y/`` for shipping stub files. However, manually adding a path to stub files for each third party library does not scale. The simplest approach people have taken is to add ``site-packages`` to their ``MYPYPATH``, but this causes type checkers to fail on packages that are highly dynamic (e.g. sqlalchemy and Django). Specification ============= There are several motivations and methods of supporting typing in a package. This PEP recognizes three (3) types of packages that may be created: 1. The package maintainer would like to add type information inline. 2. The package maintainer would like to add type information via stubs. 3. A third party would like to share stub files for a package, but the maintainer does not want to include them in the source of the package. This PEP aims to support these scenarios and make them simple to add to packaging and deployment. The two major parts of this specification are the packaging specifications and the resolution order for resolving module type information. The packaging spec is based on and extends PEP 345 metadata. The type checking spec is meant to replace the ``shared/typehints/pythonX.Y/`` spec of PEP 484 [2]_. New third party stub libraries are encouraged to distribute stubs via the third party packaging proposed in this PEP in place of being added to typeshed. Typeshed will remain in use, but if maintainers are found, third party stubs in typeshed are encouraged to be split into their own package. Packaging Type Information -------------------------- In order to make packaging and distributing type information as simple and easy as possible, the distribution of type information, and typed Python code is done through existing packaging frameworks. This PEP adds a new item to the ``*.distinfo/METADATA`` file to contain metadata about a package's support for typing. The new item is optional, but must have a name of ``Typed`` and have a value of either ``inline`` or ``stubs``, if present. Metadata Examples:: Typed: inline Typed: stubs Stub Only Packages '''''''''''''''''' For package maintainers wishing to ship stub files containing all of their type information, it is prefered that the ``*.pyi`` stubs are alongside the corresponding ``*.py`` files. However, the stubs may be put in a sub-folder of the Python sources, with the same name the ``*.py`` files are in. For example, the ``flyingcircus`` package would have its stubs in the folder ``flyingcircus/flyingcircus/``. This path is chosen so that if stubs are not found in ``flyingcircus/`` the type checker may treat the subdirectory as a normal package. The normal resolution order of checking ``*.pyi`` before ``*.py`` will be maintained. Third Party Stub Packages ''''''''''''''''''''''''' Third parties seeking to distribute stub files are encouraged to contact the maintainer of the package about distribution alongside the package. If the maintainer does not wish to maintain or package stub files or type information inline, then a "third party stub package" should be created. The structure is similar, but slightly different from that of stub only packages. If the stubs are for the library ``flyingcircus`` then the package should be named ``flyingcircus-stubs`` and the stub files should be put in a sub-directory named ``flyingcircus``. This allows the stubs to be checked as if they were in a regular package. In addition, the third party stub package should indicate which version(s) of the runtime package are supported by indicating the runtime package's version(s) through the normal dependency data. For example, if there was a stub package ``flyingcircus-stubs``, it can indicate the versions of the runtime ``flyingcircus`` package supported through ``install_requires`` in distutils based tools, or the equivalent in other packaging tools. Type Checker Module Resolution Order ------------------------------------ The following is the order that type checkers supporting this PEP should resolve modules containing type information: 1. User code - the files the type checker is running on. 2. Stubs or Python source manually put in the beginning of the path. Type checkers should provide this to allow the user complete control of which stubs to use, and patch broken stubs/inline types from packages. 3. Third party stub packages - these packages can supersede the installed untyped packages. They can be found at ``pkg-stubs`` for package ``pkg``, however it is encouraged to check the package's metadata using packaging query APIs such as ``pkg_resources`` to assure that the package is meant for type checking, and is compatible with the installed version. 4. Inline packages - finally, if there is nothing overriding the installed package, and it opts into type checking. 5. Typeshed (if used) - Provides the stdlib types and several third party libraries Type checkers that check a different Python version than the version they run on must find the type information in the ``site-packages``/``dist-packages`` of that Python version. This can be queried e.g. ``pythonX.Y -c 'import site; print(site.getsitepackages())'``. It is also recommended that the type checker allow for the user to point to a particular Python binary, in case it is not in the path. To check if a package has opted into type checking, type checkers are recommended to use the ``pkg_resources`` module to query the package metadata. If the ``typed`` package metadata has ``None`` as its value, the package has not opted into type checking, and the type checker should skip that package. Implementation ============== A CPython branch with a modified distutils supporting the ``typed`` setup keyword lives here: [impl]_. In addition, a sample package with inline types is available [typed_pkg]_, as well as a sample package [pkg_checker]_ which reads the metadata of installed packages and reports on their status as either not typed, inline typed, or a stub package. Acknowledgements ================ This PEP would not have been possible without the ideas, feedback, and support of Ivan Levkivskyi, Jelle Zijlstra, Nick Coghlan, Daniel F Moisset, and Guido van Rossum. Version History =============== * 2017-10-26 * Added implementation references. * Added acknowledgements and version history. * 2017-10-06 * Rewritten to use .distinfo/METADATA over a distutils specific command. * Clarify versioning of third party stub packages. * 2017-09-11 * Added information about current solutions and typeshed. * Clarify rationale. References ========== .. [1] Typeshed (https://github.com/python/typeshed) .. [2] PEP 484, Storing and Distributing Stub Files (https://www.python.org/dev/peps/pep-0484/#storing-and-distributing-stub-files) .. [impl] CPython sample implementation (https://github.com/ethanhs/cpython/tree/typeddist) .. [typed_pkg] Sample typed package (https://github.com/ethanhs/sample-typed-package) .. [pkg_checker] Sample package checker (https://github.com/ethanhs/check_typedpkg) Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: -------------- next part -------------- An HTML attachment was scrubbed... URL: From phd at phdru.name Thu Oct 26 19:27:25 2017 From: phd at phdru.name (Oleg Broytman) Date: Fri, 27 Oct 2017 01:27:25 +0200 Subject: [Python-Dev] PEP 561: Distributing and Packaging Type Information In-Reply-To: References: Message-ID: <20171026232725.GA914@phdru.name> Hi! On Thu, Oct 26, 2017 at 03:42:19PM -0700, Ethan Smith wrote: > Post-History: Not sure if postings to python-ideas count, but Post-History: 10-Sep-2017, 12-Sep-2017, 26-Oct-2017 Refs: https://mail.python.org/pipermail/python-ideas/2017-September/047015.html https://mail.python.org/pipermail/python-ideas/2017-September/047083.html > This PEP adds a new item to the > ``*.distinfo/METADATA`` file *.dist-info/ Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From mariatta.wijaya at gmail.com Thu Oct 26 19:48:23 2017 From: mariatta.wijaya at gmail.com (Mariatta Wijaya) Date: Thu, 26 Oct 2017 16:48:23 -0700 Subject: [Python-Dev] PEP 561: Distributing and Packaging Type Information In-Reply-To: <20171026232725.GA914@phdru.name> References: <20171026232725.GA914@phdru.name> Message-ID: > > Not sure if postings to python-ideas count, PEP 1 says: Post-History is used to record the dates of when new versions of the PEP are posted to python-list and/or python-dev. So, no ? Mariatta Wijaya -------------- next part -------------- An HTML attachment was scrubbed... URL: From phd at phdru.name Thu Oct 26 19:57:02 2017 From: phd at phdru.name (Oleg Broytman) Date: Fri, 27 Oct 2017 01:57:02 +0200 Subject: [Python-Dev] PEP 561: Distributing and Packaging Type Information In-Reply-To: References: <20171026232725.GA914@phdru.name> Message-ID: <20171026235702.GA1976@phdru.name> On Thu, Oct 26, 2017 at 04:48:23PM -0700, Mariatta Wijaya wrote: > > > > Not sure if postings to python-ideas count, > > PEP 1 says: > > Post-History is used to record the dates of when new versions of the PEP > are posted to python-list and/or python-dev. That's was added in 2003: https://hg.python.org/peps/annotate/96614829c145/pep-0001.txt https://github.com/python/peps/commit/0a690292ffe2cdc547dbad3bdbdb46672012b536 I don't remember if python-ideas has already been created. ;-) > So, no ? I'm not so sure. :-) > Mariatta Wijaya Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From ethan at ethanhs.me Thu Oct 26 20:00:40 2017 From: ethan at ethanhs.me (Ethan Smith) Date: Thu, 26 Oct 2017 17:00:40 -0700 Subject: [Python-Dev] PEP 561: Distributing and Packaging Type Information In-Reply-To: References: <20171026232725.GA914@phdru.name> Message-ID: On Thu, Oct 26, 2017 at 4:48 PM, Mariatta Wijaya wrote: > Not sure if postings to python-ideas count, > > > PEP 1 says: > > Post-History is used to record the dates of when new versions of the PEP > are posted to python-list and/or python-dev. > > So, no ? > Reading PEP 12, https://www.python.org/dev/peps/pep-0012/#id24 - Leave Post-History alone for now; you'll add dates to this header each time you post your PEP to python-list at python.org or python-dev at python.org. If you posted your PEP to the lists on August 14, 2001 and September 3, 2001, the Post-History header would look like: Post-History: 14-Aug-2001, 03-Sept-2001 You must manually add new dates and check them in. If you don't have check-in privileges, send your changes to the PEP editors. Perhaps it is outdated and needs to have python-ideas added? python-ideas was created around 2006 (according to the archives), so after PEP 1/12 were written. > This PEP adds a new item to the > ``*.distinfo/METADATA`` file *.dist-info/ Thank you for catching that. I will fix that with my next round of edits. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gvanrossum at gmail.com Thu Oct 26 20:03:41 2017 From: gvanrossum at gmail.com (Guido van Rossum) Date: Thu, 26 Oct 2017 17:03:41 -0700 Subject: [Python-Dev] PEP 561: Distributing and Packaging Type Information In-Reply-To: <20171026235702.GA1976@phdru.name> References: <20171026232725.GA914@phdru.name> <20171026235702.GA1976@phdru.name> Message-ID: I think python-ideas does count here. Many PEPs evolve mostly there. On Oct 26, 2017 4:59 PM, "Oleg Broytman" wrote: > On Thu, Oct 26, 2017 at 04:48:23PM -0700, Mariatta Wijaya < > mariatta.wijaya at gmail.com> wrote: > > > > > > Not sure if postings to python-ideas count, > > > > PEP 1 says: > > > > Post-History is used to record the dates of when new versions of the PEP > > are posted to python-list and/or python-dev. > > That's was added in 2003: > https://hg.python.org/peps/annotate/96614829c145/pep-0001.txt > https://github.com/python/peps/commit/0a690292ffe2cdc547dbad3bdbdb46 > 672012b536 > I don't remember if python-ideas has already been created. ;-) > > > So, no ? > > I'm not so sure. :-) > > > Mariatta Wijaya > > Oleg. > -- > Oleg Broytman http://phdru.name/ phd at phdru.name > Programmers don't die, they just GOSUB without RETURN. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mariatta.wijaya at gmail.com Thu Oct 26 20:21:57 2017 From: mariatta.wijaya at gmail.com (Mariatta Wijaya) Date: Thu, 26 Oct 2017 17:21:57 -0700 Subject: [Python-Dev] PEP 561: Distributing and Packaging Type Information In-Reply-To: References: <20171026232725.GA914@phdru.name> <20171026235702.GA1976@phdru.name> Message-ID: Ok I created an issue https://github.com/python/peps/issues/440, maybe someone can work on updating the wordings in PEP 1 and PEP 12. Thanks :) Mariatta Wijaya On Thu, Oct 26, 2017 at 5:03 PM, Guido van Rossum wrote: > I think python-ideas does count here. Many PEPs evolve mostly there. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From phd at phdru.name Thu Oct 26 20:31:31 2017 From: phd at phdru.name (Oleg Broytman) Date: Fri, 27 Oct 2017 02:31:31 +0200 Subject: [Python-Dev] PEP 561: Distributing and Packaging Type Information In-Reply-To: References: <20171026232725.GA914@phdru.name> <20171026235702.GA1976@phdru.name> Message-ID: <20171027003131.GA4066@phdru.name> Proposed pull request: https://github.com/python/peps/pull/441 On Thu, Oct 26, 2017 at 05:21:57PM -0700, Mariatta Wijaya wrote: > Ok I created an issue https://github.com/python/peps/issues/440, maybe > someone can work on updating the wordings in PEP 1 and PEP 12. > Thanks :) > > Mariatta Wijaya > > On Thu, Oct 26, 2017 at 5:03 PM, Guido van Rossum > wrote: > > > I think python-ideas does count here. Many PEPs evolve mostly there. Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From barry at python.org Thu Oct 26 21:38:45 2017 From: barry at python.org (Barry Warsaw) Date: Thu, 26 Oct 2017 21:38:45 -0400 Subject: [Python-Dev] PEP 561: Distributing and Packaging Type Information In-Reply-To: References: <20171026232725.GA914@phdru.name> <20171026235702.GA1976@phdru.name> Message-ID: On Oct 26, 2017, at 20:03, Guido van Rossum wrote: > > I think python-ideas does count here. Many PEPs evolve mostly there. True, but there was some discussion of this way back when. The way I remember it was that, while there are many outlets to discuss PEPs (including those pointed to by the optional Discussions-To header), python-dev is the ?forum of record?. This means that python-dev is the only mailing list you *have* to follow if you want to be informed of a PEP?s status in a timely manner. Thus Post-History is supposed to reflect the history of when the PEP is sent to python-dev. python-list is included because it?s the primary mailing list followed by people who aren?t developing Python but are still interested in it. Maybe this needs to be reconsidered here in 2017, but that?s the rationale for the wording of PEP 1. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From barry at python.org Thu Oct 26 22:01:12 2017 From: barry at python.org (Barry Warsaw) Date: Thu, 26 Oct 2017 22:01:12 -0400 Subject: [Python-Dev] Migrate python-dev to Mailman 3? In-Reply-To: References: <20171026120137.1de34389@fsol> Message-ID: On Oct 26, 2017, at 06:15, Victor Stinner wrote: > I don't think that Mailman 3 gives the choice of the UI for archives. Technically, it does. Mailman 3 has a pluggable architecture and supports multiple archives enabled site-wide and opt-in by individual lists. HyperKitty is the default archiver, and the one we promote, but it doesn?t have to be the only archiver enabled. In fact, we come with plugins for mail-archive.com and MHonarc. It *might* even be possible to enable a standalone Pipermail and route messages to that if one were so inclined. The choice of archivers is not mutually exclusive. Practically speaking though, there just aren?t a ton of well maintained FLOSS archivers to choose from. HyperKitty *is* well maintained. Frankly speaking, Pipermail is not. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From barry at python.org Thu Oct 26 22:11:04 2017 From: barry at python.org (Barry Warsaw) Date: Thu, 26 Oct 2017 22:11:04 -0400 Subject: [Python-Dev] Migrate python-dev to Mailman 3? In-Reply-To: References: <20171026120137.1de34389@fsol> <090a325f-6b04-2cbc-e677-0493dc0273ad@python.org> Message-ID: <3D92AED4-B97F-43D7-9EC5-D311A3911734@python.org> On Oct 26, 2017, at 10:01, Donald Stufft wrote: > Pipermail is *horrible* Pipermail also has a fatal flaw, and we have been hit by it several times in our past. It?s fundamental to Pipermail?s design and can?t be fixed. Fortunately, HyperKitty was designed and implemented correctly so it doesn?t suffer this flaw. Pipermail indexes messages sequentially, and if you ever regenerate the archive from the source mbox, it?s is almost guaranteed that your messages will get different URLs. Worse, you can?t even automate a mapping from new URLs to old URLs. This is especially likely in archives that go back as far as python-dev does, because there was a bug back in the day where even the source mbox file got corrupted, where the separator between messages was broken. We tried to implement a fix for that, but it?s a heuristic and it?s not perfect. We say that Pipermail does not have ?stable urls?. Thankfully HyperKitty does! So even if you regenerate the HyperKitty archive, your messages will end up with the same URLs. This let?s us implement Archived-At stably, and the algorithm at [1] even lets us pre-calculate the URL, so we can even include the URL to where the message *will* be once it?s archived, even without talking to HyperKitty, or any of the other archivers that are enabled (and support the algorithm of course). So HyperKitty is miles ahead of Pipermail in design and implementation. Sure it?s different, but people also forget how really buggy Pipermail was for a long time. (And trust me, you really don?t even want to look at the code. ;) Cheers, -Barry [1] https://wiki.list.org/DEV/Stable%20URLs -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From guido at python.org Fri Oct 27 00:12:10 2017 From: guido at python.org (Guido van Rossum) Date: Thu, 26 Oct 2017 21:12:10 -0700 Subject: [Python-Dev] PEP 561: Distributing and Packaging Type Information In-Reply-To: References: <20171026232725.GA914@phdru.name> <20171026235702.GA1976@phdru.name> Message-ID: Heh, you're right that was the reasoning. But I think python-list is much less valuable than python-ideas for PEP authors. So let's change it. On Thu, Oct 26, 2017 at 6:38 PM, Barry Warsaw wrote: > On Oct 26, 2017, at 20:03, Guido van Rossum wrote: > > > > I think python-ideas does count here. Many PEPs evolve mostly there. > > True, but there was some discussion of this way back when. > > The way I remember it was that, while there are many outlets to discuss > PEPs (including those pointed to by the optional Discussions-To header), > python-dev is the ?forum of record?. This means that python-dev is the > only mailing list you *have* to follow if you want to be informed of a > PEP?s status in a timely manner. Thus Post-History is supposed to reflect > the history of when the PEP is sent to python-dev. python-list is included > because it?s the primary mailing list followed by people who aren?t > developing Python but are still interested in it. > > Maybe this needs to be reconsidered here in 2017, but that?s the rationale > for the wording of PEP 1. > > Cheers, > -Barry > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Oct 27 00:35:58 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 27 Oct 2017 14:35:58 +1000 Subject: [Python-Dev] Migrate python-dev to Mailman 3? In-Reply-To: References: Message-ID: On 27 October 2017 at 00:28, Wes Turner wrote: > > On Thursday, October 26, 2017, Paul Moore wrote: > >> So basically, what I'm asking is what would be the support for the use >> case "Find a permanent link to an archived article as fast as possible >> based on subject/author or a Google search". > > > The complexity of this process is also very wastefully frustrating to me. > (Maybe it's in the next month's message tree? No fulltext search? No way to > even do an inurl: search because of the URIs?!) > > Isn't there a way to append a permalink to the relayed message footers? > Google Groups and Github do this and it saves a lot of time. > MM3 injects an Archived-At header, as the permalink URLs for individual messages are generated based on a hash of a suitable subset of the message headers (I don't know if it's possible to opt-in to including those in the message footer or not, though). As Barry explained, this isn't possible with pipermail, as those archive URLs are dynamically generated and are completely independent of the message contents and metadata (this is also why you can't safely delete messages from MM2 archives: doing so will renumber the archive URLs for all subsequent messages). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Fri Oct 27 03:44:50 2017 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 27 Oct 2017 00:44:50 -0700 Subject: [Python-Dev] PEP 561: Distributing and Packaging Type Information In-Reply-To: References: Message-ID: On Thu, Oct 26, 2017 at 3:42 PM, Ethan Smith wrote: > However, the stubs may be put in a sub-folder > of the Python sources, with the same name the ``*.py`` files are in. For > example, the ``flyingcircus`` package would have its stubs in the folder > ``flyingcircus/flyingcircus/``. This path is chosen so that if stubs are > not found in ``flyingcircus/`` the type checker may treat the subdirectory as > a normal package. I admit that I find this aesthetically unpleasant. Wouldn't something like __typestubs__/ be a more Pythonic name? (And also avoid potential name clashes, e.g. my async_generator package has a top-level export called async_generator; normally you do 'from async_generator import async_generator'. I think that might cause problems if I created an async_generator/async_generator/ directory, especially post-PEP 420.) I also don't understand the given rationale -- it sounds like you want to be able say well, if ${SOME_DIR_ON_PYTHONPATH}/flyingcircus/ doesn't contain stubs, then just stick the ${SOME_DIR_ON_PYTHONPATH}/flyingcircus/ directory *itself* onto PYTHONPATH, and then try again. But that's clearly the wrong thing, because then you'll also be adding a bunch of other random junk into that directory into the top-level namespace. For example, suddenly the flyingcircus.summarise_proust module has become a top-level summarise_proust package. I must be misunderstanding something? > Type Checker Module Resolution Order > ------------------------------------ > > The following is the order that type checkers supporting this PEP should > resolve modules containing type information: > > 1. User code - the files the type checker is running on. > > 2. Stubs or Python source manually put in the beginning of the path. Type > checkers should provide this to allow the user complete control of which > stubs to use, and patch broken stubs/inline types from packages. > > 3. Third party stub packages - these packages can supersede the installed > untyped packages. They can be found at ``pkg-stubs`` for package ``pkg``, > however it is encouraged to check the package's metadata using packaging > query APIs such as ``pkg_resources`` to assure that the package is meant > for type checking, and is compatible with the installed version. Am I right that this means you need to be able to map from import names to distribution names? I.e., if you see 'import foo', you need to figure out which *.dist-info directory contains metadata for the 'foo' package? How do you plan to do this? The problem is that technically, import names and distribution names are totally unrelated namespaces -- for example, the '_pytest' package comes from the 'pytest' distribution, the 'pylab' package comes from 'matplotlib', and 'pip install scikit-learn' gives you a package imported as 'sklearn'. Namespace packages are also challenging, because a single top-level package might actually be spread across multiple distributions. -n -- Nathaniel J. Smith -- https://vorpus.org From solipsis at pitrou.net Fri Oct 27 05:31:04 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 27 Oct 2017 11:31:04 +0200 Subject: [Python-Dev] PEP 561: Distributing and Packaging Type Information References: Message-ID: <20171027113104.4cbc4194@fsol> On Thu, 26 Oct 2017 15:42:19 -0700 Ethan Smith wrote: > Stub Only Packages > '''''''''''''''''' > > For package maintainers wishing to ship stub files containing all of their > type information, it is prefered that the ``*.pyi`` stubs are alongside the > corresponding ``*.py`` files. However, the stubs may be put in a sub-folder > of the Python sources, with the same name the ``*.py`` files are in. For > example, the ``flyingcircus`` package would have its stubs in the folder > ``flyingcircus/flyingcircus/``. This path is chosen so that if stubs are > not found in ``flyingcircus/`` the type checker may treat the subdirectory as > a normal package. The normal resolution order of checking ``*.pyi`` before > ``*.py`` will be maintained. I am not sure I understand the rationale for this. What would be the problem with looking for the stubs in a directory named, e.g; "flyingcircus/__typing__"? Regards Antoine. From solipsis at pitrou.net Fri Oct 27 05:43:20 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 27 Oct 2017 11:43:20 +0200 Subject: [Python-Dev] PEP 561: Distributing and Packaging Type Information References: <20171027113104.4cbc4194@fsol> Message-ID: <20171027114320.5388a85f@fsol> On Fri, 27 Oct 2017 11:31:04 +0200 Antoine Pitrou wrote: > On Thu, 26 Oct 2017 15:42:19 -0700 > Ethan Smith wrote: > > Stub Only Packages > > '''''''''''''''''' > > > > For package maintainers wishing to ship stub files containing all of their > > type information, it is prefered that the ``*.pyi`` stubs are alongside the > > corresponding ``*.py`` files. However, the stubs may be put in a sub-folder > > of the Python sources, with the same name the ``*.py`` files are in. For > > example, the ``flyingcircus`` package would have its stubs in the folder > > ``flyingcircus/flyingcircus/``. This path is chosen so that if stubs are > > not found in ``flyingcircus/`` the type checker may treat the subdirectory as > > a normal package. The normal resolution order of checking ``*.pyi`` before > > ``*.py`` will be maintained. > > I am not sure I understand the rationale for this. What would be the > problem with looking for the stubs in a directory named, e.g; > "flyingcircus/__typing__"? I just saw Nathaniel asked the same question above. Sorry for the noise! Regards Antoine. From horos22 at gmail.com Fri Oct 27 02:03:05 2017 From: horos22 at gmail.com (Ed Peschko) Date: Thu, 26 Oct 2017 23:03:05 -0700 Subject: [Python-Dev] \G (match last position) regex operator non-existant in python? Message-ID: All, perl has a regex assertion (\G) that allows multiple-match regular expressions to be able to use the position of the last match. Perl's documentation puts it this way: \G Match only at pos() (e.g. at the end-of-match position of prior m//g) Anyways, this is exceedingly powerful for matching regularly structured free-form records, and I was really surprised when I found out that python did not have it. For example, if findall supported this, it would be possible to write things like this (a quick and dirty ifconfig parser): pat = re.compile(r'\G(\S+)(.*?\n)(?=\S+|\Z)', re.S) val = """ eth2 Link encap:Ethernet HWaddr xx inet addr: xx.xx.xx.xx Bcast:xx.xx.xx.xx Mask:xx.xx.xx.xx ... lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 """ matches = re.findall(pat, val) So - why doesn't python have this? is it something that simply was overlooked, or is there another method of doing the same thing with arbitrarily complex freeform records? thanks much.. From stefan at bytereef.org Fri Oct 27 04:17:22 2017 From: stefan at bytereef.org (Stefan Krah) Date: Fri, 27 Oct 2017 10:17:22 +0200 Subject: [Python-Dev] If aligned_alloc() is missing on your platform, please let us know. Message-ID: <20171027081722.GA3757@bytereef.org> Hello, we want to add aligned versions of allocation functions to 3.7: https://bugs.python.org/issue18835 C11 has aligned_alloc(). Linux, BSD, OSX, MSVC, Android all have either posix_memalign() or _aligned_malloc(). Cygwin apparently has posix_memalign(). MinGW has: https://github.com/Alexpux/mingw-w64/blob/master/mingw-w64-crt/misc/mingw-aligned-malloc.c Victor wrote a patch and would like to avoid adding a (probably unnecessary) emulation function. I agree with that. So if any platform does not have some form of aligned_alloc(), please speak up. Stefan Krah From stefan at bytereef.org Fri Oct 27 04:35:19 2017 From: stefan at bytereef.org (Stefan Krah) Date: Fri, 27 Oct 2017 10:35:19 +0200 Subject: [Python-Dev] Migrate python-dev to Mailman 3? Message-ID: <20171027083519.GA4094@bytereef.org> Barry Warsaw wrote: > In fact, we come with plugins for mail-archive.com and MHonarc. MHonarc output is nice, practically the same as pipermail: https://lists.debian.org/debian-x/2010/12/threads.html If it is possible to enable (and maintain!) a MHonarc archive with just blue links along with the new interface, everyone should be happy. Stefan Krah From barry at python.org Fri Oct 27 11:27:14 2017 From: barry at python.org (Barry Warsaw) Date: Fri, 27 Oct 2017 11:27:14 -0400 Subject: [Python-Dev] PEP Post-History (was Re: PEP 561: Distributing and Packaging Type Information) In-Reply-To: References: <20171026232725.GA914@phdru.name> <20171026235702.GA1976@phdru.name> Message-ID: <0A6D483C-9DA5-43E4-9A4C-CF4AD04C3651@python.org> On Oct 27, 2017, at 00:12, Guido van Rossum wrote: > > Heh, you're right that was the reasoning. But I think python-list is much less valuable than python-ideas for PEP authors. So let's change it. Sounds good. I just want to make sure we keep python-dev in the loop. This is a process change though, so I?ll work with the PR#441 author to get the update into the PEPs, and then make the announcement on the relevant mailing lists. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From guido at python.org Fri Oct 27 11:35:58 2017 From: guido at python.org (Guido van Rossum) Date: Fri, 27 Oct 2017 08:35:58 -0700 Subject: [Python-Dev] \G (match last position) regex operator non-existant in python? In-Reply-To: References: Message-ID: The "why" question is not very interesting -- it probably wasn't in PCRE and nobody was familiar with it when we moved off PCRE (maybe it wasn't even in Perl at the time -- it was ~15 years ago). I didn't understand your description of \G so I googled it and found a helpful StackOverflow article: https://stackoverflow.com/questions/21971701/when-is-g-useful-application-in-a-regex. >From this I understand that when using e.g. findall() it forces successive matches to be adjacent. In general this seems to be a unique property of \G: it preserves *state* from one match to the next. This will make it somewhat difficult to implement -- e.g. that state should probably be thread-local in case multiple threads use the same compiled regex. It's also unclear when that state should be reset. (Only when you compile the regex? Each time you pass it a different source string?) So I'm not sure it's reasonable to add. But I also don't see a reason why it shouldn't be added -- presuming we can decide on good answer for the questions above about the "scope" of the anchor. I think it's okay to start a discussion on bugs.python.org about the precise specification of \G for Python. OTOH I expect that most core devs won't find this a very interesting problem (Python relies on regexes for parsing a lot less than Perl does). Good luck! On Thu, Oct 26, 2017 at 11:03 PM, Ed Peschko wrote: > All, > > perl has a regex assertion (\G) that allows multiple-match regular > expressions to be able to use the position of the last match. Perl's > documentation puts it this way: > > \G Match only at pos() (e.g. at the end-of-match position of prior > m//g) > > Anyways, this is exceedingly powerful for matching regularly > structured free-form records, and I was really surprised when I found > out that python did not have it. For example, if findall supported > this, it would be possible to write things like this (a quick and > dirty ifconfig parser): > > pat = re.compile(r'\G(\S+)(.*?\n)(?=\S+|\Z)', re.S) > > val = """ > eth2 Link encap:Ethernet HWaddr xx > inet addr: xx.xx.xx.xx Bcast:xx.xx.xx.xx Mask:xx.xx.xx.xx > ... > lo Link encap:Local Loopback > inet addr:127.0.0.1 Mask:255.0.0.0 > """ > matches = re.findall(pat, val) > > So - why doesn't python have this? is it something that simply was > overlooked, or is there another method of doing the same thing with > arbitrarily complex freeform records? > > thanks much.. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri Oct 27 11:36:45 2017 From: guido at python.org (Guido van Rossum) Date: Fri, 27 Oct 2017 08:36:45 -0700 Subject: [Python-Dev] PEP Post-History (was Re: PEP 561: Distributing and Packaging Type Information) In-Reply-To: <0A6D483C-9DA5-43E4-9A4C-CF4AD04C3651@python.org> References: <20171026232725.GA914@phdru.name> <20171026235702.GA1976@phdru.name> <0A6D483C-9DA5-43E4-9A4C-CF4AD04C3651@python.org> Message-ID: Great! On Fri, Oct 27, 2017 at 8:27 AM, Barry Warsaw wrote: > On Oct 27, 2017, at 00:12, Guido van Rossum wrote: > > > > Heh, you're right that was the reasoning. But I think python-list is > much less valuable than python-ideas for PEP authors. So let's change it. > > Sounds good. I just want to make sure we keep python-dev in the loop. > > This is a process change though, so I?ll work with the PR#441 author to > get the update into the PEPs, and then make the announcement on the > relevant mailing lists. > > Cheers, > -Barry > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Fri Oct 27 11:50:48 2017 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 27 Oct 2017 10:50:48 -0500 Subject: [Python-Dev] \G (match last position) regex operator non-existant in python? In-Reply-To: References: Message-ID: Note that Matthew Barnett's `regex` module already supports \G, and a great many other features that weren't around 15 years ago ;-) either: https://pypi.python.org/pypi/regex/ I haven't followed this in detail. I'm just surprised once per year that it hasn't been folded into the core ;-) [nothing new below] On Fri, Oct 27, 2017 at 10:35 AM, Guido van Rossum wrote: > The "why" question is not very interesting -- it probably wasn't in PCRE and > nobody was familiar with it when we moved off PCRE (maybe it wasn't even in > Perl at the time -- it was ~15 years ago). > > I didn't understand your description of \G so I googled it and found a > helpful StackOverflow article: > https://stackoverflow.com/questions/21971701/when-is-g-useful-application-in-a-regex. > From this I understand that when using e.g. findall() it forces successive > matches to be adjacent. > > In general this seems to be a unique property of \G: it preserves *state* > from one match to the next. This will make it somewhat difficult to > implement -- e.g. that state should probably be thread-local in case > multiple threads use the same compiled regex. It's also unclear when that > state should be reset. (Only when you compile the regex? Each time you pass > it a different source string?) > > So I'm not sure it's reasonable to add. But I also don't see a reason why it > shouldn't be added -- presuming we can decide on good answer for the > questions above about the "scope" of the anchor. > > I think it's okay to start a discussion on bugs.python.org about the precise > specification of \G for Python. OTOH I expect that most core devs won't find > this a very interesting problem (Python relies on regexes for parsing a lot > less than Perl does). > > Good luck! > > On Thu, Oct 26, 2017 at 11:03 PM, Ed Peschko wrote: >> >> All, >> >> perl has a regex assertion (\G) that allows multiple-match regular >> expressions to be able to use the position of the last match. Perl's >> documentation puts it this way: >> >> \G Match only at pos() (e.g. at the end-of-match position of prior >> m//g) >> >> Anyways, this is exceedingly powerful for matching regularly >> structured free-form records, and I was really surprised when I found >> out that python did not have it. For example, if findall supported >> this, it would be possible to write things like this (a quick and >> dirty ifconfig parser): >> >> pat = re.compile(r'\G(\S+)(.*?\n)(?=\S+|\Z)', re.S) >> >> val = """ >> eth2 Link encap:Ethernet HWaddr xx >> inet addr: xx.xx.xx.xx Bcast:xx.xx.xx.xx Mask:xx.xx.xx.xx >> ... >> lo Link encap:Local Loopback >> inet addr:127.0.0.1 Mask:255.0.0.0 >> """ >> matches = re.findall(pat, val) >> >> So - why doesn't python have this? is it something that simply was >> overlooked, or is there another method of doing the same thing with >> arbitrarily complex freeform records? >> >> thanks much.. >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> https://mail.python.org/mailman/options/python-dev/guido%40python.org > > > > > -- > --Guido van Rossum (python.org/~guido) > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/tim.peters%40gmail.com > From guido at python.org Fri Oct 27 11:57:57 2017 From: guido at python.org (Guido van Rossum) Date: Fri, 27 Oct 2017 08:57:57 -0700 Subject: [Python-Dev] \G (match last position) regex operator non-existant in python? In-Reply-To: References: Message-ID: Oh. Yes, that is being discussed about once a year two. It seems Matthew isn't very interested in helping out with the port, and there are some concerns about backwards compatibility with the `re` module. I think it needs a champion! On Fri, Oct 27, 2017 at 8:50 AM, Tim Peters wrote: > Note that Matthew Barnett's `regex` module already supports \G, and a > great many other features that weren't around 15 years ago ;-) either: > > https://pypi.python.org/pypi/regex/ > > I haven't followed this in detail. I'm just surprised once per year > that it hasn't been folded into the core ;-) > > [nothing new below] > > On Fri, Oct 27, 2017 at 10:35 AM, Guido van Rossum > wrote: > > The "why" question is not very interesting -- it probably wasn't in PCRE > and > > nobody was familiar with it when we moved off PCRE (maybe it wasn't even > in > > Perl at the time -- it was ~15 years ago). > > > > I didn't understand your description of \G so I googled it and found a > > helpful StackOverflow article: > > https://stackoverflow.com/questions/21971701/when-is-g- > useful-application-in-a-regex. > > From this I understand that when using e.g. findall() it forces > successive > > matches to be adjacent. > > > > In general this seems to be a unique property of \G: it preserves *state* > > from one match to the next. This will make it somewhat difficult to > > implement -- e.g. that state should probably be thread-local in case > > multiple threads use the same compiled regex. It's also unclear when that > > state should be reset. (Only when you compile the regex? Each time you > pass > > it a different source string?) > > > > So I'm not sure it's reasonable to add. But I also don't see a reason > why it > > shouldn't be added -- presuming we can decide on good answer for the > > questions above about the "scope" of the anchor. > > > > I think it's okay to start a discussion on bugs.python.org about the > precise > > specification of \G for Python. OTOH I expect that most core devs won't > find > > this a very interesting problem (Python relies on regexes for parsing a > lot > > less than Perl does). > > > > Good luck! > > > > On Thu, Oct 26, 2017 at 11:03 PM, Ed Peschko wrote: > >> > >> All, > >> > >> perl has a regex assertion (\G) that allows multiple-match regular > >> expressions to be able to use the position of the last match. Perl's > >> documentation puts it this way: > >> > >> \G Match only at pos() (e.g. at the end-of-match position of prior > >> m//g) > >> > >> Anyways, this is exceedingly powerful for matching regularly > >> structured free-form records, and I was really surprised when I found > >> out that python did not have it. For example, if findall supported > >> this, it would be possible to write things like this (a quick and > >> dirty ifconfig parser): > >> > >> pat = re.compile(r'\G(\S+)(.*?\n)(?=\S+|\Z)', re.S) > >> > >> val = """ > >> eth2 Link encap:Ethernet HWaddr xx > >> inet addr: xx.xx.xx.xx Bcast:xx.xx.xx.xx Mask:xx.xx.xx.xx > >> ... > >> lo Link encap:Local Loopback > >> inet addr:127.0.0.1 Mask:255.0.0.0 > >> """ > >> matches = re.findall(pat, val) > >> > >> So - why doesn't python have this? is it something that simply was > >> overlooked, or is there another method of doing the same thing with > >> arbitrarily complex freeform records? > >> > >> thanks much.. > >> _______________________________________________ > >> Python-Dev mailing list > >> Python-Dev at python.org > >> https://mail.python.org/mailman/listinfo/python-dev > >> Unsubscribe: > >> https://mail.python.org/mailman/options/python-dev/guido%40python.org > > > > > > > > > > -- > > --Guido van Rossum (python.org/~guido) > > > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > https://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: > > https://mail.python.org/mailman/options/python-dev/ > tim.peters%40gmail.com > > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From status at bugs.python.org Fri Oct 27 12:09:41 2017 From: status at bugs.python.org (Python tracker) Date: Fri, 27 Oct 2017 18:09:41 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20171027160941.083E211A85B@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2017-10-20 - 2017-10-27) Python tracker at https://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 6260 ( +0) closed 37377 (+59) total 43637 (+59) Open issues with patches: 2411 Issues opened (51) ================== #18835: Add aligned memory variants to the suite of PyMem functions/ma https://bugs.python.org/issue18835 reopened by skrah #28224: Compilation warnings on Windows: export 'PyInit_xx' specified https://bugs.python.org/issue28224 reopened by skrah #30768: PyThread_acquire_lock_timed() should recompute the timeout whe https://bugs.python.org/issue30768 reopened by haypo #31811: async and await missing from keyword list in lexical analysis https://bugs.python.org/issue31811 reopened by yselivanov #31828: Support Py_tss_NEEDS_INIT outside of static initialisation https://bugs.python.org/issue31828 opened by scoder #31829: Portability issues with pickle https://bugs.python.org/issue31829 opened by serhiy.storchaka #31830: asyncio.create_subprocess_exec doesn't capture all stdout outp https://bugs.python.org/issue31830 opened by cannedrag #31831: EmailMessage.add_attachment(filename="long or sp??cial") crash https://bugs.python.org/issue31831 opened by calimeroteknik #31834: BLAKE2: the (pure) SSE2 impl forced on x86_64 is slower than r https://bugs.python.org/issue31834 opened by mgorny #31836: test_code_module fails after test_idle https://bugs.python.org/issue31836 opened by serhiy.storchaka #31837: ParseError in test_all_project_files() https://bugs.python.org/issue31837 opened by serhiy.storchaka #31839: datetime: add method to parse isoformat() output https://bugs.python.org/issue31839 opened by orent #31841: Several methods of collections.UserString do not return instan https://bugs.python.org/issue31841 opened by vaultah #31842: pathlib: "Incorrect function" during resolve() https://bugs.python.org/issue31842 opened by earonesty2 #31843: sqlite3.connect() should accept PathLike objects https://bugs.python.org/issue31843 opened by Allen Li #31844: HTMLParser: undocumented not implemented method https://bugs.python.org/issue31844 opened by srittau #31846: Error in 3.6.3 epub docs https://bugs.python.org/issue31846 opened by n8henrie #31848: "aifc" module does not always initialize "Aifc_read._ssnd_chun https://bugs.python.org/issue31848 opened by Zero #31849: Python/pyhash.c warning: comparison of integers of different s https://bugs.python.org/issue31849 opened by xdegaye #31850: test_nntplib failed with "nntplib.NNTPDataError: line too long https://bugs.python.org/issue31850 opened by haypo #31851: test_subprocess hangs randomly on x86 Windows7 3.x https://bugs.python.org/issue31851 opened by haypo #31852: Crashes with lines of the form "async \" https://bugs.python.org/issue31852 opened by Alexandre Hamelin #31853: Use super().method instead of socket.method in SSLSocket https://bugs.python.org/issue31853 opened by earonesty #31854: Add mmap.ACCESS_DEFAULT to namespace https://bugs.python.org/issue31854 opened by Justus Schwabedal #31855: mock_open is not compatible with read(n) (and pickle.load) https://bugs.python.org/issue31855 opened by ron.rothman #31858: IDLE: cleanup use of sys.ps1 and never set it. https://bugs.python.org/issue31858 opened by terry.reedy #31859: sharedctypes.RawArray initialization https://bugs.python.org/issue31859 opened by meetaig #31860: IDLE: Make font sample editable https://bugs.python.org/issue31860 opened by serhiy.storchaka #31861: aiter() and anext() built-in functions https://bugs.python.org/issue31861 opened by davide.rizzo #31862: Port the standard library to PEP 489 multiphase initialization https://bugs.python.org/issue31862 opened by Dormouse759 #31863: Inconsistent returncode/exitcode for terminated child processe https://bugs.python.org/issue31863 opened by Akos Kiss #31865: html.unescape does not work as per documentation https://bugs.python.org/issue31865 opened by cardin #31867: Duplicated keys in MIME type_map with different values https://bugs.python.org/issue31867 opened by Mark.Shannon #31868: Null pointer dereference in ndb.ndbm get when used with a defa https://bugs.python.org/issue31868 opened by tmiasko #31869: commentary on ssl.PROTOCOL_TLS https://bugs.python.org/issue31869 opened by J Sloot #31870: add timeout parameter for get_server_certificate in ssl.py https://bugs.python.org/issue31870 opened by Nixawk #31871: Support for file descriptor params in os.path https://bugs.python.org/issue31871 opened by Mateusz Kurek #31872: SSL BIO is broken for internationalized domains https://bugs.python.org/issue31872 opened by asvetlov #31873: Inconsistent capitalization of proper noun - Unicode. https://bugs.python.org/issue31873 opened by toonarmycaptain #31874: [feature] runpy.run_module should mimic script launch behavior https://bugs.python.org/issue31874 opened by jason.coombs #31875: Error 0x80070642: Failed to install MSI package. https://bugs.python.org/issue31875 opened by Gareth Moger #31876: python363.chm includes gibberish https://bugs.python.org/issue31876 opened by Nim #31878: Cygwin: _socket module does not compile due to missing ioctl d https://bugs.python.org/issue31878 opened by erik.bray #31879: Launcher fails on custom command starting with "python" https://bugs.python.org/issue31879 opened by mrh1997 #31880: subprocess process interaction with IDLEX GUI causes pygnuplot https://bugs.python.org/issue31880 opened by jbrearley #31881: subprocess.returncode not set depending on arguments to subpro https://bugs.python.org/issue31881 opened by nthompson #31882: Cygwin: asyncio and asyncore test suites hang indefinitely due https://bugs.python.org/issue31882 opened by erik.bray #31883: Cygwin: heap corruption bug in wcsxfrm https://bugs.python.org/issue31883 opened by erik.bray #31884: subprocess set priority on windows https://bugs.python.org/issue31884 opened by JamesGKent #31885: Cygwin: socket test suites hang indefinitely due to bug in Cyg https://bugs.python.org/issue31885 opened by erik.bray #31886: Multiprocessing.Pool hangs after re-spawning several worker pr https://bugs.python.org/issue31886 opened by olarn Most recent 15 issues with no replies (15) ========================================== #31886: Multiprocessing.Pool hangs after re-spawning several worker pr https://bugs.python.org/issue31886 #31885: Cygwin: socket test suites hang indefinitely due to bug in Cyg https://bugs.python.org/issue31885 #31884: subprocess set priority on windows https://bugs.python.org/issue31884 #31883: Cygwin: heap corruption bug in wcsxfrm https://bugs.python.org/issue31883 #31882: Cygwin: asyncio and asyncore test suites hang indefinitely due https://bugs.python.org/issue31882 #31880: subprocess process interaction with IDLEX GUI causes pygnuplot https://bugs.python.org/issue31880 #31879: Launcher fails on custom command starting with "python" https://bugs.python.org/issue31879 #31878: Cygwin: _socket module does not compile due to missing ioctl d https://bugs.python.org/issue31878 #31876: python363.chm includes gibberish https://bugs.python.org/issue31876 #31875: Error 0x80070642: Failed to install MSI package. https://bugs.python.org/issue31875 #31872: SSL BIO is broken for internationalized domains https://bugs.python.org/issue31872 #31871: Support for file descriptor params in os.path https://bugs.python.org/issue31871 #31870: add timeout parameter for get_server_certificate in ssl.py https://bugs.python.org/issue31870 #31869: commentary on ssl.PROTOCOL_TLS https://bugs.python.org/issue31869 #31865: html.unescape does not work as per documentation https://bugs.python.org/issue31865 Most recent 15 issues waiting for review (15) ============================================= #31885: Cygwin: socket test suites hang indefinitely due to bug in Cyg https://bugs.python.org/issue31885 #31884: subprocess set priority on windows https://bugs.python.org/issue31884 #31883: Cygwin: heap corruption bug in wcsxfrm https://bugs.python.org/issue31883 #31882: Cygwin: asyncio and asyncore test suites hang indefinitely due https://bugs.python.org/issue31882 #31878: Cygwin: _socket module does not compile due to missing ioctl d https://bugs.python.org/issue31878 #31873: Inconsistent capitalization of proper noun - Unicode. https://bugs.python.org/issue31873 #31868: Null pointer dereference in ndb.ndbm get when used with a defa https://bugs.python.org/issue31868 #31862: Port the standard library to PEP 489 multiphase initialization https://bugs.python.org/issue31862 #31860: IDLE: Make font sample editable https://bugs.python.org/issue31860 #31858: IDLE: cleanup use of sys.ps1 and never set it. https://bugs.python.org/issue31858 #31854: Add mmap.ACCESS_DEFAULT to namespace https://bugs.python.org/issue31854 #31852: Crashes with lines of the form "async \" https://bugs.python.org/issue31852 #31836: test_code_module fails after test_idle https://bugs.python.org/issue31836 #31834: BLAKE2: the (pure) SSE2 impl forced on x86_64 is slower than r https://bugs.python.org/issue31834 #31829: Portability issues with pickle https://bugs.python.org/issue31829 Top 10 most discussed issues (10) ================================= #18835: Add aligned memory variants to the suite of PyMem functions/ma https://bugs.python.org/issue18835 16 msgs #31803: time.clock() should emit a DeprecationWarning https://bugs.python.org/issue31803 16 msgs #31626: Writing in freed memory in _PyMem_DebugRawRealloc() after shri https://bugs.python.org/issue31626 10 msgs #27987: obmalloc's 8-byte alignment causes undefined behavior https://bugs.python.org/issue27987 9 msgs #20180: Derby #11: Convert 50 sites to Argument Clinic across 9 files https://bugs.python.org/issue20180 8 msgs #30768: PyThread_acquire_lock_timed() should recompute the timeout whe https://bugs.python.org/issue30768 7 msgs #16737: Different behaviours in script run directly and via runpy.run_ https://bugs.python.org/issue16737 6 msgs #25083: Python can sometimes create incorrect .pyc files https://bugs.python.org/issue25083 6 msgs #31831: EmailMessage.add_attachment(filename="long or sp??cial") crash https://bugs.python.org/issue31831 6 msgs #31826: Misleading __version__ attribute of modules in standard librar https://bugs.python.org/issue31826 5 msgs Issues closed (57) ================== #20825: containment test for "ip_network in ip_network" https://bugs.python.org/issue20825 closed by pitrou #21720: "TypeError: Item in ``from list'' not a string" message https://bugs.python.org/issue21720 closed by serhiy.storchaka #22898: segfault during shutdown attempting to log ResourceWarning https://bugs.python.org/issue22898 closed by xdegaye #24920: shutil.get_terminal_size throws AttributeError https://bugs.python.org/issue24920 closed by serhiy.storchaka #25612: nested try..excepts don't work correctly for generators https://bugs.python.org/issue25612 closed by pitrou #26123: http.client status code constants incompatible with Python 3.4 https://bugs.python.org/issue26123 closed by srittau #28028: Convert warnings to SyntaxWarning in parser https://bugs.python.org/issue28028 closed by serhiy.storchaka #28280: Always return a list from PyMapping_Keys/PyMapping_Values/PyMa https://bugs.python.org/issue28280 closed by serhiy.storchaka #28281: Remove year limits from calendar https://bugs.python.org/issue28281 closed by belopolsky #28292: Make Calendar.itermonthdates() behave consistently in edge cas https://bugs.python.org/issue28292 closed by belopolsky #28326: multiprocessing.Process depends on sys.stdout being open https://bugs.python.org/issue28326 closed by pitrou #28506: Multiprocessing Pool starmap - struct.error: 'i' format requir https://bugs.python.org/issue28506 closed by serhiy.storchaka #28645: Drop __aiter__ compatibility layer from 3.7 https://bugs.python.org/issue28645 closed by yselivanov #28936: test_global_err_then_warn in test_syntax is no longer valid https://bugs.python.org/issue28936 closed by serhiy.storchaka #29933: asyncio: set_write_buffer_limits() doc doesn't specify unit of https://bugs.python.org/issue29933 closed by berker.peksag #30302: Improve .__repr__ implementation for datetime.timedelta https://bugs.python.org/issue30302 closed by berker.peksag #30484: Garbage Collector can cause Segfault whilst iterating dictiona https://bugs.python.org/issue30484 closed by berker.peksag #30549: ProcessPoolExecutor hangs forever if the object raises on __ge https://bugs.python.org/issue30549 closed by berker.peksag #30553: Add HTTP Response code 421 https://bugs.python.org/issue30553 closed by berker.peksag #30639: inspect.getfile(obj) calls object repr on failure https://bugs.python.org/issue30639 closed by yselivanov #30695: add a nomemory_allocator to the _testcapi module https://bugs.python.org/issue30695 closed by xdegaye #30697: segfault in PyErr_NormalizeException() after memory exhaustion https://bugs.python.org/issue30697 closed by xdegaye #30722: Tools/demo/redemo.py broken https://bugs.python.org/issue30722 closed by berker.peksag #30762: Misleading message ???can't concat bytes to str??? https://bugs.python.org/issue30762 closed by berker.peksag #30817: Abort in PyErr_PrintEx() when no memory https://bugs.python.org/issue30817 closed by xdegaye #30937: csv module examples miss newline='' when opening files https://bugs.python.org/issue30937 closed by berker.peksag #30949: Provide assertion functions in unittest.mock https://bugs.python.org/issue30949 closed by berker.peksag #31053: Unnecessary argument in command example https://bugs.python.org/issue31053 closed by berker.peksag #31174: test_tools leaks randomly references on x86 Gentoo Refleaks 3. https://bugs.python.org/issue31174 closed by haypo #31227: regrtest: reseed random with the same seed before running a te https://bugs.python.org/issue31227 closed by haypo #31545: Fixing documentation for timedelta. https://bugs.python.org/issue31545 closed by berker.peksag #31653: Don't release the GIL if we can acquire a multiprocessing sema https://bugs.python.org/issue31653 closed by pitrou #31664: Add support of new crypt methods https://bugs.python.org/issue31664 closed by serhiy.storchaka #31667: Wrong links in the gettext.NullTranslations class https://bugs.python.org/issue31667 closed by serhiy.storchaka #31674: Buildbots: random "Failed to connect to github.com port 443: C https://bugs.python.org/issue31674 closed by haypo #31690: Make RE "a", "L" and "u" inline flags local https://bugs.python.org/issue31690 closed by serhiy.storchaka #31752: Assertion failure in timedelta() in case of bad __divmod__ https://bugs.python.org/issue31752 closed by serhiy.storchaka #31756: subprocess.run should alias universal_newlines to text https://bugs.python.org/issue31756 closed by gregory.p.smith #31781: crashes when calling methods of an uninitialized zipimport.zip https://bugs.python.org/issue31781 closed by brett.cannon #31800: datetime.strptime: Support for parsing offsets with a colon https://bugs.python.org/issue31800 closed by belopolsky #31804: multiprocessing calls flush on sys.stdout at exit even if it i https://bugs.python.org/issue31804 closed by pitrou #31809: ssl module unnecessarily pins the client curve when using ECDH https://bugs.python.org/issue31809 closed by grrrrrrrrr #31810: Travis CI, buildbots: run "make smelly" to check if CPython le https://bugs.python.org/issue31810 closed by haypo #31812: Document PEP 545 (documentation translation) in What's New in https://bugs.python.org/issue31812 closed by haypo #31827: Remove os.stat_float_times() https://bugs.python.org/issue31827 closed by haypo #31832: Async list comprehension (PEP 530) causes SyntaxError in Pytho https://bugs.python.org/issue31832 closed by zach.ware #31833: Compile fail on gentoo for MIPS CPU of loongson 2f https://bugs.python.org/issue31833 closed by pitrou #31835: _PyFunction_FastCallDict and _PyFunction_FastCallKeywords: fas https://bugs.python.org/issue31835 closed by haypo #31838: Python 3.4 supported SSL version https://bugs.python.org/issue31838 closed by sabian2008 #31840: `ImportError` is` raised` only when `python -m unittest discov https://bugs.python.org/issue31840 closed by serhiy.storchaka #31845: PYTHONDONTWRITEBYTECODE and PYTHONOPTIMIZE have no effect https://bugs.python.org/issue31845 closed by ncoghlan #31847: Fix commented out tests in test_syntax https://bugs.python.org/issue31847 closed by serhiy.storchaka #31856: Unexpected behavior of re module when VERBOSE flag is set https://bugs.python.org/issue31856 closed by mrabarnett #31857: Make the behavior of USE_STACKCHECK deterministic https://bugs.python.org/issue31857 closed by benjamin.peterson #31864: datetime violates Postel's law https://bugs.python.org/issue31864 closed by r.david.murray #31866: clean out some more AtheOS code https://bugs.python.org/issue31866 closed by benjamin.peterson #31877: Build fails on Cygwin since issue28180 https://bugs.python.org/issue31877 closed by ncoghlan From horos22 at gmail.com Fri Oct 27 13:05:28 2017 From: horos22 at gmail.com (Ed Peschko) Date: Fri, 27 Oct 2017 10:05:28 -0700 Subject: [Python-Dev] \G (match last position) regex operator non-existant in python? In-Reply-To: References: Message-ID: > From this I understand that when using e.g. findall() it forces successive matches to be adjacent. yes, I admit that this is a clearer description of what \G does. My only defense is that I wrote my description when it was late. :) I can only stress how useful it is, especially for debugging regexes. Basically if you are cutting up any string into discrete chunks, you want to make sure that you aren't missing any chunks in the middle when you do the cut. without \G, you can miss large sections of string, and it is easy to overlook. with \G, you are guaranteed to see exactly where your regex falls down. In addition, there are specific regexes that you can only write with \G (eg. c parsers) Anyways, I'll look at regex. On Fri, Oct 27, 2017 at 8:35 AM, Guido van Rossum wrote: > The "why" question is not very interesting -- it probably wasn't in PCRE and > nobody was familiar with it when we moved off PCRE (maybe it wasn't even in > Perl at the time -- it was ~15 years ago). > > I didn't understand your description of \G so I googled it and found a > helpful StackOverflow article: > https://stackoverflow.com/questions/21971701/when-is-g-useful-application-in-a-regex. > From this I understand that when using e.g. findall() it forces successive > matches to be adjacent. > > In general this seems to be a unique property of \G: it preserves *state* > from one match to the next. This will make it somewhat difficult to > implement -- e.g. that state should probably be thread-local in case > multiple threads use the same compiled regex. It's also unclear when that > state should be reset. (Only when you compile the regex? Each time you pass > it a different source string?) > > So I'm not sure it's reasonable to add. But I also don't see a reason why it > shouldn't be added -- presuming we can decide on good answer for the > questions above about the "scope" of the anchor. > > I think it's okay to start a discussion on bugs.python.org about the precise > specification of \G for Python. OTOH I expect that most core devs won't find > this a very interesting problem (Python relies on regexes for parsing a lot > less than Perl does). > > Good luck! > > On Thu, Oct 26, 2017 at 11:03 PM, Ed Peschko wrote: >> >> All, >> >> perl has a regex assertion (\G) that allows multiple-match regular >> expressions to be able to use the position of the last match. Perl's >> documentation puts it this way: >> >> \G Match only at pos() (e.g. at the end-of-match position of prior >> m//g) >> >> Anyways, this is exceedingly powerful for matching regularly >> structured free-form records, and I was really surprised when I found >> out that python did not have it. For example, if findall supported >> this, it would be possible to write things like this (a quick and >> dirty ifconfig parser): >> >> pat = re.compile(r'\G(\S+)(.*?\n)(?=\S+|\Z)', re.S) >> >> val = """ >> eth2 Link encap:Ethernet HWaddr xx >> inet addr: xx.xx.xx.xx Bcast:xx.xx.xx.xx Mask:xx.xx.xx.xx >> ... >> lo Link encap:Local Loopback >> inet addr:127.0.0.1 Mask:255.0.0.0 >> """ >> matches = re.findall(pat, val) >> >> So - why doesn't python have this? is it something that simply was >> overlooked, or is there another method of doing the same thing with >> arbitrarily complex freeform records? >> >> thanks much.. >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> https://mail.python.org/mailman/options/python-dev/guido%40python.org > > > > > -- > --Guido van Rossum (python.org/~guido) From barry at python.org Fri Oct 27 15:36:07 2017 From: barry at python.org (Barry Warsaw) Date: Fri, 27 Oct 2017 15:36:07 -0400 Subject: [Python-Dev] PEP Post-History Message-ID: <61E959D3-ECEB-4897-B3F6-3D04D8F52E90@python.org> We?ve made a small change to the PEP process which may affect readers of python-list and python-ideas, so I?d like to inform you of it. This change was made to PEP 1 and PEP 12. PEPs must have a Post-History header which records the dates at which the PEP is posted to mailing lists, in order to keep the general Python community in the loop as a PEP moves to final status. Until now, PEPs in development were supposed to be posted at least to python-dev and optionally to python-list[1]. This guideline predated the creation of the python-ideas mailing list. We?ve now changed this guideline so that Post-History will record the dates at which the PEP is posted to python-dev and optionally python-ideas. python-list is dropped from this requirement. python-dev is always the primary mailing list of record for Python development, and PEPs under development should be posted to python-dev as appropriate. python-ideas is the list for discussion of more speculative changes to Python, and it?s often where more complex PEPs, and even proto-PEPs are first raised and their concepts are hashed out. As such, it makes more sense to change the guideline to include python-ideas and/or python-dev. In the effort to keep the forums of record to a manageable number, python-list is dropped. If you have been watching for new PEPs to be posted to python-list, you are invited to follow either python-dev or python-ideas. Cheers, -Barry (on behalf of the Python development community) https://mail.python.org/mailman/listinfo/python-dev https://mail.python.org/mailman/listinfo/python-ideas Both python-dev and python-ideas are available via Gmane. [1] PEPs may have a Discussions-To header which changes the list of forums where the PEP is discussed. In that case, Post-History records the dates that the PEP is posted to those forums. See PEP 1 for details. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From jwilk at jwilk.net Fri Oct 27 19:04:16 2017 From: jwilk at jwilk.net (Jakub Wilk) Date: Sat, 28 Oct 2017 01:04:16 +0200 Subject: [Python-Dev] \G (match last position) regex operator non-existant in python? In-Reply-To: References: Message-ID: <20171027230416.hoht4v7tcgdd4vkd@jwilk.net> * Guido van Rossum , 2017-10-27, 08:35: >The "why" question is not very interesting -- it probably wasn't in >PCRE and nobody was familiar with it when we moved off PCRE (maybe it >wasn't even in Perl at the time -- it was ~15 years ago). Perl supports \G since v5.0, released in 1994. PCRE supports it since v4.0, released in 2003. -- Jakub Wilk From ncoghlan at gmail.com Sat Oct 28 01:43:47 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 28 Oct 2017 15:43:47 +1000 Subject: [Python-Dev] If aligned_alloc() is missing on your platform, please let us know. In-Reply-To: <20171027081722.GA3757@bytereef.org> References: <20171027081722.GA3757@bytereef.org> Message-ID: On 27 October 2017 at 18:17, Stefan Krah wrote: > Victor wrote a patch and would like to avoid adding a (probably > unnecessary) > emulation function. I agree with that. > > So if any platform does not have some form of aligned_alloc(), please > speak up. > I think Victor's suggested strategy in https://bugs.python.org/issue18835#msg305122 of deferring emulation support to its own follow-up issue is a good one, since there are two distinct cases to consider: 1. CPython's own support for platforms where we don't have a native aligned memory allocation API to call is covered by PEP 11, so if all current buildbots still work, then it will be up to the folks interested in a platform that *doesn't* offer aligned allocations to provide both a suitable emulation and a buildbot to test it on. 2.While all of the CPython-provided memory allocators will implement the new slots, the folks implementing their own custom allocators will need a defined upgrade path in the "Porting" section of the What's New guide. For that, an explicit error on 3.7+ that says "Configured custom allocator doesn't implement aligned memory allocations" is likely going to be easier to debug than subtle issues with the way an implicit emulation layer interacts with the other memory allocator slots. To appropriately address 2, we need more info not about which platforms natively support aligned memory allocations, but rather from folks that actually are implementing their own custom allocators. It may be that making the low level cross-platform raw alligned allocators available as a public API (independently of the switchable allocator machinery) will be a more appropriate way of handling that case than providing an emulation layer in terms of the old slots. That is, the suggested approach for custom allocator developers would be: 1. Test against 3.7, get an error complaining the new slots aren't defined 2. Set the new slots to the CPython provided cross-platform abstraction (if appropriate), or else wrap that abstraction layer as needed, or else implement your own aligned allocation routines If the only reason for the custom allocator was to allow tracemalloc to track additional memory that wouldn't otherwise be handled through CPython's memory allocators, then consider switching to using PyTraceMalloc_Track()/PyTraceMalloc_Untrack() instead. Either way, that transition strategy discussion shouldn't block making the feature itself available - it will be a lot easier to discuss transition enablement if potentially affected folks can download 3.7.0a3 and tell us what actually breaks, rather than asking them to speculate about what they think *might* break. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Oct 28 03:09:33 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 28 Oct 2017 17:09:33 +1000 Subject: [Python-Dev] \G (match last position) regex operator non-existant in python? In-Reply-To: References: Message-ID: On 28 October 2017 at 01:57, Guido van Rossum wrote: > Oh. Yes, that is being discussed about once a year two. It seems Matthew > isn't very interested in helping out with the port, and there are some > concerns about backwards compatibility with the `re` module. I think it > needs a champion! > Matthew's been amenable to the idea when it comes up, and he explicitly wrote the module to be usable as a drop-in replacement for "re" (hence the re-compatible v0 behaviour still being the default). The resistance has more been from our side, since this is a case where existing regex module users are clearly better off if it remains a separate project, as that keeps upgrades independent of the relatively slow standard library release cycle (and allows it to be used on Python 2.7 as well as in 3.x). By contrast, the potential benefits of standard library inclusion accrue primarily to Python newcomers and folks writing scripts without the benefit of package management tools, since they'll have a more capable regex engine available as part of the assumed language baseline. That means that if we add regex to the standard library in the regular way, there's a more than fair chance that we'll end up with an outcome like the json vs simplejson split, where we have one variant in the standard library, and another variant on PyPI, and the variants may drift apart over time if their maintenance is being handled by different people. (Note: one may argue that we already have this split in the form of re vs regex. So if regex was brought in specifically to replace _sre as the re module implementation, rather than as a new public API, then we at least wouldn't be making anything *worse* from a future behavioural consistency perspective, but we'd be risking a compatibility break for anyone depending on _sre and other internal implementation details of the re module). One potential alternative approach that is then brought up (often by me) is to suggest instead *bundling* the regex module with CPython, without actually bringing it fully within the regular standard library maintenance process. The idea there would be to both make the module available by default in python.org downloads, *and* make it clear to redistributors that the module is part of the expected baseline of Python functionality, but otherwise keep it entirely in its current independently upgradable form. That would still be hard (since it would involve establishing new maintenance policy precedents that go beyond the current special-casing of `pip` in order to bootstrap PyPI access), but would have the additional benefit of paving the way for doing similar things with other modules where we'd like them to be part of the assumed baseline for end users, but also have reasons for wanting to avoid tightly coupling them to the standard libary's regular maintenance policy (most notably, requests). And that's where discussions tend to fizzle out: * outright replacement of the current re module implementation with a private copy of the regex module introduces compatibility risks that would need a fiat decision from you as BDFL to say "Let's do it anyway, make sure the test suite still works, and then figure out how to cope with any other consequences as they arise" * going down the bundling path requires making some explicit community management decisions around what we actually want the standard library to *be* (and whether or not there's a difference between "the standard library" and "the assumed available package set" for Python installations that are expected to run arbitrary third party scripts rather than specific applications) * having both the current re API and implementation *and* a new regex based API and implementation in the standard library indefinitely seems like it would be a maintainability nightmare that delivered the worst of all possible outcomes for everyone involved (CPython maintainers, regex maintainers, Python end users) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From francismb at email.de Sat Oct 28 04:57:57 2017 From: francismb at email.de (francismb) Date: Sat, 28 Oct 2017 10:57:57 +0200 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: References: <20171016170631.375edd56@fsol> <4f15b978-d786-3b80-77f7-c6cb7d313573@email.de> Message-ID: <90a1475b-4fa1-3323-fae8-29d65d54fcc5@email.de> Hi David, On 10/22/2017 07:30 PM, David Mertz wrote: > The 'time' module is about > wall clock out calendar time, not about *simulation time*. means that the other scale direction makes more sense for the module? aka 'get_time('us'), get_time('ms'), 'get_time('s') Thanks, --francis From stefan at bytereef.org Sat Oct 28 08:36:27 2017 From: stefan at bytereef.org (Stefan Krah) Date: Sat, 28 Oct 2017 14:36:27 +0200 Subject: [Python-Dev] If aligned_alloc() is missing on your platform, please let us know. In-Reply-To: References: <20171027081722.GA3757@bytereef.org> Message-ID: <20171028123627.GA3197@bytereef.org> On Sat, Oct 28, 2017 at 03:43:47PM +1000, Nick Coghlan wrote: > 1. CPython's own support for platforms where we don't have a native aligned > memory allocation API to call is covered by PEP 11, so if all current > buildbots still work, then it will be up to the folks interested in a > platform that *doesn't* offer aligned allocations to provide both a > suitable emulation and a buildbot to test it on. Indeed, the feature is backed up by PEP 11. > 2.While all of the CPython-provided memory allocators will implement the > new slots, the folks implementing their own custom allocators will need a > defined upgrade path in the "Porting" section of the What's New guide. For > that, an explicit error on 3.7+ that says "Configured custom allocator > doesn't implement aligned memory allocations" is likely going to be easier > to debug than subtle issues with the way an implicit emulation layer > interacts with the other memory allocator slots. > > To appropriately address 2, we need more info not about which platforms > natively support aligned memory allocations, but rather from folks that > actually are implementing their own custom allocators. It may be that > making the low level cross-platform raw alligned allocators available as a > public API (independently of the switchable allocator machinery) will be a > more appropriate way of handling that case than providing an emulation > layer in terms of the old slots. I don't have an opinion whether new slots should be available. For my use case I just need PyMem_AlignedAlloc(), PyMem_AlignedFree() that automatically use the faster allocator for 'sizeof(void *) <= align <= ALIGNMENT' and 'size <= SMALL_REQUEST_THRESHOLD'. So yes, it would be nice to hear from people who implement custom allocators. Stefan Krah From ls73039 at djusdstudents.org Fri Oct 27 16:43:48 2017 From: ls73039 at djusdstudents.org (London) Date: Fri, 27 Oct 2017 13:43:48 -0700 Subject: [Python-Dev] PEP 530 Message-ID: can you help me get idol for my computer -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sat Oct 28 17:05:25 2017 From: guido at python.org (Guido van Rossum) Date: Sat, 28 Oct 2017 14:05:25 -0700 Subject: [Python-Dev] \G (match last position) regex operator non-existant in python? In-Reply-To: References: Message-ID: On Sat, Oct 28, 2017 at 12:09 AM, Nick Coghlan wrote: > On 28 October 2017 at 01:57, Guido van Rossum wrote: > >> Oh. Yes, that is being discussed about once a year two. It seems Matthew >> isn't very interested in helping out with the port, and there are some >> concerns about backwards compatibility with the `re` module. I think it >> needs a champion! >> > > Matthew's been amenable to the idea when it comes up, and he explicitly > wrote the module to be usable as a drop-in replacement for "re" (hence the > re-compatible v0 behaviour still being the default). > > The resistance has more been from our side, since this is a case where > existing regex module users are clearly better off if it remains a separate > project, as that keeps upgrades independent of the relatively slow standard > library release cycle (and allows it to be used on Python 2.7 as well as in > 3.x). By contrast, the potential benefits of standard library inclusion > accrue primarily to Python newcomers and folks writing scripts without the > benefit of package management tools, since they'll have a more capable > regex engine available as part of the assumed language baseline. > > That means that if we add regex to the standard library in the regular > way, there's a more than fair chance that we'll end up with an outcome like > the json vs simplejson split, where we have one variant in the standard > library, and another variant on PyPI, and the variants may drift apart over > time if their maintenance is being handled by different people. (Note: one > may argue that we already have this split in the form of re vs regex. So if > regex was brought in specifically to replace _sre as the re module > implementation, rather than as a new public API, then we at least wouldn't > be making anything *worse* from a future behavioural consistency > perspective, but we'd be risking a compatibility break for anyone depending > on _sre and other internal implementation details of the re module). > > One potential alternative approach that is then brought up (often by me) > is to suggest instead *bundling* the regex module with CPython, without > actually bringing it fully within the regular standard library maintenance > process. The idea there would be to both make the module available by > default in python.org downloads, *and* make it clear to redistributors > that the module is part of the expected baseline of Python functionality, > but otherwise keep it entirely in its current independently upgradable form. > > That would still be hard (since it would involve establishing new > maintenance policy precedents that go beyond the current special-casing of > `pip` in order to bootstrap PyPI access), but would have the additional > benefit of paving the way for doing similar things with other modules where > we'd like them to be part of the assumed baseline for end users, but also > have reasons for wanting to avoid tightly coupling them to the standard > libary's regular maintenance policy (most notably, requests). > > And that's where discussions tend to fizzle out: > > * outright replacement of the current re module implementation with a > private copy of the regex module introduces compatibility risks that would > need a fiat decision from you as BDFL to say "Let's do it anyway, make sure > the test suite still works, and then figure out how to cope with any other > consequences as they arise" > * going down the bundling path requires making some explicit community > management decisions around what we actually want the standard library to > *be* (and whether or not there's a difference between "the standard > library" and "the assumed available package set" for Python installations > that are expected to run arbitrary third party scripts rather than specific > applications) > * having both the current re API and implementation *and* a new regex > based API and implementation in the standard library indefinitely seems > like it would be a maintainability nightmare that delivered the worst of > all possible outcomes for everyone involved (CPython maintainers, regex > maintainers, Python end users) > Maybe it would be easier if Matthew were amenable to maintaining the stdlib version and only add new features to the PyPI version when they've also been added to the stdlib version. IOW if he were committed to *not* letting the [simple]json thing happen. I don't condone having two different regex implementations/APIs bundled in any form, even if one were to be deprecated -- we'd never get rid of the deprecated one until 4.0. (FWIW I don't condone this pattern for other packages/modules either.) Note that even if we outright switched there would *still* be two versions, because regex itself has an internal versioning scheme where V0 claims to be strictly compatible with re and V1 explicitly changes the matching rules in some cases. (I don't know if this means that you have to request V1 to use \G though.) The other problem with outright replacement is that despite Matthew's best efforts there may be subtle incompatibilities that will break people's code in surprising ways. I don't recall much about our current 're' test suite -- I'm sure it tests every feature, but I'm not sure how far it goes in testing edge cases. IIRC this is where in the past we've always erred on the side of (extreme) caution, and my recollection is of Matthew being (understandably!) pretty lukewarm about doing extra work to help assess this -- IIRC he's totally fine with the status quo. If there's new information or a change in Matthew's outlook I'd be happy to reconsider it. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Sat Oct 28 17:33:21 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 28 Oct 2017 17:33:21 -0400 Subject: [Python-Dev] PEP 530 In-Reply-To: References: Message-ID: On 10/27/2017 4:43 PM, London wrote: > can you help me get idol for my computer Post questions about using python on python-list and include information about what OS you are running and what version of Python you want. -- Terry Jan Reedy From python at mrabarnett.plus.com Sat Oct 28 19:31:01 2017 From: python at mrabarnett.plus.com (MRAB) Date: Sun, 29 Oct 2017 00:31:01 +0100 Subject: [Python-Dev] \G (match last position) regex operator non-existant in python? In-Reply-To: References: Message-ID: <3bb00f2f-91fa-69e5-0dc6-31ffdae5b786@mrabarnett.plus.com> On 2017-10-28 22:05, Guido van Rossum wrote: > On Sat, Oct 28, 2017 at 12:09 AM, Nick Coghlan > wrote: > > On 28 October 2017 at 01:57, Guido van Rossum > wrote: > > Oh. Yes, that is being discussed about once a year two. It > seems Matthew isn't very interested in helping out with the > port, and there are some concerns about backwards > compatibility with the `re` module. I think it needs a champion! > > > Matthew's been amenable to the idea when it comes up, and he > explicitly wrote the module to be usable as a drop-in replacement > for "re" (hence the re-compatible v0 behaviour still being the > default). > > The resistance has more been from our side, since this is a case > where existing regex module users are clearly better off if it > remains a separate project, as that keeps upgrades independent of > the relatively slow standard library release cycle (and allows it > to be used on Python 2.7 as well as in 3.x). By contrast, the > potential benefits of standard library inclusion accrue primarily > to Python newcomers and folks writing scripts without the benefit > of package management tools, since they'll have a more capable > regex engine available as part of the assumed language baseline. > > That means that if we add regex to the standard library in the > regular way, there's a more than fair chance that we'll end up > with an outcome like the json vs simplejson split, where we have > one variant in the standard library, and another variant on PyPI, > and the variants may drift apart over time if their maintenance is > being handled by different people. (Note: one may argue that we > already have this split in the form of re vs regex. So if regex > was brought in specifically to replace _sre as the re module > implementation, rather than as a new public API, then we at least > wouldn't be making anything *worse* from a future behavioural > consistency perspective, but we'd be risking a compatibility break > for anyone depending on _sre and other internal implementation > details of the re module). > > One potential alternative approach that is then brought up (often > by me) is to suggest instead *bundling* the regex module with > CPython, without actually bringing it fully within the regular > standard library maintenance process. The idea there would be to > both make the module available by default in python.org > downloads, *and* make it clear to > redistributors that the module is part of the expected baseline of > Python functionality, but otherwise keep it entirely in its > current independently upgradable form. > > That would still be hard (since it would involve establishing new > maintenance policy precedents that go beyond the current > special-casing of `pip` in order to bootstrap PyPI access), but > would have the additional benefit of paving the way for doing > similar things with other modules where we'd like them to be part > of the assumed baseline for end users, but also have reasons for > wanting to avoid tightly coupling them to the standard libary's > regular maintenance policy (most notably, requests). > > And that's where discussions tend to fizzle out: > > * outright replacement of the current re module implementation > with a private copy of the regex module introduces compatibility > risks that would need a fiat decision from you as BDFL to say > "Let's do it anyway, make sure the test suite still works, and > then figure out how to cope with any other consequences as they arise" > * going down the bundling path requires making some explicit > community management decisions around what we actually want the > standard library to *be* (and whether or not there's a difference > between "the standard library" and "the assumed available package > set" for Python installations that are expected to run arbitrary > third party scripts rather than specific applications) > * having both the current re API and implementation *and* a new > regex based API and implementation in the standard library > indefinitely seems like it would be a maintainability nightmare > that delivered the worst of all possible outcomes for everyone > involved (CPython maintainers, regex maintainers, Python end users) > > > Maybe it would be easier if Matthew were amenable to maintaining the > stdlib version and only add new features to the PyPI version when > they've also been added to the stdlib version. IOW if he were > committed to *not* letting the [simple]json thing happen. > > I don't condone having two different regex implementations/APIs > bundled in any form, even if one were to be deprecated -- we'd never > get rid of the deprecated one until 4.0. (FWIW I don't condone this > pattern for other packages/modules either.) Note that even if we > outright switched there would *still* be two versions, because regex > itself has an internal versioning scheme where V0 claims to be > strictly compatible with re and V1 explicitly changes the matching > rules in some cases. (I don't know if this means that you have to > request V1 to use \G though.) > > The other problem with outright replacement is that despite Matthew's > best efforts there may be subtle incompatibilities that will break > people's code in surprising ways. I don't recall much about our > current 're' test suite -- I'm sure it tests every feature, but I'm > not sure how far it goes in testing edge cases. IIRC this is where in > the past we've always erred on the side of (extreme) caution, and my > recollection is of Matthew being (understandably!) pretty lukewarm > about doing extra work to help assess this -- IIRC he's totally fine > with the status quo. > > If there's new information or a change in Matthew's outlook I'd be > happy to reconsider it. > At one time I was in favour of including it in the stdlib, but then I changed my mind. Having it outside gives me more flexibility, and I'm happy with just using pip. Not that I'm planning on making any further additions, just bug fixes and updates to follow the Unicode updates. I think I've crammed enough into it already. There's only so much you can do with the regex syntax with its handful of metacharacters and possible escape sequences... From steve at pearwood.info Sat Oct 28 19:48:55 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 29 Oct 2017 10:48:55 +1100 Subject: [Python-Dev] \G (match last position) regex operator non-existant in python? In-Reply-To: <3bb00f2f-91fa-69e5-0dc6-31ffdae5b786@mrabarnett.plus.com> References: <3bb00f2f-91fa-69e5-0dc6-31ffdae5b786@mrabarnett.plus.com> Message-ID: <20171028234855.GW9068@ando.pearwood.info> On Sun, Oct 29, 2017 at 12:31:01AM +0100, MRAB wrote: > Not that I'm planning on making any further additions, just bug fixes > and updates to follow the Unicode updates. I think I've crammed enough > into it already. There's only so much you can do with the regex syntax > with its handful of metacharacters and possible escape sequences... What do you think of the Perl 6 regex syntax? https://en.wikipedia.org/wiki/Perl_6_rules#Changes_from_Perl_5 -- Steve From pludemann at google.com Sat Oct 28 20:09:48 2017 From: pludemann at google.com (Peter Ludemann) Date: Sat, 28 Oct 2017 17:09:48 -0700 Subject: [Python-Dev] \G (match last position) regex operator non-existant in python? In-Reply-To: <20171028234855.GW9068@ando.pearwood.info> References: <3bb00f2f-91fa-69e5-0dc6-31ffdae5b786@mrabarnett.plus.com> <20171028234855.GW9068@ando.pearwood.info> Message-ID: On 28 October 2017 at 16:48, Steven D'Aprano wrote: > On Sun, Oct 29, 2017 at 12:31:01AM +0100, MRAB wrote: > > > Not that I'm planning on making any further additions, just bug fixes > > and updates to follow the Unicode updates. I think I've crammed enough > > into it already. There's only so much you can do with the regex syntax > > with its handful of metacharacters and possible escape sequences... > > What do you think of the Perl 6 regex syntax? > > https://en.wikipedia.org/wiki/Perl_6_rules#Changes_from_Perl_5 ?If you're going to change the notation, why not use notations similar to what linguists use for FSTs? These allow building FSTs (with operations such as adding/subtracting/composing/projecting FSTs) with millions of states ? and there are some impressive optimisers for them also, so that encoding a dictionary with inflections is both more compact and faster than a hash of just the words without inflections. Some of this work is open source, but I haven't kept up with it. If you're interested, you can start here: http://web.stanford.edu/~laurik/? http://web.stanford.edu/~laurik/publications/TR-2010-01.pdf http://web.stanford.edu/group/cslipublications/cslipublications/site/1575864347.shtml etc. ;) > > > > > -- > Steve > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > pludemann%40google.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Sat Oct 28 20:41:48 2017 From: python at mrabarnett.plus.com (MRAB) Date: Sun, 29 Oct 2017 01:41:48 +0100 Subject: [Python-Dev] \G (match last position) regex operator non-existant in python? In-Reply-To: <20171028234855.GW9068@ando.pearwood.info> References: <3bb00f2f-91fa-69e5-0dc6-31ffdae5b786@mrabarnett.plus.com> <20171028234855.GW9068@ando.pearwood.info> Message-ID: On 2017-10-29 00:48, Steven D'Aprano wrote: > On Sun, Oct 29, 2017 at 12:31:01AM +0100, MRAB wrote: > >> Not that I'm planning on making any further additions, just bug fixes >> and updates to follow the Unicode updates. I think I've crammed enough >> into it already. There's only so much you can do with the regex syntax >> with its handful of metacharacters and possible escape sequences... > > What do you think of the Perl 6 regex syntax? > > https://en.wikipedia.org/wiki/Perl_6_rules#Changes_from_Perl_5 > I think I prefer something that's more like PEG, with quoted literals, perhaps because it looks more like a programming language, but also because it's clearer than saying "these characters are literal, but those aren't". That webpage says "Literals: word characters (letters, numbers and underscore) matched literally", but is that all letters? And what about diacritics, and combining characters? I'm not keen on and , I like & and ! better, but then how would you write a lookbehind? Named rules are good, better than regex's use of named capture groups, and if you quote literal, then you wouldn't need to wrap rule call in <...>, as Perl 6 does. From ncoghlan at gmail.com Sat Oct 28 23:13:43 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 29 Oct 2017 13:13:43 +1000 Subject: [Python-Dev] \G (match last position) regex operator non-existant in python? In-Reply-To: <3bb00f2f-91fa-69e5-0dc6-31ffdae5b786@mrabarnett.plus.com> References: <3bb00f2f-91fa-69e5-0dc6-31ffdae5b786@mrabarnett.plus.com> Message-ID: On 29 October 2017 at 09:31, MRAB wrote: > On 2017-10-28 22:05, Guido van Rossum wrote: > >> I don't condone having two different regex implementations/APIs bundled >> in any form, even if one were to be deprecated -- we'd never get rid of the >> deprecated one until 4.0. (FWIW I don't condone this pattern for other >> packages/modules either.) Note that even if we outright switched there >> would *still* be two versions, because regex itself has an internal >> versioning scheme where V0 claims to be strictly compatible with re and V1 >> explicitly changes the matching rules in some cases. (I don't know if this >> means that you have to request V1 to use \G though.) >> >> The other problem with outright replacement is that despite Matthew's >> best efforts there may be subtle incompatibilities that will break people's >> code in surprising ways. I don't recall much about our current 're' test >> suite -- I'm sure it tests every feature, but I'm not sure how far it goes >> in testing edge cases. IIRC this is where in the past we've always erred on >> the side of (extreme) caution, and my recollection is of Matthew being >> (understandably!) pretty lukewarm about doing extra work to help assess >> this -- IIRC he's totally fine with the status quo. >> >> If there's new information or a change in Matthew's outlook I'd be happy >> to reconsider it. >> >> At one time I was in favour of including it in the stdlib, but then I > changed my mind. Having it outside gives me more flexibility, and I'm happy > with just using pip. > OK, so I think that leaves the notion of a "Recommended baseline package set" as the most realistic option we have for improvement in this area - coming up with a way for us as the standard library maintainers to make particular 3rd party components more readily available to end users, while also providing explicit guidance to 3rd party redistributors that we think those components should be offered by default in general purpose scripting environments. I'll start a thread on python-ideas about that, as I think we could get quite some way towards that goal with just some minor additions to the ensurepip and venv modules (using existing documented third party recommendations like those in the re docs for regex and the urllib.request docs for requests), without actually bundling anything directly into the python.org installers. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Sat Oct 28 19:35:59 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 29 Oct 2017 12:35:59 +1300 Subject: [Python-Dev] \G (match last position) regex operator non-existant in python? In-Reply-To: References: Message-ID: <59F5145F.9080409@canterbury.ac.nz> Guido van Rossum wrote: > From this I understand that when using e.g. findall() it forces > successive matches to be adjacent. Seems to me this would be better addressed using an option to findall() rather than being part of the regex. That would avoid the issue of where to keep the state. -- Greg From storchaka at gmail.com Sun Oct 29 08:27:22 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sun, 29 Oct 2017 14:27:22 +0200 Subject: [Python-Dev] \G (match last position) regex operator non-existant in python? In-Reply-To: References: Message-ID: 27.10.17 18:35, Guido van Rossum ????: > The "why" question is not very interesting -- it probably wasn't in PCRE > and nobody was familiar with it when we moved off PCRE (maybe it wasn't > even in Perl at the time -- it was ~15 years ago). > > I didn't understand your description of \G so I googled it and found a > helpful StackOverflow article: > https://stackoverflow.com/questions/21971701/when-is-g-useful-application-in-a-regex. > From this I understand that when using e.g. findall() it forces > successive matches to be adjacent. This looks too Perlish to me. In Perl regular expressions are the part of language syntax, they can contain even Perl expressions. Arguments to them are passed implicitly (as well as to Perl's analogs of str.strip() and str.split()) and results are saved in global special variables. Loops also can be implicit. It seems to me that \G makes sense only to re.findall() and re.finditer(), not to re.match(), re.search() or re.split(). In Python all this is explicit. Compiled regular expressions are objects, and you can pass start and end positions to Pattern.match(). The Python equivalent of \G looks to me like: p = re.compile(...) i = 0 while True: m = p.match(s, i) if not m: break ... i = m.end() The one also can use the undocumented Pattern.scanner() method. Actually Pattern.finditer() is implemented as iter(Pattern.scanner().search). iter(Pattern.scanner().match) would return an iterator of adjacent matches. I think it would be more Pythonic (and much easier) to add a boolean parameter to finditer() and findall() than introduce a \G operator. From storchaka at gmail.com Sun Oct 29 11:19:40 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sun, 29 Oct 2017 17:19:40 +0200 Subject: [Python-Dev] The type of the result of the copy() method Message-ID: The copy() methods of list, dict, bytearray, set, frozenset, WeakValueDictionary, WeakKeyDictionary return an instance of the base type containing the content of the original collection. The copy() methods of deque, defaultdict, OrderedDict, Counter, ChainMap, UserDict, UserList, WeakSet, ElementTree.Element return an instance of the same type as the original collection. The copy() method of mappingproxy returns a copy of the underlying mapping (using its copy() method). os.environ.copy() returns a dict. Shouldn't it be more consistent? From storchaka at gmail.com Sun Oct 29 11:42:20 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sun, 29 Oct 2017 17:42:20 +0200 Subject: [Python-Dev] Migrate python-dev to Mailman 3? In-Reply-To: References: Message-ID: 26.10.17 12:24, Victor Stinner ????: > We are using Mailman 3 for the new buildbot-status mailing list and it > works well: > > https://mail.python.org/mm3/archives/list/buildbot-status at python.org/ > > I prefer to read archives with this UI, it's simpler to follow > threads, and it's possible to reply on the web UI! > > To be honest, we got some issues when the new security-announce > mailing list was quickly migrated from Mailman 2 to Mailman 3, but > issues were quicky fixed as well. > > Would it be possible to migrate python-dev to Mailman 3? Do you see > any blocker issue? +1! Current UI is almost unusable. When you read a message the only navigation links are available are "pref/next in the thread" and back to the global list of messages. So you should either read all messages sequentially in some linearized order and lost a context when jump from the end of one branch to the start of other branch, or switch to the three view and open every message in a separate tab and switch between tabs. I preferred to use Gmane, but its web-interface now doesn't work. Does Mailman 3 provide a NNTP interface? The NNTP interface of Gmane still works, but it can be switched off at any time. It would be more reliable to not depend on an unstable third-party service. From brett at python.org Sun Oct 29 12:40:35 2017 From: brett at python.org (Brett Cannon) Date: Sun, 29 Oct 2017 16:40:35 +0000 Subject: [Python-Dev] The type of the result of the copy() method In-Reply-To: References: Message-ID: It probably should be more consistent and I have a vague recollection that this has been brought up before. On Sun, Oct 29, 2017, 08:21 Serhiy Storchaka, wrote: > The copy() methods of list, dict, bytearray, set, frozenset, > WeakValueDictionary, WeakKeyDictionary return an instance of the base > type containing the content of the original collection. > > The copy() methods of deque, defaultdict, OrderedDict, Counter, > ChainMap, UserDict, UserList, WeakSet, ElementTree.Element return an > instance of the same type as the original collection. > > The copy() method of mappingproxy returns a copy of the underlying > mapping (using its copy() method). > > os.environ.copy() returns a dict. > > Shouldn't it be more consistent? > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Sun Oct 29 12:54:09 2017 From: python at mrabarnett.plus.com (MRAB) Date: Sun, 29 Oct 2017 16:54:09 +0000 Subject: [Python-Dev] \G (match last position) regex operator non-existant in python? In-Reply-To: References: Message-ID: <63415be4-670f-9084-215b-bab6dae5cf9d@mrabarnett.plus.com> On 2017-10-29 12:27, Serhiy Storchaka wrote: > 27.10.17 18:35, Guido van Rossum ????: >> The "why" question is not very interesting -- it probably wasn't in PCRE >> and nobody was familiar with it when we moved off PCRE (maybe it wasn't >> even in Perl at the time -- it was ~15 years ago). >> >> I didn't understand your description of \G so I googled it and found a >> helpful StackOverflow article: >> https://stackoverflow.com/questions/21971701/when-is-g-useful-application-in-a-regex. >> From this I understand that when using e.g. findall() it forces >> successive matches to be adjacent. > > This looks too Perlish to me. In Perl regular expressions are the part > of language syntax, they can contain even Perl expressions. Arguments to > them are passed implicitly (as well as to Perl's analogs of str.strip() > and str.split()) and results are saved in global special variables. > Loops also can be implicit. > > It seems to me that \G makes sense only to re.findall() and > re.finditer(), not to re.match(), re.search() or re.split(). > > In Python all this is explicit. Compiled regular expressions are > objects, and you can pass start and end positions to Pattern.match(). > The Python equivalent of \G looks to me like: > > p = re.compile(...) > i = 0 > while True: > m = p.match(s, i) > if not m: break > ... > i = m.end() > > You're correct. \G matches at the start position, so .search(r\G\w+') behaves the same as .match(r'\w+'). findall and finditer perform a series of searches, but with \G at the start they'll perform a series of matches, each anchored at where the previous one ended. > The one also can use the undocumented Pattern.scanner() method. Actually > Pattern.finditer() is implemented as iter(Pattern.scanner().search). > iter(Pattern.scanner().match) would return an iterator of adjacent matches. > > I think it would be more Pythonic (and much easier) to add a boolean > parameter to finditer() and findall() than introduce a \G operator. > From raymond.hettinger at gmail.com Sun Oct 29 12:57:10 2017 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sun, 29 Oct 2017 09:57:10 -0700 Subject: [Python-Dev] The type of the result of the copy() method In-Reply-To: References: Message-ID: <0F5114C6-B982-44DD-89EE-8C188E31645F@gmail.com> > On Oct 29, 2017, at 8:19 AM, Serhiy Storchaka wrote: > > The copy() methods of list, dict, bytearray, set, frozenset, WeakValueDictionary, WeakKeyDictionary return an instance of the base type containing the content of the original collection. > > The copy() methods of deque, defaultdict, OrderedDict, Counter, ChainMap, UserDict, UserList, WeakSet, ElementTree.Element return an instance of the same type as the original collection. > > The copy() method of mappingproxy returns a copy of the underlying mapping (using its copy() method). > > os.environ.copy() returns a dict. > > Shouldn't it be more consistent? Not really. It is up to the class designer to make a decision about what the most useful behavior would be for subclassers. Note for a regular Python class, copy.copy() by default creates an instance of the subclass. On the other hand, instances like int() are harder to subclass because all the int operations such as __add__ produce exact int() instances (this is likely because so few assumptions can be made about the subclass and because it isn't clear what the semantics would be otherwise). Also, the time to argue and change APIs is BEFORE they are released, not a decade or two after they've lived successfully in the wild. Raymond From guido at python.org Sun Oct 29 13:04:30 2017 From: guido at python.org (Guido van Rossum) Date: Sun, 29 Oct 2017 10:04:30 -0700 Subject: [Python-Dev] The type of the result of the copy() method In-Reply-To: References: Message-ID: It's somewhat problematic. If I subclass dict with a different constructor, but I don't overload copy(), how can the dict.copy() method construct a correct instance of the subclass? Even if the constructor signatures match, how can dict.copy() make sure it copies all attributes properly? Without an answer to these questions I think it's better to admit defeat and return a dict instance -- classes that want to do better should overload copy(). I notice that Counter.copy() has all the problems I indicate here -- it works as long as you don't add attributes or change the constructor signature. I bet this isn't documented anywhere. On Sun, Oct 29, 2017 at 9:40 AM, Brett Cannon wrote: > It probably should be more consistent and I have a vague recollection that > this has been brought up before. > > On Sun, Oct 29, 2017, 08:21 Serhiy Storchaka, wrote: > >> The copy() methods of list, dict, bytearray, set, frozenset, >> WeakValueDictionary, WeakKeyDictionary return an instance of the base >> type containing the content of the original collection. >> >> The copy() methods of deque, defaultdict, OrderedDict, Counter, >> ChainMap, UserDict, UserList, WeakSet, ElementTree.Element return an >> instance of the same type as the original collection. >> >> The copy() method of mappingproxy returns a copy of the underlying >> mapping (using its copy() method). >> >> os.environ.copy() returns a dict. >> >> Shouldn't it be more consistent? >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: https://mail.python.org/mailman/options/python-dev/ >> brett%40python.org >> > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Sun Oct 29 13:08:04 2017 From: barry at python.org (Barry Warsaw) Date: Sun, 29 Oct 2017 13:08:04 -0400 Subject: [Python-Dev] Migrate python-dev to Mailman 3? In-Reply-To: References: Message-ID: On Oct 29, 2017, at 11:42, Serhiy Storchaka wrote: > Does Mailman 3 provide a NNTP interface? The NNTP interface of Gmane still works, but it can be switched off at any time. It would be more reliable to not depend on an unstable third-party service. I use the NNTP interface of Gmane too (although not for python-dev), and agree with everything your saying here. Right now however, MM3 does not have a built-in NNTP server. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From raymond.hettinger at gmail.com Sun Oct 29 13:41:25 2017 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sun, 29 Oct 2017 10:41:25 -0700 Subject: [Python-Dev] The type of the result of the copy() method In-Reply-To: References: Message-ID: > On Oct 29, 2017, at 10:04 AM, Guido van Rossum wrote: > > Without an answer to these questions I think it's better to admit defeat and return a dict instance I think it is better to admit success and recognize that these APIs have fared well in the wild. Focusing just on OrderedDict() and dict(), I don't see how to change the copy() method for either of them without breaking existing code. OrderedDict *is* a dict subclass but really does need to have copy() return an OrderedDict. The *default* behavior for any pure python class is for copy.copy() to return an instance of that class. We really don't want ChainMap() to return a dict instance -- that would defeat the whole purpose of having a ChainMap in the first place. And unlike the original builtin classes, most of the collection classes were specifically designed to be easily subclassable (not making the subclasser do work unnecessarily). These aren't accidental behaviors: class ChainMap(MutableMapping): def copy(self): 'New ChainMap or subclass with a new copy of maps[0] and refs to maps[1:]' return self.__class__(self.maps[0].copy(), *self.maps[1:]) Do you really want that changed to: return ChainMap(self.maps[0].copy(), *self.maps[1:]) Or to: return dict(self) Do you really want Serhiy to sweep through the code and change all of these long standing APIs, overriding the decisions of the people who designed those classes, and breaking all user code that reasonably relied on those useful and intentional behaviors? Raymond P.S. Possibly related: We've gone out of way in many classes to have a __repr__ that uses the name of the subclass. Presumably, this is to make life easier for subclassers (one less method they have to override), but it does make an assumption about what the subclass signature looks like. IIRC, our position on that has been that a subclasser who changes the signature would then need to override the __repr__. ISTM that similar reasoning would apply to copy. From guido at python.org Sun Oct 29 14:44:17 2017 From: guido at python.org (Guido van Rossum) Date: Sun, 29 Oct 2017 11:44:17 -0700 Subject: [Python-Dev] The type of the result of the copy() method In-Reply-To: References: Message-ID: On Sun, Oct 29, 2017 at 10:41 AM, Raymond Hettinger < raymond.hettinger at gmail.com> wrote: > > > On Oct 29, 2017, at 10:04 AM, Guido van Rossum wrote: > > > > Without an answer to these questions I think it's better to admit defeat > and return a dict instance > > I think it is better to admit success and recognize that these APIs have > fared well in the wild. > Oh, I agree! Focusing just on OrderedDict() and dict(), I don't see how to change the > copy() method for either of them without breaking existing code. > OrderedDict *is* a dict subclass but really does need to have copy() return > an OrderedDict. > And I wasn't proposing that. I like what OrderedDict does -- I was just suggesting that the *default* dict.copy() needn't worry about this. > The *default* behavior for any pure python class is for copy.copy() to > return an instance of that class. We really don't want ChainMap() to > return a dict instance -- that would defeat the whole purpose of having a > ChainMap in the first place. > Of course. And unlike the original builtin classes, most of the collection classes > were specifically designed to be easily subclassable (not making the > subclasser do work unnecessarily). These aren't accidental behaviors: > > class ChainMap(MutableMapping): > > def copy(self): > 'New ChainMap or subclass with a new copy of maps[0] and refs > to maps[1:]' > return self.__class__(self.maps[0].copy(), *self.maps[1:]) > > Do you really want that changed to: > > return ChainMap(self.maps[0].copy(), *self.maps[1:]) > > Or to: > > return dict(self) > I think you've misread what I meant. (The defeat I referred to was accepting the status quo, no matter how inconsistent it seems, not a withdrawal to some other seemingly inconsistent but different rule.) > Do you really want Serhiy to sweep through the code and change all of > these long standing APIs, overriding the decisions of the people who > designed those classes, and breaking all user code that reasonably relied > on those useful and intentional behaviors? > No, and I never said that. Calm down. Raymond > > > P.S. Possibly related: We've gone out of way in many classes to have a > __repr__ that uses the name of the subclass. Presumably, this is to make > life easier for subclassers (one less method they have to override), but it > does make an assumption about what the subclass signature looks like. > IIRC, our position on that has been that a subclasser who changes the > signature would then need to override the __repr__. ISTM that similar > reasoning would apply to copy. > I don't think the same reasoning applies. When the string returned doesn't indicate the true class of the object, debugging becomes a lot harder. If the signature in the repr() output is wrong, the user can probably deal with that. And yes, the subclasser who wants the best possible repr() needs to override it, but the use cases don't match. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From jwilk at jwilk.net Sun Oct 29 14:41:00 2017 From: jwilk at jwilk.net (Jakub Wilk) Date: Sun, 29 Oct 2017 19:41:00 +0100 Subject: [Python-Dev] \G (match last position) regex operator non-existant in python? In-Reply-To: References: Message-ID: <20171029184100.bfes7jyg66le22go@jwilk.net> * Guido van Rossum , 2017-10-28, 14:05: >even if we outright switched there would *still* be two versions, >because regex itself has an internal versioning scheme where V0 claims >to be strictly compatible with re and V1 explicitly changes the >matching rules in some cases. (I don't know if this means that you have >to request V1 to use \G though.) No, \G is available in the V0 mode. -- Jakub Wilk From ethan at ethanhs.me Mon Oct 30 00:21:00 2017 From: ethan at ethanhs.me (Ethan Smith) Date: Sun, 29 Oct 2017 21:21:00 -0700 Subject: [Python-Dev] PEP 561: Distributing and Packaging Type Information In-Reply-To: References: Message-ID: On Fri, Oct 27, 2017 at 12:44 AM, Nathaniel Smith wrote: > On Thu, Oct 26, 2017 at 3:42 PM, Ethan Smith wrote: > > However, the stubs may be put in a sub-folder > > of the Python sources, with the same name the ``*.py`` files are in. For > > example, the ``flyingcircus`` package would have its stubs in the folder > > ``flyingcircus/flyingcircus/``. This path is chosen so that if stubs are > > not found in ``flyingcircus/`` the type checker may treat the > subdirectory as > > a normal package. > > I admit that I find this aesthetically unpleasant. Wouldn't something > like __typestubs__/ be a more Pythonic name? (And also avoid potential > name clashes, e.g. my async_generator package has a top-level export > called async_generator; normally you do 'from async_generator import > async_generator'. I think that might cause problems if I created an > async_generator/async_generator/ directory, especially post-PEP 420.) > I agree, this is unpleasant, I am now of the thought that if maintainers do not wish to ship stubs alongside their Python code, they should just create separate stub-only packages. I don't think there is a particular need to special case this for minor convenience. > > Type Checker Module Resolution Order > > ------------------------------------ > > > > The following is the order that type checkers supporting this PEP should > > resolve modules containing type information: > > > > 1. User code - the files the type checker is running on. > > > > 2. Stubs or Python source manually put in the beginning of the path. Type > > checkers should provide this to allow the user complete control of > which > > stubs to use, and patch broken stubs/inline types from packages. > > > > 3. Third party stub packages - these packages can supersede the installed > > untyped packages. They can be found at ``pkg-stubs`` for package > ``pkg``, > > however it is encouraged to check the package's metadata using > packaging > > query APIs such as ``pkg_resources`` to assure that the package is > meant > > for type checking, and is compatible with the installed version. > > Am I right that this means you need to be able to map from import > names to distribution names? I.e., if you see 'import foo', you need > to figure out which *.dist-info directory contains metadata for the > 'foo' package? How do you plan to do this? > > The problem is that technically, import names and distribution names > are totally unrelated namespaces -- for example, the '_pytest' package > comes from the 'pytest' distribution, the 'pylab' package comes from > 'matplotlib', and 'pip install scikit-learn' gives you a package > imported as 'sklearn'. Namespace packages are also challenging, > because a single top-level package might actually be spread across > multiple distributions. > > This is a problem. What I now realize is that the typing metadata is needed for *packages* and not distributions. I will work on a new proposal that makes the metadata per-package. It will require a slightly more complicated proposal, but I feel that it is necessary. Thank you for realizing this issue with my proposal, I probably should have caught it earlier. -n > > -- > Nathaniel J. Smith -- https://vorpus.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Mon Oct 30 07:00:22 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 30 Oct 2017 12:00:22 +0100 Subject: [Python-Dev] Migrate python-dev to Mailman 3? In-Reply-To: <20171026120137.1de34389@fsol> References: <20171026120137.1de34389@fsol> Message-ID: 2017-10-26 12:01 GMT+02:00 Antoine Pitrou : >> We are using Mailman 3 for the new buildbot-status mailing list and it >> works well: >> >> https://mail.python.org/mm3/archives/list/buildbot-status at python.org/ >> >> I prefer to read archives with this UI, it's simpler to follow >> threads, and it's possible to reply on the web UI! > > Personally, I really don't like that UI. Is it possible to have a > pipermail-style UI as an alternative? > (...) >> I don't know pipermail. Do you have an example? > https://mail.python.org/pipermail/python-dev/ :-) Oh, I didn't know that Mailman 2 archives are called "pipermail". Well, there are already other archives already available if you want another UI: http://code.activestate.com/lists/python-dev/ http://dir.gmane.org/gmane.comp.python.devel -- using NNTP https://groups.google.com/forum/#!forum/dev-python https://lists.gt.net/python/dev/ And maybe others. -- It's really hard to design an UI liked by everyone :-) I prefer Mailman 3 UI (HyperKitty), I prefer to get all emails of a thread on a single page, and the new UI has a few nice features like "Most active discussions", "Activity Summary", "favorites", tags, etc. Except of Antoine Pitrou, does everybody else like the new UI? :-) I expect that Mailman 3 is more actively developed than Mailman 2. By the way, I hope that Mailman 3 and HyperKity support and runs on Python 3, whereas Mailman 2 is more likely stuck at Python 2, no? ;-) Victor From victor.stinner at gmail.com Mon Oct 30 07:09:28 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 30 Oct 2017 12:09:28 +0100 Subject: [Python-Dev] If aligned_alloc() is missing on your platform, please let us know. In-Reply-To: <20171027081722.GA3757@bytereef.org> References: <20171027081722.GA3757@bytereef.org> Message-ID: 2017-10-27 10:17 GMT+02:00 Stefan Krah : > Victor wrote a patch and would like to avoid adding a (probably unnecessary) > emulation function. I agree with that. > (...) > So if any platform does not have some form of aligned_alloc(), please > speak up. I'm not really opposed to implement an aligned allocator on top of an existing allocator. I only propose to discuss that in a separated issue. I wrote "PEP 445 -- Add new APIs to customize Python memory allocators" to implement tracemalloc, but also because I was working on a Python version patched to use custom memory allocators, different than malloc()/free(): https://www.python.org/dev/peps/pep-0445/#rationale IMHO only users of PyMem_SetAllocators() would need such "fallback". I expect all modern platforms to provide a "aligned" memory allocator. Victor From p.f.moore at gmail.com Mon Oct 30 07:14:26 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 30 Oct 2017 11:14:26 +0000 Subject: [Python-Dev] Migrate python-dev to Mailman 3? In-Reply-To: References: <20171026120137.1de34389@fsol> Message-ID: On 30 October 2017 at 11:00, Victor Stinner wrote: > Except of Antoine Pitrou, does everybody else like the new UI? :-) As I said, I don't particularly like it, but I don't expect to need it if we get an archived-at header in the mails, and Google indexes individual mails in the archive correctly. Paul From stefan at bytereef.org Mon Oct 30 07:15:29 2017 From: stefan at bytereef.org (Stefan Krah) Date: Mon, 30 Oct 2017 12:15:29 +0100 Subject: [Python-Dev] Migrate python-dev to Mailman 3? In-Reply-To: References: <20171026120137.1de34389@fsol> Message-ID: <20171030111529.GA15477@bytereef.org> On Mon, Oct 30, 2017 at 12:00:22PM +0100, Victor Stinner wrote: > Except of Antoine Pitrou, does everybody else like the new UI? :-) No, I don't like it. If there is a promise to keep an additional, MHonArc or Pipermail archive *with an implicit promise of long term support*, I don't care. Despite the mentioned shortcomings of Pipermail, it is 5 times faster for me to navigate and less stressful to look at. Stefan Krah From guido at python.org Mon Oct 30 10:46:42 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 30 Oct 2017 07:46:42 -0700 Subject: [Python-Dev] Migrate python-dev to Mailman 3? In-Reply-To: <20171030111529.GA15477@bytereef.org> References: <20171026120137.1de34389@fsol> <20171030111529.GA15477@bytereef.org> Message-ID: I love MM3 and hyperkitty. But I rarely peruse the archives -- I only go to pipermail to get a link to a specific message from the past so I can copy it into a current message. IIUC that functionality is actually better in hyperkitty because when a pipermail archive is rebuilt the message numbers come out differently. On Mon, Oct 30, 2017 at 4:15 AM, Stefan Krah wrote: > On Mon, Oct 30, 2017 at 12:00:22PM +0100, Victor Stinner wrote: > > Except of Antoine Pitrou, does everybody else like the new UI? :-) > > No, I don't like it. If there is a promise to keep an additional, MHonArc > or Pipermail archive *with an implicit promise of long term support*, I > don't > care. > > > Despite the mentioned shortcomings of Pipermail, it is 5 times faster > for me to navigate and less stressful to look at. > > > > Stefan Krah > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Mon Oct 30 10:59:39 2017 From: barry at python.org (Barry Warsaw) Date: Mon, 30 Oct 2017 07:59:39 -0700 Subject: [Python-Dev] Migrate python-dev to Mailman 3? In-Reply-To: References: <20171026120137.1de34389@fsol> Message-ID: <4A220C29-5C33-4772-A2C9-DA3A67DF50BD@python.org> On Oct 30, 2017, at 04:00, Victor Stinner wrote: > It's really hard to design an UI liked by everyone :-) It?s impossible to design *anything* that?s liked by everyone :). > I expect that Mailman 3 is more actively developed than Mailman 2. By > the way, I hope that Mailman 3 and HyperKity support and runs on > Python 3, whereas Mailman 2 is more likely stuck at Python 2, no? ;-) Mailman 3 Core has been Python 3 for a long time. HyperKitty and Postorius (the new admin web u/i) are both Django projects and while currently effectively Python 2, they are being actively ported to Python 3. mailmanclient, the official library of bindings to the Core REST API, is of course both Python 2 and 3. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From stefan at bytereef.org Mon Oct 30 11:57:00 2017 From: stefan at bytereef.org (Stefan Krah) Date: Mon, 30 Oct 2017 16:57:00 +0100 Subject: [Python-Dev] Migrate python-dev to Mailman 3? In-Reply-To: References: <20171026120137.1de34389@fsol> <20171030111529.GA15477@bytereef.org> Message-ID: <20171030155700.GA18119@bytereef.org> On Mon, Oct 30, 2017 at 07:46:42AM -0700, Guido van Rossum wrote: > I love MM3 and hyperkitty. But I rarely peruse the archives -- I only go to > pipermail to get a link to a specific message from the past so I can copy > it into a current message. IIUC that functionality is actually better in > hyperkitty because when a pipermail archive is rebuilt the message numbers > come out differently. Yes, I use the archives differently. When I'm temporarily unsubscribed due to overload I scan the archives for interesting topics and indeed sometimes read whole threads. I think Pipermail is great for that. Quiet design, nice font, good contrast for speed reading. Stefan Krah From mariatta.wijaya at gmail.com Mon Oct 30 12:35:17 2017 From: mariatta.wijaya at gmail.com (Mariatta Wijaya) Date: Mon, 30 Oct 2017 09:35:17 -0700 Subject: [Python-Dev] Migrate python-dev to Mailman 3? In-Reply-To: <20171030155700.GA18119@bytereef.org> References: <20171026120137.1de34389@fsol> <20171030111529.GA15477@bytereef.org> <20171030155700.GA18119@bytereef.org> Message-ID: > Except of Antoine Pitrou, does everybody else like the new UI? :-) I love the new UI. +1000 for migrating. Mariatta Wijaya On Mon, Oct 30, 2017 at 8:57 AM, Stefan Krah wrote: > On Mon, Oct 30, 2017 at 07:46:42AM -0700, Guido van Rossum wrote: > > I love MM3 and hyperkitty. But I rarely peruse the archives -- I only go > to > > pipermail to get a link to a specific message from the past so I can copy > > it into a current message. IIUC that functionality is actually better in > > hyperkitty because when a pipermail archive is rebuilt the message > numbers > > come out differently. > > Yes, I use the archives differently. When I'm temporarily unsubscribed > due to overload I scan the archives for interesting topics and indeed > sometimes read whole threads. > > I think Pipermail is great for that. Quiet design, nice font, good contrast > for speed reading. > > > > Stefan Krah > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > mariatta.wijaya%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Oct 30 13:18:04 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 30 Oct 2017 10:18:04 -0700 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: <90a1475b-4fa1-3323-fae8-29d65d54fcc5@email.de> References: <20171016170631.375edd56@fsol> <4f15b978-d786-3b80-77f7-c6cb7d313573@email.de> <90a1475b-4fa1-3323-fae8-29d65d54fcc5@email.de> Message-ID: I have read PEP 564 and (mostly) followed the discussion in this thread, and I am happy with the PEP. I am hereby approving PEP 564. Congratulations Victor! -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From cedugenio at gmail.com Mon Oct 30 15:41:20 2017 From: cedugenio at gmail.com (Carlos Eugenio) Date: Mon, 30 Oct 2017 17:41:20 -0200 Subject: [Python-Dev] Convert Sqlite Function from cx_Oracle Message-ID: ============================================================== SQLITE3 Function def get_db(): def dict_factory(cursor, row): d = {} for idx, col in enumerate(cursor.description): d[col[0]] = row[idx] return d db = getattr(g, '_database', None) if db is None: db = g._database = sqlite3.connect(DATABASE) db.row_factory = dict_factory return db ================================================================ I try this form but isnt ok . Can I help me ? import cx_Oracle con = cx_Oracle.connect('xxxx/xxxx at xxxxxxx/xxxxxxx') cur = con.cursor() cur.execute("select * from test") desc = [d[0] for d in cur.description] result = [dict(zip(dec,line))for line in cur] print (result) cur.close() -- Carlos Eug?nio -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Mon Oct 30 16:21:21 2017 From: brett at python.org (Brett Cannon) Date: Mon, 30 Oct 2017 20:21:21 +0000 Subject: [Python-Dev] Migrate python-dev to Mailman 3? In-Reply-To: References: <20171026120137.1de34389@fsol> <20171030111529.GA15477@bytereef.org> <20171030155700.GA18119@bytereef.org> Message-ID: On Mon, 30 Oct 2017 at 09:36 Mariatta Wijaya wrote: > > Except of Antoine Pitrou, does everybody else like the new UI? :-) > > I love the new UI. +1000 for migrating. > I personally prefer MM3 + HyperKitty compared to MM2 + pipermail. -Brett > > > > Mariatta Wijaya > > On Mon, Oct 30, 2017 at 8:57 AM, Stefan Krah wrote: > >> On Mon, Oct 30, 2017 at 07:46:42AM -0700, Guido van Rossum wrote: >> > I love MM3 and hyperkitty. But I rarely peruse the archives -- I only >> go to >> > pipermail to get a link to a specific message from the past so I can >> copy >> > it into a current message. IIUC that functionality is actually better in >> > hyperkitty because when a pipermail archive is rebuilt the message >> numbers >> > come out differently. >> >> Yes, I use the archives differently. When I'm temporarily unsubscribed >> due to overload I scan the archives for interesting topics and indeed >> sometimes read whole threads. >> >> I think Pipermail is great for that. Quiet design, nice font, good >> contrast >> for speed reading. >> >> >> >> Stefan Krah >> >> >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> > Unsubscribe: >> https://mail.python.org/mailman/options/python-dev/mariatta.wijaya%40gmail.com >> > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From phd at phdru.name Mon Oct 30 16:45:01 2017 From: phd at phdru.name (Oleg Broytman) Date: Mon, 30 Oct 2017 21:45:01 +0100 Subject: [Python-Dev] Convert Sqlite Function from cx_Oracle In-Reply-To: References: Message-ID: <20171030204501.GA17665@phdru.name> Hello. This mailing list is to work on developing Python (adding new features to Python itself and fixing bugs); if you're having problems learning, understanding or using Python, please find another forum. Probably python-list/comp.lang.python mailing list/news group is the best place; there are Python developers who participate in it; you may get a faster, and probably more complete, answer there. See http://www.python.org/community/ for other lists/news groups/fora. Thank you for understanding. On Mon, Oct 30, 2017 at 05:41:20PM -0200, Carlos Eugenio wrote: > ============================================================== > SQLITE3 Function > > def get_db(): > def dict_factory(cursor, row): > d = {} > for idx, col in enumerate(cursor.description): > d[col[0]] = row[idx] > return d > > db = getattr(g, '_database', None) > if db is None: > db = g._database = sqlite3.connect(DATABASE) > db.row_factory = dict_factory > return db > ================================================================ > > I try this form but isnt ok . Can I help me ? > > > import cx_Oracle > con = cx_Oracle.connect('xxxx/xxxx at xxxxxxx/xxxxxxx') > > cur = con.cursor() > cur.execute("select * from test") > > desc = [d[0] for d in cur.description] > > result = [dict(zip(dec,line))for line in cur] > > print (result) > > cur.close() > > -- > Carlos Eug??nio Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From ethan at stoneleaf.us Mon Oct 30 18:51:37 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 30 Oct 2017 15:51:37 -0700 Subject: [Python-Dev] PEP 564: Add new time functions with nanosecond resolution In-Reply-To: References: <20171016170631.375edd56@fsol> <4f15b978-d786-3b80-77f7-c6cb7d313573@email.de> <90a1475b-4fa1-3323-fae8-29d65d54fcc5@email.de> Message-ID: <59F7ACF9.7050202@stoneleaf.us> On 10/30/2017 10:18 AM, Guido van Rossum wrote: > I have read PEP 564 and (mostly) followed the discussion in this thread, and I am happy with the PEP. I am hereby > approving PEP 564. Congratulations Victor! Congrats, Victor! From storchaka at gmail.com Tue Oct 31 06:12:35 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 31 Oct 2017 12:12:35 +0200 Subject: [Python-Dev] The type of the result of the copy() method In-Reply-To: References: Message-ID: 29.10.17 19:04, Guido van Rossum ????: > It's somewhat problematic. If I subclass dict with a different > constructor, but I don't overload copy(), how can the dict.copy() method > construct a correct instance of the subclass? Even if the constructor > signatures match, how can dict.copy() make sure it copies all attributes > properly? Without an answer to these questions I think it's better to > admit defeat and return a dict instance -- classes that want to do > better should overload copy(). > > I notice that Counter.copy() has all the problems I indicate here -- it > works as long as you don't add attributes or change the constructor > signature. I bet this isn't documented anywhere. I am familiar with these reasons, and agree with them. But I'm curious why some collections chose the way of creating an instance of the same class. For creating an instance of the same class we have the __copy__() method. An attempt to preserve a class in the returned value can cause problems. For example, the __add__() and __mul__() methods of deque first make a copy of the same type, and this can cause a crash [1]. Of course this is not occurred in real code, it is just yet one way of crashing the interpreter from Python code. list and tuple are free from this problem since their corresponding methods (as well as copy()) create an instance of the corresponding base type. I think there were reasons for copying the type in results. It would be nice to formalize the criteria, in what cases copy() and other methods should return an instance of the base class, and in what cases they should create an instance of the same type as the original object. This would help for new types. And maybe we need to change some existing type (the inconsistency between WeakKeyDictionary and WeakSet looks weird). [1] https://bugs.python.org/issue31608 From storchaka at gmail.com Tue Oct 31 06:37:27 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 31 Oct 2017 12:37:27 +0200 Subject: [Python-Dev] The syntax of replacement fields in format strings Message-ID: According to the specification of format string syntax [1] (I meant str.format(), not f-strings), both argument name and attribute name must be Python identifiers. But the current implementation is more lenient and allow arbitrary sequences of characters while they don't contain '.', '[', ']', '{', '}', ':', '!'. >>> '{#}'.format_map({'#': 42}) '42' >>> import types >>> '{0.#}'.format(types.SimpleNamespace(**{'#': 42})) '42' This can be confusing due to similarity with the format string syntaxes in str.format() and f-strings. >> name = 'abc' >>> f'{name.upper()}' 'ABC' >>> '{name.upper()}'.format(name='abc') Traceback (most recent call last): File "", line 1, in AttributeError: 'str' object has no attribute 'upper()' If accept only identifiers, we could produce more specific error message. Is there a bug in the documentation or in the implementation? [1] https://docs.python.org/3/library/string.html#format-string-syntax From eric at trueblade.com Tue Oct 31 06:52:23 2017 From: eric at trueblade.com (Eric V. Smith) Date: Tue, 31 Oct 2017 06:52:23 -0400 Subject: [Python-Dev] The syntax of replacement fields in format strings In-Reply-To: References: Message-ID: <22C593DA-A74B-49D9-BBE1-2C887A02C0C4@trueblade.com> If I had it to do over again, I?d implement it more strictly and only allow chars that are valid in identifiers. But see https://bugs.python.org/issue31907 for a case that is currently valid and would break if we changed how it worked. I?m not sure it?s worth the churn of deprecating this and eventually making it illegal. -- Eric. > On Oct 31, 2017, at 6:37 AM, Serhiy Storchaka wrote: > > According to the specification of format string syntax [1] (I meant str.format(), not f-strings), both argument name and attribute name must be Python identifiers. > > But the current implementation is more lenient and allow arbitrary sequences of characters while they don't contain '.', '[', ']', '{', '}', ':', '!'. > > >>> '{#}'.format_map({'#': 42}) > '42' > >>> import types > >>> '{0.#}'.format(types.SimpleNamespace(**{'#': 42})) > '42' > > This can be confusing due to similarity with the format string syntaxes in str.format() and f-strings. > > >> name = 'abc' > >>> f'{name.upper()}' > 'ABC' > >>> '{name.upper()}'.format(name='abc') > Traceback (most recent call last): > File "", line 1, in > AttributeError: 'str' object has no attribute 'upper()' > > If accept only identifiers, we could produce more specific error message. > > Is there a bug in the documentation or in the implementation? > > [1] https://docs.python.org/3/library/string.html#format-string-syntax > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/eric%2Ba-python-dev%40trueblade.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Oct 31 11:32:00 2017 From: guido at python.org (Guido van Rossum) Date: Tue, 31 Oct 2017 08:32:00 -0700 Subject: [Python-Dev] The type of the result of the copy() method In-Reply-To: References: Message-ID: On Tue, Oct 31, 2017 at 3:12 AM, Serhiy Storchaka wrote: > 29.10.17 19:04, Guido van Rossum ????: > >> It's somewhat problematic. If I subclass dict with a different >> constructor, but I don't overload copy(), how can the dict.copy() method >> construct a correct instance of the subclass? Even if the constructor >> signatures match, how can dict.copy() make sure it copies all attributes >> properly? Without an answer to these questions I think it's better to admit >> defeat and return a dict instance -- classes that want to do better should >> overload copy(). >> >> I notice that Counter.copy() has all the problems I indicate here -- it >> works as long as you don't add attributes or change the constructor >> signature. I bet this isn't documented anywhere. >> > > I am familiar with these reasons, and agree with them. But I'm curious why > some collections chose the way of creating an instance of the same class. > For creating an instance of the same class we have the __copy__() method. > > An attempt to preserve a class in the returned value can cause problems. > For example, the __add__() and __mul__() methods of deque first make a copy > of the same type, and this can cause a crash [1]. Of course this is not > occurred in real code, it is just yet one way of crashing the interpreter > from Python code. list and tuple are free from this problem since their > corresponding methods (as well as copy()) create an instance of the > corresponding base type. > > I think there were reasons for copying the type in results. It would be > nice to formalize the criteria, in what cases copy() and other methods > should return an instance of the base class, and in what cases they should > create an instance of the same type as the original object. This would help > for new types. And maybe we need to change some existing type (the > inconsistency between WeakKeyDictionary and WeakSet looks weird). > > [1] https://bugs.python.org/issue31608 > I think it all depends on the use case. (Though in some cases I suspect the class' author didn't think too hard about it.) The more strict rule should be that a base class cannot know how to create a subclass instance and hence it should not bother. (Or perhaps it should use the __copy__ protocol.) But there are some cases where a useful pattern of subclassing a stdlib class just to add some convenience methods to it, without changing its essence. In those cases, it might be convenient that by default you get something that preserves its type (and full contents) when copying without having to explicitly implement copy() or __copy__(). Another useful rule is that if a class *does* have a copy() method, a subclass *ought* to override it (or __copy__()) to make it work right. IOW from the class author's POV, copy() should not attempt to copy the type of a subclass. But from the user's POV copy() is more useful if it copies the type. This places the burden on the subclass author to override copy() or __copy__(). Traditionally we've done a terrible job at documenting what you should to do subclass a class, and what you can expect from the base class (e.g. which parts of the base class are part of the API for subclasses, and which parts are truly private -- underscores aren't used consistently in many class implementations). For those classes that currently preserve the type in copy(), perhaps we could document that if one overrides __init__() or __new__() one should also override copy() or __copy__(). And for future classes we should recommend whether it's preferred to preserve the type in copy() or not -- I'm not actually sure what to recommend here. I guess it depends on what other methods of the class return new instances. If there are a lot (like for int or str) then copy() should follow those methods' lead. Sorry about the rambling, this is hard to get consistent. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Oct 31 11:41:28 2017 From: guido at python.org (Guido van Rossum) Date: Tue, 31 Oct 2017 08:41:28 -0700 Subject: [Python-Dev] The syntax of replacement fields in format strings In-Reply-To: <22C593DA-A74B-49D9-BBE1-2C887A02C0C4@trueblade.com> References: <22C593DA-A74B-49D9-BBE1-2C887A02C0C4@trueblade.com> Message-ID: I'd say let sleeping dogs lie. On Tue, Oct 31, 2017 at 3:52 AM, Eric V. Smith wrote: > If I had it to do over again, I?d implement it more strictly and only > allow chars that are valid in identifiers. > > But see https://bugs.python.org/issue31907 for a case that is currently > valid and would break if we changed how it worked. > > I?m not sure it?s worth the churn of deprecating this and eventually > making it illegal. > > -- > Eric. > > On Oct 31, 2017, at 6:37 AM, Serhiy Storchaka wrote: > > According to the specification of format string syntax [1] (I meant > str.format(), not f-strings), both argument name and attribute name must be > Python identifiers. > > But the current implementation is more lenient and allow arbitrary > sequences of characters while they don't contain '.', '[', ']', '{', '}', > ':', '!'. > > >>> '{#}'.format_map({'#': 42}) > '42' > >>> import types > >>> '{0.#}'.format(types.SimpleNamespace(**{'#': 42})) > '42' > > This can be confusing due to similarity with the format string syntaxes in > str.format() and f-strings. > > >> name = 'abc' > >>> f'{name.upper()}' > 'ABC' > >>> '{name.upper()}'.format(name='abc') > Traceback (most recent call last): > File "", line 1, in > AttributeError: 'str' object has no attribute 'upper()' > > If accept only identifiers, we could produce more specific error message. > > Is there a bug in the documentation or in the implementation? > > [1] https://docs.python.org/3/library/string.html#format-string-syntax > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > eric%2Ba-python-dev%40trueblade.com > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: