From k7hoven at gmail.com Thu Sep 1 09:11:05 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Thu, 1 Sep 2016 16:11:05 +0300 Subject: [Python-Dev] PEP 526 ready for review: Syntax for Variable and Attribute Annotations In-Reply-To: References: Message-ID: On Wed, Aug 31, 2016 at 12:20 AM, Guido van Rossum wrote: > I'm happy to present PEP 526 for your collective review: > https://www.python.org/dev/peps/pep-0526/ (HTML) > https://github.com/python/peps/blob/master/pep-0526.txt (source) > > There's also an implementation ready: > https://github.com/ilevkivskyi/cpython/tree/pep-526 > > I don't want to post the full text here but I encourage feedback on > the high-order ideas, including but not limited to > > - Whether (given PEP 484's relative success) it's worth adding syntax > for variable/attribute annotations. While a large amount of Python programmers may not be interested in type hinting local variables inside functions, I can see other potential benefits in this. When I start sketching a new class, I'm often tempted to write down the names of the attributes first, before starting to implement ``__init__``. Sometimes I even write temporary comments for this purpose. This syntax would naturally provide a way to sketch the list of attributes. Yes, there is already __slots__, but I'm not sure that is a good example of readability. Also, when reading code, it may be hard to tell which (instance) attributes the class implements. To have these listed in the beginning of the class could therefore improve the readability. In this light, I'm not sure it's a good idea to allow attribute type hints inside methods. > > - Whether the keyword-free syntax idea proposed here is best: > NAME: TYPE > TARGET: TYPE = VALUE > I wonder if this would be better: def NAME: TYPE def NAME: TYPE = VALUE Maybe it's just me, but I've always thought 'def' is Python's least logically used keyword. It seems to come from 'define', but what is it about 'define' that makes it relate to functions only. Adding an optional 'def' for other variables might even be a tiny bit of added consistency. Note that we could then also have this: def NAME Which would, again for readability (see above), be a way to express that "there is an instance variable called X, but no type hint for now". I can't think of a *good* way to do this with the keyword-free version for people that don't use type hints. And then there could also be a simple decorator like @slotted_attributes that automatically generates "__slots__" from the annotations. -- Koos > Note that there's an extensive list of rejected ideas in the PEP; > please be so kind to read it before posting here: > https://www.python.org/dev/peps/pep-0526/#rejected-proposals-and-things-left-out-for-now > > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/k7hoven%40gmail.com -- + Koos Zevenhoven + http://twitter.com/k7hoven + From guido at python.org Thu Sep 1 10:46:38 2016 From: guido at python.org (Guido van Rossum) Date: Thu, 1 Sep 2016 07:46:38 -0700 Subject: [Python-Dev] PEP 526 ready for review: Syntax for Variable and Attribute Annotations In-Reply-To: References: Message-ID: On Thu, Sep 1, 2016 at 6:11 AM, Koos Zevenhoven wrote: > While a large amount of Python programmers may not be interested in > type hinting local variables inside functions, I can see other > potential benefits in this. IOW, PEP 3157 is not dead yet. Indeed. > When I start sketching a new class, I'm often tempted to write down > the names of the attributes first, before starting to implement > ``__init__``. Sometimes I even write temporary comments for this > purpose. This syntax would naturally provide a way to sketch the list > of attributes. Yes, there is already __slots__, but I'm not sure that > is a good example of readability. Agreed, it can't get much cleaner than NAME: TYPE. > Also, when reading code, it may be hard to tell which (instance) > attributes the class implements. To have these listed in the beginning > of the class could therefore improve the readability. Right. That has been my observation using PEP 484's type comments extensively for annotating instance variables at the class level. E.g. much of mypy's own code is written this way, and it really is a huge help. But foo = None # type: List[int] while it gives me the info I'm looking for, is not great notation-wise, and that's why I started thinking about an alternative: foo: List[int] (in either case, the __init__ contains something like `self.foo = []`). > In this light, I'm not sure it's a good idea to allow attribute type > hints inside methods. Those are meant for the coding style where all attributes are initialized in the method and people just want to add annotations there. This is already in heavy use in some PEP-484-annotated code bases I know of, using # type comments, and I think it will be easier to get people to switch to syntactic annotations if they can mechanically translate those uses. (In fact we are planning an automatic translator.) >> - Whether the keyword-free syntax idea proposed here is best: >> NAME: TYPE >> TARGET: TYPE = VALUE > > I wonder if this would be better: > > def NAME: TYPE > def NAME: TYPE = VALUE > > Maybe it's just me, but I've always thought 'def' is Python's least > logically used keyword. It seems to come from 'define', but what is it > about 'define' that makes it relate to functions only. Adding an > optional 'def' for other variables might even be a tiny bit of added > consistency. Here I strongly disagree. Everyone will be confused. > Note that we could then also have this: > > def NAME > > Which would, again for readability (see above), be a way to express > that "there is an instance variable called X, but no type hint for > now". I can't think of a *good* way to do this with the keyword-free > version for people that don't use type hints. > > And then there could also be a simple decorator like > @slotted_attributes that automatically generates "__slots__" from the > annotations. This I like, or something like it. It can be a follow-up design. (I.e. a separate PEP, once we have experiece with PEP 526.) -- --Guido van Rossum (python.org/~guido) From christian at python.org Thu Sep 1 12:19:35 2016 From: christian at python.org (Christian Heimes) Date: Thu, 1 Sep 2016 18:19:35 +0200 Subject: [Python-Dev] Patch reviews In-Reply-To: References: Message-ID: On 2016-08-31 22:31, Christian Heimes wrote: > Hi, > > I have 7 patches for 3.6 ready for merging. The new features were > discussed on Security-SIG and reviewed by Victor or GPS. The patches > just need one final review and an ACK. The first three patches should > land in 2.7, 3.4 and 3.5, too. > > http://bugs.python.org/issue26470 > Make OpenSSL module compatible with OpenSSL 1.1.0 > > https://bugs.python.org/issue27850 > Remove 3DES from cipher list (sweet32 CVE-2016-2183) > Also adds ChaCha20 Poly1305 > > http://bugs.python.org/issue27691 > X509 cert with GEN_RID subject alt name causes SytemError > > http://bugs.python.org/issue27866 > ssl: get list of enabled ciphers > > https://bugs.python.org/issue27744 > Add AF_ALG (Linux Kernel crypto) to socket module > > http://bugs.python.org/issue16113 > Add SHA-3 and SHAKE (Keccak) support > > http://bugs.python.org/issue26798 > add BLAKE2 to hashlib And another one: http://bugs.python.org/issue27928 Add hashlib.scrypt Christian From steve at pearwood.info Thu Sep 1 12:21:18 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 2 Sep 2016 02:21:18 +1000 Subject: [Python-Dev] PEP 526 ready for review: Syntax for Variable and Attribute Annotations In-Reply-To: References: Message-ID: <20160901162117.GX26300@ando.pearwood.info> On Thu, Sep 01, 2016 at 04:11:05PM +0300, Koos Zevenhoven wrote: > Maybe it's just me, but I've always thought 'def' is Python's least > logically used keyword. It seems to come from 'define', but what is it > about 'define' that makes it relate to functions only. Convention. You can't use "def" to define both functions and classes: def function(x): ... def Class(x): ... is ambiguous, which is the function and which is the class? So we cannot avoid at least one limitation: "def is for functions, or classes, but not both". Given that, it isn't that weird to make the rule "def is only for functions". [...] > Note that we could then also have this: > > def NAME > > Which would, again for readability (see above), be a way to express > that "there is an instance variable called X, but no type hint for > now". I can't think of a *good* way to do this with the keyword-free > version for people that don't use type hints. The simplest way would be to say "go on, one type hint won't hurt, there's no meaningful runtime cost, just do it". from typing import Any class X: NAME: Any Since I'm not running a type checker, it doesn't matter what hint I use, but Any is probably the least inaccurate. But I think there's a better way. Unless I've missed something, there's no way to pre-declare an instance attribute without specifying a type. (Even if that type is Any.) So how about we allow None as a type-hint on its own: NAME: None as equivalent to a declaration *without* a hint. The reader, and the type-checker, can see that there's an instance attribute called NAME, but in the absense of an actual hint, the type will have to be inferred, just as if it wasn't declared at all. The risk is that somebody will "helpfully" correct the "obvious typo" and change it to NAME = None, but I think that will usually be harmless. I can invent examples where they will behave differently, but they feel contrived to me: class X: spam: None # declaration only, without a hint def method(self): if not hasattr(self, "spam"): raise XError("spam not set") return self.spam def setup(self, arg): if hasattr(self, "spam"): raise XError("spam already set") self.spam = arg Changing the declaration to an assignment does change the behaviour of the class, but I think that will be obvious when it happens. -- Steve From levkivskyi at gmail.com Thu Sep 1 12:30:24 2016 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Thu, 1 Sep 2016 18:30:24 +0200 Subject: [Python-Dev] PEP 526 ready for review: Syntax for Variable and Attribute Annotations In-Reply-To: <20160901162117.GX26300@ando.pearwood.info> References: <20160901162117.GX26300@ando.pearwood.info> Message-ID: On 1 September 2016 at 18:21, Steven D'Aprano wrote: > The simplest way would be to say "go on, one type hint won't hurt, > there's no meaningful runtime cost, just do it". > > from typing import Any > > class X: > NAME: Any > > Since I'm not running a type checker, it doesn't matter what hint I use, > but Any is probably the least inaccurate. > > But I think there's a better way. > > Unless I've missed something, there's no way to pre-declare an instance > attribute without specifying a type. (Even if that type is Any.) So how > about we allow None as a type-hint on its own: > > NAME: None > > as equivalent to a declaration *without* a hint. The reader, and the > type-checker, can see that there's an instance attribute called NAME, > but in the absense of an actual hint, the type will have to be inferred, > just as if it wasn't declared at all. > There is a convention for function annotations in PEP 484 that a missing annotation is equivalent to Any, so that I like your first option more. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From k7hoven at gmail.com Thu Sep 1 13:01:23 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Thu, 1 Sep 2016 20:01:23 +0300 Subject: [Python-Dev] PEP 526 ready for review: Syntax for Variable and Attribute Annotations In-Reply-To: References: Message-ID: On Thu, Sep 1, 2016 at 5:46 PM, Guido van Rossum wrote: > On Thu, Sep 1, 2016 at 6:11 AM, Koos Zevenhoven wrote: >> While a large amount of Python programmers may not be interested in >> type hinting local variables inside functions, I can see other >> potential benefits in this. > > IOW, PEP 3157 is not dead yet. Indeed. > PEP 3157? Is that a typo or is there such a thing somewhere? [...] >> Also, when reading code, it may be hard to tell which (instance) >> attributes the class implements. To have these listed in the beginning >> of the class could therefore improve the readability. > > Right. That has been my observation using PEP 484's type comments > extensively for annotating instance variables at the class level. E.g. > much of mypy's own code is written this way, and it really is a huge > help. But > > foo = None # type: List[int] > > while it gives me the info I'm looking for, is not great > notation-wise, and that's why I started thinking about an alternative: > > foo: List[int] > > (in either case, the __init__ contains something like `self.foo = []`). > >> In this light, I'm not sure it's a good idea to allow attribute type >> hints inside methods. > > Those are meant for the coding style where all attributes are > initialized in the method and people just want to add annotations > there. This is already in heavy use in some PEP-484-annotated code > bases I know of, using # type comments, and I think it will be easier > to get people to switch to syntactic annotations if they can > mechanically translate those uses. (In fact we are planning an > automatic translator.) I suppose the translator would be somewhat more complicated if it were to move the type hints to the beginning of the class suite. Anyway, I hope there will at least be a recommendation somewhere (PEP 8?) to not mix the two styles of attribute annotation (beginning of class / in method). The whole readability benefit turns against itself if there are some non-ClassVar variables annotated outside __init__ and then the rest somewhere in __init__ and in whatever initialization helper methods __init__ happens to call. [...] >> Note that we could then also have this: >> >> def NAME >> >> Which would, again for readability (see above), be a way to express >> that "there is an instance variable called X, but no type hint for >> now". I can't think of a *good* way to do this with the keyword-free >> version for people that don't use type hints. >> >> And then there could also be a simple decorator like >> @slotted_attributes that automatically generates "__slots__" from the >> annotations. > > This I like, or something like it. It can be a follow-up design. (I.e. > a separate PEP, once we have experiece with PEP 526.) I think there should be a syntax for this that does not involve type hints, but I can't seem to come up with anything that works with the keyword-free version :(. -- Koos > -- > --Guido van Rossum (python.org/~guido) -- + Koos Zevenhoven + http://twitter.com/k7hoven + From steve at pearwood.info Thu Sep 1 13:30:23 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 2 Sep 2016 03:30:23 +1000 Subject: [Python-Dev] PEP 526 ready for review: Syntax for Variable and Attribute Annotations In-Reply-To: References: Message-ID: <20160901173023.GZ26300@ando.pearwood.info> On Tue, Aug 30, 2016 at 02:20:26PM -0700, Guido van Rossum wrote: > - Whether (given PEP 484's relative success) it's worth adding syntax > for variable/attribute annotations. The PEP makes a good case that it does. > - Whether the keyword-free syntax idea proposed here is best: > NAME: TYPE > TARGET: TYPE = VALUE I think so. That looks like similar to the syntax used by TypeScript: http://www.typescriptlang.org/docs/handbook/type-inference.html let zoo: Animal[] = [new Rhino(), new Elephant(), new Snake()]; Some additional thoughts: Is it okay to declare something as both an instance and class attribute? class X: spam: int spam: ClassVar[Str] = 'suprise!' def __init__(self): self.spam = 999 I would expect it should be okay. It is more common in Python circles to talk about class and instance *attributes* than "variables". Class variable might be okay in a language like Java where classes themselves aren't first-class values, but in Python "class variable" always makes me think it is talking about a variable which is a class, just like a string variable or list variable. Can we have ClassAttr[] instead of ClassVar[]? Other than that, +1 on the PEP. -- Steve From ethan at stoneleaf.us Thu Sep 1 15:36:53 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 01 Sep 2016 12:36:53 -0700 Subject: [Python-Dev] PEP 467: last round (?) Message-ID: <57C88355.9000302@stoneleaf.us> One more iteration. PEPs repo not updated yet. Changes are renaming of methods to be ``fromsize()`` and ``fromord()``, and moving ``memoryview`` to an Open Questions section. PEP: 467 Title: Minor API improvements for binary sequences Version: $Revision$ Last-Modified: $Date$ Author: Nick Coghlan , Ethan Furman Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 2014-03-30 Python-Version: 3.6 Post-History: 2014-03-30 2014-08-15 2014-08-16 2016-06-07 2016-09-01 Abstract ======== During the initial development of the Python 3 language specification, the core ``bytes`` type for arbitrary binary data started as the mutable type that is now referred to as ``bytearray``. Other aspects of operating in the binary domain in Python have also evolved over the course of the Python 3 series. This PEP proposes five small adjustments to the APIs of the ``bytes`` and ``bytearray`` types to make it easier to operate entirely in the binary domain: * Deprecate passing single integer values to ``bytes`` and ``bytearray`` * Add ``bytes.fromsize`` and ``bytearray.fromsize`` alternative constructors * Add ``bytes.fromord`` and ``bytearray.fromord`` alternative constructors * Add ``bytes.getbyte`` and ``bytearray.getbyte`` byte retrieval methods * Add ``bytes.iterbytes`` and ``bytearray.iterbytes`` alternative iterators Proposals ========= Deprecation of current "zero-initialised sequence" behaviour without removal ---------------------------------------------------------------------------- Currently, the ``bytes`` and ``bytearray`` constructors accept an integer argument and interpret it as meaning to create a zero-initialised sequence of the given size:: >>> bytes(3) b'\x00\x00\x00' >>> bytearray(3) bytearray(b'\x00\x00\x00') This PEP proposes to deprecate that behaviour in Python 3.6, but to leave it in place for at least as long as Python 2.7 is supported, possibly indefinitely. No other changes are proposed to the existing constructors. Addition of explicit "count and byte initialised sequence" constructors ----------------------------------------------------------------------- To replace the deprecated behaviour, this PEP proposes the addition of an explicit ``fromsize`` alternative constructor as a class method on both ``bytes`` and ``bytearray`` whose first argument is the count, and whose second argument is the fill byte to use (defaults to ``\x00``):: >>> bytes.fromsize(3) b'\x00\x00\x00' >>> bytearray.fromsize(3) bytearray(b'\x00\x00\x00') >>> bytes.fromsize(5, b'\x0a') b'\x0a\x0a\x0a\x0a\x0a' >>> bytearray.fromsize(5, b'\x0a') bytearray(b'\x0a\x0a\x0a\x0a\x0a') ``fromsize`` will behave just as the current constructors behave when passed a single integer, while allowing for non-zero fill values when needed. Addition of "bchr" function and explicit "single byte" constructors ------------------------------------------------------------------- As binary counterparts to the text ``chr`` function, this PEP proposes the addition of a ``bchr`` function and an explicit ``fromord`` alternative constructor as a class method on both ``bytes`` and ``bytearray``:: >>> bchr(ord("A")) b'A' >>> bchr(ord(b"A")) b'A' >>> bytes.fromord(65) b'A' >>> bytearray.fromord(65) bytearray(b'A') These methods will only accept integers in the range 0 to 255 (inclusive):: >>> bytes.fromord(512) Traceback (most recent call last): File "", line 1, in ValueError: integer must be in range(0, 256) >>> bytes.fromord(1.0) Traceback (most recent call last): File "", line 1, in TypeError: 'float' object cannot be interpreted as an integer While this does create some duplication, there are valid reasons for it:: * the ``bchr`` builtin is to recreate the ord/chr/unichr trio from Python 2 under a different naming scheme * the class method is mainly for the ``bytearray.fromord`` case, with ``bytes.fromord`` added for consistency The documentation of the ``ord`` builtin will be updated to explicitly note that ``bchr`` is the primary inverse operation for binary data, while ``chr`` is the inverse operation for text data, and that ``bytes.fromord`` and ``bytearray.fromord`` also exist. Behaviourally, ``bytes.fromord(x)`` will be equivalent to the current ``bytes([x])`` (and similarly for ``bytearray``). The new spelling is expected to be easier to discover and easier to read (especially when used in conjunction with indexing operations on binary sequence types). As a separate method, the new spelling will also work better with higher order functions like ``map``. Addition of "getbyte" method to retrieve a single byte ------------------------------------------------------ This PEP proposes that ``bytes`` and ``bytearray`` gain the method ``getbyte`` which will always return ``bytes``:: >>> b'abc'.getbyte(0) b'a' If an index is asked for that doesn't exist, ``IndexError`` is raised:: >>> b'abc'.getbyte(9) Traceback (most recent call last): File "", line 1, in IndexError: index out of range Addition of optimised iterator methods that produce ``bytes`` objects --------------------------------------------------------------------- This PEP proposes that ``bytes`` and ``bytearray``gain an optimised ``iterbytes`` method that produces length 1 ``bytes`` objects rather than integers:: for x in data.iterbytes(): # x is a length 1 ``bytes`` object, rather than an integer For example:: >>> tuple(b"ABC".iterbytes()) (b'A', b'B', b'C') Design discussion ================= Why not rely on sequence repetition to create zero-initialised sequences? ------------------------------------------------------------------------- Zero-initialised sequences can be created via sequence repetition:: >>> b'\x00' * 3 b'\x00\x00\x00' >>> bytearray(b'\x00') * 3 bytearray(b'\x00\x00\x00') However, this was also the case when the ``bytearray`` type was originally designed, and the decision was made to add explicit support for it in the type constructor. The immutable ``bytes`` type then inherited that feature when it was introduced in PEP 3137. This PEP isn't revisiting that original design decision, just changing the spelling as users sometimes find the current behaviour of the binary sequence constructors surprising. In particular, there's a reasonable case to be made that ``bytes(x)`` (where ``x`` is an integer) should behave like the ``bytes.fromint(x)`` proposal in this PEP. Providing both behaviours as separate class methods avoids that ambiguity. Open Questions ============== Do we add ``iterbytes`` to ``memoryview``, or modify ``memoryview.cast()`` to accept ``'s'`` as a single-byte interpretation? Or do we ignore memory for now and add it later? References ========== .. [1] Initial March 2014 discussion thread on python-ideas (https://mail.python.org/pipermail/python-ideas/2014-March/027295.html) .. [2] Guido's initial feedback in that thread (https://mail.python.org/pipermail/python-ideas/2014-March/027376.html) .. [3] Issue proposing moving zero-initialised sequences to a dedicated API (http://bugs.python.org/issue20895) .. [4] Issue proposing to use calloc() for zero-initialised binary sequences (http://bugs.python.org/issue21644) .. [5] August 2014 discussion thread on python-dev (https://mail.python.org/pipermail/python-ideas/2014-March/027295.html) .. [6] June 2016 discussion thread on python-dev (https://mail.python.org/pipermail/python-dev/2016-June/144875.html) Copyright ========= This document has been placed in the public domain. From guido at python.org Thu Sep 1 16:14:59 2016 From: guido at python.org (Guido van Rossum) Date: Thu, 1 Sep 2016 13:14:59 -0700 Subject: [Python-Dev] PEP 526 ready for review: Syntax for Variable and Attribute Annotations In-Reply-To: <20160901173023.GZ26300@ando.pearwood.info> References: <20160901173023.GZ26300@ando.pearwood.info> Message-ID: On Thu, Sep 1, 2016 at 10:30 AM, Steven D'Aprano wrote: > On Tue, Aug 30, 2016 at 02:20:26PM -0700, Guido van Rossum wrote: > >> - Whether (given PEP 484's relative success) it's worth adding syntax >> for variable/attribute annotations. > > The PEP makes a good case that it does. Thanks, I agree. :-) >> - Whether the keyword-free syntax idea proposed here is best: >> NAME: TYPE >> TARGET: TYPE = VALUE > > I think so. > > That looks like similar to the syntax used by TypeScript: > > http://www.typescriptlang.org/docs/handbook/type-inference.html > > let zoo: Animal[] = [new Rhino(), new Elephant(), new Snake()]; And Rust. In the tracker issue we're still tweaking this, e.g. the latest idea is that after all we'd like to simplify the syntax to TARGET: TYPE [= VALUE] Please read the end of the tracker discussion: https://github.com/python/typing/issues/258#issuecomment-244188268 > Some additional thoughts: > > Is it okay to declare something as both an instance and class attribute? > > class X: > spam: int > spam: ClassVar[Str] = 'suprise!' > > def __init__(self): > self.spam = 999 > > > I would expect it should be okay. I think it would be confusing because the class var would be used as the default if the instance var is not defined. > It is more common in Python circles to talk about class and instance > *attributes* than "variables". Class variable might be okay in a > language like Java where classes themselves aren't first-class values, > but in Python "class variable" always makes me think it is talking about > a variable which is a class, just like a string variable or list > variable. Can we have ClassAttr[] instead of ClassVar[]? We went back and forth on this. I really don't like to use the word attribute here, because a method is also an attribute. And instance variable sounds more natural to me than instance attribute. Also we now have global variables, class variables, instance variables, and local variables, all of which can be annotated. (The PEP's language is actually a bit inconsistent here.) > Other than that, +1 on the PEP. -- --Guido van Rossum (python.org/~guido) From guido at python.org Thu Sep 1 16:25:17 2016 From: guido at python.org (Guido van Rossum) Date: Thu, 1 Sep 2016 13:25:17 -0700 Subject: [Python-Dev] PEP 526 ready for review: Syntax for Variable and Attribute Annotations In-Reply-To: References: Message-ID: On Thu, Sep 1, 2016 at 10:01 AM, Koos Zevenhoven wrote: > On Thu, Sep 1, 2016 at 5:46 PM, Guido van Rossum wrote: >> IOW, PEP 3157 is not dead yet. Indeed. >> > > PEP 3157? Is that a typo or is there such a thing somewhere? Sorry, 3107 (the original Function Annotations PEP). > [...] > I hope there will at least be a recommendation somewhere (PEP 8?) to not > mix the two styles of attribute annotation (beginning of class / in > method). The whole readability benefit turns against itself if there > are some non-ClassVar variables annotated outside __init__ and then > the rest somewhere in __init__ and in whatever initialization helper > methods __init__ happens to call. Yeah, but then again, in general I don't believe you can legislate the writing of readable code using crude syntactic means. Not mixing the two in the same class sounds like pretty good advice though, and a linter should be able to catch that easily. -- --Guido van Rossum (python.org/~guido) From guido at python.org Thu Sep 1 16:37:37 2016 From: guido at python.org (Guido van Rossum) Date: Thu, 1 Sep 2016 13:37:37 -0700 Subject: [Python-Dev] PEP 526 ready for review: Syntax for Variable and Attribute Annotations In-Reply-To: References: <20160901162117.GX26300@ando.pearwood.info> Message-ID: On Thu, Sep 1, 2016 at 9:30 AM, Ivan Levkivskyi wrote: > On 1 September 2016 at 18:21, Steven D'Aprano wrote: [...] >> Unless I've missed something, there's no way to pre-declare an instance >> attribute without specifying a type. (Even if that type is Any.) So how >> about we allow None as a type-hint on its own: >> >> NAME: None >> >> as equivalent to a declaration *without* a hint. The reader, and the >> type-checker, can see that there's an instance attribute called NAME, >> but in the absense of an actual hint, the type will have to be inferred, >> just as if it wasn't declared at all. > There is a convention for function annotations in PEP 484 that a missing > annotation is equivalent to Any, so that I like your first option more. But Steven wasn't proposing it to mean Any, he was proposing it to mean "type checker should infer". Where I presume the inference should be done based on the assignment in __init__ only. I'm not sure if this needs special syntax (a type checker might behave the same way without this, so we could just use a comment) but even if we did decide we wanted to support NAME: None for this case, we don't have to change Python, since this already conforms to the syntax in PEP 526 (the type is None). We'd still have to update the PEP to tell the authors of type checkers about this special feature, since otherwise it would mean "NAME has type NoneType" (remember that PEP 484 defines None as a shortcut for NoneType == type(None)). But that's not a very useful type for a variable... But I'm not in a hurry for that -- I'm only hoping to get the basic syntax accepted by Python 3.6 beta 1 so that we can start using this in 5 years from now rather than 7 years from now. -- --Guido van Rossum (python.org/~guido) From victor.stinner at gmail.com Thu Sep 1 17:06:15 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 1 Sep 2016 23:06:15 +0200 Subject: [Python-Dev] PEP 467: last round (?) In-Reply-To: <57C88355.9000302@stoneleaf.us> References: <57C88355.9000302@stoneleaf.us> Message-ID: 2016-09-01 21:36 GMT+02:00 Ethan Furman : > Abstract > ======== > > This PEP proposes five small adjustments to the APIs of the ``bytes`` and > ``bytearray`` types to make it easier to operate entirely in the binary > domain: You should add bchr() in the Abstract. > * Deprecate passing single integer values to ``bytes`` and ``bytearray`` > * Add ``bytes.fromsize`` and ``bytearray.fromsize`` alternative constructors I understand that main reason for this change is to catch bugs when bytes(obj) is used and obj is not supposed to be an integer. So I expect that bytes(int) will be quickly deprecated, but the PEP doesn't schedule a removal of the feature. So it looks more than only adding an alias to bytes(int). I would prefer to either schedule a removal of bytes(int), or remove bytes.fromsize() from the PEP. > * Add ``bytes.fromord`` and ``bytearray.fromord`` alternative constructors Hum, you already propose to add a builtin function. Why would we need two ways to create a single byte? I'm talking about bchr(int)==bytes.fromord(int). I'm not sure that there is an use case for bytearray.fromord(int). > * Add ``bytes.getbyte`` and ``bytearray.getbyte`` byte retrieval methods > * Add ``bytes.iterbytes`` and ``bytearray.iterbytes`` alternative iterators I like these ones :-) > In particular, there's a reasonable case to be made > that ``bytes(x)`` (where ``x`` is an integer) should behave like the > ``bytes.fromint(x)`` proposal in this PEP. "fromint"? Is it bytes.fromord()/bchr()? > Open Questions > ============== > > Do we add ``iterbytes`` to ``memoryview``, or modify > ``memoryview.cast()`` to accept ``'s'`` as a single-byte interpretation? Or > do we ignore memory for now and add it later? It's nice to have bytes.iterbytes() to help porting Python 2 code, but I'm not sure that this function would be super popular in new Python 3 code. I don't think that a memoryview.iterbytes() (or cast("s")) would be useful. Victor From victor.stinner at gmail.com Thu Sep 1 17:15:51 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 1 Sep 2016 23:15:51 +0200 Subject: [Python-Dev] Patch reviews In-Reply-To: References: Message-ID: 2016-08-31 22:31 GMT+02:00 Christian Heimes : > https://bugs.python.org/issue27744 > Add AF_ALG (Linux Kernel crypto) to socket module This patch adds a new socket.sendmsg_afalg() method on Linux. "afalg" comes from AF_ALG which means "Address Family Algorithm". It's documented as "af_alg: User-space algorithm interface" in crypto/af_alg.c. IHMO the method should be just "sendmsg_alg()", beacuse "afalg" is redundant. The AF_ prefix is only used to workaround a C limitation: there is no namespace in the language, all symbols are in one single giant namespace. I don't expect that a platform will add a new sendmsg_alg() C function. If it's the case, we will see how to handle the name conflict ;-) Victor From ethan at stoneleaf.us Thu Sep 1 18:04:33 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 01 Sep 2016 15:04:33 -0700 Subject: [Python-Dev] PEP 467: last round (?) In-Reply-To: References: <57C88355.9000302@stoneleaf.us> Message-ID: <57C8A5F1.4060204@stoneleaf.us> On 09/01/2016 02:06 PM, Victor Stinner wrote: > 2016-09-01 21:36 GMT+02:00 Ethan Furman: >> Abstract >> ======== >> >> This PEP proposes five small adjustments to the APIs of the ``bytes`` and >> ``bytearray`` types to make it easier to operate entirely in the binary >> domain: > > You should add bchr() in the Abstract. Done. >> * Deprecate passing single integer values to ``bytes`` and ``bytearray`` >> * Add ``bytes.fromsize`` and ``bytearray.fromsize`` alternative constructors > > I understand that main reason for this change is to catch bugs when > bytes(obj) is used and obj is not supposed to be an integer. > > So I expect that bytes(int) will be quickly deprecated, but the PEP > doesn't schedule a removal of the feature. So it looks more than only > adding an alias to bytes(int). > > I would prefer to either schedule a removal of bytes(int), or remove > bytes.fromsize() from the PEP. The PEP states that ``bytes(x)`` will not be removed while 2.7 is supported. Once 2.7 is no longer a concern we can visit the question of removing that behavior. >> * Add ``bytes.fromord`` and ``bytearray.fromord`` alternative constructors > > Hum, you already propose to add a builtin function. Why would we need > two ways to create a single byte? - `bchr` to mirror `chr` - `fromord` to replace the mistaken purpose of the default constructor >> * Add ``bytes.getbyte`` and ``bytearray.getbyte`` byte retrieval methods >> * Add ``bytes.iterbytes`` and ``bytearray.iterbytes`` alternative iterators > > I like these ones :-) Cool. >> In particular, there's a reasonable case to be made >> that ``bytes(x)`` (where ``x`` is an integer) should behave like the >> ``bytes.fromint(x)`` proposal in this PEP. > > "fromint"? Is it bytes.fromord()/bchr()? Oops, fixed. -- ~Ethan From steve.dower at python.org Thu Sep 1 18:28:53 2016 From: steve.dower at python.org (Steve Dower) Date: Thu, 1 Sep 2016 15:28:53 -0700 Subject: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8 Message-ID: I'm about to be offline for a few days, so I wanted to get my current draft PEPs out for people can read and review. I don't believe there is a lot of change as a result of either PEP, but the impact of what change there is needs to be weighed against the benefits. If anything, I'm likely to have underplayed the impact of this change (though I've had a *lot* of support for this one). Just stating my biases up-front - take it as you wish. See https://bugs.python.org/issue1602 for the current proposed patch for this PEP. I will likely update it after my upcoming flights, but it's in pretty good shape right now. Cheers, Steve --- https://github.com/python/peps/blob/master/pep-0528.txt --- PEP: 528 Title: Change Windows console encoding to UTF-8 Version: $Revision$ Last-Modified: $Date$ Author: Steve Dower Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 27-Aug-2016 Post-History: 01-Sep-2016 Abstract ======== Historically, Python uses the ANSI APIs for interacting with the Windows operating system, often via C Runtime functions. However, these have been long discouraged in favor of the UTF-16 APIs. Within the operating system, all text is represented as UTF-16, and the ANSI APIs perform encoding and decoding using the active code page. This PEP proposes changing the default standard stream implementation on Windows to use the Unicode APIs. This will allow users to print and input the full range of Unicode characters at the default Windows console. This also requires a subtle change to how the tokenizer parses text from readline hooks, that should have no backwards compatibility issues. Specific Changes ================ Add _io.WindowsConsoleIO ------------------------ Currently an instance of ``_io.FileIO`` is used to wrap the file descriptors representing standard input, output and error. We add a new class (implemented in C) ``_io.WindowsConsoleIO`` that acts as a raw IO object using the Windows console functions, specifically, ``ReadConsoleW`` and ``WriteConsoleW``. This class will be used when the legacy-mode flag is not in effect, when opening a standard stream by file descriptor and the stream is a console buffer rather than a redirected file. Otherwise, ``_io.FileIO`` will be used as it is today. This is a raw (bytes) IO class that requires text to be passed encoded with utf-8, which will be decoded to utf-16-le and passed to the Windows APIs. Similarly, bytes read from the class will be provided by the operating system as utf-16-le and converted into utf-8 when returned to Python. The use of an ASCII compatible encoding is required to maintain compatibility with code that bypasses the ``TextIOWrapper`` and directly writes ASCII bytes to the standard streams (for example, [process_stdinreader.py]_). Code that assumes a particular encoding for the standard streams other than ASCII will likely break. Add _PyOS_WindowsConsoleReadline -------------------------------- To allow Unicode entry at the interactive prompt, a new readline hook is required. The existing ``PyOS_StdioReadline`` function will delegate to the new ``_PyOS_WindowsConsoleReadline`` function when reading from a file descriptor that is a console buffer and the legacy-mode flag is not in effect (the logic should be identical to above). Since the readline interface is required to return an 8-bit encoded string with no embedded nulls, the ``_PyOS_WindowsConsoleReadline`` function transcodes from utf-16-le as read from the operating system into utf-8. The function ``PyRun_InteractiveOneObject`` which currently obtains the encoding from ``sys.stdin`` will select utf-8 unless the legacy-mode flag is in effect. This may require readline hooks to change their encodings to utf-8, or to require legacy-mode for correct behaviour. Add legacy mode --------------- Launching Python with the environment variable ``PYTHONLEGACYWINDOWSSTDIO`` set will enable the legacy-mode flag, which completely restores the previous behaviour. Alternative Approaches ====================== The ``win_unicode_console`` package [win_unicode_console]_ is a pure-Python alternative to changing the default behaviour of the console. Code that may break =================== The following code patterns may break or see different behaviour as a result of this change. All of these code samples require explicitly choosing to use a raw file object in place of a more convenient wrapper that would prevent any visible change. Assuming stdin/stdout encoding ------------------------------ Code that assumes that the encoding required by ``sys.stdin.buffer`` or ``sys.stdout.buffer`` is ``'mbcs'`` or a more specific encoding may currently be working by chance, but could encounter issues under this change. For example:: sys.stdout.buffer.write(text.encode('mbcs')) r = sys.stdin.buffer.read(16).decode('cp437') To correct this code, the encoding specified on the ``TextIOWrapper`` should be used, either implicitly or explicitly:: # Fix 1: Use wrapper correctly sys.stdout.write(text) r = sys.stdin.read(16) # Fix 2: Use encoding explicitly sys.stdout.buffer.write(text.encode(sys.stdout.encoding)) r = sys.stdin.buffer.read(16).decode(sys.stdin.encoding) Incorrectly using the raw object -------------------------------- Code that uses the raw IO object and does not correctly handle partial reads and writes may be affected. This is particularly important for reads, where the number of characters read will never exceed one-fourth of the number of bytes allowed, as there is no feasible way to prevent input from encoding as much longer utf-8 strings:: >>> stdin = open(sys.stdin.fileno(), 'rb') >>> data = stdin.raw.read(15) abcdefghijklm b'abc' # data contains at most 3 characters, and never more than 12 bytes # error, as "defghijklm\r\n" is passed to the interactive prompt To correct this code, the buffered reader/writer should be used, or the caller should continue reading until its buffer is full.:: # Fix 1: Use the buffered reader/writer >>> stdin = open(sys.stdin.fileno(), 'rb') >>> data = stdin.read(15) abcedfghijklm b'abcdefghijklm\r\n' # Fix 2: Loop until enough bytes have been read >>> stdin = open(sys.stdin.fileno(), 'rb') >>> b = b'' >>> while len(b) < 15: ... b += stdin.raw.read(15) abcedfghijklm b'abcdefghijklm\r\n' Copyright ========= This document has been placed in the public domain. References ========== .. [process_stdinreader.py] Twisted's process_stdinreader.py (https://github.com/twisted/twisted/blob/trunk/src/twisted/test/process_stdinreader.py) .. [win_unicode_console] win_unicode_console package (https://pypi.org/project/win_unicode_console/) From steve.dower at python.org Thu Sep 1 18:31:26 2016 From: steve.dower at python.org (Steve Dower) Date: Thu, 1 Sep 2016 15:31:26 -0700 Subject: [Python-Dev] PEP 529: Change Windows filesystem encoding to UTF-8 Message-ID: I'm about to be offline for a few days, so I wanted to get my current draft PEPs out for people can read and review. I don't believe there is a lot of change as a result of either PEP, but the impact of what change there is needs to be weighed against the benefits. We've already had some thorough discussion on this one and failed to reach agreement on whether we can make this change in 3.6 or if it needs a deprecation cycle that is more visible than the one we started in 3.3. In the latter case, we need to determine how visible that should be (i.e. warnings visible by default, visible for non-Windows platforms, value-dependent warnings/errors, etc.). IMHO, the argument about having the change be on-by-default or off-by-default is irrelevant until we decide on the deprecation issue, at which point it is obvious what the default should be. See https://bugs.python.org/issue27781 for the current proposed patch. I do need to update it in order to merge against default it seems (work for my upcoming flight). Cheers, Steve --- https://github.com/python/peps/blob/master/pep-0529.txt --- PEP: 529 Title: Change Windows filesystem encoding to UTF-8 Version: $Revision$ Last-Modified: $Date$ Author: Steve Dower Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 27-Aug-2016 Post-History: 01-Sep-2016 Abstract ======== Historically, Python uses the ANSI APIs for interacting with the Windows operating system, often via C Runtime functions. However, these have been long discouraged in favor of the UTF-16 APIs. Within the operating system, all text is represented as UTF-16, and the ANSI APIs perform encoding and decoding using the active code page. This PEP proposes changing the default filesystem encoding on Windows to utf-8, and changing all filesystem functions to use the Unicode APIs for filesystem paths. This will not affect code that uses strings to represent paths, however those that use bytes for paths will now be able to correctly round-trip all valid paths in Windows filesystems. Currently, the conversions between Unicode (in the OS) and bytes (in Python) were lossy and would fail to round-trip characters outside of the user's active code page. Notably, this does not impact the encoding of the contents of files. These will continue to default to locale.getpreferredencoding (for text files) or plain bytes (for binary files). This only affects the encoding used when users pass a bytes object to Python where it is then passed to the operating system as a path name. Background ========== File system paths are almost universally represented as text with an encoding determined by the file system. In Python, we expose these paths via a number of interfaces, such as the ``os`` and ``io`` modules. Paths may be passed either direction across these interfaces, that is, from the filesystem to the application (for example, ``os.listdir()``), or from the application to the filesystem (for example, ``os.unlink()``). When paths are passed between the filesystem and the application, they are either passed through as a bytes blob or converted to/from str using ``os.fsencode()`` or ``sys.getfilesystemencoding()``. The result of encoding a string with ``sys.getfilesystemencoding()`` is a blob of bytes in the native format for the default file system. On Windows, the native format for the filesystem is utf-16-le. The recommended platform APIs for accessing the filesystem all accept and return text encoded in this format. However, prior to Windows NT (and possibly further back), the native format was a configurable machine option and a separate set of APIs existed to accept this format. The option (the "active code page") and these APIs (the "*A functions") still exist in recent versions of Windows for backwards compatibility, though new functionality often only has a utf-16-le API (the "*W functions"). In Python, str is recommended because it can correctly round-trip all characters used in paths (on POSIX with surrogateescape handling; on Windows because str maps to the native representation). On Windows bytes cannot round-trip all characters used in paths, as Python internally uses the *A functions and hence the encoding is "whatever the active code page is". Since the active code page cannot represent all Unicode characters, the conversion of a path into bytes can lose information without warning or any available indication. As a demonstration of this:: >>> open('test\uAB00.txt', 'wb').close() >>> import glob >>> glob.glob('test*') ['test\uab00.txt'] >>> glob.glob(b'test*') [b'test?.txt'] The Unicode character in the second call to glob has been replaced by a '?', which means passing the path back into the filesystem will result in a ``FileNotFoundError``. The same results may be observed with ``os.listdir()`` or any function that matches the return type to the parameter type. While one user-accessible fix is to use str everywhere, POSIX systems generally do not suffer from data loss when using bytes exclusively as the bytes are the canonical representation. Even if the encoding is "incorrect" by some standard, the file system will still map the bytes back to the file. Making use of this avoids the cost of decoding and reencoding, such that (theoretically, and only on POSIX), code such as this may be faster because of the use of `b'.'` compared to using `'.'`:: >>> for f in os.listdir(b'.'): ... os.stat(f) ... As a result, POSIX-focused library authors prefer to use bytes to represent paths. For some authors it is also a convenience, as their code may receive bytes already known to be encoded correctly, while others are attempting to simplify porting their code from Python 2. However, the correctness assumptions do not carry over to Windows where Unicode is the canonical representation, and errors may result. This potential data loss is why the use of bytes paths on Windows was deprecated in Python 3.3 - all of the above code snippets produce deprecation warnings on Windows. Proposal ======== Currently the default filesystem encoding is 'mbcs', which is a meta-encoder that uses the active code page. However, when bytes are passed to the filesystem they go through the *A APIs and the operating system handles encoding. In this case, paths are always encoded using the equivalent of 'mbcs:replace' - we have no ability to change this (though there is a user/machine configuration option to change the encoding from CP_ACP to CP_OEM, so it won't necessarily always match mbcs...) This proposal would remove all use of the *A APIs and only ever call the *W APIs. When Windows returns paths to Python as str, they will be decoded from utf-16-le and returned as text (in whatever the minimal representation is). When Windows returns paths to Python as bytes, they will be decoded from utf-16-le to utf-8 using surrogatepass (Windows does not validate surrogate pairs, so it is possible to have invalid surrogates in filenames). Equally, when paths are provided as bytes, they are decoded from utf-8 into utf-16-le and passed to the *W APIs. The use of utf-8 will not be configurable, with the possible exception of a "legacy mode" environment variable or X-flag. surrogateescape does not apply here, as the concern is not about retaining non-sensical bytes. Any path returned from the operating system will be valid Unicode, while bytes paths created by the user may raise a decoding error (currently these would raise ``OSError`` or a subclass). The choice of utf-8 bytes (as opposed to utf-16-le bytes) is to ensure the ability to round-trip without breaking the functionality of the ``os.path`` module, which assumes an ASCII-compatible encoding. Using utf-16-le as the encoding is more pure, but will cause more issues than are resolved. This change would also undeprecate the use of bytes paths on Windows. No change to the semantics of using bytes as a path is required - as before, they must be encoded with the encoding specified by ``sys.getfilesystemencoding()``. Specific Changes ================ Update sys.getfilesystemencoding -------------------------------- Remove the default value for ``Py_FileSystemDefaultEncoding`` and set it in ``initfsencoding()`` to utf-8, or if the legacy-mode switch is enabled to mbcs. Update the implementations of ``PyUnicode_DecodeFSDefaultAndSize`` and ``PyUnicode_EncodeFSDefault`` to use the standard utf-8 codec with surrogatepass error mode, or if the legacy-mode switch is enabled the code page codec with replace error mode. Update path_converter --------------------- Update the path converter to always decode bytes or buffer objects into text using ``PyUnicode_DecodeFSDefaultAndSize``. Change the ``narrow`` field from a ``char*`` string into a flag that indicates whether the original object was bytes. This is required for functions that need to return paths using the same type as was originally provided. Remove unused ANSI code ----------------------- Remove all code paths using the ``narrow`` field, as these will no longer be reachable by any caller. These are only used within ``posixmodule.c``. Other uses of paths should have use of bytes paths replaced with decoding and use of the *W APIs. Add legacy mode --------------- Add a legacy mode flag, enabled by the environment variable ``PYTHONLEGACYWINDOWSFSENCODING``. When this flag is set, the default filesystem encoding is set to mbcs rather than utf-8, and the error mode is set to 'replace' rather than 'strict'. The ``path_converter`` will continue to decode to wide characters and only *W APIs will be called, however, the bytes passed in and received from Python will be encoded the same as prior to this change. Undeprecate bytes paths on Windows ---------------------------------- Using bytes as paths on Windows is currently deprecated. We would announce that this is no longer the case, and that paths when encoded as bytes should use whatever is returned from ``sys.getfilesystemencoding()`` rather than the user's active code page. Rejected Alternatives ===================== Use strict mbcs decoding ------------------------ This is essentially the same as the proposed change, but instead of changing ``sys.getfilesystemencoding()`` to utf-8 it is changed to mbcs (which dynamically maps to the active code page). This approach allows the use of new functionality that is only available as *W APIs and also detection of encoding/decoding errors. For example, rather than silently replacing Unicode characters with '?', it would be possible to warn or fail the operation. Compared to the proposed fix, this could enable some new functionality but does not fix any of the problems described initially. New runtime errors may cause some problems to be more obvious and lead to fixes, provided library maintainers are interested in supporting Windows and adding a separate code path to treat filesystem paths as strings. Making the encoding mbcs without strict errors is equivalent to the legacy-mode switch being enabled by default. This is a possible course of action if there is significant breakage of actual code and a need to extend the deprecation period, but still a desire to have the simplifications to the CPython source. Make bytes paths an error on Windows ------------------------------------ By preventing the use of bytes paths on Windows completely we prevent users from hitting encoding issues. However, the motivation for this PEP is to increase the likelihood that code written on POSIX will also work correctly on Windows. This alternative would move the other direction and make such code completely incompatible. As this does not benefit users in any way, we reject it. Make bytes paths an error on all platforms ------------------------------------------ By deprecating and then disable the use of bytes paths on all platforms we prevent users from hitting encoding issues regardless of where the code was originally written. This would require a full deprecation cycle, as there are currently no warnings on platforms other than Windows. This is likely to be seen as a hostile action against Python developers in general, and as such is rejected at this time. Code that may break =================== The following code patterns may break or see different behaviour as a result of this change. Note that all of these examples produce deprecation warnings on Python 3.3 and later. Not managing encodings across boundaries ---------------------------------------- Code that does not manage encodings when crossing protocol boundaries may currently be working by chance, but could encounter issues when either encoding changes. For example:: filename = open('filename_in_mbcs.txt', 'rb').read() text = open(filename, 'r').read() To correct this code, the encoding of the bytes in ``filename`` should be specified, either when reading from the file or before using the value:: # Fix 1: Open file as text filename = open('filename_in_mbcs.txt', 'r', encoding='mbcs').read() text = open(filename, 'r').read() # Fix 2: Decode path filename = open('filename_in_mbcs.txt', 'rb').read() text = open(filename.decode('mbcs'), 'r').read() Explicitly using 'mbcs' ----------------------- Code that explicitly encodes text using 'mbcs' before passing to file system APIs. For example:: filename = open('files.txt', 'r').readline() text = open(filename.encode('mbcs'), 'r') To correct this code, the string should be passed without explicit encoding, or should use ``os.fsencode()``:: # Fix 1: Do not encode the string filename = open('files.txt', 'r').readline() text = open(filename, 'r') # Fix 2: Use correct encoding filename = open('files.txt', 'r').readline() text = open(os.fsencode(filename), 'r') Copyright ========= This document has been placed in the public domain. From yselivanov.ml at gmail.com Thu Sep 1 18:34:06 2016 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Thu, 1 Sep 2016 15:34:06 -0700 Subject: [Python-Dev] PEP 525, third round, better finalization Message-ID: Hi, I've spent quite a while thinking and experimenting with PEP 525 trying to figure out how to make asynchronous generators (AG) finalization reliable. I've tried to replace the callback for GCed with a callback to intercept first iteration of AGs. Turns out it's very hard to work with weak-refs and make asyncio event loop to reliably track and shutdown all open AGs. My new approach is to replace the "sys.set_asyncgen_finalizer(finalizer)" function with "sys.set_asyncgen_hooks(firstiter=None, finalizer=None)". This design allows us to: 1. intercept first iteration of an AG. That makes it possible for event loops to keep a weak set of all "open" AGs, and to implement a "shutdown" method to close the loop and close all AGs *reliably*. 2. intercept AGs GC. That makes it possible to call "aclose" on GCed AGs to guarantee that 'finally' and 'async with' statements are properly closed. 3. in later Python versions we can add more hooks, although I can't think of anything else we need to add right now. I'm posting below the only updated PEP section. The latest PEP revision should also be available on python.org shortly. All new proposed changes are available to play with in my fork of CPython here: https://github.com/1st1/cpython/tree/async_gen Finalization ------------ PEP 492 requires an event loop or a scheduler to run coroutines. Because asynchronous generators are meant to be used from coroutines, they also require an event loop to run and finalize them. Asynchronous generators can have ``try..finally`` blocks, as well as ``async with``. It is important to provide a guarantee that, even when partially iterated, and then garbage collected, generators can be safely finalized. For example:: async def square_series(con, to): async with con.transaction(): cursor = con.cursor( 'SELECT generate_series(0, $1) AS i', to) async for row in cursor: yield row['i'] ** 2 async for i in square_series(con, 1000): if i == 100: break The above code defines an asynchronous generator that uses ``async with`` to iterate over a database cursor in a transaction. The generator is then iterated over with ``async for``, which interrupts the iteration at some point. The ``square_series()`` generator will then be garbage collected, and without a mechanism to asynchronously close the generator, Python interpreter would not be able to do anything. To solve this problem we propose to do the following: 1. Implement an ``aclose`` method on asynchronous generators returning a special *awaitable*. When awaited it throws a ``GeneratorExit`` into the suspended generator and iterates over it until either a ``GeneratorExit`` or a ``StopAsyncIteration`` occur. This is very similar to what the ``close()`` method does to regular Python generators, except that an event loop is required to execute ``aclose()``. 2. Raise a ``RuntimeError``, when an asynchronous generator executes a ``yield`` expression in its ``finally`` block (using ``await`` is fine, though):: async def gen(): try: yield finally: await asyncio.sleep(1) # Can use 'await'. yield # Cannot use 'yield', # this line will trigger a # RuntimeError. 3. Add two new methods to the ``sys`` module: ``set_asyncgen_hooks()`` and ``get_asyncgen_hooks()``. The idea behind ``sys.set_asyncgen_hooks()`` is to allow event loops to intercept asynchronous generators iteration and finalization, so that the end user does not need to care about the finalization problem, and everything just works. ``sys.set_asyncgen_hooks()`` accepts two arguments: * ``firstiter``: a callable which will be called when an asynchronous generator is iterated for the first time. * ``finalizer``: a callable which will be called when an asynchronous generator is about to be GCed. When an asynchronous generator is iterated for the first time, it stores a reference to the current finalizer. If there is none, a ``RuntimeError`` is raised. This provides a strong guarantee that every asynchronous generator object will always have a finalizer installed by the correct event loop. When an asynchronous generator is about to be garbage collected, it calls its cached finalizer. The assumption is that the finalizer will schedule an ``aclose()`` call with the loop that was active when the iteration started. For instance, here is how asyncio is modified to allow safe finalization of asynchronous generators:: # asyncio/base_events.py class BaseEventLoop: def run_forever(self): ... old_hooks = sys.get_asyncgen_hooks() sys.set_asyncgen_hooks(finalizer=self._finalize_asyncgen) try: ... finally: sys.set_asyncgen_hooks(*old_hooks) ... def _finalize_asyncgen(self, gen): self.create_task(gen.aclose()) The second argument, ``firstiter``, allows event loops to maintain a weak set of asynchronous generators instantiated under their control. This makes it possible to implement "shutdown" mechanisms to safely finalize all open generators and close the event loop. ``sys.set_asyncgen_hooks()`` is thread-specific, so several event loops running in parallel threads can use it safely. ``sys.get_asyncgen_hooks()`` returns a namedtuple-like structure with ``firstiter`` and ``finalizer`` fields. From victor.stinner at gmail.com Thu Sep 1 19:07:49 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 2 Sep 2016 01:07:49 +0200 Subject: [Python-Dev] PEP 467: last round (?) In-Reply-To: <57C8A5F1.4060204@stoneleaf.us> References: <57C88355.9000302@stoneleaf.us> <57C8A5F1.4060204@stoneleaf.us> Message-ID: 2016-09-02 0:04 GMT+02:00 Ethan Furman : > - `fromord` to replace the mistaken purpose of the default constructor To replace a bogus bytes(obj)? If someone writes bytes(obj) but expect to create a byte string from an integer, why not using bchr() to fix the code? Victor From random832 at fastmail.com Thu Sep 1 19:28:29 2016 From: random832 at fastmail.com (Random832) Date: Thu, 01 Sep 2016 19:28:29 -0400 Subject: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8 In-Reply-To: References: Message-ID: <1472772509.386843.713295953.53162396@webmail.messagingengine.com> On Thu, Sep 1, 2016, at 18:28, Steve Dower wrote: > This is a raw (bytes) IO class that requires text to be passed encoded > with utf-8, which will be decoded to utf-16-le and passed to the Windows APIs. > Similarly, bytes read from the class will be provided by the operating > system as utf-16-le and converted into utf-8 when returned to Python. What happens if a character is broken across a buffer boundary? e.g. if someone tries to read or write one byte at a time (you can't do a partial read of zero bytes, there's no way to distinguish that from an EOF.) Is there going to be a higher-level text I/O class that bypasses the UTF-8 encoding step when the underlying bytes stream is a console? What if we did that but left the encoding as mbcs? I.e. the console is text stream that can magically handle characters that aren't representable in its encoding. Note that if anything does os.read/write to the console's file descriptors, they're gonna get MBCS and there's nothing we can do about it. From steve.dower at python.org Thu Sep 1 22:35:26 2016 From: steve.dower at python.org (Steve Dower) Date: Thu, 1 Sep 2016 19:35:26 -0700 Subject: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8 In-Reply-To: <1472772509.386843.713295953.53162396@webmail.messagingengine.com> References: <1472772509.386843.713295953.53162396@webmail.messagingengine.com> Message-ID: My original plan was to bypass the utf8 encoding step, but that was going to cause major issues with code that blindly assumes it can do things like sys.stdout.buffer.write(b"\n") (rather than b"\n\0" - and who'd imagine you needed to do that). I didn't want to set up secret handshakes either, at least until there's a proven performance issue. I'd need to test to be sure, but writing an incomplete code point should just truncate to before that point. It may currently raise OSError if that truncated to zero length, as I believe that's not currently distinguished from an error. What behavior would you propose? Reads of less than four bytes fail instantly, as in the worst case we need four bytes to represent one Unicode character. This is an unfortunate reality of trying to limit it to one system call - you'll never get a full buffer from a single read, as there is no simple mapping between length-as-utf8 and length-as-utf16 for an arbitrary string. Top-posted from my Windows Phone -----Original Message----- From: "Random832" Sent: ?9/?1/?2016 16:31 To: "python-dev at python.org" Subject: Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8 On Thu, Sep 1, 2016, at 18:28, Steve Dower wrote: > This is a raw (bytes) IO class that requires text to be passed encoded > with utf-8, which will be decoded to utf-16-le and passed to the Windows APIs. > Similarly, bytes read from the class will be provided by the operating > system as utf-16-le and converted into utf-8 when returned to Python. What happens if a character is broken across a buffer boundary? e.g. if someone tries to read or write one byte at a time (you can't do a partial read of zero bytes, there's no way to distinguish that from an EOF.) Is there going to be a higher-level text I/O class that bypasses the UTF-8 encoding step when the underlying bytes stream is a console? What if we did that but left the encoding as mbcs? I.e. the console is text stream that can magically handle characters that aren't representable in its encoding. Note that if anything does os.read/write to the console's file descriptors, they're gonna get MBCS and there's nothing we can do about it. _______________________________________________ Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/steve.dower%40python.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Sep 1 22:38:27 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 2 Sep 2016 12:38:27 +1000 Subject: [Python-Dev] PEP 526 ready for review: Syntax for Variable and Attribute Annotations In-Reply-To: <20160901162117.GX26300@ando.pearwood.info> References: <20160901162117.GX26300@ando.pearwood.info> Message-ID: On 2 September 2016 at 02:21, Steven D'Aprano wrote: > Unless I've missed something, there's no way to pre-declare an instance > attribute without specifying a type. (Even if that type is Any.) So how > about we allow None as a type-hint on its own: > > NAME: None None already has a meaning as an annotation - it's a shorthand for "type(None)". While for variables and parameters, that's usually only seen in combination with Union, and even though Union[T, None] has a preferred spelling as Optional[T], there's also the "-> None" case to specify that a function doesn't return a value. Having "-> None" mean "no return value" and "NAME: None" mean "infer type from later assignment" would be quite confusing. However, a standalone Ellipsis doesn't currently have a meaning as a type annotation (it's only meaningful when subscripting Tuple and Callable), so a spelling like this might work: NAME: ... That spelling could then also be used in function definitions to say "infer the return type from the return statements rather than assuming Any": def inferred_return_type(): -> ... return some_other_function() Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From p.f.moore at gmail.com Fri Sep 2 04:58:28 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 2 Sep 2016 09:58:28 +0100 Subject: [Python-Dev] PEP 529: Change Windows filesystem encoding to UTF-8 In-Reply-To: References: Message-ID: On 1 September 2016 at 23:31, Steve Dower wrote: [...] > As a result, POSIX-focused library authors prefer to use bytes to represent > paths. A minor point, but in my experience, a lot of POSIX-focused authors are happy to move to a better text/bytes separation, so I'd soften this to "some POSIX-focused library authors...". Other than that minor point, this looks great - +1 from me. Paul From njs at pobox.com Fri Sep 2 05:13:34 2016 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 2 Sep 2016 02:13:34 -0700 Subject: [Python-Dev] PEP 525, third round, better finalization In-Reply-To: References: Message-ID: On Thu, Sep 1, 2016 at 3:34 PM, Yury Selivanov wrote: > Hi, > > I've spent quite a while thinking and experimenting with PEP 525 trying to > figure out how to make asynchronous generators (AG) finalization reliable. > I've tried to replace the callback for GCed with a callback to intercept > first iteration of AGs. Turns out it's very hard to work with weak-refs and > make asyncio event loop to reliably track and shutdown all open AGs. > > My new approach is to replace the "sys.set_asyncgen_finalizer(finalizer)" > function with "sys.set_asyncgen_hooks(firstiter=None, finalizer=None)". 1) Can/should these hooks be used by other types besides async generators? (e.g., async iterators that are not async generators?) What would that look like? 2) In the asyncio design it's legal for an event loop to be stopped and then started again. Currently (I guess for this reason?) asyncio event loops do not forcefully clean up resources associated with them on shutdown. For example, if I open a StreamReader, loop.stop() and loop.close() will not automatically close it for me. When, concretely, are you imagining that asyncio will run these finalizers? 3) Should the cleanup code in the generator be able to distinguish between "this iterator has left scope" versus "the event loop is being violently shut down"? 4) More fundamentally -- this revision is definitely an improvement, but it doesn't really address the main concern I have. Let me see if I can restate it more clearly. Let's define 3 levels of cleanup handling: Level 0: resources (e.g. file descriptors) cannot be reliably cleaned up. Level 1: resources are cleaned up reliably, but at an unpredictable time. Level 2: resources are cleaned up both reliably and promptly. In Python 3.5, unless you're very anal about writing cumbersome 'async with' blocks around every single 'async for', resources owned by aysnc iterators land at level 0. (Because the only cleanup method available is __del__, and __del__ cannot make async calls, so if you need async calls to do clean up then you're just doomed.) I think at the revised draft does a good job of moving async generators from level 0 to level 1 -- the finalizer hook gives a way to effectively call back into the event loop from __del__, and the shutdown hook gives us a way to guarantee that the cleanup happens while the event loop is still running. But... IIUC, it's now generally agreed that for Python code, level 1 is simply *not good enough*. (Or to be a little more precise, it's good enough for the case where the resource being cleaned up is memory, because the garbage collector knows when memory is short, but it's not good enough for resources like file descriptors.) The classic example of this is code like: # used to be good, now considered poor style: def get_file_contents(path): handle = open(path) return handle.read() This works OK on CPython because the reference-counting gc will call handle.__del__() at the end of the scope (so on CPython it's at level 2), but it famously causes huge problems when porting to PyPy with it's much faster and more sophisticated gc that only runs when triggered by memory pressure. (Or for "PyPy" you can substitute "Jython", "IronPython", whatever.) Technically this code doesn't actually "leak" file descriptors on PyPy, because handle.__del__() will get called *eventually* (this code is at level 1, not level 0), but by the time "eventually" arrives your server process has probably run out of file descriptors and crashed. Level 1 isn't good enough. So now we have all learned to instead write # good modern Python style: def get_file_contents(path): with open(path) as handle: return handle.read() and we have fancy tools like the ResourceWarning machinery to help us catch these bugs. Here's the analogous example for async generators. This is a useful, realistic async generator, that lets us incrementally read from a TCP connection that streams newline-separated JSON documents: async def read_json_lines_from_server(host, port): async for line in asyncio.open_connection(host, port)[0]: yield json.loads(line) You would expect to use this like: async for data in read_json_lines_from_server(host, port): ... BUT, with the current PEP 525 proposal, trying to use this generator in this way is exactly analogous to the open(path).read() case: on CPython it will work fine -- the generator object will leave scope at the end of the 'async for' loop, cleanup methods will be called, etc. But on PyPy, the weakref callback will not be triggered until some arbitrary time later, you will "leak" file descriptors, and your server will crash. For correct operation, you have to replace the simple 'async for' loop with this lovely construct: async with aclosing(read_json_lines_from_server(host, port)) as ait: async for data in ait: ... Of course, you only have to do this on loops whose iterator might potentially hold resources like file descriptors, either currently or in the future. So... uh... basically that's all loops, I guess? If you want to be a good defensive programmer? Conclusion: if you care about PyPy support then AFAICT the current PEP 525 cleanup design doesn't provide any benefits -- you still have to write exactly the same cumbersome defensive code as you would if the finalizer hooks were left out entirely. If anything, the PEP 525 finalizer hooks are actually harmful, because they encourage people to write CPython-specific code that blows up in hard-to-test-for-ways on PyPy. As a practical note, this is particularly concerning since the impression I got from PyCon this year is that PyPy's big production use case is running big async network servers. Currently these are on twisted, but PyPy landed asyncio/async/await support, like, last week: https://pypy35syntax.blogspot.com/ and on top of that they just got at least $250k in funding to further polish their Python 3 support, so people are going to be actually running code like my example on PyPy very soon now. tl;dr: AFAICT this revision of PEP 525 is enough to make it work reliably on CPython, but I have serious concerns that it bakes a CPython-specific design into the language. I would prefer a design that actually aims for "level 2" cleanup semantics (for example, [1]) -n [1] https://mail.python.org/pipermail/python-ideas/2016-August/041868.html -- Nathaniel J. Smith -- https://vorpus.org From p.f.moore at gmail.com Fri Sep 2 05:23:04 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 2 Sep 2016 10:23:04 +0100 Subject: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8 In-Reply-To: References: <1472772509.386843.713295953.53162396@webmail.messagingengine.com> Message-ID: On 2 September 2016 at 03:35, Steve Dower wrote: > I'd need to test to be sure, but writing an incomplete code point should > just truncate to before that point. It may currently raise OSError if that > truncated to zero length, as I believe that's not currently distinguished > from an error. What behavior would you propose? For "correct" behaviour, you should retain the unwritten bytes, and write them as part of the next call (essentially making the API stateful, in the same way that incremental codecs work). I'm pretty sure that this could cause actual problems, for example I think invoke (https://github.com/pyinvoke/invoke) gets byte streams from subprocesses and dumps them direct to stdout in blocks (so could easily end up splitting multibyte sequences). It''s arguable that it should be decoding the bytes from the subprocess and then re-encoding them, but that gets us into "guess the encoding used by the subprocess" territory. The problem is that we're not going to simply drop some bad data in the common case - it's not so much the dropping of the start of an incomplete code point that bothers me, as the encoding error you hit at the start of the *next* block of data you send. So people will get random, unexplained, encoding errors. I don't see an easy answer here other than a stateful API. > Reads of less than four bytes fail instantly, as in the worst case we need > four bytes to represent one Unicode character. This is an unfortunate > reality of trying to limit it to one system call - you'll never get a full > buffer from a single read, as there is no simple mapping between > length-as-utf8 and length-as-utf16 for an arbitrary string. And here - "read a single byte" is a not uncommon way of getting some data. Once again see invoke: https://github.com/pyinvoke/invoke/blob/master/invoke/platform.py#L147 used at https://github.com/pyinvoke/invoke/blob/master/invoke/runners.py#L548 I'm not saying that there's an easy answer here, but this *will* break code. And actually, it's in violation of the documentation: see https://docs.python.org/3/library/io.html#io.RawIOBase.read """ read(size=-1) Read up to size bytes from the object and return them. As a convenience, if size is unspecified or -1, readall() is called. Otherwise, only one system call is ever made. Fewer than size bytes may be returned if the operating system call returns fewer than size bytes. If 0 bytes are returned, and size was not 0, this indicates end of file. If the object is in non-blocking mode and no bytes are available, None is returned. """ You're not allowed to return 0 bytes if the requested size was not 0, and you're not at EOF. Having said all this, I'm strongly +1 on the idea of this PEP, it would be fantastic to resolve the above issues and get this in. Paul From levkivskyi at gmail.com Fri Sep 2 09:43:40 2016 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Fri, 2 Sep 2016 15:43:40 +0200 Subject: [Python-Dev] PEP 526 ready for review: Syntax for Variable and Attribute Annotations In-Reply-To: References: <20160901162117.GX26300@ando.pearwood.info> Message-ID: On 1 September 2016 at 22:37, Guido van Rossum wrote: > On Thu, Sep 1, 2016 at 9:30 AM, Ivan Levkivskyi wrote: > > There is a convention for function annotations in PEP 484 that a missing > > annotation is equivalent to Any, so that I like your first option more. > > But Steven wasn't proposing it to mean Any, he was proposing it to > mean "type checker should infer". Where I presume the inference should > be done based on the assignment in __init__ only. Sorry for misunderstanding. On 2 September 2016 at 04:38, Nick Coghlan wrote: > However, a standalone Ellipsis doesn't currently have a meaning as a > type annotation (it's only meaningful when subscripting Tuple and > Callable), so a spelling like this might work: > > NAME: ... > > That spelling could then also be used in function definitions to say > "infer the return type from the return statements rather than assuming > Any" Interesting idea. This is somehow similar to one of the existing use of Ellipsis: in numpy it infers how many dimensions needs to have the full slice, it is like saying "You know what I mean". So I am +1 on this solution. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at hotpy.org Fri Sep 2 09:47:00 2016 From: mark at hotpy.org (Mark Shannon) Date: Fri, 2 Sep 2016 14:47:00 +0100 Subject: [Python-Dev] Please reject or postpone PEP 526 Message-ID: <57C982D4.1060405@hotpy.org> Hi everyone, I think we should reject, or at least postpone PEP 526. PEP 526 represents a major change to the language, however there are, I believe, a number of technical flaws with the PEP. It is probable that with significant revisions it can be a worthwhile addition to the language, but that cannot happen in time for 3.6 beta 1 (in 11 days). PEP 526 claims to be an extension of PEP 484, but I don't think that is entirely correct. PEP 484 was primarily about defining the semantics of pre-existing syntax. PEP 526 is about adding new syntax. Historically the bar for adding new syntax has been set very high. I don't think that PEP 526, in its current form, reaches that bar. Below is a list of the issues I have with the PEP as it stands. In many cases it makes it more effort than type comments ======================================================== Type hints should be as easy to use as possible, and that means pushing as much work as possible onto the checker, and not burdening the programmer. Attaching type hints to variables, rather than expressions, reduces the potential for inference. This makes it harder for programmer, but easier for the checker, which is the wrong way around. For example,, given a function: def spam(x: Optional[List[int]])->None: ... With type comments, this is intuitively correct and should type check: def eggs(cond:bool): if cond: x = None else: x = [] # type: List[int] spam(x) # Here we can infer the type of x With PEP 526 we loose the ability to infer types. def eggs(cond:bool): if cond: x = None # Not legal due to type declaration below else: x: List[int] = [] spam(x) So we need to use a more complex type def eggs(cond:bool): x: Optional[List[int]] if cond: x = None # Now legal else: x: = [] spam(x) I don't think this improves readability. Whether this is an acceptable change is debatable, but it does need some debate. It limits the use of variables ============================== In Python a name (variable) is just a binding that refers to an object. A name only exists in a meaningful sense once an object has been assigned to it. Any attempt to use that name, without an object bound to it, will result in a NameError. PEP 526 makes variables more than just bindings, as any rebinding must conform to the given type. This looses us some of the dynamism for which we all love Python. Quoting from the PEP: ``` a: int a: str # Static type checker will warn about this. ``` In other words, it is illegal for a checker to split up the variable, even though it is straightforward to do so. However, without the type declarations, ``` a = 1 a = "Hi" ``` is just fine. Useless, but fine. We should be free to add extra variables, whenever we choose, for clarity. For example, total = foo() - bar() should not be treated differently from: revenue = foo() tax = bar() total = revenue - tax If types are inferred, there is no problem. However, if they must be declared, then the use of meaningfully named variables is discouraged. [A note about type-inference: Type inference is not a universal panacea, but it can make life a lot easier for programmers in statically type languages. Languages like C# use local type inference extensively and it means that many variables often do not need their type declared. We should take care not to limit the ability of checkers to infer values and types and make programmers' lives easier. Within a function, type inference is near perfect, failing only occasionally for some generic types. One place where type inference definitely breaks down is across calls, which is why PEP 484 is necessary. ] It is premature =============== There are still plenty of issues to iron out w.r.t. PEP 484 types. I don't think we should be adding more syntax, until we have a *precise* idea of what is required. PEP 484 states: "If type hinting proves useful in general, a syntax for typing variables may be provided in a future Python version." Has it proved useful in general? I don't think it has. Maybe it will in future, but it hasn't yet. It seems confused about class attributes and instance attributes ================================================================ The PEP also includes a section of how to define class attributes and instance attributes. It seems that everything needs to be defined in the class scope, even it is not an attribute of the class, but of its instances. This seems confusing, both to human reader and machine analyser. Example from PEP 526: class Starship: captain: str = 'Picard' damage: int stats: ClassVar[Dict[str, int]] = {} def __init__(self, damage: int, captain: str = None): self.damage = damage if captain: self.captain = captain # Else keep the default With type hints as they currently exist, the same code is shorter and doesn't contaminate the class namespace with the 'damage' attribute. class Starship: captain = 'Picard' stats = {} # type: Dict[str, int] def __init__(self, damage: int, captain: str = None): self.damage = damage # Can infer type as int if captain: self.captain = captain # Can infer type as str This isn't an argument against adding type syntax for attributes in general, just that the form suggested in PEP 526 doesn't seem to follow Python semantics. One could imagine applying minimal PEP 526 style hints, with standard Python semantics and relying on type inference, as follows: class Starship: captain = 'Picard' stats: Dict[str, int] = {} def __init__(self, damage: int, captain: str = None): self.damage = damage if captain: self.captain = captain The PEP overstates the existing use of static typing in Python ============================================================== Finally, in the rejected proposal section, under "Should we introduce variable annotations at all?" it states that "Variable annotations have already been around for almost two years in the form of type comments, sanctioned by PEP 484." I don't think that this is entirely true. PEP 484 was about the syntax for types, declaring parameter and return types, and declaring custom types to be generic. PEP 484 does include a description of type comments, but they are always annotations on assignment statements and were primarily intended for use in stub files. Please don't turn Python into some sort of inferior Java. There is potential in this PEP, but in its current form I think it should be rejected. Cheers, Mark. From rymg19 at gmail.com Fri Sep 2 11:03:54 2016 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Fri, 2 Sep 2016 10:03:54 -0500 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: <57C982D4.1060405@hotpy.org> References: <57C982D4.1060405@hotpy.org> Message-ID: On Sep 2, 2016 8:51 AM, "Mark Shannon" wrote: > > Hi everyone, > > I think we should reject, or at least postpone PEP 526. > > PEP 526 represents a major change to the language, however there are, I believe, a number of technical flaws with the PEP. > > It is probable that with significant revisions it can be a worthwhile addition to the language, but that cannot happen in time for 3.6 beta 1 (in 11 days). > > PEP 526 claims to be an extension of PEP 484, but I don't think that is entirely correct. > PEP 484 was primarily about defining the semantics of pre-existing syntax. PEP 526 is about adding new syntax. > Historically the bar for adding new syntax has been set very high. I don't think that PEP 526, in its current form, reaches that bar. > > Below is a list of the issues I have with the PEP as it stands. > > In many cases it makes it more effort than type comments > ======================================================== > > Type hints should be as easy to use as possible, and that means pushing as much work as possible onto the checker, and not burdening the programmer. > > Attaching type hints to variables, rather than expressions, reduces the potential for inference. This makes it harder for programmer, but easier for the checker, which is the wrong way around. > > For example,, given a function: > def spam(x: Optional[List[int]])->None: ... > > With type comments, this is intuitively correct and should type check: > def eggs(cond:bool): > if cond: > x = None > else: > x = [] # type: List[int] > spam(x) # Here we can infer the type of x > > With PEP 526 we loose the ability to infer types. > def eggs(cond:bool): > if cond: > x = None # Not legal due to type declaration below > else: > x: List[int] = [] > spam(x) > > So we need to use a more complex type > def eggs(cond:bool): > x: Optional[List[int]] > if cond: > x = None # Now legal > else: > x: = [] > spam(x) > > I don't think this improves readability. > Whether this is an acceptable change is debatable, but it does need some debate. > > It limits the use of variables > ============================== > > In Python a name (variable) is just a binding that refers to an object. > A name only exists in a meaningful sense once an object has been assigned to it. Any attempt to use that name, without an object bound to it, will result in a NameError. > > PEP 526 makes variables more than just bindings, as any rebinding must conform to the given type. This looses us some of the dynamism for which we all love Python. > > Quoting from the PEP: > ``` > a: int > a: str # Static type checker will warn about this. > ``` > In other words, it is illegal for a checker to split up the variable, even though it is straightforward to do so. > > However, without the type declarations, > ``` > a = 1 > a = "Hi" > ``` > is just fine. Useless, but fine. > But isn't that the same way with type comments? Except uglier? > We should be free to add extra variables, whenever we choose, for clarity. For example, > total = foo() - bar() > should not be treated differently from: > revenue = foo() > tax = bar() > total = revenue - tax > > If types are inferred, there is no problem. > However, if they must be declared, then the use of meaningfully named variables is discouraged. > > [A note about type-inference: > Type inference is not a universal panacea, but it can make life a lot easier for programmers in statically type languages. > Languages like C# use local type inference extensively and it means that many variables often do not need their type declared. We should take care not to limit the ability of checkers to infer values and types and make programmers' lives easier. > Within a function, type inference is near perfect, failing only occasionally for some generic types. > One place where type inference definitely breaks down is across calls, which is why PEP 484 is necessary. > ] > > It is premature > =============== > > There are still plenty of issues to iron out w.r.t. PEP 484 types. I don't think we should be adding more syntax, until we have a *precise* idea of what is required. > > PEP 484 states: > "If type hinting proves useful in general, a syntax for typing variables may be provided in a future Python version." > Has it proved useful in general? I don't think it has. Maybe it will in future, but it hasn't yet. > > It seems confused about class attributes and instance attributes > ================================================================ > > The PEP also includes a section of how to define class attributes and instance attributes. It seems that everything needs to be defined in the class scope, even it is not an attribute of the class, but of its instances. This seems confusing, both to human reader and machine analyser. > > Example from PEP 526: > > class Starship: > > captain: str = 'Picard' > damage: int > stats: ClassVar[Dict[str, int]] = {} > > def __init__(self, damage: int, captain: str = None): > self.damage = damage > if captain: > self.captain = captain # Else keep the default > > With type hints as they currently exist, the same code is shorter and > doesn't contaminate the class namespace with the 'damage' attribute. > > class Starship: > > captain = 'Picard' > stats = {} # type: Dict[str, int] > > def __init__(self, damage: int, captain: str = None): > self.damage = damage # Can infer type as int > if captain: > self.captain = captain # Can infer type as str > > > This isn't an argument against adding type syntax for attributes in general, just that the form suggested in PEP 526 doesn't seem to follow Python semantics. > > One could imagine applying minimal PEP 526 style hints, with standard Python semantics and relying on type inference, as follows: > > class Starship: > > captain = 'Picard' > stats: Dict[str, int] = {} > > def __init__(self, damage: int, captain: str = None): > self.damage = damage > if captain: > self.captain = captain > > The PEP overstates the existing use of static typing in Python > ============================================================== > > Finally, in the rejected proposal section, under "Should we introduce variable annotations at all?" it states that "Variable annotations have already been around for almost two years in the form of type comments, sanctioned by PEP 484." > I don't think that this is entirely true. > PEP 484 was about the syntax for types, declaring parameter and return types, and declaring custom types to be generic. > PEP 484 does include a description of type comments, but they are always annotations on assignment statements and were primarily intended for use in stub files. > > > > Please don't turn Python into some sort of inferior Java. I think we're fine; there aren't any `AbstractAccountManagerInterfacePageSQLDatabaseConnPipeOrientedNewVersionProtocol`s. > There is potential in this PEP, but in its current form I think it should be rejected. > > Cheers, > Mark. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/rymg19%40gmail.com -- Ryan [ERROR]: Your autotools build scripts are 200 lines longer than your program. Something?s wrong. http://kirbyfan64.github.io/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From Amresh.Sajjanshetty at netapp.com Fri Sep 2 02:49:31 2016 From: Amresh.Sajjanshetty at netapp.com (Sajjanshetty, Amresh) Date: Fri, 2 Sep 2016 06:49:31 +0000 Subject: [Python-Dev] Need help in debugging the python core Message-ID: Dear All, I?m using asyncio and paramiko to multiplex different channels into a single SSH connection. Things were working fine till recently but suddenly started seeing that python getting crashed whenever I tried to write to the channel. I have very limited knowledge on how python interpreter works, so I?m finding difficulty in understanding the stack trace. Can you please help in understanding the below backtarce. bash-4.2$ gdb /usr/software/bin/python3.4.3 core.60015 Traceback (most recent call last): File "", line 70, in File "", line 67, in GdbSetPythonDirectory File "/usr/software/share/gdb/python/gdb/__init__.py", line 19, in import _gdb ImportError: No module named _gdb GNU gdb (GDB) 7.5 Copyright (C) 2012 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-unknown-linux-gnu". For bug reporting instructions, please see: ... Reading symbols from /usr/software/bin/python3.4.3...done. warning: core file may not match specified executable file. [New LWP 60015] [New LWP 60018] [New LWP 60019] [New LWP 60020] [New LWP 60021] [New LWP 60022] [New LWP 60023] [New LWP 60024] [Thread debugging using libthread_db enabled] Using host libthread_db library "/usr/software/lib/libthread_db.so.1". Core was generated by `/usr/software/bin/python3.4.3 /x/eng/bbrtp/users/amresh/sshproxy_3896926_160824'. Program terminated with signal 11, Segmentation fault. #0 _PyObject_Malloc (ctx=0x0, nbytes=52) at Objects/obmalloc.c:1159 1159 Objects/obmalloc.c: No such file or directory. (gdb) bt #0 _PyObject_Malloc (ctx=0x0, nbytes=52) at Objects/obmalloc.c:1159 #1 0x00007ff2e511474a in PyUnicode_New (maxchar=, size=3) at Objects/unicodeobject.c:1093 #2 PyUnicode_New (size=3, maxchar=) at Objects/unicodeobject.c:1033 #3 0x00007ff2e5139da2 in _PyUnicodeWriter_PrepareInternal (writer=writer at entry=0x7fff3d5c8640, length=, maxchar=, maxchar at entry=127) at Objects/unicodeobject.c:13327 #4 0x00007ff2e513f38b in PyUnicode_DecodeUTF8Stateful (s=s at entry=0x7ff2e3572f78 "tcp\245reuse\001\253socket_type\244pull\251transport\246zeromq", size=size at entry=3, errors=errors at entry=0x7ff2dee5dd70 "strict", consumed=consumed at entry=0x0) at Objects/unicodeobject.c:4757 #5 0x00007ff2e5140690 in PyUnicode_Decode (s=0x7ff2e3572f78 "tcp\245reuse\001\253socket_type\244pull\251transport\246zeromq", size=3, encoding=0x7ff2dee5df28 "utf-8", errors=0x7ff2dee5dd70 "strict") at Objects/unicodeobject.c:3012 #6 0x00007ff2de49bfdf in unpack_callback_raw (o=, l=3, p=0x7ff2e3572f78 "tcp\245reuse\001\253socket_type\244pull\251transport\246zeromq", u=0x7fff3d5c8840, b=) at msgpack/unpack.h:229 #7 unpack_execute (ctx=ctx at entry=0x7fff3d5c8840, data=0x7ff2e3572ec0 "\205\245_auth\300\245_call\246expect\243_i", , len=, off=off at entry=0x7fff3d5c8820) at msgpack/unpack_template.h:312 #8 0x00007ff2de49fe3d in __pyx_pf_7msgpack_9_unpacker_2unpackb (__pyx_v_packed=__pyx_v_packed at entry=0x7ff2e3572ea0, __pyx_v_object_hook=__pyx_v_object_hook at entry=0x7ff2e54934b0 <_Py_NoneStruct>, __pyx_v_list_hook=__pyx_v_list_hook at entry=0x7ff2e54934b0 <_Py_NoneStruct>, __pyx_v_use_list=1, __pyx_v_encoding=0x7ff2dee5df08, __pyx_v_unicode_errors=0x7ff2dee5dd50, __pyx_v_object_pairs_hook=0x7ff2e54934b0 <_Py_NoneStruct>, __pyx_v_ext_hook=0x13db2d8, __pyx_v_max_str_len=__pyx_v_max_str_len at entry=2147483647, __pyx_v_max_bin_len=__pyx_v_max_bin_len at entry=2147483647, __pyx_v_max_array_len=2147483647, __pyx_v_max_map_len=2147483647, __pyx_v_max_ext_len=__pyx_v_max_ext_len at entry=2147483647, __pyx_self=) at msgpack/_unpacker.pyx:139 #9 0x00007ff2de4a1395 in __pyx_pw_7msgpack_9_unpacker_3unpackb (__pyx_self=, __pyx_args=, __pyx_kwds=) at msgpack/_unpacker.pyx:102 #10 0x00007ff2e5174ed3 in do_call (nk=, na=, pp_stack=0x7fff3d5d2b80, func=0x7ff2df20ddc8) at Python/ceval.c:4463 #11 call_function (oparg=, pp_stack=0x7fff3d5d2b80) at Python/ceval.c:4264 #12 PyEval_EvalFrameEx (f=f at entry=0x7ff2def02208, throwflag=throwflag at entry=0) at Python/ceval.c:2838 #13 0x00007ff2e5175f45 in PyEval_EvalCodeEx (_co=, globals=, locals=locals at entry=0x0, args=, argcount=argcount at entry=1, kws=0x7ff2deefec30, kwcount=0, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0) at Python/ceval.c:3588 #14 0x00007ff2e51734da in fast_function (nk=, na=1, n=, pp_stack=0x7fff3d5d2e10, func=0x7ff2dee9b7b8) at Python/ceval.c:4344 #15 call_function (oparg=, pp_stack=0x7fff3d5d2e10) at Python/ceval.c:4262 #16 PyEval_EvalFrameEx (f=f at entry=0x7ff2deefea98, throwflag=throwflag at entry=0) at Python/ceval.c:2838 #17 0x00007ff2e5175f45 in PyEval_EvalCodeEx (_co=, globals=, locals=locals at entry=0x0, args=, argcount=argcount at entry=1, kws=0x14566c8, kwcount=0, defs=0x7ff2deeaedb8, defcount=1, kwdefs=0x0, closure=0x0) at Python/ceval.c:3588 #18 0x00007ff2e51734da in fast_function (nk=, na=1, n=, pp_stack=0x7fff3d5d30a0, func=0x7ff2dee2dd90) at Python/ceval.c:4344 #19 call_function (oparg=, pp_stack=0x7fff3d5d30a0) at Python/ceval.c:4262 #20 PyEval_EvalFrameEx (f=f at entry=0x1456478, throwflag=throwflag at entry=0) at Python/ceval.c:2838 #21 0x00007ff2e5175f45 in PyEval_EvalCodeEx (_co=, globals=, locals=locals at entry=0x0, args=args at entry=0x7ff2d87364c0, argcount=1, kws=kws at entry=0x7ff2dee1de40, kwcount=kwcount at entry=3, defs=defs at entry=0x7ff2e0820fd8, defcount=defcount at entry=3, kwdefs=0x0, closure=0x0) at Python/ceval.c:3588 #22 0x00007ff2e50d3320 in function_call (func=0x7ff2df1e9a60, arg=0x7ff2d87364a8, kw=0x7ff2d8738248) at Objects/funcobject.c:632 #23 0x00007ff2e50a76ca in PyObject_Call (func=func at entry=0x7ff2df1e9a60, arg=arg at entry=0x7ff2d87364a8, kw=kw at entry=0x7ff2d8738248) at Objects/abstract.c:2040 #24 0x00007ff2e50be55d in method_call (func=0x7ff2df1e9a60, arg=0x7ff2d87364a8, kw=0x7ff2d8738248) at Objects/classobject.c:347 #25 0x00007ff2e50a76ca in PyObject_Call (func=0x7ff2dee30e88, arg=arg at entry=0x7ff2e433d048, kw=kw at entry=0x7ff2d8738248) at Objects/abstract.c:2040 #26 0x00007ff2e51d9301 in partial_call (pto=0x7ff2deee1db8, args=, kw=0x0) at ./Modules/_functoolsmodule.c:127 #27 0x00007ff2e50a76ca in PyObject_Call (func=func at entry=0x7ff2deee1db8, arg=arg at entry=0x7ff2e433d048, kw=kw at entry=0x0) at Objects/abstract.c:2040 #28 0x00007ff2e51700a0 in ext_do_call (nk=-466366392, na=0, flags=, pp_stack=0x7fff3d5d3540, func=0x7ff2deee1db8) at Python/ceval.c:4561 #29 PyEval_EvalFrameEx (f=, throwflag=throwflag at entry=0) at Python/ceval.c:2878 #30 0x00007ff2e51756a9 in fast_function (nk=, na=1, n=1, pp_stack=0x7fff3d5d3710, func=0x7ff2e1540730) at Python/ceval.c:4334 #31 call_function (oparg=, pp_stack=0x7fff3d5d3710) at Python/ceval.c:4262 #32 PyEval_EvalFrameEx (f=, throwflag=throwflag at entry=0) at Python/ceval.c:2838 #33 0x00007ff2e51756a9 in fast_function (nk=, na=1, n=1, pp_stack=0x7fff3d5d38f0, func=0x7ff2e12f2f28) at Python/ceval.c:4334 #34 call_function (oparg=, pp_stack=0x7fff3d5d38f0) at Python/ceval.c:4262 #35 PyEval_EvalFrameEx (f=, throwflag=throwflag at entry=0) at Python/ceval.c:2838 #36 0x00007ff2e51756a9 in fast_function (nk=, na=1, n=1, pp_stack=0x7fff3d5d3ad0, func=0x7ff2e12f0c80) at Python/ceval.c:4334 #37 call_function (oparg=, pp_stack=0x7fff3d5d3ad0) at Python/ceval.c:4262 #38 PyEval_EvalFrameEx (f=, throwflag=throwflag at entry=0) at Python/ceval.c:2838 #39 0x00007ff2e51756a9 in fast_function (nk=, na=1, n=1, pp_stack=0x7fff3d5d3cb0, func=0x7ff2df1e9ae8) at Python/ceval.c:4334 #40 call_function (oparg=, pp_stack=0x7fff3d5d3cb0) at Python/ceval.c:4262 #41 PyEval_EvalFrameEx (f=f at entry=0xf796b8, throwflag=throwflag at entry=0) at Python/ceval.c:2838 #42 0x00007ff2e5175f45 in PyEval_EvalCodeEx (_co=_co at entry=0x7ff2e400c660, globals=globals at entry=0x7ff2e42df488, locals=locals at entry=0x7ff2e42df488, args=args at entry=0x0, argcount=argcount at entry=0, kws=kws at entry=0x0, kwcount=kwcount at entry=0, defs=defs at entry=0x0, defcount=defcount at entry=0, kwdefs=kwdefs at entry=0x0, closure=closure at entry=0x0) at Python/ceval.c:3588 #43 0x00007ff2e517601b in PyEval_EvalCode (co=co at entry=0x7ff2e400c660, globals=globals at entry=0x7ff2e42df488, locals=locals at entry=0x7ff2e42df488) at Python/ceval.c:775 #44 0x00007ff2e519c09e in run_mod (arena=0xfc2990, flags=0x7fff3d5d3f50, locals=0x7ff2e42df488, globals=0x7ff2e42df488, filename=0x7ff2e41d64b0, mod=0x103b4f0) at Python/pythonrun.c:2180 #45 PyRun_FileExFlags (fp=fp at entry=0xf77b60, filename_str=filename_str at entry=0x7ff2e41d81d0 "/x/eng/bbrtp/users/amresh/sshproxy_3896926_1608240818/test/nate/lib/NATE/Service/SSHProxy.py", start=start at entry=257, globals=globals at entry=0x7ff2e42df488, locals=locals at entry=0x7ff2e42df488, closeit=closeit at entry=1, flags=flags at entry=0x7fff3d5d3f50) at Python/pythonrun.c:2133 #46 0x00007ff2e519ced5 in PyRun_SimpleFileExFlags (fp=fp at entry=0xf77b60, filename=, closeit=closeit at entry=1, flags=flags at entry=0x7fff3d5d3f50) at Python/pythonrun.c:1606 ---Type to continue, or q to quit--- #47 0x00007ff2e519df09 in PyRun_AnyFileExFlags (fp=fp at entry=0xf77b60, filename=, closeit=closeit at entry=1, flags=flags at entry=0x7fff3d5d3f50) at Python/pythonrun.c:1292 #48 0x00007ff2e51b6af5 in run_file (p_cf=0x7fff3d5d3f50, filename=0xf0e7f0 L"/x/eng/bbrtp/users/amresh/sshproxy_3896926_1608240818/test/nate/lib/NATE/Service/SSHProxy.py", fp=0xf77b60) at Modules/main.c:319 #49 Py_Main (argc=argc at entry=6, argv=argv at entry=0xee6010) at Modules/main.c:751 #50 0x0000000000400aa6 in main (argc=6, argv=) at ./Modules/python.c:69 (gdb) Thanks and Regards, Amresh -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri Sep 2 11:59:25 2016 From: guido at python.org (Guido van Rossum) Date: Fri, 2 Sep 2016 08:59:25 -0700 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: <57C982D4.1060405@hotpy.org> References: <57C982D4.1060405@hotpy.org> Message-ID: On Fri, Sep 2, 2016 at 6:47 AM, Mark Shannon wrote: > Hi everyone, > > I think we should reject, or at least postpone PEP 526. > > PEP 526 represents a major change to the language, however there are, I > believe, a number of technical flaws with the PEP. > > It is probable that with significant revisions it can be a worthwhile > addition to the language, but that cannot happen in time for 3.6 beta 1 (in > 11 days). > > PEP 526 claims to be an extension of PEP 484, but I don't think that is > entirely correct. > PEP 484 was primarily about defining the semantics of pre-existing syntax. > PEP 526 is about adding new syntax. > Historically the bar for adding new syntax has been set very high. I don't > think that PEP 526, in its current form, reaches that bar. > > Below is a list of the issues I have with the PEP as it stands. > > In many cases it makes it more effort than type comments > ======================================================== > > Type hints should be as easy to use as possible, and that means pushing as > much work as possible onto the checker, and not burdening the programmer. > > Attaching type hints to variables, rather than expressions, reduces the > potential for inference. This makes it harder for programmer, but easier for > the checker, which is the wrong way around. > > For example,, given a function: > def spam(x: Optional[List[int]])->None: ... > > With type comments, this is intuitively correct and should type check: > def eggs(cond:bool): > if cond: > x = None > else: > x = [] # type: List[int] > spam(x) # Here we can infer the type of x > > With PEP 526 we loose the ability to infer types. > def eggs(cond:bool): > if cond: > x = None # Not legal due to type declaration below > else: > x: List[int] = [] > spam(x) > > So we need to use a more complex type > def eggs(cond:bool): > x: Optional[List[int]] > if cond: > x = None # Now legal > else: > x: = [] > spam(x) > > I don't think this improves readability. > Whether this is an acceptable change is debatable, but it does need some > debate. It looks like you're misinterpreting the intent of the PEP. It does not mean legislate the behavior of the type checker in this way. In mypy, the first example is already rejected because it wants the annotation on the first occurrence. The plan is for mypy not to change its behavior -- the old form TARGET = VALUE # type: TYPE will be treated the same way as the new form TARGET: TYPE = VALUE (If you have a beef with what this means in mypy you should probably take it up with mypy, not with PEP 526.) > It limits the use of variables > ============================== > > In Python a name (variable) is just a binding that refers to an object. > A name only exists in a meaningful sense once an object has been assigned to > it. Any attempt to use that name, without an object bound to it, will result > in a NameError. (Or UnboundLocalError, if the compiler knows there is an assignment to the name anywhere in the same (function) scope.) > PEP 526 makes variables more than just bindings, as any rebinding must > conform to the given type. This looses us some of the dynamism for which we > all love Python. Thanks for catching this; that's not the intent. > Quoting from the PEP: > ``` > a: int > a: str # Static type checker will warn about this. > ``` > In other words, it is illegal for a checker to split up the variable, even > though it is straightforward to do so. One of my co-authors has gone too far here. The intent is not to legislate what should happen in this case but to leave it to the checker. In mypy, the equivalent syntax using type comments is currently indeed rejected, but we're considering a change here (https://github.com/python/mypy/issues/1174). The PEP 526 syntax will not make a difference here. > However, without the type declarations, > ``` > a = 1 > a = "Hi" > ``` > is just fine. Useless, but fine. And believe me, I want to keep it this way. I will amend the example and clarify the intent in the text. > We should be free to add extra variables, whenever we choose, for clarity. > For example, > total = foo() - bar() > should not be treated differently from: > revenue = foo() > tax = bar() > total = revenue - tax > > If types are inferred, there is no problem. > However, if they must be declared, then the use of meaningfully named > variables is discouraged. There is no mandate to declare variables! I actually see the main use of variable annotations in class bodies where it can crate a lot of clarity around which instance variables exist and what they mean. > [A note about type-inference: > Type inference is not a universal panacea, but it can make life a lot easier > for programmers in statically type languages. > Languages like C# use local type inference extensively and it means that > many variables often do not need their type declared. We should take care > not to limit the ability of checkers to infer values and types and make > programmers' lives easier. > Within a function, type inference is near perfect, failing only occasionally > for some generic types. > One place where type inference definitely breaks down is across calls, which > is why PEP 484 is necessary. > ] Totally agreed. But type annotations are not *just* for the checker. I am regularly delighted when I find function annotations in code that I have to read for the first time, because it helps my understanding. Many others at Dropbox (where we have been doing a fairly large-scale experiment with the introduction of mypy) agree. > It is premature > =============== > > There are still plenty of issues to iron out w.r.t. PEP 484 types. I don't > think we should be adding more syntax, until we have a *precise* idea of > what is required. > > PEP 484 states: > "If type hinting proves useful in general, a syntax for typing variables may > be provided in a future Python version." > Has it proved useful in general? I don't think it has. Maybe it will in > future, but it hasn't yet. PEP 526 does not alter this situation. It doesn't define new types, only new places where types can be used syntactically, and it is careful to give them the same syntactic description as PEP 484 (it's an expression). Practical use of mypy has shown that we use `# type` comments on variables with some regularity and not having them in the AST has been a problem for tools. For example, we have had to teach flake8 and pylint about type comments (so that we could continue to benefit from there "unused import" and "undefined variable" tests), and in both cases it was a gross hack. > It seems confused about class attributes and instance attributes > ================================================================ > > The PEP also includes a section of how to define class attributes and > instance attributes. It seems that everything needs to be defined in the > class scope, even it is not an attribute of the class, but of its instances. > This seems confusing, both to human reader and machine analyser. And yet of course most other OO languages, like C++ and Java, also let you define instance and class variables in the class body (and again with a default of "instance", IIRC you have to use "static" in both languages for class variables). > Example from PEP 526: > > class Starship: > > captain: str = 'Picard' > damage: int > stats: ClassVar[Dict[str, int]] = {} > > def __init__(self, damage: int, captain: str = None): > self.damage = damage > if captain: > self.captain = captain # Else keep the default > > With type hints as they currently exist, the same code is shorter and > doesn't contaminate the class namespace with the 'damage' attribute. > > class Starship: > > captain = 'Picard' > stats = {} # type: Dict[str, int] > > def __init__(self, damage: int, captain: str = None): > self.damage = damage # Can infer type as int > if captain: > self.captain = captain # Can infer type as str It is also harder for the human reader to discover that there's a `damage` instance attribute. (Many classes I've reviewed at Dropbox have pages and pages of __init__ code, defining dozens of instance variables, sometimes using init-helper methods.) For example, if you look at the code for asyncio's Future class you'll see this block at the class level: ``` class Future: """...""" # Class variables serving as defaults for instance variables. _state = _PENDING _result = None _exception = None _loop = None _source_traceback = None _blocking = False # proper use of future (yield vs yield from) _log_traceback = False # Used for Python 3.4 and later _tb_logger = None # Used for Python 3.3 only def __init__(self, *, loop=None): ... ``` The terminology here is actually somewhat confused, but these are all default values for instance variables. Because the defaults here are all immutable, the assignments are put here instead of in __init__ to save a little space in the dict and to make __init__ shorter (it only has to set those instance variables that have mutable values). There's also a highly technical reason for preferring that some instance variables are given a default value in the class -- that way if an exception happens in __init__ and there is recovery code that tries to use some instance variable that __init__ hadn't initialized yet (e.g. in an attempt to log the object) it avoids AttributeErrors for those variables that have defaults set on the class. This has happened to me often enough that it is now a standard idiom in my head. > This isn't an argument against adding type syntax for attributes in general, > just that the form suggested in PEP 526 doesn't seem to follow Python > semantics. I'm happy to weaken the semantics as mandated in the PEP. In fact I had thought it already doesn't mandate any semantics (apart from storing certain forms of annotations in __annotations__, to match PEP 3107 and PEP 484), although I agree some examples have crept in that may appear more normative than we meant them. > One could imagine applying minimal PEP 526 style hints, with standard Python > semantics and relying on type inference, as follows: > > class Starship: > > captain = 'Picard' > stats: Dict[str, int] = {} > > def __init__(self, damage: int, captain: str = None): > self.damage = damage > if captain: > self.captain = captain > > The PEP overstates the existing use of static typing in Python > ============================================================== > > Finally, in the rejected proposal section, under "Should we introduce > variable annotations at all?" it states that "Variable annotations have > already been around for almost two years in the form of type comments, > sanctioned by PEP 484." > I don't think that this is entirely true. > PEP 484 was about the syntax for types, declaring parameter and return > types, and declaring custom types to be generic. > PEP 484 does include a description of type comments, but they are always > annotations on assignment statements and were primarily intended for use in > stub files. That is a mis-characterization of the intent of type comments in PEP 484; they are not primarily meant for stubs (the only think I find tying the two together is the use of "..." as the initial value in stubs). > Please don't turn Python into some sort of inferior Java. > There is potential in this PEP, but in its current form I think it should be > rejected. Thanks for your feedback. We will be sure not to turn Python into Java! But I am unconvinced that your objections are reason to reject the PEP -- you seem to be fine with the general *syntax* proposed, your concerns are about the specific rules to be used by a type checker. I expect we'll be arguing about those for years to come -- maybe one day a PEP will come along that ties the semantics of types down, but PEP 526 is not it. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From status at bugs.python.org Fri Sep 2 12:08:47 2016 From: status at bugs.python.org (Python tracker) Date: Fri, 2 Sep 2016 18:08:47 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20160902160847.8783456672@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2016-08-26 - 2016-09-02) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 5630 (+24) closed 34067 (+49) total 39697 (+73) Open issues with patches: 2441 Issues opened (52) ================== #27868: Unconditionally state when a build succeeds http://bugs.python.org/issue27868 opened by brett.cannon #27869: test failures on Bash on Windows http://bugs.python.org/issue27869 opened by brett.cannon #27872: Update os/os.path docs to mention path-like object support http://bugs.python.org/issue27872 opened by brett.cannon #27873: multiprocessing.pool.Pool.map should take more than one iterab http://bugs.python.org/issue27873 opened by Jason Yu #27874: inconsistent sys.path behavior when using PythonXX.zip http://bugs.python.org/issue27874 opened by Joseph.Shen #27875: Syslogs /usr/sbin/foo as /foo instead of as foo http://bugs.python.org/issue27875 opened by canvon #27876: Add SSLContext.set_version_range(minver, maxver=None) http://bugs.python.org/issue27876 opened by christian.heimes #27877: Add recipe for "valueless" Enums to docs http://bugs.python.org/issue27877 opened by John Hagen #27879: add os.syncfs() http://bugs.python.org/issue27879 opened by mmarkk #27880: cPickle fails on large objects (still - 2011 and counting) http://bugs.python.org/issue27880 opened by robert at smithpierce.net #27881: Fix possible bugs when setting sqlite3.Connection.isolation_le http://bugs.python.org/issue27881 opened by xiang.zhang #27883: sqlite-3.14.1 for Python_3.6.0b1 ? http://bugs.python.org/issue27883 opened by Big Stone #27884: during 'make install', pre-existing site-packages residents ar http://bugs.python.org/issue27884 opened by MattDMo #27886: Docs: the difference between rename and replace is not obvious http://bugs.python.org/issue27886 opened by asvetlov #27889: ctypes interfers with signal handling http://bugs.python.org/issue27889 opened by Andre Merzky #27890: platform.release() incorrect in Python 3.5.2 on Windows 2008Se http://bugs.python.org/issue27890 opened by James Domingo #27892: Idlelib: document or move delayed imports http://bugs.python.org/issue27892 opened by terry.reedy #27896: Allow passing sphinx options to Doc/Makefile http://bugs.python.org/issue27896 opened by sizeof #27897: Avoid possible crash in pysqlite_connection_create_collation http://bugs.python.org/issue27897 opened by xiang.zhang #27898: regexp performance degradation between 2.7.6 and 2.7.12 http://bugs.python.org/issue27898 opened by steve.newcomb #27900: ctypes fails to find ncurses via ncursesw on Arch Linux http://bugs.python.org/issue27900 opened by blueyed #27901: inspect.ismethod returns different results on the same basic c http://bugs.python.org/issue27901 opened by anthony-flury #27902: pstats.Stats: strip_dirs() method cannot handle file paths fro http://bugs.python.org/issue27902 opened by Jaroslav #27903: Avoid ResourceWarnings from platform._dist_try_harder http://bugs.python.org/issue27903 opened by scop #27905: Add documentation for typing.Type http://bugs.python.org/issue27905 opened by michael0x2a #27906: Socket accept exhaustion during high TCP traffic http://bugs.python.org/issue27906 opened by kevinconway #27908: del _limbo[self] KeyError http://bugs.python.org/issue27908 opened by Dima.Tisnek #27910: Doc/library/traceback.rst ??? references to tuples should be r http://bugs.python.org/issue27910 opened by torsava #27911: Unnecessary error checks in exec_builtin_or_dynamic http://bugs.python.org/issue27911 opened by xiang.zhang #27914: Incorrect comment in PyModule_ExcDef http://bugs.python.org/issue27914 opened by xiang.zhang #27915: Use 'ascii' instead of 'us-ascii' to bypass lookup machinery http://bugs.python.org/issue27915 opened by scop #27916: Use time.monotonic instead of time.time where appropriate http://bugs.python.org/issue27916 opened by scop #27918: Running test suites without gui but still having windows flash http://bugs.python.org/issue27918 opened by xiang.zhang #27919: Deprecate and remove extra_path distribution kwarg http://bugs.python.org/issue27919 opened by jason.coombs #27920: Embedding python in a shared library fails to import the Pytho http://bugs.python.org/issue27920 opened by suzaku #27921: f-strings: do not allow backslashes http://bugs.python.org/issue27921 opened by eric.smith #27923: PEP 467 -- Minor API improvements for binary sequences http://bugs.python.org/issue27923 opened by elias #27925: Nicer interface to convert hashlib digests to int http://bugs.python.org/issue27925 opened by steven.daprano #27926: ctypes is too slow to convert a Python list to a C array http://bugs.python.org/issue27926 opened by Tom Cornebize #27927: argparse: default propagation of formatter_class from Argument http://bugs.python.org/issue27927 opened by Benjamin Giesers #27928: Add hashlib.scrypt http://bugs.python.org/issue27928 opened by christian.heimes #27929: asyncio.AbstractEventLoop.sock_connect broken for AF_BLUETOOTH http://bugs.python.org/issue27929 opened by Robert.Jordens #27930: logging's QueueListener drops log messages http://bugs.python.org/issue27930 opened by encukou #27931: Email parse IndexError <""@wiarcom.com> http://bugs.python.org/issue27931 opened by ???????????????????? ???????????? #27932: platform.win32_ver() leaks in 2.7.12 http://bugs.python.org/issue27932 opened by Okko.Willeboordse #27934: json float encoding incorrect for dbus.Double http://bugs.python.org/issue27934 opened by eajames #27935: logging level FATAL missing in _nameToLevel http://bugs.python.org/issue27935 opened by Ond??ej Medek #27936: Inconsistent round behavior between float and int http://bugs.python.org/issue27936 opened by Jonatan Skogsfors #27937: logging.getLevelName microoptimization http://bugs.python.org/issue27937 opened by Ond??ej Medek #27938: PyUnicode_AsEncodedString, PyUnicode_Decode: add fast-path for http://bugs.python.org/issue27938 opened by haypo #27939: Tkinter mainloop raises when setting the value of ttk.LabeledS http://bugs.python.org/issue27939 opened by goyodiaz #27940: xml.etree: Avoid XML declaration for the "ascii" encoding http://bugs.python.org/issue27940 opened by haypo Most recent 15 issues with no replies (15) ========================================== #27939: Tkinter mainloop raises when setting the value of ttk.LabeledS http://bugs.python.org/issue27939 #27930: logging's QueueListener drops log messages http://bugs.python.org/issue27930 #27927: argparse: default propagation of formatter_class from Argument http://bugs.python.org/issue27927 #27914: Incorrect comment in PyModule_ExcDef http://bugs.python.org/issue27914 #27908: del _limbo[self] KeyError http://bugs.python.org/issue27908 #27905: Add documentation for typing.Type http://bugs.python.org/issue27905 #27903: Avoid ResourceWarnings from platform._dist_try_harder http://bugs.python.org/issue27903 #27900: ctypes fails to find ncurses via ncursesw on Arch Linux http://bugs.python.org/issue27900 #27897: Avoid possible crash in pysqlite_connection_create_collation http://bugs.python.org/issue27897 #27896: Allow passing sphinx options to Doc/Makefile http://bugs.python.org/issue27896 #27880: cPickle fails on large objects (still - 2011 and counting) http://bugs.python.org/issue27880 #27876: Add SSLContext.set_version_range(minver, maxver=None) http://bugs.python.org/issue27876 #27875: Syslogs /usr/sbin/foo as /foo instead of as foo http://bugs.python.org/issue27875 #27872: Update os/os.path docs to mention path-like object support http://bugs.python.org/issue27872 #27869: test failures on Bash on Windows http://bugs.python.org/issue27869 Most recent 15 issues waiting for review (15) ============================================= #27940: xml.etree: Avoid XML declaration for the "ascii" encoding http://bugs.python.org/issue27940 #27938: PyUnicode_AsEncodedString, PyUnicode_Decode: add fast-path for http://bugs.python.org/issue27938 #27936: Inconsistent round behavior between float and int http://bugs.python.org/issue27936 #27935: logging level FATAL missing in _nameToLevel http://bugs.python.org/issue27935 #27934: json float encoding incorrect for dbus.Double http://bugs.python.org/issue27934 #27931: Email parse IndexError <""@wiarcom.com> http://bugs.python.org/issue27931 #27928: Add hashlib.scrypt http://bugs.python.org/issue27928 #27923: PEP 467 -- Minor API improvements for binary sequences http://bugs.python.org/issue27923 #27921: f-strings: do not allow backslashes http://bugs.python.org/issue27921 #27918: Running test suites without gui but still having windows flash http://bugs.python.org/issue27918 #27916: Use time.monotonic instead of time.time where appropriate http://bugs.python.org/issue27916 #27915: Use 'ascii' instead of 'us-ascii' to bypass lookup machinery http://bugs.python.org/issue27915 #27914: Incorrect comment in PyModule_ExcDef http://bugs.python.org/issue27914 #27911: Unnecessary error checks in exec_builtin_or_dynamic http://bugs.python.org/issue27911 #27910: Doc/library/traceback.rst ??? references to tuples should be r http://bugs.python.org/issue27910 Top 10 most discussed issues (10) ================================= #27761: Private _nth_root function loses accuracy http://bugs.python.org/issue27761 15 msgs #23591: enum: Add Flags and IntFlags http://bugs.python.org/issue23591 12 msgs #27881: Fix possible bugs when setting sqlite3.Connection.isolation_le http://bugs.python.org/issue27881 12 msgs #26470: Make OpenSSL module compatible with OpenSSL 1.1.0 http://bugs.python.org/issue26470 11 msgs #27867: various issues due to misuse of PySlice_GetIndicesEx http://bugs.python.org/issue27867 9 msgs #27918: Running test suites without gui but still having windows flash http://bugs.python.org/issue27918 9 msgs #27923: PEP 467 -- Minor API improvements for binary sequences http://bugs.python.org/issue27923 9 msgs #22458: Add fractions benchmark http://bugs.python.org/issue22458 8 msgs #26530: tracemalloc: add C API to manually track/untrack memory alloca http://bugs.python.org/issue26530 8 msgs #27898: regexp performance degradation between 2.7.6 and 2.7.12 http://bugs.python.org/issue27898 8 msgs Issues closed (48) ================== #10513: sqlite3.InterfaceError after commit http://bugs.python.org/issue10513 closed by berker.peksag #12319: [http.client] HTTPConnection.request not support "chunked" Tra http://bugs.python.org/issue12319 closed by martin.panter #12885: distutils.filelist.findall() fails on broken symlink in Py2.x http://bugs.python.org/issue12885 closed by jason.coombs #18899: make pystone.py Py3 compatible in benchmark suite http://bugs.python.org/issue18899 closed by berker.peksag #20562: sqlite3 returns result set with doubled first entry http://bugs.python.org/issue20562 closed by berker.peksag #22881: show median in benchmark results http://bugs.python.org/issue22881 closed by scoder #23129: sqlite3 COMMIT nested in SELECT returns unexpected results http://bugs.python.org/issue23129 closed by berker.peksag #23229: add inf, nan, infj, nanj to cmath module http://bugs.python.org/issue23229 closed by mark.dickinson #23968: rename the platform directory from plat-$(MACHDEP) to plat-$(P http://bugs.python.org/issue23968 closed by doko #24648: Allocation of values array in split dicts should use small obj http://bugs.python.org/issue24648 closed by haypo #25402: More accurate estimation of the number of digits in int to dec http://bugs.python.org/issue25402 closed by mark.dickinson #25423: Deprecate benchmarks that execute too quickly http://bugs.python.org/issue25423 closed by brett.cannon #26027: Support Path objects in the posix module http://bugs.python.org/issue26027 closed by brett.cannon #26275: perf.py: calibrate benchmarks using time, not using a fixed nu http://bugs.python.org/issue26275 closed by haypo #26814: [WIP] Add a new _PyObject_FastCall() function which avoids the http://bugs.python.org/issue26814 closed by haypo #27128: Add _PyObject_FastCall() http://bugs.python.org/issue27128 closed by haypo #27214: a potential future bug and an optimization that mostly undermi http://bugs.python.org/issue27214 closed by mark.dickinson #27283: Add a "What's New" entry for PEP 519 http://bugs.python.org/issue27283 closed by brett.cannon #27444: Python doesn't build due to test_float.py broken on non-IEEE m http://bugs.python.org/issue27444 closed by mark.dickinson #27506: make bytes/bytearray translate's delete a keyword argument http://bugs.python.org/issue27506 closed by martin.panter #27524: Update os.path for PEP 519/__fspath__() http://bugs.python.org/issue27524 closed by brett.cannon #27706: Random.seed, whose purpose is purportedly determinism, behaves http://bugs.python.org/issue27706 closed by rhettinger #27809: Add _PyFunction_FastCallDict(): fast call with keyword argumen http://bugs.python.org/issue27809 closed by haypo #27818: Speed up number format spec parsing http://bugs.python.org/issue27818 closed by serhiy.storchaka #27842: Order CSV header fields http://bugs.python.org/issue27842 closed by rhettinger #27861: sqlite3 type confusion and multiple frees http://bugs.python.org/issue27861 closed by serhiy.storchaka #27870: Left shift of zero allocates memory http://bugs.python.org/issue27870 closed by mark.dickinson #27871: ctypes docs must be more explicit about the type a func return http://bugs.python.org/issue27871 closed by eryksun #27878: Unicode word boundries http://bugs.python.org/issue27878 closed by SilentGhost #27882: Python docs on 9.2 Math module lists math.log2 as function but http://bugs.python.org/issue27882 closed by ebarry #27885: Add a Crypto++ Wrapper for when Someone needs to use Crypto++ http://bugs.python.org/issue27885 closed by r.david.murray #27887: Installation failed http://bugs.python.org/issue27887 closed by Aleksandar Petrovic #27888: Hide pip install/uninstall windows in setup http://bugs.python.org/issue27888 closed by steve.dower #27891: Consistently group and sort imports within idlelib modules. http://bugs.python.org/issue27891 closed by terry.reedy #27893: email.parser.BytesParser.parsebytes docs fix http://bugs.python.org/issue27893 closed by r.david.murray #27894: Fix to_addrs refs in smtplib docs http://bugs.python.org/issue27894 closed by rhettinger #27895: Spelling fixes http://bugs.python.org/issue27895 closed by rhettinger #27899: Apostrophe is not replace with ' ElementTree.tostring (al http://bugs.python.org/issue27899 closed by serhiy.storchaka #27904: Let logging format more messages on demand http://bugs.python.org/issue27904 closed by python-dev #27907: Misspelled variable in test_asyncio/test_events http://bugs.python.org/issue27907 closed by gvanrossum #27909: Py_INCREF(NULL) in _imp.create_builtin http://bugs.python.org/issue27909 closed by rhettinger #27912: Distutils should use warehouse for index http://bugs.python.org/issue27912 closed by jason.coombs #27913: Difflib.ndiff (Problem on identification of changes as Diff St http://bugs.python.org/issue27913 closed by SilentGhost #27917: Choose platform triplets for android builds http://bugs.python.org/issue27917 closed by doko #27922: Make IDLE tests less flashy http://bugs.python.org/issue27922 closed by terry.reedy #27924: ensurepip raises TypeError after pip uninstall http://bugs.python.org/issue27924 closed by jayvdb #27933: functools.lru_cache seems to not work when renaming decorated http://bugs.python.org/issue27933 closed by ?????????? ???????????????? #901727: extra_path kwarg to setup() undocumented http://bugs.python.org/issue901727 closed by jason.coombs From guido at python.org Fri Sep 2 12:17:53 2016 From: guido at python.org (Guido van Rossum) Date: Fri, 2 Sep 2016 09:17:53 -0700 Subject: [Python-Dev] PEP 526 ready for review: Syntax for Variable and Attribute Annotations In-Reply-To: References: <20160901162117.GX26300@ando.pearwood.info> Message-ID: On Fri, Sep 2, 2016 at 6:43 AM, Ivan Levkivskyi wrote: > On 2 September 2016 at 04:38, Nick Coghlan wrote: >> However, a standalone Ellipsis doesn't currently have a meaning as a >> type annotation (it's only meaningful when subscripting Tuple and >> Callable), so a spelling like this might work: >> >> NAME: ... >> >> That spelling could then also be used in function definitions to say >> "infer the return type from the return statements rather than assuming >> Any" > > Interesting idea. > This is somehow similar to one of the existing use of Ellipsis: in numpy it > infers how many dimensions needs to have the full slice, it is like saying > "You know what I mean". So I am +1 on this solution. I like it too, but I think it's better to leave any firm promises about the *semantics* of variable annotations out of the PEP. I just spoke to someone who noted that the PEP is likely to evoke an outsize emotional response. (Similar to what happened with PEP 484.) Pinning down the semantics is not why I am pushing for PEP 526 -- I only want to pin down the *syntax* to the point where we won't have to change it again for many versions, since it's much harder to change the syntax than it is to change the behavior of type checkers (which have fewer backwards compatibility constraints, a faster release cycle, and narrower user bases than core Python itself). -- --Guido van Rossum (python.org/~guido) From steve at pearwood.info Fri Sep 2 12:40:37 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 3 Sep 2016 02:40:37 +1000 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: <57C982D4.1060405@hotpy.org> References: <57C982D4.1060405@hotpy.org> Message-ID: <20160902164035.GB26300@ando.pearwood.info> Hi Mark, I'm going to trim your post drastically, down to the bare essentials, in order to keep this already long post down to a manageable size. On Fri, Sep 02, 2016 at 02:47:00PM +0100, Mark Shannon wrote: [...] > With type comments, this is intuitively correct and should type check: > def eggs(cond:bool): > if cond: > x = None > else: > x = [] # type: List[int] > spam(x) # Here we can infer the type of x It isn't correct. You've declared something to be a list of ints, but assigned a value None to it! How is that not an error? The error is more obvious if you swap the order of assignments: if cond: x = [] # type: List[int] else: x = None MyPy currently requires the experimental --strict-optional flag to detect this error: [steve at ando ~]$ mypy --strict-optional test.py test.py: note: In function "eggs": test.py:10: error: Incompatible types in assignment (expression has type None, variable has type List[int]) Changing that from comment syntax to (proposed) Python syntax will not change that. There is no semantic difference to the type checker between x = [] #type: List[int] and x: List[int] = [] and any type checker will have to treat them identically. > With PEP 526 we loose the ability to infer types. On re-reading the PEP, I have just noticed that nowhere does it explicitly state that checkers are still expected to perform type inference. However, this is the companion to PEP 484, which states: Type comments No first-class syntax support for explicitly marking variables as being of a specific type is added by this PEP. TO HELP WITH TYPE INFERENCE IN COMPLEX CASES, a comment of the following format may be used: [...] (Emphasis added.) So the intention is that, regardless of whether you use a type annotation using a comment or the proposed syntax, that is intended to *help* the checker perform inference, not to replace it. Perhaps this PEP should include an explicit note to that end. [...] > So we need to use a more complex type > def eggs(cond:bool): > x: Optional[List[int]] > if cond: > x = None # Now legal > else: > x: = [] > spam(x) > > I don't think this improves readability. Maybe not, but it sure improves *correctness*. A human reader might be able to logically deduce that x = None and x = [] are both valid, given that spam() takes either a list or None, but I'm not sure if that level of intelligence is currently possible in type inference. (Maybe it is, and MyPy simply doesn't support it yet.) So it may be that this is a case where you do have to apply an explicit type hint, using *either* a type comment or this new proposed syntax: x: Optional[List[int]] if cond: x = None else: x = [] should be precisely the same as: if cond: x = None #type: Optional[List[int]] else: x = [] > Quoting from the PEP: > ``` > a: int > a: str # Static type checker will warn about this. > ``` > In other words, it is illegal for a checker to split up the variable, > even though it is straightforward to do so. No, it is a *warning*, not an error. Remember, the Python interpreter itself won't care. The type checker is optional and not part of the interpreter. You can still write code like: a = 1 do_something(a) a = "string" do_another(a) and Python will be happy. But if you do run a type checker, it should warn you that you're changing types, as that suggests the possibility of a type error. (And it also goes against a lot of people's style guidelines.) > We should be free to add extra variables, whenever we choose, for > clarity. For example, > total = foo() - bar() > should not be treated differently from: > revenue = foo() > tax = bar() > total = revenue - tax > > If types are inferred, there is no problem. > However, if they must be declared, then the use of meaningfully named > variables is discouraged. Was there something in the PEP that lead you to believe that they "must" be declared? Under "Non-goals", the PEP states in bold text: "the authors have no desire to ever make type hints mandatory" so I'm not sure why you think that types must be declared. Perhaps the PEP should make it more obvious that type hints on variables are *in addition to* and not a substitute for type inference. > PEP 484 states: > "If type hinting proves useful in general, a syntax for typing variables > may be provided in a future Python version." > Has it proved useful in general? I don't think it has. According to the PEP, it has proved useful in typeshed. > It seems confused about class attributes and instance attributes > ================================================================ > > The PEP also includes a section of how to define class attributes and > instance attributes. It seems that everything needs to be defined in the > class scope, even it is not an attribute of the class, but of its > instances. Quoting from the PEP: "As a matter of convenience, instance attributes can be annotated in __init__ or other methods, rather than in class" Perhaps the PEP could do with a little more justification for why we would want to declare instance attributes in the class rather than in __init__. > Example from PEP 526: > > class Starship: > > captain: str = 'Picard' > damage: int > stats: ClassVar[Dict[str, int]] = {} > > def __init__(self, damage: int, captain: str = None): > self.damage = damage > if captain: > self.captain = captain # Else keep the default On re-reading this, I too wonder why damage is being declared in the class body. Can the type checker not infer that self.damage has the same type as damage in the __init__ method? > Finally, in the rejected proposal section, under "Should we introduce > variable annotations at all?" it states that "Variable annotations have > already been around for almost two years in the form of type comments, > sanctioned by PEP 484." > I don't think that this is entirely true. PEP 484 itself was created almost two years ago (Sept 2014) and although it doesn't list prior art for type comments, I seem to recall that it copied the idea from MyPy. I expect that MyPy (and maybe even linters like PyLint, PyFlakes, etc.) have been using type comments for "almost two years", if not longer. > PEP 484 was about the syntax for types, declaring parameter and return > types, and declaring custom types to be generic. > PEP 484 does include a description of type comments, but they are always > annotations on assignment statements and were primarily intended for use > in stub files. I'm not seeing what distinction you think you are making here. What distinction do you see between: x: int = func(value) and x = func(value) #type: int -- Steve From srkunze at mail.de Fri Sep 2 12:46:41 2016 From: srkunze at mail.de (Sven R. Kunze) Date: Fri, 2 Sep 2016 18:46:41 +0200 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: <57C982D4.1060405@hotpy.org> References: <57C982D4.1060405@hotpy.org> Message-ID: <3d8dbb4e-7223-491f-2641-42c1c1a7d026@mail.de> Hi Mark, I agree with you about postponing. Not so much because of the issues you mentioned. Those all seem resolvable to me and mostly concerns type checkers, linters and coding styles not Python itself. However, I also don't like the rushing through as if this beta were the only chance to get it into Python. Python haven't had variable annotations for decades until now and Python will still be there in 10 decades or so. Thus, 2 years more to wait and to hammer out the details does not seem much compared to the entire lifetime of this language. The PEP also remains silent about when to use annotations. Compared to recent additions like f-strings or async, it's completely clear when to use it. However for variable annotations it's not (all variables? most used variables? least used variables?). So, I also agree with you that improving type checkers is the better way than adding all static type annotations all over the place. Python is dynamic and also types should be as dynamic and redundant-free as possible. Thus, some guidance would be great here. Cheers, Sven On 02.09.2016 15:47, Mark Shannon wrote: > Hi everyone, > > I think we should reject, or at least postpone PEP 526. > > PEP 526 represents a major change to the language, however there are, > I believe, a number of technical flaws with the PEP. > > It is probable that with significant revisions it can be a worthwhile > addition to the language, but that cannot happen in time for 3.6 beta > 1 (in 11 days). > > PEP 526 claims to be an extension of PEP 484, but I don't think that > is entirely correct. > PEP 484 was primarily about defining the semantics of pre-existing > syntax. PEP 526 is about adding new syntax. > Historically the bar for adding new syntax has been set very high. I > don't think that PEP 526, in its current form, reaches that bar. > > Below is a list of the issues I have with the PEP as it stands. > > In many cases it makes it more effort than type comments > ======================================================== > > Type hints should be as easy to use as possible, and that means > pushing as much work as possible onto the checker, and not burdening > the programmer. > > Attaching type hints to variables, rather than expressions, reduces > the potential for inference. This makes it harder for programmer, but > easier for the checker, which is the wrong way around. > > For example,, given a function: > def spam(x: Optional[List[int]])->None: ... > > With type comments, this is intuitively correct and should type check: > def eggs(cond:bool): > if cond: > x = None > else: > x = [] # type: List[int] > spam(x) # Here we can infer the type of x > > With PEP 526 we loose the ability to infer types. > def eggs(cond:bool): > if cond: > x = None # Not legal due to type declaration below > else: > x: List[int] = [] > spam(x) > > So we need to use a more complex type > def eggs(cond:bool): > x: Optional[List[int]] > if cond: > x = None # Now legal > else: > x: = [] > spam(x) > > I don't think this improves readability. > Whether this is an acceptable change is debatable, but it does need > some debate. > > It limits the use of variables > ============================== > > In Python a name (variable) is just a binding that refers to an object. > A name only exists in a meaningful sense once an object has been > assigned to it. Any attempt to use that name, without an object bound > to it, will result in a NameError. > > PEP 526 makes variables more than just bindings, as any rebinding must > conform to the given type. This looses us some of the dynamism for > which we all love Python. > > Quoting from the PEP: > ``` > a: int > a: str # Static type checker will warn about this. > ``` > In other words, it is illegal for a checker to split up the variable, > even though it is straightforward to do so. > > However, without the type declarations, > ``` > a = 1 > a = "Hi" > ``` > is just fine. Useless, but fine. > > We should be free to add extra variables, whenever we choose, for > clarity. For example, > total = foo() - bar() > should not be treated differently from: > revenue = foo() > tax = bar() > total = revenue - tax > > If types are inferred, there is no problem. > However, if they must be declared, then the use of meaningfully named > variables is discouraged. > > [A note about type-inference: > Type inference is not a universal panacea, but it can make life a lot > easier for programmers in statically type languages. > Languages like C# use local type inference extensively and it means > that many variables often do not need their type declared. We should > take care not to limit the ability of checkers to infer values and > types and make programmers' lives easier. > Within a function, type inference is near perfect, failing only > occasionally for some generic types. > One place where type inference definitely breaks down is across calls, > which is why PEP 484 is necessary. > ] > > It is premature > =============== > > There are still plenty of issues to iron out w.r.t. PEP 484 types. I > don't think we should be adding more syntax, until we have a *precise* > idea of what is required. > > PEP 484 states: > "If type hinting proves useful in general, a syntax for typing > variables may be provided in a future Python version." > Has it proved useful in general? I don't think it has. Maybe it will > in future, but it hasn't yet. > > It seems confused about class attributes and instance attributes > ================================================================ > > The PEP also includes a section of how to define class attributes and > instance attributes. It seems that everything needs to be defined in > the class scope, even it is not an attribute of the class, but of its > instances. This seems confusing, both to human reader and machine > analyser. > > Example from PEP 526: > > class Starship: > > captain: str = 'Picard' > damage: int > stats: ClassVar[Dict[str, int]] = {} > > def __init__(self, damage: int, captain: str = None): > self.damage = damage > if captain: > self.captain = captain # Else keep the default > > With type hints as they currently exist, the same code is shorter and > doesn't contaminate the class namespace with the 'damage' attribute. > > class Starship: > > captain = 'Picard' > stats = {} # type: Dict[str, int] > > def __init__(self, damage: int, captain: str = None): > self.damage = damage # Can infer type as int > if captain: > self.captain = captain # Can infer type as str > > > This isn't an argument against adding type syntax for attributes in > general, just that the form suggested in PEP 526 doesn't seem to > follow Python semantics. > > One could imagine applying minimal PEP 526 style hints, with standard > Python semantics and relying on type inference, as follows: > > class Starship: > > captain = 'Picard' > stats: Dict[str, int] = {} > > def __init__(self, damage: int, captain: str = None): > self.damage = damage > if captain: > self.captain = captain > > The PEP overstates the existing use of static typing in Python > ============================================================== > > Finally, in the rejected proposal section, under "Should we introduce > variable annotations at all?" it states that "Variable annotations > have already been around for almost two years in the form of type > comments, sanctioned by PEP 484." > I don't think that this is entirely true. > PEP 484 was about the syntax for types, declaring parameter and return > types, and declaring custom types to be generic. > PEP 484 does include a description of type comments, but they are > always annotations on assignment statements and were primarily > intended for use in stub files. > > > > Please don't turn Python into some sort of inferior Java. > There is potential in this PEP, but in its current form I think it > should be rejected. > > Cheers, > Mark. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/srkunze%40mail.de From k7hoven at gmail.com Fri Sep 2 13:10:24 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Fri, 2 Sep 2016 20:10:24 +0300 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: <57C982D4.1060405@hotpy.org> References: <57C982D4.1060405@hotpy.org> Message-ID: I agree about some concerns and disagree about several. I see most use for class/instance attribute annotations, both for type checking and for other uses. I'm least sure about their syntax and about annotations in functions or at module level in the proposed form. Below some comments: On Fri, Sep 2, 2016 at 4:47 PM, Mark Shannon wrote: > Hi everyone, > > I think we should reject, or at least postpone PEP 526. > > PEP 526 represents a major change to the language, however there are, I > believe, a number of technical flaws with the PEP. [...] > > In many cases it makes it more effort than type comments > ======================================================== > > Type hints should be as easy to use as possible, and that means pushing as > much work as possible onto the checker, and not burdening the programmer. > > Attaching type hints to variables, rather than expressions, reduces the > potential for inference. This makes it harder for programmer, but easier for > the checker, which is the wrong way around. The more you infer types, the less you check them. It's up to the programmers to choose the amount of annotation. > For example,, given a function: > def spam(x: Optional[List[int]])->None: ... > > With type comments, this is intuitively correct and should type check: > def eggs(cond:bool): > if cond: > x = None > else: > x = [] # type: List[int] > spam(x) # Here we can infer the type of x > > With PEP 526 we loose the ability to infer types. > def eggs(cond:bool): > if cond: > x = None # Not legal due to type declaration below > else: > x: List[int] = [] > spam(x) I'm also a little worried about not being able to reannotate a name. > > So we need to use a more complex type > def eggs(cond:bool): > x: Optional[List[int]] > if cond: > x = None # Now legal > else: > x: = [] > spam(x) > A good checker should be able to infer that x is a union type at the point that it's passed to spam, even without the type annotation. For example: def eggs(cond:bool): if cond: x = 1 else: x = 1.5 spam(x) # a good type checker infers that x is of type Union[int, float] Or with annotations: def eggs(cond:bool): if cond: x : int = foo() # foo may not have a return type hint else: x : float = bar() # bar may not have a return type hint spam(x) # a good type checker infers that x is of type Union[int, float] [...] > It limits the use of variables > ============================== > > In Python a name (variable) is just a binding that refers to an object. > A name only exists in a meaningful sense once an object has been assigned to > it. Any attempt to use that name, without an object bound to it, will result > in a NameError. > IIUC, that would still be the case after PEP 526. [...] > > We should be free to add extra variables, whenever we choose, for clarity. > For example, > total = foo() - bar() > should not be treated differently from: > revenue = foo() > tax = bar() > total = revenue - tax > > If types are inferred, there is no problem. > However, if they must be declared, then the use of meaningfully named > variables is discouraged. > Who says they *must* be declared? [...] > It is premature > =============== > > There are still plenty of issues to iron out w.r.t. PEP 484 types. I don't > think we should be adding more syntax, until we have a *precise* idea of > what is required. > > PEP 484 states: > "If type hinting proves useful in general, a syntax for typing variables may > be provided in a future Python version." > Has it proved useful in general? I don't think it has. Maybe it will in > future, but it hasn't yet. > Yeah, I hope someone has enough experience to know whether this is the right thing for Python as a whole. > It seems confused about class attributes and instance attributes > ================================================================ > > The PEP also includes a section of how to define class attributes and > instance attributes. It seems that everything needs to be defined in the > class scope, even it is not an attribute of the class, but of its instances. > This seems confusing, both to human reader and machine analyser. I don't see the problem here, isn't that how it's usually done in strongly typed languages? And methods are defined in the class scope too (well yes, they do also exist in the class namespace, but anyway...). But I agree in the sense that the proposed syntax is far from explicit about these being instance attributes by default. > Example from PEP 526: > > class Starship: > > captain: str = 'Picard' > damage: int > stats: ClassVar[Dict[str, int]] = {} > > def __init__(self, damage: int, captain: str = None): > self.damage = damage > if captain: > self.captain = captain # Else keep the default > > With type hints as they currently exist, the same code is shorter and > doesn't contaminate the class namespace with the 'damage' attribute. IIUC, 'damage' will not be in the class namespace according to PEP 526. > class Starship: > > captain = 'Picard' > stats = {} # type: Dict[str, int] > > def __init__(self, damage: int, captain: str = None): > self.damage = damage # Can infer type as int > if captain: > self.captain = captain # Can infer type as str > And that's one of the reasons why there should be annotations without setting a type hint (as I wrote in the other thread). > > This isn't an argument against adding type syntax for attributes in general, > just that the form suggested in PEP 526 doesn't seem to follow Python > semantics. > > One could imagine applying minimal PEP 526 style hints, with standard Python > semantics and relying on type inference, as follows: > > class Starship: > > captain = 'Picard' > stats: Dict[str, int] = {} > > def __init__(self, damage: int, captain: str = None): > self.damage = damage > if captain: > self.captain = captain I don't like this, because some of the attributes are introduced at class level and some inside __init__, so it is easy to miss that there is such a thing as 'damage' (at least in more complicated examples). I keep repeating myself, but again this where we need non-type-hinting attribute declarations. -- Koos > > The PEP overstates the existing use of static typing in Python > ============================================================== > [...] > Please don't turn Python into some sort of inferior Java. > There is potential in this PEP, but in its current form I think it should be > rejected. > > Cheers, > Mark. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/k7hoven%40gmail.com -- + Koos Zevenhoven + http://twitter.com/k7hoven + From victor.stinner at gmail.com Fri Sep 2 13:44:45 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 2 Sep 2016 19:44:45 +0200 Subject: [Python-Dev] Need help in debugging the python core In-Reply-To: References: Message-ID: 2016-09-02 8:49 GMT+02:00 Sajjanshetty, Amresh : > I?m using asyncio and paramiko to multiplex different channels into a single > SSH connection. Hum, asyncio found bugs in CPython. Please try with a more recent version of CPython than 3.4.3 :-/ > Program terminated with signal 11, Segmentation fault. > > #0 _PyObject_Malloc (ctx=0x0, nbytes=52) at Objects/obmalloc.c:1159 Hum, a crash on a memory allocation is usually a buffer overflow. Please retry with Python 3.6 using PYTHONMALLOC=debug: https://docs.python.org/dev/using/cmdline.html#envvar-PYTHONMALLOC Calling regulary gc.collect() may help PYTHONMALLOC=debug to detect buffer overflows earlier. Victor From victor.stinner at gmail.com Fri Sep 2 13:47:15 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 2 Sep 2016 19:47:15 +0200 Subject: [Python-Dev] Need help in debugging the python core In-Reply-To: References: Message-ID: Oh, I forgot to mention that it would help to get the Python traceback on the crash. Try faulthandler: add faulthandler.enable() at the beginning of your program. https://docs.python.org/dev/library/faulthandler.html Maybe I should write once tools to debug such bug :-) Victor From steve.dower at python.org Fri Sep 2 13:47:41 2016 From: steve.dower at python.org (Steve Dower) Date: Fri, 2 Sep 2016 10:47:41 -0700 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: <20160902164035.GB26300@ando.pearwood.info> References: <57C982D4.1060405@hotpy.org> <20160902164035.GB26300@ando.pearwood.info> Message-ID: "I'm not seeing what distinction you think you are making here. What distinction do you see between: x: int = func(value) and x = func(value) #type: int" Not sure whether I agree with Mark on this particular point, but the difference I see here is that the first describes what types x may ever contain, while the latter describes what type of being assigned to x right here. So one is a variable annotation while the other is an expression annotation. Personally, I prefer expression annotations over variable annotations, as there are many other languages I'd prefer if variable have fixed types (e.g. C++, where I actually enjoy doing horrible things with implicit casting ;) ). Variable annotations appear to be inherently restrictive, so either we need serious clarification as to why they are not, or they actually are and we ought to be more sure that it's the direction we want the language to go. Cheers, Steve Top-posted from my Windows Phone -----Original Message----- From: "Steven D'Aprano" Sent: ?9/?2/?2016 9:43 To: "python-dev at python.org" Subject: Re: [Python-Dev] Please reject or postpone PEP 526 Hi Mark, I'm going to trim your post drastically, down to the bare essentials, in order to keep this already long post down to a manageable size. On Fri, Sep 02, 2016 at 02:47:00PM +0100, Mark Shannon wrote: [...] > With type comments, this is intuitively correct and should type check: > def eggs(cond:bool): > if cond: > x = None > else: > x = [] # type: List[int] > spam(x) # Here we can infer the type of x It isn't correct. You've declared something to be a list of ints, but assigned a value None to it! How is that not an error? The error is more obvious if you swap the order of assignments: if cond: x = [] # type: List[int] else: x = None MyPy currently requires the experimental --strict-optional flag to detect this error: [steve at ando ~]$ mypy --strict-optional test.py test.py: note: In function "eggs": test.py:10: error: Incompatible types in assignment (expression has type None, variable has type List[int]) Changing that from comment syntax to (proposed) Python syntax will not change that. There is no semantic difference to the type checker between x = [] #type: List[int] and x: List[int] = [] and any type checker will have to treat them identically. > With PEP 526 we loose the ability to infer types. On re-reading the PEP, I have just noticed that nowhere does it explicitly state that checkers are still expected to perform type inference. However, this is the companion to PEP 484, which states: Type comments No first-class syntax support for explicitly marking variables as being of a specific type is added by this PEP. TO HELP WITH TYPE INFERENCE IN COMPLEX CASES, a comment of the following format may be used: [...] (Emphasis added.) So the intention is that, regardless of whether you use a type annotation using a comment or the proposed syntax, that is intended to *help* the checker perform inference, not to replace it. Perhaps this PEP should include an explicit note to that end. [...] > So we need to use a more complex type > def eggs(cond:bool): > x: Optional[List[int]] > if cond: > x = None # Now legal > else: > x: = [] > spam(x) > > I don't think this improves readability. Maybe not, but it sure improves *correctness*. A human reader might be able to logically deduce that x = None and x = [] are both valid, given that spam() takes either a list or None, but I'm not sure if that level of intelligence is currently possible in type inference. (Maybe it is, and MyPy simply doesn't support it yet.) So it may be that this is a case where you do have to apply an explicit type hint, using *either* a type comment or this new proposed syntax: x: Optional[List[int]] if cond: x = None else: x = [] should be precisely the same as: if cond: x = None #type: Optional[List[int]] else: x = [] > Quoting from the PEP: > ``` > a: int > a: str # Static type checker will warn about this. > ``` > In other words, it is illegal for a checker to split up the variable, > even though it is straightforward to do so. No, it is a *warning*, not an error. Remember, the Python interpreter itself won't care. The type checker is optional and not part of the interpreter. You can still write code like: a = 1 do_something(a) a = "string" do_another(a) and Python will be happy. But if you do run a type checker, it should warn you that you're changing types, as that suggests the possibility of a type error. (And it also goes against a lot of people's style guidelines.) > We should be free to add extra variables, whenever we choose, for > clarity. For example, > total = foo() - bar() > should not be treated differently from: > revenue = foo() > tax = bar() > total = revenue - tax > > If types are inferred, there is no problem. > However, if they must be declared, then the use of meaningfully named > variables is discouraged. Was there something in the PEP that lead you to believe that they "must" be declared? Under "Non-goals", the PEP states in bold text: "the authors have no desire to ever make type hints mandatory" so I'm not sure why you think that types must be declared. Perhaps the PEP should make it more obvious that type hints on variables are *in addition to* and not a substitute for type inference. > PEP 484 states: > "If type hinting proves useful in general, a syntax for typing variables > may be provided in a future Python version." > Has it proved useful in general? I don't think it has. According to the PEP, it has proved useful in typeshed. > It seems confused about class attributes and instance attributes > ================================================================ > > The PEP also includes a section of how to define class attributes and > instance attributes. It seems that everything needs to be defined in the > class scope, even it is not an attribute of the class, but of its > instances. Quoting from the PEP: "As a matter of convenience, instance attributes can be annotated in __init__ or other methods, rather than in class" Perhaps the PEP could do with a little more justification for why we would want to declare instance attributes in the class rather than in __init__. > Example from PEP 526: > > class Starship: > > captain: str = 'Picard' > damage: int > stats: ClassVar[Dict[str, int]] = {} > > def __init__(self, damage: int, captain: str = None): > self.damage = damage > if captain: > self.captain = captain # Else keep the default On re-reading this, I too wonder why damage is being declared in the class body. Can the type checker not infer that self.damage has the same type as damage in the __init__ method? > Finally, in the rejected proposal section, under "Should we introduce > variable annotations at all?" it states that "Variable annotations have > already been around for almost two years in the form of type comments, > sanctioned by PEP 484." > I don't think that this is entirely true. PEP 484 itself was created almost two years ago (Sept 2014) and although it doesn't list prior art for type comments, I seem to recall that it copied the idea from MyPy. I expect that MyPy (and maybe even linters like PyLint, PyFlakes, etc.) have been using type comments for "almost two years", if not longer. > PEP 484 was about the syntax for types, declaring parameter and return > types, and declaring custom types to be generic. > PEP 484 does include a description of type comments, but they are always > annotations on assignment statements and were primarily intended for use > in stub files. I'm not seeing what distinction you think you are making here. What distinction do you see between: x: int = func(value) and x = func(value) #type: int -- Steve _______________________________________________ Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/steve.dower%40python.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From burkhardameier at gmail.com Fri Sep 2 13:49:47 2016 From: burkhardameier at gmail.com (Burkhard Meier) Date: Fri, 2 Sep 2016 10:49:47 -0700 Subject: [Python-Dev] Need help in debugging the python core In-Reply-To: References: Message-ID: How could I help? Burkhard On Fri, Sep 2, 2016 at 10:47 AM, Victor Stinner wrote: > Oh, I forgot to mention that it would help to get the Python traceback > on the crash. Try faulthandler: add faulthandler.enable() at the > beginning of your program. > https://docs.python.org/dev/library/faulthandler.html > > Maybe I should write once tools to debug such bug :-) > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > burkhardameier%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From levkivskyi at gmail.com Fri Sep 2 13:51:17 2016 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Fri, 2 Sep 2016 19:51:17 +0200 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <57C982D4.1060405@hotpy.org> Message-ID: On 2 September 2016 at 17:59, Guido van Rossum wrote: > On Fri, Sep 2, 2016 at 6:47 AM, Mark Shannon wrote: > > Quoting from the PEP: > > ``` > > a: int > > a: str # Static type checker will warn about this. > > ``` > > In other words, it is illegal for a checker to split up the variable, even > > though it is straightforward to do so. > > One of my co-authors has gone too far here. The intent is not to legislate what should happen in this case but to leave it to the checker. In mypy, the equivalent syntax using type comments is currently indeed rejected, but we're considering a change here ( https://github.com/python/mypy/issues/1174). The PEP 526 syntax will not make a difference here. If I remember correctly, I added this example. At that time the intention was to propose to "loosen" the behaviour of type checkers (note it is a warning, not an error like in mypy). But now I agree with Guido that we should be even more liberal. We could left this to type checker to decide what to do (they could even have options like -Werror or -Wignore). -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From k7hoven at gmail.com Fri Sep 2 13:54:06 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Fri, 2 Sep 2016 20:54:06 +0300 Subject: [Python-Dev] PEP 467: last round (?) In-Reply-To: <57C88355.9000302@stoneleaf.us> References: <57C88355.9000302@stoneleaf.us> Message-ID: Some quick comments below, a few more later: On Thu, Sep 1, 2016 at 10:36 PM, Ethan Furman wrote: > One more iteration. PEPs repo not updated yet. Changes are renaming of > methods to be ``fromsize()`` and ``fromord()``, and moving ``memoryview`` to > an Open Questions section. > > > PEP: 467 > Title: Minor API improvements for binary sequences > Version: $Revision$ > Last-Modified: $Date$ > Author: Nick Coghlan , Ethan Furman < ethan at stoneleaf.us> > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 2014-03-30 > Python-Version: 3.6 > Post-History: 2014-03-30 2014-08-15 2014-08-16 2016-06-07 2016-09-01 > > > Abstract > ======== > > During the initial development of the Python 3 language specification, the > core ``bytes`` type for arbitrary binary data started as the mutable type > that is now referred to as ``bytearray``. Other aspects of operating in > the binary domain in Python have also evolved over the course of the Python > 3 series. > > This PEP proposes five small adjustments to the APIs of the ``bytes`` and > ``bytearray`` types to make it easier to operate entirely in the binary > domain: > > * Deprecate passing single integer values to ``bytes`` and ``bytearray`` > * Add ``bytes.fromsize`` and ``bytearray.fromsize`` alternative constructors > * Add ``bytes.fromord`` and ``bytearray.fromord`` alternative constructors > * Add ``bytes.getbyte`` and ``bytearray.getbyte`` byte retrieval methods > * Add ``bytes.iterbytes`` and ``bytearray.iterbytes`` alternative iterators I wonder if from_something with an underscore is more consistent (according to a quick search perhaps yes). What about bytes.getchar and iterchars? A 'byte' in python 3 seems to be an integer. (I would still like a .chars property that gives a sequence view with __getitem__ and __len__ so that the getchar and iterchars methods are not needed) chrb seems to be more in line with some bytes versions in for instance os than bchr. Do we really need chrb? Better to introduce from_int or from_ord also in str and recommend that over chr? -- Koos (mobile) > > Proposals > ========= > > Deprecation of current "zero-initialised sequence" behaviour without removal > ---------------------------------------------------------------------------- > > Currently, the ``bytes`` and ``bytearray`` constructors accept an integer > argument and interpret it as meaning to create a zero-initialised sequence > of the given size:: > > >>> bytes(3) > b'\x00\x00\x00' > >>> bytearray(3) > bytearray(b'\x00\x00\x00') > > This PEP proposes to deprecate that behaviour in Python 3.6, but to leave > it in place for at least as long as Python 2.7 is supported, possibly > indefinitely. > > No other changes are proposed to the existing constructors. > > > Addition of explicit "count and byte initialised sequence" constructors > ----------------------------------------------------------------------- > > To replace the deprecated behaviour, this PEP proposes the addition of an > explicit ``fromsize`` alternative constructor as a class method on both > ``bytes`` and ``bytearray`` whose first argument is the count, and whose > second argument is the fill byte to use (defaults to ``\x00``):: > > >>> bytes.fromsize(3) > b'\x00\x00\x00' > >>> bytearray.fromsize(3) > bytearray(b'\x00\x00\x00') > >>> bytes.fromsize(5, b'\x0a') > b'\x0a\x0a\x0a\x0a\x0a' > >>> bytearray.fromsize(5, b'\x0a') > bytearray(b'\x0a\x0a\x0a\x0a\x0a') > > ``fromsize`` will behave just as the current constructors behave when passed > a single > integer, while allowing for non-zero fill values when needed. > > > Addition of "bchr" function and explicit "single byte" constructors > ------------------------------------------------------------------- > > As binary counterparts to the text ``chr`` function, this PEP proposes > the addition of a ``bchr`` function and an explicit ``fromord`` alternative > constructor as a class method on both ``bytes`` and ``bytearray``:: > > >>> bchr(ord("A")) > b'A' > >>> bchr(ord(b"A")) > b'A' > >>> bytes.fromord(65) > b'A' > >>> bytearray.fromord(65) > bytearray(b'A') > > These methods will only accept integers in the range 0 to 255 (inclusive):: > > >>> bytes.fromord(512) > Traceback (most recent call last): > File "", line 1, in > ValueError: integer must be in range(0, 256) > > >>> bytes.fromord(1.0) > Traceback (most recent call last): > File "", line 1, in > TypeError: 'float' object cannot be interpreted as an integer > > While this does create some duplication, there are valid reasons for it:: > > * the ``bchr`` builtin is to recreate the ord/chr/unichr trio from Python > 2 under a different naming scheme > * the class method is mainly for the ``bytearray.fromord`` case, with > ``bytes.fromord`` added for consistency > > The documentation of the ``ord`` builtin will be updated to explicitly note > that ``bchr`` is the primary inverse operation for binary data, while > ``chr`` > is the inverse operation for text data, and that ``bytes.fromord`` and > ``bytearray.fromord`` also exist. > > Behaviourally, ``bytes.fromord(x)`` will be equivalent to the current > ``bytes([x])`` (and similarly for ``bytearray``). The new spelling is > expected to be easier to discover and easier to read (especially when used > in conjunction with indexing operations on binary sequence types). > > As a separate method, the new spelling will also work better with higher > order functions like ``map``. > > > Addition of "getbyte" method to retrieve a single byte > ------------------------------------------------------ > > This PEP proposes that ``bytes`` and ``bytearray`` gain the method > ``getbyte`` > which will always return ``bytes``:: > > >>> b'abc'.getbyte(0) > b'a' > > If an index is asked for that doesn't exist, ``IndexError`` is raised:: > > >>> b'abc'.getbyte(9) > Traceback (most recent call last): > File "", line 1, in > IndexError: index out of range > > > Addition of optimised iterator methods that produce ``bytes`` objects > --------------------------------------------------------------------- > > This PEP proposes that ``bytes`` and ``bytearray``gain an optimised > ``iterbytes`` method that produces length 1 ``bytes`` objects rather than > integers:: > > for x in data.iterbytes(): > # x is a length 1 ``bytes`` object, rather than an integer > > For example:: > > >>> tuple(b"ABC".iterbytes()) > (b'A', b'B', b'C') > > > Design discussion > ================= > > Why not rely on sequence repetition to create zero-initialised sequences? > ------------------------------------------------------------------------- > > Zero-initialised sequences can be created via sequence repetition:: > > >>> b'\x00' * 3 > b'\x00\x00\x00' > >>> bytearray(b'\x00') * 3 > bytearray(b'\x00\x00\x00') > > However, this was also the case when the ``bytearray`` type was originally > designed, and the decision was made to add explicit support for it in the > type constructor. The immutable ``bytes`` type then inherited that feature > when it was introduced in PEP 3137. > > This PEP isn't revisiting that original design decision, just changing the > spelling as users sometimes find the current behaviour of the binary > sequence > constructors surprising. In particular, there's a reasonable case to be made > that ``bytes(x)`` (where ``x`` is an integer) should behave like the > ``bytes.fromint(x)`` proposal in this PEP. Providing both behaviours as > separate > class methods avoids that ambiguity. > > > Open Questions > ============== > > Do we add ``iterbytes`` to ``memoryview``, or modify > ``memoryview.cast()`` to accept ``'s'`` as a single-byte interpretation? Or > do we ignore memory for now and add it later? > > > References > ========== > > .. [1] Initial March 2014 discussion thread on python-ideas > (https://mail.python.org/pipermail/python-ideas/2014-March/027295.html) > .. [2] Guido's initial feedback in that thread > (https://mail.python.org/pipermail/python-ideas/2014-March/027376.html) > .. [3] Issue proposing moving zero-initialised sequences to a dedicated API > (http://bugs.python.org/issue20895) > .. [4] Issue proposing to use calloc() for zero-initialised binary sequences > (http://bugs.python.org/issue21644) > .. [5] August 2014 discussion thread on python-dev > (https://mail.python.org/pipermail/python-ideas/2014-March/027295.html) > .. [6] June 2016 discussion thread on python-dev > (https://mail.python.org/pipermail/python-dev/2016-June/144875.html) > > > Copyright > ========= > > This document has been placed in the public domain. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/k7hoven%40gmail.com -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From levkivskyi at gmail.com Fri Sep 2 13:56:06 2016 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Fri, 2 Sep 2016 19:56:06 +0200 Subject: [Python-Dev] PEP 526 ready for review: Syntax for Variable and Attribute Annotations In-Reply-To: References: <20160901162117.GX26300@ando.pearwood.info> Message-ID: On 2 September 2016 at 18:17, Guido van Rossum wrote: > On Fri, Sep 2, 2016 at 6:43 AM, Ivan Levkivskyi wrote: > > On 2 September 2016 at 04:38, Nick Coghlan wrote: > >> However, a standalone Ellipsis doesn't currently have a meaning as a > >> type annotation (it's only meaningful when subscripting Tuple and > >> Callable), so a spelling like this might work: > >> > >> NAME: ... > >> > >> That spelling could then also be used in function definitions to say > >> "infer the return type from the return statements rather than assuming > >> Any" > > > > Interesting idea. > > This is somehow similar to one of the existing use of Ellipsis: in numpy it > > infers how many dimensions needs to have the full slice, it is like saying > > "You know what I mean". So I am +1 on this solution. > > I like it too, but I think it's better to leave any firm promises > about the *semantics* of variable annotations out of the PEP. I just > spoke to someone who noted that the PEP is likely to evoke an outsize > emotional response. (Similar to what happened with PEP 484.) > > Pinning down the semantics is not why I am pushing for PEP 526 -- I > only want to pin down the *syntax* to the point where we won't have to > change it again for many versions, since it's much harder to change > the syntax than it is to change the behavior of type checkers (which > have fewer backwards compatibility constraints, a faster release > cycle, and narrower user bases than core Python itself). This is a good point. I totally agree. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Fri Sep 2 14:04:07 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 3 Sep 2016 04:04:07 +1000 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <57C982D4.1060405@hotpy.org> Message-ID: <20160902180407.GC26300@ando.pearwood.info> On Fri, Sep 02, 2016 at 08:10:24PM +0300, Koos Zevenhoven wrote: > A good checker should be able to infer that x is a union type at the > point that it's passed to spam, even without the type annotation. For > example: > > def eggs(cond:bool): > if cond: > x = 1 > else: > x = 1.5 > spam(x) # a good type checker infers that x is of type Union[int, float] Oh I really hope not. I wouldn't call that a *good* type checker. I would call that a type checker that is overly permissive. Maybe you think that it's okay because ints and floats are somewhat compatible. But suppose I wrote: if cond: x = HTTPServer(*args) else: x = 1.5 Would you want the checker to infer Union[HTTPServer, float]? I wouldn't. I would want the checker to complain that the two branches of the `if` result in different types for x. If I really mean it, then I can give a type-hint. In any case, this PEP isn't about specifying when to declare variable types, it is for picking syntax. Do you have a better idea for variable syntax? -- Steve From steve at pearwood.info Fri Sep 2 14:19:13 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 3 Sep 2016 04:19:13 +1000 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <57C982D4.1060405@hotpy.org> <20160902164035.GB26300@ando.pearwood.info> Message-ID: <20160902181912.GD26300@ando.pearwood.info> On Fri, Sep 02, 2016 at 10:47:41AM -0700, Steve Dower wrote: > "I'm not seeing what distinction you think you are making here. What > distinction do you see between: > > x: int = func(value) > > and > > x = func(value) #type: int" > > Not sure whether I agree with Mark on this particular point, but the > difference I see here is that the first describes what types x may > ever contain, while the latter describes what type of being assigned > to x right here. So one is a variable annotation while the other is an > expression annotation. Ultimately Python is a dynamically typed language, and that's not changing. This means types are fundamentally associated with *values*, not *variables* (names). But in practice, you can go a long way by pretending that it is the variable that carries the type. That's the point of the static type checker: if you see that x holds an int here, then assume (unless told differently) that x should always be an int. Because in practice, most exceptions to that are due to bugs, or at least sloppy code. Of course, it is up to the type checker to decide how strict it wants to be, whether to treat violations as a warning or a error, whether to offer the user a flag to set the behaviour, etc. None of this is relevant to the PEP. The PEP only specifies the syntax, leaving enforcement or non-enforcement to the checker, and it says: PEP 484 introduced type hints, a.k.a. type annotations. While its main focus was function annotations, it also introduced the notion of type comments to annotate VARIABLES [emphasis added] not expressions. And: This PEP aims at adding syntax to Python for annotating the types of variables and attributes, instead of expressing them through comments which to me obviously implies that the two ways (type comment, and variable type hint) are intended to be absolutely identical in semantics, at least as far as the type-checker is concerned. (They will have different semantics at runtime: the comment is just a comment, while the type hint will set an __annotations__ mapping.) But perhaps the PEP needs to make it explicit that they are to be treated exactly the same. -- Steve From stefan at bytereef.org Fri Sep 2 14:37:30 2016 From: stefan at bytereef.org (Stefan Krah) Date: Fri, 2 Sep 2016 20:37:30 +0200 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: <20160902181912.GD26300@ando.pearwood.info> References: <57C982D4.1060405@hotpy.org> <20160902164035.GB26300@ando.pearwood.info> <20160902181912.GD26300@ando.pearwood.info> Message-ID: <20160902183730.GA10941@bytereef.org> [Replying to Steve Dower] On Sat, Sep 03, 2016 at 04:19:13AM +1000, Steven D'Aprano wrote: > On Fri, Sep 02, 2016 at 10:47:41AM -0700, Steve Dower wrote: > > "I'm not seeing what distinction you think you are making here. What > > distinction do you see between: > > > > x: int = func(value) > > > > and > > > > x = func(value) #type: int" > > > > Not sure whether I agree with Mark on this particular point, but the > > difference I see here is that the first describes what types x may > > ever contain, while the latter describes what type of being assigned > > to x right here. So one is a variable annotation while the other is an > > expression annotation. I see it differently, but I'm quite used to OCaml: # let f () = let x : int = 10 in let x : float = 320.0 in x;; Warning 26: unused variable x. val f : unit -> float = # f();; - : float = 320. Like in Python, in OCaml variables can be rebound and indeed have different types with different explicit type constraints. Expressions can also be annotated, but require parentheses (silly example): # let x = (10 * 20 : int);; val x : int = 200 So I'm quite happy with the proposed syntax in the PEP, perhaps the parenthesized expression annotations could also be added. But these are only very rarely needed. Stefan Krah From Amresh.Sajjanshetty at netapp.com Fri Sep 2 14:36:34 2016 From: Amresh.Sajjanshetty at netapp.com (Sajjanshetty, Amresh) Date: Fri, 2 Sep 2016 18:36:34 +0000 Subject: [Python-Dev] Need help in debugging the python core In-Reply-To: References: Message-ID: <00BB959A-08AF-4D81-850B-2DFFACC63D48@netapp.com> Surprisingly I?m not seeing the core dump/crash after adding ?faulthandler.enable()? . Would it catch the signal and ignore by default? Thanks and Regards, Amresh From: Burkhard Meier Date: Friday, September 2, 2016 at 11:19 PM To: Victor Stinner Cc: Amresh Sajjanshetty , "python-dev at python.org" Subject: Re: [Python-Dev] Need help in debugging the python core How could I help? Burkhard On Fri, Sep 2, 2016 at 10:47 AM, Victor Stinner > wrote: Oh, I forgot to mention that it would help to get the Python traceback on the crash. Try faulthandler: add faulthandler.enable() at the beginning of your program. https://docs.python.org/dev/library/faulthandler.html Maybe I should write once tools to debug such bug :-) Victor _______________________________________________ Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/burkhardameier%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From burkhardameier at gmail.com Fri Sep 2 14:42:39 2016 From: burkhardameier at gmail.com (Burkhard Meier) Date: Fri, 2 Sep 2016 11:42:39 -0700 Subject: [Python-Dev] Need help in debugging the python core In-Reply-To: References: Message-ID: You are using bash? On Sep 2, 2016 8:56 AM, "Sajjanshetty, Amresh" < Amresh.Sajjanshetty at netapp.com> wrote: > Dear All, > > > > I?m using asyncio and paramiko to multiplex different channels into a > single SSH connection. Things were working fine till recently but suddenly > started seeing that python getting crashed whenever I tried to write to the > channel. I have very limited knowledge on how python interpreter works, so > I?m finding difficulty in understanding the stack trace. Can you please > help in understanding the below backtarce. > > > > bash-4.2$ gdb /usr/software/bin/python3.4.3 core.60015 > > Traceback (most recent call last): > > File "", line 70, in > > File "", line 67, in GdbSetPythonDirectory > > File "/usr/software/share/gdb/python/gdb/__init__.py", line 19, in > > > import _gdb > > ImportError: No module named _gdb > > GNU gdb (GDB) 7.5 > > Copyright (C) 2012 Free Software Foundation, Inc. > > License GPLv3+: GNU GPL version 3 or later html> > > This is free software: you are free to change and redistribute it. > > There is NO WARRANTY, to the extent permitted by law. Type "show copying" > > and "show warranty" for details. > > This GDB was configured as "x86_64-unknown-linux-gnu". > > For bug reporting instructions, please see: > > ... > > Reading symbols from /usr/software/bin/python3.4.3...done. > > > > warning: core file may not match specified executable file. > > [New LWP 60015] > > [New LWP 60018] > > [New LWP 60019] > > [New LWP 60020] > > [New LWP 60021] > > [New LWP 60022] > > [New LWP 60023] > > [New LWP 60024] > > [Thread debugging using libthread_db enabled] > > Using host libthread_db library "/usr/software/lib/libthread_db.so.1". > > Core was generated by `/usr/software/bin/python3.4.3 > /x/eng/bbrtp/users/amresh/sshproxy_3896926_160824'. > > Program terminated with signal 11, Segmentation fault. > > #0 _PyObject_Malloc (ctx=0x0, nbytes=52) at Objects/obmalloc.c:1159 > > 1159 Objects/obmalloc.c: No such file or directory. > > (gdb) bt > > #0 _PyObject_Malloc (ctx=0x0, nbytes=52) at Objects/obmalloc.c:1159 > > #1 0x00007ff2e511474a in PyUnicode_New (maxchar=, size=3) > at Objects/unicodeobject.c:1093 > > #2 PyUnicode_New (size=3, maxchar=) at > Objects/unicodeobject.c:1033 > > #3 0x00007ff2e5139da2 in _PyUnicodeWriter_PrepareInternal > (writer=writer at entry=0x7fff3d5c8640, length=, > maxchar=, maxchar at entry=127) at > Objects/unicodeobject.c:13327 > > #4 0x00007ff2e513f38b in PyUnicode_DecodeUTF8Stateful (s=s at entry=0x7ff2e3572f78 > "tcp\245reuse\001\253socket_type\244pull\251transport\246zeromq", > size=size at entry=3, > > errors=errors at entry=0x7ff2dee5dd70 "strict", consumed=consumed at entry=0x0) > at Objects/unicodeobject.c:4757 > > #5 0x00007ff2e5140690 in PyUnicode_Decode (s=0x7ff2e3572f78 > "tcp\245reuse\001\253socket_type\244pull\251transport\246zeromq", size=3, > encoding=0x7ff2dee5df28 "utf-8", errors=0x7ff2dee5dd70 "strict") > > at Objects/unicodeobject.c:3012 > > #6 0x00007ff2de49bfdf in unpack_callback_raw (o=, l=3, > p=0x7ff2e3572f78 "tcp\245reuse\001\253socket_type\244pull\251transport\246zeromq", > u=0x7fff3d5c8840, b=) > > at msgpack/unpack.h:229 > > #7 unpack_execute (ctx=ctx at entry=0x7fff3d5c8840, > data=0x7ff2e3572ec0 "\205\245_auth\300\245_call\246expect\243_i", > , len=, off=off at entry > =0x7fff3d5c8820) > > at msgpack/unpack_template.h:312 > > #8 0x00007ff2de49fe3d in __pyx_pf_7msgpack_9_unpacker_2unpackb > (__pyx_v_packed=__pyx_v_packed at entry=0x7ff2e3572ea0, > __pyx_v_object_hook=__pyx_v_object_hook at entry=0x7ff2e54934b0 > <_Py_NoneStruct>, > > __pyx_v_list_hook=__pyx_v_list_hook at entry=0x7ff2e54934b0 > <_Py_NoneStruct>, __pyx_v_use_list=1, __pyx_v_encoding=0x7ff2dee5df08, > __pyx_v_unicode_errors=0x7ff2dee5dd50, > > __pyx_v_object_pairs_hook=0x7ff2e54934b0 <_Py_NoneStruct>, > __pyx_v_ext_hook=0x13db2d8, __pyx_v_max_str_len=__pyx_v_max_str_len at entry= > 2147483647, > > __pyx_v_max_bin_len=__pyx_v_max_bin_len at entry=2147483647, > __pyx_v_max_array_len=2147483647, __pyx_v_max_map_len=2147483647, > __pyx_v_max_ext_len=__pyx_v_max_ext_len at entry=2147483647, > > __pyx_self=) at msgpack/_unpacker.pyx:139 > > #9 0x00007ff2de4a1395 in __pyx_pw_7msgpack_9_unpacker_3unpackb > (__pyx_self=, __pyx_args=, > __pyx_kwds=) at msgpack/_unpacker.pyx:102 > > #10 0x00007ff2e5174ed3 in do_call (nk=, na=, > pp_stack=0x7fff3d5d2b80, func=0x7ff2df20ddc8) at Python/ceval.c:4463 > > #11 call_function (oparg=, pp_stack=0x7fff3d5d2b80) at > Python/ceval.c:4264 > > #12 PyEval_EvalFrameEx (f=f at entry=0x7ff2def02208, > throwflag=throwflag at entry=0) at Python/ceval.c:2838 > > #13 0x00007ff2e5175f45 in PyEval_EvalCodeEx (_co=, > globals=, locals=locals at entry=0x0, args=, > argcount=argcount at entry=1, kws=0x7ff2deefec30, kwcount=0, defs=0x0, > > defcount=0, kwdefs=0x0, closure=0x0) at Python/ceval.c:3588 > > #14 0x00007ff2e51734da in fast_function (nk=, na=1, > n=, pp_stack=0x7fff3d5d2e10, func=0x7ff2dee9b7b8) at > Python/ceval.c:4344 > > #15 call_function (oparg=, pp_stack=0x7fff3d5d2e10) at > Python/ceval.c:4262 > > #16 PyEval_EvalFrameEx (f=f at entry=0x7ff2deefea98, > throwflag=throwflag at entry=0) at Python/ceval.c:2838 > > #17 0x00007ff2e5175f45 in PyEval_EvalCodeEx (_co=, > globals=, locals=locals at entry=0x0, args=, > argcount=argcount at entry=1, kws=0x14566c8, kwcount=0, > > defs=0x7ff2deeaedb8, defcount=1, kwdefs=0x0, closure=0x0) at > Python/ceval.c:3588 > > #18 0x00007ff2e51734da in fast_function (nk=, na=1, > n=, pp_stack=0x7fff3d5d30a0, func=0x7ff2dee2dd90) at > Python/ceval.c:4344 > > #19 call_function (oparg=, pp_stack=0x7fff3d5d30a0) at > Python/ceval.c:4262 > > #20 PyEval_EvalFrameEx (f=f at entry=0x1456478, throwflag=throwflag at entry=0) > at Python/ceval.c:2838 > > #21 0x00007ff2e5175f45 in PyEval_EvalCodeEx (_co=, > globals=, locals=locals at entry=0x0, args=args at entry=0x7ff2d87364c0, > argcount=1, kws=kws at entry=0x7ff2dee1de40, > > kwcount=kwcount at entry=3, defs=defs at entry=0x7ff2e0820fd8, > defcount=defcount at entry=3, kwdefs=0x0, closure=0x0) at Python/ceval.c:3588 > > #22 0x00007ff2e50d3320 in function_call (func=0x7ff2df1e9a60, > arg=0x7ff2d87364a8, kw=0x7ff2d8738248) at Objects/funcobject.c:632 > > #23 0x00007ff2e50a76ca in PyObject_Call (func=func at entry=0x7ff2df1e9a60, > arg=arg at entry=0x7ff2d87364a8, kw=kw at entry=0x7ff2d8738248) at > Objects/abstract.c:2040 > > #24 0x00007ff2e50be55d in method_call (func=0x7ff2df1e9a60, > arg=0x7ff2d87364a8, kw=0x7ff2d8738248) at Objects/classobject.c:347 > > #25 0x00007ff2e50a76ca in PyObject_Call (func=0x7ff2dee30e88, arg=arg at entry=0x7ff2e433d048, > kw=kw at entry=0x7ff2d8738248) at Objects/abstract.c:2040 > > #26 0x00007ff2e51d9301 in partial_call (pto=0x7ff2deee1db8, > args=, kw=0x0) at ./Modules/_functoolsmodule.c:127 > > #27 0x00007ff2e50a76ca in PyObject_Call (func=func at entry=0x7ff2deee1db8, > arg=arg at entry=0x7ff2e433d048, kw=kw at entry=0x0) at Objects/abstract.c:2040 > > #28 0x00007ff2e51700a0 in ext_do_call (nk=-466366392, na=0, > flags=, pp_stack=0x7fff3d5d3540, func=0x7ff2deee1db8) at > Python/ceval.c:4561 > > #29 PyEval_EvalFrameEx (f=, throwflag=throwflag at entry=0) > at Python/ceval.c:2878 > > #30 0x00007ff2e51756a9 in fast_function (nk=, na=1, n=1, > pp_stack=0x7fff3d5d3710, func=0x7ff2e1540730) at Python/ceval.c:4334 > > #31 call_function (oparg=, pp_stack=0x7fff3d5d3710) at > Python/ceval.c:4262 > > #32 PyEval_EvalFrameEx (f=, throwflag=throwflag at entry=0) > at Python/ceval.c:2838 > > #33 0x00007ff2e51756a9 in fast_function (nk=, na=1, n=1, > pp_stack=0x7fff3d5d38f0, func=0x7ff2e12f2f28) at Python/ceval.c:4334 > > #34 call_function (oparg=, pp_stack=0x7fff3d5d38f0) at > Python/ceval.c:4262 > > #35 PyEval_EvalFrameEx (f=, throwflag=throwflag at entry=0) > at Python/ceval.c:2838 > > #36 0x00007ff2e51756a9 in fast_function (nk=, na=1, n=1, > pp_stack=0x7fff3d5d3ad0, func=0x7ff2e12f0c80) at Python/ceval.c:4334 > > #37 call_function (oparg=, pp_stack=0x7fff3d5d3ad0) at > Python/ceval.c:4262 > > #38 PyEval_EvalFrameEx (f=, throwflag=throwflag at entry=0) > at Python/ceval.c:2838 > > #39 0x00007ff2e51756a9 in fast_function (nk=, na=1, n=1, > pp_stack=0x7fff3d5d3cb0, func=0x7ff2df1e9ae8) at Python/ceval.c:4334 > > #40 call_function (oparg=, pp_stack=0x7fff3d5d3cb0) at > Python/ceval.c:4262 > > #41 PyEval_EvalFrameEx (f=f at entry=0xf796b8, throwflag=throwflag at entry=0) > at Python/ceval.c:2838 > > #42 0x00007ff2e5175f45 in PyEval_EvalCodeEx (_co=_co at entry=0x7ff2e400c660, > globals=globals at entry=0x7ff2e42df488, locals=locals at entry=0x7ff2e42df488, > args=args at entry=0x0, argcount=argcount at entry=0, > > kws=kws at entry=0x0, kwcount=kwcount at entry=0, defs=defs at entry=0x0, > defcount=defcount at entry=0, kwdefs=kwdefs at entry=0x0, closure=closure at entry=0x0) > at Python/ceval.c:3588 > > #43 0x00007ff2e517601b in PyEval_EvalCode (co=co at entry=0x7ff2e400c660, > globals=globals at entry=0x7ff2e42df488, locals=locals at entry=0x7ff2e42df488) > at Python/ceval.c:775 > > #44 0x00007ff2e519c09e in run_mod (arena=0xfc2990, flags=0x7fff3d5d3f50, > locals=0x7ff2e42df488, globals=0x7ff2e42df488, filename=0x7ff2e41d64b0, > mod=0x103b4f0) at Python/pythonrun.c:2180 > > #45 PyRun_FileExFlags (fp=fp at entry=0xf77b60, filename_str=filename_str at entry=0x7ff2e41d81d0 > "/x/eng/bbrtp/users/amresh/sshproxy_3896926_1608240818/ > test/nate/lib/NATE/Service/SSHProxy.py", > > start=start at entry=257, globals=globals at entry=0x7ff2e42df488, > locals=locals at entry=0x7ff2e42df488, closeit=closeit at entry=1, > flags=flags at entry=0x7fff3d5d3f50) at Python/pythonrun.c:2133 > > #46 0x00007ff2e519ced5 in PyRun_SimpleFileExFlags (fp=fp at entry=0xf77b60, > filename=, closeit=closeit at entry=1, flags=flags at entry=0x7fff3d5d3f50) > at Python/pythonrun.c:1606 > > ---Type to continue, or q to quit--- > > #47 0x00007ff2e519df09 in PyRun_AnyFileExFlags (fp=fp at entry=0xf77b60, > filename=, closeit=closeit at entry=1, flags=flags at entry=0x7fff3d5d3f50) > at Python/pythonrun.c:1292 > > #48 0x00007ff2e51b6af5 in run_file (p_cf=0x7fff3d5d3f50, filename=0xf0e7f0 > L"/x/eng/bbrtp/users/amresh/sshproxy_3896926_1608240818/ > test/nate/lib/NATE/Service/SSHProxy.py", fp=0xf77b60) at > Modules/main.c:319 > > #49 Py_Main (argc=argc at entry=6, argv=argv at entry=0xee6010) at > Modules/main.c:751 > > #50 0x0000000000400aa6 in main (argc=6, argv=) at > ./Modules/python.c:69 > > (gdb) > > > > Thanks and Regards, > > Amresh > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > burkhardameier%40gmail.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Amresh.Sajjanshetty at netapp.com Fri Sep 2 14:48:01 2016 From: Amresh.Sajjanshetty at netapp.com (Sajjanshetty, Amresh) Date: Fri, 2 Sep 2016 18:48:01 +0000 Subject: [Python-Dev] Need help in debugging the python core In-Reply-To: References: Message-ID: <01A70524-4E55-4486-AA7B-A4A8AA8A8344@netapp.com> Yes Thanks and Regards, Amresh From: Burkhard Meier Date: Saturday, September 3, 2016 at 12:12 AM To: Amresh Sajjanshetty Cc: "python-dev at python.org" Subject: Re: [Python-Dev] Need help in debugging the python core You are using bash? On Sep 2, 2016 8:56 AM, "Sajjanshetty, Amresh" > wrote: Dear All, I?m using asyncio and paramiko to multiplex different channels into a single SSH connection. Things were working fine till recently but suddenly started seeing that python getting crashed whenever I tried to write to the channel. I have very limited knowledge on how python interpreter works, so I?m finding difficulty in understanding the stack trace. Can you please help in understanding the below backtarce. bash-4.2$ gdb /usr/software/bin/python3.4.3 core.60015 Traceback (most recent call last): File "", line 70, in File "", line 67, in GdbSetPythonDirectory File "/usr/software/share/gdb/python/gdb/__init__.py", line 19, in import _gdb ImportError: No module named _gdb GNU gdb (GDB) 7.5 Copyright (C) 2012 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-unknown-linux-gnu". For bug reporting instructions, please see: ... Reading symbols from /usr/software/bin/python3.4.3...done. warning: core file may not match specified executable file. [New LWP 60015] [New LWP 60018] [New LWP 60019] [New LWP 60020] [New LWP 60021] [New LWP 60022] [New LWP 60023] [New LWP 60024] [Thread debugging using libthread_db enabled] Using host libthread_db library "/usr/software/lib/libthread_db.so.1". Core was generated by `/usr/software/bin/python3.4.3 /x/eng/bbrtp/users/amresh/sshproxy_3896926_160824'. Program terminated with signal 11, Segmentation fault. #0 _PyObject_Malloc (ctx=0x0, nbytes=52) at Objects/obmalloc.c:1159 1159 Objects/obmalloc.c: No such file or directory. (gdb) bt #0 _PyObject_Malloc (ctx=0x0, nbytes=52) at Objects/obmalloc.c:1159 #1 0x00007ff2e511474a in PyUnicode_New (maxchar=, size=3) at Objects/unicodeobject.c:1093 #2 PyUnicode_New (size=3, maxchar=) at Objects/unicodeobject.c:1033 #3 0x00007ff2e5139da2 in _PyUnicodeWriter_PrepareInternal (writer=writer at entry=0x7fff3d5c8640, length=, maxchar=, maxchar at entry=127) at Objects/unicodeobject.c:13327 #4 0x00007ff2e513f38b in PyUnicode_DecodeUTF8Stateful (s=s at entry=0x7ff2e3572f78 "tcp\245reuse\001\253socket_type\244pull\251transport\246zeromq", size=size at entry=3, errors=errors at entry=0x7ff2dee5dd70 "strict", consumed=consumed at entry=0x0) at Objects/unicodeobject.c:4757 #5 0x00007ff2e5140690 in PyUnicode_Decode (s=0x7ff2e3572f78 "tcp\245reuse\001\253socket_type\244pull\251transport\246zeromq", size=3, encoding=0x7ff2dee5df28 "utf-8", errors=0x7ff2dee5dd70 "strict") at Objects/unicodeobject.c:3012 #6 0x00007ff2de49bfdf in unpack_callback_raw (o=, l=3, p=0x7ff2e3572f78 "tcp\245reuse\001\253socket_type\244pull\251transport\246zeromq", u=0x7fff3d5c8840, b=) at msgpack/unpack.h:229 #7 unpack_execute (ctx=ctx at entry=0x7fff3d5c8840, data=0x7ff2e3572ec0 "\205\245_auth\300\245_call\246expect\243_i", , len=, off=off at entry=0x7fff3d5c8820) at msgpack/unpack_template.h:312 #8 0x00007ff2de49fe3d in __pyx_pf_7msgpack_9_unpacker_2unpackb (__pyx_v_packed=__pyx_v_packed at entry=0x7ff2e3572ea0, __pyx_v_object_hook=__pyx_v_object_hook at entry=0x7ff2e54934b0 <_Py_NoneStruct>, __pyx_v_list_hook=__pyx_v_list_hook at entry=0x7ff2e54934b0 <_Py_NoneStruct>, __pyx_v_use_list=1, __pyx_v_encoding=0x7ff2dee5df08, __pyx_v_unicode_errors=0x7ff2dee5dd50, __pyx_v_object_pairs_hook=0x7ff2e54934b0 <_Py_NoneStruct>, __pyx_v_ext_hook=0x13db2d8, __pyx_v_max_str_len=__pyx_v_max_str_len at entry=2147483647, __pyx_v_max_bin_len=__pyx_v_max_bin_len at entry=2147483647, __pyx_v_max_array_len=2147483647, __pyx_v_max_map_len=2147483647, __pyx_v_max_ext_len=__pyx_v_max_ext_len at entry=2147483647, __pyx_self=) at msgpack/_unpacker.pyx:139 #9 0x00007ff2de4a1395 in __pyx_pw_7msgpack_9_unpacker_3unpackb (__pyx_self=, __pyx_args=, __pyx_kwds=) at msgpack/_unpacker.pyx:102 #10 0x00007ff2e5174ed3 in do_call (nk=, na=, pp_stack=0x7fff3d5d2b80, func=0x7ff2df20ddc8) at Python/ceval.c:4463 #11 call_function (oparg=, pp_stack=0x7fff3d5d2b80) at Python/ceval.c:4264 #12 PyEval_EvalFrameEx (f=f at entry=0x7ff2def02208, throwflag=throwflag at entry=0) at Python/ceval.c:2838 #13 0x00007ff2e5175f45 in PyEval_EvalCodeEx (_co=, globals=, locals=locals at entry=0x0, args=, argcount=argcount at entry=1, kws=0x7ff2deefec30, kwcount=0, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0) at Python/ceval.c:3588 #14 0x00007ff2e51734da in fast_function (nk=, na=1, n=, pp_stack=0x7fff3d5d2e10, func=0x7ff2dee9b7b8) at Python/ceval.c:4344 #15 call_function (oparg=, pp_stack=0x7fff3d5d2e10) at Python/ceval.c:4262 #16 PyEval_EvalFrameEx (f=f at entry=0x7ff2deefea98, throwflag=throwflag at entry=0) at Python/ceval.c:2838 #17 0x00007ff2e5175f45 in PyEval_EvalCodeEx (_co=, globals=, locals=locals at entry=0x0, args=, argcount=argcount at entry=1, kws=0x14566c8, kwcount=0, defs=0x7ff2deeaedb8, defcount=1, kwdefs=0x0, closure=0x0) at Python/ceval.c:3588 #18 0x00007ff2e51734da in fast_function (nk=, na=1, n=, pp_stack=0x7fff3d5d30a0, func=0x7ff2dee2dd90) at Python/ceval.c:4344 #19 call_function (oparg=, pp_stack=0x7fff3d5d30a0) at Python/ceval.c:4262 #20 PyEval_EvalFrameEx (f=f at entry=0x1456478, throwflag=throwflag at entry=0) at Python/ceval.c:2838 #21 0x00007ff2e5175f45 in PyEval_EvalCodeEx (_co=, globals=, locals=locals at entry=0x0, args=args at entry=0x7ff2d87364c0, argcount=1, kws=kws at entry=0x7ff2dee1de40, kwcount=kwcount at entry=3, defs=defs at entry=0x7ff2e0820fd8, defcount=defcount at entry=3, kwdefs=0x0, closure=0x0) at Python/ceval.c:3588 #22 0x00007ff2e50d3320 in function_call (func=0x7ff2df1e9a60, arg=0x7ff2d87364a8, kw=0x7ff2d8738248) at Objects/funcobject.c:632 #23 0x00007ff2e50a76ca in PyObject_Call (func=func at entry=0x7ff2df1e9a60, arg=arg at entry=0x7ff2d87364a8, kw=kw at entry=0x7ff2d8738248) at Objects/abstract.c:2040 #24 0x00007ff2e50be55d in method_call (func=0x7ff2df1e9a60, arg=0x7ff2d87364a8, kw=0x7ff2d8738248) at Objects/classobject.c:347 #25 0x00007ff2e50a76ca in PyObject_Call (func=0x7ff2dee30e88, arg=arg at entry=0x7ff2e433d048, kw=kw at entry=0x7ff2d8738248) at Objects/abstract.c:2040 #26 0x00007ff2e51d9301 in partial_call (pto=0x7ff2deee1db8, args=, kw=0x0) at ./Modules/_functoolsmodule.c:127 #27 0x00007ff2e50a76ca in PyObject_Call (func=func at entry=0x7ff2deee1db8, arg=arg at entry=0x7ff2e433d048, kw=kw at entry=0x0) at Objects/abstract.c:2040 #28 0x00007ff2e51700a0 in ext_do_call (nk=-466366392, na=0, flags=, pp_stack=0x7fff3d5d3540, func=0x7ff2deee1db8) at Python/ceval.c:4561 #29 PyEval_EvalFrameEx (f=, throwflag=throwflag at entry=0) at Python/ceval.c:2878 #30 0x00007ff2e51756a9 in fast_function (nk=, na=1, n=1, pp_stack=0x7fff3d5d3710, func=0x7ff2e1540730) at Python/ceval.c:4334 #31 call_function (oparg=, pp_stack=0x7fff3d5d3710) at Python/ceval.c:4262 #32 PyEval_EvalFrameEx (f=, throwflag=throwflag at entry=0) at Python/ceval.c:2838 #33 0x00007ff2e51756a9 in fast_function (nk=, na=1, n=1, pp_stack=0x7fff3d5d38f0, func=0x7ff2e12f2f28) at Python/ceval.c:4334 #34 call_function (oparg=, pp_stack=0x7fff3d5d38f0) at Python/ceval.c:4262 #35 PyEval_EvalFrameEx (f=, throwflag=throwflag at entry=0) at Python/ceval.c:2838 #36 0x00007ff2e51756a9 in fast_function (nk=, na=1, n=1, pp_stack=0x7fff3d5d3ad0, func=0x7ff2e12f0c80) at Python/ceval.c:4334 #37 call_function (oparg=, pp_stack=0x7fff3d5d3ad0) at Python/ceval.c:4262 #38 PyEval_EvalFrameEx (f=, throwflag=throwflag at entry=0) at Python/ceval.c:2838 #39 0x00007ff2e51756a9 in fast_function (nk=, na=1, n=1, pp_stack=0x7fff3d5d3cb0, func=0x7ff2df1e9ae8) at Python/ceval.c:4334 #40 call_function (oparg=, pp_stack=0x7fff3d5d3cb0) at Python/ceval.c:4262 #41 PyEval_EvalFrameEx (f=f at entry=0xf796b8, throwflag=throwflag at entry=0) at Python/ceval.c:2838 #42 0x00007ff2e5175f45 in PyEval_EvalCodeEx (_co=_co at entry=0x7ff2e400c660, globals=globals at entry=0x7ff2e42df488, locals=locals at entry=0x7ff2e42df488, args=args at entry=0x0, argcount=argcount at entry=0, kws=kws at entry=0x0, kwcount=kwcount at entry=0, defs=defs at entry=0x0, defcount=defcount at entry=0, kwdefs=kwdefs at entry=0x0, closure=closure at entry=0x0) at Python/ceval.c:3588 #43 0x00007ff2e517601b in PyEval_EvalCode (co=co at entry=0x7ff2e400c660, globals=globals at entry=0x7ff2e42df488, locals=locals at entry=0x7ff2e42df488) at Python/ceval.c:775 #44 0x00007ff2e519c09e in run_mod (arena=0xfc2990, flags=0x7fff3d5d3f50, locals=0x7ff2e42df488, globals=0x7ff2e42df488, filename=0x7ff2e41d64b0, mod=0x103b4f0) at Python/pythonrun.c:2180 #45 PyRun_FileExFlags (fp=fp at entry=0xf77b60, filename_str=filename_str at entry=0x7ff2e41d81d0 "/x/eng/bbrtp/users/amresh/sshproxy_3896926_1608240818/test/nate/lib/NATE/Service/SSHProxy.py", start=start at entry=257, globals=globals at entry=0x7ff2e42df488, locals=locals at entry=0x7ff2e42df488, closeit=closeit at entry=1, flags=flags at entry=0x7fff3d5d3f50) at Python/pythonrun.c:2133 #46 0x00007ff2e519ced5 in PyRun_SimpleFileExFlags (fp=fp at entry=0xf77b60, filename=, closeit=closeit at entry=1, flags=flags at entry=0x7fff3d5d3f50) at Python/pythonrun.c:1606 ---Type to continue, or q to quit--- #47 0x00007ff2e519df09 in PyRun_AnyFileExFlags (fp=fp at entry=0xf77b60, filename=, closeit=closeit at entry=1, flags=flags at entry=0x7fff3d5d3f50) at Python/pythonrun.c:1292 #48 0x00007ff2e51b6af5 in run_file (p_cf=0x7fff3d5d3f50, filename=0xf0e7f0 L"/x/eng/bbrtp/users/amresh/sshproxy_3896926_1608240818/test/nate/lib/NATE/Service/SSHProxy.py", fp=0xf77b60) at Modules/main.c:319 #49 Py_Main (argc=argc at entry=6, argv=argv at entry=0xee6010) at Modules/main.c:751 #50 0x0000000000400aa6 in main (argc=6, argv=) at ./Modules/python.c:69 (gdb) Thanks and Regards, Amresh _______________________________________________ Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/burkhardameier%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From dinov at microsoft.com Fri Sep 2 14:55:12 2016 From: dinov at microsoft.com (Dino Viehland) Date: Fri, 2 Sep 2016 18:55:12 +0000 Subject: [Python-Dev] Update on PEP 523 and adding a co_extra field to code objects In-Reply-To: References: <20160830180729.63540a92@fsol> <20160830193051.532e8d9a@fsol> <20160830194813.0cbb261b@fsol> <20160830202037.28943cec@fsol> Message-ID: So I ran the tests with both a list and a tuple. They were about 5% slower on a handful of benchmarks, and then the difference between the tuple and list again had a few benchmarks that were around 5% slower. There was one benchmark where the tuple one significantly for some reason (mako_v2) which was 1.4x slower. It seems to me we should go with the tuple just because the common case will be having a single object and it'll be even less common to have these changing very frequently. -----Original Message----- From: Python-Dev [mailto:python-dev-bounces+dinov=microsoft.com at python.org] On Behalf Of Chris Angelico Sent: Tuesday, August 30, 2016 2:11 PM To: python-dev Subject: Re: [Python-Dev] Update on PEP 523 and adding a co_extra field to code objects On Wed, Aug 31, 2016 at 4:55 AM, Serhiy Storchaka wrote: > On 30.08.16 21:20, Antoine Pitrou wrote: >> >> On Tue, 30 Aug 2016 18:12:01 +0000 >> Brett Cannon wrote: >>>> >>>> Why not make it always a list? List objects are reasonably cheap >>>> in memory and access time... (unlike dicts) >>> >>> >>> Because I would prefer to avoid any form of unnecessary performance >>> overhead for the common case. >> >> >> But the performance overhead of iterating over a 1-element list is >> small enough (it's just an array access after a pointer dereference) >> that it may not be larger than the overhead of the multiple tests and >> conditional branches your example shows. > > > Iterating over a tuple is even faster. It needs one pointer > dereference less. > > And for memory efficiency we can use just a raw array of pointers. Didn't all this kind of thing come up when function annotations were discussed? Insane schemes like dictionaries with UUID keys and so on. The decision then was YAGNI. The decision now, IMO, should be the same. Keep things simple. ChrisA _______________________________________________ Python-Dev mailing list Python-Dev at python.org https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fmail.python.org%2fmailman%2flistinfo%2fpython-dev&data=01%7c01%7cdinov%40microsoft.com%7c9d750b06b2134a2145c708d3d11a4ab0%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=szub1gsDW2rdns3IQGV68J3tCqWiNcjqG77xYIfoORc%3d Unsubscribe: https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fmail.python.org%2fmailman%2foptions%2fpython-dev%2fdinov%2540microsoft.com&data=01%7c01%7cdinov%40microsoft.com%7c9d750b06b2134a2145c708d3d11a4ab0%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=TEzMSyJLmAe2BVZGPugXAh6bga2xN1WQw3bR0z0b%2fLg%3d From guido at python.org Fri Sep 2 15:33:43 2016 From: guido at python.org (Guido van Rossum) Date: Fri, 2 Sep 2016 12:33:43 -0700 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <57C982D4.1060405@hotpy.org> <20160902164035.GB26300@ando.pearwood.info> Message-ID: On Fri, Sep 2, 2016 at 10:47 AM, Steve Dower wrote: > "I'm not seeing what distinction you think you are making here. What > distinction do you see between: > > x: int = func(value) > > and > > x = func(value) # type: int" > > Not sure whether I agree with Mark on this particular point, but the > difference I see here is that the first describes what types x may ever > contain, while the latter describes what type of being assigned to x right > here. So one is a variable annotation while the other is an expression > annotation. But that's not what type comments mean! They don't annotate the expression. They annotate the variable. The text in PEP 484 that introduces them is clear about this (it never mentions expressions, only variables). > Personally, I prefer expression annotations over variable annotations, as > there are many other languages I'd prefer if variable have fixed types (e.g. > C++, where I actually enjoy doing horrible things with implicit casting ;) > ). > > Variable annotations appear to be inherently restrictive, so either we need > serious clarification as to why they are not, or they actually are and we > ought to be more sure that it's the direction we want the language to go. At runtime the variable annotations are ignored. And a type checker will only ask for them when it cannot infer the type. So I think we'll be fine. -- --Guido van Rossum (python.org/~guido) From dinov at microsoft.com Fri Sep 2 14:57:06 2016 From: dinov at microsoft.com (Dino Viehland) Date: Fri, 2 Sep 2016 18:57:06 +0000 Subject: [Python-Dev] Update on PEP 523 and adding a co_extra field to code objects In-Reply-To: References: <20160830180729.63540a92@fsol> <20160830193051.532e8d9a@fsol> <20160830194813.0cbb261b@fsol> <20160830202037.28943cec@fsol> Message-ID: So it looks like both list and tuple are about within 5% of using co_extra directly. Using a tuple instead of a list is about a wash except for make_v2 where list is 1.4x slower for some reason (which I didn't dig into). I would say that using a tuple and copying the tuple on updates makes sense as we don't expect these to change very often and we don't expect collisions to happen very often. > -----Original Message----- > From: Python-Dev [mailto:python-dev- > bounces+dinov=microsoft.com at python.org] On Behalf Of Chris Angelico > Sent: Tuesday, August 30, 2016 2:11 PM > To: python-dev > Subject: Re: [Python-Dev] Update on PEP 523 and adding a co_extra field to > code objects > > On Wed, Aug 31, 2016 at 4:55 AM, Serhiy Storchaka > wrote: > > On 30.08.16 21:20, Antoine Pitrou wrote: > >> > >> On Tue, 30 Aug 2016 18:12:01 +0000 > >> Brett Cannon wrote: > >>>> > >>>> Why not make it always a list? List objects are reasonably cheap > >>>> in memory and access time... (unlike dicts) > >>> > >>> > >>> Because I would prefer to avoid any form of unnecessary performance > >>> overhead for the common case. > >> > >> > >> But the performance overhead of iterating over a 1-element list is > >> small enough (it's just an array access after a pointer dereference) > >> that it may not be larger than the overhead of the multiple tests and > >> conditional branches your example shows. > > > > > > Iterating over a tuple is even faster. It needs one pointer > > dereference less. > > > > And for memory efficiency we can use just a raw array of pointers. > > Didn't all this kind of thing come up when function annotations were > discussed? Insane schemes like dictionaries with UUID keys and so on. > The decision then was YAGNI. The decision now, IMO, should be the same. > Keep things simple. > > ChrisA > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fmail.pyt > hon.org%2fmailman%2flistinfo%2fpython- > dev&data=01%7c01%7cdinov%40microsoft.com%7c9d750b06b2134a2145c70 > 8d3d11a4ab0%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=szub1gs > DW2rdns3IQGV68J3tCqWiNcjqG77xYIfoORc%3d > Unsubscribe: > https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fmail.pyt > hon.org%2fmailman%2foptions%2fpython- > dev%2fdinov%2540microsoft.com&data=01%7c01%7cdinov%40microsoft.co > m%7c9d750b06b2134a2145c708d3d11a4ab0%7c72f988bf86f141af91ab2d7c > d011db47%7c1&sdata=TEzMSyJLmAe2BVZGPugXAh6bga2xN1WQw3bR0z0b > %2fLg%3d From brett at python.org Fri Sep 2 17:56:36 2016 From: brett at python.org (Brett Cannon) Date: Fri, 02 Sep 2016 21:56:36 +0000 Subject: [Python-Dev] Update on PEP 523 and adding a co_extra field to code objects In-Reply-To: References: <20160830180729.63540a92@fsol> <20160830193051.532e8d9a@fsol> <20160830194813.0cbb261b@fsol> <20160830202037.28943cec@fsol> Message-ID: On Fri, 2 Sep 2016 at 13:31 Dino Viehland via Python-Dev < python-dev at python.org> wrote: > So it looks like both list and tuple are about within 5% of using co_extra > directly. Using a tuple instead of a list is about a wash except for > make_v2 where list is 1.4x slower for some reason (which I didn't dig into). > > I would say that using a tuple and copying the tuple on updates makes > sense as we don't expect these to change very often and we don't expect > collisions to happen very often. > So would making co_extra a PyTupleObject instead of PyObject alleviate people's worry of a collision problem? You're going to have to hold the GIL anyway to interact with the tuple so there won't be any race condition in replacing the tuple when it's grown (or initially set). -Brett > > > -----Original Message----- > > From: Python-Dev [mailto:python-dev- > > bounces+dinov=microsoft.com at python.org] On Behalf Of Chris Angelico > > Sent: Tuesday, August 30, 2016 2:11 PM > > To: python-dev > > Subject: Re: [Python-Dev] Update on PEP 523 and adding a co_extra field > to > > code objects > > > > On Wed, Aug 31, 2016 at 4:55 AM, Serhiy Storchaka > > wrote: > > > On 30.08.16 21:20, Antoine Pitrou wrote: > > >> > > >> On Tue, 30 Aug 2016 18:12:01 +0000 > > >> Brett Cannon wrote: > > >>>> > > >>>> Why not make it always a list? List objects are reasonably cheap > > >>>> in memory and access time... (unlike dicts) > > >>> > > >>> > > >>> Because I would prefer to avoid any form of unnecessary performance > > >>> overhead for the common case. > > >> > > >> > > >> But the performance overhead of iterating over a 1-element list is > > >> small enough (it's just an array access after a pointer dereference) > > >> that it may not be larger than the overhead of the multiple tests and > > >> conditional branches your example shows. > > > > > > > > > Iterating over a tuple is even faster. It needs one pointer > > > dereference less. > > > > > > And for memory efficiency we can use just a raw array of pointers. > > > > Didn't all this kind of thing come up when function annotations were > > discussed? Insane schemes like dictionaries with UUID keys and so on. > > The decision then was YAGNI. The decision now, IMO, should be the same. > > Keep things simple. > > > > ChrisA > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > > https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fmail.pyt > > hon.org%2fmailman%2flistinfo%2fpython- > > dev&data=01%7c01%7cdinov%40microsoft.com%7c9d750b06b2134a2145c70 > > 8d3d11a4ab0%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=szub1gs > > DW2rdns3IQGV68J3tCqWiNcjqG77xYIfoORc%3d > > Unsubscribe: > > > https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fmail.pyt > > hon.org%2fmailman%2foptions%2fpython- > > dev%2fdinov%2540microsoft.com&data=01%7c01%7cdinov%40microsoft.co > > m%7c9d750b06b2134a2145c708d3d11a4ab0%7c72f988bf86f141af91ab2d7c > > d011db47%7c1&sdata=TEzMSyJLmAe2BVZGPugXAh6bga2xN1WQw3bR0z0b > > %2fLg%3d > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Fri Sep 2 18:10:46 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 3 Sep 2016 08:10:46 +1000 Subject: [Python-Dev] Update on PEP 523 and adding a co_extra field to code objects In-Reply-To: References: <20160830180729.63540a92@fsol> <20160830193051.532e8d9a@fsol> <20160830194813.0cbb261b@fsol> <20160830202037.28943cec@fsol> Message-ID: On Sat, Sep 3, 2016 at 7:56 AM, Brett Cannon wrote: > On Fri, 2 Sep 2016 at 13:31 Dino Viehland via Python-Dev > wrote: >> >> So it looks like both list and tuple are about within 5% of using co_extra >> directly. Using a tuple instead of a list is about a wash except for >> make_v2 where list is 1.4x slower for some reason (which I didn't dig into). >> >> I would say that using a tuple and copying the tuple on updates makes >> sense as we don't expect these to change very often and we don't expect >> collisions to happen very often. > > > So would making co_extra a PyTupleObject instead of PyObject alleviate > people's worry of a collision problem? You're going to have to hold the GIL > anyway to interact with the tuple so there won't be any race condition in > replacing the tuple when it's grown (or initially set). > I'm not following how this solves the collision problem. If you have a tuple, how do the two (or more) users of it know which index they're using? They'd need to keep track separately for each object, or else inefficiently search the tuple for an object of appropriate type every time. What am I missing here? ChrisA From guido at python.org Fri Sep 2 18:19:44 2016 From: guido at python.org (Guido van Rossum) Date: Fri, 2 Sep 2016 15:19:44 -0700 Subject: [Python-Dev] Updated version of PEP 526 (Syntax for Variable Annotations) Message-ID: We've prepared an updated version of PEP 526: https://www.python.org/dev/peps/pep-0526/ This changes the title to "Syntax for Variable Annotations", now that we've settled on global, class, instance, and local variables as the things you might annotate. There is one substantial change: where the previous version supported only NAME: TYPE TARGET: TYPE = VALUE the new PEP removes the distinction and just allows TARGET: TYPE [= VALUE] This simplifies the explanations a bit and enables type checkers to support separating the annotation from the assignment for instance variables in the __init__ method, e.g. def __init__(self): self.name: str if : self.name = else: self.name = The other changes are all minor editing nits, or clarifications about the scope of the PEP. The scope clarification is important: while I really want the new syntax settled in 3.6, I have no intention to pin down the way type checkers use this syntax, apart from the observation that TARGET: TYPE = VALUE is just meant as a cleaner way to write what you'd currently write using PEP 484 as TARGET = VALUE # type: TYPE The PEP does *not* claim that you have to use variable annotations -- in fact we'd prefer that they were unnecessary, but the prevalence of type comments in code we've annotated so far makes it clear that there are plenty of uses for them, and we'd rather have a clean syntax for them that tools can see in the AST. -- --Guido van Rossum (python.org/~guido) From brett at python.org Fri Sep 2 18:45:17 2016 From: brett at python.org (Brett Cannon) Date: Fri, 02 Sep 2016 22:45:17 +0000 Subject: [Python-Dev] Update on PEP 523 and adding a co_extra field to code objects In-Reply-To: References: <20160830180729.63540a92@fsol> <20160830193051.532e8d9a@fsol> <20160830194813.0cbb261b@fsol> <20160830202037.28943cec@fsol> Message-ID: On Fri, 2 Sep 2016 at 15:11 Chris Angelico wrote: > On Sat, Sep 3, 2016 at 7:56 AM, Brett Cannon wrote: > > On Fri, 2 Sep 2016 at 13:31 Dino Viehland via Python-Dev > > wrote: > >> > >> So it looks like both list and tuple are about within 5% of using > co_extra > >> directly. Using a tuple instead of a list is about a wash except for > >> make_v2 where list is 1.4x slower for some reason (which I didn't dig > into). > >> > >> I would say that using a tuple and copying the tuple on updates makes > >> sense as we don't expect these to change very often and we don't expect > >> collisions to happen very often. > > > > > > So would making co_extra a PyTupleObject instead of PyObject alleviate > > people's worry of a collision problem? You're going to have to hold the > GIL > > anyway to interact with the tuple so there won't be any race condition in > > replacing the tuple when it's grown (or initially set). > > > > I'm not following how this solves the collision problem. If you have a > tuple, how do the two (or more) users of it know which index they're > using? They'd need to keep track separately for each object, or else > inefficiently search the tuple for an object of appropriate type every > time. What am I missing here? > You're not missing anything, you just have to pay for the search cost, otherwise we're back to square one here of not worrying about the case of multiple users. I don't see how you can have multiple users of a single struct field and yet not have to do some search of some data structure to find the relevant object you care about. We've tried maps and dicts and they were too slow, and we proposed not worrying about multiple users but people didn't like the idea of either not caring or relying on some implicit practice that evolved around the co_extra field. Using a tuple seems to be the best option we can come up with short of developing a linked list which isn't that much better than a tuple if you're simply storing PyObjects. So either we're sticking with the lack of coordination as outlined in the PEP because you don't imagine people using a combination of Pyjion, vmprof, and/or some debugger simultaneously, or you do and we have to just eat the performance degradation. -------------- next part -------------- An HTML attachment was scrubbed... URL: From k7hoven at gmail.com Fri Sep 2 18:49:00 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Sat, 3 Sep 2016 01:49:00 +0300 Subject: [Python-Dev] What should a good type checker do? (was: Please reject or postpone PEP 526) Message-ID: On Fri, Sep 2, 2016 at 9:04 PM, Steven D'Aprano wrote: > On Fri, Sep 02, 2016 at 08:10:24PM +0300, Koos Zevenhoven wrote: > >> A good checker should be able to infer that x is a union type at the >> point that it's passed to spam, even without the type annotation. For >> example: >> >> def eggs(cond:bool): >> if cond: >> x = 1 >> else: >> x = 1.5 >> spam(x) # a good type checker infers that x is of type Union[int, float] > > Oh I really hope not. I wouldn't call that a *good* type checker. I > would call that a type checker that is overly permissive. I guess it's perfectly fine if we disagree about type checking ideals, and I can imagine the justification for you thinking that way. There can also be different type checkers, and which can have different modes. But assume (a) that the above function is perfectly working code, and spam(...) accepts Union[int, float]. Why would I want the type checker to complain? Then again, (b) instead of that being working code, it might be an error and spam only takes float. No problem, the type checker will catch that. In case of (b), to get the behavior you want (but in my hypothetical semantics), this could be annotated as def eggs(cond:bool): x : float if cond: x = 1 # type checker says error else: x = 1.5 spam(x) So here the programmer thinks the type of x should be more constrained than what spam(...) accepts. Or you might have something like this def eggs(cond:bool): if cond: x = 1 else: x = 1.5 # type checker has inferred x to be Union[int, float] x : float # type checker finds an error spam(x) Here, the same error is found, but at a different location. > Maybe you think that it's okay because ints and floats are somewhat > compatible. But suppose I wrote: > > if cond: > x = HTTPServer(*args) > else: > x = 1.5 It might be clear by now, but no, that's not why I wrote that. That was just a slightly more "realistic" example than this HTTP & 1.5 one. [...] > Do you have a better idea for variable > syntax? I had one but it turned out it was worse. -- Koos > > > -- > Steve > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/k7hoven%40gmail.com -- + Koos Zevenhoven + http://twitter.com/k7hoven + From rosuav at gmail.com Fri Sep 2 18:50:37 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 3 Sep 2016 08:50:37 +1000 Subject: [Python-Dev] Update on PEP 523 and adding a co_extra field to code objects In-Reply-To: References: <20160830180729.63540a92@fsol> <20160830193051.532e8d9a@fsol> <20160830194813.0cbb261b@fsol> <20160830202037.28943cec@fsol> Message-ID: On Sat, Sep 3, 2016 at 8:45 AM, Brett Cannon wrote: >> I'm not following how this solves the collision problem. If you have a >> tuple, how do the two (or more) users of it know which index they're >> using? They'd need to keep track separately for each object, or else >> inefficiently search the tuple for an object of appropriate type every >> time. What am I missing here? > > > You're not missing anything, you just have to pay for the search cost, > otherwise we're back to square one here of not worrying about the case of > multiple users. I don't see how you can have multiple users of a single > struct field and yet not have to do some search of some data structure to > find the relevant object you care about. We've tried maps and dicts and they > were too slow, and we proposed not worrying about multiple users but people > didn't like the idea of either not caring or relying on some implicit > practice that evolved around the co_extra field. Using a tuple seems to be > the best option we can come up with short of developing a linked list which > isn't that much better than a tuple if you're simply storing PyObjects. So > either we're sticking with the lack of coordination as outlined in the PEP > because you don't imagine people using a combination of Pyjion, vmprof, > and/or some debugger simultaneously, or you do and we have to just eat the > performance degradation. Got it, thanks. I hope the vagaries of linear search don't mess with profilers - a debugger isn't going to be bothered by whether it gets first slot or second, but profiling and performance might get subtle differences based on which thing looks at a function first. A dict would avoid that (constant-time lookups with a pre-selected key will be consistent), but costs a lot more. ChrisA From rosuav at gmail.com Fri Sep 2 19:01:41 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 3 Sep 2016 09:01:41 +1000 Subject: [Python-Dev] What should a good type checker do? (was: Please reject or postpone PEP 526) In-Reply-To: References: Message-ID: On Sat, Sep 3, 2016 at 8:49 AM, Koos Zevenhoven wrote: > On Fri, Sep 2, 2016 at 9:04 PM, Steven D'Aprano wrote: >> On Fri, Sep 02, 2016 at 08:10:24PM +0300, Koos Zevenhoven wrote: >> >>> A good checker should be able to infer that x is a union type at the >>> point that it's passed to spam, even without the type annotation. For >>> example: >>> >>> def eggs(cond:bool): >>> if cond: >>> x = 1 >>> else: >>> x = 1.5 >>> spam(x) # a good type checker infers that x is of type Union[int, float] >> >> Oh I really hope not. I wouldn't call that a *good* type checker. I >> would call that a type checker that is overly permissive. > > I guess it's perfectly fine if we disagree about type checking ideals, > and I can imagine the justification for you thinking that way. There > can also be different type checkers, and which can have different > modes. > > But assume (a) that the above function is perfectly working code, and > spam(...) accepts Union[int, float]. Why would I want the type checker > to complain? I wonder if it would be different if you wrote that as a single expression: x = 1 if cond else 1.5 x = sum([1] + [0.5] * cond) What should type inference decide x is in these cases? Assume an arbitrarily smart type checker that can implement your ideal; it's equally plausible to pretend that the type checker can recognize an if/else block (or even if/elif/else tree of arbitrary length) as a single "assignment" operation. IMO both of these examples - and by extension, the if/else of the original - should be assigning a Union type. Lots of Python code assumes that smallish integers [1] are entirely compatible with floats. Is Python 4 going to have to deal with the int/float distinction the way Python 3 did for bytes/text, or are they fundamentally compatible concepts? Is the "small integer" like the "ASCII byte/character" as a kind of hybrid beast that people treat as simultaneously two types? (Personally, I don't think it's anything like bytes/text. But I'm open to argument.) Forcing people to write 1.0 just to be compatible with 1.5 will cause a lot of annoyance. I leave it to you to decide whether there's a fundamental difference that needs to be acknowledged, or just subtleties of representational limitations to be ignored until they become a problem. ChrisA [1] And by "smallish" I mean less than 2**53. Big enough for a lot of purposes. Bigger (by definition) than JavaScript's integers, which cap out at 2**32. From guido at python.org Fri Sep 2 19:08:48 2016 From: guido at python.org (Guido van Rossum) Date: Fri, 2 Sep 2016 16:08:48 -0700 Subject: [Python-Dev] What should a good type checker do? (was: Please reject or postpone PEP 526) In-Reply-To: References: Message-ID: Won't you ll agree that this thread belongs on python-ideas? -- --Guido van Rossum (python.org/~guido) From random832 at fastmail.com Fri Sep 2 19:14:53 2016 From: random832 at fastmail.com (Random832) Date: Fri, 02 Sep 2016 19:14:53 -0400 Subject: [Python-Dev] What should a good type checker do? (was: Please reject or postpone PEP 526) In-Reply-To: References: Message-ID: <1472858093.3247554.714367961.00FF63AD@webmail.messagingengine.com> On Fri, Sep 2, 2016, at 18:49, Koos Zevenhoven wrote: > Then again, (b) instead of that being working code, it might be an > error and spam only takes float. No problem, the type checker will > catch that. There are very few functions that should only take float and not int. > On Fri, Sep 2, 2016 at 9:04 PM, Steven D'Aprano > wrote: > > Maybe you think that it's okay because ints and floats are somewhat > > compatible. But suppose I wrote: > > > > if cond: > > x = HTTPServer(*args) > > else: > > x = 1.5 > > It might be clear by now, but no, that's not why I wrote that. That > was just a slightly more "realistic" example than this HTTP & 1.5 one. The other thing is... I'd kind of want it to infer Number in the first case. And if I assign both a list and a generator expression to something, that should be Iterable[whatever] and maybe even whatever gets worked out for "proper iterable and not string or bytes or memoryview". Certainly if you can return HTTPServer() or None it should infer Union[HTTPServer, None], otherwise spelled Optional[HTTPServer]. Maybe what we need is a "protoize"-alike that can run through a source file and produce a stub file with all its inferences, for manual inspection. So if you see something nonsensical like Union[HTTPServer, float] you can think "wait a minute, where's that coming from" and go look at the code. From ethan at stoneleaf.us Fri Sep 2 19:44:08 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 02 Sep 2016 16:44:08 -0700 Subject: [Python-Dev] PEP 467: last round (?) In-Reply-To: References: <57C88355.9000302@stoneleaf.us> <57C8A5F1.4060204@stoneleaf.us> Message-ID: <57CA0EC8.5030508@stoneleaf.us> On 09/01/2016 04:07 PM, Victor Stinner wrote: > 2016-09-02 0:04 GMT+02:00 Ethan Furman: >> - `fromord` to replace the mistaken purpose of the default constructor > > To replace a bogus bytes(obj)? If someone writes bytes(obj) but expect > to create a byte string from an integer, why not using bchr() to fix > the code? The problem with only having `bchr` is that it doesn't help with `bytearray`; the problem with not having `bchr` is who wants to write `bytes.fromord`? So we need `bchr`, and we need `bytearray.fromord`; and since the major difference between `bytes` and `bytearray` is that one is mutable and one is not, `bytes` should also have `fromord`. -- ~Ethan~ From random832 at fastmail.com Fri Sep 2 20:17:24 2016 From: random832 at fastmail.com (Random832) Date: Fri, 02 Sep 2016 20:17:24 -0400 Subject: [Python-Dev] PEP 467: last round (?) In-Reply-To: <57CA0EC8.5030508@stoneleaf.us> References: <57C88355.9000302@stoneleaf.us> <57C8A5F1.4060204@stoneleaf.us> <57CA0EC8.5030508@stoneleaf.us> Message-ID: <1472861844.3258795.714404505.0822A4A7@webmail.messagingengine.com> On Fri, Sep 2, 2016, at 19:44, Ethan Furman wrote: > The problem with only having `bchr` is that it doesn't help with > `bytearray`; What is the use case for bytearray.fromord? Even in the rare case someone needs it, why not bytearray(bchr(...))? From python at mrabarnett.plus.com Fri Sep 2 20:36:18 2016 From: python at mrabarnett.plus.com (MRAB) Date: Sat, 3 Sep 2016 01:36:18 +0100 Subject: [Python-Dev] Update on PEP 523 and adding a co_extra field to code objects In-Reply-To: References: <20160830180729.63540a92@fsol> <20160830193051.532e8d9a@fsol> <20160830194813.0cbb261b@fsol> <20160830202037.28943cec@fsol> Message-ID: On 2016-09-02 23:45, Brett Cannon wrote: > > > On Fri, 2 Sep 2016 at 15:11 Chris Angelico > wrote: > > On Sat, Sep 3, 2016 at 7:56 AM, Brett Cannon > wrote: > > On Fri, 2 Sep 2016 at 13:31 Dino Viehland via Python-Dev > > > wrote: > >> > >> So it looks like both list and tuple are about within 5% of using > co_extra > >> directly. Using a tuple instead of a list is about a wash except for > >> make_v2 where list is 1.4x slower for some reason (which I didn't > dig into). > >> > >> I would say that using a tuple and copying the tuple on updates makes > >> sense as we don't expect these to change very often and we don't > expect > >> collisions to happen very often. > > > > > > So would making co_extra a PyTupleObject instead of PyObject alleviate > > people's worry of a collision problem? You're going to have to > hold the GIL > > anyway to interact with the tuple so there won't be any race > condition in > > replacing the tuple when it's grown (or initially set). > > > > I'm not following how this solves the collision problem. If you have a > tuple, how do the two (or more) users of it know which index they're > using? They'd need to keep track separately for each object, or else > inefficiently search the tuple for an object of appropriate type every > time. What am I missing here? > > > You're not missing anything, you just have to pay for the search cost, > otherwise we're back to square one here of not worrying about the case > of multiple users. I don't see how you can have multiple users of a > single struct field and yet not have to do some search of some data > structure to find the relevant object you care about. We've tried maps > and dicts and they were too slow, and we proposed not worrying about > multiple users but people didn't like the idea of either not caring or > relying on some implicit practice that evolved around the co_extra > field. Using a tuple seems to be the best option we can come up with > short of developing a linked list which isn't that much better than a > tuple if you're simply storing PyObjects. So either we're sticking with > the lack of coordination as outlined in the PEP because you don't > imagine people using a combination of Pyjion, vmprof, and/or some > debugger simultaneously, or you do and we have to just eat the > performance degradation. > Could the users register themselves first? They could then be told what index to use. From greg.ewing at canterbury.ac.nz Fri Sep 2 21:18:34 2016 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 03 Sep 2016 13:18:34 +1200 Subject: [Python-Dev] What should a good type checker do? (was: Please reject or postpone PEP 526) In-Reply-To: References: Message-ID: <57CA24EA.1080002@canterbury.ac.nz> Chris Angelico wrote: > Forcing people to write 1.0 just to be compatible with 1.5 will cause > a lot of annoyance. Indeed, this would be unacceptable IMO. The checker could have a type 'smallint' that it considers promotable to float. But that wouldn't avoid the problem entirely, because e.g. adding two smallints doesn't necessarily give a smallint. Seems to me the practical thing is just to always allow ints to be promoted to floats. It's possible for runtime errors to result, but this is Python, so we're used to those. -- Greg From brett at python.org Fri Sep 2 22:16:16 2016 From: brett at python.org (Brett Cannon) Date: Sat, 03 Sep 2016 02:16:16 +0000 Subject: [Python-Dev] Update on PEP 523 and adding a co_extra field to code objects In-Reply-To: References: <20160830180729.63540a92@fsol> <20160830193051.532e8d9a@fsol> <20160830194813.0cbb261b@fsol> <20160830202037.28943cec@fsol> Message-ID: On Fri, 2 Sep 2016 at 17:37 MRAB wrote: > On 2016-09-02 23:45, Brett Cannon wrote: > > > > > > On Fri, 2 Sep 2016 at 15:11 Chris Angelico > > wrote: > > > > On Sat, Sep 3, 2016 at 7:56 AM, Brett Cannon > > wrote: > > > On Fri, 2 Sep 2016 at 13:31 Dino Viehland via Python-Dev > > > > wrote: > > >> > > >> So it looks like both list and tuple are about within 5% of using > > co_extra > > >> directly. Using a tuple instead of a list is about a wash except > for > > >> make_v2 where list is 1.4x slower for some reason (which I didn't > > dig into). > > >> > > >> I would say that using a tuple and copying the tuple on updates > makes > > >> sense as we don't expect these to change very often and we don't > > expect > > >> collisions to happen very often. > > > > > > > > > So would making co_extra a PyTupleObject instead of PyObject > alleviate > > > people's worry of a collision problem? You're going to have to > > hold the GIL > > > anyway to interact with the tuple so there won't be any race > > condition in > > > replacing the tuple when it's grown (or initially set). > > > > > > > I'm not following how this solves the collision problem. If you have > a > > tuple, how do the two (or more) users of it know which index they're > > using? They'd need to keep track separately for each object, or else > > inefficiently search the tuple for an object of appropriate type > every > > time. What am I missing here? > > > > > > You're not missing anything, you just have to pay for the search cost, > > otherwise we're back to square one here of not worrying about the case > > of multiple users. I don't see how you can have multiple users of a > > single struct field and yet not have to do some search of some data > > structure to find the relevant object you care about. We've tried maps > > and dicts and they were too slow, and we proposed not worrying about > > multiple users but people didn't like the idea of either not caring or > > relying on some implicit practice that evolved around the co_extra > > field. Using a tuple seems to be the best option we can come up with > > short of developing a linked list which isn't that much better than a > > tuple if you're simply storing PyObjects. So either we're sticking with > > the lack of coordination as outlined in the PEP because you don't > > imagine people using a combination of Pyjion, vmprof, and/or some > > debugger simultaneously, or you do and we have to just eat the > > performance degradation. > > > Could the users register themselves first? They could then be told what > index to use. > But that requires they register before any tuple is created, else they run the risk of seeing a tuple that was created before they registered. To cover that issue you then have to check the length at which point it's no more expensive than just iterating through a tuple (especially in the common case of a tuple of length 1). -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Fri Sep 2 21:17:54 2016 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 03 Sep 2016 13:17:54 +1200 Subject: [Python-Dev] PEP 467: last round (?) In-Reply-To: <57CA0EC8.5030508@stoneleaf.us> References: <57C88355.9000302@stoneleaf.us> <57C8A5F1.4060204@stoneleaf.us> <57CA0EC8.5030508@stoneleaf.us> Message-ID: <57CA24C2.4030703@canterbury.ac.nz> Ethan Furman wrote: > The problem with only having `bchr` is that it doesn't help with > `bytearray`; the problem with not having `bchr` is who wants to write > `bytes.fromord`? If we called it 'bytes.fnord' (From Numeric Ordinal) people would want to write it just for the fun factor. -- Greg From victor.stinner at gmail.com Sat Sep 3 04:47:57 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Sat, 3 Sep 2016 10:47:57 +0200 Subject: [Python-Dev] PEP 467: last round (?) In-Reply-To: <1472861844.3258795.714404505.0822A4A7@webmail.messagingengine.com> References: <57C88355.9000302@stoneleaf.us> <57C8A5F1.4060204@stoneleaf.us> <57CA0EC8.5030508@stoneleaf.us> <1472861844.3258795.714404505.0822A4A7@webmail.messagingengine.com> Message-ID: Yes, this was my point: I don't think that we need a bytearray method to create a mutable string from a single byte. Victor Le samedi 3 septembre 2016, Random832 a ?crit : > On Fri, Sep 2, 2016, at 19:44, Ethan Furman wrote: > > The problem with only having `bchr` is that it doesn't help with > > `bytearray`; > > What is the use case for bytearray.fromord? Even in the rare case > someone needs it, why not bytearray(bchr(...))? > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > victor.stinner%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From turnbull.stephen.fw at u.tsukuba.ac.jp Sat Sep 3 04:03:11 2016 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Sat, 3 Sep 2016 17:03:11 +0900 Subject: [Python-Dev] Emotional responses to PEPs 484 and 526 In-Reply-To: References: <20160901162117.GX26300@ando.pearwood.info> Message-ID: <22474.33727.604932.509479@turnbull.sk.tsukuba.ac.jp> Guido van Rossum writes: > I just spoke to someone who noted that [PEP 526] is likely to evoke > an outsize emotional response. (Similar to what happened with PEP 484.) Emotional, yes, but I resent the "outsize" part. Although that word to the wise is undoubtedly enough, i.e., tl:dr if you like, let me explain why I have one foot in each camp. Compare Nick's version of "scientific code with SI units": from circuit_units import A, V, Ohm, seconds delta: A for delta in [-500n, 0, 500n]: input: A = 2.75u + delta wait(seconds(1u)) expected: V = Ohm(100k)*input tolerance: V = 2.2m fails = check_output(expected, tolerance) print('%s: I(in)=%rA, measured V(out)=%rV, expected V(out)=%rV, diff=%rV.' % ( 'FAIL' if fails else 'pass', input, get_output(), expected, get_output() - expected )) with from circuit_units import VoltType, uA, mV, kOhm, u_second expected: VoltType for delta in [-0.5*uA, 0*uA, 0.5*uA]: input = 2.75*uA + delta wait(1*u_second) expected = (100*kOhm)*input tolerance = 2.2*mV fails = check_output(expected, tolerance) print('%s: I(in)=%rA, measured V(out)=%rV, expected V(out)=%rV, diff=%rV.' % ( 'FAIL' if fails else 'pass', input, get_output(), expected, get_output() - expected )) In Nick's version, literals like 500n ("500 nano-whatevers") require Ken Kundert's proposed syntax change. I left that in because it streamlines the expressions. I wrote the latter because I really disliked Nick's version, streamlined with "SI scale syntax" or not. Nick didn't explicitly type the *_output functions, so I didn't either.[1] I assume they're annotated in their module of definition. The important point about the second version is that if we accept the hypothesis that the pseudo-literals like '[0.5*uA, 0*uA, 0.5*uA]' are good enough to implicitly type the variables they're assigned to (as they are in this snippet), mypy will catch "unit errors" (which the circuit_units module converts into TypeErrors) in the expressions. I think that this hypothesis is appropriate in the context of the thread. Therefore, I think Nick's version was an abuse of variable annotation. I don't mean to criticize Nick, as he was trying to make the best of an unlikely proposal. But if Nick can fall into this trap[2], I think the fears of many that type annotations will grow like fungus on code that really doesn't need them, and arguably is better without them, are quite reasonable. The point here is *not* that Nick's version is "horrible" (as many of the "emotional" type refuseniks might say), whatever that might mean. I can easily imagine that the snippet above is part of the embedded software for air traffic control or medical life support equipment, and a belt and suspenders approach ("make units visible to reviewers" + "mypy checking" + "full coverage in unit tests") is warranted. Ie, Nick's version is much better than mine in that context because the hypothesis that "implicit declaration" is good enough is invalid. But in the context of discussion of how to make measurement units visible and readable in a Python program, he grabbed an inappropriate tool because it was close to hand. My version does everything the OP asked for, it furthermore makes mypy into a units checker, and it does so in a way that any intermediate Python programmer (and many novices as well) can immediately grasp. If Nick had been given the constraint that novices should be able to read it, I suspect he would have written the same snippet I did. Nick? Footnotes: [1] I think both versions could have better variable naming along with a few other changes to make them more readable, but those aspects were dictated by an earlier post. I don't have the nerve to touch Nick's code, so left those as is. [2] Or perhaps he did it intentionally, trying to combine two cool ideas, variable type annotations and SI units, in one example. If so, I think that this combination was unwise in context. From drekin at gmail.com Sat Sep 3 05:48:58 2016 From: drekin at gmail.com (=?UTF-8?B?QWRhbSBCYXJ0b8Wh?=) Date: Sat, 3 Sep 2016 11:48:58 +0200 Subject: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8 Message-ID: Paul Moore (p.f.moore at gmail.com) on Fri Sep 2 05:23:04 EDT 2016 wrote > > On 2 September 2016 at 03:35, Steve Dower > wrote: > >* I'd need to test to be sure, but writing an incomplete code point should > *>* just truncate to before that point. It may currently raise OSError if that > *>* truncated to zero length, as I believe that's not currently distinguished > *>* from an error. What behavior would you propose? > * > For "correct" behaviour, you should retain the unwritten bytes, and > write them as part of the next call (essentially making the API > stateful, in the same way that incremental codecs work). I'm pretty > sure that this could cause actual problems, for example I think invoke > (https://github.com/pyinvoke/invoke) gets byte streams from > subprocesses and dumps them direct to stdout in blocks (so could > easily end up splitting multibyte sequences). It''s arguable that it > should be decoding the bytes from the subprocess and then re-encoding > them, but that gets us into "guess the encoding used by the > subprocess" territory. > > The problem is that we're not going to simply drop some bad data in > the common case - it's not so much the dropping of the start of an > incomplete code point that bothers me, as the encoding error you hit > at the start of the *next* block of data you send. So people will get > random, unexplained, encoding errors. > > I don't see an easy answer here other than a stateful API. > > Isn't the buffered IO wrapper for this? > >* Reads of less than four bytes fail instantly, as in the worst case we need > *>* four bytes to represent one Unicode character. This is an unfortunate > *>* reality of trying to limit it to one system call - you'll never get a full > *>* buffer from a single read, as there is no simple mapping between > *>* length-as-utf8 and length-as-utf16 for an arbitrary string. > * > And here - "read a single byte" is a not uncommon way of getting some > data. Once again see invoke: > https://github.com/pyinvoke/invoke/blob/master/invoke/platform.py#L147 > > used at > https://github.com/pyinvoke/invoke/blob/master/invoke/runners.py#L548 > > I'm not saying that there's an easy answer here, but this *will* break > code. And actually, it's in violation of the documentation: seehttps://docs.python.org/3/library/io.html#io.RawIOBase.read > > """ > read(size=-1) > > Read up to size bytes from the object and return them. As a > convenience, if size is unspecified or -1, readall() is called. > Otherwise, only one system call is ever made. Fewer than size bytes > may be returned if the operating system call returns fewer than size > bytes. > > If 0 bytes are returned, and size was not 0, this indicates end of > file. If the object is in non-blocking mode and no bytes are > available, None is returned. > """ > > You're not allowed to return 0 bytes if the requested size was not 0, > and you're not at EOF. > > That's why it should be rather signaled by an exception. Even when one doesn't transcode UTF-16 to UTF-8, reading just one byte is still impossible I would argue that also incorrect here. I raise ValueError in win_unicode_console. Adam Barto? -------------- next part -------------- An HTML attachment was scrubbed... URL: From drekin at gmail.com Sat Sep 3 05:59:27 2016 From: drekin at gmail.com (=?UTF-8?B?QWRhbSBCYXJ0b8Wh?=) Date: Sat, 3 Sep 2016 11:59:27 +0200 Subject: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8 Message-ID: Steve Dower (steve.dower at python.org) on Thu Sep 1 18:28:53 EDT 2016 wrote I'm about to be offline for a few days, so I wanted to get my current > draft PEPs out for people can read and review. > > I don't believe there is a lot of change as a result of either PEP, but > the impact of what change there is needs to be weighed against the benefits. > > If anything, I'm likely to have underplayed the impact of this change > (though I've had a *lot* of support for this one). Just stating my > biases up-front - take it as you wish. > > See https://bugs.python.org/issue1602 for the current proposed patch for > this PEP. I will likely update it after my upcoming flights, but it's in > pretty good shape right now. > > Cheers, > Steve > > Did you consider that the hard-wired readline hook `_PyOS_WindowsConsoleReadline` won't be needed in future if http://bugs.python.org/issue17620 gets resolved so the default hook on Windows just reads from sys.stdin? This would also reduce code duplicity and all the Read/WriteConsoleW stuff would be gathered together in one special class. Regards, Adam Barto? -------------- next part -------------- An HTML attachment was scrubbed... URL: From drekin at gmail.com Sat Sep 3 06:05:02 2016 From: drekin at gmail.com (=?UTF-8?B?QWRhbSBCYXJ0b8Wh?=) Date: Sat, 3 Sep 2016 12:05:02 +0200 Subject: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8 In-Reply-To: References: Message-ID: > > The use of an ASCII compatible encoding is required to maintain > compatibility with code that bypasses the TextIOWrapper and directly > writes ASCII bytes to the standard streams (for example, [process_stdinreader.py] > ). > Code that assumes a particular encoding for the standard streams other than > ASCII will likely break. Note that for example in IDLE there are sys.std* stream objects that don't have buffer attribute. I would argue that it is incorrect to suppose that there is always one. Adam Barto? -------------- next part -------------- An HTML attachment was scrubbed... URL: From vadmium+py at gmail.com Sat Sep 3 07:26:06 2016 From: vadmium+py at gmail.com (Martin Panter) Date: Sat, 3 Sep 2016 11:26:06 +0000 Subject: [Python-Dev] PEP 467: last round (?) In-Reply-To: References: <57C88355.9000302@stoneleaf.us> Message-ID: On 2 September 2016 at 17:54, Koos Zevenhoven wrote: > On Thu, Sep 1, 2016 at 10:36 PM, Ethan Furman wrote: >> * Deprecate passing single integer values to ``bytes`` and ``bytearray`` >> * Add ``bytes.fromsize`` and ``bytearray.fromsize`` alternative >> constructors >> * Add ``bytes.fromord`` and ``bytearray.fromord`` alternative constructors >> * Add ``bytes.getbyte`` and ``bytearray.getbyte`` byte retrieval methods >> * Add ``bytes.iterbytes`` and ``bytearray.iterbytes`` alternative >> iterators > > I wonder if from_something with an underscore is more consistent (according > to a quick search perhaps yes). That would not be too inconsistent with the sister constructor bytes.fromhex(). From vadmium+py at gmail.com Sat Sep 3 07:35:41 2016 From: vadmium+py at gmail.com (Martin Panter) Date: Sat, 3 Sep 2016 11:35:41 +0000 Subject: [Python-Dev] PEP 467: last round (?) In-Reply-To: References: <57C88355.9000302@stoneleaf.us> <57C8A5F1.4060204@stoneleaf.us> <57CA0EC8.5030508@stoneleaf.us> <1472861844.3258795.714404505.0822A4A7@webmail.messagingengine.com> Message-ID: > Le samedi 3 septembre 2016, Random832 a ?crit : >> On Fri, Sep 2, 2016, at 19:44, Ethan Furman wrote: >> > The problem with only having `bchr` is that it doesn't help with >> > `bytearray`; >> >> What is the use case for bytearray.fromord? Even in the rare case >> someone needs it, why not bytearray(bchr(...))? On 3 September 2016 at 08:47, Victor Stinner wrote: > Yes, this was my point: I don't think that we need a bytearray method to > create a mutable string from a single byte. I agree with the above. Having an easy way to turn an int into a bytes object is good. But I think the built-in bchr() function on its own is enough. Just like we have bytes object literals, but the closest we have for a bytearray literal is bytearray(b". . ."). From vadmium+py at gmail.com Sat Sep 3 08:08:53 2016 From: vadmium+py at gmail.com (Martin Panter) Date: Sat, 3 Sep 2016 12:08:53 +0000 Subject: [Python-Dev] PEP 467: last round (?) In-Reply-To: <57C88355.9000302@stoneleaf.us> References: <57C88355.9000302@stoneleaf.us> Message-ID: On 1 September 2016 at 19:36, Ethan Furman wrote: > Deprecation of current "zero-initialised sequence" behaviour without removal > ---------------------------------------------------------------------------- > > Currently, the ``bytes`` and ``bytearray`` constructors accept an integer > argument and interpret it as meaning to create a zero-initialised sequence > of the given size:: > > >>> bytes(3) > b'\x00\x00\x00' > >>> bytearray(3) > bytearray(b'\x00\x00\x00') > > This PEP proposes to deprecate that behaviour in Python 3.6, but to leave > it in place for at least as long as Python 2.7 is supported, possibly > indefinitely. Can you clarify what ?deprecate? means? Just add a note in the documentation, or make calls trigger a DeprecationWarning as well? Having bytearray(n) trigger a DeprecationWarning would be a minor annoyance for code being compatible with Python 2 and 3, since bytearray(n) is supported in Python 2. > Addition of "getbyte" method to retrieve a single byte > ------------------------------------------------------ > > This PEP proposes that ``bytes`` and ``bytearray`` gain the method > ``getbyte`` > which will always return ``bytes``:: Should getbyte() handle negative indexes? E.g. getbyte(-1) returning the last byte. > Open Questions > ============== > > Do we add ``iterbytes`` to ``memoryview``, or modify > ``memoryview.cast()`` to accept ``'s'`` as a single-byte interpretation? Or > do we ignore memory for now and add it later? Apparently memoryview.cast('s') comes from Nick Coghlan: . However, since 3.5 (https://bugs.python.org/issue15944) you can call cast("c") on most memoryviews, which I think already does what you want: >>> tuple(memoryview(b"ABC").cast("c")) (b'A', b'B', b'C') From ncoghlan at gmail.com Sat Sep 3 08:26:07 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 3 Sep 2016 22:26:07 +1000 Subject: [Python-Dev] PEP 526 ready for review: Syntax for Variable and Attribute Annotations In-Reply-To: References: <20160901162117.GX26300@ando.pearwood.info> Message-ID: On 3 September 2016 at 02:17, Guido van Rossum wrote: > Pinning down the semantics is not why I am pushing for PEP 526 -- I > only want to pin down the *syntax* to the point where we won't have to > change it again for many versions, since it's much harder to change > the syntax than it is to change the behavior of type checkers (which > have fewer backwards compatibility constraints, a faster release > cycle, and narrower user bases than core Python itself). +1 from me as well for omitting any new type semantics that aren't absolutely necessary from the PEP (i.e. nothing beyond ClassVar) - I only figured it was worth bringing up here as the question had already arisen. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sat Sep 3 09:19:37 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 3 Sep 2016 23:19:37 +1000 Subject: [Python-Dev] Emotional responses to PEPs 484 and 526 In-Reply-To: <22474.33727.604932.509479@turnbull.sk.tsukuba.ac.jp> References: <20160901162117.GX26300@ando.pearwood.info> <22474.33727.604932.509479@turnbull.sk.tsukuba.ac.jp> Message-ID: On 3 September 2016 at 18:03, Stephen J. Turnbull wrote: > Therefore, I think Nick's version was an abuse of variable annotation. > I don't mean to criticize Nick, as he was trying to make the best of > an unlikely proposal. But if Nick can fall into this trap[2], I think > the fears of many that type annotations will grow like fungus on code > that really doesn't need them, and arguably is better without them, > are quite reasonable. I suggest lots of things of python-ideas that I would probably oppose if they ever made it as far as python-dev - enabling that kind of speculative freedom is a large part of *why* we have a brainstorming list. For me, type annotations fall into the same category in practice as metaclasses and structural linters: if you're still asking yourself the question "Do I need one?" the answer is an emphatic "No". They're tools designed to solve particular problems, so you reach for them when you have those problems, rather than as a matter of course. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From vadmium+py at gmail.com Sat Sep 3 10:01:49 2016 From: vadmium+py at gmail.com (Martin Panter) Date: Sat, 3 Sep 2016 14:01:49 +0000 Subject: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8 In-Reply-To: <1472772509.386843.713295953.53162396@webmail.messagingengine.com> References: <1472772509.386843.713295953.53162396@webmail.messagingengine.com> Message-ID: On 1 September 2016 at 23:28, Random832 wrote: > On Thu, Sep 1, 2016, at 18:28, Steve Dower wrote: >> This is a raw (bytes) IO class that requires text to be passed encoded >> with utf-8, which will be decoded to utf-16-le and passed to the Windows APIs. >> Similarly, bytes read from the class will be provided by the operating >> system as utf-16-le and converted into utf-8 when returned to Python. > > What happens if a character is broken across a buffer boundary? e.g. if > someone tries to read or write one byte at a time (you can't do a > partial read of zero bytes, there's no way to distinguish that from an > EOF.) > > Is there going to be a higher-level text I/O class that bypasses the > UTF-8 encoding step when the underlying bytes stream is a console? What > if we did that but left the encoding as mbcs? I.e. the console is text > stream that can magically handle characters that aren't representable in > its encoding. Note that if anything does os.read/write to the console's > file descriptors, they're gonna get MBCS and there's nothing we can do > about it. Maybe it is too complicated and impractical, but I have imagined that the sys.stdin/stdout/stderr could be custom TextIOBase objects. They would not be wrappers or do much encoding (other than maybe newline encoding). To solve the compatibility problems with code that uses stdout.buffer or whatever, you could add a custom ?buffer? object, something like my AsciiBufferMixin class: https://gist.github.com/vadmium/d1b07d771fbf4347683c005c40991c02 Just putting this idea out there, but maybe Steve?s UTF-8 encoding solution is good enough. From eric at trueblade.com Sat Sep 3 10:36:45 2016 From: eric at trueblade.com (Eric V. Smith) Date: Sat, 3 Sep 2016 10:36:45 -0400 Subject: [Python-Dev] [New-bugs-announce] [issue27948] f-strings: allow backslashes only in the string parts, not in the expression parts In-Reply-To: <1472909045.16.0.0196162777566.issue27948@psf.upfronthosting.co.za> References: <1472909045.16.0.0196162777566.issue27948@psf.upfronthosting.co.za> Message-ID: <505ac248-37e2-19b6-3d28-bb77f4d9428e@trueblade.com> I'm aware of the buildbot failures due to this commit. I'm working on it. Sorry about that: tests passed on my machine. Eric. On 09/03/2016 09:24 AM, Eric V. Smith wrote: > > New submission from Eric V. Smith: > > See issue 27921. > > Currently (and for 3.6 beta 1), backslashes are not allowed anywhere in f-strings. This needs to be changed to allow them in the string parts, but not in the expression parts. > > Also, require that the start and end of an expression be literal '{' and '}, not escapes like '\0x7b' and '\u007d'. > > ---------- > assignee: eric.smith > components: Interpreter Core > messages: 274294 > nosy: eric.smith > priority: normal > severity: normal > stage: needs patch > status: open > title: f-strings: allow backslashes only in the string parts, not in the expression parts > type: behavior > versions: Python 3.6 > > _______________________________________ > Python tracker > > _______________________________________ > _______________________________________________ > New-bugs-announce mailing list > New-bugs-announce at python.org > https://mail.python.org/mailman/listinfo/new-bugs-announce > From ncoghlan at gmail.com Sat Sep 3 10:49:10 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 4 Sep 2016 00:49:10 +1000 Subject: [Python-Dev] PEP 529: Change Windows filesystem encoding to UTF-8 In-Reply-To: References: Message-ID: On 2 September 2016 at 08:31, Steve Dower wrote: > This proposal would remove all use of the *A APIs and only ever call the *W > APIs. When Windows returns paths to Python as str, they will be decoded from > utf-16-le and returned as text (in whatever the minimal representation is). > When > Windows returns paths to Python as bytes, they will be decoded from > utf-16-le to > utf-8 using surrogatepass (Windows does not validate surrogate pairs, so it > is > possible to have invalid surrogates in filenames). Equally, when paths are > provided as bytes, they are decoded from utf-8 into utf-16-le and passed to > the > *W APIs. The overall proposal looks good to me, there's just a terminology glitch here: utf-8 <-> utf-16-le should either be described as transcoding, or else as decoding and then re-encoding. As they're both text codecs, there's no "decoding" operation that switches between them. As far as the timing of this particular change goes, I think you make a good case that all of the cases that will see a behaviour change with this PEP have already been receiving deprecation warnings since 3.3, which would make it acceptable to change the default behaviour in 3.6. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From turnbull.stephen.fw at u.tsukuba.ac.jp Sat Sep 3 11:17:32 2016 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Sun, 4 Sep 2016 00:17:32 +0900 Subject: [Python-Dev] What should a good type checker do? (was: Please reject or postpone PEP 526) In-Reply-To: <57CA24EA.1080002@canterbury.ac.nz> References: <57CA24EA.1080002@canterbury.ac.nz> Message-ID: <22474.59788.248766.537464@turnbull.sk.tsukuba.ac.jp> Please respect Reply-To, set to python-ideas. Greg Ewing writes: > Chris Angelico wrote: > > Forcing people to write 1.0 just to be compatible with 1.5 will cause > > a lot of annoyance. > > Indeed, this would be unacceptable IMO. But "forcing" won't happen. Just ignore the warning. *All* such Python programs will continue to run (or crash) exactly as if the type declarations weren't there. If you don't like the warning, either don't run the typechecker, or change your code to placate it. But allowing escapes from a typechecker means allowing escapes. All of them, not just the ones you or I have preapproved. I want my typechecker to be paranoid, and loud about it. That doesn't mean I would never use a type like "Floatable" (ie, any type subject to implicit conversion to float). But in the original example, I would probably placate the typechecker. YMMV, of course. From turnbull.stephen.fw at u.tsukuba.ac.jp Sat Sep 3 11:18:01 2016 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Sun, 4 Sep 2016 00:18:01 +0900 Subject: [Python-Dev] [erratum] Emotional responses to PEPs 484 and 526 In-Reply-To: <22474.33727.604932.509479@turnbull.sk.tsukuba.ac.jp> References: <20160901162117.GX26300@ando.pearwood.info> <22474.33727.604932.509479@turnbull.sk.tsukuba.ac.jp> Message-ID: <22474.59817.995209.791686@turnbull.sk.tsukuba.ac.jp> Stephen J. Turnbull writes: > My version ... furthermore makes mypy into a units checker, That isn't true, mypy does want annotations on all the variables it checks and does not infer them from initializer type. Sorry for the misinformation. From levkivskyi at gmail.com Sat Sep 3 11:40:24 2016 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Sat, 3 Sep 2016 17:40:24 +0200 Subject: [Python-Dev] [erratum] Emotional responses to PEPs 484 and 526 In-Reply-To: <22474.59817.995209.791686@turnbull.sk.tsukuba.ac.jp> References: <20160901162117.GX26300@ando.pearwood.info> <22474.33727.604932.509479@turnbull.sk.tsukuba.ac.jp> <22474.59817.995209.791686@turnbull.sk.tsukuba.ac.jp> Message-ID: On 3 September 2016 at 17:18, Stephen J. Turnbull < turnbull.stephen.fw at u.tsukuba.ac.jp> wrote: > Stephen J. Turnbull writes: > > > My version ... furthermore makes mypy into a units checker, > > That isn't true, mypy does want annotations on all the variables it > checks and does not infer them from initializer type. > I have heard that pytype (https://github.com/google/pytype) does more type inference (although it has some weaknesses). In general, I think it is OK that the amount of annotations needed depends on the type checker (there is actually a note on this in the last revision of PEP 526). -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Sat Sep 3 11:41:58 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Sat, 03 Sep 2016 08:41:58 -0700 Subject: [Python-Dev] PEP 467: last round (?) In-Reply-To: References: <57C88355.9000302@stoneleaf.us> Message-ID: <57CAEF46.6000402@stoneleaf.us> On 09/03/2016 05:08 AM, Martin Panter wrote: > On 1 September 2016 at 19:36, Ethan Furman wrote: >> Deprecation of current "zero-initialised sequence" behaviour without removal >> ---------------------------------------------------------------------------- >> >> Currently, the ``bytes`` and ``bytearray`` constructors accept an integer >> argument and interpret it as meaning to create a zero-initialised sequence >> of the given size:: >> >> >>> bytes(3) >> b'\x00\x00\x00' >> >>> bytearray(3) >> bytearray(b'\x00\x00\x00') >> >> This PEP proposes to deprecate that behaviour in Python 3.6, but to leave >> it in place for at least as long as Python 2.7 is supported, possibly >> indefinitely. > > Can you clarify what ?deprecate? means? Just add a note in the > documentation, [...] This one. >> Addition of "getbyte" method to retrieve a single byte >> ------------------------------------------------------ >> >> This PEP proposes that ``bytes`` and ``bytearray`` gain the method >> ``getbyte`` >> which will always return ``bytes``:: > > Should getbyte() handle negative indexes? E.g. getbyte(-1) returning > the last byte. Yes. >> Open Questions >> ============== >> >> Do we add ``iterbytes`` to ``memoryview``, or modify >> ``memoryview.cast()`` to accept ``'s'`` as a single-byte interpretation? Or >> do we ignore memory for now and add it later? > > Apparently memoryview.cast('s') comes from Nick Coghlan: > . > However, since 3.5 (https://bugs.python.org/issue15944) you can call > cast("c") on most memoryviews, which I think already does what you > want: > >>>> tuple(memoryview(b"ABC").cast("c")) > (b'A', b'B', b'C') Nice! -- ~Ethan~ From ncoghlan at gmail.com Sat Sep 3 11:42:41 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 4 Sep 2016 01:42:41 +1000 Subject: [Python-Dev] PEP 525, third round, better finalization In-Reply-To: References: Message-ID: On 2 September 2016 at 19:13, Nathaniel Smith wrote: > This works OK on CPython because the reference-counting gc will call > handle.__del__() at the end of the scope (so on CPython it's at level > 2), but it famously causes huge problems when porting to PyPy with > it's much faster and more sophisticated gc that only runs when > triggered by memory pressure. (Or for "PyPy" you can substitute > "Jython", "IronPython", whatever.) Technically this code doesn't > actually "leak" file descriptors on PyPy, because handle.__del__() > will get called *eventually* (this code is at level 1, not level 0), > but by the time "eventually" arrives your server process has probably > run out of file descriptors and crashed. Level 1 isn't good enough. So > now we have all learned to instead write > > # good modern Python style: > def get_file_contents(path): > with open(path) as handle: > return handle.read() This only works if the file fits in memory - otherwise you just have to accept the fact that you need to leave the file handle open until you're "done with the iterator", which means deferring the resource management to the caller. > and we have fancy tools like the ResourceWarning machinery to help us > catch these bugs. > > Here's the analogous example for async generators. This is a useful, > realistic async generator, that lets us incrementally read from a TCP > connection that streams newline-separated JSON documents: > > async def read_json_lines_from_server(host, port): > async for line in asyncio.open_connection(host, port)[0]: > yield json.loads(line) > > You would expect to use this like: > > async for data in read_json_lines_from_server(host, port): > ... The actual synchronous equivalent to this would look more like: def read_data_from_file(path): with open(path) as f: for line in f: yield f (Assume we're doing something interesting to each line, rather than reproducing normal file iteration behaviour) And that has the same problem as your asynchronous example: the caller needs to worry about resource management on the generator and do: with closing(read_data_from_file(path)) as itr: for line in itr: ... Which means the problem causing your concern doesn't arise from the generator being asynchronous - it comes from the fact the generator actually *needs* to hold the FD open in order to work as intended (if it didn't, then the code wouldn't need to be asynchronous). > BUT, with the current PEP 525 proposal, trying to use this generator > in this way is exactly analogous to the open(path).read() case: on > CPython it will work fine -- the generator object will leave scope at > the end of the 'async for' loop, cleanup methods will be called, etc. > But on PyPy, the weakref callback will not be triggered until some > arbitrary time later, you will "leak" file descriptors, and your > server will crash. That suggests the PyPy GC should probably be tracking pressure on more resources than just memory when deciding whether or not to trigger a GC run. > For correct operation, you have to replace the > simple 'async for' loop with this lovely construct: > > async with aclosing(read_json_lines_from_server(host, port)) as ait: > async for data in ait: > ... > > Of course, you only have to do this on loops whose iterator might > potentially hold resources like file descriptors, either currently or > in the future. So... uh... basically that's all loops, I guess? If you > want to be a good defensive programmer? At that level of defensiveness in asynchronous code, you need to start treating all external resources (including file descriptors) as a managed pool, just as we have process and thread pools in the standard library, and many database and networking libraries offer connection pooling. It limits your per-process concurrency, but that limit exists anyway at the operating system level - modelling it explicitly just lets you manage how the application handles those limits. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ethan at stoneleaf.us Sat Sep 3 11:43:18 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Sat, 03 Sep 2016 08:43:18 -0700 Subject: [Python-Dev] PEP 467: last round (?) In-Reply-To: <57CA24C2.4030703@canterbury.ac.nz> References: <57C88355.9000302@stoneleaf.us> <57C8A5F1.4060204@stoneleaf.us> <57CA0EC8.5030508@stoneleaf.us> <57CA24C2.4030703@canterbury.ac.nz> Message-ID: <57CAEF96.1060306@stoneleaf.us> On 09/02/2016 06:17 PM, Greg Ewing wrote: > Ethan Furman wrote: >> The problem with only having `bchr` is that it doesn't >> help with `bytearray`; the problem with not having >> `bchr` is who wants to write `bytes.fromord`? > > If we called it 'bytes.fnord' (From Numeric Ordinal) > people would want to write it just for the fun factor. Very good point! :) -- ~Ethan~ From k7hoven at gmail.com Sat Sep 3 11:44:29 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Sat, 3 Sep 2016 18:44:29 +0300 Subject: [Python-Dev] What should a good type checker do? (was: Please reject or postpone PEP 526) In-Reply-To: <22474.59788.248766.537464@turnbull.sk.tsukuba.ac.jp> References: <57CA24EA.1080002@canterbury.ac.nz> <22474.59788.248766.537464@turnbull.sk.tsukuba.ac.jp> Message-ID: What's up with the weird subthreads, Stephen?! On Guido's suggestion, I'm working on posting those type-checking thoughts here. -- Koos On Sat, Sep 3, 2016 at 6:17 PM, Stephen J. Turnbull wrote: > Please respect Reply-To, set to python-ideas. > > Greg Ewing writes: > > Chris Angelico wrote: > > > Forcing people to write 1.0 just to be compatible with 1.5 will cause > > > a lot of annoyance. > > > > Indeed, this would be unacceptable IMO. > > But "forcing" won't happen. Just ignore the warning. *All* such > Python programs will continue to run (or crash) exactly as if the type > declarations weren't there. If you don't like the warning, either > don't run the typechecker, or change your code to placate it. > > But allowing escapes from a typechecker means allowing escapes. All > of them, not just the ones you or I have preapproved. I want my > typechecker to be paranoid, and loud about it. > > That doesn't mean I would never use a type like "Floatable" (ie, any > type subject to implicit conversion to float). But in the original > example, I would probably placate the typechecker. YMMV, of course. > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/k7hoven%40gmail.com -- + Koos Zevenhoven + http://twitter.com/k7hoven + From ncoghlan at gmail.com Sat Sep 3 12:09:31 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 4 Sep 2016 02:09:31 +1000 Subject: [Python-Dev] Update on PEP 523 and adding a co_extra field to code objects In-Reply-To: References: <20160830180729.63540a92@fsol> <20160830193051.532e8d9a@fsol> <20160830194813.0cbb261b@fsol> <20160830202037.28943cec@fsol> Message-ID: On 3 September 2016 at 08:50, Chris Angelico wrote: > Got it, thanks. I hope the vagaries of linear search don't mess with > profilers - a debugger isn't going to be bothered by whether it gets > first slot or second, but profiling and performance might get subtle > differences based on which thing looks at a function first. A dict > would avoid that (constant-time lookups with a pre-selected key will > be consistent), but costs a lot more. Profiling with a debugger enabled is going to see a lot more interference from the debugger than it is from a linear search through a small tuple for its own state :) Optimising compilers and VM profilers are clearly a case where cooperation will be desirable, as are optimising compilers and debuggers. However, that cooperation is still going to need to be worked out on a pairwise basis - the PEP can't magically make arbitrary pairs of plugins compatible, all it can do is define some rules and guidelines that make it easier for plugins to cooperate when they want to do so. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From guido at python.org Sat Sep 3 12:16:32 2016 From: guido at python.org (Guido van Rossum) Date: Sat, 3 Sep 2016 09:16:32 -0700 Subject: [Python-Dev] [erratum] Emotional responses to PEPs 484 and 526 In-Reply-To: <22474.59817.995209.791686@turnbull.sk.tsukuba.ac.jp> References: <20160901162117.GX26300@ando.pearwood.info> <22474.33727.604932.509479@turnbull.sk.tsukuba.ac.jp> <22474.59817.995209.791686@turnbull.sk.tsukuba.ac.jp> Message-ID: On Sat, Sep 3, 2016 at 8:18 AM, Stephen J. Turnbull wrote: > Stephen J. Turnbull writes: > > > My version ... furthermore makes mypy into a units checker, > > That isn't true, mypy does want annotations on all the variables it > checks and does not infer them from initializer type. But it does! Mypy emphatically does *not* need annotations on all variables; it infers most variable types from the first expression assigned to them. E.g. here: output = [] n = 0 output.append(n) reveal_type(output) it will reveal the type List[int] without any help from annotations. There are cases where it does require annotation on empty containers, when it's less obvious how the container is filled, and other, more complicated situations, but a sequence of assignments as in Nick's example is a piece of cake for it. In fact, the one place where *I* wanted a type annotation was here: expected: V = Ohm(100k)*input because I haven't had a need to use Ohm's law in a long time, so I could personally use the hint that Ohm times Amps makes Volts (but again, given suitable class definitions, mypy wouldn't have needed that annotation). -- --Guido van Rossum (python.org/~guido) From rosuav at gmail.com Sat Sep 3 12:25:59 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 4 Sep 2016 02:25:59 +1000 Subject: [Python-Dev] Update on PEP 523 and adding a co_extra field to code objects In-Reply-To: References: <20160830180729.63540a92@fsol> <20160830193051.532e8d9a@fsol> <20160830194813.0cbb261b@fsol> <20160830202037.28943cec@fsol> Message-ID: On Sun, Sep 4, 2016 at 2:09 AM, Nick Coghlan wrote: > On 3 September 2016 at 08:50, Chris Angelico wrote: >> Got it, thanks. I hope the vagaries of linear search don't mess with >> profilers - a debugger isn't going to be bothered by whether it gets >> first slot or second, but profiling and performance might get subtle >> differences based on which thing looks at a function first. A dict >> would avoid that (constant-time lookups with a pre-selected key will >> be consistent), but costs a lot more. > > Profiling with a debugger enabled is going to see a lot more > interference from the debugger than it is from a linear search through > a small tuple for its own state :) Right; I was contrasting the debugger at one end (linear search is utterly dwarfed by other costs) with a profiler at the other end (wants minimal cost, and minimal noise, and a linear search gives cost and noise). In between, an optimizer is an example of something that could mess with the profiler based on activation ordering (and thus which one gets first slot). > Optimising compilers and VM profilers are clearly a case where > cooperation will be desirable, as are optimising compilers and > debuggers. However, that cooperation is still going to need to be > worked out on a pairwise basis - the PEP can't magically make > arbitrary pairs of plugins compatible, all it can do is define some > rules and guidelines that make it easier for plugins to cooperate when > they want to do so. Obviously, but AIUI the rules sound pretty simple: 1) Base compiler: co_extra = () 2) Modifier: co_extra += (MyState(),) 3) Repeat #2 for other tools 4) for obj in co_extra: if obj.__class__ is MyState: do stuff Anyone who puts a non-tuple into co_extra is playing badly with other people. Anyone who doesn't use a custom class is risking collisions. Beyond that, it should be pretty straight-forward. ChrisA From ncoghlan at gmail.com Sat Sep 3 12:27:44 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 4 Sep 2016 02:27:44 +1000 Subject: [Python-Dev] PEP 529: Change Windows filesystem encoding to UTF-8 In-Reply-To: References: Message-ID: On 4 September 2016 at 00:49, Nick Coghlan wrote: > On 2 September 2016 at 08:31, Steve Dower wrote: >> This proposal would remove all use of the *A APIs and only ever call the *W >> APIs. When Windows returns paths to Python as str, they will be decoded from >> utf-16-le and returned as text (in whatever the minimal representation is). >> When >> Windows returns paths to Python as bytes, they will be decoded from >> utf-16-le to >> utf-8 using surrogatepass (Windows does not validate surrogate pairs, so it >> is >> possible to have invalid surrogates in filenames). Equally, when paths are >> provided as bytes, they are decoded from utf-8 into utf-16-le and passed to >> the >> *W APIs. > > The overall proposal looks good to me, there's just a terminology > glitch here: utf-8 <-> utf-16-le should either be described as > transcoding, or else as decoding and then re-encoding. As they're both > text codecs, there's no "decoding" operation that switches between > them. After also reading the Windows console encoding PEP, I realised there's a couple of missing discussions here regarding the impacts on sys.argv, os.environ, and os.environb. The reason that's relevant is that "sys.getfilesystemencoding" is a bit of a misnomer, as it's also used to determine the assumed encoding of command line arguments and environment variables. With the PEP currently stating that all use of the "*A" Windows APIs will be removed, I'm guessing these will just start working as expected, but it should be convered explicitly. In addition, if the subprocess module is going to be excluded from these changes, that should be called out explicitly (Keeping in mind that on *nix, the only subprocess pipe configurations that are straightforward to set up in Python 3 are raw binary mode and universal newlines mode, with the latter implicitly treating the pipes as UTF-8 text) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sat Sep 3 12:48:05 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 4 Sep 2016 02:48:05 +1000 Subject: [Python-Dev] PEP 467: last round (?) In-Reply-To: References: <57C88355.9000302@stoneleaf.us> <57C8A5F1.4060204@stoneleaf.us> <57CA0EC8.5030508@stoneleaf.us> <1472861844.3258795.714404505.0822A4A7@webmail.messagingengine.com> Message-ID: On 3 September 2016 at 21:35, Martin Panter wrote: >> Le samedi 3 septembre 2016, Random832 a ?crit : >>> On Fri, Sep 2, 2016, at 19:44, Ethan Furman wrote: >>> > The problem with only having `bchr` is that it doesn't help with >>> > `bytearray`; >>> >>> What is the use case for bytearray.fromord? Even in the rare case >>> someone needs it, why not bytearray(bchr(...))? > > On 3 September 2016 at 08:47, Victor Stinner wrote: >> Yes, this was my point: I don't think that we need a bytearray method to >> create a mutable string from a single byte. > > I agree with the above. Having an easy way to turn an int into a bytes > object is good. But I think the built-in bchr() function on its own is > enough. Just like we have bytes object literals, but the closest we > have for a bytearray literal is bytearray(b". . ."). This is a good point - earlier versions of the PEP didn't include bchr(), they just had the class methods, so "bytearray(bchr(...))" wasn't an available spelling (if I remember the original API design correctly, it would have been something like "bytearray(bytes.byte(...))"), which meant there was a strong consistency argument in having the alternate constructor on both types. Now that the PEP proposes the "bchr" builtin, the "fromord" constructors look less necessary. Given that, and the uncertain deprecation time frame for accepting integers in the main bytes and bytearray constructors, perhaps both the "fromsize" and "fromord" parts of the proposal can be deferred indefinitely in favour of just adding the bchr() builtin? We wouldn't gain the "initialise a region of memory to an arbitrary value" feature, but it can be argued that wanting that is a sign someone may be better off with a more specialised memory manipulation library, rather than relying solely on the builtins. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sat Sep 3 12:59:16 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 4 Sep 2016 02:59:16 +1000 Subject: [Python-Dev] PEP 467: last round (?) In-Reply-To: References: <57C88355.9000302@stoneleaf.us> Message-ID: On 3 September 2016 at 03:54, Koos Zevenhoven wrote: > chrb seems to be more in line with some bytes versions in for instance os > than bchr. The mnemonic for the current name in the PEP is that bchr is to chr as b"" is to "". The PEP should probably say that in addition to pointing out the 'unichr' Python 2 inspiration, though. The other big difference between this and the os module case, is that the resulting builtin constructor pairs here are str/chr (arbitrary text, single code point) and bytes/bchr (arbitrary binary data, single binary octet). By contrast, os.getcwd() and os.getcwdb() (and similar APIs) are both referring to the same operating system level operation, they're just requesting a different return type for the data. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From drekin at gmail.com Sat Sep 3 14:31:05 2016 From: drekin at gmail.com (=?UTF-8?B?QWRhbSBCYXJ0b8Wh?=) Date: Sat, 3 Sep 2016 20:31:05 +0200 Subject: [Python-Dev] PEP 529: Change Windows filesystem encoding to UTF-8 Message-ID: Nick Coghlan (ncoghlan at gmail.com) on Sat Sep 3 12:27:44 EDT 2016 wrote: > After also reading the Windows console encoding PEP, I realised > there's a couple of missing discussions here regarding the impacts on > sys.argv, os.environ, and os.environb. > > The reason that's relevant is that "sys.getfilesystemencoding" is a > bit of a misnomer, as it's also used to determine the assumed encoding > of command line arguments and environment variables. > > Regarding sys.argv, AFAIK Unicode arguments work well on Python 3. Even non-BMP characters are transferred correctly. Adam Barto? -------------- next part -------------- An HTML attachment was scrubbed... URL: From oscar.j.benjamin at gmail.com Sat Sep 3 14:38:15 2016 From: oscar.j.benjamin at gmail.com (Oscar Benjamin) Date: Sat, 3 Sep 2016 19:38:15 +0100 Subject: [Python-Dev] PEP 525, third round, better finalization In-Reply-To: References: Message-ID: On 3 September 2016 at 16:42, Nick Coghlan wrote: > On 2 September 2016 at 19:13, Nathaniel Smith wrote: >> This works OK on CPython because the reference-counting gc will call >> handle.__del__() at the end of the scope (so on CPython it's at level >> 2), but it famously causes huge problems when porting to PyPy with >> it's much faster and more sophisticated gc that only runs when >> triggered by memory pressure. (Or for "PyPy" you can substitute >> "Jython", "IronPython", whatever.) Technically this code doesn't >> actually "leak" file descriptors on PyPy, because handle.__del__() >> will get called *eventually* (this code is at level 1, not level 0), >> but by the time "eventually" arrives your server process has probably >> run out of file descriptors and crashed. Level 1 isn't good enough. So >> now we have all learned to instead write ... >> BUT, with the current PEP 525 proposal, trying to use this generator >> in this way is exactly analogous to the open(path).read() case: on >> CPython it will work fine -- the generator object will leave scope at >> the end of the 'async for' loop, cleanup methods will be called, etc. >> But on PyPy, the weakref callback will not be triggered until some >> arbitrary time later, you will "leak" file descriptors, and your >> server will crash. > > That suggests the PyPy GC should probably be tracking pressure on more > resources than just memory when deciding whether or not to trigger a > GC run. PyPy's GC is conformant to the language spec AFAICT: https://docs.python.org/3/reference/datamodel.html#object.__del__ """ object.__del__(self) Called when the instance is about to be destroyed. This is also called a destructor. If a base class has a __del__() method, the derived class?s __del__() method, if any, must explicitly call it to ensure proper deletion of the base class part of the instance. Note that it is possible (though not recommended!) for the __del__() method to postpone destruction of the instance by creating a new reference to it. It may then be called at a later time when this new reference is deleted. It is not guaranteed that __del__() methods are called for objects that still exist when the interpreter exits. """ Note the last sentence. It is also not guaranteed (across different Python implementations and regardless of the CPython-specific notes in the docs) that any particular object will cease to exist before the interpreter exits. Taken together these two imply that it is not guaranteed that *any* __del__ method will ever be called. Antoine's excellent work in PEP 442 has improved the situation with CPython but the language spec (covering all implementations) remains the same and changing that requires a new PEP and coordination with other implementations. Without changing it is a mistake to base a new core language feature (async finalisation) on CPython-specific implementation details. Already using with (or try/finally etc.) inside a generator function behaves differently under PyPy: $ cat gentest.py def generator_needs_finalisation(): try: for n in range(10): yield n finally: print('Doing important cleanup') for obj in generator_needs_finalisation(): if obj == 5: break print('Process exit') $ python gentest.py Doing important cleanup Process exit So here the cleanup is triggered by the reference count of the generator falling at the break statement. Under CPython this corresponds to Nathaniel's "level 2" cleanup. If we keep another reference around it gets done at process exit: $ cat gentest2.py def generator_needs_finalisation(): try: for n in range(10): yield n finally: print('Doing important cleanup') gen = generator_needs_finalisation() for obj in gen: if obj == 5: break print('Process exit') $ python gentest2.py Process exit Doing important cleanup So that's Nathaniel's "level 1" cleanup. However if you run either of these scripts under PyPy the cleanup simply won't occur (i.e. "level 0" cleanup): $ pypy gentest.py Process exit $ pypy gentest2.py Process exit I don't think PyPy is in breach of the language spec here. Python made a decision a long time ago to shun RAII-style implicit cleanup in favour if with-style explicit cleanup. The solution to this problem is to move resource management outside of the generator functions. This is true for ordinary generators without an event-loop etc. The example in the PEP is async def square_series(con, to): async with con.transaction(): cursor = con.cursor( 'SELECT generate_series(0, $1) AS i', to) async for row in cursor: yield row['i'] ** 2 async for i in square_series(con, 1000): if i == 100: break The normal generator equivalent of this is: def square_series(con, to): with con.transaction(): cursor = con.cursor( 'SELECT generate_series(0, $1) AS i', to) for row in cursor: yield row['i'] ** 2 This code is already broken: move the with statement outside to the caller of the generator function. Going back to Nathaniel's example: def get_file_contents(path): with open(path) as handle: return handle.read() Nick wants it to be generator function so we don't have to load the whole file into memory i.e.: def get_file_lines(path): with open(path) as handle: yield from handle However this is now broken if the iterator is not fully consumed: for line in get_file_lines(path): if line.startswith('#'): break The answer is to move the with statement outside and pass the handle into your generator function: def get_file_lines(handle): yield from handle with open(path) as handle: for line in get_file_lines(handle): if line.startswith('#'): break Of course in this case get_file_lines is trivial and can be omitted but this fix works more generally in the case that get_file_lines actually does some processing on the lines of the file: move the with statement outside and turn the generator function into an iterator-style filter. -- Oscar From yselivanov.ml at gmail.com Sat Sep 3 15:13:14 2016 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Sat, 3 Sep 2016 12:13:14 -0700 Subject: [Python-Dev] PEP 525, third round, better finalization In-Reply-To: References: Message-ID: <3599dca9-7c27-c1fc-0469-6a79c4167d31@gmail.com> Hi Nathaniel, On 2016-09-02 2:13 AM, Nathaniel Smith wrote: > On Thu, Sep 1, 2016 at 3:34 PM, Yury Selivanov wrote: >> Hi, >> >> I've spent quite a while thinking and experimenting with PEP 525 trying to >> figure out how to make asynchronous generators (AG) finalization reliable. >> I've tried to replace the callback for GCed with a callback to intercept >> first iteration of AGs. Turns out it's very hard to work with weak-refs and >> make asyncio event loop to reliably track and shutdown all open AGs. >> >> My new approach is to replace the "sys.set_asyncgen_finalizer(finalizer)" >> function with "sys.set_asyncgen_hooks(firstiter=None, finalizer=None)". > 1) Can/should these hooks be used by other types besides async > generators? (e.g., async iterators that are not async generators?) > What would that look like? Asynchronous iterators (classes implementing __aiter__, __anext__) should use __del__ for any cleanup purposes. sys.set_asyncgen_hooks only supports asynchronous generators. > > 2) In the asyncio design it's legal for an event loop to be stopped > and then started again. Currently (I guess for this reason?) asyncio > event loops do not forcefully clean up resources associated with them > on shutdown. For example, if I open a StreamReader, loop.stop() and > loop.close() will not automatically close it for me. When, concretely, > are you imagining that asyncio will run these finalizers? I think we will add another API method to asyncio event loop, which users will call before closing the loop. In my reference implementation I added `loop.shutdown()` synchronous method. > > 3) Should the cleanup code in the generator be able to distinguish > between "this iterator has left scope" versus "the event loop is being > violently shut down"? This is already handled in the reference implementation. When an AG is iterated for the first time, the loop starts tracking it by adding it to a weak set. When the AG is about to be GCed, the loop removes it from the weak set, and schedules its 'aclose()'. If 'loop.shutdown' is called it means that the loop is being "violently shutdown", so we schedule 'aclose' for all AGs in the weak set. > > 4) More fundamentally -- this revision is definitely an improvement, > but it doesn't really address the main concern I have. Let me see if I > can restate it more clearly. > > Let's define 3 levels of cleanup handling: > > Level 0: resources (e.g. file descriptors) cannot be reliably cleaned up. > > Level 1: resources are cleaned up reliably, but at an unpredictable time. > > Level 2: resources are cleaned up both reliably and promptly. > > In Python 3.5, unless you're very anal about writing cumbersome 'async > with' blocks around every single 'async for', resources owned by aysnc > iterators land at level 0. (Because the only cleanup method available > is __del__, and __del__ cannot make async calls, so if you need async > calls to do clean up then you're just doomed.) > > I think at the revised draft does a good job of moving async > generators from level 0 to level 1 -- the finalizer hook gives a way > to effectively call back into the event loop from __del__, and the > shutdown hook gives us a way to guarantee that the cleanup happens > while the event loop is still running. Right. It's good to hear that you agree that the latest revision of the PEP makes AGs cleanup reliable (albeit unpredictable when exactly that will happen, more on that below). My goal was exactly this - make the mechanism reliable, with the same predictability as what we have for __del__. > But... IIUC, it's now generally agreed that for Python code, level 1 > is simply *not good enough*. (Or to be a little more precise, it's > good enough for the case where the resource being cleaned up is > memory, because the garbage collector knows when memory is short, but > it's not good enough for resources like file descriptors.) The classic > example of this is code like: I think this is where I don't agree with you 100%. There are no strict guarantees when an object will be GCed in a timely manner in CPython or PyPy. If it's part of a ref cycle, it might not be cleaned up at all. All in all, in all your examples I don't see the exact place where AGs are different from let's say synchronous generators. For instance: > async def read_json_lines_from_server(host, port): > async for line in asyncio.open_connection(host, port)[0]: > yield json.loads(line) > > You would expect to use this like: > > async for data in read_json_lines_from_server(host, port): > ... If you rewrite the above code without the 'async' keyword, you'd have a synchronous generator with *exactly* the same problems. > tl;dr: AFAICT this revision of PEP 525 is enough to make it work > reliably on CPython, but I have serious concerns that it bakes a > CPython-specific design into the language. I would prefer a design > that actually aims for "level 2" cleanup semantics (for example, [1]) > I honestly don't see why PEP 525 can't be implemented in PyPy. The finalizing mechanism is built on top of existing finalization of synchronous generators which is already implemented in PyPy. The design of PEP 525 doesn't exploit any CPython-specific features (like ref counting). If an alternative implementation of Python interpreter implements __del__ semantics properly, it shouldn't have any problems with implementing PEP 525. Thank you, Yury From ncoghlan at gmail.com Sat Sep 3 15:15:01 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 4 Sep 2016 05:15:01 +1000 Subject: [Python-Dev] PEP 525, third round, better finalization In-Reply-To: References: Message-ID: On 4 September 2016 at 04:38, Oscar Benjamin wrote: > On 3 September 2016 at 16:42, Nick Coghlan wrote: >> On 2 September 2016 at 19:13, Nathaniel Smith wrote: >>> This works OK on CPython because the reference-counting gc will call >>> handle.__del__() at the end of the scope (so on CPython it's at level >>> 2), but it famously causes huge problems when porting to PyPy with >>> it's much faster and more sophisticated gc that only runs when >>> triggered by memory pressure. (Or for "PyPy" you can substitute >>> "Jython", "IronPython", whatever.) Technically this code doesn't >>> actually "leak" file descriptors on PyPy, because handle.__del__() >>> will get called *eventually* (this code is at level 1, not level 0), >>> but by the time "eventually" arrives your server process has probably >>> run out of file descriptors and crashed. Level 1 isn't good enough. So >>> now we have all learned to instead write > ... >>> BUT, with the current PEP 525 proposal, trying to use this generator >>> in this way is exactly analogous to the open(path).read() case: on >>> CPython it will work fine -- the generator object will leave scope at >>> the end of the 'async for' loop, cleanup methods will be called, etc. >>> But on PyPy, the weakref callback will not be triggered until some >>> arbitrary time later, you will "leak" file descriptors, and your >>> server will crash. >> >> That suggests the PyPy GC should probably be tracking pressure on more >> resources than just memory when deciding whether or not to trigger a >> GC run. > > PyPy's GC is conformant to the language spec The language spec doesn't say anything about what triggers GC cycles - that's purely a decision for runtime implementors based on the programming experience they want to provide their users. CPython runs GC pretty eagerly, with it being immediate when the automatic reference counting is sufficient and the cyclic GC doesn't have to get involved at all. If I understand correctly, PyPy currently decides whether or not to trigger a GC cycle based primarily on memory pressure, even though the uncollected garbage may also be holding on to system resources other than memory (like file descriptors). For synchronous code, that's a relatively easy burden to push back onto the programmer - assuming fair thread scheduling, a with statement can ensure reliably ensure prompt resource cleanup. That assurance goes out the window as soon as you explicitly pause code execution inside the body of the with statement - it doesn't matter whether its via yield, yield from, or await, you've completely lost that assurance of immediacy. At that point, even CPython doesn't ensure prompt release of resources - it just promises to try to clean things up as soon as it can and as best it can (which is usually pretty soon and pretty well, with recent iterations of 3.x, but event loops will still happily keep things alive indefinitely if they're waiting for events that never happen). For synchronous generators, you can make your API a bit more complicated, and ask your caller to handle the manual resource management, but you may not want to do that. The asynchronous case is even worse though, as there, you often simply can't readily push the burden back onto the programmer, because the code is *meant* to be waiting for events and reacting to them, rather than proceeding deterministically from beginning to end. So while it's good that PEP 492 and 525 attempt to adapt synchronous resource management models to the asynchronous world, it's also important to remember that there's a fundamental mismatch of underlying concepts when it comes to trying to pair up deterministic resource management with asynchronous code - you're often going to want to tip the model on its side and set up a dedicated resource manager that other components can interact with, and then have the resource manager take care of promptly releasing the resources when the other components go away (perhaps with notions of leases and lease renewals if you simply cannot afford unexpected delays in resources being released). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From yselivanov.ml at gmail.com Sat Sep 3 15:16:30 2016 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Sat, 3 Sep 2016 12:16:30 -0700 Subject: [Python-Dev] PEP 525, third round, better finalization In-Reply-To: References: Message-ID: <9b4c4c7a-5fce-8cf6-7b55-e9377f748ad4@gmail.com> Hi Oscar, > I don't think PyPy is in breach of the language spec here. Python made > a decision a long time ago to shun RAII-style implicit cleanup in > favour if with-style explicit cleanup. > > The solution to this problem is to move resource management outside of > the generator functions. This is true for ordinary generators without > an event-loop etc. The example in the PEP is > > async def square_series(con, to): > async with con.transaction(): > cursor = con.cursor( > 'SELECT generate_series(0, $1) AS i', to) > async for row in cursor: > yield row['i'] ** 2 > > async for i in square_series(con, 1000): > if i == 100: > break > > The normal generator equivalent of this is: > > def square_series(con, to): > with con.transaction(): > cursor = con.cursor( > 'SELECT generate_series(0, $1) AS i', to) > for row in cursor: > yield row['i'] ** 2 > > This code is already broken: move the with statement outside to the > caller of the generator function. Exactly. I used 'async with' in the PEP to demonstrate that the cleanup mechanisms are powerful enough to handle bad code patterns. Thank you, Yury From brett at python.org Sat Sep 3 15:27:19 2016 From: brett at python.org (Brett Cannon) Date: Sat, 03 Sep 2016 19:27:19 +0000 Subject: [Python-Dev] Tweak to PEP 523 for storing a tuple in co_extra Message-ID: Below is the `co_extra` section of PEP 523 with the update saying that users are expected to put a tuple in the field for easier simultaneous use of the field. Since the `co_extra` discussions do not affect CPython itself I'm planning on landing the changes stemming from the PEP probably on Monday. ---------- Expanding ``PyCodeObject`` -------------------------- One field is to be added to the ``PyCodeObject`` struct [#pycodeobject]_:: typedef struct { ... PyObject *co_extra; /* "Scratch space" for the code object. */ } PyCodeObject; The ``co_extra`` will be ``NULL`` by default and will not be used by CPython itself. Third-party code is free to use the field as desired. Values stored in the field are expected to not be required in order for the code object to function, allowing the loss of the data of the field to be acceptable. The field will be freed like all other fields on ``PyCodeObject`` during deallocation using ``Py_XDECREF()``. Code using the field is expected to always store a tuple in the field. This allows for multiple users of the field to not trample over each other while being as performant as possible. Typical usage of the field is expected to roughly follow the following pseudo-code:: if co_extra is None: data = DataClass() co_extra = (data,) else: assert isinstance(co_extra, tuple) for x in co_extra: if isinstance(x, DataClass): data = x break else: data = DataClass() co_extra += (data,) Using a list was considered but was found to be less performant, and with a key use-case being JIT usage the performance consideration it was deemed more important to use a tuple than a list. A tuple also makes more sense semantically as the objects stored in the tuple will be heterogeneous. A dict was also considered, but once again performance was more important. While a dict will have constant overhead in looking up data, the overhead for the common case of a single object being stored in the data structure leads to a tuple having better performance characteristics (i.e. iterating a tuple of length 1 is faster than the overhead of hashing and looking up an object in a dict). -------------- next part -------------- An HTML attachment was scrubbed... URL: From yselivanov.ml at gmail.com Sat Sep 3 17:14:34 2016 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Sat, 3 Sep 2016 14:14:34 -0700 Subject: [Python-Dev] PEP 526 ready for review: Syntax for Variable and Attribute Annotations In-Reply-To: References: Message-ID: On 2016-08-30 2:20 PM, Guido van Rossum wrote: > I'm happy to present PEP 526 for your collective review: > https://www.python.org/dev/peps/pep-0526/ (HTML) > https://github.com/python/peps/blob/master/pep-0526.txt (source) > > There's also an implementation ready: > https://github.com/ilevkivskyi/cpython/tree/pep-526 > > I don't want to post the full text here but I encourage feedback on > the high-order ideas, including but not limited to > > - Whether (given PEP 484's relative success) it's worth adding syntax > for variable/attribute annotations. > > - Whether the keyword-free syntax idea proposed here is best: > NAME: TYPE > TARGET: TYPE = VALUE I'm in favour for the PEP, and I like the syntax. I find it much better than any previously discussed alternatives. Static typing is becoming increasingly more popular, and the benefits of using static type checkers for big code bases are clear. The PEP doesn't really change the semantics of the language, it only allows better tooling (using comments for annotations was fine too, but dedicated syntax makes this feature a first class citizen). Yury From yselivanov.ml at gmail.com Sat Sep 3 18:03:17 2016 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Sat, 3 Sep 2016 15:03:17 -0700 Subject: [Python-Dev] Tweak to PEP 523 for storing a tuple in co_extra In-Reply-To: References: Message-ID: <594cf0f3-8e6f-e4ee-8ad4-cb619e1b3401@gmail.com> On 2016-09-03 12:27 PM, Brett Cannon wrote: > Below is the `co_extra` section of PEP 523 with the update saying that > users are expected to put a tuple in the field for easier simultaneous > use of the field. > > Since the `co_extra` discussions do not affect CPython itself I'm > planning on landing the changes stemming from the PEP probably on Monday. Tuples are immutable. If you have multiple co_extra users then they will have to either mutate tuple (which isn't always possible, for instance, you can't increase size), or to replace it with another tuple. Creating lists is a bit more expensive, but item access speed should be in the same ballpark. Another question -- sorry if this was discussed before -- why do we want a PyObject* there at all? I.e. why don't we create a dedicated struct CoExtraContainer to manage the stuff in co_extra? My understanding is that the users of co_extra are C-level python optimizers and profilers, which don't need the overhead of CPython API. This way my work to add an extra caching layer (which I'm very much willing to continue to work on) wouldn't require another set of extra fields for code objects. Yury From k7hoven at gmail.com Sat Sep 3 18:06:01 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Sun, 4 Sep 2016 01:06:01 +0300 Subject: [Python-Dev] PEP 467: last round (?) In-Reply-To: References: <57C88355.9000302@stoneleaf.us> Message-ID: On Sat, Sep 3, 2016 at 7:59 PM, Nick Coghlan wrote: > On 3 September 2016 at 03:54, Koos Zevenhoven wrote: >> chrb seems to be more in line with some bytes versions in for instance os >> than bchr. > > The mnemonic for the current name in the PEP is that bchr is to chr as > b"" is to "". The PEP should probably say that in addition to pointing > out the 'unichr' Python 2 inspiration, though. Thanks for explaining. Indeed I hope that unichr does not affect any naming decisions that will remain in the language for a long time. > The other big difference between this and the os module case, is that > the resulting builtin constructor pairs here are str/chr (arbitrary > text, single code point) and bytes/bchr (arbitrary binary data, single > binary octet). By contrast, os.getcwd() and os.getcwdb() (and similar > APIs) are both referring to the same operating system level operation, > they're just requesting a different return type for the data. But chr and "bchr" are also requesting a different return type. The difference is that the data is not coming from an os-level operation but from an int. I guess one reason I don't like bchr (nor chrb, really) is that they look just like a random sequence of letters in builtins, but not recognizable the way asdf would be. I guess I have one last pair of suggestions for the name of this function: bytes.chr or bytes.char. -- Koos > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -- + Koos Zevenhoven + http://twitter.com/k7hoven + From random832 at fastmail.com Sat Sep 3 18:10:06 2016 From: random832 at fastmail.com (Random832) Date: Sat, 03 Sep 2016 18:10:06 -0400 Subject: [Python-Dev] PEP 467: last round (?) In-Reply-To: References: <57C88355.9000302@stoneleaf.us> Message-ID: <1472940606.890255.714952393.21FE6B71@webmail.messagingengine.com> On Sat, Sep 3, 2016, at 08:08, Martin Panter wrote: > On 1 September 2016 at 19:36, Ethan Furman wrote: > > Deprecation of current "zero-initialised sequence" behaviour without removal > > ---------------------------------------------------------------------------- > > > > Currently, the ``bytes`` and ``bytearray`` constructors accept an integer > > argument and interpret it as meaning to create a zero-initialised sequence > > of the given size:: > > > > >>> bytes(3) > > b'\x00\x00\x00' > > >>> bytearray(3) > > bytearray(b'\x00\x00\x00') > > > > This PEP proposes to deprecate that behaviour in Python 3.6, but to leave > > it in place for at least as long as Python 2.7 is supported, possibly > > indefinitely. > > Can you clarify what ?deprecate? means? Just add a note in the > documentation, or make calls trigger a DeprecationWarning as well? > Having bytearray(n) trigger a DeprecationWarning would be a minor > annoyance for code being compatible with Python 2 and 3, since > bytearray(n) is supported in Python 2. I don't think bytearray(n) should be deprecated. I don't think that deprecating bytes(n) should entail also deprecating bytes(n). If I were designing these classes from scratch, I would not feel any impulse to make their constructors take the same arguments or have the same semantics, and I'm a bit unclear on what the reason for this decision was. I also don't think bytes.fromcount(n) is necessary. What's wrong with b'\0'*n? I could swear this has been answered before, but I don't recall what the answer was. I don't think the rationale mentioned in the PEP is an adequate explanation, it references an earlier decision, about a conceptually different class (it's an operation that's much more common with mutable classes than immutable ones - when's the last time you did (None,)*n relative to [None]*n), without actually explaining the real reason for either underlying decision (having bytearray(n) and having both classes take the same constructor arguments). I think that the functions we should add/keep are: bytes(values: Union[bytes, bytearray, Iterable[int]) bytearray(count : int) bytearray(values: Union[bytes, bytearray, Iterable[int]) bchr(integer) If, incidentally, we're going to add a .fromsize method, it'd be nice to add a way to provide a fill value other than 0. Also, maybe we should also add it for list and tuple (with the default value None)? For the (string, encoding) signatures, there's no good reason to keep them [TOOWTDI is str.encode] but no good reason to get rid of them either. From random832 at fastmail.com Sat Sep 3 18:11:00 2016 From: random832 at fastmail.com (Random832) Date: Sat, 03 Sep 2016 18:11:00 -0400 Subject: [Python-Dev] PEP 467: last round (?) In-Reply-To: References: <57C88355.9000302@stoneleaf.us> Message-ID: <1472940660.890622.714961169.3425F103@webmail.messagingengine.com> On Sat, Sep 3, 2016, at 18:06, Koos Zevenhoven wrote: > I guess one reason I don't like bchr (nor chrb, really) is that they > look just like a random sequence of letters in builtins, but not > recognizable the way asdf would be. > > I guess I have one last pair of suggestions for the name of this > function: bytes.chr or bytes.char. What about byte? Like, not bytes.byte, just builtins.byte. From k7hoven at gmail.com Sat Sep 3 18:21:32 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Sun, 4 Sep 2016 01:21:32 +0300 Subject: [Python-Dev] PEP 467: last round (?) In-Reply-To: <57CAEF46.6000402@stoneleaf.us> References: <57C88355.9000302@stoneleaf.us> <57CAEF46.6000402@stoneleaf.us> Message-ID: On Sat, Sep 3, 2016 at 6:41 PM, Ethan Furman wrote: >>> >>> Open Questions >>> ============== >>> >>> Do we add ``iterbytes`` to ``memoryview``, or modify >>> ``memoryview.cast()`` to accept ``'s'`` as a single-byte interpretation? >>> Or >>> do we ignore memory for now and add it later? >> >> >> Apparently memoryview.cast('s') comes from Nick Coghlan: >> >> . >> However, since 3.5 (https://bugs.python.org/issue15944) you can call >> cast("c") on most memoryviews, which I think already does what you >> want: >> >>>>> tuple(memoryview(b"ABC").cast("c")) >> >> (b'A', b'B', b'C') > > > Nice! > Indeed! Exposing this as bytes_instance.chars would make porting from Python 2 really simple. Of course even better would be if slicing the view would return bytes, so the porting rule would be the same for all bytes subscripting: py2str[SOMETHING] becomes py3bytes.chars[SOMETHING] With the "c" memoryview there will be a distinction between slicing and indexing. And Random832 seems to be making some good points. --- Koos > -- > ~Ethan~ > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/k7hoven%40gmail.com -- + Koos Zevenhoven + http://twitter.com/k7hoven + From levkivskyi at gmail.com Sat Sep 3 18:23:37 2016 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Sun, 4 Sep 2016 00:23:37 +0200 Subject: [Python-Dev] PEP 467: last round (?) In-Reply-To: <1472940660.890622.714961169.3425F103@webmail.messagingengine.com> References: <57C88355.9000302@stoneleaf.us> <1472940660.890622.714961169.3425F103@webmail.messagingengine.com> Message-ID: On 4 September 2016 at 00:11, Random832 wrote: > On Sat, Sep 3, 2016, at 18:06, Koos Zevenhoven wrote: > > I guess one reason I don't like bchr (nor chrb, really) is that they > > look just like a random sequence of letters in builtins, but not > > recognizable the way asdf would be. > > > > I guess I have one last pair of suggestions for the name of this > > function: bytes.chr or bytes.char. > > What about byte? Like, not bytes.byte, just builtins.byte. > I like this option, it would be very "symmetric" to have, compare: >>>chr(42) '*' >>>str() '' with this: >>>byte(42) b'*' >>>bytes() b'' It is easy to explain and remember this. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From k7hoven at gmail.com Sat Sep 3 18:36:08 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Sun, 4 Sep 2016 01:36:08 +0300 Subject: [Python-Dev] PEP 467: last round (?) In-Reply-To: References: <57C88355.9000302@stoneleaf.us> <1472940660.890622.714961169.3425F103@webmail.messagingengine.com> Message-ID: On Sun, Sep 4, 2016 at 1:23 AM, Ivan Levkivskyi wrote: > On 4 September 2016 at 00:11, Random832 wrote: >> >> On Sat, Sep 3, 2016, at 18:06, Koos Zevenhoven wrote: >> > I guess one reason I don't like bchr (nor chrb, really) is that they >> > look just like a random sequence of letters in builtins, but not >> > recognizable the way asdf would be. >> > >> > I guess I have one last pair of suggestions for the name of this >> > function: bytes.chr or bytes.char. >> >> What about byte? Like, not bytes.byte, just builtins.byte. > > > I like this option, it would be very "symmetric" to have, compare: > >>>>chr(42) > '*' >>>>str() > '' > > with this: > >>>>byte(42) > b'*' >>>>bytes() > b'' > > It is easy to explain and remember this. In one way, I like it, but on the other hand, indexing a bytes gives an integer, so maybe a 'byte' is just an integer in range(256). Also, having both byte and bytes would be a slight annoyance with autocomplete. -- Koos From rosuav at gmail.com Sat Sep 3 19:13:05 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 4 Sep 2016 09:13:05 +1000 Subject: [Python-Dev] Tweak to PEP 523 for storing a tuple in co_extra In-Reply-To: <594cf0f3-8e6f-e4ee-8ad4-cb619e1b3401@gmail.com> References: <594cf0f3-8e6f-e4ee-8ad4-cb619e1b3401@gmail.com> Message-ID: On Sun, Sep 4, 2016 at 8:03 AM, Yury Selivanov wrote: > On 2016-09-03 12:27 PM, Brett Cannon wrote: >> >> Below is the `co_extra` section of PEP 523 with the update saying that >> users are expected to put a tuple in the field for easier simultaneous use >> of the field. >> >> Since the `co_extra` discussions do not affect CPython itself I'm planning >> on landing the changes stemming from the PEP probably on Monday. > > > Tuples are immutable. If you have multiple co_extra users then they will > have to either mutate tuple (which isn't always possible, for instance, you > can't increase size), or to replace it with another tuple. Replace it, but only as they register themselves with a particular function. Imagine a profiler doing something vaguely like this: class FunctionStats: def __init__(self): self.info = [whatever, whatever, blah blah] def profile(func): """Decorator to mark a function for profiling""" func.__code__.co_extra += (FunctionStats(),) return func Tuple immutability impacts the initialization only. After that, you just iterate over it. ChrisA From christian at python.org Sat Sep 3 19:15:09 2016 From: christian at python.org (Christian Heimes) Date: Sun, 4 Sep 2016 01:15:09 +0200 Subject: [Python-Dev] Tweak to PEP 523 for storing a tuple in co_extra In-Reply-To: <594cf0f3-8e6f-e4ee-8ad4-cb619e1b3401@gmail.com> References: <594cf0f3-8e6f-e4ee-8ad4-cb619e1b3401@gmail.com> Message-ID: On 2016-09-04 00:03, Yury Selivanov wrote: > > > On 2016-09-03 12:27 PM, Brett Cannon wrote: >> Below is the `co_extra` section of PEP 523 with the update saying that >> users are expected to put a tuple in the field for easier simultaneous >> use of the field. >> >> Since the `co_extra` discussions do not affect CPython itself I'm >> planning on landing the changes stemming from the PEP probably on Monday. > > Tuples are immutable. If you have multiple co_extra users then they > will have to either mutate tuple (which isn't always possible, for > instance, you can't increase size), or to replace it with another tuple. > > Creating lists is a bit more expensive, but item access speed should be > in the same ballpark. > > Another question -- sorry if this was discussed before -- why do we want > a PyObject* there at all? I.e. why don't we create a dedicated struct > CoExtraContainer to manage the stuff in co_extra? My understanding is > that the users of co_extra are C-level python optimizers and profilers, > which don't need the overhead of CPython API. > > This way my work to add an extra caching layer (which I'm very much > willing to continue to work on) wouldn't require another set of extra > fields for code objects. Quick idea before I go to bed: You could adopt a similar API to OpenSSL's CRYPTO_get_ex_new_index() API, https://www.openssl.org/docs/manmaster/crypto/CRYPTO_get_ex_new_index.html static int code_index = 0; int PyCodeObject_NewIndex() { return code_index++; } A library like Pyjion has to acquire an index first. In further calls it uses the index as offset into the new co_extra field. Libraries don't have to hard-code their offset and two libraries will never conflict. PyCode_New() can pre-populate co_extra with a PyTuple of size code_index. This avoids most resizes if you load Pyjion early. For code_index == 0 leaf the field NULL. Christian From yselivanov.ml at gmail.com Sat Sep 3 19:42:50 2016 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Sat, 3 Sep 2016 16:42:50 -0700 Subject: [Python-Dev] Tweak to PEP 523 for storing a tuple in co_extra In-Reply-To: References: <594cf0f3-8e6f-e4ee-8ad4-cb619e1b3401@gmail.com> Message-ID: On 2016-09-03 4:15 PM, Christian Heimes wrote: > On 2016-09-04 00:03, Yury Selivanov wrote: >> >> On 2016-09-03 12:27 PM, Brett Cannon wrote: >>> Below is the `co_extra` section of PEP 523 with the update saying that >>> users are expected to put a tuple in the field for easier simultaneous >>> use of the field. >>> >>> Since the `co_extra` discussions do not affect CPython itself I'm >>> planning on landing the changes stemming from the PEP probably on Monday. >> Tuples are immutable. If you have multiple co_extra users then they >> will have to either mutate tuple (which isn't always possible, for >> instance, you can't increase size), or to replace it with another tuple. >> >> Creating lists is a bit more expensive, but item access speed should be >> in the same ballpark. >> >> Another question -- sorry if this was discussed before -- why do we want >> a PyObject* there at all? I.e. why don't we create a dedicated struct >> CoExtraContainer to manage the stuff in co_extra? My understanding is >> that the users of co_extra are C-level python optimizers and profilers, >> which don't need the overhead of CPython API. >> >> This way my work to add an extra caching layer (which I'm very much >> willing to continue to work on) wouldn't require another set of extra >> fields for code objects. > Quick idea before I go to bed: > > You could adopt a similar API to OpenSSL's CRYPTO_get_ex_new_index() > API, > https://www.openssl.org/docs/manmaster/crypto/CRYPTO_get_ex_new_index.html > > > static int code_index = 0; > > int PyCodeObject_NewIndex() { > return code_index++; > } > > A library like Pyjion has to acquire an index first. In further calls it > uses the index as offset into the new co_extra field. Libraries don't > have to hard-code their offset and two libraries will never conflict. > PyCode_New() can pre-populate co_extra with a PyTuple of size > code_index. This avoids most resizes if you load Pyjion early. For > code_index == 0 leaf the field NULL. Sounds like a very good idea! Yury From yselivanov.ml at gmail.com Sat Sep 3 19:49:53 2016 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Sat, 3 Sep 2016 16:49:53 -0700 Subject: [Python-Dev] Tweak to PEP 523 for storing a tuple in co_extra In-Reply-To: References: <594cf0f3-8e6f-e4ee-8ad4-cb619e1b3401@gmail.com> Message-ID: <99623b88-4275-53a6-d179-0cbe210aa0d3@gmail.com> On 2016-09-03 4:13 PM, Chris Angelico wrote: > On Sun, Sep 4, 2016 at 8:03 AM, Yury Selivanov wrote: >> On 2016-09-03 12:27 PM, Brett Cannon wrote: >>> Below is the `co_extra` section of PEP 523 with the update saying that >>> users are expected to put a tuple in the field for easier simultaneous use >>> of the field. >>> >>> Since the `co_extra` discussions do not affect CPython itself I'm planning >>> on landing the changes stemming from the PEP probably on Monday. >> >> Tuples are immutable. If you have multiple co_extra users then they will >> have to either mutate tuple (which isn't always possible, for instance, you >> can't increase size), or to replace it with another tuple. > Replace it, but only as they register themselves with a particular > function. Imagine a profiler doing something vaguely like this: "Replacing" makes it error prone to cache the pointer even for small periods of time. Defining co_extra using Python C API forces us to acquire the GIL etc (aside from other performance penalties). Although we probably would recommend to use the GIL anyways, I'm not sure tuple really simplifies anything here. > > class FunctionStats: > def __init__(self): > self.info = [whatever, whatever, blah blah] > > def profile(func): > """Decorator to mark a function for profiling""" > func.__code__.co_extra += (FunctionStats(),) > return func > > Tuple immutability impacts the initialization only. After that, you > just iterate over it. I wasn't aware we wanted to expose co_extra to Python land. I'm not convinced it's a good idea, because exposing, say, Pyjion JIT state to Python doesn't make any sense. At least for Python 3.6 I don't think we would want to expose this field. Moreover, profiling Python with a pure Python profiler is kind of slow... I'm sure people use C for that anyways. Yury From rosuav at gmail.com Sat Sep 3 20:15:27 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 4 Sep 2016 10:15:27 +1000 Subject: [Python-Dev] Tweak to PEP 523 for storing a tuple in co_extra In-Reply-To: <99623b88-4275-53a6-d179-0cbe210aa0d3@gmail.com> References: <594cf0f3-8e6f-e4ee-8ad4-cb619e1b3401@gmail.com> <99623b88-4275-53a6-d179-0cbe210aa0d3@gmail.com> Message-ID: On Sun, Sep 4, 2016 at 9:49 AM, Yury Selivanov wrote: > > > On 2016-09-03 4:13 PM, Chris Angelico wrote: >> Replace it, but only as they register themselves with a particular >> function. Imagine a profiler doing something vaguely like this: > > > "Replacing" makes it error prone to cache the pointer even for small periods > of time. Defining co_extra using Python C API forces us to acquire the GIL > etc (aside from other performance penalties). Although we probably would > recommend to use the GIL anyways, I'm not sure tuple really simplifies > anything here. If everyone behaves properly, it should be safe. tuple_pointer = co_extra max_index = len(tuple_pointer) is tuple_pointer[0] mine? No -- someone appends to the tuple -- is tuple_pointer[1] mine? No The only effect of caching is that, in effect, mutations aren't seen till the end of the iteration - a short time anyway. >> class FunctionStats: >> def __init__(self): >> self.info = [whatever, whatever, blah blah] >> >> def profile(func): >> """Decorator to mark a function for profiling""" >> func.__code__.co_extra += (FunctionStats(),) >> return func >> >> Tuple immutability impacts the initialization only. After that, you >> just iterate over it. > > > I wasn't aware we wanted to expose co_extra to Python land. I'm not > convinced it's a good idea, because exposing, say, Pyjion JIT state to > Python doesn't make any sense. At least for Python 3.6 I don't think we > would want to expose this field. > > Moreover, profiling Python with a pure Python profiler is kind of slow... > I'm sure people use C for that anyways. This is what I get for overly embracing the notion that Python is executable pseudo-code :) Yes, this would normally be happening in C, but notionally, it'll be like that. ChrisA From brett at python.org Sat Sep 3 20:19:44 2016 From: brett at python.org (Brett Cannon) Date: Sun, 04 Sep 2016 00:19:44 +0000 Subject: [Python-Dev] Tweak to PEP 523 for storing a tuple in co_extra In-Reply-To: References: <594cf0f3-8e6f-e4ee-8ad4-cb619e1b3401@gmail.com> Message-ID: On Sat, 3 Sep 2016 at 16:43 Yury Selivanov wrote: > > > On 2016-09-03 4:15 PM, Christian Heimes wrote: > > On 2016-09-04 00:03, Yury Selivanov wrote: > >> > >> On 2016-09-03 12:27 PM, Brett Cannon wrote: > >>> Below is the `co_extra` section of PEP 523 with the update saying that > >>> users are expected to put a tuple in the field for easier simultaneous > >>> use of the field. > >>> > >>> Since the `co_extra` discussions do not affect CPython itself I'm > >>> planning on landing the changes stemming from the PEP probably on > Monday. > >> Tuples are immutable. If you have multiple co_extra users then they > >> will have to either mutate tuple (which isn't always possible, for > >> instance, you can't increase size), or to replace it with another tuple. > >> > >> Creating lists is a bit more expensive, but item access speed should be > >> in the same ballpark. > >> > >> Another question -- sorry if this was discussed before -- why do we want > >> a PyObject* there at all? I.e. why don't we create a dedicated struct > >> CoExtraContainer to manage the stuff in co_extra? My understanding is > >> that the users of co_extra are C-level python optimizers and profilers, > >> which don't need the overhead of CPython API. > As Chris pointed out in another email, the overhead is only in the allocation, not the iteration/access if you use the PyTuple macros to get the size and index into the tuple the overhead is negligible. > >> > >> This way my work to add an extra caching layer (which I'm very much > >> willing to continue to work on) wouldn't require another set of extra > >> fields for code objects. > > Quick idea before I go to bed: > > > > You could adopt a similar API to OpenSSL's CRYPTO_get_ex_new_index() > > API, > > > https://www.openssl.org/docs/manmaster/crypto/CRYPTO_get_ex_new_index.html > > > > > > static int code_index = 0; > > > > int PyCodeObject_NewIndex() { > > return code_index++; > > } > > > > A library like Pyjion has to acquire an index first. In further calls it > > uses the index as offset into the new co_extra field. Libraries don't > > have to hard-code their offset and two libraries will never conflict. > > PyCode_New() can pre-populate co_extra with a PyTuple of size > > code_index. This avoids most resizes if you load Pyjion early. For > > code_index == 0 leaf the field NULL. > > Sounds like a very good idea! > The problem with this is the pre-population. If you don't get your index assigned before the very first code object is allocated then you still have to manage the size of the tuple in co_extra. So what this would do is avoid the iteration but not the allocation overhead. If we open up the can of worms in terms of custom functions for this (which I was trying to avoid), then you end up with Py_ssize_t _PyCode_ExtraIndex(), PyObject * _PyCode_GetExtra(PyCodeObject *code, Py_ssize_t index), and int _PyCode_SetExtra(PyCodeObject *code, Py_ssize_t index, PyObject *data) which does all the right things for creating or resizing the tuple as necessary and which I think matches mostly what Nick had proposed earlier. But the pseudo-code for _PyCode_GetExtra() would be:: if co_extra is None: co_extra = (None,) * _next_extra_index; return None elif len(co_extra) < index - 1: ... pad out tuple return None else: return co_extra[index] Is that going to save us enough to want to have a custom API for this? -Brett > > Yury > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Sat Sep 3 20:20:47 2016 From: brett at python.org (Brett Cannon) Date: Sun, 04 Sep 2016 00:20:47 +0000 Subject: [Python-Dev] Tweak to PEP 523 for storing a tuple in co_extra In-Reply-To: <99623b88-4275-53a6-d179-0cbe210aa0d3@gmail.com> References: <594cf0f3-8e6f-e4ee-8ad4-cb619e1b3401@gmail.com> <99623b88-4275-53a6-d179-0cbe210aa0d3@gmail.com> Message-ID: On Sat, 3 Sep 2016 at 16:55 Yury Selivanov wrote: > > > On 2016-09-03 4:13 PM, Chris Angelico wrote: > > On Sun, Sep 4, 2016 at 8:03 AM, Yury Selivanov > wrote: > >> On 2016-09-03 12:27 PM, Brett Cannon wrote: > >>> Below is the `co_extra` section of PEP 523 with the update saying that > >>> users are expected to put a tuple in the field for easier simultaneous > use > >>> of the field. > >>> > >>> Since the `co_extra` discussions do not affect CPython itself I'm > planning > >>> on landing the changes stemming from the PEP probably on Monday. > >> > >> Tuples are immutable. If you have multiple co_extra users then they > will > >> have to either mutate tuple (which isn't always possible, for instance, > you > >> can't increase size), or to replace it with another tuple. > > Replace it, but only as they register themselves with a particular > > function. Imagine a profiler doing something vaguely like this: > > "Replacing" makes it error prone to cache the pointer even for small > periods of time. Defining co_extra using Python C API forces us to > acquire the GIL etc (aside from other performance penalties). Although > we probably would recommend to use the GIL anyways, I'm not sure tuple > really simplifies anything here. > > > > > class FunctionStats: > > def __init__(self): > > self.info = [whatever, whatever, blah blah] > > > > def profile(func): > > """Decorator to mark a function for profiling""" > > func.__code__.co_extra += (FunctionStats(),) > > return func > > > > Tuple immutability impacts the initialization only. After that, you > > just iterate over it. > > I wasn't aware we wanted to expose co_extra to Python land. > We are most definitely not exposing the field to Python code. -------------- next part -------------- An HTML attachment was scrubbed... URL: From yselivanov.ml at gmail.com Sat Sep 3 20:27:11 2016 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Sat, 3 Sep 2016 17:27:11 -0700 Subject: [Python-Dev] Tweak to PEP 523 for storing a tuple in co_extra In-Reply-To: References: <594cf0f3-8e6f-e4ee-8ad4-cb619e1b3401@gmail.com> Message-ID: <5e8cc8d6-4241-afdc-07af-c52826e08877@gmail.com> On 2016-09-03 5:19 PM, Brett Cannon wrote: > > > On Sat, 3 Sep 2016 at 16:43 Yury Selivanov > wrote: > > > > On 2016-09-03 4:15 PM, Christian Heimes wrote: > > On 2016-09-04 00:03, Yury Selivanov wrote: > >> > >> On 2016-09-03 12:27 PM, Brett Cannon wrote: > >>> Below is the `co_extra` section of PEP 523 with the update > saying that > >>> users are expected to put a tuple in the field for easier > simultaneous > >>> use of the field. > >>> > >>> Since the `co_extra` discussions do not affect CPython itself I'm > >>> planning on landing the changes stemming from the PEP probably > on Monday. > >> Tuples are immutable. If you have multiple co_extra users then > they > >> will have to either mutate tuple (which isn't always possible, for > >> instance, you can't increase size), or to replace it with > another tuple. > >> > >> Creating lists is a bit more expensive, but item access speed > should be > >> in the same ballpark. > >> > >> Another question -- sorry if this was discussed before -- why > do we want > >> a PyObject* there at all? I.e. why don't we create a dedicated > struct > >> CoExtraContainer to manage the stuff in co_extra? My > understanding is > >> that the users of co_extra are C-level python optimizers and > profilers, > >> which don't need the overhead of CPython API. > > > As Chris pointed out in another email, the overhead is only in the > allocation, not the iteration/access if you use the PyTuple macros to > get the size and index into the tuple the overhead is negligible. Yes, my point was that it's as cheap to use a list as a tuple for co_extra. If we decide to store PyObject in co_extra. > >> > >> This way my work to add an extra caching layer (which I'm very much > >> willing to continue to work on) wouldn't require another set of > extra > >> fields for code objects. > > Quick idea before I go to bed: > > > > You could adopt a similar API to OpenSSL's CRYPTO_get_ex_new_index() > > API, > > > https://www.openssl.org/docs/manmaster/crypto/CRYPTO_get_ex_new_index.html > > > > > > static int code_index = 0; > > > > int PyCodeObject_NewIndex() { > > return code_index++; > > } > > > > A library like Pyjion has to acquire an index first. In further > calls it > > uses the index as offset into the new co_extra field. Libraries > don't > > have to hard-code their offset and two libraries will never > conflict. > > PyCode_New() can pre-populate co_extra with a PyTuple of size > > code_index. This avoids most resizes if you load Pyjion early. For > > code_index == 0 leaf the field NULL. > > Sounds like a very good idea! > > > The problem with this is the pre-population. If you don't get your > index assigned before the very first code object is allocated then you > still have to manage the size of the tuple in co_extra. So what this > would do is avoid the iteration but not the allocation overhead. > > If we open up the can of worms in terms of custom functions for this > (which I was trying to avoid), then you end up with Py_ssize_t > _PyCode_ExtraIndex(), PyObject * > _PyCode_GetExtra(PyCodeObject *code, Py_ssize_t index), and int > _PyCode_SetExtra(PyCodeObject *code, Py_ssize_t index, PyObject *data) > which does all the right things for creating or resizing the tuple as > necessary and which I think matches mostly what Nick had proposed > earlier. But the pseudo-code for _PyCode_GetExtra() would be:: > > if co_extra is None: > co_extra = (None,) * _next_extra_index; > return None > elif len(co_extra) < index - 1: > ... pad out tuple > return None > else: > return co_extra[index] > > Is that going to save us enough to want to have a custom API for this? But without that new API (basically what Christian proposed) you'd need to iterate over the list in order to find the object that belongs to Pyjion. If we manage to implement my opcode caching idea, we'll have at least two known users of co_extra. Without a way to claim a particular index in co_extra you will have some overhead to locate your objects. Yury From brett at python.org Sat Sep 3 20:36:39 2016 From: brett at python.org (Brett Cannon) Date: Sun, 04 Sep 2016 00:36:39 +0000 Subject: [Python-Dev] Tweak to PEP 523 for storing a tuple in co_extra In-Reply-To: <5e8cc8d6-4241-afdc-07af-c52826e08877@gmail.com> References: <594cf0f3-8e6f-e4ee-8ad4-cb619e1b3401@gmail.com> <5e8cc8d6-4241-afdc-07af-c52826e08877@gmail.com> Message-ID: On Sat, 3 Sep 2016 at 17:27 Yury Selivanov wrote: > > On 2016-09-03 5:19 PM, Brett Cannon wrote: > > > > > > On Sat, 3 Sep 2016 at 16:43 Yury Selivanov > > wrote: > > > > > > > > On 2016-09-03 4:15 PM, Christian Heimes wrote: > > > On 2016-09-04 00:03, Yury Selivanov wrote: > > >> > > >> On 2016-09-03 12:27 PM, Brett Cannon wrote: > > >>> Below is the `co_extra` section of PEP 523 with the update > > saying that > > >>> users are expected to put a tuple in the field for easier > > simultaneous > > >>> use of the field. > > >>> > > >>> Since the `co_extra` discussions do not affect CPython itself I'm > > >>> planning on landing the changes stemming from the PEP probably > > on Monday. > > >> Tuples are immutable. If you have multiple co_extra users then > > they > > >> will have to either mutate tuple (which isn't always possible, for > > >> instance, you can't increase size), or to replace it with > > another tuple. > > >> > > >> Creating lists is a bit more expensive, but item access speed > > should be > > >> in the same ballpark. > > >> > > >> Another question -- sorry if this was discussed before -- why > > do we want > > >> a PyObject* there at all? I.e. why don't we create a dedicated > > struct > > >> CoExtraContainer to manage the stuff in co_extra? My > > understanding is > > >> that the users of co_extra are C-level python optimizers and > > profilers, > > >> which don't need the overhead of CPython API. > > > > > > As Chris pointed out in another email, the overhead is only in the > > allocation, not the iteration/access if you use the PyTuple macros to > > get the size and index into the tuple the overhead is negligible. > > Yes, my point was that it's as cheap to use a list as a tuple for > co_extra. If we decide to store PyObject in co_extra. > > > >> > > >> This way my work to add an extra caching layer (which I'm very > much > > >> willing to continue to work on) wouldn't require another set of > > extra > > >> fields for code objects. > > > Quick idea before I go to bed: > > > > > > You could adopt a similar API to OpenSSL's > CRYPTO_get_ex_new_index() > > > API, > > > > > > https://www.openssl.org/docs/manmaster/crypto/CRYPTO_get_ex_new_index.html > > > > > > > > > static int code_index = 0; > > > > > > int PyCodeObject_NewIndex() { > > > return code_index++; > > > } > > > > > > A library like Pyjion has to acquire an index first. In further > > calls it > > > uses the index as offset into the new co_extra field. Libraries > > don't > > > have to hard-code their offset and two libraries will never > > conflict. > > > PyCode_New() can pre-populate co_extra with a PyTuple of size > > > code_index. This avoids most resizes if you load Pyjion early. For > > > code_index == 0 leaf the field NULL. > > > > Sounds like a very good idea! > > > > > > The problem with this is the pre-population. If you don't get your > > index assigned before the very first code object is allocated then you > > still have to manage the size of the tuple in co_extra. So what this > > would do is avoid the iteration but not the allocation overhead. > > > > If we open up the can of worms in terms of custom functions for this > > (which I was trying to avoid), then you end up with Py_ssize_t > > _PyCode_ExtraIndex(), PyObject * > > _PyCode_GetExtra(PyCodeObject *code, Py_ssize_t index), and int > > _PyCode_SetExtra(PyCodeObject *code, Py_ssize_t index, PyObject *data) > > which does all the right things for creating or resizing the tuple as > > necessary and which I think matches mostly what Nick had proposed > > earlier. But the pseudo-code for _PyCode_GetExtra() would be:: > > > > if co_extra is None: > > co_extra = (None,) * _next_extra_index; > > return None > > elif len(co_extra) < index - 1: > > ... pad out tuple > > return None > > else: > > return co_extra[index] > > > > Is that going to save us enough to want to have a custom API for this? > > But without that new API (basically what Christian proposed) you'd need > to iterate over the list in order to find the object that belongs to > Pyjion. Yes. > If we manage to implement my opcode caching idea, we'll have at > least two known users of co_extra. Without a way to claim a particular > index in co_extra you will have some overhead to locate your objects. > Two things. One, I would want any new API to start with an underscore so people know we can and will change its semantics as necessary. Two, Guido would have to re-accept the PEP as this is a shift in the use of the field if this is how people want to go. -------------- next part -------------- An HTML attachment was scrubbed... URL: From yselivanov.ml at gmail.com Sat Sep 3 20:45:19 2016 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Sat, 3 Sep 2016 17:45:19 -0700 Subject: [Python-Dev] Tweak to PEP 523 for storing a tuple in co_extra In-Reply-To: References: <594cf0f3-8e6f-e4ee-8ad4-cb619e1b3401@gmail.com> <5e8cc8d6-4241-afdc-07af-c52826e08877@gmail.com> Message-ID: > > But without that new API (basically what Christian proposed) you'd > need > to iterate over the list in order to find the object that belongs to > Pyjion. > > > Yes. Yeah, which means the same for my opcode patch... Which unfortunately will make things slower :( > If we manage to implement my opcode caching idea, we'll have at > least two known users of co_extra. Without a way to claim a > particular > index in co_extra you will have some overhead to locate your objects. > > > Two things. One, I would want any new API to start with an underscore > so people know we can and will change its semantics as necessary. Two, > Guido would have to re-accept the PEP as this is a shift in the use of > the field if this is how people want to go. Since this isn't a user-facing/public API feature, are we *really* forced to accept/implement the PEP before the beta? I'd be happy to spend some time tomorrow/Monday to hammer out an alternative approach to co_extra. Let's see if we can find a slightly better approach. Yury From gvanrossum at gmail.com Sat Sep 3 20:59:35 2016 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sat, 3 Sep 2016 17:59:35 -0700 Subject: [Python-Dev] Tweak to PEP 523 for storing a tuple in co_extra In-Reply-To: References: <594cf0f3-8e6f-e4ee-8ad4-cb619e1b3401@gmail.com> <5e8cc8d6-4241-afdc-07af-c52826e08877@gmail.com> Message-ID: Brett, I have not followed everything here but I have no problem with tweaks at this level as long as you are happy with it. --Guido (mobile) On Sep 3, 2016 5:39 PM, "Brett Cannon" wrote: > > > On Sat, 3 Sep 2016 at 17:27 Yury Selivanov > wrote: > >> >> On 2016-09-03 5:19 PM, Brett Cannon wrote: >> > >> > >> > On Sat, 3 Sep 2016 at 16:43 Yury Selivanov > > > wrote: >> > >> > >> > >> > On 2016-09-03 4:15 PM, Christian Heimes wrote: >> > > On 2016-09-04 00:03, Yury Selivanov wrote: >> > >> >> > >> On 2016-09-03 12:27 PM, Brett Cannon wrote: >> > >>> Below is the `co_extra` section of PEP 523 with the update >> > saying that >> > >>> users are expected to put a tuple in the field for easier >> > simultaneous >> > >>> use of the field. >> > >>> >> > >>> Since the `co_extra` discussions do not affect CPython itself >> I'm >> > >>> planning on landing the changes stemming from the PEP probably >> > on Monday. >> > >> Tuples are immutable. If you have multiple co_extra users then >> > they >> > >> will have to either mutate tuple (which isn't always possible, >> for >> > >> instance, you can't increase size), or to replace it with >> > another tuple. >> > >> >> > >> Creating lists is a bit more expensive, but item access speed >> > should be >> > >> in the same ballpark. >> > >> >> > >> Another question -- sorry if this was discussed before -- why >> > do we want >> > >> a PyObject* there at all? I.e. why don't we create a dedicated >> > struct >> > >> CoExtraContainer to manage the stuff in co_extra? My >> > understanding is >> > >> that the users of co_extra are C-level python optimizers and >> > profilers, >> > >> which don't need the overhead of CPython API. >> > >> > >> > As Chris pointed out in another email, the overhead is only in the >> > allocation, not the iteration/access if you use the PyTuple macros to >> > get the size and index into the tuple the overhead is negligible. >> >> Yes, my point was that it's as cheap to use a list as a tuple for >> co_extra. If we decide to store PyObject in co_extra. >> >> > >> >> > >> This way my work to add an extra caching layer (which I'm very >> much >> > >> willing to continue to work on) wouldn't require another set of >> > extra >> > >> fields for code objects. >> > > Quick idea before I go to bed: >> > > >> > > You could adopt a similar API to OpenSSL's >> CRYPTO_get_ex_new_index() >> > > API, >> > > >> > https://www.openssl.org/docs/manmaster/crypto/CRYPTO_get_ >> ex_new_index.html >> > > >> > > >> > > static int code_index = 0; >> > > >> > > int PyCodeObject_NewIndex() { >> > > return code_index++; >> > > } >> > > >> > > A library like Pyjion has to acquire an index first. In further >> > calls it >> > > uses the index as offset into the new co_extra field. Libraries >> > don't >> > > have to hard-code their offset and two libraries will never >> > conflict. >> > > PyCode_New() can pre-populate co_extra with a PyTuple of size >> > > code_index. This avoids most resizes if you load Pyjion early. For >> > > code_index == 0 leaf the field NULL. >> > >> > Sounds like a very good idea! >> > >> > >> > The problem with this is the pre-population. If you don't get your >> > index assigned before the very first code object is allocated then you >> > still have to manage the size of the tuple in co_extra. So what this >> > would do is avoid the iteration but not the allocation overhead. >> > >> > If we open up the can of worms in terms of custom functions for this >> > (which I was trying to avoid), then you end up with Py_ssize_t >> > _PyCode_ExtraIndex(), PyObject * >> > _PyCode_GetExtra(PyCodeObject *code, Py_ssize_t index), and int >> > _PyCode_SetExtra(PyCodeObject *code, Py_ssize_t index, PyObject *data) >> > which does all the right things for creating or resizing the tuple as >> > necessary and which I think matches mostly what Nick had proposed >> > earlier. But the pseudo-code for _PyCode_GetExtra() would be:: >> > >> > if co_extra is None: >> > co_extra = (None,) * _next_extra_index; >> > return None >> > elif len(co_extra) < index - 1: >> > ... pad out tuple >> > return None >> > else: >> > return co_extra[index] >> > >> > Is that going to save us enough to want to have a custom API for this? >> >> But without that new API (basically what Christian proposed) you'd need >> to iterate over the list in order to find the object that belongs to >> Pyjion. > > > Yes. > > >> If we manage to implement my opcode caching idea, we'll have at >> least two known users of co_extra. Without a way to claim a particular >> index in co_extra you will have some overhead to locate your objects. >> > > Two things. One, I would want any new API to start with an underscore so > people know we can and will change its semantics as necessary. Two, Guido > would have to re-accept the PEP as this is a shift in the use of the field > if this is how people want to go. > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Sat Sep 3 21:21:21 2016 From: brett at python.org (Brett Cannon) Date: Sun, 04 Sep 2016 01:21:21 +0000 Subject: [Python-Dev] Tweak to PEP 523 for storing a tuple in co_extra In-Reply-To: References: <594cf0f3-8e6f-e4ee-8ad4-cb619e1b3401@gmail.com> <5e8cc8d6-4241-afdc-07af-c52826e08877@gmail.com> Message-ID: Great, thanks! On Sat, Sep 3, 2016, 17:59 Guido van Rossum wrote: > Brett, I have not followed everything here but I have no problem with > tweaks at this level as long as you are happy with it. > > --Guido (mobile) > > On Sep 3, 2016 5:39 PM, "Brett Cannon" wrote: > >> >> >> On Sat, 3 Sep 2016 at 17:27 Yury Selivanov >> wrote: >> >>> >>> On 2016-09-03 5:19 PM, Brett Cannon wrote: >>> > >>> > >>> > On Sat, 3 Sep 2016 at 16:43 Yury Selivanov >> > > wrote: >>> > >>> > >>> > >>> > On 2016-09-03 4:15 PM, Christian Heimes wrote: >>> > > On 2016-09-04 00:03, Yury Selivanov wrote: >>> > >> >>> > >> On 2016-09-03 12:27 PM, Brett Cannon wrote: >>> > >>> Below is the `co_extra` section of PEP 523 with the update >>> > saying that >>> > >>> users are expected to put a tuple in the field for easier >>> > simultaneous >>> > >>> use of the field. >>> > >>> >>> > >>> Since the `co_extra` discussions do not affect CPython itself >>> I'm >>> > >>> planning on landing the changes stemming from the PEP probably >>> > on Monday. >>> > >> Tuples are immutable. If you have multiple co_extra users then >>> > they >>> > >> will have to either mutate tuple (which isn't always possible, >>> for >>> > >> instance, you can't increase size), or to replace it with >>> > another tuple. >>> > >> >>> > >> Creating lists is a bit more expensive, but item access speed >>> > should be >>> > >> in the same ballpark. >>> > >> >>> > >> Another question -- sorry if this was discussed before -- why >>> > do we want >>> > >> a PyObject* there at all? I.e. why don't we create a dedicated >>> > struct >>> > >> CoExtraContainer to manage the stuff in co_extra? My >>> > understanding is >>> > >> that the users of co_extra are C-level python optimizers and >>> > profilers, >>> > >> which don't need the overhead of CPython API. >>> > >>> > >>> > As Chris pointed out in another email, the overhead is only in the >>> > allocation, not the iteration/access if you use the PyTuple macros to >>> > get the size and index into the tuple the overhead is negligible. >>> >>> Yes, my point was that it's as cheap to use a list as a tuple for >>> co_extra. If we decide to store PyObject in co_extra. >>> >>> > >> >>> > >> This way my work to add an extra caching layer (which I'm very >>> much >>> > >> willing to continue to work on) wouldn't require another set of >>> > extra >>> > >> fields for code objects. >>> > > Quick idea before I go to bed: >>> > > >>> > > You could adopt a similar API to OpenSSL's >>> CRYPTO_get_ex_new_index() >>> > > API, >>> > > >>> > >>> https://www.openssl.org/docs/manmaster/crypto/CRYPTO_get_ex_new_index.html >>> > > >>> > > >>> > > static int code_index = 0; >>> > > >>> > > int PyCodeObject_NewIndex() { >>> > > return code_index++; >>> > > } >>> > > >>> > > A library like Pyjion has to acquire an index first. In further >>> > calls it >>> > > uses the index as offset into the new co_extra field. Libraries >>> > don't >>> > > have to hard-code their offset and two libraries will never >>> > conflict. >>> > > PyCode_New() can pre-populate co_extra with a PyTuple of size >>> > > code_index. This avoids most resizes if you load Pyjion early. >>> For >>> > > code_index == 0 leaf the field NULL. >>> > >>> > Sounds like a very good idea! >>> > >>> > >>> > The problem with this is the pre-population. If you don't get your >>> > index assigned before the very first code object is allocated then you >>> > still have to manage the size of the tuple in co_extra. So what this >>> > would do is avoid the iteration but not the allocation overhead. >>> > >>> > If we open up the can of worms in terms of custom functions for this >>> > (which I was trying to avoid), then you end up with Py_ssize_t >>> > _PyCode_ExtraIndex(), PyObject * >>> > _PyCode_GetExtra(PyCodeObject *code, Py_ssize_t index), and int >>> > _PyCode_SetExtra(PyCodeObject *code, Py_ssize_t index, PyObject *data) >>> > which does all the right things for creating or resizing the tuple as >>> > necessary and which I think matches mostly what Nick had proposed >>> > earlier. But the pseudo-code for _PyCode_GetExtra() would be:: >>> > >>> > if co_extra is None: >>> > co_extra = (None,) * _next_extra_index; >>> > return None >>> > elif len(co_extra) < index - 1: >>> > ... pad out tuple >>> > return None >>> > else: >>> > return co_extra[index] >>> > >>> > Is that going to save us enough to want to have a custom API for this? >>> >>> But without that new API (basically what Christian proposed) you'd need >>> to iterate over the list in order to find the object that belongs to >>> Pyjion. >> >> >> Yes. >> >> >>> If we manage to implement my opcode caching idea, we'll have at >>> least two known users of co_extra. Without a way to claim a particular >>> index in co_extra you will have some overhead to locate your objects. >>> >> >> Two things. One, I would want any new API to start with an underscore so >> people know we can and will change its semantics as necessary. Two, Guido >> would have to re-accept the PEP as this is a shift in the use of the field >> if this is how people want to go. >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> > Unsubscribe: >> https://mail.python.org/mailman/options/python-dev/guido%40python.org >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Sat Sep 3 21:22:47 2016 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 04 Sep 2016 13:22:47 +1200 Subject: [Python-Dev] PEP 525, third round, better finalization In-Reply-To: References: Message-ID: <57CB7767.9000207@canterbury.ac.nz> Nick Coghlan wrote: > For synchronous code, that's a relatively easy burden to push back > onto the programmer - assuming fair thread scheduling, a with > statement can ensure reliably ensure prompt resource cleanup. > > That assurance goes out the window as soon as you explicitly pause > code execution inside the body of the with statement - it doesn't > matter whether its via yield, yield from, or await, you've completely > lost that assurance of immediacy. I don't see how this is any worse than a thread containing an ordinary with-statement that waits for something that will never happen. If that's the case, then you've got a deadlock, and you have more to worry about than resources not being released. I think what all this means is that an event loop must not simply drop async tasks on the floor. If it's asked to cancel a task, it should do that by throwing an appropriate exception into it and letting it unwind itself. To go along with that, the programmer needs to understand that he can't just fire off a task and abandon it if it uses external resources and is not guaranteed to finish under its own steam. He needs to arrange a timeout or other mechanism to cancel it if it doesn't complete in a timely manner. If those things are done, an async with should be exactly as adequate for resource cleanup as an ordinary with is in a thread. It also shouldn't be necessary to have any special protocol for finalising an async generator; async with together with a way of throwing an exception into a task should be all that's needed. -- Greg From brett at python.org Sat Sep 3 21:22:51 2016 From: brett at python.org (Brett Cannon) Date: Sun, 04 Sep 2016 01:22:51 +0000 Subject: [Python-Dev] Tweak to PEP 523 for storing a tuple in co_extra In-Reply-To: References: <594cf0f3-8e6f-e4ee-8ad4-cb619e1b3401@gmail.com> <5e8cc8d6-4241-afdc-07af-c52826e08877@gmail.com> Message-ID: On Sat, Sep 3, 2016, 17:45 Yury Selivanov wrote: > > > > > But without that new API (basically what Christian proposed) you'd > > need > > to iterate over the list in order to find the object that belongs to > > Pyjion. > > > > > > Yes. > > Yeah, which means the same for my opcode patch... Which unfortunately > will make things slower :( > > > If we manage to implement my opcode caching idea, we'll have at > > least two known users of co_extra. Without a way to claim a > > particular > > index in co_extra you will have some overhead to locate your objects. > > > > > > Two things. One, I would want any new API to start with an underscore > > so people know we can and will change its semantics as necessary. Two, > > Guido would have to re-accept the PEP as this is a shift in the use of > > the field if this is how people want to go. > > > Since this isn't a user-facing/public API feature, are we *really* > forced to accept/implement the PEP before the beta? > I say yes since people could want to use it during the beta for testing (it's Ned's call in the end, though). > I'd be happy to spend some time tomorrow/Monday to hammer out an > alternative approach to co_extra. Let's see if we can find a slightly > better approach. > OK! -brett > Yury > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Sep 4 05:51:21 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 4 Sep 2016 19:51:21 +1000 Subject: [Python-Dev] PEP 467: last round (?) In-Reply-To: <1472940660.890622.714961169.3425F103@webmail.messagingengine.com> References: <57C88355.9000302@stoneleaf.us> <1472940660.890622.714961169.3425F103@webmail.messagingengine.com> Message-ID: On 4 September 2016 at 08:11, Random832 wrote: > On Sat, Sep 3, 2016, at 18:06, Koos Zevenhoven wrote: >> I guess one reason I don't like bchr (nor chrb, really) is that they >> look just like a random sequence of letters in builtins, but not >> recognizable the way asdf would be. >> >> I guess I have one last pair of suggestions for the name of this >> function: bytes.chr or bytes.char. The PEP started out with a classmethod, and that proved problematic due to length and the expectation of API symmetry with bytearray. A new builtin paralleling chr avoids both of those problems. > What about byte? Like, not bytes.byte, just builtins.byte. The main problem with "byte" as a name is that "bytes" is *not* an iterable of these - it's an iterable of ints. That concern doesn't arise with chr/str as they're both abbreviated singular nouns rather than one being the plural form of the other (it also doesn't hurt that str actually is an iterable of chr results). If we wanted a meaningful noun (other than byte) for the bchr concept, then the alternative term that most readily comes to mind for me is "octet", but I don't know how intuitive or memorable that would be for folks without an embedded systems or serial communications background (especially given that we already have 'oct', which does something completely different). That said, the PEP does propose "getbyte()" and "iterbytes()" for bytes-oriented indexing and iteration, so there's a reasonable consistency argument in favour of also proposing "byte" as the builtin factory function: * data.getbyte(idx) would be a more efficient alternative to byte(data[idx]) * data.iterbytes() would be a more efficient alternative to map(byte, data) With bchr, those mappings aren't as clear (plus there's a potentially unwanted "text" connotation arising from the use of the "chr" abbreviation). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From mark at hotpy.org Sun Sep 4 06:15:30 2016 From: mark at hotpy.org (Mark Shannon) Date: Sun, 4 Sep 2016 11:15:30 +0100 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: <20160902180407.GC26300@ando.pearwood.info> References: <57C982D4.1060405@hotpy.org> <20160902180407.GC26300@ando.pearwood.info> Message-ID: <7df55264-de3e-98cb-1641-bfe820acdde4@hotpy.org> On 02/09/16 19:04, Steven D'Aprano wrote: > On Fri, Sep 02, 2016 at 08:10:24PM +0300, Koos Zevenhoven wrote: > >> A good checker should be able to infer that x is a union type at the >> point that it's passed to spam, even without the type annotation. For >> example: >> >> def eggs(cond:bool): >> if cond: >> x = 1 >> else: >> x = 1.5 >> spam(x) # a good type checker infers that x is of type Union[int, float] > > Oh I really hope not. I wouldn't call that a *good* type checker. I > would call that a type checker that is overly permissive. Why would that be overly permissive? It infers the most precise type possible. > > Maybe you think that it's okay because ints and floats are somewhat > compatible. But suppose I wrote: > > if cond: > x = HTTPServer(*args) > else: > x = 1.5 > > Would you want the checker to infer Union[HTTPServer, float]? I > wouldn't. I would want the checker to complain that the two branches of > the `if` result in different types for x. If I really mean it, then I > can give a type-hint. Yes, the checker would infer that the type of x (strictly, all uses of x that are defined by these definitions) is Union[HTTPServer, float]. You example is incomplete, what do you do with x? If you pass x to a function that takes Union[HTTPServer, float] then there is no error. If you pass it to a function that takes a number then you get an error: "Cannot use HTTPServer (from line 2) as Number (line ...)" as one would expect. When it comes to checkers, people hate false positives. Flagging correct code as erroneous because it is bad 'style' is really unpopular. > > In any case, this PEP isn't about specifying when to declare variable > types, it is for picking syntax. Do you have a better idea for variable > syntax? No. I think that defining the type of variables, rather than expressions is a bad idea. Cheers, Mark. From k7hoven at gmail.com Sun Sep 4 06:43:42 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Sun, 4 Sep 2016 13:43:42 +0300 Subject: [Python-Dev] PEP 467: last round (?) In-Reply-To: References: <57C88355.9000302@stoneleaf.us> <1472940660.890622.714961169.3425F103@webmail.messagingengine.com> Message-ID: On Sun, Sep 4, 2016 at 12:51 PM, Nick Coghlan wrote: > On 4 September 2016 at 08:11, Random832 wrote: >> On Sat, Sep 3, 2016, at 18:06, Koos Zevenhoven wrote: >>> I guess one reason I don't like bchr (nor chrb, really) is that they >>> look just like a random sequence of letters in builtins, but not >>> recognizable the way asdf would be. >>> >>> I guess I have one last pair of suggestions for the name of this >>> function: bytes.chr or bytes.char. > > The PEP started out with a classmethod, and that proved problematic > due to length and the expectation of API symmetry with bytearray. A > new builtin paralleling chr avoids both of those problems. > >> What about byte? Like, not bytes.byte, just builtins.byte. > > The main problem with "byte" as a name is that "bytes" is *not* an > iterable of these - it's an iterable of ints. That concern doesn't > arise with chr/str as they're both abbreviated singular nouns rather > than one being the plural form of the other (it also doesn't hurt that > str actually is an iterable of chr results). > Since you agree with me about this... [...] > > That said, the PEP does propose "getbyte()" and "iterbytes()" for > bytes-oriented indexing and iteration, so there's a reasonable > consistency argument in favour of also proposing "byte" as the builtin > factory function: > > * data.getbyte(idx) would be a more efficient alternative to byte(data[idx]) > * data.iterbytes() would be a more efficient alternative to map(byte, data) > .. I don't understand the argument for having 'byte' in these names. They should have 'char' or 'chr' in them for exacly the same reason that the proposed builtin should have 'chr' in it instead of 'byte'. If 'bytes' is an iterable of ints, then get_byte should probably return an int I'm sorry, but this argument comes across as "were're proposing the wrong thing here, so for consistency, we might want to do the wrong thing in this other part too". And didn't someone recently propose deprecating iterability of str (not indexing, or slicing, just iterability)? Then str would also need a way to provide an iterable or sequence view of the characters. For consistency, the str functionality would probably need to mimic the approach in bytes. IOW, this PEP may in fact ultimately dictate how to get a iterable/sequence from a str object. -- Koos > With bchr, those mappings aren't as clear (plus there's a potentially > unwanted "text" connotation arising from the use of the "chr" > abbreviation). > Which mappings? > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/k7hoven%40gmail.com -- + Koos Zevenhoven + http://twitter.com/k7hoven + From mark at hotpy.org Sun Sep 4 06:52:24 2016 From: mark at hotpy.org (Mark Shannon) Date: Sun, 4 Sep 2016 11:52:24 +0100 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> Message-ID: On 02/09/16 19:19, Steven D'Aprano wrote: > On Fri, Sep 02, 2016 at 10:47:41AM -0700, Steve Dower wrote: >> "I'm not seeing what distinction you think you are making here. What >> distinction do you see between: >> >> x: int = func(value) >> >> and >> >> x = func(value) #type: int" >> >> Not sure whether I agree with Mark on this particular point, but the >> difference I see here is that the first describes what types x may >> ever contain, while the latter describes what type of being assigned >> to x right here. So one is a variable annotation while the other is an >> expression annotation. > > Ultimately Python is a dynamically typed language, and that's not > changing. This means types are fundamentally associated with *values*, > not *variables* (names). But in practice, you can go a long way by > pretending that it is the variable that carries the type. That's the > point of the static type checker: if you see that x holds an int here, > then assume (unless told differently) that x should always be an int. > Because in practice, most exceptions to that are due to bugs, or at > least sloppy code. > > Of course, it is up to the type checker to decide how strict it wants to > be, whether to treat violations as a warning or a error, whether to > offer the user a flag to set the behaviour, etc. None of this is > relevant to the PEP. The PEP only specifies the syntax, leaving > enforcement or non-enforcement to the checker, and it says: > > PEP 484 introduced type hints, a.k.a. type annotations. While its > main focus was function annotations, it also introduced the notion > of type comments to annotate VARIABLES [emphasis added] If I recall, Guido and I agreed to differ on that point. We still do, it seems. We did manage to agree on the syntax though. > > not expressions. And: > > This PEP aims at adding syntax to Python for annotating the types > of variables and attributes, instead of expressing them through > comments > > which to me obviously implies that the two ways (type comment, and > variable type hint) are intended to be absolutely identical in > semantics, at least as far as the type-checker is concerned. The key difference is in placement. PEP 484 style variable = value # annotation Which reads to me as if the annotation refers to the value. PEP 526 variable: annotation = value Which reads very much as if the annotation refers to the variable. That is a change in terms of semantics and a change for the worse, in terms of expressibility. Cheers, Mark. From rosuav at gmail.com Sun Sep 4 06:57:15 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 4 Sep 2016 20:57:15 +1000 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> Message-ID: On Sun, Sep 4, 2016 at 8:52 PM, Mark Shannon wrote: > The key difference is in placement. > PEP 484 style > variable = value # annotation > > Which reads to me as if the annotation refers to the value. > PEP 526 > variable: annotation = value > > Which reads very much as if the annotation refers to the variable. > That is a change in terms of semantics and a change for the worse, in terms > of expressibility. So what you have is actually a change in *implication* (since the PEP doesn't stipulate semantics); and the old way (the comment) implies something contrary to the semantics of at least one of the type checkers that uses it (MyPy). Are there any current type checkers that actually do associate that with the value? That is, to have "variable = func() # str" indicate the same type check as "def func() -> str"? If not, this is a strong argument in favour of the PEP, since it would synchronize the syntax with the current best-of-breed checkers. ChrisA From christian at python.org Sun Sep 4 06:57:41 2016 From: christian at python.org (Christian Heimes) Date: Sun, 4 Sep 2016 12:57:41 +0200 Subject: [Python-Dev] Patch reviews In-Reply-To: References: Message-ID: <08ebd955-c784-400c-084c-5f39fb52271b@python.org> On 2016-09-01 23:15, Victor Stinner wrote: > 2016-08-31 22:31 GMT+02:00 Christian Heimes : >> https://bugs.python.org/issue27744 >> Add AF_ALG (Linux Kernel crypto) to socket module > > This patch adds a new socket.sendmsg_afalg() method on Linux. > > "afalg" comes from AF_ALG which means "Address Family Algorithm". It's > documented as "af_alg: User-space algorithm interface" in > crypto/af_alg.c. > > IHMO the method should be just "sendmsg_alg()", beacuse "afalg" is > redundant. The AF_ prefix is only used to workaround a C limitation: > there is no namespace in the language, all symbols are in one single > giant namespace. > > I don't expect that a platform will add a new sendmsg_alg() C > function. If it's the case, we will see how to handle the name > conflict ;-) Hi, afalg is pretty much the standard name for Linux Kernel crypto. For example OpenSSL 1.1.0 introduced a crypto engine to offload AES. The engine is called 'afalg' [1]. Other documentations refer to the interface as either afalg or AF_ALG, too. I prefer to use an established name for the method. Christian [1] https://github.com/openssl/openssl/blob/master/engines/afalg/e_afalg.c#L88 From mark at hotpy.org Sun Sep 4 07:31:26 2016 From: mark at hotpy.org (Mark Shannon) Date: Sun, 4 Sep 2016 12:31:26 +0100 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <57C982D4.1060405@hotpy.org> <20160902164035.GB26300@ando.pearwood.info> Message-ID: <84005c43-1465-22f3-2106-c5a0f3c21533@hotpy.org> On 02/09/16 20:33, Guido van Rossum wrote: > On Fri, Sep 2, 2016 at 10:47 AM, Steve Dower wrote: >> "I'm not seeing what distinction you think you are making here. What >> distinction do you see between: >> >> x: int = func(value) >> >> and >> >> x = func(value) # type: int" >> >> Not sure whether I agree with Mark on this particular point, but the >> difference I see here is that the first describes what types x may ever >> contain, while the latter describes what type of being assigned to x right >> here. So one is a variable annotation while the other is an expression >> annotation. > > But that's not what type comments mean! They don't annotate the > expression. They annotate the variable. The text in PEP 484 that > introduces them is clear about this (it never mentions expressions, > only variables). In PEP 484, the section on type comments says: (Quoting verbatim) """ No first-class syntax support for explicitly marking variables as being of a specific type is added by this PEP. To help with type inference in complex cases, a comment of the following format may be used... """ Some mentions of the type of a variable are made in other places in the PEP, but those were all added *after* I had approved the PEP. In other words PEP 484 specifically states that annotations are to help with type inference. As defined in PEP 526, I think that type annotations become a hindrance to type inference. Cheers, Mark. > >> Personally, I prefer expression annotations over variable annotations, as >> there are many other languages I'd prefer if variable have fixed types (e.g. >> C++, where I actually enjoy doing horrible things with implicit casting ;) >> ). >> >> Variable annotations appear to be inherently restrictive, so either we need >> serious clarification as to why they are not, or they actually are and we >> ought to be more sure that it's the direction we want the language to go. > > At runtime the variable annotations are ignored. And a type checker > will only ask for them when it cannot infer the type. So I think we'll > be fine. > From levkivskyi at gmail.com Sun Sep 4 07:32:00 2016 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Sun, 4 Sep 2016 13:32:00 +0200 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> Message-ID: On 4 September 2016 at 12:52, Mark Shannon wrote: > The key difference is in placement. > PEP 484 style > variable = value # annotation > > Which reads to me as if the annotation refers to the value. > PEP 526 > variable: annotation = value > > Which reads very much as if the annotation refers to the variable. > That is a change in terms of semantics and a change for the worse, in > terms of expressibility. I still think it is better to leave the decision to type checkers. The proposed syntax allows two forms: variable: annotation = value and variable: annotation The first form still could be interpreted by type checkers as annotation for value (a cast to more precise type): variable = cast(annotation, value) # visually also looks similar and PEP says that annotations "are intended to help with type inference in complex cases". Such interpretation could be useful for function local variables (the implementation is also optimised for such use case). While the second form (without value) indeed looks like annotation of variable, and is equivalent to: __annotations__['variable'] = annotation This form could be useful for annotating instance/class variables. Or just for documentation (I really like this). In addition, expression annotations are allowed expression: annotation [= value] This form is not used by 3rd party tools (as far as I know) but one of the possible use cases could be "check-points" for values: somedict[somefunc(somevar)]: annotation # type checker could flag this if something went wrong. Finally, I would like to reiterate, both interpretations (annotating value vs annotating variable) are possible and we (OK at least me, but it looks like Guido also agree) don't want to force 3rd party tools to use only one of those. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at hotpy.org Sun Sep 4 07:30:18 2016 From: mark at hotpy.org (Mark Shannon) Date: Sun, 4 Sep 2016 12:30:18 +0100 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> Message-ID: On 04/09/16 11:57, Chris Angelico wrote: > On Sun, Sep 4, 2016 at 8:52 PM, Mark Shannon wrote: >> The key difference is in placement. >> PEP 484 style >> variable = value # annotation >> >> Which reads to me as if the annotation refers to the value. >> PEP 526 >> variable: annotation = value >> >> Which reads very much as if the annotation refers to the variable. >> That is a change in terms of semantics and a change for the worse, in terms >> of expressibility. > > So what you have is actually a change in *implication* (since the PEP > doesn't stipulate semantics); and the old way (the comment) implies > something contrary to the semantics of at least one of the type > checkers that uses it (MyPy). Do we really want to make a major, irrevocable change to the language just because MyPy does something? MyPy is very far from complete (it doesn't even support Optional types yet). Are there any current type checkers that > actually do associate that with the value? That is, to have "variable > = func() # str" indicate the same type check as "def func() -> str"? > If not, this is a strong argument in favour of the PEP, since it would > synchronize the syntax with the current best-of-breed checkers. I believe pytype uses value, rather the variable, tracking. It is thus more precise. Of course, it is more of an inferencer than a checker. We (semmle.com) do precise value tracking to infer a lot of "type" errors in unannotated code (as the vast majority of Python code is). It would be a real shame if PEP 526 mandates against checkers doing as good as job as possible. Forcing all uses of a variable to have the same type is a major and, IMO crippling, limitation. E.g. def foo(x:Optional[int])->int: if x is None: return -1 return x + 1 If the type of the *variable* 'x' is Optional[int] then 'return x + 1' doesn't type check. If the type of the *parameter* 'x' is Optional[int] then a checker can readily verify the above code. I want a checker to check my code and, with minimal annotations, give me confidence that my code is correct. Cheers, Mark. > > ChrisA > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/mark%40hotpy.org > From steve at pearwood.info Sun Sep 4 07:51:59 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 4 Sep 2016 21:51:59 +1000 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: <84005c43-1465-22f3-2106-c5a0f3c21533@hotpy.org> References: <57C982D4.1060405@hotpy.org> <20160902164035.GB26300@ando.pearwood.info> <84005c43-1465-22f3-2106-c5a0f3c21533@hotpy.org> Message-ID: <20160904115159.GL26300@ando.pearwood.info> On Sun, Sep 04, 2016 at 12:31:26PM +0100, Mark Shannon wrote: > In other words PEP 484 specifically states that annotations are to help > with type inference. As defined in PEP 526, I think that type > annotations become a hindrance to type inference. I'm pretty sure that they don't. Have you used any languages with type inference? Any type-checkers? If so, can you give some actual concrete examples of how annotating a variable hinders type inference? It sounds like you are spreading FUD at the moment. The whole point of type annotations is that you use them to deliberately over-ride what the checker would infer (if it infers the wrong thing, or cannot infer anything). I cannot see how you conclude from this that type annotations will be a hindrance to type inference. If you don't want to declare the type of a variable, simply DON'T declare the type, and let the checker infer whatever it can (which may be nothing, or may be the wrong type). -- Steve From levkivskyi at gmail.com Sun Sep 4 07:56:46 2016 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Sun, 4 Sep 2016 13:56:46 +0200 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> Message-ID: On 4 September 2016 at 13:30, Mark Shannon wrote: > It would be a real shame if PEP 526 mandates against checkers doing as > good as job as possible. Forcing all uses of a variable to have the same > type is a major and, IMO crippling, limitation. > > E.g. > def foo(x:Optional[int])->int: > if x is None: > return -1 > return x + 1 > > If the type of the *variable* 'x' is Optional[int] then 'return x + 1' > doesn't type check. If the type of the *parameter* 'x' is Optional[int] > then a checker can readily verify the above code. > Mark, First, in addition to the quote from my previous e-mail, I would like to show another quote from PEP 526 "This PEP does not require type checkers to change their type checking rules. It merely provides a more readable syntax to replace type comments" Second, almost exactly your example has been added to PEP 484: class Reason(Enum): timeout = 1 error = 2 def process(response: Union[str, Reason] = '') -> str: if response is Reason.timeout: return 'TIMEOUT' elif response is Reason.error: return 'ERROR' else: # response can be only str, all other possible values exhausted return 'PROCESSED: ' + response I think mypy either already supports this or will support very soon (and the same for Optional) -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From k7hoven at gmail.com Sun Sep 4 08:08:28 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Sun, 4 Sep 2016 15:08:28 +0300 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> Message-ID: On Sun, Sep 4, 2016 at 1:52 PM, Mark Shannon wrote: [...] > > The key difference is in placement. > PEP 484 style > variable = value # annotation > > Which reads to me as if the annotation refers to the value. > PEP 526 > variable: annotation = value > > Which reads very much as if the annotation refers to the variable. > That is a change in terms of semantics and a change for the worse, in terms > of expressibility. > You have probably noticed this already, but in the semantics which I have now explained more precisely on python-ideas https://mail.python.org/pipermail/python-ideas/2016-September/042076.html an annotation like variable: annotation = value is a little closer to an expression annotation. I.e. it does not say that 'variable' should *always* have the type given by 'annotation'. -- Koos > > Cheers, > Mark. > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/k7hoven%40gmail.com -- + Koos Zevenhoven + http://twitter.com/k7hoven + From steve at pearwood.info Sun Sep 4 08:43:59 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 4 Sep 2016 22:43:59 +1000 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> Message-ID: <20160904124359.GM26300@ando.pearwood.info> On Sun, Sep 04, 2016 at 12:30:18PM +0100, Mark Shannon wrote: > It would be a real shame if PEP 526 mandates against checkers doing as > good as job as possible. Forcing all uses of a variable to have the same > type is a major and, IMO crippling, limitation. This is approaching FUD. Guido has already stated that the section of the PEP which implied that *any* change of type of a variable would be a warning (not an error) is too strong: https://mail.python.org/pipermail/python-dev/2016-September/146064.html and indeed the relevant section of the PEP has already been changed: Duplicate type annotations will be ignored. However, static type checkers may issue a warning for annotations of the same variable by a different type: a: int a: str # Static type checker may or may not warn about this. This PEP does not mandate any behaviour for type-checkers. It describes the syntax for type annotations in Python code. What type-checkers do with that information is up to them. > E.g. > def foo(x:Optional[int])->int: > if x is None: > return -1 > return x + 1 > > If the type of the *variable* 'x' is Optional[int] then 'return x + 1' > doesn't type check. That makes no sense. Why wouldn't it type check? It may be that some simple-minded type-checkers are incapable of checking that code because it is too complex. If so, that's a limitation of that specific checker, not of type-checkers in general. MyPy already can type-check that code. See below. > If the type of the *parameter* 'x' is Optional[int] > then a checker can readily verify the above code. This makes even less sense, since the parameter "x" is, of course, precisely the same as the variable "x". Here's MyPy in action, successfully checking code that you state can't be checked: [steve at ando ~]$ cat test.py from typing import Optional def foo(x:Optional[int])->int: if x is None: return -1 return x + 1 def bar(x:Optional[int])->int: y = x # the type of y must be inferred if y is None: return y + 1 return len(y) [steve at ando ~]$ mypy --strict-optional test.py test.py: note: In function "bar": test.py:11: error: Unsupported operand types for + (None and "int") test.py:12: error: Argument 1 to "len" has incompatible type "int"; expected "Sized" foo passes the type check; bar fails. > I want a checker to check my code and, with minimal annotations, give me > confidence that my code is correct. Don't we all. -- Steve From steve at pearwood.info Sun Sep 4 09:07:43 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 4 Sep 2016 23:07:43 +1000 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> Message-ID: <20160904130743.GO26300@ando.pearwood.info> Referring to the alternative syntax forms: # Proposed x: int = func(value) # Already accepted x = func(value) #type: int On Sun, Sep 04, 2016 at 11:52:24AM +0100, Mark Shannon wrote: > The key difference is in placement. > PEP 484 style > variable = value # annotation > > Which reads to me as if the annotation refers to the value. Both Guido and the PEP have stated that it doesn't refer to the value, but to the variable. But what does it even mean to say that it refers to the value in the context of *static type-checking*? I know what it means in the context of dynamic type-checking, but I don't see how that has any relevance to a static checker. I have seen a number of people commenting that the comment annotation "applies to the expression", but I don't understand what this is supposed to mean. How is that different from applying it to the variable? (That's not a rhetorical question.) Suppose I write this: mylist = [] x = False or None or (mylist + [1]) #type: List[int] pass # stand-in for arbitrary code x.append("definitely not an int") Should the type-checker flag the call to x.append as an error? I hope we all agree that it should. But it can only do that if it knows the type of the variable `x`. This is a *static* type-checker, it doesn't know what value x *actually* has at run-time because it isn't running at run-time. As far as the static checker is concerned, it can only flag that append as an error if it knows that `x` must be a list of ints. If you distinguish the two cases: "the expression `False or None or (mylist + [1])` is List[int]" versus: "the variable `x` is List[int]" I don't even see what the first case could possible mean. But whatever it means, if it is different from the second case, then the type-checker is pretty limited in what it can do. > PEP 526 > variable: annotation = value > > Which reads very much as if the annotation refers to the variable. Since the PEP makes it clear that the two forms are to be treated the same, I think that whatever difference you think they have is not relevant. They are *defined* to mean the same thing. -- Steve From levkivskyi at gmail.com Sun Sep 4 09:22:57 2016 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Sun, 4 Sep 2016 15:22:57 +0200 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: <20160904130743.GO26300@ando.pearwood.info> References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> <20160904130743.GO26300@ando.pearwood.info> Message-ID: On 4 September 2016 at 15:07, Steven D'Aprano wrote: > > PEP 526 > > variable: annotation = value > > > > Which reads very much as if the annotation refers to the variable. > > Since the PEP makes it clear that the two forms are to be treated the > same, I think that whatever difference you think they have is not > relevant. They are *defined* to mean the same thing. > Steve, This has been discussed in the python/typing tracker. When you say "mean" you are talking about semantics, but: Me: """ The title of the PEP contains "Syntax", not "Semantics" because we don't want to impose any new type semantics (apart from addition of ClassVar) """ Guido: """ I have nothing to add to what @ilevkivskyi said about the semantics of redefinition -- that is to be worked out between type checkers. (Much like PEP 484 doesn't specify how type checkers should behave -- while it gives examples of suggested behavior, those examples are not normative, and there are huge gray areas where the PEP doesn't give any guidance. It's the same for PEP 526.) """ -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From k7hoven at gmail.com Sun Sep 4 10:34:30 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Sun, 4 Sep 2016 17:34:30 +0300 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: <20160904124359.GM26300@ando.pearwood.info> References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> <20160904124359.GM26300@ando.pearwood.info> Message-ID: On Sun, Sep 4, 2016 at 3:43 PM, Steven D'Aprano wrote: [...] > [steve at ando ~]$ cat test.py > from typing import Optional > > def foo(x:Optional[int])->int: > if x is None: > return -1 > return x + 1 > > def bar(x:Optional[int])->int: > y = x # the type of y must be inferred > if y is None: > return y + 1 > return len(y) > > [steve at ando ~]$ mypy --strict-optional test.py > test.py: note: In function "bar": > test.py:11: error: Unsupported operand types for + (None and "int") > test.py:12: error: Argument 1 to "len" has incompatible type "int"; expected "Sized" > > > foo passes the type check; bar fails. > That's great. While mypy has nice features, these examples have little to do with PEP 526 as they don't have variable annotations, not even using comments. For some reason, pip install --upgrade mypy fails for me at the moment, but at least mypy version 0.4.1 does not allow this: from typing import Callable def foo(cond: bool, bar : Callable, baz : Callable) -> float: if cond: x = bar() # type: int else: x = baz() # type: float return x / 2 and complains that test.py:7: error: Name 'x' already defined". Maybe someone can confirm this with a newer version. Here, def foo(cond: bool) -> float: if cond: x = 1 else: x = 1.5 return x / 2 you get a different error: test.py:5: error: Incompatible types in assignment (expression has type "float", variable has type "int") Maybe someone can confirm this with a newer version, but IIUC this is still the case. >> I want a checker to check my code and, with minimal annotations, give me >> confidence that my code is correct > > Don't we all. > I would add *with minimal restrictions on how the code is supposed to be written* for type checking to work. It's not at all obvious that everyone thinks that way. Hence, the "Semantics for type checking" thread on python-ideas. -- Koos > > > -- > Steve > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/k7hoven%40gmail.com -- + Koos Zevenhoven + http://twitter.com/k7hoven + From ncoghlan at gmail.com Sun Sep 4 11:38:06 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 5 Sep 2016 01:38:06 +1000 Subject: [Python-Dev] PEP 467: last round (?) In-Reply-To: References: <57C88355.9000302@stoneleaf.us> <1472940660.890622.714961169.3425F103@webmail.messagingengine.com> Message-ID: On 4 September 2016 at 20:43, Koos Zevenhoven wrote: > On Sun, Sep 4, 2016 at 12:51 PM, Nick Coghlan wrote: >> That said, the PEP does propose "getbyte()" and "iterbytes()" for >> bytes-oriented indexing and iteration, so there's a reasonable >> consistency argument in favour of also proposing "byte" as the builtin >> factory function: >> >> * data.getbyte(idx) would be a more efficient alternative to byte(data[idx]) >> * data.iterbytes() would be a more efficient alternative to map(byte, data) >> > > .. I don't understand the argument for having 'byte' in these names. > They should have 'char' or 'chr' in them for exacly the same reason > that the proposed builtin should have 'chr' in it instead of 'byte'. > If 'bytes' is an iterable of ints, then get_byte should probably > return an int > > I'm sorry, but this argument comes across as "were're proposing the > wrong thing here, so for consistency, we might want to do the wrong > thing in this other part too". There are two self-consistent sets of names: bchr bytes.getbchr, bytearray.getbchr bytes.iterbchr, bytearray.iterbchr byte bytes.getbyte, bytearray.getbyte bytes.iterbytes, bytearray.iterbytes The former set emphasises the "stringiness" of this behaviour, by aligning with the chr() builtin The latter set emphasises that these APIs are still about working with arbitrary binary data rather than text, with a Python "byte" subsequently being a length 1 bytes object containing a single integer between 0 and 255, rather than "What you get when you index or iterate over a bytes instance". Having noticed the discrepancy, my personal preference is to go with the latter option (since it better fits the "executable pseudocode" ideal and despite my reservations about "bytes objects contain int objects rather than byte objects", that shouldn't be any more confusing in the long run than explaining that str instances are containers of length-1 str instances). The fact "byte" is much easier to pronounce than bchr (bee-cher? bee-char?) also doesn't hurt. However, I suspect we'll need to put both sets of names in front of Guido and ask him to just pick whichever he prefers to get it resolved one way or the other. > And didn't someone recently propose deprecating iterability of str > (not indexing, or slicing, just iterability)? Then str would also need > a way to provide an iterable or sequence view of the characters. For > consistency, the str functionality would probably need to mimic the > approach in bytes. IOW, this PEP may in fact ultimately dictate how to > get a iterable/sequence from a str object. Strings are not going to become atomic objects, no matter how many times people suggest it. >> With bchr, those mappings aren't as clear (plus there's a potentially >> unwanted "text" connotation arising from the use of the "chr" >> abbreviation). > > Which mappings? The mapping between the builtin name and the method names. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sun Sep 4 11:48:24 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 5 Sep 2016 01:48:24 +1000 Subject: [Python-Dev] Patch reviews In-Reply-To: <08ebd955-c784-400c-084c-5f39fb52271b@python.org> References: <08ebd955-c784-400c-084c-5f39fb52271b@python.org> Message-ID: On 4 September 2016 at 20:57, Christian Heimes wrote: > On 2016-09-01 23:15, Victor Stinner wrote: >> 2016-08-31 22:31 GMT+02:00 Christian Heimes : >>> https://bugs.python.org/issue27744 >>> Add AF_ALG (Linux Kernel crypto) to socket module >> >> This patch adds a new socket.sendmsg_afalg() method on Linux. >> >> "afalg" comes from AF_ALG which means "Address Family Algorithm". It's >> documented as "af_alg: User-space algorithm interface" in >> crypto/af_alg.c. >> >> IHMO the method should be just "sendmsg_alg()", beacuse "afalg" is >> redundant. The AF_ prefix is only used to workaround a C limitation: >> there is no namespace in the language, all symbols are in one single >> giant namespace. >> >> I don't expect that a platform will add a new sendmsg_alg() C >> function. If it's the case, we will see how to handle the name >> conflict ;-) > > Hi, > > afalg is pretty much the standard name for Linux Kernel crypto. For > example OpenSSL 1.1.0 introduced a crypto engine to offload AES. The > engine is called 'afalg' [1]. Other documentations refer to the > interface as either afalg or AF_ALG, too. I prefer to use an established > name for the method. Right, there's a confusability problem here not just at the API level, but at a general terminology level: "alg" is just short for "algorithm", which is way to generic to be meaningful. Once you put the "af" qualifier in front though, it's clear you're not just talking about algorithms in general, you're referring to AF_ALG in particular. Putting the alternatives into Google and seeing which one gives more relevant results also suggests afalg as the clear winner, since the first link returned is the OpenSSL engine for it, while even qualifying "alg" with "ssl" doesn't get you relevant information: * https://www.google.com.au/search?q=afalg * https://www.google.com.au/search?q=alg * https://www.google.com.au/search?q=alg#q=alg+ssl (that afalg repo unfortunately doesn't have a useful README, but at least the first link is relevant, unlike the results for "alg" and "alg ssl") Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sun Sep 4 12:43:06 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 5 Sep 2016 02:43:06 +1000 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> Message-ID: On 4 September 2016 at 21:32, Ivan Levkivskyi wrote: > The first form still could be interpreted by type checkers > as annotation for value (a cast to more precise type): > > variable = cast(annotation, value) # visually also looks similar I think a function based spelling needs to be discussed further, as it seems to me that at least some of the goals of the PEP could be met with a suitable definition of "cast" and "declare", with no syntactic changes to Python. Specifically, consider: def cast(value, annotation): return value def declare(annotation): return object() The idea here is that "cast" would be used as a hint to type checkers to annotate an *expression*, with the runtime semantic impact being exactly nil - the value just gets passed through unmodified. Annotated initialisations would then look like: from typing import cast primes = cast([], List[int]) class Starship: stats = cast({}, ClassVar[Dict[str, int]]) This preserves the relative annotation order of current type hints, where the item being annotated (parameter, function declaration, assignment statement) is on the left, and the annotation is on the right. In cases where the typechecker is able to infer a type for the expression, it may complain here when there's a mismatch between the type inference and the explicit declaration, so these would also be a form of type assertion. That leaves the case of declarations, where the aim is to provide a preemptive declaration that all assignments to a particular variable will include an implicit casting of the RHS. That would look like: from typing import declare captain = declare(str) Until it left the scope, or saw a new target declaration, a typechecker would then interpret future assignments to "captain" as if they had been written: captain = cast(RHS, str) With the above definition, this would have the runtime consequence of setting "captain" to a unique object() instance until the first assignment took place. Both that assignment, and the runtime overhead of evaluating the declaration, can be avoided by moving the declaration into otherwise dead code: if 0: captain = declare(str) Considering the goals and problems listed in the PEP, this would be sufficient to address many of them: * These are real expressions, and hence will be highlighted appropriately * declare() allows annotations of undefined variables (sort of) * These annotations will be in the AST, just as function arguments, rather than as custom nodes * In situations where normal comments and type comments are used together, it is difficult to distinguish them For the other goals, the function-based approach may not help: * For conditional branches, it's only arguably an improvement if 0: my_var = declare(Logger) if some_value: my_var = function() else: my_var = another_function() * Readability for typeshed might improve for module level and class level declarations, but functions would likely want the leading "if 0:" noise for performance reasons * Since the code generator remains uninvolved, this still wouldn't allow annotation access at runtime (unless typing implemented some sys._getframe() hacks in declare() and cast()) However, exploring this possibility still seems like a good idea to me, as it should allow many of the currently thorny semantic questions to be resolved, and a future syntax-only PEP for 3.7+ can just be about defining syntactic sugar for semantics that can (by then) already be expressed via appropriate initialisers. Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From k7hoven at gmail.com Sun Sep 4 12:58:50 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Sun, 4 Sep 2016 19:58:50 +0300 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> Message-ID: On Sun, Sep 4, 2016 at 7:43 PM, Nick Coghlan wrote: > On 4 September 2016 at 21:32, Ivan Levkivskyi wrote: >> The first form still could be interpreted by type checkers >> as annotation for value (a cast to more precise type): >> >> variable = cast(annotation, value) # visually also looks similar > > I think a function based spelling needs to be discussed further, as it > seems to me that at least some of the goals of the PEP could be met > with a suitable definition of "cast" and "declare", with no syntactic > changes to Python. Specifically, consider: > > def cast(value, annotation): > return value > typing.cast already exists. -- Koos From levkivskyi at gmail.com Sun Sep 4 13:23:30 2016 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Sun, 4 Sep 2016 19:23:30 +0200 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> Message-ID: On 4 September 2016 at 18:43, Nick Coghlan wrote: > On 4 September 2016 at 21:32, Ivan Levkivskyi > wrote: > > The first form still could be interpreted by type checkers > > as annotation for value (a cast to more precise type): > > > > variable = cast(annotation, value) # visually also looks similar > > I think a function based spelling needs to be discussed further, as it > seems to me that at least some of the goals of the PEP could be met > with a suitable definition of "cast" and "declare", with no syntactic > changes to Python. Specifically, consider: > > def cast(value, annotation): > return value > > def declare(annotation): > return object() > Nick, If I understand you correctly, this idea is very similar to Undefined. It was proposed a year and half ago, when PEP 484 was discussed. At that time it was abandoned, it reappeared during the discussion of this PEP, but many people (including me) didn't like this, so that we decided to put it in the list of rejected ideas to this PEP. Some downsides of this approach: * People will start to expect Undefined (or whatever is returned by declare()) everywhere (as in Javascript) even if we prohibit this. * Some runtime overhead is still present: annotation gets evaluated at every call to cast, and many annotations involve creation of class objects (especially generics) that are very costly. Because of this overhead, such use of generics was prohibited in PEP 484: x = Node[int]() # prohibited by PEP 484 x = Node() # type: Node[int] # this was allowed * Readability will be probably even worse than with comments: many types already have brackets and parens, so that two more form cast() is not good. Plus some noise of the if 0: that you mentioned, plus "cast" everywhere. However, exploring this possibility still seems like a good idea to > me, as it should allow many of the currently thorny semantic questions > to be resolved, and a future syntax-only PEP for 3.7+ can just be > about defining syntactic sugar for semantics that can (by then) > already be expressed via appropriate initialisers. > I think that motivation of the PEP is exactly opposite, this is why it has "Syntax" not "Semantics" in title. Also quoting Guido: > But I'm not in a hurry for that -- I'm only hoping to get the basic > syntax accepted by Python 3.6 beta 1 so that we can start using this > in 5 years from now rather than 7 years from now. I also think that semantics should be up to the type checkers. Maybe it is not a perfect comparison, but prohibiting all type semantics except one is like prohibiting all Python web frameworks except one. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Sep 4 13:29:43 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 5 Sep 2016 03:29:43 +1000 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: <20160904115159.GL26300@ando.pearwood.info> References: <57C982D4.1060405@hotpy.org> <20160902164035.GB26300@ando.pearwood.info> <84005c43-1465-22f3-2106-c5a0f3c21533@hotpy.org> <20160904115159.GL26300@ando.pearwood.info> Message-ID: On 4 September 2016 at 21:51, Steven D'Aprano wrote: > On Sun, Sep 04, 2016 at 12:31:26PM +0100, Mark Shannon wrote: > >> In other words PEP 484 specifically states that annotations are to help >> with type inference. As defined in PEP 526, I think that type >> annotations become a hindrance to type inference. > > I'm pretty sure that they don't. > > Have you used any languages with type inference? Any type-checkers? If > so, can you give some actual concrete examples of how annotating a > variable hinders type inference? It sounds like you are spreading FUD at > the moment. Steven, this kind of credential checking is uncalled for - Mark is significantly more qualified than most of us to comment on this topic, since he actually works on a shipping software quality analysis product that does type inference on Python code (hence https://semmle.com/semmle-analysis-now-includes-python/ ), and was nominated as the BDFL-Delegate for PEP 484 because Guido trusted him to critically review the proposal and keep any insurmountable problems from getting through. Getting accused of spreading FUD when a topic is just plain hard to explain (due to the large expert/novice gap that needs to be bridged) is one of the reasons python-dev and python-ideas can end up feeling hostile to domain experts. We have the SIG system to help mitigate that problem, but it's vastly preferable if such folks also feel their expertise is welcomed on the main lists, rather than having it be rejected as an inconvenient complication. > The whole point of type annotations is that you use them to deliberately > over-ride what the checker would infer (if it infers the wrong thing, or > cannot infer anything). I cannot see how you conclude from this that > type annotations will be a hindrance to type inference. The problem arises for the "bare annotation" case, as that looks a *lot* like traditional declarations in languages where initialisation (which can specify a type) and assignment (which can't) are different operations. Consider this case: if arg is not None: x = list(arg) # Type of "x" is inferred as List[Any] or something more specific here if other_arg is not None: # This is fine, we know "x" is a list at this point x.extend(other_arg) else: x = None # Type of "x" is inferred as type(None) here # Type of "x" is inferred as Optional[List[Any]] from here on out Now, consider that case with PEP 526: x: Optional[List[Any]] # Oops, this is the type of "x" *after* the if statement, not *during* it if arg is not None: x = list(arg) if other_arg is not None: # If we believe the type declaration here, this code will (incorrectly) be flagged # (as None has no "extend" method) x.extend(other_arg) else: x = None The "pre-declaration as documentation" proposal turns out to be problematic in this case, as it misses the fact that different branches of the if statement contribute different types to what ultimately becomes a Union type for the rest of the code. In order to cover the entire code block, we have to make the pre-declaration match the state after the if statement, but then it's overly broad for any *particular* branch. So in this case, attempting to entirely defer specification of the semantics creates a significant risk of type checkers written on the assumption of C++ or Java style type declarations actively inhibiting the dynamism of Python code, suggesting that the PEP would be well advised to declare not only that the PEP 484 semantics are unchanged, but also that a typechecker that flags the example above as unsafe is wrong to do so. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From levkivskyi at gmail.com Sun Sep 4 13:59:35 2016 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Sun, 4 Sep 2016 19:59:35 +0200 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <57C982D4.1060405@hotpy.org> <20160902164035.GB26300@ando.pearwood.info> <84005c43-1465-22f3-2106-c5a0f3c21533@hotpy.org> <20160904115159.GL26300@ando.pearwood.info> Message-ID: On 4 September 2016 at 19:29, Nick Coghlan wrote: > So in this case, attempting to entirely defer specification of the > semantics creates a significant risk of type checkers written on the > assumption of C++ or Java style type declarations actively inhibiting > the dynamism of Python code, suggesting that the PEP would be well > advised to declare not only that the PEP 484 semantics are unchanged, > but also that a typechecker that flags the example above as unsafe is > wrong to do so. > I don't think that a dedicated syntax will increase the risk more than the existing type comment syntax. Moreover, mainstream type checkers (mypy, pytype, etc) are far from C++ or Java, and as far as I know they are not going to change semantics. As I understand, the main point of Mark is that such syntax suggests visually a variable annotation, more than a value annotation. However, I think that the current behavior of type checkers will have more influence on perception of people rather than a visual appearance of annotation. Anyway, I think it is worth adding an explicit statement to the PEP that both interpretations are possible (maybe even add that value semantics is inherent to Python). But I don't think that we should *prohibit* something in the PEP. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Sep 4 13:59:34 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 5 Sep 2016 03:59:34 +1000 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> Message-ID: On 5 September 2016 at 03:23, Ivan Levkivskyi wrote: > On 4 September 2016 at 18:43, Nick Coghlan wrote: >> >> On 4 September 2016 at 21:32, Ivan Levkivskyi >> wrote: >> > The first form still could be interpreted by type checkers >> > as annotation for value (a cast to more precise type): >> > >> > variable = cast(annotation, value) # visually also looks similar >> >> I think a function based spelling needs to be discussed further, as it >> seems to me that at least some of the goals of the PEP could be met >> with a suitable definition of "cast" and "declare", with no syntactic >> changes to Python. Specifically, consider: >> >> def cast(value, annotation): >> return value >> >> def declare(annotation): >> return object() > > > Nick, If I understand you correctly, this idea is very similar to Undefined. > It was proposed a year and half ago, when PEP 484 was discussed. Not quite, as it deliberately doesn't create a singleton, precisely to avoid the problems a new singleton creates - if you use declare() as written, there's no way to a priori check for it at runtime (since each call produces a new object), so you have to either get the TypeError when you try to use it as whatever type it's supposed to be, or else use a static checker to find cases where you try to use it without initialising it properly first. Folks can also put it behind an "if 0:" or "if typing.TYPE_CHECKING" guard so it doesn't actually execute at runtime, and is only visible to static analysis. > At that time it was abandoned, it reappeared during the discussion > of this PEP, but many people (including me) didn't like this, > so that we decided to put it in the list of rejected ideas to this PEP. > > Some downsides of this approach: > > * People will start to expect Undefined (or whatever is returned by > declare()) > everywhere (as in Javascript) even if we prohibit this. Hence why I didn't use a singleton. > * Some runtime overhead is still present: annotation gets evaluated > at every call to cast, and many annotations involve creation of > class objects (especially generics) that are very costly. Hence the commentary about using an explicit guard to prevent execution ("if 0:" in my post for the dead code elimination, although "if typing.TYPE_CHECKING:" would be more self-explanatory). > * Readability will be probably even worse than with comments: > many types already have brackets and parens, so that two more form cast() > is not good. Plus some noise of the if 0: that you mentioned, plus > "cast" everywhere. I mostly agree, but the PEP still needs to address the fact that it's only a subset of the benefits that actually require new syntax, since it's that subset which provides the rationale for rejecting the use of a function based approach, while the rest provided the incentive to start looking for a way to replace the type comments. >> However, exploring this possibility still seems like a good idea to >> me, as it should allow many of the currently thorny semantic questions >> to be resolved, and a future syntax-only PEP for 3.7+ can just be >> about defining syntactic sugar for semantics that can (by then) >> already be expressed via appropriate initialisers. > > I think that motivation of the PEP is exactly opposite, this is why it has > "Syntax" not "Semantics" in title. Also quoting Guido: > >> But I'm not in a hurry for that -- I'm only hoping to get the basic >> syntax accepted by Python 3.6 beta 1 so that we can start using this >> in 5 years from now rather than 7 years from now. > > I also think that semantics should be up to the type checkers. > Maybe it is not a perfect comparison, but prohibiting all type semantics > except one is like prohibiting all Python web frameworks except one. It's the semantics that worry people though, and it's easy for folks actively working on typecheckers to think it's just as easy for the rest of us to make plausible assumptions about the kind of code that well-behaved typecheckers are going to allow as it is for you. That's not the case, which means folks get concerned, especially those accustomed to instititutional environments where decisions about tool use are still made by folks a long way removed from the day to day experience of software development, rather than being delegated to the engineering teams themselves. I suspect you'll have an easier time of it on that front if you include some examples of dynamically typed code that a well-behaved type-checker *must* report as correct Python code, such as: x: Optional[List[Any]] # This is the type of "x" *after* the if statement, not *during* it if arg is not None: x = list(arg) if other_arg is not None: # A well-behaved typechecker should allow this due to # the more specific initialisation in this particular branch x.extend(other_arg) else: x = None A typechecker could abide by that restriction by ignoring variable declarations entirely and operating solely on its own type inference from expressions, so any existing PEP 484 typechecker is likely to be well-behaved in that regard. Similarly, it would be reasonable to say that these three snippets should all be equivalent from a typechecking perspective: x = None # type: Optional[T] x: Optional[T] = None x: Optional[T] x = None This more explcitly spells out what it means for PEP 526 to say that it's purely about syntax and doesn't define any new semantics beyond those already defined in PEP 484. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From levkivskyi at gmail.com Sun Sep 4 14:13:04 2016 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Sun, 4 Sep 2016 20:13:04 +0200 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> Message-ID: On 4 September 2016 at 19:59, Nick Coghlan wrote: Nick, Thank you for good suggestions. > I mostly agree, but the PEP still needs to address the fact that it's > only a subset of the benefits that actually require new syntax, since > it's that subset which provides the rationale for rejecting the use of > a function based approach, while the rest provided the incentive to > start looking for a way to replace the type comments. > I think I agree. > I suspect you'll have an easier time of it on that front if you > include some examples of dynamically typed code that a well-behaved > type-checker *must* report as correct Python code, such as: > > x: Optional[List[Any]] > # This is the type of "x" *after* the if statement, not *during* it > if arg is not None: > x = list(arg) > if other_arg is not None: > # A well-behaved typechecker should allow this due to > # the more specific initialisation in this particular branch > x.extend(other_arg) > else: > x = None There are very similar examples in PEP 484 (section on singletons in unions), we could just copy those or use this example, but I am sure Guido will not agree to word "must" (although "should" maybe possible :-) > Similarly, it would be reasonable to say that these three snippets > should all be equivalent from a typechecking perspective: > > x = None # type: Optional[T] > > x: Optional[T] = None > > x: Optional[T] > x = None > Nice idea, explicit is better than implicit. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Sep 4 14:31:59 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 5 Sep 2016 04:31:59 +1000 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> Message-ID: On 5 September 2016 at 04:13, Ivan Levkivskyi wrote: > On 4 September 2016 at 19:59, Nick Coghlan wrote: >> I suspect you'll have an easier time of it on that front if you >> include some examples of dynamically typed code that a well-behaved >> type-checker *must* report as correct Python code, such as: >> >> x: Optional[List[Any]] >> # This is the type of "x" *after* the if statement, not *during* it >> if arg is not None: >> x = list(arg) >> if other_arg is not None: >> # A well-behaved typechecker should allow this due to >> # the more specific initialisation in this particular branch >> x.extend(other_arg) >> else: >> x = None > > There are very similar examples in PEP 484 (section on singletons in > unions), > we could just copy those or use this example, > but I am sure Guido will not agree to word "must" (although "should" maybe > possible :-) "Should" would be fine by me :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From k7hoven at gmail.com Sun Sep 4 14:40:55 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Sun, 4 Sep 2016 21:40:55 +0300 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> Message-ID: On Sun, Sep 4, 2016 at 9:13 PM, Ivan Levkivskyi wrote: > On 4 September 2016 at 19:59, Nick Coghlan wrote: [...] >> >> Similarly, it would be reasonable to say that these three snippets >> should all be equivalent from a typechecking perspective: >> >> x = None # type: Optional[T] >> >> x: Optional[T] = None >> >> x: Optional[T] >> x = None > > > Nice idea, explicit is better than implicit. > How is it going to help that these are equivalent within one checker, if the meaning may differ across checkers? -- Koos > -- > Ivan > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/k7hoven%40gmail.com > -- + Koos Zevenhoven + http://twitter.com/k7hoven + From guido at python.org Sun Sep 4 16:16:20 2016 From: guido at python.org (Guido van Rossum) Date: Sun, 4 Sep 2016 13:16:20 -0700 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> Message-ID: Everybody please stop panicking. PEP 526 does not make a stand on the behavior of type checkers (other than deferring to PEP 484). If you want to start a discussion about constraining type checkers please do it over at python-ideas. There is no rush as type checkers are not affected by the feature freeze. -- --Guido van Rossum (python.org/~guido) From k7hoven at gmail.com Sun Sep 4 16:42:26 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Sun, 4 Sep 2016 23:42:26 +0300 Subject: [Python-Dev] PEP 467: last round (?) In-Reply-To: References: <57C88355.9000302@stoneleaf.us> <1472940660.890622.714961169.3425F103@webmail.messagingengine.com> Message-ID: On Sun, Sep 4, 2016 at 6:38 PM, Nick Coghlan wrote: > > There are two self-consistent sets of names: > Let me add a few. I wonder if this is really used so much that bytes.chr is too long to type (and you can do bchr = bytes.chr if you want to): bytes.chr (or bchr in builtins) bytes.chr_at, bytearray.chr_at bytes.iterchr, bytearray.iterchr bytes.chr (or bchr in builtins) bytes.chrview, bytearray.chrview (sequence views) bytes.char (or bytes.chr or bchr in builtins) bytes.chars, bytearray.chars (sequence views) > bchr > bytes.getbchr, bytearray.getbchr > bytes.iterbchr, bytearray.iterbchr > > byte > bytes.getbyte, bytearray.getbyte > bytes.iterbytes, bytearray.iterbytes > > The former set emphasises the "stringiness" of this behaviour, by > aligning with the chr() builtin > > The latter set emphasises that these APIs are still about working with > arbitrary binary data rather than text, with a Python "byte" > subsequently being a length 1 bytes object containing a single integer > between 0 and 255, rather than "What you get when you index or iterate > over a bytes instance". > > Having noticed the discrepancy, my personal preference is to go with > the latter option (since it better fits the "executable pseudocode" > ideal and despite my reservations about "bytes objects contain int > objects rather than byte objects", that shouldn't be any more > confusing in the long run than explaining that str instances are > containers of length-1 str instances). The fact "byte" is much easier > to pronounce than bchr (bee-cher? bee-char?) also doesn't hurt. > > However, I suspect we'll need to put both sets of names in front of > Guido and ask him to just pick whichever he prefers to get it resolved > one way or the other. > >> And didn't someone recently propose deprecating iterability of str >> (not indexing, or slicing, just iterability)? Then str would also need >> a way to provide an iterable or sequence view of the characters. For >> consistency, the str functionality would probably need to mimic the >> approach in bytes. IOW, this PEP may in fact ultimately dictate how to >> get a iterable/sequence from a str object. > > Strings are not going to become atomic objects, no matter how many > times people suggest it. > You consider all non-iterable objects atomic? If str.__iter__ raises an exception, it does not turn str somehow atomic. I wouldn't be surprised by breaking changes of this nature to python at some point. The breakage will be quite significant, but easy to fix. -- Koos From greg.ewing at canterbury.ac.nz Sun Sep 4 17:59:12 2016 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 05 Sep 2016 09:59:12 +1200 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: <20160904115159.GL26300@ando.pearwood.info> References: <57C982D4.1060405@hotpy.org> <20160902164035.GB26300@ando.pearwood.info> <84005c43-1465-22f3-2106-c5a0f3c21533@hotpy.org> <20160904115159.GL26300@ando.pearwood.info> Message-ID: <57CC9930.7070809@canterbury.ac.nz> > On Sun, Sep 04, 2016 at 12:31:26PM +0100, Mark Shannon wrote: > >> As defined in PEP 526, I think that type >>annotations become a hindrance to type inference. In Haskell-like languages, type annotations have no ability to influence whether types can be inferred. The compiler infers a type for everything, whether you annotate or not. The annotations serve as assertions about what the inferred types should be. If they don't match, it means the programmer has made a mistake somewhere. I don't think it's possible for an annotation to prevent the compiler from being able to infer a type where it could have inferred one without the annotation. -- Greg From random832 at fastmail.com Sun Sep 4 20:30:12 2016 From: random832 at fastmail.com (Random832) Date: Sun, 04 Sep 2016 20:30:12 -0400 Subject: [Python-Dev] PEP 467: last round (?) In-Reply-To: References: <57C88355.9000302@stoneleaf.us> <1472940660.890622.714961169.3425F103@webmail.messagingengine.com> Message-ID: <1473035412.2762262.715619409.62F5BAA2@webmail.messagingengine.com> On Sun, Sep 4, 2016, at 16:42, Koos Zevenhoven wrote: > On Sun, Sep 4, 2016 at 6:38 PM, Nick Coghlan wrote: > > > > There are two self-consistent sets of names: > > > > Let me add a few. I wonder if this is really used so much that > bytes.chr is too long to type (and you can do bchr = bytes.chr if you > want to): > > bytes.chr (or bchr in builtins) > bytes.chr_at, bytearray.chr_at Ugh, that "at" is too reminiscent of java. And it just feels wrong to spell it "chr" rather than "char" when there's a vowel elsewhere in the name. Hmm... how offensive to the zen of python would it be to have "magic" to allow both bytes.chr(65) and b'ABCDE'.chr[0]? (and possibly also iter(b'ABCDE'.chr)? That is, a descriptor which is callable on the class, but returns a view on instances? From ncoghlan at gmail.com Sun Sep 4 22:21:42 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 5 Sep 2016 12:21:42 +1000 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> Message-ID: On 5 September 2016 at 04:40, Koos Zevenhoven wrote: > On Sun, Sep 4, 2016 at 9:13 PM, Ivan Levkivskyi wrote: >> On 4 September 2016 at 19:59, Nick Coghlan wrote: > [...] >>> >>> Similarly, it would be reasonable to say that these three snippets >>> should all be equivalent from a typechecking perspective: >>> >>> x = None # type: Optional[T] >>> >>> x: Optional[T] = None >>> >>> x: Optional[T] >>> x = None >> >> >> Nice idea, explicit is better than implicit. > > How is it going to help that these are equivalent within one checker, > if the meaning may differ across checkers? For typechecker developers, it provides absolute clarity that the semantics of the new annotations should match the behaviour of existing type comments when there's an initialiser present, or of a parameter annotation when there's no initialiser present. For folks following along without necessarily keeping up with all the nuances, it makes it more explicit what Guido means when he says "PEP 526 does not make a stand on the behavior of type checkers (other than deferring to PEP 484)." For example, the formulation of the divergent initialisation case where I think the preferred semantics are already implied by PEP 484 can be looked at this way: x = None # type: Optional[List[T]] if arg is not None: x = list(arg) if other_arg is not None: x.extend(arg) It would be a strange typechecker indeed that handled that case differently from the new spellings made possible by PEP 526: x: Optional[List[T]] = None if arg is not None: x = list(arg) if other_arg is not None: x.extend(arg) x: Optional[List[T]] if arg is None: x = None else: x = list(arg) if other_arg is not None: x.extend(arg) x: Optional[List[T]] if arg is not None: x = list(arg) if other_arg is not None: x.extend(arg) else: x = None Or from the semantics of PEP 484 parameter annotations: def my_func(arg:Optional[List[T]], other_arg=None): # other_arg is implicitly Optional[Any] if arg is not None and other_arg is not None: # Here, "arg" can be assumed to be List[T] # while "other_arg" is Any arg.extend(other_arg) A self-consistent typechecker will either allow all of the above, or prohibit all of the above, while a typechecker that *isn't* self-consistent would be incredibly hard to use. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sun Sep 4 23:06:58 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 5 Sep 2016 13:06:58 +1000 Subject: [Python-Dev] PEP 467: last round (?) In-Reply-To: References: <57C88355.9000302@stoneleaf.us> <1472940660.890622.714961169.3425F103@webmail.messagingengine.com> Message-ID: On 5 September 2016 at 06:42, Koos Zevenhoven wrote: > On Sun, Sep 4, 2016 at 6:38 PM, Nick Coghlan wrote: >> >> There are two self-consistent sets of names: >> > > Let me add a few. I wonder if this is really used so much that > bytes.chr is too long to type (and you can do bchr = bytes.chr if you > want to) > > bytes.chr (or bchr in builtins) The main problem with class method based spellings is that we need to either duplicate it on bytearray or else break the bytearray/bytes symmetry and propose "bytearray(bytes.chr(x))" as the replacement for current cryptic "bytearray([x])" Consider: bytearray([x]) bytearray(bchr(x)) bytearray(byte(x)) bytearray(bytes.chr(x)) Folks that care about maintainability are generally willing to trade a few extra characters at development time for ease of reading later, but there are limits to how large a trade-off they can be asked to make if we expect the alternative to actually be used (since overly verbose code can be a readability problem in its own right). > bytes.chr_at, bytearray.chr_at > bytes.iterchr, bytearray.iterchr These don't work for me because I'd expect iterchr to take encoding and errors arguments and produce length 1 strings. You also run into a searchability problem as "chr" will get hits for both the chr builtin and bytes.chr, similar to the afalg problem that recently came up in another thread. While namespaces are a honking great idea, the fact that search is non-hierarchical means they still don't give API designers complete freedom to reuse names at will. > bytes.chr (or bchr in builtins) > bytes.chrview, bytearray.chrview (sequence views) > > bytes.char (or bytes.chr or bchr in builtins) > bytes.chars, bytearray.chars (sequence views) The views are already available via memoryview.cast if folks really want them, but encouraging their use in general isn't a great idea, as it means more function developers now need to ask themselves "What if someone passes me a char view rather than a normal bytes object?". >> Strings are not going to become atomic objects, no matter how many >> times people suggest it. > > You consider all non-iterable objects atomic? If str.__iter__ raises > an exception, it does not turn str somehow atomic. "atomic" is an overloaded word in software design, but it's still the right one for pointing out that something people want strings to be atomic, and sometimes they don't - it depends on what they're doing. In particular, you can look up the many, many, many discussions of providing a generic flatten operation for iterables, and how it always founders on the question of types like str and bytes, which can both be usefully viewed as an atomic unit of information, *and* as containers of smaller units of information (NumPy arrays are another excellent example of this problem). > I wouldn't be > surprised by breaking changes of this nature to python at some point. I would, and you should be to: http://www.curiousefficiency.org/posts/2014/08/python-4000.html > The breakage will be quite significant, but easy to fix. Please keep in mind that we're already 10 years into a breaking change to Python's text handling model, with another decade or so still to go before the legacy Python 2 text model is spoken of largely in terms similar to the way COBOL is spoken of today. There is no such thing as a "significant, but easy to fix" change when it comes to adjusting how a programming language handles text data, as text handling is a fundamental part of defining how a language is used to communicate with people. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From steve.dower at python.org Mon Sep 5 01:54:32 2016 From: steve.dower at python.org (Steve Dower) Date: Sun, 4 Sep 2016 22:54:32 -0700 Subject: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8 In-Reply-To: References: Message-ID: I posted a minor update to PEP 528 at https://github.com/python/peps/blob/master/pep-0528.txt and a diff below. While there are likely to be technical and compatibility issues to resolve after the changes are applied, I don't believe they impact the decision to accept the change at the PEP-level (everyone who has raised potential issues has also been supportive of the change). Without real experience during the beta period, it's really hard to determine whether fixes should be made on our side or their side, so I think it's worth going ahead with the change, even if specific implementation details change between now and release. Cheers, Steve --- @@ -21,8 +21,7 @@ This PEP proposes changing the default standard stream implementation on Windows to use the Unicode APIs. This will allow users to print and input the full range of Unicode characters at the default Windows console. This also requires a -subtle change to how the tokenizer parses text from readline hooks, that should -have no backwards compatibility issues. +subtle change to how the tokenizer parses text from readline hooks. Specific Changes ================ @@ -46,7 +45,7 @@ The use of an ASCII compatible encoding is required to maintain compatibility with code that bypasses the ``TextIOWrapper`` and directly writes ASCII bytes to -the standard streams (for example, [process_stdinreader.py]_). Code that assumes +the standard streams (for example, `Twisted's process_stdinreader.py`_). Code that assumes a particular encoding for the standard streams other than ASCII will likely break. @@ -78,8 +77,9 @@ Alternative Approaches ====================== -The ``win_unicode_console`` package [win_unicode_console]_ is a pure-Python -alternative to changing the default behaviour of the console. +The `win_unicode_console package`_ is a pure-Python alternative to changing the +default behaviour of the console. It implements essentially the same +modifications as described here using pure Python code. Code that may break =================== @@ -94,21 +94,21 @@ Code that assumes that the encoding required by ``sys.stdin.buffer`` or ``sys.stdout.buffer`` is ``'mbcs'`` or a more specific encoding may currently be -working by chance, but could encounter issues under this change. For example:: +working by chance, but could encounter issues under this change. For example: - sys.stdout.buffer.write(text.encode('mbcs')) - r = sys.stdin.buffer.read(16).decode('cp437') + >>> sys.stdout.buffer.write(text.encode('mbcs')) + >>> r = sys.stdin.buffer.read(16).decode('cp437') To correct this code, the encoding specified on the ``TextIOWrapper`` should be -used, either implicitly or explicitly:: +used, either implicitly or explicitly: - # Fix 1: Use wrapper correctly - sys.stdout.write(text) - r = sys.stdin.read(16) + >>> # Fix 1: Use wrapper correctly + >>> sys.stdout.write(text) + >>> r = sys.stdin.read(16) - # Fix 2: Use encoding explicitly - sys.stdout.buffer.write(text.encode(sys.stdout.encoding)) - r = sys.stdin.buffer.read(16).decode(sys.stdin.encoding) + >>> # Fix 2: Use encoding explicitly + >>> sys.stdout.buffer.write(text.encode(sys.stdout.encoding)) + >>> r = sys.stdin.buffer.read(16).decode(sys.stdin.encoding) Incorrectly using the raw object -------------------------------- @@ -117,32 +117,57 @@ writes may be affected. This is particularly important for reads, where the number of characters read will never exceed one-fourth of the number of bytes allowed, as there is no feasible way to prevent input from encoding as much -longer utf-8 strings:: +longer utf-8 strings. - >>> stdin = open(sys.stdin.fileno(), 'rb') - >>> data = stdin.raw.read(15) + >>> raw_stdin = sys.stdin.buffer.raw + >>> data = raw_stdin.read(15) abcdefghijklm b'abc' # data contains at most 3 characters, and never more than 12 bytes # error, as "defghijklm\r\n" is passed to the interactive prompt To correct this code, the buffered reader/writer should be used, or the caller -should continue reading until its buffer is full.:: +should continue reading until its buffer is full. - # Fix 1: Use the buffered reader/writer - >>> stdin = open(sys.stdin.fileno(), 'rb') + >>> # Fix 1: Use the buffered reader/writer + >>> stdin = sys.stdin.buffer >>> data = stdin.read(15) abcedfghijklm b'abcdefghijklm\r\n' - # Fix 2: Loop until enough bytes have been read - >>> stdin = open(sys.stdin.fileno(), 'rb') + >>> # Fix 2: Loop until enough bytes have been read + >>> raw_stdin = sys.stdin.buffer.raw >>> b = b'' >>> while len(b) < 15: - ... b += stdin.raw.read(15) + ... b += raw_stdin.read(15) abcedfghijklm b'abcdefghijklm\r\n' +Using the raw object with small buffers +--------------------------------------- + +Code that uses the raw IO object and attempts to read less than four characters +will now receive an error. Because it's possible that any single character may +require up to four bytes when represented in utf-8, requests must fail. + + >>> raw_stdin = sys.stdin.buffer.raw + >>> data = raw_stdin.read(3) + Traceback (most recent call last): + File "", line 1, in + ValueError: must read at least 4 bytes + +The only workaround is to pass a larger buffer. + + >>> # Fix: Request at least four bytes + >>> raw_stdin = sys.stdin.buffer.raw + >>> data = raw_stdin.read(4) + a + b'a' + >>> >>> + +(The extra ``>>>`` is due to the newline remaining in the input buffer and is +expected in this situation.) + Copyright ========= @@ -151,7 +176,5 @@ References ========== -.. [process_stdinreader.py] Twisted's process_stdinreader.py - (https://github.com/twisted/twisted/blob/trunk/src/twisted/test/process_stdinreader.py) -.. [win_unicode_console] win_unicode_console package - (https://pypi.org/project/win_unicode_console/) +.. _Twisted's process_stdinreader.py: https://github.com/twisted/twisted/blob/trunk/src/twisted/test/process_stdinreader.py +.. _win_unicode_console package: https://pypi.org/project/win_unicode_console/ From steve.dower at python.org Mon Sep 5 01:59:04 2016 From: steve.dower at python.org (Steve Dower) Date: Sun, 4 Sep 2016 22:59:04 -0700 Subject: [Python-Dev] PEP 529: Change Windows filesystem encoding to UTF-8 In-Reply-To: References: Message-ID: I posted an update to PEP 529 at https://github.com/python/peps/blob/master/pep-0529.txt and a diff below. The update includes more detail on the affected code within CPython - including a number of references to broken code that would be resolved with the change - and more details about the necessary changes. As with PEP 528, I don't think it's possible to predict the impact better than I already have, and the beta period will be essential to determine whether this change is completely unworkable. I am fully prepared to back out the change if necessary prior to RC. Cheers, Steve --- @@ -16,7 +16,8 @@ operating system, often via C Runtime functions. However, these have been long discouraged in favor of the UTF-16 APIs. Within the operating system, all text is represented as UTF-16, and the ANSI APIs perform encoding and decoding using -the active code page. +the active code page. See `Naming Files, Paths, and Namespaces`_ for +more details. This PEP proposes changing the default filesystem encoding on Windows to utf-8, and changing all filesystem functions to use the Unicode APIs for filesystem @@ -27,10 +28,10 @@ characters outside of the user's active code page. Notably, this does not impact the encoding of the contents of files. These will -continue to default to locale.getpreferredencoding (for text files) or plain -bytes (for binary files). This only affects the encoding used when users pass a -bytes object to Python where it is then passed to the operating system as a path -name. +continue to default to ``locale.getpreferredencoding()`` (for text files) or +plain bytes (for binary files). This only affects the encoding used when users +pass a bytes object to Python where it is then passed to the operating system as +a path name. Background ========== @@ -44,9 +45,10 @@ When paths are passed between the filesystem and the application, they are either passed through as a bytes blob or converted to/from str using -``os.fsencode()`` or ``sys.getfilesystemencoding()``. The result of encoding a -string with ``sys.getfilesystemencoding()`` is a blob of bytes in the native -format for the default file system. +``os.fsencode()`` and ``os.fsdecode()`` or explicit encoding using +``sys.getfilesystemencoding()``. The result of encoding a string with +``sys.getfilesystemencoding()`` is a blob of bytes in the native format for the +default file system. On Windows, the native format for the filesystem is utf-16-le. The recommended platform APIs for accessing the filesystem all accept and return text encoded in @@ -83,11 +85,11 @@ canonical representation. Even if the encoding is "incorrect" by some standard, the file system will still map the bytes back to the file. Making use of this avoids the cost of decoding and reencoding, such that (theoretically, and only -on POSIX), code such as this may be faster because of the use of `b'.'` compared -to using `'.'`:: +on POSIX), code such as this may be faster because of the use of ``b'.'`` +compared to using ``'.'``:: >>> for f in os.listdir(b'.'): - ... os.stat(f) + ... os.stat(f) ... As a result, POSIX-focused library authors prefer to use bytes to represent @@ -105,32 +107,31 @@ Currently the default filesystem encoding is 'mbcs', which is a meta-encoder that uses the active code page. However, when bytes are passed to the filesystem they go through the \*A APIs and the operating system handles encoding. In this -case, paths are always encoded using the equivalent of 'mbcs:replace' - we have -no ability to change this (though there is a user/machine configuration option -to change the encoding from CP_ACP to CP_OEM, so it won't necessarily always -match mbcs...) +case, paths are always encoded using the equivalent of 'mbcs:replace' with no +opportunity for Python to override or change this. This proposal would remove all use of the \*A APIs and only ever call the \*W -APIs. When Windows returns paths to Python as str, they will be decoded from +APIs. When Windows returns paths to Python as ``str``, they will be decoded from utf-16-le and returned as text (in whatever the minimal representation is). When -Windows returns paths to Python as bytes, they will be decoded from utf-16-le to -utf-8 using surrogatepass (Windows does not validate surrogate pairs, so it is -possible to have invalid surrogates in filenames). Equally, when paths are -provided as bytes, they are decoded from utf-8 into utf-16-le and passed to the -\*W APIs. +Python code requests paths as ``bytes``, the paths will be transcoded from +utf-16-le into utf-8 using surrogatepass (Windows does not validate surrogate +pairs, so it is possible to have invalid surrogates in filenames). Equally, when +paths are provided as ``bytes``, they are trasncoded from utf-8 into utf-16-le +and passed to the \*W APIs. -The use of utf-8 will not be configurable, with the possible exception of a -"legacy mode" environment variable or X-flag. +The use of utf-8 will not be configurable, except for the provision of a +"legacy mode" flag to revert to the previous behaviour. -surrogateescape does not apply here, as the concern is not about retaining -non-sensical bytes. Any path returned from the operating system will be valid -Unicode, while bytes paths created by the user may raise a decoding error -(currently these would raise ``OSError`` or a subclass). +The ``surrogateescape`` error mode does not apply here, as the concern is not +about retaining non-sensical bytes. Any path returned from the operating system +will be valid Unicode, while invalid paths created by the user should raise a +decoding error (currently these would raise ``OSError`` or a subclass). The choice of utf-8 bytes (as opposed to utf-16-le bytes) is to ensure the -ability to round-trip without breaking the functionality of the ``os.path`` -module, which assumes an ASCII-compatible encoding. Using utf-16-le as the -encoding is more pure, but will cause more issues than are resolved. +ability to round-trip path names and allow basic manipulation (for example, +using the ``os.path`` module) when assuming an ASCII-compatible encoding. Using +utf-16-le as the encoding is more pure, but will cause more issues than are +resolved. This change would also undeprecate the use of bytes paths on Windows. No change to the semantics of using bytes as a path is required - as before, they must be @@ -145,16 +146,38 @@ Remove the default value for ``Py_FileSystemDefaultEncoding`` and set it in ``initfsencoding()`` to utf-8, or if the legacy-mode switch is enabled to mbcs. -Update the implementations of ``PyUnicode_DecodeFSDefaultAndSize`` and -``PyUnicode_EncodeFSDefault`` to use the standard utf-8 codec with surrogatepass -error mode, or if the legacy-mode switch is enabled the code page codec with -replace error mode. +Update the implementations of ``PyUnicode_DecodeFSDefaultAndSize()`` and +``PyUnicode_EncodeFSDefault()`` to use the utf-8 codec, or if the legacy-mode +switch is enabled the existing mbcs codec. + +Add sys.getfilesystemencodeerrors +--------------------------------- + +As the error mode may now change between ``surrogatepass`` and ``replace``, +Python code that manually performs encoding also needs access to the current +error mode. This includes the implementation of ``os.fsencode()`` and +``os.fsdecode()``, which currently assume an error mode based on the codec. + +Add a public ``Py_FileSystemDefaultEncodeErrors``, similar to the existing +``Py_FileSystemDefaultEncoding``. The default value on Windows will be +``surrogatepass`` or in legacy mode, ``replace``. The default value on all other +platforms will be ``surrogateescape``. + +Add a public ``sys.getfilesystemencodeerrors()`` function that returns the +current error mode. + +Update the implementations of ``PyUnicode_DecodeFSDefaultAndSize()`` and +``PyUnicode_EncodeFSDefault()`` to use the variable for error mode rather than +constant strings. + +Update the implementations of ``os.fsencode()`` and ``os.fsdecode()`` to use +``sys.getfilesystemencodeerrors()`` instead of assuming the mode. Update path_converter --------------------- Update the path converter to always decode bytes or buffer objects into text -using ``PyUnicode_DecodeFSDefaultAndSize``. +using ``PyUnicode_DecodeFSDefaultAndSize()``. Change the ``narrow`` field from a ``char*`` string into a flag that indicates whether the original object was bytes. This is required for functions that need @@ -172,11 +195,13 @@ --------------- Add a legacy mode flag, enabled by the environment variable -``PYTHONLEGACYWINDOWSFSENCODING``. When this flag is set, the default filesystem -encoding is set to mbcs rather than utf-8, and the error mode is set to -'replace' rather than 'strict'. The ``path_converter`` will continue to decode -to wide characters and only \*W APIs will be called, however, the bytes passed in -and received from Python will be encoded the same as prior to this change. +``PYTHONLEGACYWINDOWSFSENCODING``. + +When this flag is set, the default filesystem encoding is set to mbcs rather +than utf-8, and the error mode is set to ``replace`` rather than +``surrogatepass``. Paths will continue to decode to wide characters and only \*W +APIs will be called, however, the bytes passed in and received from Python will +be encoded the same as prior to this change. Undeprecate bytes paths on Windows ---------------------------------- @@ -186,6 +211,52 @@ whatever is returned from ``sys.getfilesystemencoding()`` rather than the user's active code page. +Affected Modules +---------------- + +This PEP implicitly includes all modules within the Python that either pass path +names to the operating system, or otherwise use ``sys.getfilesystemencoding()``. + +As of 3.6.0a4, the following modules require modification: + +* ``os`` +* ``_overlapped`` +* ``_socket`` +* ``subprocess`` +* ``zipimport`` + +The following modules use ``sys.getfilesystemencoding()`` but do not need +modification: + +* ``gc`` (already assumes bytes are utf-8) +* ``grp`` (not compiled for Windows) +* ``http.server`` (correctly includes codec name with transmitted data) +* ``idlelib.editor`` (should not be needed; has fallback handling) +* ``nis`` (not compiled for Windows) +* ``pwd`` (not compiled for Windows) +* ``spwd`` (not compiled for Windows) +* ``_ssl`` (only used for ASCII constants) +* ``tarfile`` (code unused on Windows) +* ``_tkinter`` (already assumes bytes are utf-8) +* ``wsgiref`` (assumed as the default encoding for unknown environments) +* ``zipapp`` (code unused on Windows) + +The following native code uses one of the encoding or decoding functions, but do +not require any modification: + +* ``Parser/parsetok.c`` (docs already specify ``sys.getfilesystemencoding()``) +* ``Python/ast.c`` (docs already specify ``sys.getfilesystemencoding()``) +* ``Python/compile.c`` (undocumented, but Python filesystem encoding implied) +* ``Python/errors.c`` (docs already specify ``os.fsdecode()``) +* ``Python/fileutils.c`` (code unused on Windows) +* ``Python/future.c`` (undocumented, but Python filesystem encoding implied) +* ``Python/import.c`` (docs already specify utf-8) +* ``Python/importdl.c`` (code unused on Windows) +* ``Python/pythonrun.c`` (docs already specify ``sys.getfilesystemencoding()``) +* ``Python/symtable.c`` (undocumented, but Python filesystem encoding implied) +* ``Python/thread.c`` (code unused on Windows) +* ``Python/traceback.c`` (encodes correctly for comparing strings) +* ``Python/_warnings.c`` (docs already specify ``os.fsdecode()``) Rejected Alternatives ===================== @@ -249,44 +320,50 @@ Code that does not manage encodings when crossing protocol boundaries may currently be working by chance, but could encounter issues when either encoding -changes. For example:: +changes. For example: - filename = open('filename_in_mbcs.txt', 'rb').read() - text = open(filename, 'r').read() + >>> filename = open('filename_in_mbcs.txt', 'rb').read() + >>> text = open(filename, 'r').read() To correct this code, the encoding of the bytes in ``filename`` should be -specified, either when reading from the file or before using the value:: +specified, either when reading from the file or before using the value: - # Fix 1: Open file as text - filename = open('filename_in_mbcs.txt', 'r', encoding='mbcs').read() - text = open(filename, 'r').read() + >>> # Fix 1: Open file as text + >>> filename = open('filename_in_mbcs.txt', 'r', encoding='mbcs').read() + >>> text = open(filename, 'r').read() - # Fix 2: Decode path - filename = open('filename_in_mbcs.txt', 'rb').read() - text = open(filename.decode('mbcs'), 'r').read() + >>> # Fix 2: Decode path + >>> filename = open('filename_in_mbcs.txt', 'rb').read() + >>> text = open(filename.decode('mbcs'), 'r').read() Explicitly using 'mbcs' ----------------------- Code that explicitly encodes text using 'mbcs' before passing to file system -APIs. For example:: +APIs is now passing incorrectly encoded bytes. For example: - filename = open('files.txt', 'r').readline() - text = open(filename.encode('mbcs'), 'r') + >>> filename = open('files.txt', 'r').readline() + >>> text = open(filename.encode('mbcs'), 'r') To correct this code, the string should be passed without explicit encoding, or -should use ``os.fsencode()``:: +should use ``os.fsencode()``: - # Fix 1: Do not encode the string - filename = open('files.txt', 'r').readline() - text = open(filename, 'r') + >>> # Fix 1: Do not encode the string + >>> filename = open('files.txt', 'r').readline() + >>> text = open(filename, 'r') - # Fix 2: Use correct encoding - filename = open('files.txt', 'r').readline() - text = open(os.fsencode(filename), 'r') + >>> # Fix 2: Use correct encoding + >>> filename = open('files.txt', 'r').readline() + >>> text = open(os.fsencode(filename), 'r') +References +========== + +.. _Naming Files, Paths, and Namespaces: + https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247.aspx + Copyright ========= From ncoghlan at gmail.com Mon Sep 5 02:58:14 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 5 Sep 2016 16:58:14 +1000 Subject: [Python-Dev] PEP 529: Change Windows filesystem encoding to UTF-8 In-Reply-To: References: Message-ID: On 5 September 2016 at 15:59, Steve Dower wrote: > +continue to default to ``locale.getpreferredencoding()`` (for text files) or > +plain bytes (for binary files). This only affects the encoding used when users > +pass a bytes object to Python where it is then passed to the operating system as > +a path name. For the three non-filesystem cases: I checked the situation for os.environb, and that's already unavailable on Windows (since os.supports_bytes_environ is False there), while sys.argv is apparently already handled correctly (i.e. always using the *W APIs). That means my only open question would be the handling of subprocess module calls (both with and without shell=True), since that currently works with binary arguments on *nix: >>> subprocess.call([b"python", b"-c", "print('??????')".encode("utf-8")]) ?????? 0 >>> subprocess.call(b"python -c '%s'" % 'print("??????")'.encode("utf-8"), shell=True) ?????? 0 While calling system native apps that way will still have many portability challenges, there are also plenty of cases where folks use sys.executable to launch new Python processes in a separate instance of the currently running interpreter, and it would be good if these changes brought cross-platform consistency to the handling of binary arguments here as well. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From k7hoven at gmail.com Mon Sep 5 04:19:38 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Mon, 5 Sep 2016 11:19:38 +0300 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> Message-ID: On Mon, Sep 5, 2016 at 5:21 AM, Nick Coghlan wrote: > On 5 September 2016 at 04:40, Koos Zevenhoven wrote: >> On Sun, Sep 4, 2016 at 9:13 PM, Ivan Levkivskyi wrote: >>> On 4 September 2016 at 19:59, Nick Coghlan wrote: >> [...] >>>> >>>> Similarly, it would be reasonable to say that these three snippets >>>> should all be equivalent from a typechecking perspective: >>>> >>>> x = None # type: Optional[T] >>>> >>>> x: Optional[T] = None >>>> >>>> x: Optional[T] >>>> x = None >>> >>> >>> Nice idea, explicit is better than implicit. >> >> How is it going to help that these are equivalent within one checker, >> if the meaning may differ across checkers? > > For typechecker developers, it provides absolute clarity that the > semantics of the new annotations should match the behaviour of > existing type comments when there's an initialiser present, I understood that, but what's the benefit? I hope there will be a type checker that breaks this "rule". > or of a > parameter annotation when there's no initialiser present. No, your suggested addition does not provide any reference to this. (...luckily, because that would have been worse.) > For folks following along without necessarily keeping up with all the > nuances, it makes it more explicit what Guido means when he says "PEP > 526 does not make a stand on the > behavior of type checkers (other than deferring to PEP 484)." What you are proposing is exactly "making a stand on the behavior of type checkers", and the examples you provide below are all variations of the same situation and provide no justification for a general rule. Here's a general rule: The closer it gets to the end of drafting a PEP [1], the more carefully you have to justify changes. Justification is left as an exercise ;-). --Koos [1] or any document (or anything, I suppose) > For example, the formulation of the divergent initialisation case > where I think the preferred semantics are already implied by PEP 484 > can be looked at this way: > > x = None # type: Optional[List[T]] > if arg is not None: > x = list(arg) > if other_arg is not None: > x.extend(arg) > > It would be a strange typechecker indeed that handled that case > differently from the new spellings made possible by PEP 526: > > x: Optional[List[T]] = None > if arg is not None: > x = list(arg) > if other_arg is not None: > x.extend(arg) > > x: Optional[List[T]] > if arg is None: > x = None > else: > x = list(arg) > if other_arg is not None: > x.extend(arg) > > x: Optional[List[T]] > if arg is not None: > x = list(arg) > if other_arg is not None: > x.extend(arg) > else: > x = None > > Or from the semantics of PEP 484 parameter annotations: > > def my_func(arg:Optional[List[T]], other_arg=None): > # other_arg is implicitly Optional[Any] > if arg is not None and other_arg is not None: > # Here, "arg" can be assumed to be List[T] > # while "other_arg" is Any > arg.extend(other_arg) > > A self-consistent typechecker will either allow all of the above, or > prohibit all of the above, while a typechecker that *isn't* > self-consistent would be incredibly hard to use. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -- + Koos Zevenhoven + http://twitter.com/k7hoven + From p.f.moore at gmail.com Mon Sep 5 05:10:01 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 5 Sep 2016 10:10:01 +0100 Subject: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8 In-Reply-To: References: Message-ID: On 5 September 2016 at 06:54, Steve Dower wrote: > +Using the raw object with small buffers > +--------------------------------------- > + > +Code that uses the raw IO object and attempts to read less than four characters > +will now receive an error. Because it's possible that any single character may > +require up to four bytes when represented in utf-8, requests must fail. I'm very concerned about this statement. It's clearly not true that the request *must* fail, as reading 1 byte from a UTF-8 enabled Linux console stream currently works (at least I believe it does). And there is code in the wild that works by doing a test that "there's input available" (using kbhit on Windows and select on Unix) and then doing read(1) to ensure a non-blocking read (the pyinvoke code I referenced earlier). If we're going to break this behaviour, I'd argue that we need to provide a working alternative. At a minimum, can the PEP include a recommended cross-platform means of implementing a non-blocking read from standard input, to replace the current approach? (If the recommendation is to read a larger 4-byte buffer and manage the process of retaining unused bytes yourself, then that's quite a major change to at least the code I'm thinking of in invoke, and I'm not sure read(4) guarantees that it *won't* block if only 1 byte is available without blocking...) Paul From vadmium+py at gmail.com Mon Sep 5 05:37:36 2016 From: vadmium+py at gmail.com (Martin Panter) Date: Mon, 5 Sep 2016 09:37:36 +0000 Subject: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8 In-Reply-To: References: Message-ID: On 5 September 2016 at 09:10, Paul Moore wrote: > On 5 September 2016 at 06:54, Steve Dower wrote: >> +Using the raw object with small buffers >> +--------------------------------------- >> + >> +Code that uses the raw IO object and attempts to read less than four characters >> +will now receive an error. Because it's possible that any single character may >> +require up to four bytes when represented in utf-8, requests must fail. > > I'm very concerned about this statement. It's clearly not true that > the request *must* fail, as reading 1 byte from a UTF-8 enabled Linux > console stream currently works (at least I believe it does). And there > is code in the wild that works by doing a test that "there's input > available" (using kbhit on Windows and select on Unix) and then doing > read(1) to ensure a non-blocking read (the pyinvoke code I referenced > earlier). If we're going to break this behaviour, I'd argue that we > need to provide a working alternative. > > At a minimum, can the PEP include a recommended cross-platform means > of implementing a non-blocking read from standard input, to replace > the current approach? (If the recommendation is to read a larger > 4-byte buffer and manage the process of retaining unused bytes > yourself, then that's quite a major change to at least the code I'm > thinking of in invoke, and I'm not sure read(4) guarantees that it > *won't* block if only 1 byte is available without blocking...) FWIW, on Linux and Unix in general, if select() or similar indicates that some read data is available, calling raw read() with any buffer size should return at least one byte, whatever is available, without blocking. If the user has only typed one byte, read(4) would return that one byte immediately. But if you are using a BufferedReader (stdin.buffer rather than stdin.buffer.raw), then this guarantee is off and read(4) will block until it gets 4 bytes, or until EOF. From ncoghlan at gmail.com Mon Sep 5 06:04:07 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 5 Sep 2016 20:04:07 +1000 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> Message-ID: On 5 September 2016 at 18:19, Koos Zevenhoven wrote: > On Mon, Sep 5, 2016 at 5:21 AM, Nick Coghlan wrote: >> On 5 September 2016 at 04:40, Koos Zevenhoven wrote: >>> On Sun, Sep 4, 2016 at 9:13 PM, Ivan Levkivskyi wrote: >>>> On 4 September 2016 at 19:59, Nick Coghlan wrote: >>> [...] >>>>> >>>>> Similarly, it would be reasonable to say that these three snippets >>>>> should all be equivalent from a typechecking perspective: >>>>> >>>>> x = None # type: Optional[T] >>>>> >>>>> x: Optional[T] = None >>>>> >>>>> x: Optional[T] >>>>> x = None >>>> >>>> >>>> Nice idea, explicit is better than implicit. >>> >>> How is it going to help that these are equivalent within one checker, >>> if the meaning may differ across checkers? >> >> For typechecker developers, it provides absolute clarity that the >> semantics of the new annotations should match the behaviour of >> existing type comments when there's an initialiser present, > > I understood that, but what's the benefit? I hope there will be a type > checker that breaks this "rule". Such a typechecker means you're not writing Python anymore, you're writing Java/C++/C# in a language that isn't designed to be used that way. Fortunately, none of the current typecheckers have made that mistake, nor does anyone appear to be promoting this mindset outside this particular discussion. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From p.f.moore at gmail.com Mon Sep 5 06:20:24 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 5 Sep 2016 11:20:24 +0100 Subject: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8 In-Reply-To: References: Message-ID: On 5 September 2016 at 10:37, Martin Panter wrote: > On 5 September 2016 at 09:10, Paul Moore wrote: >> On 5 September 2016 at 06:54, Steve Dower wrote: >>> +Using the raw object with small buffers >>> +--------------------------------------- >>> + >>> +Code that uses the raw IO object and attempts to read less than four characters >>> +will now receive an error. Because it's possible that any single character may >>> +require up to four bytes when represented in utf-8, requests must fail. >> >> I'm very concerned about this statement. It's clearly not true that >> the request *must* fail, as reading 1 byte from a UTF-8 enabled Linux >> console stream currently works (at least I believe it does). And there >> is code in the wild that works by doing a test that "there's input >> available" (using kbhit on Windows and select on Unix) and then doing >> read(1) to ensure a non-blocking read (the pyinvoke code I referenced >> earlier). If we're going to break this behaviour, I'd argue that we >> need to provide a working alternative. >> >> At a minimum, can the PEP include a recommended cross-platform means >> of implementing a non-blocking read from standard input, to replace >> the current approach? (If the recommendation is to read a larger >> 4-byte buffer and manage the process of retaining unused bytes >> yourself, then that's quite a major change to at least the code I'm >> thinking of in invoke, and I'm not sure read(4) guarantees that it >> *won't* block if only 1 byte is available without blocking...) > > FWIW, on Linux and Unix in general, if select() or similar indicates > that some read data is available, calling raw read() with any buffer > size should return at least one byte, whatever is available, without > blocking. If the user has only typed one byte, read(4) would return > that one byte immediately. > > But if you are using a BufferedReader (stdin.buffer rather than > stdin.buffer.raw), then this guarantee is off and read(4) will block > until it gets 4 bytes, or until EOF. OK. So a correct non-blocking approach would be: def ready_for_reading(): if sys.platform == "win32": return msvcrt.kbhit() else: reads, _, _ = select.select([sys.stdin], [], [], 0.0) return bool(reads and reads[0] is sys.stdin) if ready_for_reading(): return sys.stdin.buffer.raw.read(4) And using a buffer any less than 4 bytes long risks an error on input (specifically, if a character than encodes to multiple UTF-8 bytes is returned). OK. That's viable, I guess, although the *actual* code in question is written to be valid on Python back to 2.7, and to work for general file-like objects, so it'll still be some work to get the incantation correct. It would be nice to explain this explicitly in the docs, though, as read(1) is pretty common, and doesn't typically expect to get an error because of this. Thanks, Paul From k7hoven at gmail.com Mon Sep 5 07:46:08 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Mon, 5 Sep 2016 14:46:08 +0300 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> Message-ID: On Mon, Sep 5, 2016 at 1:04 PM, Nick Coghlan wrote: > On 5 September 2016 at 18:19, Koos Zevenhoven wrote: >> On Mon, Sep 5, 2016 at 5:21 AM, Nick Coghlan wrote: >>> On 5 September 2016 at 04:40, Koos Zevenhoven wrote: >>>> On Sun, Sep 4, 2016 at 9:13 PM, Ivan Levkivskyi wrote: >>>>> On 4 September 2016 at 19:59, Nick Coghlan wrote: >>>> [...] >>>>>> >>>>>> Similarly, it would be reasonable to say that these three snippets >>>>>> should all be equivalent from a typechecking perspective: >>>>>> >>>>>> x = None # type: Optional[T] >>>>>> >>>>>> x: Optional[T] = None >>>>>> >>>>>> x: Optional[T] >>>>>> x = None >>>>> >>>>> >>>>> Nice idea, explicit is better than implicit. >>>> >>>> How is it going to help that these are equivalent within one checker, >>>> if the meaning may differ across checkers? >>> >>> For typechecker developers, it provides absolute clarity that the >>> semantics of the new annotations should match the behaviour of >>> existing type comments when there's an initialiser present, >> >> I understood that, but what's the benefit? I hope there will be a type >> checker that breaks this "rule". > > Such a typechecker means you're not writing Python anymore, you're > writing Java/C++/C# in a language that isn't designed to be used that > way. I'm glad those are all the languages you accuse me of. The list could have been a lot worse. I actually have some good memories of Java. It felt kind of cool at that age, and it taught me many things about undertanding the structure of large and complicated programs after I had been programming for years in other languages, including C++. It also taught me to value simplicity instead, so here we are. > Fortunately, none of the current typecheckers have made that mistake, > nor does anyone appear to be promoting this mindset outside this > particular discussion. The thing I'm promoting here is to not add anything to PEP 526 that says what a type checker is supposed to do with type annotations. Quite the opposite of Java/C++/C#, I would say. We can, of course, speculate about the future of type checkers and the implications of PEP 526 on it. That's what I'm trying to do on python-ideas, speculate about the best kind of type checking (achievable with PEP 526 annotations) [1]. --Koos [1] https://mail.python.org/pipermail/python-ideas/2016-September/042076.html > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -- + Koos Zevenhoven + http://twitter.com/k7hoven + From steve at pearwood.info Mon Sep 5 08:10:39 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 5 Sep 2016 22:10:39 +1000 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> Message-ID: <20160905121036.GR26300@ando.pearwood.info> On Mon, Sep 05, 2016 at 11:19:38AM +0300, Koos Zevenhoven wrote: > On Mon, Sep 5, 2016 at 5:21 AM, Nick Coghlan wrote: [...] > > On 5 September 2016 at 04:40, Koos Zevenhoven wrote: > >> On Sun, Sep 4, 2016 at 9:13 PM, Ivan Levkivskyi wrote: > >>> On 4 September 2016 at 19:59, Nick Coghlan wrote: > >> [...] [Ivan Levkivskyi] > >>>> Similarly, it would be reasonable to say that these three snippets > >>>> should all be equivalent from a typechecking perspective: > >>>> > >>>> x = None # type: Optional[T] > >>>> > >>>> x: Optional[T] = None > >>>> > >>>> x: Optional[T] > >>>> x = None [...] [Koos Zevenhoven] > >> How is it going to help that these are equivalent within one checker, > >> if the meaning may differ across checkers? Before I can give an answer to your [Koos'] question, I have to understand what you see as the problem here. I *think* that you are worried that two different checkers will disagree on what counts as a type error. That given the same chunk of code: x: Optional[T] = None if x: spam(x) else: x.eggs() two checkers will disagree as to whether or not the code is safe. Is that your concern? If not, can you explain in more detail what your concern is? [Nick Coghlan] > > For typechecker developers, it provides absolute clarity that the > > semantics of the new annotations should match the behaviour of > > existing type comments when there's an initialiser present, [Koos] > I understood that, but what's the benefit? Are you asking what is the benefit of having three forms of syntax for the same thing? The type comment systax is required for Python 2 and backwards- compatibility. That's a given. The variable annotation syntax is required because the type comment syntax is (according to the PEP) very much a second-best solution. See the PEP: https://www.python.org/dev/peps/pep-0526/#id4 So this is a proposal to create a *better* syntax for something which already exists. The old version, using comments, cannot be deprecated or removed, as it is required for Python 3.5 and older. Once we allow x: T = value then there benefit in also allowing: x: T x = value since this supports some of the use cases that aren't well supported by type comments or one-line variable annotations. E.g. very long or deeply indented lines, situations where the assignment to x is inside an if...else branch, or any other time you wish to declare the type of the variable before actually setting the variable. [Koos] > I hope there will be a type checker that breaks this "rule". I don't understand. Do you mean that you want three different behaviours for these type annotations? What would they do differently? To me, all three are clear and obvious ways of declaring the type of a variable. Whether I write `x: T = expr` or `x = expr #type:T`, it should be clear that I intend `x` to be treated as T. What would you do differently? [Nick] > > or of a > > parameter annotation when there's no initialiser present. [Koos] > No, your suggested addition does not provide any reference to this. > (...luckily, because that would have been worse.) I'm sorry, I don't follow you. Are you suggesting that we should have the syntax `name:T = value` mean something different inside and outside of a function parameter list? def func(x:T = v): y:T = v The first line declares x as type T with default value v; the second line declares y as type T with initial value v. You say this is "worse"... worse than what? What behaviour would you prefer to see? [Nick] > > For folks following along without necessarily keeping up with all the > > nuances, it makes it more explicit what Guido means when he says "PEP > > 526 does not make a stand on the > > behavior of type checkers (other than deferring to PEP 484)." [Koos] > What you are proposing is exactly "making a stand on the behavior of > type checkers", and the examples you provide below are all variations > of the same situation and provide no justification for a general rule. I'm sorry, I don't understand this objection. The closest I can get to an answer would be: A general rule is better than a large number of unconnected, arbitrary, special cases. Does that help? -- Steve From steve.dower at python.org Mon Sep 5 09:36:23 2016 From: steve.dower at python.org (Steve Dower) Date: Mon, 5 Sep 2016 06:36:23 -0700 Subject: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8 In-Reply-To: References: Message-ID: The best fix is to use a buffered reader, which will read all the available bytes and then let you .read(1), even if it happens to be an incomplete character. We could theoretically add buffering to the raw reader to handle one character, which would allow very small reads from raw, but that severely complicates things and the advice to use a buffered reader is good advice anyway. Top-posted from my Windows Phone -----Original Message----- From: "Paul Moore" Sent: ?9/?5/?2016 3:23 To: "Martin Panter" Cc: "Python Dev" Subject: Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8 On 5 September 2016 at 10:37, Martin Panter wrote: > On 5 September 2016 at 09:10, Paul Moore wrote: >> On 5 September 2016 at 06:54, Steve Dower wrote: >>> +Using the raw object with small buffers >>> +--------------------------------------- >>> + >>> +Code that uses the raw IO object and attempts to read less than four characters >>> +will now receive an error. Because it's possible that any single character may >>> +require up to four bytes when represented in utf-8, requests must fail. >> >> I'm very concerned about this statement. It's clearly not true that >> the request *must* fail, as reading 1 byte from a UTF-8 enabled Linux >> console stream currently works (at least I believe it does). And there >> is code in the wild that works by doing a test that "there's input >> available" (using kbhit on Windows and select on Unix) and then doing >> read(1) to ensure a non-blocking read (the pyinvoke code I referenced >> earlier). If we're going to break this behaviour, I'd argue that we >> need to provide a working alternative. >> >> At a minimum, can the PEP include a recommended cross-platform means >> of implementing a non-blocking read from standard input, to replace >> the current approach? (If the recommendation is to read a larger >> 4-byte buffer and manage the process of retaining unused bytes >> yourself, then that's quite a major change to at least the code I'm >> thinking of in invoke, and I'm not sure read(4) guarantees that it >> *won't* block if only 1 byte is available without blocking...) > > FWIW, on Linux and Unix in general, if select() or similar indicates > that some read data is available, calling raw read() with any buffer > size should return at least one byte, whatever is available, without > blocking. If the user has only typed one byte, read(4) would return > that one byte immediately. > > But if you are using a BufferedReader (stdin.buffer rather than > stdin.buffer.raw), then this guarantee is off and read(4) will block > until it gets 4 bytes, or until EOF. OK. So a correct non-blocking approach would be: def ready_for_reading(): if sys.platform == "win32": return msvcrt.kbhit() else: reads, _, _ = select.select([sys.stdin], [], [], 0.0) return bool(reads and reads[0] is sys.stdin) if ready_for_reading(): return sys.stdin.buffer.raw.read(4) And using a buffer any less than 4 bytes long risks an error on input (specifically, if a character than encodes to multiple UTF-8 bytes is returned). OK. That's viable, I guess, although the *actual* code in question is written to be valid on Python back to 2.7, and to work for general file-like objects, so it'll still be some work to get the incantation correct. It would be nice to explain this explicitly in the docs, though, as read(1) is pretty common, and doesn't typically expect to get an error because of this. Thanks, Paul _______________________________________________ Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/steve.dower%40python.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From k7hoven at gmail.com Mon Sep 5 09:40:08 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Mon, 5 Sep 2016 16:40:08 +0300 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: <20160905121036.GR26300@ando.pearwood.info> References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> <20160905121036.GR26300@ando.pearwood.info> Message-ID: It looks like you are trying to make sense of this, but unfortunately there's some added mess and copy&paste-like errors regarding who said what. I think no such errors remain in what I quote below: On Mon, Sep 5, 2016 at 3:10 PM, Steven D'Aprano wrote: > > [Koos Zevenhoven] >> >> How is it going to help that these are equivalent within one checker, >> >> if the meaning may differ across checkers? > > Before I can give an answer to your [Koos'] question, I have to > understand what you see as the problem here. The problem was that suggested restrictive addition into PEP 526 with no proper justification, especially since the PEP was not supposed to restrict the semantics of type checking. I was asking how it would help to add that restriction. Very simple. Maybe some people got confused because I did want to *discuss* best practices for type checking elsewhere. > I *think* that you are worried that two different checkers will disagree > on what counts as a type error. That given the same chunk of code: In the long term, I'm worried about that, but there's nothing that PEP 526 can do about it at this point. > [Nick Coghlan] >> > For typechecker developers, it provides absolute clarity that the >> > semantics of the new annotations should match the behaviour of >> > existing type comments when there's an initialiser present, > > [Koos] >> I understood that, but what's the benefit? > > Are you asking what is the benefit of having three forms of syntax for > the same thing? No, still the same thing: What is the benefit of that particular restriction, when there are no other restrictions? Better just leave it out. > The type comment systax is required for Python 2 and backwards- > compatibility. That's a given. Sure, but all type checkers will not have to care about Python 2. > The variable annotation syntax is required because the type comment > syntax is (according to the PEP) very much a second-best solution. See > the PEP: > > https://www.python.org/dev/peps/pep-0526/#id4 > > So this is a proposal to create a *better* syntax for something which > already exists. The old version, using comments, cannot be deprecated or > removed, as it is required for Python 3.5 and older. Right. > Once we allow > > x: T = value > > then there benefit in also allowing: > > x: T > x = value > > since this supports some of the use cases that aren't well supported by > type comments or one-line variable annotations. E.g. very long or deeply > indented lines, situations where the assignment to x is inside an > if...else branch, or any other time you wish to declare the type of the > variable before actually setting the variable. Sure. > [Nick] >> > For folks following along without necessarily keeping up with all the >> > nuances, it makes it more explicit what Guido means when he says "PEP >> > 526 does not make a stand on the >> > behavior of type checkers (other than deferring to PEP 484)." > > [Koos] >> What you are proposing is exactly "making a stand on the behavior of >> type checkers", and the examples you provide below are all variations >> of the same situation and provide no justification for a general rule. > > I'm sorry, I don't understand this objection. The closest I can get to > an answer would be: > > A general rule is better than a large number of unconnected, arbitrary, > special cases. A general rule that does not solve a problem is worse than no rule. -- Koos > > -- > Steve > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/k7hoven%40gmail.com -- + Koos Zevenhoven + http://twitter.com/k7hoven + From ncoghlan at gmail.com Mon Sep 5 09:46:59 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 5 Sep 2016 23:46:59 +1000 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> Message-ID: On 5 September 2016 at 21:46, Koos Zevenhoven wrote: > The thing I'm promoting here is to not add anything to PEP 526 that > says what a type checker is supposed to do with type annotations. PEP 526 says it doesn't intend to expand the scope of typechecking semantics beyond what PEP 484 already supports. For that to be true, it needs to be able to define expected equivalencies between the existing semantics of PEP 484 and the new syntax in PEP 526. If those equivalencies can't be defined, then Mark's concerns are valid, and the PEP either needs to be deferred as inadvertently introducing new semantics while intending to only introduce new syntax, or else the intended semantics need to be spelled out as they were in PEP 484 so folks can judge the proposal accurately, rather than attempting to judge it based on an invalid premise. For initialised variables, the equivalence between the two PEPs is straightforward: "x: T = expr" is equivalent to "x = expr # type: T" If PEP 526 always required an initialiser, and didn't introduce ClassVar, there'd be no controversy, and we'd already be done. However, the question of "Does this new syntax necessarily imply the introduction of new semantics?" gets a lot murkier for uninitialised variables. A strict "no new semantics beyond PEP 484" interpretation would mean that these need to be interpreted the same way as parameter annotations: as a type hint on the outcome of the code executed up to that point, rather than as a type constraint on assignment statements in the code *following* that point. Consider: def simple_appender(base: List[T], value: T) -> None: base.append(value) This will typecheck fine - lists have append methods, and the value appended conforms to what our list expects. The parameter annotations mainly act as constraints on how this function is *called*, with the following all being problematic: simple_appender([1, 2, 3], "hello") # Container/value type mismatch simple_appender([1, 2, 3], None) # Value is not optional simple_appender((1, 2, 3), 4) # A tuple is not a list However, because of the way name binding in Python works, the annotations in *no way* constrain assignments inside the function body: def not_so_simple_appender(base: List[T], value: T) -> None: other_ref = base base = value other_ref.append(base) >From a dynamic typechecking perspective, that's just as valid as the original implementation, since the "List[T]" type of "other_ref" is inferred from the original type of "base" before it gets rebound to value and has its new type inferred as "T". This freedom to rebind an annotated name without a typechecker complaining is what Mark is referring to when he says that PEP 484 attaches annotations to expressions rather than types. Under such "parameter annotation like" semantics, uninitialised variable annotations would only make sense as a new form of post-initialisation assertion, and perhaps as some form of Eiffel-style class invariant documentation syntax. The usage to help ensure code correctness in multi-branch initialisation cases would then look something like this: if case1: x = ... elif case2: x = ... else: x = ... assert x : List[T] # If we get to here without x being List[T], something's wrong The interpreter could then optimise type assertions out entirely at function level (even in __debug__ mode), and turn them into annotations at module and class level (with typecheckers then deciding how to process them). That's not what the PEP proposes for uninitialised variables though: it proposes processing them *before* a series of assignment statements, which *only makes sense* if you plan to use them to constrain those assignments in some way. If you wanted to write something like that under a type assertion spelling, then you could enlist the aid of the "all" builtin: assert all(x) : List[T] # All local assignments to "x" must abide by this constraint if case1: x = ... elif case2: x = ... else: x = ... So I've come around to the point of view of being a solid -1 on the PEP as written - despite the best of intentions, it strongly encourages "assert all(x): List[T]" as the default interpretation of unitialised variable annotations, and doesn't provide an easy way to do arbitrary inline type assertions to statically check the correctness of the preceding code the way we can with runtime assertions and as would happen if the code in question was factored out to an annotated function. Stick the "assert" keyword in front of them though, call them type assertions rather than type declarations, and require all() when you want to constrain all assignments later in the function (or until the next relevant type assertion), and I'm a solid +1. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From k7hoven at gmail.com Mon Sep 5 09:59:01 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Mon, 5 Sep 2016 16:59:01 +0300 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> Message-ID: Sorry, I don't have time to read emails of this length now, and perhaps I'm interpreting your emails more literally than you write them, anyway. If PEP 484 introduces unnecessary restrictions at this point, that's a separate issue. I see no need to copy those into PEP 526. I'll be posting my own remaining concerns regarding PEP 526 when I find the time. -- Koos On Mon, Sep 5, 2016 at 4:46 PM, Nick Coghlan wrote: > On 5 September 2016 at 21:46, Koos Zevenhoven wrote: >> The thing I'm promoting here is to not add anything to PEP 526 that >> says what a type checker is supposed to do with type annotations. > > PEP 526 says it doesn't intend to expand the scope of typechecking > semantics beyond what PEP 484 already supports. For that to be true, > it needs to be able to define expected equivalencies between the > existing semantics of PEP 484 and the new syntax in PEP 526. > > If those equivalencies can't be defined, then Mark's concerns are > valid, and the PEP either needs to be deferred as inadvertently > introducing new semantics while intending to only introduce new > syntax, or else the intended semantics need to be spelled out as they > were in PEP 484 so folks can judge the proposal accurately, rather > than attempting to judge it based on an invalid premise. > > For initialised variables, the equivalence between the two PEPs is > straightforward: "x: T = expr" is equivalent to "x = expr # type: T" > > If PEP 526 always required an initialiser, and didn't introduce > ClassVar, there'd be no controversy, and we'd already be done. > > However, the question of "Does this new syntax necessarily imply the > introduction of new semantics?" gets a lot murkier for uninitialised > variables. > > A strict "no new semantics beyond PEP 484" interpretation would mean > that these need to be interpreted the same way as parameter > annotations: as a type hint on the outcome of the code executed up to > that point, rather than as a type constraint on assignment statements > in the code *following* that point. > > Consider: > > def simple_appender(base: List[T], value: T) -> None: > base.append(value) > > This will typecheck fine - lists have append methods, and the value > appended conforms to what our list expects. > > The parameter annotations mainly act as constraints on how this > function is *called*, with the following all being problematic: > > simple_appender([1, 2, 3], "hello") # Container/value type mismatch > simple_appender([1, 2, 3], None) # Value is not optional > simple_appender((1, 2, 3), 4) # A tuple is not a list > > However, because of the way name binding in Python works, the > annotations in *no way* constrain assignments inside the function > body: > > def not_so_simple_appender(base: List[T], value: T) -> None: > other_ref = base > base = value > other_ref.append(base) > > From a dynamic typechecking perspective, that's just as valid as the > original implementation, since the "List[T]" type of "other_ref" is > inferred from the original type of "base" before it gets rebound to > value and has its new type inferred as "T". > > This freedom to rebind an annotated name without a typechecker > complaining is what Mark is referring to when he says that PEP 484 > attaches annotations to expressions rather than types. > > Under such "parameter annotation like" semantics, uninitialised > variable annotations would only make sense as a new form of > post-initialisation assertion, and perhaps as some form of > Eiffel-style class invariant documentation syntax. > > The usage to help ensure code correctness in multi-branch > initialisation cases would then look something like this: > > if case1: > x = ... > elif case2: > x = ... > else: > x = ... > assert x : List[T] # If we get to here without x being List[T], > something's wrong > > The interpreter could then optimise type assertions out entirely at > function level (even in __debug__ mode), and turn them into > annotations at module and class level (with typecheckers then deciding > how to process them). > > That's not what the PEP proposes for uninitialised variables though: > it proposes processing them *before* a series of assignment > statements, which *only makes sense* if you plan to use them to > constrain those assignments in some way. > > If you wanted to write something like that under a type assertion > spelling, then you could enlist the aid of the "all" builtin: > > assert all(x) : List[T] # All local assignments to "x" must abide > by this constraint > if case1: > x = ... > elif case2: > x = ... > else: > x = ... > > So I've come around to the point of view of being a solid -1 on the > PEP as written - despite the best of intentions, it strongly > encourages "assert all(x): List[T]" as the default interpretation of > unitialised variable annotations, and doesn't provide an easy way to > do arbitrary inline type assertions to statically check the > correctness of the preceding code the way we can with runtime > assertions and as would happen if the code in question was factored > out to an annotated function. > > Stick the "assert" keyword in front of them though, call them type > assertions rather than type declarations, and require all() when you > want to constrain all assignments later in the function (or until the > next relevant type assertion), and I'm a solid +1. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -- + Koos Zevenhoven + http://twitter.com/k7hoven + From ncoghlan at gmail.com Mon Sep 5 10:02:08 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 6 Sep 2016 00:02:08 +1000 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> Message-ID: On 5 September 2016 at 23:46, Nick Coghlan wrote: > Under such "parameter annotation like" semantics, uninitialised > variable annotations would only make sense as a new form of > post-initialisation assertion, and perhaps as some form of > Eiffel-style class invariant documentation syntax. Thinking further about the latter half of that comment, I realised that the PEP 484 equivalence I'd like to see for variable annotations in a class body is how they would relate to a property definition using the existing PEP 484 syntax. For example, consider: class AnnotatedProperty: @property def x(self) -> int: ... @x.setter def x(self, value: int) -> None: ... @x.deleter def x(self) -> None: ... It would be rather surprising if that typechecked differently from: class AnnotatedVariable: x: int For ClassVar, you'd similarly want: class AnnotatedClassVariable: x: ClassVar[int] to typecheck like "x" was declared as an annotated property on the metaclass. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From mark at hotpy.org Mon Sep 5 10:19:46 2016 From: mark at hotpy.org (Mark Shannon) Date: Mon, 5 Sep 2016 15:19:46 +0100 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> Message-ID: <57CD7F02.7080106@hotpy.org> On 04/09/16 21:16, Guido van Rossum wrote: > Everybody please stop panicking. PEP 526 does not make a stand on the > behavior of type checkers (other than deferring to PEP 484). If you > want to start a discussion about constraining type checkers please do > it over at python-ideas. There is no rush as type checkers are not > affected by the feature freeze. > Indeed, we shouldn't panic. We should take our time, review this carefully and make sure that the version of typehints that lands in 3.7 is one that we most of us are happy with and all of us can at least tolerate. Cheers, Mark. From k7hoven at gmail.com Mon Sep 5 10:24:15 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Mon, 5 Sep 2016 17:24:15 +0300 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> Message-ID: On Mon, Sep 5, 2016 at 5:02 PM, Nick Coghlan wrote: > On 5 September 2016 at 23:46, Nick Coghlan wrote: >> Under such "parameter annotation like" semantics, uninitialised >> variable annotations would only make sense as a new form of >> post-initialisation assertion, Why not discuss this in the python-ideas thread where I quote myself from last Friday regarding the notion of annotations as assertions? >> and perhaps as some form of >> Eiffel-style class invariant documentation syntax. I hope this is simpler than it sounds :-) > Thinking further about the latter half of that comment, I realised > that the PEP 484 equivalence I'd like to see for variable annotations > in a class body is how they would relate to a property definition > using the existing PEP 484 syntax. > > For example, consider: > > class AnnotatedProperty: > > @property > def x(self) -> int: > ... > > @x.setter > def x(self, value: int) -> None: > ... > > @x.deleter > def x(self) -> None: > ... > > It would be rather surprising if that typechecked differently from: > > class AnnotatedVariable: > > x: int > How about just using the latter way? That's much clearer. I doubt this needs a change in the PEP. > For ClassVar, you'd similarly want: > > > class AnnotatedClassVariable: > > x: ClassVar[int] > > to typecheck like "x" was declared as an annotated property on the metaclass. > Sure, there are many things that one may consider equivalent. I doubt you'll be able to list them all in a way that everyone agrees on. And I hope you don't take this as a challenge -- I'm in the don't-panic camp :). -- Koos > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -- + Koos Zevenhoven + http://twitter.com/k7hoven + From tds333 at mailbox.org Mon Sep 5 05:08:51 2016 From: tds333 at mailbox.org (Wolfgang) Date: Mon, 5 Sep 2016 11:08:51 +0200 Subject: [Python-Dev] PEP 526 ready for review: Syntax for Variable and Attribute Annotations In-Reply-To: References: Message-ID: <2272d90f-cf11-4589-e081-d48515aa8e4f@mailbox.org> Hi, first if something like this is needed I am fine with the syntax. But I think this changes comes to late for 3.6. There is more time needed to discuss all of this and there is more time needed to mature the type checkers. Don't rush with such a change because it affects the language at whole. So please defer it. Saying the syntax is fine means not I am happy with the addition. Fundamentally I think don't add this at Python language syntax level. Years come and it will be misused misunderstood by new users. It will affect all other users reading the code and even misguiding them. If Variable and Attribute Annotation is needed keep it simply at the stub file level on *.pyi files. Only there for the type checking stuff. Other users must not bother with them. And for stub files it can be as simple as: myvar = typing.Int() or all other valid syntax. For me the whole specifying of types in Python comes down to: It can be useful to document a user interface (most of the time a function or method) and say if you call it these types are supported. At some day a documentation generator can use this type information and I have no longer the need to specify it also in the docstring. Personally I would like to extend the pyi stub files to carry also the documentation and keep the code as clean and short as possible. Sometimes the documentation is longer than the code and the code is no longer easy to find. Instead of putting everything into the language put more into the stub files. Even micropython or other implementations with limited constraints don't need to carry all of this. Even if it is only part of the AST, it is overhead. Have someone checked if there is a possibility if this is added to slow down the interpreter or interpreter startup or increase the memory consumption? Regards, Wolfgang On 30.08.2016 23:20, Guido van Rossum wrote: > I'm happy to present PEP 526 for your collective review: > https://www.python.org/dev/peps/pep-0526/ (HTML) > https://github.com/python/peps/blob/master/pep-0526.txt (source) > > There's also an implementation ready: > https://github.com/ilevkivskyi/cpython/tree/pep-526 > > I don't want to post the full text here but I encourage feedback on > the high-order ideas, including but not limited to > > - Whether (given PEP 484's relative success) it's worth adding syntax > for variable/attribute annotations. > > - Whether the keyword-free syntax idea proposed here is best: > NAME: TYPE > TARGET: TYPE = VALUE > > Note that there's an extensive list of rejected ideas in the PEP; > please be so kind to read it before posting here: > https://www.python.org/dev/peps/pep-0526/#rejected-proposals-and-things-left-out-for-now > > From tds333 at mailbox.org Mon Sep 5 11:08:07 2016 From: tds333 at mailbox.org (Wolfgang) Date: Mon, 5 Sep 2016 17:08:07 +0200 Subject: [Python-Dev] PEP 526 ready for review: Syntax for Variable and Attribute Annotations In-Reply-To: References: Message-ID: <95cbe40a-71ed-702c-d66c-ce72a5145064@mailbox.org> Hi, first if something like this is needed I am fine with the syntax. But I think this changes comes to late for 3.6. There is more time needed to discuss all of this and there is more time needed to mature the type checkers. Don't rush with such a change because it affects the language at whole. So please defer it. Saying the syntax is fine means not I am happy with the addition. Fundamentally I think don't add this at Python language syntax level. Years come and it will be misused misunderstood by new users. It will affect all other users reading the code and even misguiding them. If Variable and Attribute Annotation is needed keep it simply at the stub file level on *.pyi files. Only there for the type checking stuff. Other users must not bother with them. And for stub files it can be as simple as: myvar = typing.Int() or all other valid syntax. For me the whole specifying of types in Python comes down to: It can be useful to document a user interface (most of the time a function or method) and say if you call it these types are supported. At some day a documentation generator can use this type information and I have no longer the need to specify it also in the docstring. Personally I would like to extend the pyi stub files to carry also the documentation and keep the code as clean and short as possible. Sometimes the documentation is longer than the code and the code is no longer easy to find. Instead of putting everything into the language put more into the stub files. Even micropython or other implementations with limited constraints don't need to carry all of this. Even if it is only part of the AST, it is overhead. Have someone checked if there is a possibility if this is added to slow down the interpreter or interpreter startup or increase the memory consumption? Regards, Wolfgang On 30.08.2016 23:20, Guido van Rossum wrote: > I'm happy to present PEP 526 for your collective review: > https://www.python.org/dev/peps/pep-0526/ (HTML) > https://github.com/python/peps/blob/master/pep-0526.txt (source) > > There's also an implementation ready: > https://github.com/ilevkivskyi/cpython/tree/pep-526 > > I don't want to post the full text here but I encourage feedback on > the high-order ideas, including but not limited to > > - Whether (given PEP 484's relative success) it's worth adding syntax > for variable/attribute annotations. > > - Whether the keyword-free syntax idea proposed here is best: > NAME: TYPE > TARGET: TYPE = VALUE > > Note that there's an extensive list of rejected ideas in the PEP; > please be so kind to read it before posting here: > https://www.python.org/dev/peps/pep-0526/#rejected-proposals-and-things-left-out-for-now > > From mark at hotpy.org Mon Sep 5 11:26:17 2016 From: mark at hotpy.org (Mark Shannon) Date: Mon, 5 Sep 2016 16:26:17 +0100 Subject: [Python-Dev] Do PEP 526 type declarations define the types of variables or not? Message-ID: <57CD8E99.8090205@hotpy.org> Hi, PEP 526 states that "This PEP aims at adding syntax to Python for annotating the types of variables" and Guido seems quite insistent that the declarations are for the types of variables. However, I get the impression that most (all) of the authors and proponents of PEP 526 are quite keen to emphasise that the PEP in no way limits type checkers from doing what they want. This is rather contradictory. The behaviour of a typechecker is defined by the typesystem that it implements. Whether a type annotation determines the type of a variable or an expression alters changes what typesystems are feasible. So, stating that annotations define the type of variables *does* limit what a typechecker can or cannot do. Unless of course, others may have a different idea of what the "type of a variable" means. To me, it means it means that for all assignments `var = expr` the type of `expr` must be a subtype of the variable, and for all uses of var, the type of the use is the same as the type of the variable. In this example: def bar()->Optional[int]: ... def foo()->int: x:Optional[int] = bar() if x is None: return -1 return x According to PEP 526 the annotation `x:Optional[int]` means that the *variable* `x` has the type `Optional[int]`. So what is the type of `x` in `return x`? If it is `Optional[int]`, then a type checker is obliged to reject this code. If it is `int` then what does "type of a variable" actually mean, and why aren't the other uses of `x` int as well? Cheers, Mark. From guido at python.org Mon Sep 5 11:34:20 2016 From: guido at python.org (Guido van Rossum) Date: Mon, 5 Sep 2016 08:34:20 -0700 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: <57CD7F02.7080106@hotpy.org> References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> <57CD7F02.7080106@hotpy.org> Message-ID: On Mon, Sep 5, 2016 at 7:19 AM, Mark Shannon wrote: > On 04/09/16 21:16, Guido van Rossum wrote: >> >> Everybody please stop panicking. PEP 526 does not make a stand on the >> behavior of type checkers (other than deferring to PEP 484). If you >> want to start a discussion about constraining type checkers please do >> it over at python-ideas. There is no rush as type checkers are not >> affected by the feature freeze. >> > > Indeed, we shouldn't panic. We should take our time, review this carefully > and make sure that the version of typehints that lands in 3.7 is one that we > most of us are happy with and all of us can at least tolerate. Right, we want the best possible version to land in 3.7. And in order to make that possible, I have to accept it *provisionally* for 3.6 and Ivan's implementation will go into 3.6b1. We will then have until 3.7 to experiment with it and tweak it as necessary. Maybe ClassVar will turn out to be pointless. Maybe we'll decide that we want to have a syntax for quickly annotating several variables with the same type (x, y, z: T). Maybe we'll change the rules for how or when __annotations__ is updated. Maybe we'll change slightly whether we'll allow annotating complex assignment targets like x[f()]. But without starting the experiment now we won't be able to evaluate any of those things. Waiting until 3.7 is just going to cause the exact same discussions that are going on now 18 months from now. Regarding how type checkers should use the new syntax, PEP 526 itself give barely more guidance than PEP 3107, except that we now have PEP 484 to tell us what types ought to look like, *if* you want to use an external type checker. I hope that you and others will help write another PEP (informational?) to guide type checkers and their users. Given my own experience at Dropbox (much of it vicariously through the eyes of the many Dropbox engineers annotating their own code) I am *very* reluctant to try and specify the behavior of a type checker formally myself. As anyone who has used mypy on a sizeable project knows, there are a lot more details to sort out than how to handle branches that assign different values to the same variable. For people who want to read about what it is like to use mypy seriously, I can recommend the series of three blog posts by Daniel Moisset starting here: http://www.machinalis.com/blog/a-day-with-mypy-part-1/ If you want to see a large open source code base that's annotated for mypy (with 97% coverage), I recommend looking at Zulip: https://github.com/zulip/zulip Try digging through the history and looking for commits mentioning mypy; a Google Summer of Code student did most of the work over the summer. (The syntax used is the Python-2-compatible version, but that's hardly relevant -- the important things to observe include how they use types and how they had to change their code to pacify mypy.) -- --Guido van Rossum (python.org/~guido) From rymg19 at gmail.com Mon Sep 5 12:15:21 2016 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Mon, 5 Sep 2016 11:15:21 -0500 Subject: [Python-Dev] Do PEP 526 type declarations define the types of variables or not? In-Reply-To: <57CD8E99.8090205@hotpy.org> References: <57CD8E99.8090205@hotpy.org> Message-ID: Maybe the PEP should just say it's for "annotating variables", and it would mention "primarily for the purpose of types"? -- Ryan [ERROR]: Your autotools build scripts are 200 lines longer than your program. Something?s wrong. http://kirbyfan64.github.io/ On Sep 5, 2016 10:27 AM, "Mark Shannon" wrote: > Hi, > > PEP 526 states that "This PEP aims at adding syntax to Python for > annotating the types of variables" and Guido seems quite insistent that the > declarations are for the types of variables. > > However, I get the impression that most (all) of the authors and > proponents of PEP 526 are quite keen to emphasise that the PEP in no way > limits type checkers from doing what they want. > > This is rather contradictory. The behaviour of a typechecker is defined by > the typesystem that it implements. Whether a type annotation determines the > type of a variable or an expression alters changes what typesystems are > feasible. So, stating that annotations define the type of variables *does* > limit what a typechecker can or cannot do. > > Unless of course, others may have a different idea of what the "type of a > variable" means. > To me, it means it means that for all assignments `var = expr` > the type of `expr` must be a subtype of the variable, > and for all uses of var, the type of the use is the same as the type of > the variable. > > In this example: > > def bar()->Optional[int]: ... > > def foo()->int: > x:Optional[int] = bar() > if x is None: > return -1 > return x > > According to PEP 526 the annotation `x:Optional[int]` > means that the *variable* `x` has the type `Optional[int]`. > So what is the type of `x` in `return x`? > If it is `Optional[int]`, then a type checker is obliged to reject this > code. If it is `int` then what does "type of a variable" actually mean, > and why aren't the other uses of `x` int as well? > > Cheers, > Mark. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/rymg19% > 40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Mon Sep 5 12:17:11 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 6 Sep 2016 02:17:11 +1000 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <20160905121036.GR26300@ando.pearwood.info> Message-ID: <20160905161710.GS26300@ando.pearwood.info> On Mon, Sep 05, 2016 at 04:40:08PM +0300, Koos Zevenhoven wrote: > On Mon, Sep 5, 2016 at 3:10 PM, Steven D'Aprano wrote: > > > > [Koos Zevenhoven] > >> >> How is it going to help that these are equivalent within one checker, > >> >> if the meaning may differ across checkers? > > > > Before I can give an answer to your [Koos'] question, I have to > > understand what you see as the problem here. > > The problem was that suggested restrictive addition into PEP 526 with > no proper justification, especially since the PEP was not supposed to > restrict the semantics of type checking. What "suggested restrictive addition into PEP 526" are you referring to? Please be specific. > I was asking how it would > help to add that restriction. Very simple. Maybe some people got > confused because I did want to *discuss* best practices for type > checking elsewhere. I still can't answer your question, because I don't understand what restriction you are talking about. Unless you mean the restriction that variable annotations are to mean the same thing whether they are written as `x:T = v` or `x = v #type: T`. I don't see this as a restriction. > > The type comment systax is required for Python 2 and backwards- > > compatibility. That's a given. > > Sure, but all type checkers will not have to care about Python 2. They will have to care about type comments until such time as they are ready to abandon all versions of Python older than 3.6. And even then, there will probably be code still written with type comments until Python 4000. -- Steve From rosuav at gmail.com Mon Sep 5 12:33:40 2016 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 6 Sep 2016 02:33:40 +1000 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: <20160905161710.GS26300@ando.pearwood.info> References: <20160905121036.GR26300@ando.pearwood.info> <20160905161710.GS26300@ando.pearwood.info> Message-ID: On Tue, Sep 6, 2016 at 2:17 AM, Steven D'Aprano wrote: >> > The type comment systax is required for Python 2 and backwards- >> > compatibility. That's a given. >> >> Sure, but all type checkers will not have to care about Python 2. > > They will have to care about type comments until such time as they are > ready to abandon all versions of Python older than 3.6. More specifically, until *the code they check* can abandon all <3.6. If the checker itself depends on new features (say, an improved AST parser that retains inline comments for subsequent evaluation), you could say "You must have Python 3.6 or better to use this checker", but the application itself would still be able to run on older versions. That's another reason not to delay this PEP until 3.7, as it'd push _everything_ another 18 months (or more) into the future. ChrisA From p.f.moore at gmail.com Mon Sep 5 12:41:22 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 5 Sep 2016 17:41:22 +0100 Subject: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8 In-Reply-To: References: Message-ID: On 5 September 2016 at 14:36, Steve Dower wrote: > The best fix is to use a buffered reader, which will read all the available > bytes and then let you .read(1), even if it happens to be an incomplete > character. But this is sys.stdin.buffer.raw, we're talking about. People can't really layer anything on top of that, it's precisely because they are trying to *bypass* the existing layering (that doesn't work the way that they need it to, because it blocks) that is the problem here. > We could theoretically add buffering to the raw reader to handle one character, > which would allow very small reads from raw, but that severely complicates > things and the advice to use a buffered reader is good advice anyway. Can you provide an example of how I'd rewrite the code that I quoted previously to follow this advice? Note - this is not theoretical, I expect to have to provide a PR to fix exactly this code should this change go in. At the moment I can't find a way that doesn't impact the (currently working and not expected to need any change) Unix version of the code, most likely I'll have to add buffering of 4-byte reads (which as you say is complex). The problem I have is that we're forcing application code to do the buffering to cater for Windows (where you're proposing that the raw IO layer doesn't handle it and will potentially fail reads of <4 bytes). Code written for POSIX doesn't need to do that, and the additional maintenance overhead is potentially large enough to put POSIX developers off adding the necessary code - this is in direct contrast to the proposal to make fsencoding UTF-8 to make it easier for POSIX-compatible code to "just work" on Windows. If the goals are to handle Unicode correctly for stdin, and to work in a way that POSIX-compatible code works without special effort on Windows, then as far as I can see we have to handle the buffering of partial reads of UTF-8 code sequences (because POSIX does so). If, on the other hand, we just want Unicode to work on Windows, and we're not looking for POSIX code to work without change, then the proposed behaviour is OK (although I still maintain it needs to be flagged, as it's very close to being a compatibility break in practice, even if it's technically within the rules). Paul PS I'm not 100% sure that under POSIX read() will return partial UTF-8 byte sequences. I think it must, because otherwise a lot of code I've seen would be broken, but if a POSIX expert can confirm or deny my assumption, that would be great. From ethan at stoneleaf.us Mon Sep 5 12:58:42 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 05 Sep 2016 09:58:42 -0700 Subject: [Python-Dev] PEP 467: last round (?) In-Reply-To: References: <57C88355.9000302@stoneleaf.us> <57C8A5F1.4060204@stoneleaf.us> <57CA0EC8.5030508@stoneleaf.us> <1472861844.3258795.714404505.0822A4A7@webmail.messagingengine.com> Message-ID: <57CDA442.7080507@stoneleaf.us> On 09/03/2016 09:48 AM, Nick Coghlan wrote: > On 3 September 2016 at 21:35, Martin Panter wrote: >> On 3 September 2016 at 08:47, Victor Stinner wrote: >>> Le samedi 3 septembre 2016, Random832 a ?crit : >>>> On Fri, Sep 2, 2016, at 19:44, Ethan Furman wrote: >>>>> The problem with only having `bchr` is that it doesn't help with >>>>> `bytearray`; >>>> >>>> What is the use case for bytearray.fromord? Even in the rare case >>>> someone needs it, why not bytearray(bchr(...))? >>> >>> Yes, this was my point: I don't think that we need a bytearray method to >>> create a mutable string from a single byte. >> >> I agree with the above. Having an easy way to turn an int into a bytes >> object is good. But I think the built-in bchr() function on its own is >> enough. Just like we have bytes object literals, but the closest we >> have for a bytearray literal is bytearray(b". . ."). > > This is a good point - earlier versions of the PEP didn't include > bchr(), they just had the class methods, so "bytearray(bchr(...))" > wasn't an available spelling (if I remember the original API design > correctly, it would have been something like > "bytearray(bytes.byte(...))"), which meant there was a strong > consistency argument in having the alternate constructor on both > types. Now that the PEP proposes the "bchr" builtin, the "fromord" > constructors look less necessary. tl;dr -- Sounds good to me. I'll update the PEP. ------- When this started the idea behind the methods that eventually came to be called "fromord" and "fromsize" was that they would be the two possible interpretations of "bytes(x)": the legacy Python2 behavior: >>> var = bytes('abc') >>> bytes(var[1]) 'b' the current Python 3 behavior: >>> var = b'abc' >>> bytes(var[1]) b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 \x00\x00' Digging deeper the problem turns out to be that indexing a bytes object changed: Python 2: >>> b'abc'[1] 'b' Python 3: >>> b'abc'[1] 98 If we pass an actual byte into the Python 3 bytes constructor it behaves as one would expect: >>> bytes(b'b') b'b' Given all this it can be argued that the real problem is that indexing a bytes object behaves differently depending on whether you retrieve a single byte with an index versus a single byte with a slice: >>> b'abc'[2] 99 >>> b'abc'[2:] b'c' Since we cannot fix that behavior, the question is how do we make it more livable? - we can add a built-in to transform the int back into a byte: >>> bchr(b'abc'[2]) b'c' - we can add a method to return a byte from the bytes object, not an int: >>> b'abc'.getbyte(2) b'c' - we can add a method to return a byte from an int: >>> bytes.fromint(b'abc'[2]) b'c' Which is all to say we have two problems to deal with: - getting bytes from a bytes object - getting bytes from an int Since "bytes.fromint()" and "bchr()" are the same, and given that "bchr(ordinal)" mirrors "chr(ordinal)", I think "bchr" is the better choice for getting bytes from an int. For getting bytes from bytes, "getbyte()" and "iterbytes" are good choices. > Given that, and the uncertain deprecation time frame for accepting > integers in the main bytes and bytearray constructors, perhaps both > the "fromsize" and "fromord" parts of the proposal can be deferred > indefinitely in favour of just adding the bchr() builtin? Agreed. -- ~Ethan~ From ethan at stoneleaf.us Mon Sep 5 13:24:07 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 05 Sep 2016 10:24:07 -0700 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> Message-ID: <57CDAA37.5040207@stoneleaf.us> On 09/05/2016 06:46 AM, Nick Coghlan wrote: [an easy to understand explanation for those of us who aren't type-inferring gurus] Thanks, Nick. I think I finally have a grip on what Mark was talking about, and about how these things should work. Much appreciated! -- ~Ethan~ From steve.dower at python.org Mon Sep 5 13:38:01 2016 From: steve.dower at python.org (Steve Dower) Date: Mon, 5 Sep 2016 10:38:01 -0700 Subject: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8 In-Reply-To: References: Message-ID: <547a4a73-b696-26e3-d55f-04aca1fb4c7f@python.org> On 05Sep2016 0941, Paul Moore wrote: > On 5 September 2016 at 14:36, Steve Dower wrote: >> The best fix is to use a buffered reader, which will read all the available >> bytes and then let you .read(1), even if it happens to be an incomplete >> character. > > But this is sys.stdin.buffer.raw, we're talking about. People can't > really layer anything on top of that, it's precisely because they are > trying to *bypass* the existing layering (that doesn't work the way > that they need it to, because it blocks) that is the problem here. This layer also blocks, and always has. You need to go to platform specific functions anyway to get non-blocking functionality (which is also wrapped up in getc I believe, but that isn't used by FileIO or the new WinConsoleIO classes). >> We could theoretically add buffering to the raw reader to handle one character, >> which would allow very small reads from raw, but that severely complicates >> things and the advice to use a buffered reader is good advice anyway. > > Can you provide an example of how I'd rewrite the code that I quoted > previously to follow this advice? Note - this is not theoretical, I > expect to have to provide a PR to fix exactly this code should this > change go in. At the moment I can't find a way that doesn't impact the > (currently working and not expected to need any change) Unix version > of the code, most likely I'll have to add buffering of 4-byte reads > (which as you say is complex). The easiest way to follow it is to use "sys.stdin.buffer.read(1)" rather than "sys.stdin.buffer.raw.read(1)". > PS I'm not 100% sure that under POSIX read() will return partial UTF-8 > byte sequences. I think it must, because otherwise a lot of code I've > seen would be broken, but if a POSIX expert can confirm or deny my > assumption, that would be great. I just tested, and yes it returns partial characters. That's a good reason to do the single character buffering ourselves. Shouldn't be too hard to deal with. Cheers, Steve From guido at python.org Mon Sep 5 13:40:55 2016 From: guido at python.org (Guido van Rossum) Date: Mon, 5 Sep 2016 10:40:55 -0700 Subject: [Python-Dev] Do PEP 526 type declarations define the types of variables or not? In-Reply-To: <57CD8E99.8090205@hotpy.org> References: <57CD8E99.8090205@hotpy.org> Message-ID: On Mon, Sep 5, 2016 at 8:26 AM, Mark Shannon wrote: > PEP 526 states that "This PEP aims at adding syntax to Python for annotating > the types of variables" and Guido seems quite insistent that the > declarations are for the types of variables. > > However, I get the impression that most (all) of the authors and proponents > of PEP 526 are quite keen to emphasise that the PEP in no way limits type > checkers from doing what they want. > > This is rather contradictory. The behaviour of a typechecker is defined by > the typesystem that it implements. Whether a type annotation determines the > type of a variable or an expression alters changes what typesystems are > feasible. So, stating that annotations define the type of variables *does* > limit what a typechecker can or cannot do. > > Unless of course, others may have a different idea of what the "type of a > variable" means. > To me, it means it means that for all assignments `var = expr` > the type of `expr` must be a subtype of the variable, > and for all uses of var, the type of the use is the same as the type of the > variable. > > In this example: > > def bar()->Optional[int]: ... > > def foo()->int: > x:Optional[int] = bar() > if x is None: > return -1 > return x > > According to PEP 526 the annotation `x:Optional[int]` > means that the *variable* `x` has the type `Optional[int]`. > So what is the type of `x` in `return x`? > If it is `Optional[int]`, then a type checker is obliged to reject this > code. If it is `int` then what does "type of a variable" actually mean, > and why aren't the other uses of `x` int as well? Oh, there is definitely a problem here if you interpret it that way. Of course I assume that other type checkers are at least as smart as mypy. :-) In mypy, the analysis of this example narrows the type x can have once `x is None` is determined to be false, so that the example passes. I guess this is a surprise if you think of type systems like Java's where the compiler forgets what it has learned, at least from the language spec's POV. But a Python type checker is more like a linter, and false positives (complaints about valid code) are much more problematic than false negatives (passing invalid code). So a Python type checker that is to gain acceptance of users must be much smarter than that, and neither PEP 484 not PEP 526 is meant to require a type checker to complain about `return x` in the above example. I'm not sure how to change the language of the PEP though -- do you have a suggestion? It all seems to depend on how the reader interprets the meaning of very vague words like "variable" and "type". -- --Guido van Rossum (python.org/~guido) From steve at pearwood.info Mon Sep 5 13:57:01 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 6 Sep 2016 03:57:01 +1000 Subject: [Python-Dev] Do PEP 526 type declarations define the types of variables or not? In-Reply-To: <57CD8E99.8090205@hotpy.org> References: <57CD8E99.8090205@hotpy.org> Message-ID: <20160905175701.GT26300@ando.pearwood.info> On Mon, Sep 05, 2016 at 04:26:17PM +0100, Mark Shannon wrote: > In this example: > > def bar()->Optional[int]: ... > > def foo()->int: > x:Optional[int] = bar() > if x is None: > return -1 > return x > > According to PEP 526 the annotation `x:Optional[int]` > means that the *variable* `x` has the type `Optional[int]`. We can change that to read: x = bar() and let the type-checker infer the type of x. Introducing the annotation here is a red-herring: you have *exactly* the same issue whether we do type inference, a type comment, or the proposed variable annotation. > So what is the type of `x` in `return x`? The type of *the variable x* is still Optional[int]. But that's the wrong question. The right question is, what's the type of the return result? The return result is not "the variable x". The return result is the value produced by evaluating the expression `x` in the specific context of where the return statement is found. (To be precise, it is the *inferred* return value, of course, since the actual return value won't be produced until runtime.) > If it is `Optional[int]`, then a type checker is obliged to reject this > code. Not at all, because the function isn't returning "the variable x". It's returning the value currently bound to x, and *that* is known to be an int. It has to be an int, because if it were None, the function would have already returned -1. The return result is an expression that happens to consist of just a single term, in this case `x`. To make it more clear, let's change it to `return x+999`. The checker should be able to infer that since `x` must be an int here, the expression `x+999` will also be an int. This satisfies the return type. Of course `x+999` is just a stand-in for any expression that is known to return an int, and that includes the case where the expression is `x` alone. There's really not anything more mysterious going on here than the case where we have a Union type with two branches that depend on which type x actually is: def demo(x:Union[int, str])->int: # the next two lines are expected to fail the type check # since the checker can't tell if x is an int or a str x+1 len(x) # but the rest of the function should pass if isinstance(x, int): # here we know x is definitely an int y = x + 1 if isinstance(x, str): # and here we know x is definitely a str y = len(x) return y When I run MyPy on that, it gives: [steve at ando ~]$ mypy test.py test.py: note: In function "demo": test.py:6: error: Unsupported operand types for + ("Union[int, str]" and "int") test.py:7: error: Argument 1 to "len" has incompatible type "Union[int, str]"; expected "Sized" But all of this is a red herring. It has nothing to do with the proposed variable annotation syntax: it applies equally to type comments and function annotations. -- Steve From p.f.moore at gmail.com Mon Sep 5 14:10:10 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 5 Sep 2016 19:10:10 +0100 Subject: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8 In-Reply-To: <547a4a73-b696-26e3-d55f-04aca1fb4c7f@python.org> References: <547a4a73-b696-26e3-d55f-04aca1fb4c7f@python.org> Message-ID: On 5 September 2016 at 18:38, Steve Dower wrote: >> Can you provide an example of how I'd rewrite the code that I quoted >> previously to follow this advice? Note - this is not theoretical, I >> expect to have to provide a PR to fix exactly this code should this >> change go in. At the moment I can't find a way that doesn't impact the >> (currently working and not expected to need any change) Unix version >> of the code, most likely I'll have to add buffering of 4-byte reads >> (which as you say is complex). > > The easiest way to follow it is to use "sys.stdin.buffer.read(1)" rather > than "sys.stdin.buffer.raw.read(1)". I may have got confused here. If I say sys.stdin.buffer.read(1), having first checked via kbhit() that there's a character available[1], then I will always get 1 byte returned, never the "buffer too small to return a full character" error that you talk about in the PEP? If so, then I don't understand when the error you propose will be raised (unless your comment here is based on what you say below that we'll now buffer and therefore the error is no longer needed). > >> PS I'm not 100% sure that under POSIX read() will return partial UTF-8 >> byte sequences. I think it must, because otherwise a lot of code I've >> seen would be broken, but if a POSIX expert can confirm or deny my >> assumption, that would be great. > > I just tested, and yes it returns partial characters. That's a good reason > to do the single character buffering ourselves. Shouldn't be too hard to > deal with. OK, cool. Again I'm slightly confused because isn't this what you said before "severely complicates things" - or was that only for the raw layer? I was over-simplifying the issue for pyinvoke, which in practice is complicated by not yet having completed the process disentangling bytes/unicode handling. As a result, I was handwaving somewhat about whether read() is called on a raw stream or a buffered stream - in practice I'm sure I can manage as long as the buffered level still handles single-byte reads. One thing I did think of, though - if someone *is* working at the raw IO level, they have to be prepared for the new "buffer too small to return a full character" error. That's OK. But what if they request reading 7 bytes, but the input consists of 6 character s that encode to 1 byte in UTF-8, followed by a character that encodes to 2 bytes? You can return 6 bytes, that's fine - but you'll presumably still need to read the extra character before you can determine that it won't fit - so you're still going to have to buffer to some degree, surely? I guess this is implementation details, though - I'll try to find some time to read the patch in order to understand this. It's not something that matters in terms of the PEP anyway, it's an implementation detail. Cheers, Paul [1] Yes, I know that's not the best approach, but it's as good as we get without adding rather too much scary Windows specific code. (The irony of trying to get this right given how much low-level Unix code I don't follow is already in there doesn't escape me :-() From guido at python.org Mon Sep 5 14:15:17 2016 From: guido at python.org (Guido van Rossum) Date: Mon, 5 Sep 2016 11:15:17 -0700 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: <57CDAA37.5040207@stoneleaf.us> References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> <57CDAA37.5040207@stoneleaf.us> Message-ID: On Mon, Sep 5, 2016 at 10:24 AM, Ethan Furman wrote: > On 09/05/2016 06:46 AM, Nick Coghlan wrote: > > [an easy to understand explanation for those of us who aren't type-inferring > gurus] > > Thanks, Nick. I think I finally have a grip on what Mark was talking about, > and about how these things should work. > > Much appreciated! There must be some misunderstanding. The message from Nick with that timestamp (https://mail.python.org/pipermail/python-dev/2016-September/146200.html) hinges on an incorrect understanding of the intention of annotations without value (e.g. `x: Optional[int]`), leading to a -1 on the PEP. I can't tell if this is an honest misunderstanding or a strawman, but I want to set the intention straight. First of all, the PEP does not require the type checker to interpret anything in a particular way; it intentionally shies away from prescribing semantics (other than the runtime semantics of updating __annotations__ or verifying that the target appears assignable). But there appears considerable fear about what expectations the PEP has of a reasonable type checker. In response to this I'll try to sketch how I think this should be implemented in mypy. There are actually at least two separate cases: if x is a local variable, the intention of `x: ` is quite different from when x occurs in a class. - When found in a class, all *uses* (which may appear in modules far away from the definition) must be considered to conform to the stated type -- as must all assignments to it, but I believe that's never been in doubt. There are just too many edge cases to consider to make stricter assumptions (e.g. threading, exceptions, signals), so that even after seeing `self.x = 42; use(self.x)` the call to use() cannot assume that self.x is still 42. - But when found inside a function referring to a local variable, mypy should treat the annotation as a restriction on assignment, and use its own inference engine to type-check *uses* of that variable. So that in this example (after Mark's): def bar() -> Optional[int]: ... def foo() -> int: x: Optional[int] x = bar() if x is None: return -1 return x there should not be an error on `return x` because mypy is smart enough to know it cannot be None at that point. I am at a loss how to modify the PEP to avoid this misunderstanding, since it appears it is entirely in the reader's mind. The PEP is not a tutorial but a spec for the implementation, and as a spec it is quite clear that it leaves the type-checking semantics up to individual type checkers. And I think that is the right thing to do -- in practice there are many other ways to write the above example, and mypy will understand some of them, but not others, while other type checkers may understand a different subset of examples. I can't possibly prescribe how type checkers should behave in each case -- I can't even tell which cases are important to distinguish. So writing down "the type checker should not report an error in the following case" in the PEP is not going to be helpful for anyone (in contrast, I think discussing examples on a mailing list *is* useful). Like a linter, a type checker has limited intelligence, and it will be a quality of implementation issue as to how useful a type checker will be in practice. But that's not the topic of PEP 526. -- --Guido van Rossum (python.org/~guido) From victor.stinner at gmail.com Mon Sep 5 14:44:04 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 5 Sep 2016 11:44:04 -0700 Subject: [Python-Dev] Push PEP 528 (use utf8 on Windows) right now, but revert before 3.6 if needed Message-ID: Hi, I just spoke with Steve Dower (thanks for the current sprint!) about the PEP 528. We somehow agreed that we need to push his implementation of the PEP right now to get enough time to test as much applications as possible on Windows to have a wide view of possible all regressions. The hope is that enough users will test the first Python 3.6 beta 1 (feature freeze!) with their app on Windows. If we find blocker points before Python 3.6 final, we still have time to revert code to restore the Python 3.5 behaviour (use ANSI code page for bytes). What do you think? Note: First I was strongly opposed to any kind of change related to encodings on Windows, but then I made my own tests and had to confess that my proposed changes break the world. Steve's approach makes more sense and is more realistic. Victor From tjreedy at udel.edu Mon Sep 5 14:58:01 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 5 Sep 2016 14:58:01 -0400 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> <57CD7F02.7080106@hotpy.org> Message-ID: On 9/5/2016 11:34 AM, Guido van Rossum wrote: > On Mon, Sep 5, 2016 at 7:19 AM, Mark Shannon wrote: >> Indeed, we shouldn't panic. We should take our time, review this carefully >> and make sure that the version of typehints that lands in 3.7 is one that we >> most of us are happy with and all of us can at least tolerate. > > Right, we want the best possible version to land in 3.7. And in order > to make that possible, I have to accept it *provisionally* for 3.6 and Until now, the 'provisional' part has not been clear to me, and presumably others who have written as if acceptance meant 'baked in stone'. We have had provisional modules, but not, that I can think of, syntax that remains provisional past the x.y.0 release. > Ivan's implementation will go into 3.6b1. We will then have until 3.7 > to experiment with it and tweak it as necessary. New syntax is usually implemented within python itself, and can be fully experimented with during alpha and beta releases. In this case, the effective implementation will be in 3rd party checkers and experimentation will take longer. -- Terry Jan Reedy From levkivskyi at gmail.com Mon Sep 5 15:01:29 2016 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Mon, 5 Sep 2016 21:01:29 +0200 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> <57CDAA37.5040207@stoneleaf.us> Message-ID: On 5 September 2016 at 20:15, Guido van Rossum wrote: > There are actually at least two separate cases: if x is a local > variable, the intention of `x: ` is quite different from when x > occurs in a class. > If I understand you correctly this also matches my mental model. In local scope x: ann = value acts like a filter allowing only something compatible to be assigned at this point (and/or casting to a more precise type). While in class or module it is a part of an "API specification" for that class/module. > I am at a loss how to modify the PEP to avoid this misunderstanding, > since it appears it is entirely in the reader's mind. The PEP is not a > tutorial but a spec for the implementation, ... > I was thinking about changing terminology to name annotations, but that will not solve problem. The PEP mentions a separate document (guidelines) that will be published. I think a real solution will be to make a separate PEP that will explain in details what is preferred meaning of types and what people and machines could do with types. Is anyone interested in going in this direction? I would like to especially invite Mark, you have a lot of experience with types inference that would be very helpful (also it seems to me that you are concerned about this). -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From nad at python.org Mon Sep 5 15:03:50 2016 From: nad at python.org (Ned Deily) Date: Mon, 5 Sep 2016 12:03:50 -0700 Subject: [Python-Dev] Push PEP 528 (use utf8 on Windows) right now, but revert before 3.6 if needed In-Reply-To: References: Message-ID: On 9/5/16 11:44, Victor Stinner wrote: > I just spoke with Steve Dower (thanks for the current sprint!) about > the PEP 528. We somehow agreed that we need to push his implementation > of the PEP right now to get enough time to test as much applications > as possible on Windows to have a wide view of possible all > regressions. > > The hope is that enough users will test the first Python 3.6 beta 1 > (feature freeze!) with their app on Windows. > > If we find blocker points before Python 3.6 final, we still have time > to revert code to restore the Python 3.5 behaviour (use ANSI code page > for bytes). > > What do you think? Let's do it. Thanks to both of you for hashing this out. From steve.dower at python.org Mon Sep 5 15:30:51 2016 From: steve.dower at python.org (Steve Dower) Date: Mon, 5 Sep 2016 12:30:51 -0700 Subject: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8 In-Reply-To: References: <547a4a73-b696-26e3-d55f-04aca1fb4c7f@python.org> Message-ID: <407d23cb-e7dc-4ea5-fdcd-dacb769068e6@python.org> On 05Sep2016 1110, Paul Moore wrote: > On 5 September 2016 at 18:38, Steve Dower wrote: >>> Can you provide an example of how I'd rewrite the code that I quoted >>> previously to follow this advice? Note - this is not theoretical, I >>> expect to have to provide a PR to fix exactly this code should this >>> change go in. At the moment I can't find a way that doesn't impact the >>> (currently working and not expected to need any change) Unix version >>> of the code, most likely I'll have to add buffering of 4-byte reads >>> (which as you say is complex). >> >> The easiest way to follow it is to use "sys.stdin.buffer.read(1)" rather >> than "sys.stdin.buffer.raw.read(1)". > > I may have got confused here. If I say sys.stdin.buffer.read(1), > having first checked via kbhit() that there's a character > available[1], then I will always get 1 byte returned, never the > "buffer too small to return a full character" error that you talk > about in the PEP? If so, then I don't understand when the error you > propose will be raised (unless your comment here is based on what you > say below that we'll now buffer and therefore the error is no longer > needed). I don't think using buffer.read and kbhit together is going to be reliable anyway, as you may not have read everything that's already buffered yet. It's likely feasible if you flush everything, but otherwise it's a bit messy. > One thing I did think of, though - if someone *is* working at the raw > IO level, they have to be prepared for the new "buffer too small to > return a full character" error. That's OK. But what if they request > reading 7 bytes, but the input consists of 6 character s that encode > to 1 byte in UTF-8, followed by a character that encodes to 2 bytes? > You can return 6 bytes, that's fine - but you'll presumably still need > to read the extra character before you can determine that it won't fit > - so you're still going to have to buffer to some degree, surely? I > guess this is implementation details, though - I'll try to find some > time to read the patch in order to understand this. It's not something > that matters in terms of the PEP anyway, it's an implementation > detail. If you do raw.read(7), we internally do "7 / 4" and decide to only read one wchar_t from the console. So the returned bytes will be between 1 and 4 bytes long and there will be more info waiting for next time you ask. The only case we can reasonably handle at the raw layer is "n / 4" is zero but n != 0, in which case we can read and cache up to 4 bytes (one wchar_t) and then return those in future calls. If we try to cache any more than that we're substituting for buffered reader, which I don't want to do. Does caching up to one (Unicode) character at a time sound reasonable? I think that won't be much trouble, since there's no interference between system calls in that case and it will be consistent with POSIX behaviour. Cheers, Steve From eryksun at gmail.com Mon Sep 5 15:34:33 2016 From: eryksun at gmail.com (eryk sun) Date: Mon, 5 Sep 2016 19:34:33 +0000 Subject: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8 In-Reply-To: References: Message-ID: I have some suggestions. With ReadConsoleW, CPython can use the pInputControl parameter to set a CtrlWakeup mask. This enables a Unix-style Ctrl+D for ending a read without having to press enter. For example: >>> CTRL_MASK = 1 << 4 >>> inctrl = (ctypes.c_ulong * 4)(16, 0, CTRL_MASK, 0) >>> _ = kernel32.ReadConsoleW(hStdIn, buf, 100, pn, inctrl); print() spam >>> buf.value 'spam\x04' >>> pn[0] 5 read() would have to manually replace '\x04' with NUL. Ctrl+Z can also be added to the mask: >>> CTRL_MASK = 1 << 4 | 1 << 26 >>> inctrl = (ctypes.c_ulong * 4)(16, 0, CTRL_MASK, 0) >>> _ = kernel32.ReadConsoleW(hStdIn, buf, 100, pn, inctrl); print() spam >>> buf.value 'spam\x1a' I'd like a method to query, set and unset ENABLE_VIRTUAL_TERMINAL_PROCESSING mode for the screen buffer (sys.stdout and sys.stderr) without having to use ctypes. The console in Windows 10 has built-in VT100 emulation, but it's initially disabled. The cmd shell enables it, but Python scripts aren't always run from cmd.exe. Sometimes they're run in a new console from Explorer or via "start", etc. For example, IPython could check for this to provide more bells and whistles when PyReadline isn't installed. Finally, functions such as WriteConsoleInputW and ReadConsoleOutputCharacter require opening CONIN$ or CONOUT$ with GENERIC_READ | GENERIC_WRITE access. The initial handles given to a console process have read-write access. For opening a new handle by device name, WindowsConsoleIO should first try GENERIC_READ | GENERIC_WRITE -- with a fallback to either GENERIC_READ or GENERIC_WRITE. The fallback is necessary for CON, which uses the desired access to determine whether to open the input buffer or screen buffer. --- Paul, do you have example code that uses the 'raw' stream? Using the buffer should behave as it always has -- at least in this regard. sys.stdin.buffer requests a large block, such as 8 KB. But since the console defaults to a cooked mode (i.e. processed input and line input -- control keys, command-line editing, input history, and aliases), ReadConsole returns when enter is pressed or when interrupted. It returns at least '\r\n', unless interrupted by Ctrl+C, Ctrl+Break or a custom CtrlWakeup key. However, if line-input mode is disabled, ReadConsole returns as soon as one or more characters is available in the input buffer. As to kbhit() returning true, this does not mean that read(1) from console input won't block (not unless line-input mode is disabled). It does mean that getwch() won't block (note the "w" in there; this one reads Unicode characters).The CRT's conio functions (e.g. kbhit, getwch) put the console input buffer in a raw mode (e.g. ^C is read as '\x03' instead of generating a CTRL_C_EVENT) and call the lower-level functions PeekConsoleInputW (kbhit) and ReadConsoleInputW (getwch), to peek at and read input event records. --- Splitting surrogate pairs across reads is a problem. Granted, this should rarely be an issue given the size of the reads that the buffer requests and the typical line length. In most cases the buffer completely consumes the entire line in one read. But in principle the raw stream shouldn't replace split surrogates with the U+FFFD replacement character. For example, with Steve's patch from issue 1602: >>> _ = write_console_input('\U00010000\r\n');\ ... b1 = raw_read(4); b2 = raw_read(4); b3 = raw_read(8) ? >>> b1, b2 (b'\xef\xbf\xbd', b'\xef\xbf\xbd') Splitting UTF-8 sequences across writes is more common. Currently a raw write doesn't handle this correctly: >>> b = 'eggs \U00010000 spam\n'.encode('utf-8') >>> _ = raw_write(b[:6]); _ = raw_write(b[6:]) eggs ???? spam Also, the console is UCS-2, which can't be transcoded between UTF-16 and UTF-8. Supporting UCS-2 in the console would integrate nicely with the filesystem PEP. It makes it always possible to print os.listdir('.'), copy and paste, and read it back without data loss. It would probably be simpler to use UTF-16 in the main pipeline and implement Martin's suggestion to mix in a UTF-8 buffer. The UTF-16 buffer could be renamed as "wbuffer", for expert use. However, if you're fully committed to transcoding in the raw layer, I'm certain that these problems can be addressed with small buffers and using Python's codec machinery for a flexible mix of "surrogatepass" and "replace" error handling. From steve.dower at python.org Mon Sep 5 15:54:06 2016 From: steve.dower at python.org (Steve Dower) Date: Mon, 5 Sep 2016 12:54:06 -0700 Subject: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8 In-Reply-To: References: Message-ID: <5cfa8c73-14b7-8795-06c5-3266940da4b1@python.org> On 05Sep2016 1234, eryk sun wrote: > Also, the console is UCS-2, which can't be transcoded between UTF-16 > and UTF-8. Supporting UCS-2 in the console would integrate nicely with > the filesystem PEP. It makes it always possible to print > os.listdir('.'), copy and paste, and read it back without data loss. Supporting UTF-8 actually works better for this. We already use surrogatepass explicitly (on the filesystem side, with PEP 529) and implicitly (on the console side, using the Windows conversion API). > It would probably be simpler to use UTF-16 in the main pipeline and > implement Martin's suggestion to mix in a UTF-8 buffer. The UTF-16 > buffer could be renamed as "wbuffer", for expert use. However, if > you're fully committed to transcoding in the raw layer, I'm certain > that these problems can be addressed with small buffers and using > Python's codec machinery for a flexible mix of "surrogatepass" and > "replace" error handling. I don't think it actually makes things simpler. Having two buffers is generally a bad idea unless they are perfectly synced, which would be impossible here without data corruption (if you read half a utf-8 character sequence and then read the wide buffer, do you get that character or not?). Writing a partial character is easily avoidable by the user. We can either fail with an error or print garbage, and currently printing garbage is the most compatible behaviour. (Also occurs on Linux - I have a VM running this week for testing this stuff.) Cheers, Steve From christian at python.org Mon Sep 5 15:57:24 2016 From: christian at python.org (Christian Heimes) Date: Mon, 5 Sep 2016 21:57:24 +0200 Subject: [Python-Dev] TLS handshake performance boost Message-ID: Hi, I have yet another patch for the ssl module, http://bugs.python.org/issue19500 . The patch adds support for SSL session resumption on the client side. A SSLContext automatically handles server-side session. SSL sessions speed up successive TLS connections to the same host considerable. My na?ve benchmark shows about 15 to 20% performance improvements for short-lived connections to PyPI. In real-life applications with keep-alive, the speed-up will be a bit smaller. Cory expects that requests is going to be about 5% faster for subsequent requests. https://vincent.bernat.im/en/blog/2011-ssl-session-reuse-rfc5077.html has more information on the topic. Why is session handling different on the client side? OpenSSL does not re-use sessions on the client side automatically. To use session resumptions a SSL_SESSION must be copied from an established SSLSocket to a new SSLSocket before the handshake. OpenSSL has further restrictions, e.g. both sockets must use the same SSLContext. Session cannot be shared between SSLContext. My patch takes care of these details. The basic features are pretty much done and tested. But I won't be able to write all documentation by the end of the week or to write a high-level mechanism to auto-reuse sessions. I still like to get the feature in before Monday. What do you think? Are you fine with low-level session feature and reduced documentation for the beta release? Christian From p.f.moore at gmail.com Mon Sep 5 16:08:46 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 5 Sep 2016 21:08:46 +0100 Subject: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8 In-Reply-To: <407d23cb-e7dc-4ea5-fdcd-dacb769068e6@python.org> References: <547a4a73-b696-26e3-d55f-04aca1fb4c7f@python.org> <407d23cb-e7dc-4ea5-fdcd-dacb769068e6@python.org> Message-ID: On 5 September 2016 at 20:30, Steve Dower wrote: > The only case we can reasonably handle at the raw layer is "n / 4" is zero > but n != 0, in which case we can read and cache up to 4 bytes (one wchar_t) > and then return those in future calls. If we try to cache any more than that > we're substituting for buffered reader, which I don't want to do. > > Does caching up to one (Unicode) character at a time sound reasonable? I > think that won't be much trouble, since there's no interference between > system calls in that case and it will be consistent with POSIX behaviour. Caching a single character sounds perfectly OK. As I noted previously, my use case probably won't need to work at the raw level anyway, so I no longer expect to have code that will break, but I think that a 1-character buffer ensuring that we avoid surprises for code that was written for POSIX is a good trade-off. Paul From nad at python.org Mon Sep 5 16:15:24 2016 From: nad at python.org (Ned Deily) Date: Mon, 5 Sep 2016 13:15:24 -0700 Subject: [Python-Dev] TLS handshake performance boost In-Reply-To: References: Message-ID: On 9/5/16 12:57, Christian Heimes wrote: > I have yet another patch for the ssl module, > http://bugs.python.org/issue19500 . The patch adds support for SSL > session resumption on the client side. [...] > > My patch takes care of these details. The basic features are pretty much > done and tested. But I won't be able to write all documentation by the > end of the week or to write a high-level mechanism to auto-reuse > sessions. I still like to get the feature in before Monday. > > What do you think? Are you fine with low-level session feature and > reduced documentation for the beta release? Unless there are other objections, I'm willing to make an exception if you can get the current patch reviewed by one of the usual suspects from the Security Sig, the missing pieces are in before b2, and the API of the checked-in pieces doesn't change after b1. From p.f.moore at gmail.com Mon Sep 5 16:19:46 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 5 Sep 2016 21:19:46 +0100 Subject: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8 In-Reply-To: References: Message-ID: On 5 September 2016 at 20:34, eryk sun wrote: > Paul, do you have example code that uses the 'raw' stream? Using the > buffer should behave as it always has -- at least in this regard. > sys.stdin.buffer requests a large block, such as 8 KB. But since the > console defaults to a cooked mode (i.e. processed input and line input > -- control keys, command-line editing, input history, and aliases), > ReadConsole returns when enter is pressed or when interrupted. It > returns at least '\r\n', unless interrupted by Ctrl+C, Ctrl+Break or a > custom CtrlWakeup key. However, if line-input mode is disabled, > ReadConsole returns as soon as one or more characters is available in > the input buffer. The code I'm looking at doesn't use the raw stream (I think). The problem I had (and the reason I was concerned) is that the code does some rather messy things, and without tracing back through the full code path, I'm not 100% sure *what* level of stream it's using. However, now that I know that the buffered layer won't ever error because 1 byte isn't enough to return a full character, if I need to change the code I can do so by switching to the buffered layer and fixing the issue that way (although with Steve's new proposal even that won't be necessary). > As to kbhit() returning true, this does not mean that read(1) from > console input won't block (not unless line-input mode is disabled). It > does mean that getwch() won't block (note the "w" in there; this one > reads Unicode characters).The CRT's conio functions (e.g. kbhit, > getwch) put the console input buffer in a raw mode (e.g. ^C is read as > '\x03' instead of generating a CTRL_C_EVENT) and call the lower-level > functions PeekConsoleInputW (kbhit) and ReadConsoleInputW (getwch), to > peek at and read input event records. I understand. The code I'm working on was originally written for pure POSIX, with all the termios calls to set the console into unbuffered mode. In addition, it was until recently using the Python 2 text model, and so there's a lot of places in the code where it's still confused about whether it's processing bytes or characters (we've got rid of a *lot* of "let's decode and see if that helps" calls...). At the moment, kbhit(), while not correct, is "good enough". When I get the time, and we get to a point where it's enough of a priority, I may well look at refactoring this stuff to use proper Windows calls via ctypes to do "read what's available". But that's a way off yet. Thanks for the information, though, I'll keep it in mind when we do get to a point where we're looking at this. Paul From eryksun at gmail.com Mon Sep 5 17:40:32 2016 From: eryksun at gmail.com (eryk sun) Date: Mon, 5 Sep 2016 21:40:32 +0000 Subject: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8 In-Reply-To: <5cfa8c73-14b7-8795-06c5-3266940da4b1@python.org> References: <5cfa8c73-14b7-8795-06c5-3266940da4b1@python.org> Message-ID: On Mon, Sep 5, 2016 at 7:54 PM, Steve Dower wrote: > On 05Sep2016 1234, eryk sun wrote: >> >> Also, the console is UCS-2, which can't be transcoded between UTF-16 >> and UTF-8. Supporting UCS-2 in the console would integrate nicely with >> the filesystem PEP. It makes it always possible to print >> os.listdir('.'), copy and paste, and read it back without data loss. > > Supporting UTF-8 actually works better for this. We already use > surrogatepass explicitly (on the filesystem side, with PEP 529) and > implicitly (on the console side, using the Windows conversion API). CP_UTF8 requires valid UTF-16 text. MultiByteToWideChar and WideCharToMultiByte are of no practical use here. For example: >>> raw_read = sys.stdin.buffer.raw.read >>> _ = write_console_input('\ud800\ud800\r\n'); raw_read(16) ?? b'\xef\xbf\xbd\xef\xbf\xbd\r\n' This requires Python's "surrogatepass" error handler. It's also required to decode UTF-8 that's potentially WTF-8 from os.listdir(b'.'). Coming from the wild, there's a chance that arbitrary bytes have invalid sequences other than lone surrogates, so it needs to fall back on "replace" to deal with errors that "surrogatepass" doesn't handle. > Writing a partial character is easily avoidable by the user. We can either > fail with an error or print garbage, and currently printing garbage is the > most compatible behaviour. (Also occurs on Linux - I have a VM running this > week for testing this stuff.) Are you sure about that? The internal screen buffer of a Linux terminal is bytes; it doesn't transcode to a wide-character format. In the Unix world, almost everything is "get a byte, get a byte, get a byte, byte, byte". Here's what I see in Ubuntu using GNOME Terminal, for example: >>> raw_write = sys.stdout.buffer.raw.write >>> b = '?????\n'.encode() >>> b b'\xce\xb1\xce\xb2\xcf\x88\xce\xb4\xce\xb5\n' >>> for c in b: _ = raw_write(bytes([c])) ... ????? Here it is on Windows with your patch: >>> raw_write = sys.stdout.buffer.raw.write >>> b = '?????\n'.encode() >>> b b'\xce\xb1\xce\xb2\xcf\x88\xce\xb4\xce\xb5\n' >>> for c in b: _ = raw_write(bytes([c])) ... ?????????? For the write case this can be addressed by identifying an incomplete sequence at the tail end and either buffering it as 'written' or rejecting it for the user/buffer to try again with the complete sequence. I think rejection isn't a good option when the incomplete sequence starts at index 0. That should be buffered. I prefer buffering in all cases. >> It would probably be simpler to use UTF-16 in the main pipeline and >> implement Martin's suggestion to mix in a UTF-8 buffer. The UTF-16 >> buffer could be renamed as "wbuffer", for expert use. However, if >> you're fully committed to transcoding in the raw layer, I'm certain >> that these problems can be addressed with small buffers and using >> Python's codec machinery for a flexible mix of "surrogatepass" and >> "replace" error handling. > > I don't think it actually makes things simpler. Having two buffers is > generally a bad idea unless they are perfectly synced, which would be > impossible here without data corruption (if you read half a utf-8 character > sequence and then read the wide buffer, do you get that character or not?). Martin's idea, as I understand it, is a UTF-8 buffer that reads from and writes to the text wrapper. It necessarily consumes at least one character and buffers it to allow reading per byte. Likewise for writing, it buffers bytes until it can write a character to the text wrapper. ISTM, it has to look for incomplete lead-continuation byte sequences at the tail end, to hold them until the sequence is complete, at which time it either decodes to a valid character or the U+FFFD replacement character. Also, I found that read(n) has to read a character at a time. That's the only way to emulate line-input mode to detect "\n" and stop reading. Technically this is implemented in a RawIOBase, which dictates that operations should use a single system call, but since it's interfacing with a text wrapper around a buffer around the actual UCS-2 raw console stream, any notion of a 'system call' would be a sham. Because of the UTF-8 buffering there is a synchronization issue, but it has character granularity. For example, when decoding UTF-8, you don't get half of a surrogate pair. You decode the full character, and write that as a discrete unit to the text wrapper. I'd have to experiment to see how bad this can get. If it's too confusing the idea isn't practical. On the plus side, when working with text it's all native UCS-2 up to the TextIOWrapper, so it's as efficient as possible, and as simple as possible. You don't have to worry about transcoding and dealing with partial surrogate pairs and partial UTF-8 sequences. All of that complexity is exported to the pure-Python UTF-8 buffer mixin, but it's not as bad there either because the interface is Text <=> WTF-8 instead of UCS-2 <=> WTF-8, and you don't have to worry about limiting yourself to a single read or write. But that's detrimental for anyone using the buffer's raw stream with the presumption that it does only make one system call that's thread safe. From steve.dower at python.org Mon Sep 5 17:45:13 2016 From: steve.dower at python.org (Steve Dower) Date: Mon, 5 Sep 2016 14:45:13 -0700 Subject: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8 In-Reply-To: References: <547a4a73-b696-26e3-d55f-04aca1fb4c7f@python.org> <407d23cb-e7dc-4ea5-fdcd-dacb769068e6@python.org> Message-ID: <219890bf-c7a8-69d3-b952-0159cbfbf460@python.org> On 05Sep2016 1308, Paul Moore wrote: > On 5 September 2016 at 20:30, Steve Dower wrote: >> The only case we can reasonably handle at the raw layer is "n / 4" is zero >> but n != 0, in which case we can read and cache up to 4 bytes (one wchar_t) >> and then return those in future calls. If we try to cache any more than that >> we're substituting for buffered reader, which I don't want to do. >> >> Does caching up to one (Unicode) character at a time sound reasonable? I >> think that won't be much trouble, since there's no interference between >> system calls in that case and it will be consistent with POSIX behaviour. > > Caching a single character sounds perfectly OK. As I noted previously, > my use case probably won't need to work at the raw level anyway, so I > no longer expect to have code that will break, but I think that a > 1-character buffer ensuring that we avoid surprises for code that was > written for POSIX is a good trade-off. So it works, though the behaviour is a little strange when you do it from the interactive prompt: >>> sys.stdin.buffer.raw.read(1) ?print('hi') b'\xc9' >>> hi >>> sys.stdin.buffer.raw.read(1) b'\x92' >>> What happens here is the raw.read(1) rounds one byte up to one character, reads the turned alpha, returns a single byte of the two byte encoded form and caches the second byte. Then interactive mode reads from stdin and gets the rest of the characters, starting from the print() and executes that. Finally the next call to raw.read(1) returns the cached second byte of the turned alpha. This is basically only a problem because the readline implementation is totally separate from the stdin object and doesn't know about the small cache (and for now, I think it's going to stay that way - merging readline and stdin would be great, but is a fairly significant task that won't make 3.6 at this stage). I feel like this is an acceptable edge case, as it will only show up when interleaving calls to raw.read(n < 4) with multibyte characters and input()/interactive prompts. We've taken the 99% compatible to 99.99% compatible, and I feel like going any further is practically certain to introduce bugs (I'm being very careful with the single character buffering, but even that feels risky). Hopefully others agree with my risk assessment here, but speak up if you think it's worthwhile trying to deal with this final case. Cheers, Steve From greg.ewing at canterbury.ac.nz Mon Sep 5 18:16:37 2016 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 06 Sep 2016 10:16:37 +1200 Subject: [Python-Dev] Do PEP 526 type declarations define the types of variables or not? In-Reply-To: <57CD8E99.8090205@hotpy.org> References: <57CD8E99.8090205@hotpy.org> Message-ID: <57CDEEC5.9030305@canterbury.ac.nz> Mark Shannon wrote: > Unless of course, others may have a different idea of what the "type of > a variable" means. > To me, it means it means that for all assignments `var = expr` > the type of `expr` must be a subtype of the variable, > and for all uses of var, the type of the use is the same as the type of > the variable. I think it means that, at any given point in time, the value of the variable is of the type of the variable or some subtype thereof. That interpretation leaves the type checker free to make more precise inferences if it can. For example, in... > def foo()->int: > x:Optional[int] = bar() > if x is None: > return -1 > return x ...the type checker could notice that, on the branch containing 'return x', the value of x must be of type int, so the code is okay. -- Greg From pludemann at google.com Mon Sep 5 18:42:42 2016 From: pludemann at google.com (Peter Ludemann) Date: Mon, 5 Sep 2016 15:42:42 -0700 Subject: [Python-Dev] Do PEP 526 type declarations define the types of variables or not? In-Reply-To: <57CDEEC5.9030305@canterbury.ac.nz> References: <57CD8E99.8090205@hotpy.org> <57CDEEC5.9030305@canterbury.ac.nz> Message-ID: I would take the opposite approach from Greg Ewing, namely that the annotation is not a permission of values but a starting point for the type inferencer; and the type checker/inferencer can complain if there's an inconsistency (for some definition of "inconsistency", which is not defined in the PEP). In most cases, this distinction doesn't matter, but it does affect what kinds of errors or warnings are generated. But ... perhaps people are overthinking these things? If we go back to the example without variable annotation: def bar()->Optional[int]: ... def foo(): x = bar() if x is None: return -1 return x then a straightforward flow-tracing type inferencer can *infer* all the annotations in foo: def foo() -> int: # *not* Optional[int] - see below x:Optional[int] = bar() # derived from definition of bar if x is None: # consistent with x:Optional[int] return -1 # implies return type of foo return x # implies return type of foo as Union[int, None] minus None, that is: int That is, the type annotations add no information in this example, but might be useful to a human. Perhaps they wouldn't show in the source code at all, but would instead be put into a database, for use by development tools - for example, Kythe -flavored tools, where the type data (and other usage information) are used for code search, editing, refactoring, etc. (Or the type information could be kept in a .pyi stub file, with an automated "merge" tool putting them into the .py file as desired.) On the other hand, a non-flow-tracing inferencer would derive 'def foo() -> Optional[int]' ... it would be a *design choice* of the type checker/inferencer as to whether that's an error, a warning, or silently allowed ... I can see arguments for all of these choices. In most cases, there's seldom any need for the programmer to add annotations to local variables. Global variables and class/instance attributes, however, can benefit from annotation. (As to my credentials, which some people seem to crave: I worked on an earlier version of Google's Python type inferencer (*pytype*) and I'm currently working on *pykythe *(to be open-sourced), which takes the function-level information and propagates it to the local variables, then adds that information (together with call graph information) to a Kythe database.) On 5 September 2016 at 15:16, Greg Ewing wrote: > Mark Shannon wrote: > > Unless of course, others may have a different idea of what the "type of a >> variable" means. >> To me, it means it means that for all assignments `var = expr` >> the type of `expr` must be a subtype of the variable, >> and for all uses of var, the type of the use is the same as the type of >> the variable. >> > > I think it means that, at any given point in time, the > value of the variable is of the type of the variable or > some subtype thereof. That interpretation leaves the > type checker free to make more precise inferences if > it can. For example, in... > > def foo()->int: >> x:Optional[int] = bar() >> if x is None: >> return -1 >> return x >> > > ...the type checker could notice that, on the branch > containing 'return x', the value of x must be of type > int, so the code is okay. > > -- > Greg > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/pludemann > %40google.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From srkunze at mail.de Mon Sep 5 18:49:45 2016 From: srkunze at mail.de (Sven R. Kunze) Date: Tue, 6 Sep 2016 00:49:45 +0200 Subject: [Python-Dev] Do PEP 526 type declarations define the types of variables or not? In-Reply-To: <57CD8E99.8090205@hotpy.org> References: <57CD8E99.8090205@hotpy.org> Message-ID: Didn't Koos say this works more like an expression annotation? IMO, the type of the expression is what is specified but the type of the variable can change over time (as you demonstrated). Sven PS: thinking this way, the new syntax is actually confusing as it annotates the variable not the expression. :-/ On 05.09.2016 17:26, Mark Shannon wrote: > Hi, > > PEP 526 states that "This PEP aims at adding syntax to Python for > annotating the types of variables" and Guido seems quite insistent > that the declarations are for the types of variables. > > However, I get the impression that most (all) of the authors and > proponents of PEP 526 are quite keen to emphasise that the PEP in no > way limits type checkers from doing what they want. > > This is rather contradictory. The behaviour of a typechecker is > defined by the typesystem that it implements. Whether a type > annotation determines the type of a variable or an expression alters > changes what typesystems are feasible. So, stating that annotations > define the type of variables *does* limit what a typechecker can or > cannot do. > > Unless of course, others may have a different idea of what the "type > of a variable" means. > To me, it means it means that for all assignments `var = expr` > the type of `expr` must be a subtype of the variable, > and for all uses of var, the type of the use is the same as the type > of the variable. > > In this example: > > def bar()->Optional[int]: ... > > def foo()->int: > x:Optional[int] = bar() > if x is None: > return -1 > return x > > According to PEP 526 the annotation `x:Optional[int]` > means that the *variable* `x` has the type `Optional[int]`. > So what is the type of `x` in `return x`? > If it is `Optional[int]`, then a type checker is obliged to reject > this code. If it is `int` then what does "type of a variable" actually > mean, > and why aren't the other uses of `x` int as well? > > Cheers, > Mark. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/srkunze%40mail.de From k7hoven at gmail.com Mon Sep 5 19:40:28 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Tue, 6 Sep 2016 02:40:28 +0300 Subject: [Python-Dev] Do PEP 526 type declarations define the types of variables or not? In-Reply-To: References: <57CD8E99.8090205@hotpy.org> Message-ID: On Tue, Sep 6, 2016 at 1:49 AM, Sven R. Kunze wrote: > Didn't Koos say this works more like an expression annotation? > > IMO, the type of the expression is what is specified but the type of the > variable can change over time (as you demonstrated). That's exactly the kind of semantics I'm describing in the python-ideas thread. An that's exactly how Python works: the type of a variable can change every time you assign a value to it (but not in between, unless you're doing funny stuff). So in a sense you annotate the *value* by annotating the variable at the point in the function where the value is assigned to it. There are open questions in this approach of course. But if you're interested, don't hesitate to discuss or ask questions in the python-ideas thread. I won't answer before I wake up, though ;). -- Koos > > Sven > > > PS: thinking this way, the new syntax is actually confusing as it annotates > the variable not the expression. :-/ > > > > On 05.09.2016 17:26, Mark Shannon wrote: >> >> Hi, >> >> PEP 526 states that "This PEP aims at adding syntax to Python for >> annotating the types of variables" and Guido seems quite insistent that the >> declarations are for the types of variables. >> >> However, I get the impression that most (all) of the authors and >> proponents of PEP 526 are quite keen to emphasise that the PEP in no way >> limits type checkers from doing what they want. >> >> This is rather contradictory. The behaviour of a typechecker is defined by >> the typesystem that it implements. Whether a type annotation determines the >> type of a variable or an expression alters changes what typesystems are >> feasible. So, stating that annotations define the type of variables *does* >> limit what a typechecker can or cannot do. >> >> Unless of course, others may have a different idea of what the "type of a >> variable" means. >> To me, it means it means that for all assignments `var = expr` >> the type of `expr` must be a subtype of the variable, >> and for all uses of var, the type of the use is the same as the type of >> the variable. >> >> In this example: >> >> def bar()->Optional[int]: ... >> >> def foo()->int: >> x:Optional[int] = bar() >> if x is None: >> return -1 >> return x >> >> According to PEP 526 the annotation `x:Optional[int]` >> means that the *variable* `x` has the type `Optional[int]`. >> So what is the type of `x` in `return x`? >> If it is `Optional[int]`, then a type checker is obliged to reject this >> code. If it is `int` then what does "type of a variable" actually mean, >> and why aren't the other uses of `x` int as well? >> >> Cheers, >> Mark. >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> https://mail.python.org/mailman/options/python-dev/srkunze%40mail.de > > > -- + Koos Zevenhoven + http://twitter.com/k7hoven + From jcgoble3 at gmail.com Mon Sep 5 19:45:06 2016 From: jcgoble3 at gmail.com (Jonathan Goble) Date: Mon, 5 Sep 2016 19:45:06 -0400 Subject: [Python-Dev] Where are the list and array.array implementations in CPython source? Message-ID: I'd like to study the CPython implementations of lists and array.array instances for a personal project of mine, but I've very unfamiliar with the Python source code as it pertains to internals like this. Which files would I need to look at to do this, and are there a few particular functions/structures I should pay attention to? I'm just looking for a brief pointer in the right direction here, not a full explanation of how it works -- I'll get that from studying the source code. :-) (Sorry if this isn't the best place to post, but I felt that a question about CPython's internals fit slightly better on python-dev rather than python-list, since this is where those familiar with that code are more likely to see the post.) From vadmium+py at gmail.com Mon Sep 5 20:10:26 2016 From: vadmium+py at gmail.com (Martin Panter) Date: Tue, 6 Sep 2016 00:10:26 +0000 Subject: [Python-Dev] Where are the list and array.array implementations in CPython source? In-Reply-To: References: Message-ID: On 5 September 2016 at 23:45, Jonathan Goble wrote: > I'd like to study the CPython implementations of lists and array.array > instances for a personal project of mine, but I've very unfamiliar > with the Python source code as it pertains to internals like this. > Which files would I need to look at to do this, Built-in objects are usually in the Objects/ directory, with a corresponding include file in the Include/ directory: https://hg.python.org/cpython/file/default/Objects/listobject.c https://hg.python.org/cpython/file/default/Include/listobject.h Modules implemented in C are usually in the Modules/ directory: https://hg.python.org/cpython/file/default/Modules/arraymodule.c > and are there a few > particular functions/structures I should pay attention to? I'm just > looking for a brief pointer in the right direction here, not a full > explanation of how it works -- I'll get that from studying the source > code. :-) From benjamin at python.org Mon Sep 5 20:10:34 2016 From: benjamin at python.org (Benjamin Peterson) Date: Mon, 05 Sep 2016 17:10:34 -0700 Subject: [Python-Dev] Where are the list and array.array implementations in CPython source? In-Reply-To: References: Message-ID: <1473120634.589679.716603009.212461AF@webmail.messagingengine.com> Include/listobject.h Objects/listobject.c Modules/arraymodule.c On Mon, Sep 5, 2016, at 16:45, Jonathan Goble wrote: > I'd like to study the CPython implementations of lists and array.array > instances for a personal project of mine, but I've very unfamiliar > with the Python source code as it pertains to internals like this. > Which files would I need to look at to do this, and are there a few > particular functions/structures I should pay attention to? I'm just > looking for a brief pointer in the right direction here, not a full > explanation of how it works -- I'll get that from studying the source > code. :-) > > (Sorry if this isn't the best place to post, but I felt that a > question about CPython's internals fit slightly better on python-dev > rather than python-list, since this is where those familiar with that > code are more likely to see the post.) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/benjamin%40python.org From eryksun at gmail.com Mon Sep 5 20:26:41 2016 From: eryksun at gmail.com (eryk sun) Date: Tue, 6 Sep 2016 00:26:41 +0000 Subject: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8 In-Reply-To: <219890bf-c7a8-69d3-b952-0159cbfbf460@python.org> References: <547a4a73-b696-26e3-d55f-04aca1fb4c7f@python.org> <407d23cb-e7dc-4ea5-fdcd-dacb769068e6@python.org> <219890bf-c7a8-69d3-b952-0159cbfbf460@python.org> Message-ID: On Mon, Sep 5, 2016 at 9:45 PM, Steve Dower wrote: > > So it works, though the behaviour is a little strange when you do it from > the interactive prompt: > >>>> sys.stdin.buffer.raw.read(1) > ?print('hi') > b'\xc9' >>>> hi >>>> sys.stdin.buffer.raw.read(1) > b'\x92' >>>> > > What happens here is the raw.read(1) rounds one byte up to one character, > reads the turned alpha, returns a single byte of the two byte encoded form > and caches the second byte. Then interactive mode reads from stdin and gets > the rest of the characters, starting from the print() and executes that. > Finally the next call to raw.read(1) returns the cached second byte of the > turned alpha. > > This is basically only a problem because the readline implementation is > totally separate from the stdin object and doesn't know about the small > cache (and for now, I think it's going to stay that way - merging readline > and stdin would be great, but is a fairly significant task that won't make > 3.6 at this stage). It needs to read a minimum of 2 codes in case the first character is a lead surrogate. It can use a length 2 WCHAR buffer and remember how many bytes have been written (for the general case -- not specifically for this case). Example failure using your 3rd patch: >>> _ = write_console_input("\U00010000print('hi')\r\n");\ ... raw_read(1) ?print('hi') b'\xef' >>> File "", line 1 ?print('hi') ^ SyntaxError: invalid character in identifier >>> raw_read(1) b'\xbf' >>> raw_read(1) b'\xbd' The raw read captures the first surrogate code, and transcodes it as the replacement character b'\xef\xbf\xbd' (U+FFFD). Then PyOS_Readline captures the 2nd surrogate and decodes it as the replacement character. In the general case in which a lead surrogate is the last code read, but not at index 0, it can use the internal buffer to save the code for the next call. Surrogates that aren't in valid pairs should be allowed to pass through via surrogatepass. This aims for consistency with the filesystem encoding PEP. From jcgoble3 at gmail.com Mon Sep 5 22:16:39 2016 From: jcgoble3 at gmail.com (Jonathan Goble) Date: Mon, 5 Sep 2016 22:16:39 -0400 Subject: [Python-Dev] Where are the list and array.array implementations in CPython source? In-Reply-To: <1473120634.589679.716603009.212461AF@webmail.messagingengine.com> References: <1473120634.589679.716603009.212461AF@webmail.messagingengine.com> Message-ID: On Mon, Sep 5, 2016 at 8:10 PM, Martin Panter wrote: > Built-in objects are usually in the Objects/ directory, with a > corresponding include file in the Include/ directory: > https://hg.python.org/cpython/file/default/Objects/listobject.c > https://hg.python.org/cpython/file/default/Include/listobject.h > > Modules implemented in C are usually in the Modules/ directory: > https://hg.python.org/cpython/file/default/Modules/arraymodule.c On Mon, Sep 5, 2016 at 8:10 PM, Benjamin Peterson wrote: > Include/listobject.h > Objects/listobject.c > Modules/arraymodule.c Thanks to both of you. I'll start looking at those soon. :) From ncoghlan at gmail.com Mon Sep 5 23:46:29 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 6 Sep 2016 13:46:29 +1000 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> <57CDAA37.5040207@stoneleaf.us> Message-ID: On 6 September 2016 at 04:15, Guido van Rossum wrote: > On Mon, Sep 5, 2016 at 10:24 AM, Ethan Furman wrote: >> On 09/05/2016 06:46 AM, Nick Coghlan wrote: >> >> [an easy to understand explanation for those of us who aren't type-inferring >> gurus] >> >> Thanks, Nick. I think I finally have a grip on what Mark was talking about, >> and about how these things should work. >> >> Much appreciated! > > There must be some misunderstanding. The message from Nick with that > timestamp (https://mail.python.org/pipermail/python-dev/2016-September/146200.html) > hinges on an incorrect understanding of the intention of annotations > without value (e.g. `x: Optional[int]`), leading to a -1 on the PEP. Short version of below: after sleeping on it, I'd be OK with the PEP again if it just *added* the explicit type assertions, such that the shorthand notation could be described in those terms. Specifically, "x: T = expr" would be syntactic sugar for: x = expr assert x: T While the bare "x: T" would be syntactic sugar for: assert all(x): T which in turn would imply that all future bindings of that assignment target should be accompanied by a type assertion (and typecheckers may differ in how they define "all future bindings"). Even if everyone always writes the short forms, the explicit assertions become a useful aid in explaining what those short forms mean. The main exploratory question pushed back to the typechecking community to answer by 3.7 would then be to resolve precisely what "assert all(TARGET): ANNOTATION" means for different kinds of target and for different scopes (e.g. constraining nonlocal name rebindings in closures, constraining attribute rebinding in modules, classes, and instances). > I can't tell if this is an honest misunderstanding or a strawman, but > I want to set the intention straight. I'm pretty sure I understand your intentions (and broadly agree with them), I just also agree with Mark that people are going to need some pretty strong hints that these are not Java/C/C++/C# style type declarations, and am suggesting a different way of getting there by being more prescriptive about your intended semantics. Specifically: * for 3.6, push everything into a new form of assert statement and define those assertions as syntactic sugar for PEP 484 constructs * for 3.7 (and provisionally in 3.6), consider blessing some of those assertions with the bare annotation syntax Folks are already comfortable with the notion of assertions not necessarily being executed at runtime, and they're also comfortable with them as a way of doing embedded correctness testing inline with the code. > First of all, the PEP does not require the type checker to interpret > anything in a particular way; it intentionally shies away from > prescribing semantics (other than the runtime semantics of updating > __annotations__ or verifying that the target appears assignable). Unfortunately, the ordering problem the PEP introduces means it pushes very heavily in a particular direction, such that I think we're going to be better off if you actually specify draft semantics in the PEP (in terms of existing PEP 484 annotations), rather than leaving it completely open to interpretation. It's still provisional so you can change your mind later, but the notion of describing a not yet bound name is novel enough that I think more guidance (even if it's provisional) is needed here than was needed in the case of function annotations. (I realise you already understand most of the background I go through below - I'm spelling out my reasoning so you can hopefully figure out where I'm diverging from your point of view) If we look at PEP 484, all uses of annotations exist between two pieces of code: one that produces a value, and one that binds the value to a reference. As such, they act as type assertions: - on parameters, they assert "I am expecting this argument to be of this type" - on assignments, they assert "I an expecting this initialiser to be of this type" Typecheckers can then use those assertions in two ways: as a constraint on the value producer, and as a more precise hint if type inference either isn't possible (e.g. function parameters, initialisation to None), or gives an overly broad answer (e.g empty containers) The "x: T = expr" syntax is entirely conformant with that system - all it does is change the spelling of the existing type hint comments. Allowing "assert x: T" would permit that existing kind of type assertion to be inserted at arbitrary points in the code without otherwise affecting control flow or type inference, as if you had written: # PEP 484 def is_T(arg: T) -> None: pass is_T(x) Or: # PEP 526 x: T = x By contrast, bare annotations on new assignment targets without an initialiser can't be interpreted that way, as there is no *preceding value to constrain*. That inability to interpret them in the same sense as existing annotations means that there's really only one plausible way to interpret them if a typechecker is going to help ensure that the type assertion is actually true in a given codebase: as a constraint on *future* bindings to that particular target. Typecheckers may differ in how they enforce that constraint, and how the declared constraint influences the type inference process, but that "explicit declaration of implicit future type assertions" is core to the notion of bare variable annotations making any sense at all. That's a genuinely new concept to introduce into the language, and the PEP quite clearly intends bare annotations to be used that way given its discussion of class invariants and the distinction between instance variables with a class level default and class variables that shouldn't be shadowed on instances. > But there appears considerable fear about what expectations the PEP > has of a reasonable type checker. In response to this I'll try to > sketch how I think this should be implemented in mypy. > > There are actually at least two separate cases: if x is a local > variable, the intention of `x: ` is quite different from when x > occurs in a class. This is where I think the "assert all(x): T" notation is useful, as it changes that core semantic question to "What does 'all' mean for a type assertion?" Based on your stated intentions for mypy, it provisionally means: * for a local variable, "all future bindings in the current scope". * for a class or module variable, "all future bindings in the current scope, and all future bindings via attribute access". Both initialised and bare variable annotations can then be defined as syntactic sugar for explicit type assertions: # Initialised annotation x: T = expr x = expr assert x: T # Equivalent type assertion # Bare annotation x: T x = expr assert all(x): T # Equivalent type assertion x = expr assert x: T # Assertion implied by all(x) above (A full expansion would also show setting __annotations__, but that's not my main concern here) > I am at a loss how to modify the PEP to avoid this misunderstanding, > since it appears it is entirely in the reader's mind. The PEP is not a > tutorial but a spec for the implementation, and as a spec it is quite > clear that it leaves the type-checking semantics up to individual type > checkers. And I think that is the right thing to do -- in practice > there are many other ways to write the above example, and mypy will > understand some of them, but not others, while other type checkers may > understand a different subset of examples. I can't possibly prescribe > how type checkers should behave in each case -- I can't even tell > which cases are important to distinguish. Providing an easier path to decomposing the new syntax into pre-existing PEP 484 semantics would definitely help me, and I suspect it would help other folks as well. Recapping: * Introduce "assert TARGET: ANNOTATION" as a new noop-at-runtime syntactic primitive that typechecks as semantically equivalent to: def _conforms_to_type(x: ANNOTATION): -> None pass _conforms_to_type(TARGET) * Introduce "assert all(TARGET): ANNOTATION" as a way to declaratively annotate future assignments to a particular target * Define variable annotations in terms of those two new primitives * Make it clear that there's currently still room for semantic variation between typecheckers in defining precisely what "assert all(TARGET): ANNOTATION" means > So writing down "the type checker should not report an error in the > following case" in the PEP is not going to be helpful for anyone (in > contrast, I think discussing examples on a mailing list *is* useful). Yeah, I've come around to agreeing with you on that point. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From guido at python.org Tue Sep 6 00:04:59 2016 From: guido at python.org (Guido van Rossum) Date: Mon, 5 Sep 2016 21:04:59 -0700 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> <57CDAA37.5040207@stoneleaf.us> Message-ID: I'm sorry, but we're not going to invent new syntax this late in the game. The syntax proposed by the PEP has been on my mind ever since PEP 484 with very minor variations; I first proposed it seriously on python-ideas over a month ago, we've been debating the details since then, and it's got a solid implementation based on those debates by Ivan Levkivskyi. In contrast, it looks like you just made the "assert x: T" syntax up last night in response to the worries expressed by Mark Shannon, and "assert" sounds a lot like a run-time constraint to me. Instead, I encourage you to participate in the writing of a separate PEP explaining how type checkers are expected to work (since PEP 526 doesn't specify that). Ivan is also interested in such a PEP and we hope Mark will also lend us his expertise. On Mon, Sep 5, 2016 at 8:46 PM, Nick Coghlan wrote: > On 6 September 2016 at 04:15, Guido van Rossum wrote: >> On Mon, Sep 5, 2016 at 10:24 AM, Ethan Furman wrote: >>> On 09/05/2016 06:46 AM, Nick Coghlan wrote: >>> >>> [an easy to understand explanation for those of us who aren't type-inferring >>> gurus] >>> >>> Thanks, Nick. I think I finally have a grip on what Mark was talking about, >>> and about how these things should work. >>> >>> Much appreciated! >> >> There must be some misunderstanding. The message from Nick with that >> timestamp (https://mail.python.org/pipermail/python-dev/2016-September/146200.html) >> hinges on an incorrect understanding of the intention of annotations >> without value (e.g. `x: Optional[int]`), leading to a -1 on the PEP. > > Short version of below: after sleeping on it, I'd be OK with the PEP > again if it just *added* the explicit type assertions, such that the > shorthand notation could be described in those terms. > > Specifically, "x: T = expr" would be syntactic sugar for: > > x = expr > assert x: T > > While the bare "x: T" would be syntactic sugar for: > > assert all(x): T > > which in turn would imply that all future bindings of that assignment > target should be accompanied by a type assertion (and typecheckers may > differ in how they define "all future bindings"). > > Even if everyone always writes the short forms, the explicit > assertions become a useful aid in explaining what those short forms > mean. > > The main exploratory question pushed back to the typechecking > community to answer by 3.7 would then be to resolve precisely what > "assert all(TARGET): ANNOTATION" means for different kinds of target > and for different scopes (e.g. constraining nonlocal name rebindings > in closures, constraining attribute rebinding in modules, classes, and > instances). > >> I can't tell if this is an honest misunderstanding or a strawman, but >> I want to set the intention straight. > > I'm pretty sure I understand your intentions (and broadly agree with > them), I just also agree with Mark that people are going to need some > pretty strong hints that these are not Java/C/C++/C# style type > declarations, and am suggesting a different way of getting there by > being more prescriptive about your intended semantics. > > Specifically: > > * for 3.6, push everything into a new form of assert statement and > define those assertions as syntactic sugar for PEP 484 constructs > * for 3.7 (and provisionally in 3.6), consider blessing some of those > assertions with the bare annotation syntax > > Folks are already comfortable with the notion of assertions not > necessarily being executed at runtime, and they're also comfortable > with them as a way of doing embedded correctness testing inline with > the code. > >> First of all, the PEP does not require the type checker to interpret >> anything in a particular way; it intentionally shies away from >> prescribing semantics (other than the runtime semantics of updating >> __annotations__ or verifying that the target appears assignable). > > Unfortunately, the ordering problem the PEP introduces means it pushes > very heavily in a particular direction, such that I think we're going > to be better off if you actually specify draft semantics in the PEP > (in terms of existing PEP 484 annotations), rather than leaving it > completely open to interpretation. It's still provisional so you can > change your mind later, but the notion of describing a not yet bound > name is novel enough that I think more guidance (even if it's > provisional) is needed here than was needed in the case of function > annotations. > > (I realise you already understand most of the background I go through > below - I'm spelling out my reasoning so you can hopefully figure out > where I'm diverging from your point of view) > > If we look at PEP 484, all uses of annotations exist between two > pieces of code: one that produces a value, and one that binds the > value to a reference. > > As such, they act as type assertions: > > - on parameters, they assert "I am expecting this argument to be of this type" > - on assignments, they assert "I an expecting this initialiser to be > of this type" > > Typecheckers can then use those assertions in two ways: as a > constraint on the value producer, and as a more precise hint if type > inference either isn't possible (e.g. function parameters, > initialisation to None), or gives an overly broad answer (e.g empty > containers) > > The "x: T = expr" syntax is entirely conformant with that system - all > it does is change the spelling of the existing type hint comments. > > Allowing "assert x: T" would permit that existing kind of type > assertion to be inserted at arbitrary points in the code without > otherwise affecting control flow or type inference, as if you had > written: > > # PEP 484 > def is_T(arg: T) -> None: > pass > > is_T(x) > > Or: > > # PEP 526 > x: T = x > > By contrast, bare annotations on new assignment targets without an > initialiser can't be interpreted that way, as there is no *preceding > value to constrain*. > > That inability to interpret them in the same sense as existing > annotations means that there's really only one plausible way to > interpret them if a typechecker is going to help ensure that the type > assertion is actually true in a given codebase: as a constraint on > *future* bindings to that particular target. > > Typecheckers may differ in how they enforce that constraint, and how > the declared constraint influences the type inference process, but > that "explicit declaration of implicit future type assertions" is core > to the notion of bare variable annotations making any sense at all. > > That's a genuinely new concept to introduce into the language, and the > PEP quite clearly intends bare annotations to be used that way given > its discussion of class invariants and the distinction between > instance variables with a class level default and class variables that > shouldn't be shadowed on instances. > >> But there appears considerable fear about what expectations the PEP >> has of a reasonable type checker. In response to this I'll try to >> sketch how I think this should be implemented in mypy. >> >> There are actually at least two separate cases: if x is a local >> variable, the intention of `x: ` is quite different from when x >> occurs in a class. > > This is where I think the "assert all(x): T" notation is useful, as it > changes that core semantic question to "What does 'all' mean for a > type assertion?" > > Based on your stated intentions for mypy, it provisionally means: > > * for a local variable, "all future bindings in the current scope". > > * for a class or module variable, "all future bindings in the current > scope, and all future bindings via attribute access". > > Both initialised and bare variable annotations can then be defined as > syntactic sugar for explicit type assertions: > > # Initialised annotation > x: T = expr > > x = expr > assert x: T # Equivalent type assertion > > # Bare annotation > x: T > x = expr > > assert all(x): T # Equivalent type assertion > x = expr > assert x: T # Assertion implied by all(x) above > > (A full expansion would also show setting __annotations__, but that's > not my main concern here) > >> I am at a loss how to modify the PEP to avoid this misunderstanding, >> since it appears it is entirely in the reader's mind. The PEP is not a >> tutorial but a spec for the implementation, and as a spec it is quite >> clear that it leaves the type-checking semantics up to individual type >> checkers. And I think that is the right thing to do -- in practice >> there are many other ways to write the above example, and mypy will >> understand some of them, but not others, while other type checkers may >> understand a different subset of examples. I can't possibly prescribe >> how type checkers should behave in each case -- I can't even tell >> which cases are important to distinguish. > > Providing an easier path to decomposing the new syntax into > pre-existing PEP 484 semantics would definitely help me, and I suspect > it would help other folks as well. > > Recapping: > > * Introduce "assert TARGET: ANNOTATION" as a new noop-at-runtime > syntactic primitive that typechecks as semantically equivalent to: > > def _conforms_to_type(x: ANNOTATION): -> None > pass > _conforms_to_type(TARGET) > > * Introduce "assert all(TARGET): ANNOTATION" as a way to declaratively > annotate future assignments to a particular target > > * Define variable annotations in terms of those two new primitives > > * Make it clear that there's currently still room for semantic > variation between typecheckers in defining precisely what "assert > all(TARGET): ANNOTATION" means > >> So writing down "the type checker should not report an error in the >> following case" in the PEP is not going to be helpful for anyone (in >> contrast, I think discussing examples on a mailing list *is* useful). > > Yeah, I've come around to agreeing with you on that point. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -- --Guido van Rossum (python.org/~guido) From vadmium+py at gmail.com Tue Sep 6 06:34:01 2016 From: vadmium+py at gmail.com (Martin Panter) Date: Tue, 6 Sep 2016 10:34:01 +0000 Subject: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8 In-Reply-To: References: <5cfa8c73-14b7-8795-06c5-3266940da4b1@python.org> Message-ID: On 5 September 2016 at 21:40, eryk sun wrote: > On Mon, Sep 5, 2016 at 7:54 PM, Steve Dower wrote: >> On 05Sep2016 1234, eryk sun wrote: >>> It would probably be simpler to use UTF-16 in the main pipeline and >>> implement Martin's suggestion to mix in a UTF-8 buffer. The UTF-16 >>> buffer could be renamed as "wbuffer", for expert use. However, if >>> you're fully committed to transcoding in the raw layer, I'm certain >>> that these problems can be addressed with small buffers and using >>> Python's codec machinery for a flexible mix of "surrogatepass" and >>> "replace" error handling. >> >> I don't think it actually makes things simpler. Having two buffers is >> generally a bad idea unless they are perfectly synced, which would be >> impossible here without data corruption (if you read half a utf-8 character >> sequence and then read the wide buffer, do you get that character or not?). > > Martin's idea, as I understand it, is a UTF-8 buffer that reads from > and writes to the text wrapper. Yes, that was basically it. Though I had only thought as far as simple encodings like ASCII, where one byte corresponds to one character. I wonder if you really need UTF-8 support. Are the encoding values currently encountered for Windows consoles all single-byte encodings or are they more complicated? > It necessarily consumes at least one > character and buffers it to allow reading per byte. Likewise for > writing, it buffers bytes until it can write a character to the text > wrapper. ISTM, it has to look for incomplete lead-continuation byte > sequences at the tail end, to hold them until the sequence is > complete, at which time it either decodes to a valid character or the > U+FFFD replacement character. This buffering behaviour would be necessary for a multi-byte encodings like UTF-8. From k7hoven at gmail.com Tue Sep 6 08:01:44 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Tue, 6 Sep 2016 15:01:44 +0300 Subject: [Python-Dev] PEP 467: last round (?) In-Reply-To: <1473035412.2762262.715619409.62F5BAA2@webmail.messagingengine.com> References: <57C88355.9000302@stoneleaf.us> <1472940660.890622.714961169.3425F103@webmail.messagingengine.com> <1473035412.2762262.715619409.62F5BAA2@webmail.messagingengine.com> Message-ID: On Mon, Sep 5, 2016 at 3:30 AM, Random832 wrote: > On Sun, Sep 4, 2016, at 16:42, Koos Zevenhoven wrote: >> On Sun, Sep 4, 2016 at 6:38 PM, Nick Coghlan wrote: >> > >> > There are two self-consistent sets of names: >> > >> >> Let me add a few. I wonder if this is really used so much that >> bytes.chr is too long to type (and you can do bchr = bytes.chr if you >> want to): >> >> bytes.chr (or bchr in builtins) >> bytes.chr_at, bytearray.chr_at > > Ugh, that "at" is too reminiscent of java. And it just feels wrong to > spell it "chr" rather than "char" when there's a vowel elsewhere in the > name. > Oh, I didn't realize that connection. It's funny that I get a Java connotation from get* methods ;). > Hmm... how offensive to the zen of python would it be to have "magic" to > allow both bytes.chr(65) and b'ABCDE'.chr[0]? (and possibly also > iter(b'ABCDE'.chr)? That is, a descriptor which is callable on the > class, but returns a view on instances? Indeed quite magical, while I really like how easy it is to remember this *once you realize what is going on*. I think bytes.char (on class) and data.chars (on instance) would be quite similar. -- Koos > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/k7hoven%40gmail.com -- + Koos Zevenhoven + http://twitter.com/k7hoven + From random832 at fastmail.com Tue Sep 6 09:22:55 2016 From: random832 at fastmail.com (Random832) Date: Tue, 06 Sep 2016 09:22:55 -0400 Subject: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8 In-Reply-To: References: <5cfa8c73-14b7-8795-06c5-3266940da4b1@python.org> Message-ID: <1473168175.1294969.717131833.678CEDF7@webmail.messagingengine.com> On Tue, Sep 6, 2016, at 06:34, Martin Panter wrote: > Yes, that was basically it. Though I had only thought as far as simple > encodings like ASCII, where one byte corresponds to one character. I > wonder if you really need UTF-8 support. Are the encoding values > currently encountered for Windows consoles all single-byte encodings > or are they more complicated? Windows supports Chinese, Japanese, and Korean encodings (code pages 936, 932, 949) that are multi-byte with one or two bytes per character. I'm not sure how that affects this though or what your point about not needing UTF-8 is. From k7hoven at gmail.com Tue Sep 6 09:35:01 2016 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Tue, 6 Sep 2016 16:35:01 +0300 Subject: [Python-Dev] PEP 467: last round (?) In-Reply-To: References: <57C88355.9000302@stoneleaf.us> <1472940660.890622.714961169.3425F103@webmail.messagingengine.com> Message-ID: On Mon, Sep 5, 2016 at 6:06 AM, Nick Coghlan wrote: > On 5 September 2016 at 06:42, Koos Zevenhoven wrote: >> On Sun, Sep 4, 2016 at 6:38 PM, Nick Coghlan wrote: >>> >>> There are two self-consistent sets of names: >>> >> >> Let me add a few. I wonder if this is really used so much that >> bytes.chr is too long to type (and you can do bchr = bytes.chr if you >> want to) >> >> bytes.chr (or bchr in builtins) > > The main problem with class method based spellings is that we need to > either duplicate it on bytearray or else break the bytearray/bytes > symmetry and propose "bytearray(bytes.chr(x))" as the replacement for > current cryptic "bytearray([x])" Warning: some API-design philosophy below: 1. It's not as bad to break symmetry regarding what functionality is offered for related object types (here: str, bytes, bytearray) than it is to break symmetry in how the symmetric functionality is provided. IOW, a missing unnecessary functionality is less bad than exposing the equivalent functionality under a different name. (This might be kind of how Random832 was reasoning previously) 2. Symmetry is more important in object access functionality than it is in instance creation. IOW, symmetry regarding 'constructors' (here: bchr, bytes.chr, bytes.byte, ...) across different types is not as crucial as symmetry in slicing. The reason is that the caller of a constructor is likely to know which class it is instantiating. A consumer of bytes/bytearray/str-like objects often does not know which type is being dealt with. I might be crying over spilled milk here, but that seems to be the point of the whole PEP. That chars view thing might collect some of the milk back back into a bottle: mystr[whatever] <-> mybytes.chars[whatever] <-> mybytearray.chars[whatever] iter(mystr) <-> iter(mybytes.chars) <-> iter(mybytearray.chars) Then introduce 'chars' on str and this becomes mystring.chars[whatever] <-> mybytes.chars[whatever] <-> mybytearray.chars[whatever] iter(mystr.chars) <-> iter(mybytes.chars) <-> iter(mybytearray.chars) If iter(mystr.chars) is recommended and iter(mystr) discouraged, then after a decade or two, the world may look quite different regarding how important it is for a str to be iterable. This would solve multiple problems at once. Well I admit that "at once" is not really an accurate description of the process :). [...] > You also run into a searchability problem as "chr" will get hits for > both the chr builtin and bytes.chr, similar to the afalg problem that > recently came up in another thread. While namespaces are a honking > great idea, the fact that search is non-hierarchical means they still > don't give API designers complete freedom to reuse names at will. Oh, I can kind of see a point here, especially if the search hits aren't related in any way. Why not just forget all symmetry if this is an issue? But is it really a bad thing if by searching you find that there's a chr for both str and bytes? If I think, "I want to turn my int into a bytes 'character' kind of in the way that chr turns my int into a str". What am I going to search or google for? I can't speak for others, but I would probably search for something that contains 'chr' and 'bytes'. Based on this, I'm unable to see the search disadvantage of bytes.chr. [...] >> bytes.char (or bytes.chr or bchr in builtins) >> bytes.chars, bytearray.chars (sequence views) > > The views are already available via memoryview.cast if folks really > want them, but encouraging their use in general isn't a great idea, as > it means more function developers now need to ask themselves "What if > someone passes me a char view rather than a normal bytes object?". Thanks, I think this is the first real argument I hear against the char view. In fact, I don't think people should ask themselves that question, and just not accept bytes views as input. Would it be enough to discourage storing and passing bytes views? Anyway, the only error that would pass silently would be that the passed-in object gets indexed (e.g. obj[0]) and a bytes-char comes out instead of an int. But it would be a strange thing to do by the caller to pass a char view into the bytes-consumer. I could imagine someone wanting to pass a bytes view into a str-consumer. But there are no significant silently-passing errors there. If str also gets .chars, then it becomes even easier to support this. -- Koos > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -- + Koos Zevenhoven + http://twitter.com/k7hoven + From mark at hotpy.org Tue Sep 6 11:25:48 2016 From: mark at hotpy.org (Mark Shannon) Date: Tue, 6 Sep 2016 16:25:48 +0100 Subject: [Python-Dev] Do PEP 526 type declarations define the types of variables or not? In-Reply-To: <57CDEEC5.9030305@canterbury.ac.nz> References: <57CD8E99.8090205@hotpy.org> <57CDEEC5.9030305@canterbury.ac.nz> Message-ID: <57CEDFFC.5040904@hotpy.org> On 05/09/16 23:16, Greg Ewing wrote: > Mark Shannon wrote: > >> Unless of course, others may have a different idea of what the "type >> of a variable" means. >> To me, it means it means that for all assignments `var = expr` >> the type of `expr` must be a subtype of the variable, >> and for all uses of var, the type of the use is the same as the type >> of the variable. > > I think it means that, at any given point in time, the > value of the variable is of the type of the variable or > some subtype thereof. That interpretation leaves the > type checker free to make more precise inferences if > it can. For example, in... How does that differ from annotating the type of the expression? > >> def foo()->int: >> x:Optional[int] = bar() >> if x is None: >> return -1 >> return x > > ...the type checker could notice that, on the branch > containing 'return x', the value of x must be of type > int, so the code is okay. > The issue is not whether the checker can tell that the type of the *expression* is int, but whether it is forced to use the type of the *variable*. The current wording of PEP 526 strongly implies the latter. Cheers, Mark. From mark at hotpy.org Tue Sep 6 11:25:58 2016 From: mark at hotpy.org (Mark Shannon) Date: Tue, 6 Sep 2016 16:25:58 +0100 Subject: [Python-Dev] Do PEP 526 type declarations define the types of variables or not? In-Reply-To: References: <57CD8E99.8090205@hotpy.org> Message-ID: <57CEE006.6030803@hotpy.org> On 05/09/16 18:40, Guido van Rossum wrote: > On Mon, Sep 5, 2016 at 8:26 AM, Mark Shannon wrote: >> PEP 526 states that "This PEP aims at adding syntax to Python for annotating >> the types of variables" and Guido seems quite insistent that the >> declarations are for the types of variables. >> >> However, I get the impression that most (all) of the authors and proponents >> of PEP 526 are quite keen to emphasise that the PEP in no way limits type >> checkers from doing what they want. >> >> This is rather contradictory. The behaviour of a typechecker is defined by >> the typesystem that it implements. Whether a type annotation determines the >> type of a variable or an expression alters changes what typesystems are >> feasible. So, stating that annotations define the type of variables *does* >> limit what a typechecker can or cannot do. >> >> Unless of course, others may have a different idea of what the "type of a >> variable" means. >> To me, it means it means that for all assignments `var = expr` >> the type of `expr` must be a subtype of the variable, >> and for all uses of var, the type of the use is the same as the type of the >> variable. >> >> In this example: >> >> def bar()->Optional[int]: ... >> >> def foo()->int: >> x:Optional[int] = bar() >> if x is None: >> return -1 >> return x >> >> According to PEP 526 the annotation `x:Optional[int]` >> means that the *variable* `x` has the type `Optional[int]`. >> So what is the type of `x` in `return x`? >> If it is `Optional[int]`, then a type checker is obliged to reject this >> code. If it is `int` then what does "type of a variable" actually mean, >> and why aren't the other uses of `x` int as well? > > Oh, there is definitely a problem here if you interpret it that way. > Of course I assume that other type checkers are at least as smart as > mypy. :-) In mypy, the analysis of this example narrows the type x can > have once `x is None` is determined to be false, so that the example > passes. The "smartness" of checkers is not the problem (for this example, at least) the problem is that checkers must conform to the rules laid down in PEP 484 and (in whatever form it finally takes) PEP 526. It sounds like mypy doesn't conform to PEP 526, as it ignoring the declared type of x and using the inferred type. In fact it looks as if it is doing exactly what I proposed, which is that the annotation describes the type of the expression, not the variable. > > I guess this is a surprise if you think of type systems like Java's > where the compiler forgets what it has learned, at least from the > language spec's POV. But a Python type checker is more like a linter, > and false positives (complaints about valid code) are much more > problematic than false negatives (passing invalid code). The language of PEP 526 is strongly suggestive of a type system like Java. The extensive use of the term 'variable' rather than 'expression' and 'assignment' rather suggests that all definitions and uses of a single variable have the same type. > > So a Python type checker that is to gain acceptance of users must be > much smarter than that, and neither PEP 484 not PEP 526 is meant to > require a type checker to complain about `return x` in the above > example. > > I'm not sure how to change the language of the PEP though -- do you > have a suggestion? It all seems to depend on how the reader interprets > the meaning of very vague words like "variable" and "type". > The problem with using the term "variable" is that it is *not* vague. Variables in python have well defined scopes and lifetimes. Cheers, Mark. From levkivskyi at gmail.com Tue Sep 6 11:33:26 2016 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Tue, 6 Sep 2016 17:33:26 +0200 Subject: [Python-Dev] Do PEP 526 type declarations define the types of variables or not? In-Reply-To: <57CEDFFC.5040904@hotpy.org> References: <57CD8E99.8090205@hotpy.org> <57CDEEC5.9030305@canterbury.ac.nz> <57CEDFFC.5040904@hotpy.org> Message-ID: On 6 September 2016 at 17:25, Mark Shannon wrote: > The issue is not whether the checker can tell that the type of the > *expression* is int, but whether it is forced to use the type of the > *variable*. The current wording of PEP 526 strongly implies the latter. > Mark, Could you please point to exact locations in the PEP text and propose an alternative wording, so that we will have a more concrete discussion. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ian at feete.org Mon Sep 5 12:51:47 2016 From: ian at feete.org (Ian Foote) Date: Mon, 5 Sep 2016 17:51:47 +0100 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> Message-ID: On 05/09/16 14:46, Nick Coghlan wrote: > That's not what the PEP proposes for uninitialised variables though: > it proposes processing them *before* a series of assignment > statements, which *only makes sense* if you plan to use them to > constrain those assignments in some way. > > If you wanted to write something like that under a type assertion > spelling, then you could enlist the aid of the "all" builtin: > > assert all(x) : List[T] # All local assignments to "x" must abide > by this constraint > if case1: > x = ... > elif case2: > x = ... > else: > x = ... > Would the `assert all(x)` be executed at runtime as well or would this be syntax only for type checkers? I think this particular spelling at least is potentially confusing. Regards, Ian F -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 473 bytes Desc: OpenPGP digital signature URL: From ncoghlan at gmail.com Tue Sep 6 12:00:38 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 7 Sep 2016 02:00:38 +1000 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> <57CDAA37.5040207@stoneleaf.us> Message-ID: On 6 September 2016 at 14:04, Guido van Rossum wrote: > I'm sorry, but we're not going to invent new syntax this late in the > game. The syntax proposed by the PEP has been on my mind ever since > PEP 484 with very minor variations; I first proposed it seriously on > python-ideas over a month ago, we've been debating the details since > then, and it's got a solid implementation based on those debates by > Ivan Levkivskyi. In contrast, it looks like you just made the "assert > x: T" syntax up last night in response to the worries expressed by > Mark Shannon, and "assert" sounds a lot like a run-time constraint to > me. That's a fair description, but the notation also helped me a lot in articulating the concepts I was concerned about without having to put dummy annotated functions everywhere :) > Instead, I encourage you to participate in the writing of a separate > PEP explaining how type checkers are expected to work (since PEP 526 > doesn't specify that). Ivan is also interested in such a PEP and we > hope Mark will also lend us his expertise. Aye, I'd be happy to help with that - I think everything proposed can be described in terms of existing PEP 484 primitives and the descriptor protocol, so the requirements on typecheckers would just be for them to be self-consistent, rather than defining fundamentally new behaviours. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Tue Sep 6 12:13:29 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 7 Sep 2016 02:13:29 +1000 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> Message-ID: On 6 September 2016 at 02:51, Ian Foote wrote: > On 05/09/16 14:46, Nick Coghlan wrote: >> That's not what the PEP proposes for uninitialised variables though: >> it proposes processing them *before* a series of assignment >> statements, which *only makes sense* if you plan to use them to >> constrain those assignments in some way. >> >> If you wanted to write something like that under a type assertion >> spelling, then you could enlist the aid of the "all" builtin: >> >> assert all(x) : List[T] # All local assignments to "x" must abide >> by this constraint >> if case1: >> x = ... >> elif case2: >> x = ... >> else: >> x = ... >> > > Would the `assert all(x)` be executed at runtime as well or would this > be syntax only for type checkers? I think this particular spelling at > least is potentially confusing. Only for typecheckers, same as the plans for function level bare annotations. Otherwise it wouldn't work, since you'd be calling "all()" on a non-iterable :) Guido doesn't like the syntax though, so the only place it would ever appear is explanatory notes describing the purpose of the new syntax, and hence can be replaced by something like: # After all future assignments to x, check that x conforms to T Cheers, Nick. P.S. Or, if you're particularly fond of mathematical notation, and we take type categories as sets: # ?x: x ? T That would be a singularly unhelpful explanatory comment for the vast majority of folks, though :) -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From guido at python.org Tue Sep 6 12:22:46 2016 From: guido at python.org (Guido van Rossum) Date: Tue, 6 Sep 2016 09:22:46 -0700 Subject: [Python-Dev] Please reject or postpone PEP 526 In-Reply-To: References: <5193a7a9-575e-aee2-a502-6aad2895d51a@hotpy.org> <57CDAA37.5040207@stoneleaf.us> Message-ID: On Tue, Sep 6, 2016 at 9:00 AM, Nick Coghlan wrote: > On 6 September 2016 at 14:04, Guido van Rossum wrote: >> I'm sorry, but we're not going to invent new syntax this late in the >> game. The syntax proposed by the PEP has been on my mind ever since >> PEP 484 with very minor variations; I first proposed it seriously on >> python-ideas over a month ago, we've been debating the details since >> then, and it's got a solid implementation based on those debates by >> Ivan Levkivskyi. In contrast, it looks like you just made the "assert >> x: T" syntax up last night in response to the worries expressed by >> Mark Shannon, and "assert" sounds a lot like a run-time constraint to >> me. > > That's a fair description, but the notation also helped me a lot in > articulating the concepts I was concerned about without having to put > dummy annotated functions everywhere :) Thanks Nick! It seems your writings has helped some others (e.g. Ethan) understand PEP 526. >> Instead, I encourage you to participate in the writing of a separate >> PEP explaining how type checkers are expected to work (since PEP 526 >> doesn't specify that). Ivan is also interested in such a PEP and we >> hope Mark will also lend us his expertise. > > Aye, I'd be happy to help with that - I think everything proposed can > be described in terms of existing PEP 484 primitives and the > descriptor protocol, so the requirements on typecheckers would just be > for them to be self-consistent, rather than defining fundamentally new > behaviours. Beware that there are by now some major type checkers that already claim conformance to PEP 484 in various ways: mypy, pytype, PyCharm, and probably Semmle.com where Mark works has one too. Each one has some specialty and each one is a work in progress, but a PEP shouldn't start out by declaring the approach used by any existing checker unlawful. As an example, mypy doesn't yet support Optional by default: it recognizes the syntax but it doesn't distinguish between e.g. int and Optional[int]. (It will do the right thing when you pass the `--strict-optional` flag, but there are still some issues with that before we can make it the default behavior.) As another example: mypy understands isinstance() checks so that e.g. the following works: def foo(x: Union[int, str]) -> str: if isinstance(x, str): return x return str(x) I don't think you can find anything in PEP 484 that says this should work; but without it mypy would be much less useful. (The example here is silly, but such code appears in real life frequently.) One final thought: this is not the first time that Python has used syntax that looks like another language but gives it a different meaning. In fact, apart from `if`, almost everything in Python works differently than it works in C++ or Java. So I don't worry much about that. -- --Guido van Rossum (python.org/~guido) From vramachandra1996 at gmail.com Tue Sep 6 12:31:36 2016 From: vramachandra1996 at gmail.com (RAMU V) Date: Tue, 6 Sep 2016 22:01:36 +0530 Subject: [Python-Dev] Requesting on python directories Message-ID: <57ceef63.0986620a.f792.a8d7@mx.google.com> Sir I learn the basics of python from online and I am pretty much confident with my basics and I want to show them to the outside world by publishing them as packages in python. There were set of instructions for that process but I could not figure out the exact process so please help me with this because I learnt python out of my interest but not based on my syllabus or for job. Its like a refreshing enjoyment for my brain. Sent from Mail for Windows 10 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Sep 6 12:35:40 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 7 Sep 2016 02:35:40 +1000 Subject: [Python-Dev] Do PEP 526 type declarations define the types of variables or not? In-Reply-To: References: <57CD8E99.8090205@hotpy.org> <57CDEEC5.9030305@canterbury.ac.nz> <57CEDFFC.5040904@hotpy.org> Message-ID: On 7 September 2016 at 01:33, Ivan Levkivskyi wrote: > On 6 September 2016 at 17:25, Mark Shannon wrote: >> >> The issue is not whether the checker can tell that the type of the >> *expression* is int, but whether it is forced to use the type of the >> *variable*. The current wording of PEP 526 strongly implies the latter. > > Mark, > Could you please point to exact locations in the PEP text and propose an > alternative wording, so that we will have a more concrete discussion. Rather than trying to work that out on the list, it may make the most sense for Mark to put together a PR that rewords the parts of the PEP that he sees as constraining typecheckers to restrict *usage* of a variable based on its annotation, rather than just restricting future bindings to it. It seems to me everyone's actually in pretty good agreement on how we want variable annotations to work (constraining future assignments to abide by the declaration, without constraining use when inference indicates a more specific type), but the PEP may be using some particular terminology more loosely than is strictly correct in the context of type theory. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From rymg19 at gmail.com Tue Sep 6 12:37:05 2016 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Tue, 6 Sep 2016 11:37:05 -0500 Subject: [Python-Dev] Requesting on python directories In-Reply-To: <57ceef63.0986620a.f792.a8d7@mx.google.com> References: <57ceef63.0986620a.f792.a8d7@mx.google.com> Message-ID: Wrong mailing list. This is for the discussion of development *of* Python, not *in* Python. You probably want: https://mail.python.org/mailman/listinfo/python-list Regardless, this page should answer your questions: https://packaging.python.org/distributing/ On Tue, Sep 6, 2016 at 11:31 AM, RAMU V wrote: > Sir > > I learn the basics of python from online and I am pretty > much confident with my basics and I want to show them to the outside world > by publishing them as packages in python. There were set of instructions > for that process but I could not figure out the exact process so please > help me with this because I learnt python out of my interest but not based > on my syllabus or for job. Its like a refreshing enjoyment for my brain. > > Sent from Mail for > Windows 10 > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > rymg19%40gmail.com > > -- Ryan [ERROR]: Your autotools build scripts are 200 lines longer than your program. Something?s wrong. http://kirbyfan64.github.io/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Tue Sep 6 12:38:20 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 7 Sep 2016 02:38:20 +1000 Subject: [Python-Dev] Requesting on python directories In-Reply-To: <57ceef63.0986620a.f792.a8d7@mx.google.com> References: <57ceef63.0986620a.f792.a8d7@mx.google.com> Message-ID: <20160906163819.GY26300@ando.pearwood.info> Hello Ramu, This is the wrong place to ask for help with your question, this is for development of the Python interpreter. I suggest you subscribe to the Python-List mailing list. For help with publishing packages, see https://wiki.python.org/moin/CheeseShopTutorial On Tue, Sep 06, 2016 at 10:01:36PM +0530, RAMU V wrote: > Sir > I learn the basics of python from online and I am pretty much confident with my basics and I want to show them to the outside world by publishing them as packages in python. There were set of instructions for that process but I could not figure out the exact process so please help me with this because I learnt python out of my interest but not based on my syllabus or for job. Its like a refreshing enjoyment for my brain. > Sent from Mail for Windows 10 -- Steve From levkivskyi at gmail.com Tue Sep 6 12:42:31 2016 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Tue, 6 Sep 2016 18:42:31 +0200 Subject: [Python-Dev] Do PEP 526 type declarations define the types of variables or not? In-Reply-To: References: <57CD8E99.8090205@hotpy.org> <57CDEEC5.9030305@canterbury.ac.nz> <57CEDFFC.5040904@hotpy.org> Message-ID: On 6 September 2016 at 18:35, Nick Coghlan wrote: > On 7 September 2016 at 01:33, Ivan Levkivskyi > wrote: > > On 6 September 2016 at 17:25, Mark Shannon wrote: > >> > >> The issue is not whether the checker can tell that the type of the > >> *expression* is int, but whether it is forced to use the type of the > >> *variable*. The current wording of PEP 526 strongly implies the latter. > > > > Mark, > > Could you please point to exact locations in the PEP text and propose an > > alternative wording, so that we will have a more concrete discussion. > > Rather than trying to work that out on the list, it may make the most > sense for Mark to put together a PR that rewords the parts of the PEP > that he sees as constraining typecheckers to restrict *usage* of a > variable based on its annotation, rather than just restricting future > bindings to it. > Thanks Nick, this is a good idea. Mark, I will be glad to discuss your PR to the master python/peps repo. -- Iavn -------------- next part -------------- An HTML attachment was scrubbed... URL: From turnbull.stephen.fw at u.tsukuba.ac.jp Tue Sep 6 14:11:21 2016 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Wed, 7 Sep 2016 03:11:21 +0900 Subject: [Python-Dev] Do PEP 526 type declarations define the types of variables or not? In-Reply-To: <57CEE006.6030803@hotpy.org> References: <57CD8E99.8090205@hotpy.org> <57CEE006.6030803@hotpy.org> Message-ID: <22479.1737.757087.877789@turnbull.sk.tsukuba.ac.jp> Mark Shannon writes: > The problem with using the term "variable" is that it is *not* vague. > Variables in python have well defined scopes and lifetimes. Sure, but *hints* are not well-defined by Python (except the syntax, once PEP 526 is implemented). A *hint* is something that the typechecker takes note of, and then does whatever it pleases with it. So can we do the practical thing here and agree that even though the type hint on a variable is constant, what the typechecker does with that type hint in different contexts might change? ------------------------------------------------------------------------ The rest is tl;dr (why I want type hints on variables, and why the term "annotating expressions" leaves me cold). I don't see how you can interpret z: complex = 1.0 without a notion of annotating the variable. The RHS is clearly of float type, and Python will assign a float to z. What's going on here? Maybe this: from math import exp z: complex = 1.0 print(exp(z)) ==> "MyPy to Steve! MyPy to Steve! You are confused!" Maybe "math" was a typo for "cmath". Maybe "complex" was a premature generalization. Maybe you wouldn't want to hear about it ... but I would. I think. Anyway, to find out if I *really* want that or not, I need a notion of hinting the variable. But: "although practicality beats purity". Like everybody else, I want a typechecker that minds its manners and says nothing about from math import exp z: complex = 1.0 try: print(exp(z)) except TypeError: print("Oh well, complex is better than complicated.") Finally, the notion of annotating expressions is incoherent: # Annotating (sub)expressions: the more the merrier! (x) : bool = (((y): int + (z): float) / (w): complex): quarternion # Ooh, an expression with no past and no future. Annotate it! (y + z) / w: quarternion Noone has any intention of annotating expressions AFAICS -- people who talk about that really mean annotating the *value* of the expression on the RHS, but since it will *always* be on the RHS of an assignment, it's equivalent to annotating the value of the *target* on the LHS. From rosuav at gmail.com Tue Sep 6 14:36:27 2016 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 7 Sep 2016 04:36:27 +1000 Subject: [Python-Dev] Do PEP 526 type declarations define the types of variables or not? In-Reply-To: <22479.1737.757087.877789@turnbull.sk.tsukuba.ac.jp> References: <57CD8E99.8090205@hotpy.org> <57CEE006.6030803@hotpy.org> <22479.1737.757087.877789@turnbull.sk.tsukuba.ac.jp> Message-ID: On Wed, Sep 7, 2016 at 4:11 AM, Stephen J. Turnbull wrote: > Finally, the notion of annotating expressions is incoherent: > > # Annotating (sub)expressions: the more the merrier! > (x) : bool = (((y): int + (z): float) / (w): complex): quarternion > # Ooh, an expression with no past and no future. Annotate it! > (y + z) / w: quarternion Can't do that - parsing would become ambiguous. x = {1:int, 1.5:float, 2+3j:complex} print(type(x)) ChrisA From larry at hastings.org Tue Sep 6 18:49:36 2016 From: larry at hastings.org (Larry Hastings) Date: Tue, 6 Sep 2016 15:49:36 -0700 Subject: [Python-Dev] The Amazing Unreferenced Weakref Message-ID: <250318c6-6fa3-9c34-9e2b-b666869f2cae@hastings.org> This is all about current (3.6) trunk. In Objects/weakrefobject.c, we have the function PyObject_ClearWeakRefs(). This is called when a generic object that supports weakrefs is destroyed; this is the code that calls the callbacks. Here's a little paragraph of code from the center: for (i = 0; i < count; ++i) { PyWeakReference *next = current->wr_next; if (((PyObject *)current)->ob_refcnt > 0) { Py_INCREF(current); PyTuple_SET_ITEM(tuple, i * 2, (PyObject *) current); PyTuple_SET_ITEM(tuple, i * 2 + 1, current->wr_callback); } else { Py_DECREF(current->wr_callback); } current->wr_callback = NULL; clear_weakref(current); current = next; } "current" is the doubly-linked list of PyWeakReference objects stored inside the object that's getting destroyed. My question: under what circumstances would ob_refcnt ever be 0? The tp_dealloc handler for PyWeakReference * objects removes it from this list and frees the memory. How could the reference count reach 0 without tp_dealloc being called and it being removed from the list? Scratching my head like crazy, //arry/ p.s. If you're thinking "why does he care?", understanding this would maybe help with the Gilectomy. So yes there's a point to this question. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Sep 6 19:45:47 2016 From: guido at python.org (Guido van Rossum) Date: Tue, 6 Sep 2016 16:45:47 -0700 Subject: [Python-Dev] PEP 447: Add __getdescriptor__ to metaclasses In-Reply-To: References: <0B7F0208-DEC6-4039-89B4-1FCD0071B092@mac.com> <6F326944-9BF0-4E38-B487-79BC0ADF17B3@mac.com> Message-ID: Hi Ronald, The feature freeze for 3.6 is closing in a few days; 3.6b1 will go out this weekend. Did you overcome the issue, or does your PEP need to be postponed until 3.7? --Guido On Sun, Jul 24, 2016 at 9:58 PM, Ronald Oussoren wrote: > > On 24 Jul 2016, at 13:06, Ronald Oussoren wrote: > > ? > > But on the other hand, that?s why wanted to use PyObjC to validate > the PEP in the first place. > > > I?ve hit a fairly significant issue with this, PyObjC?s super contains more > magic than just this magic that would be fixed by PEP 447. I don?t think > I?ll be able to finish work on PEP 447 this week because of that, and in the > worst case will have to retire the PEP. > > The problem is as follows: to be able to map all of Cocoa?s methods to > Python PyObjC creates two proxy classes for every Cocoa class: the regular > class and its metaclass. The latter is used to store class methods. This is > needed because Objective-C classes can have instance and class methods with > the same name, as an example: > > @interface NSObject > -(NSString*)description; > +(NSString*)description > @end > > The first declaration for ?description? is an instance method, the second is > a class method. The Python metaclass is mostly a hidden detail, users don?t > explicitly interact with these classes and use the normal Python convention > for defining class methods. > > This works fine, problems starts when you want to subclass in Python and > override the class method: > > class MyClass (NSObject): > @classmethod > def description(cls): > return ?hello there from %r? % (super(MyClass, cls).description()) > > If you?re used to normal Python code there?s nothing wrong here, but getting > this to work required some magic in objc.super to ensure that its > __getattribute__ looks in the metaclass in this case and not the regular > class. The current PEP447-ised version of PyObjC has a number of test > failures because builtin.super obviously doesn?t contain this hack (and > shouldn?t). > > I think I can fix this for modern code that uses an argumentless call to > super by replacing the cell containing the __class__ reference when moving > the method from the regular class to the instance class. That would > obviously not work for the code I showed earlier, but that at least won?t > fail silently and the error message is specific enough that I can include it > in PyObjC?s documentation. > > Ronald > > > > > > Back to wrangling C code, > > Ronald > > > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/ronaldoussoren%40mac.com > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) From ericsnowcurrently at gmail.com Tue Sep 6 19:56:21 2016 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 6 Sep 2016 16:56:21 -0700 Subject: [Python-Dev] A Pseudo-Post-Mortem (not dead yet) on my Multi-Core Python Project. Message-ID: I'm not anticipating much discussion on this, but wanted to present a summary of my notes from the project I proposed last year and have since tabled. http://ericsnowcurrently.blogspot.com/2016/09/solving-mutli-core-python.html -eric From yselivanov.ml at gmail.com Tue Sep 6 20:10:49 2016 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Tue, 6 Sep 2016 17:10:49 -0700 Subject: [Python-Dev] PEP 525, fourth update Message-ID: <0351340c-4062-adc0-6687-4fa7633506f4@gmail.com> Hi, I've updated PEP 525 with a new section about asyncio changes. Essentially, asyncio event loop will get a new "shutdown_asyncgens" method that allows to close the loop and all associated AGs with it reliably. Only the updated section is pasted below: asyncio ------- The asyncio event loop will use ``sys.set_asyncgen_hooks()`` API to maintain a weak set of all scheduled asynchronous generators, and to schedule their ``aclose()`` coroutine methods when it is time for generators to be GCed. To make sure that asyncio programs can finalize all scheduled asynchronous generators reliably, we propose to add a new event loop method ``loop.shutdown_asyncgens(*, timeout=30)``. The method will schedule all currently open asynchronous generators to close with an ``aclose()`` call. After calling the ``loop.shutdown_asyncgens()`` method, the event loop will issue a warning whenever a new asynchronous generator is iterated for the first time. The idea is that after requesting all asynchronous generators to be shutdown, the program should not execute code that iterates over new asynchronous generators. An example of how ``shutdown_asyncgens`` should be used:: try: loop.run_forever() # or loop.run_until_complete(...) finally: loop.shutdown_asyncgens() loop.close() - Yury From greg at krypto.org Tue Sep 6 20:19:36 2016 From: greg at krypto.org (Gregory P. Smith) Date: Wed, 07 Sep 2016 00:19:36 +0000 Subject: [Python-Dev] The Amazing Unreferenced Weakref In-Reply-To: <250318c6-6fa3-9c34-9e2b-b666869f2cae@hastings.org> References: <250318c6-6fa3-9c34-9e2b-b666869f2cae@hastings.org> Message-ID: This code appears to have been added to fix https://bugs.python.org/issue3100 - A crash involving a weakref subclass. -gps On Tue, Sep 6, 2016 at 3:51 PM Larry Hastings wrote: > > This is all about current (3.6) trunk. > > In Objects/weakrefobject.c, we have the function > PyObject_ClearWeakRefs(). This is called when a generic object that > supports weakrefs is destroyed; this is the code that calls the callbacks. > Here's a little paragraph of code from the center: > > for (i = 0; i < count; ++i) { > PyWeakReference *next = current->wr_next; > > if (((PyObject *)current)->ob_refcnt > 0) > { > Py_INCREF(current); > PyTuple_SET_ITEM(tuple, i * 2, (PyObject *) current); > PyTuple_SET_ITEM(tuple, i * 2 + 1, current->wr_callback); > } > else { > Py_DECREF(current->wr_callback); > } > current->wr_callback = NULL; > clear_weakref(current); > current = next; > } > > "current" is the doubly-linked list of PyWeakReference objects stored > inside the object that's getting destroyed. > > My question: under what circumstances would ob_refcnt ever be 0? The > tp_dealloc handler for PyWeakReference * objects removes it from this list > and frees the memory. How could the reference count reach 0 without > tp_dealloc being called and it being removed from the list? > > Scratching my head like crazy, > > > */arry* > > p.s. If you're thinking "why does he care?", understanding this would > maybe help with the Gilectomy. So yes there's a point to this question. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/greg%40krypto.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Sep 6 22:10:28 2016 From: guido at python.org (Guido van Rossum) Date: Tue, 6 Sep 2016 19:10:28 -0700 Subject: [Python-Dev] PEP 525, fourth update In-Reply-To: <0351340c-4062-adc0-6687-4fa7633506f4@gmail.com> References: <0351340c-4062-adc0-6687-4fa7633506f4@gmail.com> Message-ID: Thanks Yury! I am hereby accepting PEP 525 provisionally. The acceptance is so that you can go ahead and merge this into 3.6 before the feature freeze this weekend. The provisional status is because this is a big project and it's likely that we'll need to tweak some small aspect of the API once the code is in, even after 3.6.0 is out. (Similar to the way PEP 492, async/await, was accepted provisionally.) But I am cautiously optimistic and I am grateful to Yury for the care and effort he has put into it. --Guido On Tue, Sep 6, 2016 at 5:10 PM, Yury Selivanov wrote: > Hi, > > I've updated PEP 525 with a new section about asyncio changes. > > Essentially, asyncio event loop will get a new "shutdown_asyncgens" method > that allows to close the loop and all associated AGs with it reliably. > > Only the updated section is pasted below: > > > asyncio > ------- > > The asyncio event loop will use ``sys.set_asyncgen_hooks()`` API to > maintain a weak set of all scheduled asynchronous generators, and to > schedule their ``aclose()`` coroutine methods when it is time for > generators to be GCed. > > To make sure that asyncio programs can finalize all scheduled > asynchronous generators reliably, we propose to add a new event loop > method ``loop.shutdown_asyncgens(*, timeout=30)``. The method will > schedule all currently open asynchronous generators to close with an > ``aclose()`` call. > > After calling the ``loop.shutdown_asyncgens()`` method, the event loop > will issue a warning whenever a new asynchronous generator is iterated > for the first time. The idea is that after requesting all asynchronous > generators to be shutdown, the program should not execute code that > iterates over new asynchronous generators. > > An example of how ``shutdown_asyncgens`` should be used:: > > try: > loop.run_forever() > # or loop.run_until_complete(...) > finally: > loop.shutdown_asyncgens() > loop.close() > > - > Yury > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) From guido at python.org Tue Sep 6 22:35:04 2016 From: guido at python.org (Guido van Rossum) Date: Tue, 6 Sep 2016 19:35:04 -0700 Subject: [Python-Dev] Do PEP 526 type declarations define the types of variables or not? In-Reply-To: <57CEE006.6030803@hotpy.org> References: <57CD8E99.8090205@hotpy.org> <57CEE006.6030803@hotpy.org> Message-ID: On Tue, Sep 6, 2016 at 8:25 AM, Mark Shannon wrote: > The "smartness" of checkers is not the problem (for this example, at least) > the problem is that checkers must conform to the rules laid down in PEP 484 > and (in whatever form it finally takes) PEP 526. > It sounds like mypy doesn't conform to PEP 526, as it ignoring the declared > type of x and using the inferred type. > In fact it looks as if it is doing exactly what I proposed, which is that > the annotation describes the type of the expression, not the variable. IMO neither PEP requires type checkers to behave this way. Maybe you read it between the lines when you reviewed PEP 484 and neither of us realized that we were interpreting the text differently? The words you have quoted previously mean different things to me than you seem to imply. >> I guess this is a surprise if you think of type systems like Java's >> where the compiler forgets what it has learned, at least from the >> language spec's POV. But a Python type checker is more like a linter, >> and false positives (complaints about valid code) are much more >> problematic than false negatives (passing invalid code). > The language of PEP 526 is strongly suggestive of a type system like Java. That suggestion is really in your mind. The PEP also quite clearly states that it does not specify what a type checker should do with the "declarations". > The extensive use of the term 'variable' rather than 'expression' and > 'assignment' rather suggests that all definitions and uses of a single > variable have the same type. Maybe you believe that Python's use of the word 'variable', combined with using `=` for assignment, also implies that Python's "variables" should behave like Java's "variables"? > The problem with using the term "variable" is that it is *not* vague. > Variables in python have well defined scopes and lifetimes. So? When a type checker can prove that in the expression `f(x)`, the type of the *expression* `x` will be compatible with the argument type expected by f, isn't that good enough? Why would the type given for the *variable* `x` have to be the only input to the type check for that expression? -- --Guido van Rossum (python.org/~guido) From eric at trueblade.com Wed Sep 7 00:21:17 2016 From: eric at trueblade.com (Eric V. Smith) Date: Wed, 7 Sep 2016 00:21:17 -0400 Subject: [Python-Dev] What's the status of PEP 515? Message-ID: The implementation of '_' in numeric literals is here: http://bugs.python.org/issue26331 And to add '_' in int.__format__ is here: http://bugs.python.org/issue27080 But I don't want to add support in int.__format__ unless numeric literal support is added. So, Georg and Serhiy: is issue 26331 going to get committed? If so, I'll commit 27080 (or you can). I just don't want the second part of PEP 515 to not make the deadline if the first part makes it in at the last minute. Thanks! Eric. From nad at python.org Wed Sep 7 02:04:38 2016 From: nad at python.org (Ned Deily) Date: Tue, 6 Sep 2016 23:04:38 -0700 Subject: [Python-Dev] What's the status of PEP 515? In-Reply-To: References: Message-ID: <5B15FB13-0DD0-45B9-84C3-A3FD86B10514@python.org> At the dev sprint today, we discussed PEP 515; several people are keen to see it get into 3.6. If someone doesn't get to it before tomorrow, one of the sprinters will try to do a final review and get it pushed. -- Ned Deily nad at python.org -- [] > On Sep 6, 2016, at 21:21, Eric V. Smith wrote: > > The implementation of '_' in numeric literals is here: > http://bugs.python.org/issue26331 > > And to add '_' in int.__format__ is here: > http://bugs.python.org/issue27080 > > But I don't want to add support in int.__format__ unless numeric literal support is added. > > So, Georg and Serhiy: is issue 26331 going to get committed? If so, I'll commit 27080 (or you can). I just don't want the second part of PEP 515 to not make the deadline if the first part makes it in at the last minute. > > Thanks! > Eric. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/nad%40python.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Wed Sep 7 05:09:27 2016 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 7 Sep 2016 12:09:27 +0300 Subject: [Python-Dev] What's the status of PEP 515? In-Reply-To: References: Message-ID: On 07.09.16 07:21, Eric V. Smith wrote: > The implementation of '_' in numeric literals is here: > http://bugs.python.org/issue26331 > > And to add '_' in int.__format__ is here: > http://bugs.python.org/issue27080 > > But I don't want to add support in int.__format__ unless numeric literal > support is added. > > So, Georg and Serhiy: is issue 26331 going to get committed? If so, I'll > commit 27080 (or you can). I just don't want the second part of PEP 515 > to not make the deadline if the first part makes it in at the last minute. I had not much time last weeks to make a review of such large patches. I'm going to make a review today. In any case I think the patch is good in general, and if there are any bugs we can fix them in the beta stage. From levkivskyi at gmail.com Wed Sep 7 10:10:05 2016 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Wed, 7 Sep 2016 16:10:05 +0200 Subject: [Python-Dev] Make "global after use" a SyntaxError Message-ID: Hi all, The documentation at https://docs.python.org/3/reference/simple_stmts.html says that: "Names listed in a global statement must not be used in the same code block textually preceding that global statement" But then later: "CPython implementation detail: The current implementation does not enforce the two restrictions, but programs should not abuse this freedom, as future implementations may enforce them..." Code like this def f(): x = 1 global x gives SyntaxWarning for several releases, maybe it is time to make it a SyntaxError? (I have opened an issue for this http://bugs.python.org/issue27999 I will submit a patch soon). -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Sep 7 11:20:01 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 8 Sep 2016 01:20:01 +1000 Subject: [Python-Dev] A Pseudo-Post-Mortem (not dead yet) on my Multi-Core Python Project. In-Reply-To: References: Message-ID: On 7 September 2016 at 09:56, Eric Snow wrote: > I'm not anticipating much discussion on this, but wanted to present a > summary of my notes from the project I proposed last year and have > since tabled. > > http://ericsnowcurrently.blogspot.com/2016/09/solving-mutli-core-python.html Thanks for that update. For the PEP 432 start-up changes, the draft implementation reached a point earlier this year where it aligns with the current PEP draft and works as intended, except for the fact that most config settings aren't actually using the new structs yet: https://bitbucket.org/ncoghlan/cpython_sandbox/branch/pep432_modular_bootstrap However, there were other things that seemed higher priority to work on or help coordinate, so I deferred actually wrangling the process of proposing it for inclusion as a private API and then incrementally migrating settings over to it (particular as I think there's a high chance of that migration process stalling out if I can't be sure I'll have time to work on it myself). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From yselivanov.ml at gmail.com Wed Sep 7 12:33:15 2016 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Wed, 7 Sep 2016 09:33:15 -0700 Subject: [Python-Dev] PEP 525, fourth update In-Reply-To: References: <0351340c-4062-adc0-6687-4fa7633506f4@gmail.com> Message-ID: <14ed258b-9446-c5a9-c618-8b492a0eaa37@gmail.com> Thank you, Guido! I've updated the PEP to make shutdown_asyncgens a coroutine, as we discussed. Yury On 2016-09-06 7:10 PM, Guido van Rossum wrote: > Thanks Yury! > > I am hereby accepting PEP 525 provisionally. The acceptance is so that > you can go ahead and merge this into 3.6 before the feature freeze > this weekend. The provisional status is because this is a big project > and it's likely that we'll need to tweak some small aspect of the API > once the code is in, even after 3.6.0 is out. (Similar to the way PEP > 492, async/await, was accepted provisionally.) But I am cautiously > optimistic and I am grateful to Yury for the care and effort he has > put into it. > > --Guido > > On Tue, Sep 6, 2016 at 5:10 PM, Yury Selivanov wrote: >> Hi, >> >> I've updated PEP 525 with a new section about asyncio changes. >> >> Essentially, asyncio event loop will get a new "shutdown_asyncgens" method >> that allows to close the loop and all associated AGs with it reliably. >> >> Only the updated section is pasted below: >> >> >> asyncio >> ------- >> >> The asyncio event loop will use ``sys.set_asyncgen_hooks()`` API to >> maintain a weak set of all scheduled asynchronous generators, and to >> schedule their ``aclose()`` coroutine methods when it is time for >> generators to be GCed. >> >> To make sure that asyncio programs can finalize all scheduled >> asynchronous generators reliably, we propose to add a new event loop >> method ``loop.shutdown_asyncgens(*, timeout=30)``. The method will >> schedule all currently open asynchronous generators to close with an >> ``aclose()`` call. >> >> After calling the ``loop.shutdown_asyncgens()`` method, the event loop >> will issue a warning whenever a new asynchronous generator is iterated >> for the first time. The idea is that after requesting all asynchronous >> generators to be shutdown, the program should not execute code that >> iterates over new asynchronous generators. >> >> An example of how ``shutdown_asyncgens`` should be used:: >> >> try: >> loop.run_forever() >> # or loop.run_until_complete(...) >> finally: >> loop.shutdown_asyncgens() >> loop.close() >> >> - >> Yury >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> https://mail.python.org/mailman/options/python-dev/guido%40python.org > > From guido at python.org Wed Sep 7 12:35:59 2016 From: guido at python.org (Guido van Rossum) Date: Wed, 7 Sep 2016 09:35:59 -0700 Subject: [Python-Dev] PEP 525, fourth update In-Reply-To: <14ed258b-9446-c5a9-c618-8b492a0eaa37@gmail.com> References: <0351340c-4062-adc0-6687-4fa7633506f4@gmail.com> <14ed258b-9446-c5a9-c618-8b492a0eaa37@gmail.com> Message-ID: Thanks Yury! (Everyone else following along, the PEP is accepted provisionally, and we may make small tweaks from time to time during Python 3.6's lifetime.) From guido at python.org Wed Sep 7 12:59:07 2016 From: guido at python.org (Guido van Rossum) Date: Wed, 7 Sep 2016 09:59:07 -0700 Subject: [Python-Dev] Make "global after use" a SyntaxError In-Reply-To: References: Message-ID: +1 On Wed, Sep 7, 2016 at 7:10 AM, Ivan Levkivskyi wrote: > Hi all, > > The documentation at https://docs.python.org/3/reference/simple_stmts.html > says that: > > "Names listed in a global statement must not be used in the same code block > textually preceding that global statement" > > But then later: > > "CPython implementation detail: The current implementation does not enforce > the two restrictions, > but programs should not abuse this freedom, as future implementations may > enforce them..." > > Code like this > > def f(): > x = 1 > global x > > gives SyntaxWarning for several releases, maybe it is time to make it a > SyntaxError? > > (I have opened an issue for this http://bugs.python.org/issue27999 I will > submit a patch soon). > > -- > Ivan > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) From guido at python.org Wed Sep 7 13:37:01 2016 From: guido at python.org (Guido van Rossum) Date: Wed, 7 Sep 2016 10:37:01 -0700 Subject: [Python-Dev] PEP 529: Change Windows filesystem encoding to UTF-8 In-Reply-To: References: Message-ID: I'm hijacking this thread to provisionally accept PEP 529. (I'll also do this for PEP 528, in its own thread.) I've talked things over with Steve and Victor and we're going to do an experiment (as now written up in the PEP: https://www.python.org/dev/peps/pep-0529/#beta-experiment) to tease out any issues with this change during the beta. If serious problems crop up we may have to roll back the changes and reject the PEP -- we won't get another chance at getting this right. (That would also mean that using the binary filesystem APIs will remain deprecated and will eventually be disallowed; as long as the PEP remains accepted they are undeprecated.) Congrats Steve! Thanks for the massive amount of work on the implementation and the thinking that went into the design. Thanks everyone else for their feedback. --Guido PS. I have one small inline response to Nick below. On Sun, Sep 4, 2016 at 11:58 PM, Nick Coghlan wrote: > On 5 September 2016 at 15:59, Steve Dower wrote: >> +continue to default to ``locale.getpreferredencoding()`` (for text files) or >> +plain bytes (for binary files). This only affects the encoding used when users >> +pass a bytes object to Python where it is then passed to the operating system as >> +a path name. > > For the three non-filesystem cases: > > I checked the situation for os.environb, and that's already > unavailable on Windows (since os.supports_bytes_environ is False > there), while sys.argv is apparently already handled correctly (i.e. > always using the *W APIs). > > That means my only open question would be the handling of subprocess > module calls (both with and without shell=True), since that currently > works with binary arguments on *nix: > >>>> subprocess.call([b"python", b"-c", "print('??????')".encode("utf-8")]) > ?????? > 0 >>>> subprocess.call(b"python -c '%s'" % 'print("??????")'.encode("utf-8"), shell=True) > ?????? > 0 > > While calling system native apps that way will still have many > portability challenges, there are also plenty of cases where folks use > sys.executable to launch new Python processes in a separate instance > of the currently running interpreter, and it would be good if these > changes brought cross-platform consistency to the handling of binary > arguments here as well. I checked with Steve and this is not supported anyway -- bytes arguments (regardless of the value of shell) fail early with a TypeError. That may be a bug but there's no backwards compatibility to preserve here. (And apart from Python, few shell commands that work on Unix make much sense on Windows, so Im also not particularly worried about that particular example being non-portable -- it doesn't represent a realistic concern.) -- --Guido van Rossum (python.org/~guido) From guido at python.org Wed Sep 7 13:52:39 2016 From: guido at python.org (Guido van Rossum) Date: Wed, 7 Sep 2016 10:52:39 -0700 Subject: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8 In-Reply-To: <1473168175.1294969.717131833.678CEDF7@webmail.messagingengine.com> References: <5cfa8c73-14b7-8795-06c5-3266940da4b1@python.org> <1473168175.1294969.717131833.678CEDF7@webmail.messagingengine.com> Message-ID: Congrats Steve! I'm provisionally accepting PEP 528. You can mark it as provisionally accepted in the PEP, preferably with a link to the mail.python.org archival copy of this message. Good luck with the implementation. -- --Guido van Rossum (python.org/~guido) From benjamin at python.org Wed Sep 7 13:56:12 2016 From: benjamin at python.org (Benjamin Peterson) Date: Wed, 07 Sep 2016 10:56:12 -0700 Subject: [Python-Dev] (some) C99 added to PEP 7 Message-ID: <1473270972.864925.718690353.0B268FC8@webmail.messagingengine.com> To conclude our discussion about using C99 features, I've updated PEP 7 to allow the following features: - Standard integer types in ```` and ```` - ``static inline`` functions - designated initializers - intermingled declarations - booleans I've been adding examples of these to 3.6 over the last few days to make sure the buildbots will like it. https://github.com/python/peps/commit/b6efe6e06fa70e8933440da26474a804fb3edb6e Enjoy. From victor.stinner at gmail.com Wed Sep 7 14:07:56 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 7 Sep 2016 11:07:56 -0700 Subject: [Python-Dev] (some) C99 added to PEP 7 In-Reply-To: <1473270972.864925.718690353.0B268FC8@webmail.messagingengine.com> References: <1473270972.864925.718690353.0B268FC8@webmail.messagingengine.com> Message-ID: 2016-09-07 10:56 GMT-07:00 Benjamin Peterson : > To conclude our discussion about using C99 features, I've updated PEP 7 > to allow the following features: > - Standard integer types in ```` and ```` > - ``static inline`` functions > - designated initializers > - intermingled declarations > - booleans Welcome to the future! Victor From steve.dower at python.org Wed Sep 7 14:09:34 2016 From: steve.dower at python.org (Steve Dower) Date: Wed, 7 Sep 2016 11:09:34 -0700 Subject: [Python-Dev] PEP 529: Change Windows filesystem encoding to UTF-8 In-Reply-To: References: Message-ID: On 07Sep2016 1037, Guido van Rossum wrote: > I'm hijacking this thread to provisionally accept PEP 529. (I'll also > do this for PEP 528, in its own thread.) > > I've talked things over with Steve and Victor and we're going to do an > experiment (as now written up in the PEP: > https://www.python.org/dev/peps/pep-0529/#beta-experiment) to tease > out any issues with this change during the beta. If serious problems > crop up we may have to roll back the changes and reject the PEP -- we > won't get another chance at getting this right. (That would also mean > that using the binary filesystem APIs will remain deprecated and will > eventually be disallowed; as long as the PEP remains accepted they are > undeprecated.) > > Congrats Steve! Thanks for the massive amount of work on the > implementation and the thinking that went into the design. Thanks > everyone else for their feedback. > > --Guido Thanks! I've updated the status. Now the process of bartering for code reviews begins :) Patches are at: PEP 528: http://bugs.python.org/issue1602 PEP 529: http://bugs.python.org/issue27781 Cheers, Steve From guido at python.org Wed Sep 7 14:18:34 2016 From: guido at python.org (Guido van Rossum) Date: Wed, 7 Sep 2016 11:18:34 -0700 Subject: [Python-Dev] PEP 526 (variable annotations) accepted provisionally Message-ID: I'm accepting PEP 526 provisionally. I am personally confident that this PEP is adding a useful new feature to the language: annotations that can be used by a wide variety of tools, whether off-line type checkers or frameworks that add runtime checking (e.g. traits or traitlets). The provisional status reflects the understanding that minor details of the proposed syntax and its runtime effects may still have to change based on experience during the 3.6 life cycle. (For example, maybe we end up not liking ClassVar, or maybe we'll decide we'll want to support `x, y, z: T` after all.) There's been some quite contentious discussion about the PEP, on and off python-dev, regarding how the mere presence of annotation syntax in the language will change the way people will see the language. My own experience using mypy and PyCharm has been quite different: annotations are a valuable addition for large code bases, and it's worth the effort to add them to large legacy code bases (think millions of lines of Python 2.7 code that needs to move to Python 3 by 2020). The effect of this has been that engineers using Python are happier and more confident that their code works than before, have an easier time spelunking code they don't know, and are less afraid of big refactorings (where conversion to Python 3 can be seen as the ultimate refactoring). I should blog about our experience at Dropbox; I hope the Zulip open source folks (not at Dropbox) will also blog about their experience. In the meantime you can read Daniel F. Moisset's three-part blog about adding annotations to pycodestyle (formerly pep8) here: http://www.machinalis.com/blog/a-day-with-mypy-part-1/ If you want to see a large open source code base that's annotated for mypy (with 97% coverage), I recommend looking at Zulip: https://github.com/zulip/zulip Finally, some of us are starting a new (informational) PEP to set expectations for how type checkers should make use of the annotation syntax standardized by PEP 484 and PEP 526. This is going to take more time, and new collaborators are welcome here: https://github.com/ilevkivskyi/peps/blob/new-pep/pep-0555.txt. (Mark, I really hope you'll accept the invitation to participate. Your experience would be most welcome.) -- --Guido van Rossum (python.org/~guido) From guido at python.org Wed Sep 7 14:31:16 2016 From: guido at python.org (Guido van Rossum) Date: Wed, 7 Sep 2016 11:31:16 -0700 Subject: [Python-Dev] Making PEP 3156 (asyncio) non-provisional Message-ID: PEP 3156 and the asyncio module it defines have been provisional for the lifetime of Python 3.4 and 3.5. The module is now quite mature. I propose that we end the provisional period and make asyncio subject to the usual backwards compatibility rules: new features only appear in "minor" releases (e.g. 3.6, 3.7) and all changes must be backward compatible. There's some wiggle room though: in some cases we may decide that a given "feature" was really "undefined/undocumented behavior" and then we can treat it as a bug and fix it (== change the behavior) in a bugfix release (or during the 3.6 beta period). There are some worries that Twisted might request some incompatible changes in order to obtain better interoperability. I've sent an email to Amber Brown asking for a clarification. There's also the issue of starttls, a feature that we know we'd like to add but don't have ready for 3.6b1. I think the right approach there is to provide an add-on package on PyPI that implements a starttls-capable Transport class, and when that code is sufficiently battle-tested we can add it to the stdlib (hopefully by 3.7). Such a package might end up having to copy portions of the asyncio implementation and/or use internal/undocumented APIs; that's fine because it is only meant as a temporary measure, and we can make it clear that just because the starttls package uses a certain internal API that doesn't mean that API is now public. A big advantage of having the initial starttls implementation outside the stdlib is that its release schedule can be much more frequent than that of the stdlib (== every 6 months), and a security issue in the starttls package won't require all the heavy guns of doing a security release of all of CPython. -- --Guido van Rossum (python.org/~guido) From guido at python.org Wed Sep 7 14:53:14 2016 From: guido at python.org (Guido van Rossum) Date: Wed, 7 Sep 2016 11:53:14 -0700 Subject: [Python-Dev] (some) C99 added to PEP 7 In-Reply-To: References: <1473270972.864925.718690353.0B268FC8@webmail.messagingengine.com> Message-ID: W00t! I will have to rewrite my brain. :-) On Wed, Sep 7, 2016 at 11:07 AM, Victor Stinner wrote: > 2016-09-07 10:56 GMT-07:00 Benjamin Peterson : >> To conclude our discussion about using C99 features, I've updated PEP 7 >> to allow the following features: >> - Standard integer types in ```` and ```` >> - ``static inline`` functions >> - designated initializers >> - intermingled declarations >> - booleans > > Welcome to the future! > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) From solipsis at pitrou.net Wed Sep 7 15:01:32 2016 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 7 Sep 2016 21:01:32 +0200 Subject: [Python-Dev] (some) C99 added to PEP 7 References: <1473270972.864925.718690353.0B268FC8@webmail.messagingengine.com> Message-ID: <20160907210132.4c7b865c@fsol> On Wed, 7 Sep 2016 11:53:14 -0700 Guido van Rossum wrote: > W00t! I will have to rewrite my brain. :-) ... Is your brain coded in C89? > > On Wed, Sep 7, 2016 at 11:07 AM, Victor Stinner > wrote: > > 2016-09-07 10:56 GMT-07:00 Benjamin Peterson : > >> To conclude our discussion about using C99 features, I've updated PEP 7 > >> to allow the following features: > >> - Standard integer types in ```` and ```` > >> - ``static inline`` functions > >> - designated initializers > >> - intermingled declarations > >> - booleans > > > > Welcome to the future! > > > > Victor > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > https://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org > > > From rymg19 at gmail.com Wed Sep 7 15:07:07 2016 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Wed, 7 Sep 2016 14:07:07 -0500 Subject: [Python-Dev] (some) C99 added to PEP 7 In-Reply-To: <20160907210132.4c7b865c@fsol> References: <1473270972.864925.718690353.0B268FC8@webmail.messagingengine.com> <20160907210132.4c7b865c@fsol> Message-ID: Wonder if it's ever segfaulted... ...hey, I just figured out why we got Python 3!!!!! ;) -- Ryan [ERROR]: Your autotools build scripts are 200 lines longer than your program. Something?s wrong. http://kirbyfan64.github.io/ On Sep 7, 2016 2:02 PM, "Antoine Pitrou" wrote: > On Wed, 7 Sep 2016 11:53:14 -0700 > Guido van Rossum wrote: > > W00t! I will have to rewrite my brain. :-) > > ... Is your brain coded in C89? > > > > > > On Wed, Sep 7, 2016 at 11:07 AM, Victor Stinner > > wrote: > > > 2016-09-07 10:56 GMT-07:00 Benjamin Peterson : > > >> To conclude our discussion about using C99 features, I've updated PEP > 7 > > >> to allow the following features: > > >> - Standard integer types in ```` and ```` > > >> - ``static inline`` functions > > >> - designated initializers > > >> - intermingled declarations > > >> - booleans > > > > > > Welcome to the future! > > > > > > Victor > > > _______________________________________________ > > > Python-Dev mailing list > > > Python-Dev at python.org > > > https://mail.python.org/mailman/listinfo/python-dev > > > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > > > > > > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > rymg19%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From srkunze at mail.de Wed Sep 7 15:18:20 2016 From: srkunze at mail.de (Sven R. Kunze) Date: Wed, 7 Sep 2016 21:18:20 +0200 Subject: [Python-Dev] A Pseudo-Post-Mortem (not dead yet) on my Multi-Core Python Project. In-Reply-To: References: Message-ID: <66380cb6-c612-bdfe-64ff-d257334c1e24@mail.de> Thanks for the post. :) There's some typo in the title and url. :/ :D On 07.09.2016 01:56, Eric Snow wrote: > I'm not anticipating much discussion on this, but wanted to present a > summary of my notes from the project I proposed last year and have > since tabled. > > http://ericsnowcurrently.blogspot.com/2016/09/solving-mutli-core-python.html > > -eric > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/srkunze%40mail.de From levkivskyi at gmail.com Wed Sep 7 16:40:43 2016 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Wed, 7 Sep 2016 22:40:43 +0200 Subject: [Python-Dev] PEP 526 (variable annotations) accepted provisionally In-Reply-To: References: Message-ID: Thank you Guido! :-) -- Ivan On 7 September 2016 at 20:18, Guido van Rossum wrote: > I'm accepting PEP 526 provisionally. > > I am personally confident that this PEP is adding a useful new feature > to the language: annotations that can be used by a wide variety of > tools, whether off-line type checkers or frameworks that add runtime > checking (e.g. traits or traitlets). > > The provisional status reflects the understanding that minor details > of the proposed syntax and its runtime effects may still have to > change based on experience during the 3.6 life cycle. (For example, > maybe we end up not liking ClassVar, or maybe we'll decide we'll want > to support `x, y, z: T` after all.) > > There's been some quite contentious discussion about the PEP, on and > off python-dev, regarding how the mere presence of annotation syntax > in the language will change the way people will see the language. My > own experience using mypy and PyCharm has been quite different: > annotations are a valuable addition for large code bases, and it's > worth the effort to add them to large legacy code bases (think > millions of lines of Python 2.7 code that needs to move to Python 3 by > 2020). The effect of this has been that engineers using Python are > happier and more confident that their code works than before, have an > easier time spelunking code they don't know, and are less afraid of > big refactorings (where conversion to Python 3 can be seen as the > ultimate refactoring). > > I should blog about our experience at Dropbox; I hope the Zulip open > source folks (not at Dropbox) will also blog about their experience. > In the meantime you can read Daniel F. Moisset's three-part blog about > adding annotations to pycodestyle (formerly pep8) here: > > http://www.machinalis.com/blog/a-day-with-mypy-part-1/ > > If you want to see a large open source code base that's annotated for > mypy (with 97% coverage), I recommend looking at Zulip: > https://github.com/zulip/zulip > > Finally, some of us are starting a new (informational) PEP to set > expectations for how type checkers should make use of the annotation > syntax standardized by PEP 484 and PEP 526. This is going to take more > time, and new collaborators are welcome here: > https://github.com/ilevkivskyi/peps/blob/new-pep/pep-0555.txt. (Mark, > I really hope you'll accept the invitation to participate. Your > experience would be most welcome.) > > -- > --Guido van Rossum (python.org/~guido) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rymg19 at gmail.com Wed Sep 7 16:52:49 2016 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Wed, 7 Sep 2016 15:52:49 -0500 Subject: [Python-Dev] PEP 526 (variable annotations) accepted provisionally In-Reply-To: References: Message-ID: :D -- Ryan [ERROR]: Your autotools build scripts are 200 lines longer than your program. Something?s wrong. http://kirbyfan64.github.io/ On Sep 7, 2016 1:20 PM, "Guido van Rossum" wrote: > I'm accepting PEP 526 provisionally. > > I am personally confident that this PEP is adding a useful new feature > to the language: annotations that can be used by a wide variety of > tools, whether off-line type checkers or frameworks that add runtime > checking (e.g. traits or traitlets). > > The provisional status reflects the understanding that minor details > of the proposed syntax and its runtime effects may still have to > change based on experience during the 3.6 life cycle. (For example, > maybe we end up not liking ClassVar, or maybe we'll decide we'll want > to support `x, y, z: T` after all.) > > There's been some quite contentious discussion about the PEP, on and > off python-dev, regarding how the mere presence of annotation syntax > in the language will change the way people will see the language. My > own experience using mypy and PyCharm has been quite different: > annotations are a valuable addition for large code bases, and it's > worth the effort to add them to large legacy code bases (think > millions of lines of Python 2.7 code that needs to move to Python 3 by > 2020). The effect of this has been that engineers using Python are > happier and more confident that their code works than before, have an > easier time spelunking code they don't know, and are less afraid of > big refactorings (where conversion to Python 3 can be seen as the > ultimate refactoring). > > I should blog about our experience at Dropbox; I hope the Zulip open > source folks (not at Dropbox) will also blog about their experience. > In the meantime you can read Daniel F. Moisset's three-part blog about > adding annotations to pycodestyle (formerly pep8) here: > > http://www.machinalis.com/blog/a-day-with-mypy-part-1/ > > If you want to see a large open source code base that's annotated for > mypy (with 97% coverage), I recommend looking at Zulip: > https://github.com/zulip/zulip > > Finally, some of us are starting a new (informational) PEP to set > expectations for how type checkers should make use of the annotation > syntax standardized by PEP 484 and PEP 526. This is going to take more > time, and new collaborators are welcome here: > https://github.com/ilevkivskyi/peps/blob/new-pep/pep-0555.txt. (Mark, > I really hope you'll accept the invitation to participate. Your > experience would be most welcome.) > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > rymg19%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vgr255 at live.ca Wed Sep 7 17:23:57 2016 From: vgr255 at live.ca (Emanuel Barry) Date: Wed, 7 Sep 2016 21:23:57 +0000 Subject: [Python-Dev] Commits to migrated repos no longer sent to Python-checkins Message-ID: The repos which used to send to Python-checkins no longer do so since their respective migrations (devguide, peps). I don't know who's responsible for that, so I figured I'd post here. -Emanuel From brett at python.org Wed Sep 7 17:40:15 2016 From: brett at python.org (Brett Cannon) Date: Wed, 07 Sep 2016 21:40:15 +0000 Subject: [Python-Dev] Commits to migrated repos no longer sent to Python-checkins In-Reply-To: References: Message-ID: On Wed, 7 Sep 2016 at 14:24 Emanuel Barry wrote: > The repos which used to send to Python-checkins no longer do so since their > respective migrations (devguide, peps). I don't know who's responsible for > that, so I figured I'd post here. > If people want those back on then that could be arranged. I'm not sure, though, if it still makes sense having emails for every commit from three separate repositories going to the same mailing list. You can follow the commits through an atom feed, e.g https://github.com/python/peps/commits.atom. That means you could use something like IFTTT on your own to send you an email for each commit so you can track only the repositories you care about. That makes me think that it's worth even less for peps since those all have to be posted here anyway and the devguide doesn't affect people's future production deployments. -------------- next part -------------- An HTML attachment was scrubbed... URL: From vgr255 at live.ca Wed Sep 7 17:46:06 2016 From: vgr255 at live.ca (Emanuel Barry) Date: Wed, 7 Sep 2016 21:46:06 +0000 Subject: [Python-Dev] Commits to migrated repos no longer sent to Python-checkins In-Reply-To: References: Message-ID: Fair enough. I never really bothered to set up any complicated design to get commits, and my emails all get automatically sorted into folders so it doesn?t matter which list it goes to. Although now that you mention it, I could simply subscribe to the GitHub repos and get the notifications for free :) -Emanuel From: Brett Cannon [mailto:brett at python.org] Sent: Wednesday, September 07, 2016 5:40 PM To: Emanuel Barry; python-dev at python.org Subject: Re: [Python-Dev] Commits to migrated repos no longer sent to Python-checkins On Wed, 7 Sep 2016 at 14:24 Emanuel Barry > wrote: The repos which used to send to Python-checkins no longer do so since their respective migrations (devguide, peps). I don't know who's responsible for that, so I figured I'd post here. If people want those back on then that could be arranged. I'm not sure, though, if it still makes sense having emails for every commit from three separate repositories going to the same mailing list. You can follow the commits through an atom feed, e.g https://github.com/python/peps/commits.atom. That means you could use something like IFTTT on your own to send you an email for each commit so you can track only the repositories you care about. That makes me think that it's worth even less for peps since those all have to be posted here anyway and the devguide doesn't affect people's future production deployments. -------------- next part -------------- An HTML attachment was scrubbed... URL: From benjamin at python.org Wed Sep 7 17:50:44 2016 From: benjamin at python.org (Benjamin Peterson) Date: Wed, 07 Sep 2016 14:50:44 -0700 Subject: [Python-Dev] (some) C99 added to PEP 7 In-Reply-To: <1473270972.864925.718690353.0B268FC8@webmail.messagingengine.com> References: <1473270972.864925.718690353.0B268FC8@webmail.messagingengine.com> Message-ID: <1473285044.1011171.718926313.5B1CAEC6@webmail.messagingengine.com> One more thing I forgot: C++-style line comments are kosher, too. On Wed, Sep 7, 2016, at 10:56, Benjamin Peterson wrote: > To conclude our discussion about using C99 features, I've updated PEP 7 > to allow the following features: > - Standard integer types in ```` and ```` > - ``static inline`` functions > - designated initializers > - intermingled declarations > - booleans > > I've been adding examples of these to 3.6 over the last few days to make > sure the buildbots will like it. > > https://github.com/python/peps/commit/b6efe6e06fa70e8933440da26474a804fb3edb6e > > Enjoy. From brett at python.org Wed Sep 7 18:43:46 2016 From: brett at python.org (Brett Cannon) Date: Wed, 07 Sep 2016 22:43:46 +0000 Subject: [Python-Dev] Commits to migrated repos no longer sent to Python-checkins In-Reply-To: References: Message-ID: On Wed, 7 Sep 2016 at 14:46 Emanuel Barry wrote: > Fair enough. I never really bothered to set up any complicated design to > get commits, and my emails all get automatically sorted into folders so it > doesn?t matter which list it goes to. Although now that you mention it, I > could simply subscribe to the GitHub repos and get the notifications for > free :) > Yep, you can always watch the projects as well. I just didn't suggest it as people have so far told me they viewed it as overkill when they just wanted commits. -Brett > > > -Emanuel > > > > *From:* Brett Cannon [mailto:brett at python.org] > *Sent:* Wednesday, September 07, 2016 5:40 PM > *To:* Emanuel Barry; python-dev at python.org > *Subject:* Re: [Python-Dev] Commits to migrated repos no longer sent to > Python-checkins > > > > > > On Wed, 7 Sep 2016 at 14:24 Emanuel Barry wrote: > > The repos which used to send to Python-checkins no longer do so since their > respective migrations (devguide, peps). I don't know who's responsible for > that, so I figured I'd post here. > > > > If people want those back on then that could be arranged. I'm not sure, > though, if it still makes sense having emails for every commit from three > separate repositories going to the same mailing list. > > > > You can follow the commits through an atom feed, e.g > https://github.com/python/peps/commits.atom. That means you could use > something like IFTTT on your own to send you an email for each commit so > you can track only the repositories you care about. That makes me think > that it's worth even less for peps since those all have to be posted here > anyway and the devguide doesn't affect people's future production > deployments. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vadmium+py at gmail.com Wed Sep 7 18:58:28 2016 From: vadmium+py at gmail.com (Martin Panter) Date: Wed, 7 Sep 2016 22:58:28 +0000 Subject: [Python-Dev] (some) C99 added to PEP 7 In-Reply-To: <1473270972.864925.718690353.0B268FC8@webmail.messagingengine.com> References: <1473270972.864925.718690353.0B268FC8@webmail.messagingengine.com> Message-ID: Thank you very much Benjamin. On 7 September 2016 at 17:56, Benjamin Peterson wrote: > To conclude our discussion about using C99 features, I've updated PEP 7 > to allow the following features: > - Standard integer types in ```` and ```` Perhaps PEP 7 should clarify if the optional types like uint32_t are allowed, or only C99 mandatory types like uint_fast32_t etc. I think more people will be familiar with the fixed-width uint32_t etc. I know they are mandatory in Posix, and presumably also Windows, so they may be okay. > - ``static inline`` functions > - designated initializers > - intermingled declarations > - booleans > > I've been adding examples of these to 3.6 over the last few days to make > sure the buildbots will like it. > > https://github.com/python/peps/commit/b6efe6e06fa70e8933440da26474a804fb3edb6e > > Enjoy. From benjamin at python.org Wed Sep 7 19:14:42 2016 From: benjamin at python.org (Benjamin Peterson) Date: Wed, 07 Sep 2016 16:14:42 -0700 Subject: [Python-Dev] (some) C99 added to PEP 7 In-Reply-To: References: <1473270972.864925.718690353.0B268FC8@webmail.messagingengine.com> Message-ID: <1473290082.1711818.718987569.7F3C92FD@webmail.messagingengine.com> On Wed, Sep 7, 2016, at 15:58, Martin Panter wrote: > Thank you very much Benjamin. > > On 7 September 2016 at 17:56, Benjamin Peterson > wrote: > > To conclude our discussion about using C99 features, I've updated PEP 7 > > to allow the following features: > > - Standard integer types in ```` and ```` > > Perhaps PEP 7 should clarify if the optional types like uint32_t are > allowed, or only C99 mandatory types like uint_fast32_t etc. I think > more people will be familiar with the fixed-width uint32_t etc. I know > they are mandatory in Posix, and presumably also Windows, so they may > be okay. Yes, I will clarify we require the fixed-width types. From guido at python.org Wed Sep 7 19:16:31 2016 From: guido at python.org (Guido van Rossum) Date: Wed, 7 Sep 2016 16:16:31 -0700 Subject: [Python-Dev] Commits to migrated repos no longer sent to Python-checkins In-Reply-To: References: Message-ID: Let's see if watching the git repo (and filtering if necessary) covers this use case before we build more custom infrastructure. On Wed, Sep 7, 2016 at 2:46 PM, Emanuel Barry wrote: > Fair enough. I never really bothered to set up any complicated design to get > commits, and my emails all get automatically sorted into folders so it > doesn?t matter which list it goes to. Although now that you mention it, I > could simply subscribe to the GitHub repos and get the notifications for > free :) > > > > -Emanuel > > > > From: Brett Cannon [mailto:brett at python.org] > Sent: Wednesday, September 07, 2016 5:40 PM > To: Emanuel Barry; python-dev at python.org > Subject: Re: [Python-Dev] Commits to migrated repos no longer sent to > Python-checkins > > > > > > On Wed, 7 Sep 2016 at 14:24 Emanuel Barry wrote: > > The repos which used to send to Python-checkins no longer do so since their > respective migrations (devguide, peps). I don't know who's responsible for > that, so I figured I'd post here. > > > > If people want those back on then that could be arranged. I'm not sure, > though, if it still makes sense having emails for every commit from three > separate repositories going to the same mailing list. > > > > You can follow the commits through an atom feed, e.g > https://github.com/python/peps/commits.atom. That means you could use > something like IFTTT on your own to send you an email for each commit so you > can track only the repositories you care about. That makes me think that > it's worth even less for peps since those all have to be posted here anyway > and the devguide doesn't affect people's future production deployments. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) From guido at python.org Wed Sep 7 19:24:12 2016 From: guido at python.org (Guido van Rossum) Date: Wed, 7 Sep 2016 16:24:12 -0700 Subject: [Python-Dev] RFC: PEP 509: Add a private version to dict In-Reply-To: References: Message-ID: Folks, At the sprint both Victor and Yury have petitioned me to accept this PEP. I now agree. Let's do it! PEP 509 is hereby officially accepted. (Some implementation details have to be sorted out, but I need to unblock Victor before the sprint is over.) -- --Guido van Rossum (python.org/~guido) From rob.cliffe at btinternet.com Wed Sep 7 19:33:53 2016 From: rob.cliffe at btinternet.com (Rob Cliffe) Date: Thu, 8 Sep 2016 00:33:53 +0100 Subject: [Python-Dev] Make "global after use" a SyntaxError In-Reply-To: References: Message-ID: <397db523-ec7b-e6fd-89b9-3639adac6547@btinternet.com> I don't know if feedback from a single, humble Python programmer is of any value, but: +1 I do sometimes have global statements at the start of the bit of code to which they apply (rather than having all global statements agglomerated at the start of the function they are in). This seems to me consistent with good practice, whether for clarity or to make code cut-and-pasting easier. I cannot imagine ever wanting a global statement to be AFTER the first reference to one of the global variables it mentions. Best wishes. Rob Cliffe On 07/09/2016 17:59, Guido van Rossum wrote: > +1 > > On Wed, Sep 7, 2016 at 7:10 AM, Ivan Levkivskyi wrote: >> Hi all, >> >> The documentation at https://docs.python.org/3/reference/simple_stmts.html >> says that: >> >> "Names listed in a global statement must not be used in the same code block >> textually preceding that global statement" >> >> But then later: >> >> "CPython implementation detail: The current implementation does not enforce >> the two restrictions, >> but programs should not abuse this freedom, as future implementations may >> enforce them..." >> >> Code like this >> >> def f(): >> x = 1 >> global x >> >> gives SyntaxWarning for several releases, maybe it is time to make it a >> SyntaxError? >> >> (I have opened an issue for this http://bugs.python.org/issue27999 I will >> submit a patch soon). >> >> -- >> Ivan >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> https://mail.python.org/mailman/options/python-dev/guido%40python.org >> > > From ncoghlan at gmail.com Mon Sep 5 02:58:14 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 5 Sep 2016 00:58:14 -0600 Subject: [Python-Dev] PEP 529: Change Windows filesystem encoding to UTF-8 In-Reply-To: References: Message-ID: On 5 September 2016 at 15:59, Steve Dower wrote: > +continue to default to ``locale.getpreferredencoding()`` (for text files) or > +plain bytes (for binary files). This only affects the encoding used when users > +pass a bytes object to Python where it is then passed to the operating system as > +a path name. For the three non-filesystem cases: I checked the situation for os.environb, and that's already unavailable on Windows (since os.supports_bytes_environ is False there), while sys.argv is apparently already handled correctly (i.e. always using the *W APIs). That means my only open question would be the handling of subprocess module calls (both with and without shell=True), since that currently works with binary arguments on *nix: >>> subprocess.call([b"python", b"-c", "print('??????')".encode("utf-8")]) ?????? 0 >>> subprocess.call(b"python -c '%s'" % 'print("??????")'.encode("utf-8"), shell=True) ?????? 0 While calling system native apps that way will still have many portability challenges, there are also plenty of cases where folks use sys.executable to launch new Python processes in a separate instance of the currently running interpreter, and it would be good if these changes brought cross-platform consistency to the handling of binary arguments here as well. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia _______________________________________________ Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/michaelj.voss%40intel.com From ncoghlan at gmail.com Mon Sep 5 02:58:14 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 5 Sep 2016 00:58:14 -0600 Subject: [Python-Dev] PEP 529: Change Windows filesystem encoding to UTF-8 In-Reply-To: References: Message-ID: On 5 September 2016 at 15:59, Steve Dower wrote: > +continue to default to ``locale.getpreferredencoding()`` (for text files) or > +plain bytes (for binary files). This only affects the encoding used when users > +pass a bytes object to Python where it is then passed to the operating system as > +a path name. For the three non-filesystem cases: I checked the situation for os.environb, and that's already unavailable on Windows (since os.supports_bytes_environ is False there), while sys.argv is apparently already handled correctly (i.e. always using the *W APIs). That means my only open question would be the handling of subprocess module calls (both with and without shell=True), since that currently works with binary arguments on *nix: >>> subprocess.call([b"python", b"-c", "print('??????')".encode("utf-8")]) ?????? 0 >>> subprocess.call(b"python -c '%s'" % 'print("??????")'.encode("utf-8"), shell=True) ?????? 0 While calling system native apps that way will still have many portability challenges, there are also plenty of cases where folks use sys.executable to launch new Python processes in a separate instance of the currently running interpreter, and it would be good if these changes brought cross-platform consistency to the handling of binary arguments here as well. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia _______________________________________________ Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/florin.papa%40intel.com From ncoghlan at gmail.com Mon Sep 5 02:58:14 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 5 Sep 2016 00:58:14 -0600 Subject: [Python-Dev] PEP 529: Change Windows filesystem encoding to UTF-8 In-Reply-To: References: Message-ID: On 5 September 2016 at 15:59, Steve Dower wrote: > +continue to default to ``locale.getpreferredencoding()`` (for text files) or > +plain bytes (for binary files). This only affects the encoding used when users > +pass a bytes object to Python where it is then passed to the operating system as > +a path name. For the three non-filesystem cases: I checked the situation for os.environb, and that's already unavailable on Windows (since os.supports_bytes_environ is False there), while sys.argv is apparently already handled correctly (i.e. always using the *W APIs). That means my only open question would be the handling of subprocess module calls (both with and without shell=True), since that currently works with binary arguments on *nix: >>> subprocess.call([b"python", b"-c", "print('??????')".encode("utf-8")]) ?????? 0 >>> subprocess.call(b"python -c '%s'" % 'print("??????")'.encode("utf-8"), shell=True) ?????? 0 While calling system native apps that way will still have many portability challenges, there are also plenty of cases where folks use sys.executable to launch new Python processes in a separate instance of the currently running interpreter, and it would be good if these changes brought cross-platform consistency to the handling of binary arguments here as well. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia _______________________________________________ Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/octavian.moraru%40intel.com From ncoghlan at gmail.com Mon Sep 5 02:58:14 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 5 Sep 2016 00:58:14 -0600 Subject: [Python-Dev] PEP 529: Change Windows filesystem encoding to UTF-8 In-Reply-To: References: Message-ID: On 5 September 2016 at 15:59, Steve Dower wrote: > +continue to default to ``locale.getpreferredencoding()`` (for text files) or > +plain bytes (for binary files). This only affects the encoding used when users > +pass a bytes object to Python where it is then passed to the operating system as > +a path name. For the three non-filesystem cases: I checked the situation for os.environb, and that's already unavailable on Windows (since os.supports_bytes_environ is False there), while sys.argv is apparently already handled correctly (i.e. always using the *W APIs). That means my only open question would be the handling of subprocess module calls (both with and without shell=True), since that currently works with binary arguments on *nix: >>> subprocess.call([b"python", b"-c", "print('??????')".encode("utf-8")]) ?????? 0 >>> subprocess.call(b"python -c '%s'" % 'print("??????")'.encode("utf-8"), shell=True) ?????? 0 While calling system native apps that way will still have many portability challenges, there are also plenty of cases where folks use sys.executable to launch new Python processes in a separate instance of the currently running interpreter, and it would be good if these changes brought cross-platform consistency to the handling of binary arguments here as well. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia _______________________________________________ Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/vincentx.besanceney%40intel.com From ncoghlan at gmail.com Mon Sep 5 02:58:14 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 5 Sep 2016 00:58:14 -0600 Subject: [Python-Dev] PEP 529: Change Windows filesystem encoding to UTF-8 In-Reply-To: References: Message-ID: On 5 September 2016 at 15:59, Steve Dower wrote: > +continue to default to ``locale.getpreferredencoding()`` (for text files) or > +plain bytes (for binary files). This only affects the encoding used when users > +pass a bytes object to Python where it is then passed to the operating system as > +a path name. For the three non-filesystem cases: I checked the situation for os.environb, and that's already unavailable on Windows (since os.supports_bytes_environ is False there), while sys.argv is apparently already handled correctly (i.e. always using the *W APIs). That means my only open question would be the handling of subprocess module calls (both with and without shell=True), since that currently works with binary arguments on *nix: >>> subprocess.call([b"python", b"-c", "print('??????')".encode("utf-8")]) ?????? 0 >>> subprocess.call(b"python -c '%s'" % 'print("??????")'.encode("utf-8"), shell=True) ?????? 0 While calling system native apps that way will still have many portability challenges, there are also plenty of cases where folks use sys.executable to launch new Python processes in a separate instance of the currently running interpreter, and it would be good if these changes brought cross-platform consistency to the handling of binary arguments here as well. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia _______________________________________________ Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/monica.ene-pietrosanu%40intel.com From ncoghlan at gmail.com Mon Sep 5 02:58:14 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 5 Sep 2016 00:58:14 -0600 Subject: [Python-Dev] PEP 529: Change Windows filesystem encoding to UTF-8 In-Reply-To: References: Message-ID: On 5 September 2016 at 15:59, Steve Dower wrote: > +continue to default to ``locale.getpreferredencoding()`` (for text files) or > +plain bytes (for binary files). This only affects the encoding used when users > +pass a bytes object to Python where it is then passed to the operating system as > +a path name. For the three non-filesystem cases: I checked the situation for os.environb, and that's already unavailable on Windows (since os.supports_bytes_environ is False there), while sys.argv is apparently already handled correctly (i.e. always using the *W APIs). That means my only open question would be the handling of subprocess module calls (both with and without shell=True), since that currently works with binary arguments on *nix: >>> subprocess.call([b"python", b"-c", "print('??????')".encode("utf-8")]) ?????? 0 >>> subprocess.call(b"python -c '%s'" % 'print("??????")'.encode("utf-8"), shell=True) ?????? 0 While calling system native apps that way will still have many portability challenges, there are also plenty of cases where folks use sys.executable to launch new Python processes in a separate instance of the currently running interpreter, and it would be good if these changes brought cross-platform consistency to the handling of binary arguments here as well. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia _______________________________________________ Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/david.c.stewart%40intel.com From ncoghlan at gmail.com Wed Sep 7 20:22:28 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 8 Sep 2016 10:22:28 +1000 Subject: [Python-Dev] PEP 526 (variable annotations) accepted provisionally In-Reply-To: References: Message-ID: On 8 September 2016 at 04:18, Guido van Rossum wrote: > There's been some quite contentious discussion about the PEP, on and > off python-dev, regarding how the mere presence of annotation syntax > in the language will change the way people will see the language. My > own experience using mypy and PyCharm has been quite different: > annotations are a valuable addition for large code bases, and it's > worth the effort to add them to large legacy code bases (think > millions of lines of Python 2.7 code that needs to move to Python 3 by > 2020). The effect of this has been that engineers using Python are > happier and more confident that their code works than before, have an > easier time spelunking code they don't know, and are less afraid of > big refactorings (where conversion to Python 3 can be seen as the > ultimate refactoring). I also don't think it hurts to make the language migration easier for folks coming from a C/C++/C#/Java background, and even if they initially use explicit hints more heavily than they need to given the inferencing engines in typecheckers, those same hints have the potential to enable more automated refactorings that simplify their code. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Wed Sep 7 23:43:44 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 8 Sep 2016 13:43:44 +1000 Subject: [Python-Dev] Making PEP 3156 (asyncio) non-provisional In-Reply-To: References: Message-ID: On 8 September 2016 at 04:31, Guido van Rossum wrote: > There's also the issue of starttls, a feature that we know we'd like > to add but don't have ready for 3.6b1. I think the right approach > there is to provide an add-on package on PyPI that implements a > starttls-capable Transport class, and when that code is sufficiently > battle-tested we can add it to the stdlib (hopefully by 3.7). Such a > package might end up having to copy portions of the asyncio > implementation and/or use internal/undocumented APIs; that's fine > because it is only meant as a temporary measure, and we can make it > clear that just because the starttls package uses a certain internal > API that doesn't mean that API is now public. A big advantage of > having the initial starttls implementation outside the stdlib is that > its release schedule can be much more frequent than that of the stdlib > (== every 6 months), and a security issue in the starttls package > won't require all the heavy guns of doing a security release of all of > CPython. This could also be useful in general in terms of defining more clearly what kinds of access to asyncio internals are currently needed to implement 3rd party Transport classes, and perhaps lead to related future additions to the public API. Pending Amber's response, a definite thumbs up from me for removing the provisional caveat, and congratulations on a provisional experiment proving successful :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From guido at python.org Thu Sep 8 00:08:11 2016 From: guido at python.org (Guido van Rossum) Date: Wed, 7 Sep 2016 21:08:11 -0700 Subject: [Python-Dev] Making PEP 3156 (asyncio) non-provisional In-Reply-To: References: Message-ID: On Wed, Sep 7, 2016 at 8:43 PM, Nick Coghlan wrote: > On 8 September 2016 at 04:31, Guido van Rossum wrote: >> There's also the issue of starttls, a feature that we know we'd like >> to add but don't have ready for 3.6b1. I think the right approach >> there is to provide an add-on package on PyPI that implements a >> starttls-capable Transport class, and when that code is sufficiently >> battle-tested we can add it to the stdlib (hopefully by 3.7). Such a >> package might end up having to copy portions of the asyncio >> implementation and/or use internal/undocumented APIs; that's fine >> because it is only meant as a temporary measure, and we can make it >> clear that just because the starttls package uses a certain internal >> API that doesn't mean that API is now public. A big advantage of >> having the initial starttls implementation outside the stdlib is that >> its release schedule can be much more frequent than that of the stdlib >> (== every 6 months), and a security issue in the starttls package >> won't require all the heavy guns of doing a security release of all of >> CPython. > > This could also be useful in general in terms of defining more clearly > what kinds of access to asyncio internals are currently needed to > implement 3rd party Transport classes, and perhaps lead to related > future additions to the public API. Well, the thing is, I don't ever want third party code to subclass any of the implementation classes in asyncio. Even with the best intentions, the implementation details just move around too much and having to worry about subclasses using a "protected" API would stifle improvements completely. A 3rd party Transport class will have to reimplement a bunch of Transport logic that already exists in the asyncio library, but with one exception (in _SelectorTransport.__repr__(), self._loop._selector is used to render the polling state) it doesn't use any internals from the event loop. I expect it would be a major design exercise to create a set of helper APIs or a standard base class that we feel comfortable with providing to transports; especially since creating a new transport often involves exploring new territory in some other domain as well (e.g. I remember that designing the subprocess transports was a complex task). For the add-on starttls package I propose to cheat, because it is on its way to become a stdlib API -- it just needs time to mature and I don't trust that the 3.6 beta period is enough for that. I want at least two independent developers (not Yury or myself) to build a protocol implementation based on the 3rd party starttls package before I'll feel comfortable that the API is right. For example -- do streams need starttls capability? It's somewhat scary because of the buffering, but maybe streams are the right abstraction for protocol implementations. Or maybe now. Nobody knows! > Pending Amber's response, a definite thumbs up from me for removing > the provisional caveat, and congratulations on a provisional > experiment proving successful :) Yup. And many new experiments are currently starting! -- --Guido van Rossum (python.org/~guido) From ncoghlan at gmail.com Wed Sep 7 23:38:30 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 8 Sep 2016 13:38:30 +1000 Subject: [Python-Dev] PEP 529: Change Windows filesystem encoding to UTF-8 In-Reply-To: References: Message-ID: On 8 September 2016 at 03:37, Guido van Rossum wrote: > On Sun, Sep 4, 2016 at 11:58 PM, Nick Coghlan wrote: >> While calling system native apps that way will still have many >> portability challenges, there are also plenty of cases where folks use >> sys.executable to launch new Python processes in a separate instance >> of the currently running interpreter, and it would be good if these >> changes brought cross-platform consistency to the handling of binary >> arguments here as well. > > I checked with Steve and this is not supported anyway -- bytes > arguments (regardless of the value of shell) fail early with a > TypeError. That may be a bug but there's no backwards compatibility to > preserve here. (And apart from Python, few shell commands that work on > Unix make much sense on Windows, so Im also not particularly worried > about that particular example being non-portable -- it doesn't > represent a realistic concern.) Cool, I suspected "That already doesn't work, so you just have to use strings for cross-platform compatibility in those cases" would be the answer, and I think that's a sensible way to go. Even on *nix passing bytes arguments to subprocess is unusual, since anyone with Python 2 based habits will omit the "b" prefix from literals, and anything coming from the command line, environment, or other user input is supplied as text by default. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From christian at python.org Thu Sep 8 05:10:29 2016 From: christian at python.org (Christian Heimes) Date: Thu, 8 Sep 2016 11:10:29 +0200 Subject: [Python-Dev] hg push segfault Message-ID: Hi, About 10 minutes ago I got a couple of remote segfaults from hg.python.org. They occurred during push and pull operations: $ hg push pushing to ssh://hg at hg.python.org/cpython remote: bash: line 1: 25019 Segmentation fault HGPUSHER=christian.heimes /srv/hg/bin/hg-ssh /srv/hg/repos/* abort: no suitable response from remote hg! It's fine again now. Can somebody look into the matter, please? Christian From christian at python.org Thu Sep 8 07:09:48 2016 From: christian at python.org (Christian Heimes) Date: Thu, 8 Sep 2016 13:09:48 +0200 Subject: [Python-Dev] cpython (3.5): supress coroutine warning when an exception is pending (#27968) In-Reply-To: <20160907154742.13824.97647.1274D2C5@psf.io> References: <20160907154742.13824.97647.1274D2C5@psf.io> Message-ID: On 2016-09-07 17:47, benjamin.peterson wrote: > https://hg.python.org/cpython/rev/234f758449f8 > changeset: 103223:234f758449f8 > branch: 3.5 > parent: 103213:7537ca1c2aaf > user: Benjamin Peterson > date: Wed Sep 07 08:46:59 2016 -0700 > summary: > supress coroutine warning when an exception is pending (#27968) > > files: > Objects/genobject.c | 27 +++++++++++++++------------ > 1 files changed, 15 insertions(+), 12 deletions(-) > > > diff --git a/Objects/genobject.c b/Objects/genobject.c > --- a/Objects/genobject.c > +++ b/Objects/genobject.c > @@ -21,7 +21,7 @@ > _PyGen_Finalize(PyObject *self) > { > PyGenObject *gen = (PyGenObject *)self; > - PyObject *res; > + PyObject *res = NULL; > PyObject *error_type, *error_value, *error_traceback; > > if (gen->gi_frame == NULL || gen->gi_frame->f_stacktop == NULL) > @@ -33,23 +33,26 @@ > > /* If `gen` is a coroutine, and if it was never awaited on, > issue a RuntimeWarning. */ > - if (gen->gi_code != NULL > - && ((PyCodeObject *)gen->gi_code)->co_flags & CO_COROUTINE > - && gen->gi_frame->f_lasti == -1 > - && !PyErr_Occurred() > - && PyErr_WarnFormat(PyExc_RuntimeWarning, 1, > - "coroutine '%.50S' was never awaited", > - gen->gi_qualname)) { > - res = NULL; /* oops, exception */ > + if (gen->gi_code != NULL && > + ((PyCodeObject *)gen->gi_code)->co_flags & CO_COROUTINE && > + gen->gi_frame->f_lasti == -1) { > + if (!error_value) { > + PyErr_WarnFormat(PyExc_RuntimeWarning, 1, > + "coroutine '%.50S' was never awaited", > + gen->gi_qualname); > + } You don't check the return value of PyErr_WarnFormat(). It does not signal an exception in case warnings are turned into exceptions. Christian From benjamin at python.org Thu Sep 8 11:38:00 2016 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 08 Sep 2016 08:38:00 -0700 Subject: [Python-Dev] cpython (3.5): supress coroutine warning when an exception is pending (#27968) In-Reply-To: References: <20160907154742.13824.97647.1274D2C5@psf.io> Message-ID: <1473349080.136817.719715793.231990AD@webmail.messagingengine.com> On Thu, Sep 8, 2016, at 04:09, Christian Heimes wrote: > On 2016-09-07 17:47, benjamin.peterson wrote: > > https://hg.python.org/cpython/rev/234f758449f8 > > changeset: 103223:234f758449f8 > > branch: 3.5 > > parent: 103213:7537ca1c2aaf > > user: Benjamin Peterson > > date: Wed Sep 07 08:46:59 2016 -0700 > > summary: > > supress coroutine warning when an exception is pending (#27968) > > > > files: > > Objects/genobject.c | 27 +++++++++++++++------------ > > 1 files changed, 15 insertions(+), 12 deletions(-) > > > > > > diff --git a/Objects/genobject.c b/Objects/genobject.c > > --- a/Objects/genobject.c > > +++ b/Objects/genobject.c > > @@ -21,7 +21,7 @@ > > _PyGen_Finalize(PyObject *self) > > { > > PyGenObject *gen = (PyGenObject *)self; > > - PyObject *res; > > + PyObject *res = NULL; > > PyObject *error_type, *error_value, *error_traceback; > > > > if (gen->gi_frame == NULL || gen->gi_frame->f_stacktop == NULL) > > @@ -33,23 +33,26 @@ > > > > /* If `gen` is a coroutine, and if it was never awaited on, > > issue a RuntimeWarning. */ > > - if (gen->gi_code != NULL > > - && ((PyCodeObject *)gen->gi_code)->co_flags & CO_COROUTINE > > - && gen->gi_frame->f_lasti == -1 > > - && !PyErr_Occurred() > > - && PyErr_WarnFormat(PyExc_RuntimeWarning, 1, > > - "coroutine '%.50S' was never awaited", > > - gen->gi_qualname)) { > > - res = NULL; /* oops, exception */ > > + if (gen->gi_code != NULL && > > + ((PyCodeObject *)gen->gi_code)->co_flags & CO_COROUTINE && > > + gen->gi_frame->f_lasti == -1) { > > + if (!error_value) { > > + PyErr_WarnFormat(PyExc_RuntimeWarning, 1, > > + "coroutine '%.50S' was never awaited", > > + gen->gi_qualname); > > + } > > You don't check the return value of PyErr_WarnFormat(). It does not > signal an exception in case warnings are turned into exceptions. It's checked by PyErr_Occurred() several lines later. From benjamin at python.org Thu Sep 8 11:38:57 2016 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 08 Sep 2016 08:38:57 -0700 Subject: [Python-Dev] hg push segfault In-Reply-To: References: Message-ID: <1473349137.136951.719716281.081EA430@webmail.messagingengine.com> On Thu, Sep 8, 2016, at 02:10, Christian Heimes wrote: > Hi, > > About 10 minutes ago I got a couple of remote segfaults from > hg.python.org. They occurred during push and pull operations: > > $ hg push > pushing to ssh://hg at hg.python.org/cpython > remote: bash: line 1: 25019 Segmentation fault > HGPUSHER=christian.heimes /srv/hg/bin/hg-ssh /srv/hg/repos/* > abort: no suitable response from remote hg! > > It's fine again now. Can somebody look into the matter, please? A little bit after this the OOM killer started running, so it's probably related. From chris.barker at noaa.gov Thu Sep 8 12:02:35 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 8 Sep 2016 09:02:35 -0700 Subject: [Python-Dev] PEP 529: Change Windows filesystem encoding to UTF-8 In-Reply-To: References: Message-ID: On Wed, Sep 7, 2016 at 10:37 AM, Guido van Rossum wrote: > And apart from Python, few shell commands that work on > Unix make much sense on Windows, Does the (optional) addition of bash to Windows 10 have any impact on this? It'll be something that Windows developers can't count on their users having for a good while, if ever, but if you can control the deployment environment, then you might. And it would be VERY tempting for "posix-focused" developers that want to run their code on Windows. So it would be nice if the "new" approach worked well with bash on Windows. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Sep 8 12:09:20 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 8 Sep 2016 09:09:20 -0700 Subject: [Python-Dev] (some) C99 added to PEP 7 In-Reply-To: <1473290082.1711818.718987569.7F3C92FD@webmail.messagingengine.com> References: <1473270972.864925.718690353.0B268FC8@webmail.messagingengine.com> <1473290082.1711818.718987569.7F3C92FD@webmail.messagingengine.com> Message-ID: > > - Standard integer types in ```` and ```` > > Yes, I will clarify we require the fixed-width types. Does this mean that we might be able to have the built-in integer be based on int64_t now? so Windows64 and *nix64 will be the same? - CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From benjamin at python.org Thu Sep 8 12:17:08 2016 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 08 Sep 2016 09:17:08 -0700 Subject: [Python-Dev] (some) C99 added to PEP 7 In-Reply-To: References: <1473270972.864925.718690353.0B268FC8@webmail.messagingengine.com> <1473290082.1711818.718987569.7F3C92FD@webmail.messagingengine.com> Message-ID: <1473351428.1044442.719761769.61A1DA9E@webmail.messagingengine.com> On Thu, Sep 8, 2016, at 09:09, Chris Barker wrote: > > > - Standard integer types in ```` and ```` > > > > > > Yes, I will clarify we require the fixed-width types. > > > Does this mean that we might be able to have the built-in integer be > based > on int64_t now? so Windows64 and *nix64 will be the same? The builtin integer type (in Python 3) is variable length. From chris.barker at noaa.gov Thu Sep 8 12:30:26 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 8 Sep 2016 09:30:26 -0700 Subject: [Python-Dev] (some) C99 added to PEP 7 In-Reply-To: <1473351428.1044442.719761769.61A1DA9E@webmail.messagingengine.com> References: <1473270972.864925.718690353.0B268FC8@webmail.messagingengine.com> <1473290082.1711818.718987569.7F3C92FD@webmail.messagingengine.com> <1473351428.1044442.719761769.61A1DA9E@webmail.messagingengine.com> Message-ID: On Thu, Sep 8, 2016 at 9:17 AM, Benjamin Peterson wrote: > > Does this mean that we might be able to have the built-in integer be > > based > > on int64_t now? so Windows64 and *nix64 will be the same? > > The builtin integer type (in Python 3) is variable length. > indeed it is -- py2.7 also?? That's why I said "based on" -- under the hood, a C type is used, and IIUC, that type has been "long" for ages. And a long on Windows 64 (with the MS compiler anyway) is 32 bit, and a long on *nix (with the gnu compilers, at least) is 64 bits. This doesn't expose itself to pure python (and sys.maxint is now gone) but it does get exposed in the C API, and in particular, when passing data back and forth between numpy and pure python (numpy doesn't support an unlimited integer like python), or working with buffers or bytearrays, or whatever in Cython. Perhaps this is now a non-issue in py3 -- I honestly have not done any "real" computation work with py3 yet, but it sure is in 2.7 -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From random832 at fastmail.com Thu Sep 8 12:39:10 2016 From: random832 at fastmail.com (Random832) Date: Thu, 08 Sep 2016 12:39:10 -0400 Subject: [Python-Dev] (some) C99 added to PEP 7 In-Reply-To: References: <1473270972.864925.718690353.0B268FC8@webmail.messagingengine.com> <1473290082.1711818.718987569.7F3C92FD@webmail.messagingengine.com> <1473351428.1044442.719761769.61A1DA9E@webmail.messagingengine.com> Message-ID: <1473352750.55334.719785993.5F1CB6DF@webmail.messagingengine.com> On Thu, Sep 8, 2016, at 12:30, Chris Barker wrote: > That's why I said "based on" -- under the hood, a C type is used, and > IIUC, that type has been "long" for ages. And a long on Windows 64 > (with the MS compiler anyway) is 32 bit, and a long on *nix (with the > gnu compilers, at least) is 64 bits. > > This doesn't expose itself to pure python (and sys.maxint is now gone) > but it does get exposed in the C API, and in particular, when passing > data back and forth between numpy and pure python (numpy doesn't > support an unlimited integer like python), or working with buffers or > bytearrays, or whatever in Cython. I'm not sure "the builtin integer type" was the right term for what you're referring to. You're talking about changing Py_ssize_t, right? From brett at python.org Thu Sep 8 12:57:04 2016 From: brett at python.org (Brett Cannon) Date: Thu, 08 Sep 2016 16:57:04 +0000 Subject: [Python-Dev] PEP 529: Change Windows filesystem encoding to UTF-8 In-Reply-To: References: Message-ID: On Thu, 8 Sep 2016 at 09:06 Chris Barker wrote: > On Wed, Sep 7, 2016 at 10:37 AM, Guido van Rossum > wrote: > >> And apart from Python, few shell commands that work on >> Unix make much sense on Windows, > > > Does the (optional) addition of bash to Windows 10 have any impact on this? > > It'll be something that Windows developers can't count on their users > having for a good while, if ever, but if you can control the deployment > environment, then you might. And it would be VERY tempting for > "posix-focused" developers that want to run their code on Windows. > > So it would be nice if the "new" approach worked well with bash on Windows. > Bash on Windows is just Linux, so it isn't affected by any of this. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Thu Sep 8 13:00:50 2016 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 9 Sep 2016 03:00:50 +1000 Subject: [Python-Dev] (some) C99 added to PEP 7 In-Reply-To: <1473352750.55334.719785993.5F1CB6DF@webmail.messagingengine.com> References: <1473270972.864925.718690353.0B268FC8@webmail.messagingengine.com> <1473290082.1711818.718987569.7F3C92FD@webmail.messagingengine.com> <1473351428.1044442.719761769.61A1DA9E@webmail.messagingengine.com> <1473352750.55334.719785993.5F1CB6DF@webmail.messagingengine.com> Message-ID: On Fri, Sep 9, 2016 at 2:39 AM, Random832 wrote: > On Thu, Sep 8, 2016, at 12:30, Chris Barker wrote: >> That's why I said "based on" -- under the hood, a C type is used, and >> IIUC, that type has been "long" for ages. And a long on Windows 64 >> (with the MS compiler anyway) is 32 bit, and a long on *nix (with the >> gnu compilers, at least) is 64 bits. >> >> This doesn't expose itself to pure python (and sys.maxint is now gone) >> but it does get exposed in the C API, and in particular, when passing >> data back and forth between numpy and pure python (numpy doesn't >> support an unlimited integer like python), or working with buffers or >> bytearrays, or whatever in Cython. > > I'm not sure "the builtin integer type" was the right term for what > you're referring to. > > You're talking about changing Py_ssize_t, right? There are a few places where the size of ssize_t becomes visible to a Python script. Python 3.6.0a4+ (default:4b64a049f451+, Aug 19 2016, 23:41:43) [GCC 6.1.1 20160802] on linux Type "help", "copyright", "credits" or "license" for more information. >>> x=1<<(1<<30) >>> x=1<<(1<<34) >>> x=1<<(1<<62) Traceback (most recent call last): File "", line 1, in MemoryError >>> x=1<<(1<<66) Traceback (most recent call last): File "", line 1, in OverflowError: Python int too large to convert to C ssize_t But I got the same result on 3.5.2 on Win 7 64-bit, so I'm not seeing a difference here - it seems that PyLong_AsSsize_t has the same limits on both platforms. ChrisA From guido at python.org Thu Sep 8 13:10:05 2016 From: guido at python.org (Guido van Rossum) Date: Thu, 8 Sep 2016 10:10:05 -0700 Subject: [Python-Dev] PEP 529: Change Windows filesystem encoding to UTF-8 In-Reply-To: References: Message-ID: On Thu, Sep 8, 2016 at 9:57 AM, Brett Cannon wrote: > > > On Thu, 8 Sep 2016 at 09:06 Chris Barker wrote: >> >> On Wed, Sep 7, 2016 at 10:37 AM, Guido van Rossum >> wrote: >>> >>> And apart from Python, few shell commands that work on >>> Unix make much sense on Windows, >> >> >> Does the (optional) addition of bash to Windows 10 have any impact on >> this? >> >> It'll be something that Windows developers can't count on their users >> having for a good while, if ever, but if you can control the deployment >> environment, then you might. And it would be VERY tempting for >> "posix-focused" developers that want to run their code on Windows. >> >> So it would be nice if the "new" approach worked well with bash on >> Windows. > > > Bash on Windows is just Linux, so it isn't affected by any of this. I don't know what that sentence means. But anyways, if someone wants to try making subprocess work with bytes arguments on Windows work, that's just a bugfix, and you're not constrained by how it works on previous Python versions (since it doesn't work there at all). It might be wise to choose an interpretation that's consistent with other uses of command line arguments by Python on Windows though (rather than choosing to favor making just bash work the same as it works on Linux). -- --Guido van Rossum (python.org/~guido) From random832 at fastmail.com Thu Sep 8 13:35:14 2016 From: random832 at fastmail.com (Random832) Date: Thu, 08 Sep 2016 13:35:14 -0400 Subject: [Python-Dev] PEP 529: Change Windows filesystem encoding to UTF-8 In-Reply-To: References: Message-ID: <1473356114.67870.719829665.41852041@webmail.messagingengine.com> On Thu, Sep 8, 2016, at 13:10, Guido van Rossum wrote: > On Thu, Sep 8, 2016 at 9:57 AM, Brett Cannon wrote: > > Bash on Windows is just Linux, so it isn't affected by any of this. > > I don't know what that sentence means. It means that the so-called "bash" on windows 10 is actually a full Ubuntu system (running on, AIUI, a simulation of Linux kernel system calls), which will presumably also have its own python installation and use a UTF-8 locale, rather than one that runs "natively" on win32. If it's possible for a win32 version of python to call it as a subprocess, this may be an argument in favor of using UTF-8 - subject to finding out whether WSL does use UTF-8, whether it supports non-ASCII arguments from a Win32 CreateProcess at all, whether there's any way to pass non-UTF-8 arguments to it, etc. Incidentally, according to https://github.com/Microsoft/BashOnWindows/issues/2, pipes didn't work at all between WSL processes and Win32 processes until two weeks ago, so it's clear that these features are still evolving. > But anyways, if someone wants > to try making subprocess work with bytes arguments on Windows work, > that's just a bugfix, and you're not constrained by how it works on > previous Python versions (since it doesn't work there at all). It > might be wise to choose an interpretation that's consistent with other > uses of command line arguments by Python on Windows though (rather > than choosing to favor making just bash work the same as it works on > Linux). From chris.barker at noaa.gov Thu Sep 8 16:01:08 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 8 Sep 2016 13:01:08 -0700 Subject: [Python-Dev] (some) C99 added to PEP 7 In-Reply-To: <1473352750.55334.719785993.5F1CB6DF@webmail.messagingengine.com> References: <1473270972.864925.718690353.0B268FC8@webmail.messagingengine.com> <1473290082.1711818.718987569.7F3C92FD@webmail.messagingengine.com> <1473351428.1044442.719761769.61A1DA9E@webmail.messagingengine.com> <1473352750.55334.719785993.5F1CB6DF@webmail.messagingengine.com> Message-ID: On Thu, Sep 8, 2016 at 9:39 AM, Random832 wrote: > You're talking about changing Py_ssize_t, right? > wouldn't that be the pointer size? Is there a "long" in there anywhere in the integer implementation? My example is this: on OS-X, py3.5: import numpy as np In [9]: arr = np.array([1,2,3]) Out[10]: array([1, 2, 3]) In [11]: arr.dtype Out[11]: dtype('int64') I don't have py3 running on win64 anywhere right now, but in win64 py2, that would give you: dtype('int32') as it's a "long" under the hood (and I'm pretty sure that is not because of numpy code itself, but rather how Cpython is written/compiled) Does py3 already use int64? -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Sep 8 16:05:17 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 8 Sep 2016 13:05:17 -0700 Subject: [Python-Dev] PEP 529: Change Windows filesystem encoding to UTF-8 In-Reply-To: <1473356114.67870.719829665.41852041@webmail.messagingengine.com> References: <1473356114.67870.719829665.41852041@webmail.messagingengine.com> Message-ID: On Thu, Sep 8, 2016 at 10:35 AM, Random832 wrote: > > It means that the so-called "bash" on windows 10 is actually a full > Ubuntu system (running on, AIUI, a simulation of Linux kernel system > calls), which will presumably also have its own python installation and > use a UTF-8 locale, rather than one that runs "natively" on win32. > yes -- it looks like one could run a "linux" build of python under the whole subsystem, which would presumably "look" jsu tlike LInux to Python. > If it's possible for a win32 version of python to call it as a > subprocess, But this is what I was referring too -- it may be way to early to know what the capabilities or implications are, but I'm hoping that "regular" windows programs can interact with the subsystem. So if we're making changes now, it would be nice to consider it if we can. > Incidentally, according to > https://github.com/Microsoft/BashOnWindows/issues/2, pipes didn't work > at all between WSL processes and Win32 processes until two weeks ago, so > it's clear that these features are still evolving. so it may indeed be way to early -- but if they DO work now -- pretty cool! Thanks, -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Sep 8 16:17:52 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 8 Sep 2016 13:17:52 -0700 Subject: [Python-Dev] PEP 529: Change Windows filesystem encoding to UTF-8 In-Reply-To: References: <1473356114.67870.719829665.41852041@webmail.messagingengine.com> Message-ID: On Thu, Sep 8, 2016 at 1:14 PM, Guido van Rossum wrote: > Please no. Let's not add unrelated new functionality in with this > already large change with not entirely understood consequences. > Fair enough -- this is clearly a really raw API so far. -CHB > > On Thu, Sep 8, 2016 at 1:05 PM, Chris Barker > wrote: > > On Thu, Sep 8, 2016 at 10:35 AM, Random832 > wrote: > >> > >> > >> It means that the so-called "bash" on windows 10 is actually a full > >> Ubuntu system (running on, AIUI, a simulation of Linux kernel system > >> calls), which will presumably also have its own python installation and > >> use a UTF-8 locale, rather than one that runs "natively" on win32. > > > > > > yes -- it looks like one could run a "linux" build of python under the > whole > > subsystem, which would presumably "look" jsu tlike LInux to Python. > > > > > >> > >> If it's possible for a win32 version of python to call it as a > >> subprocess, > > > > > > But this is what I was referring too -- it may be way to early to know > what > > the capabilities or implications are, but I'm hoping that "regular" > windows > > programs can interact with the subsystem. So if we're making changes > now, it > > would be nice to consider it if we can. > > > >> > >> Incidentally, according to > >> > >> https://github.com/Microsoft/BashOnWindows/issues/2, pipes didn't work > >> at all between WSL processes and Win32 processes until two weeks ago, so > >> it's clear that these features are still evolving. > > > > > > so it may indeed be way to early -- but if they DO work now -- pretty > cool! > > > > Thanks, > > > > -CHB > > > > > > -- > > > > Christopher Barker, Ph.D. > > Oceanographer > > > > Emergency Response Division > > NOAA/NOS/OR&R (206) 526-6959 voice > > 7600 Sand Point Way NE (206) 526-6329 fax > > Seattle, WA 98115 (206) 526-6317 main reception > > > > Chris.Barker at noaa.gov > > > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > https://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: > > https://mail.python.org/mailman/options/python-dev/guido%40python.org > > > > > > -- > --Guido van Rossum (python.org/~guido) > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Thu Sep 8 16:20:27 2016 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 8 Sep 2016 13:20:27 -0700 Subject: [Python-Dev] PEP 468 ready for pronouncement. Message-ID: see: https://github.com/python/peps/blob/master/pep-0468.txt With the introduction of the compact dict implementation for CPython 3.6, PEP 468 becomes no more than a change to the language reference. I've adjusted the PEP to specify use of an ordered mapping rather than exactly OrderedDict. Thanks! -eric From victor.stinner at gmail.com Thu Sep 8 16:22:46 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 8 Sep 2016 13:22:46 -0700 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered Message-ID: Hi, I pushed INADA Naoki's implementation of the "compact dict". The hash table now stores indices pointing to a new second table which contains keys and values: it adds one new level of indirection. The table of indices is "compact": use 1, 2, 4 or 8 bytes per indice depending on the size of the dictionary. Moreover, the keys/values table is also more compact: its size is 2/3 of the indices table. A nice "side effect" of compact dict is that the dictionary now preserves the insertion order. It means that keyword arguments can now be iterated by their creation order: Python 3.5.1 (default, Jun 20 2016, 14:48:22) >>> def func(**kw): print(kw.keys()) ... >>> func(a=1, b=2, c=3, d=4, e=5) dict_keys(['c', 'd', 'e', 'b', 'a']) # random order vs Python 3.6.0a4+ (default:d43f819caea7, Sep 8 2016, 13:05:34) >>> def func(**kw): print(kw.keys()) ... >>> func(a=1, b=2, c=3, d=4, e=5) dict_keys(['a', 'b', 'c', 'd', 'e']) # expected order It means that the main goal of the PEP 468 "Preserving the order of **kwargs in a function" is now implemented in Python 3.6: https://www.python.org/dev/peps/pep-0468/ But Eric Snow still wants to rephrase the PEP 468 to replace "OrderedDict" with "ordered mapping". For more information on compact dict, see: * http://bugs.python.org/issue27350 * https://mail.python.org/pipermail/python-dev/2016-June/145299.html * https://morepypy.blogspot.jp/2015/01/faster-more-memory-efficient-and-more.html *https://mail.python.org/pipermail/python-dev/2012-December/123028.html PyPy also implements the "compact dict", but it uses further "tricks" to preserve the order even if items are removed and then others are added. We might also implement these tricks in CPython, so dict will be ordered as well! -- Moreover, since Guido approved the PEP 509 "Add a private version to dict", I just pushed the implementation. The PEP 509 provides a C API (a dict version field) to implement efficient caches on namespaces. It might be used to implement a cache on builtins in Python 3.6 using Yury's opcode cache (stay tuned!). Victor From guido at python.org Thu Sep 8 16:14:46 2016 From: guido at python.org (Guido van Rossum) Date: Thu, 8 Sep 2016 13:14:46 -0700 Subject: [Python-Dev] PEP 529: Change Windows filesystem encoding to UTF-8 In-Reply-To: References: <1473356114.67870.719829665.41852041@webmail.messagingengine.com> Message-ID: Please no. Let's not add unrelated new functionality in with this already large change with not entirely understood consequences. On Thu, Sep 8, 2016 at 1:05 PM, Chris Barker wrote: > On Thu, Sep 8, 2016 at 10:35 AM, Random832 wrote: >> >> >> It means that the so-called "bash" on windows 10 is actually a full >> Ubuntu system (running on, AIUI, a simulation of Linux kernel system >> calls), which will presumably also have its own python installation and >> use a UTF-8 locale, rather than one that runs "natively" on win32. > > > yes -- it looks like one could run a "linux" build of python under the whole > subsystem, which would presumably "look" jsu tlike LInux to Python. > > >> >> If it's possible for a win32 version of python to call it as a >> subprocess, > > > But this is what I was referring too -- it may be way to early to know what > the capabilities or implications are, but I'm hoping that "regular" windows > programs can interact with the subsystem. So if we're making changes now, it > would be nice to consider it if we can. > >> >> Incidentally, according to >> >> https://github.com/Microsoft/BashOnWindows/issues/2, pipes didn't work >> at all between WSL processes and Win32 processes until two weeks ago, so >> it's clear that these features are still evolving. > > > so it may indeed be way to early -- but if they DO work now -- pretty cool! > > Thanks, > > -CHB > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) From guido at python.org Thu Sep 8 16:33:37 2016 From: guido at python.org (Guido van Rossum) Date: Thu, 8 Sep 2016 13:33:37 -0700 Subject: [Python-Dev] PEP 468 ready for pronouncement. In-Reply-To: References: Message-ID: Thanks Eric! The synergy between this PEP and the compact dict is amazing BTW. Clearly its time has come. Therefore: PEP 468 is now accepted. You may as well call it Final, since all we need to do now is update the docs. Congrats!! --Guido On Thu, Sep 8, 2016 at 1:20 PM, Eric Snow wrote: > see: https://github.com/python/peps/blob/master/pep-0468.txt > > With the introduction of the compact dict implementation for CPython > 3.6, PEP 468 becomes no more than a change to the language reference. > I've adjusted the PEP to specify use of an ordered mapping rather than > exactly OrderedDict. Thanks! > > -eric > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) From steve.dower at python.org Thu Sep 8 16:35:21 2016 From: steve.dower at python.org (Steve Dower) Date: Thu, 8 Sep 2016 13:35:21 -0700 Subject: [Python-Dev] (some) C99 added to PEP 7 In-Reply-To: References: <1473270972.864925.718690353.0B268FC8@webmail.messagingengine.com> <1473290082.1711818.718987569.7F3C92FD@webmail.messagingengine.com> <1473351428.1044442.719761769.61A1DA9E@webmail.messagingengine.com> <1473352750.55334.719785993.5F1CB6DF@webmail.messagingengine.com> Message-ID: On 08Sep2016 1301, Chris Barker wrote: > On Thu, Sep 8, 2016 at 9:39 AM, Random832 > wrote: > > You're talking about changing Py_ssize_t, right? > > > wouldn't that be the pointer size? > > Is there a "long" in there anywhere in the integer implementation? > [SNIP] > Does py3 already use int64? Py3 has used a variable-length int representation for its entire existence. Python 3.0.1 (r301:69561, Feb 13 2009, 20:04:18) [MSC v.1500 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> 2**1000 10715086071862673209484250490600018105614048117055336074437503883703510511249361224931983788156958581275946729175531468251871452856923140435984577574698574803934567774824230985421074605062371141877954182153046474983581941267398767559165543946077062914571196477686542167660429831652624386837205668069376 Cheers, Steve From guido at python.org Thu Sep 8 16:36:42 2016 From: guido at python.org (Guido van Rossum) Date: Thu, 8 Sep 2016 13:36:42 -0700 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: Message-ID: Thanks Victor for the review and commit, and thanks Naoki for your truly amazing implementation work!! I've also accepted Eric's PEP 468. IIUC there's one small thing we might still want to change somewhere after 3.6b1 but before 3.6rc1: the order is not preserved when you delete some keys and then add some other keys. Apparently PyPy has come up with a clever solution for this, and we should probably adopt it, but it's probably best not to hurry that for 3.6b1. --Guido On Thu, Sep 8, 2016 at 1:22 PM, Victor Stinner wrote: > Hi, > > I pushed INADA Naoki's implementation of the "compact dict". The hash > table now stores indices pointing to a new second table which contains > keys and values: it adds one new level of indirection. The table of > indices is "compact": use 1, 2, 4 or 8 bytes per indice depending on > the size of the dictionary. Moreover, the keys/values table is also > more compact: its size is 2/3 of the indices table. > > A nice "side effect" of compact dict is that the dictionary now > preserves the insertion order. It means that keyword arguments can now > be iterated by their creation order: > > Python 3.5.1 (default, Jun 20 2016, 14:48:22) >>>> def func(**kw): print(kw.keys()) > ... >>>> func(a=1, b=2, c=3, d=4, e=5) > dict_keys(['c', 'd', 'e', 'b', 'a']) # random order > > vs > > Python 3.6.0a4+ (default:d43f819caea7, Sep 8 2016, 13:05:34) >>>> def func(**kw): print(kw.keys()) > ... >>>> func(a=1, b=2, c=3, d=4, e=5) > dict_keys(['a', 'b', 'c', 'd', 'e']) # expected order > > > It means that the main goal of the PEP 468 "Preserving the order of > **kwargs in a function" is now implemented in Python 3.6: > https://www.python.org/dev/peps/pep-0468/ > > But Eric Snow still wants to rephrase the PEP 468 to replace > "OrderedDict" with "ordered mapping". > > > For more information on compact dict, see: > > * http://bugs.python.org/issue27350 > * https://mail.python.org/pipermail/python-dev/2016-June/145299.html > * https://morepypy.blogspot.jp/2015/01/faster-more-memory-efficient-and-more.html > *https://mail.python.org/pipermail/python-dev/2012-December/123028.html > > > PyPy also implements the "compact dict", but it uses further "tricks" > to preserve the order even if items are removed and then others are > added. We might also implement these tricks in CPython, so dict will > be ordered as well! > > -- > > Moreover, since Guido approved the PEP 509 "Add a private version to > dict", I just pushed the implementation. > > The PEP 509 provides a C API (a dict version field) to implement > efficient caches on namespaces. It might be used to implement a cache > on builtins in Python 3.6 using Yury's opcode cache (stay tuned!). > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) From guido at python.org Thu Sep 8 16:53:27 2016 From: guido at python.org (Guido van Rossum) Date: Thu, 8 Sep 2016 13:53:27 -0700 Subject: [Python-Dev] (some) C99 added to PEP 7 In-Reply-To: References: <1473270972.864925.718690353.0B268FC8@webmail.messagingengine.com> <1473290082.1711818.718987569.7F3C92FD@webmail.messagingengine.com> <1473351428.1044442.719761769.61A1DA9E@webmail.messagingengine.com> <1473352750.55334.719785993.5F1CB6DF@webmail.messagingengine.com> Message-ID: Can you guys get a room? There is absolutely no reason that all of python-dev needs to hear this. On Thu, Sep 8, 2016 at 1:35 PM, Steve Dower wrote: > On 08Sep2016 1301, Chris Barker wrote: >> >> On Thu, Sep 8, 2016 at 9:39 AM, Random832 > > wrote: >> >> You're talking about changing Py_ssize_t, right? >> >> >> wouldn't that be the pointer size? >> >> Is there a "long" in there anywhere in the integer implementation? >> [SNIP] >> Does py3 already use int64? > > > Py3 has used a variable-length int representation for its entire existence. > > Python 3.0.1 (r301:69561, Feb 13 2009, 20:04:18) [MSC v.1500 32 bit (Intel)] > on win32 > Type "help", "copyright", "credits" or "license" for more information. >>>> 2**1000 > 10715086071862673209484250490600018105614048117055336074437503883703510511249361224931983788156958581275946729175531468251871452856923140435984577574698574803934567774824230985421074605062371141877954182153046474983581941267398767559165543946077062914571196477686542167660429831652624386837205668069376 > > Cheers, > Steve > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) From random832 at fastmail.com Thu Sep 8 17:16:48 2016 From: random832 at fastmail.com (Random832) Date: Thu, 08 Sep 2016 17:16:48 -0400 Subject: [Python-Dev] (some) C99 added to PEP 7 In-Reply-To: References: <1473270972.864925.718690353.0B268FC8@webmail.messagingengine.com> <1473290082.1711818.718987569.7F3C92FD@webmail.messagingengine.com> <1473351428.1044442.719761769.61A1DA9E@webmail.messagingengine.com> <1473352750.55334.719785993.5F1CB6DF@webmail.messagingengine.com> Message-ID: <1473369408.115797.720049297.032A315B@webmail.messagingengine.com> On Thu, Sep 8, 2016, at 16:01, Chris Barker wrote: > Is there a "long" in there anywhere in the integer implementation? The python 2 long type is the python 3 int type. The python 2 int type is gone. > I don't have py3 running on win64 anywhere right now, but in win64 py2, > that would give you: > > dtype('int32') > > as it's a "long" under the hood That's numpy's decision, there's nothing "built-in" about it. > (and I'm pretty sure that is not because of numpy code itself, but rather > how Cpython is written/compiled) Nope. https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/common.c#L105 Note that PyInt_Check doesn't exist anymore in Python 3. NumPy provides its own definition: https://github.com/numpy/numpy/blob/master/numpy/core/include/numpy/npy_3kcompat.h#L35 From victor.stinner at gmail.com Thu Sep 8 17:20:53 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 8 Sep 2016 14:20:53 -0700 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: Message-ID: 2016-09-08 13:36 GMT-07:00 Guido van Rossum : > IIUC there's one small thing we might still want to change somewhere > after 3.6b1 but before 3.6rc1: the order is not preserved when you > delete some keys and then add some other keys. Apparently PyPy has > come up with a clever solution for this, and we should probably adopt > it, but it's probably best not to hurry that for 3.6b1. Very good news: I was wrong, Raymond Hettinger confirmed that the Python 3.6 dict *already* preserves the items order in all cases. In short, Python 3.6 dict = Python 3.5 OrderedDict (in fact, OrderedDict has a few more methods). Victor From guido at python.org Thu Sep 8 17:25:07 2016 From: guido at python.org (Guido van Rossum) Date: Thu, 8 Sep 2016 14:25:07 -0700 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: Message-ID: On Thu, Sep 8, 2016 at 1:36 PM, Guido van Rossum wrote: > IIUC there's one small thing we might still want to change somewhere > after 3.6b1 but before 3.6rc1: the order is not preserved when you > delete some keys and then add some other keys. Apparently PyPy has > come up with a clever solution for this, and we should probably adopt > it, but it's probably best not to hurry that for 3.6b1. It turns out I was mistaken. Naoki's implementation *does* preserve order across deletions. So we are already up to the standard set by PyPy. Go Naoki!! PS. As a consequence we're also going to change 520. Sit tight! -- --Guido van Rossum (python.org/~guido) From guido at python.org Thu Sep 8 17:42:47 2016 From: guido at python.org (Guido van Rossum) Date: Thu, 8 Sep 2016 14:42:47 -0700 Subject: [Python-Dev] Review request: issue 27350, compact ordered dict In-Reply-To: References: Message-ID: It's in! Congrats, and thanks for your great work! See longer post by Victor. On Sun, Aug 28, 2016 at 12:16 AM, INADA Naoki wrote: > On Sun, Aug 28, 2016 at 2:05 PM, Guido van Rossum wrote: >> Hopefully some core dev(s) can work on this during the core sprint, which is >> from Sept 5-9. >> > > OK. While I'm in Japan (UTC+9) and cannot join the sprint, I'll be > active as possible > while the sprint. > > Thank you! > > >> >> -- >> --Guido van Rossum (python.org/~guido) > > -- > INADA Naoki -- --Guido van Rossum (python.org/~guido) From rosuav at gmail.com Thu Sep 8 17:45:27 2016 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 9 Sep 2016 07:45:27 +1000 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: Message-ID: On Fri, Sep 9, 2016 at 6:22 AM, Victor Stinner wrote: > A nice "side effect" of compact dict is that the dictionary now > preserves the insertion order. It means that keyword arguments can now > be iterated by their creation order: > This is pretty sweet! Of course, there are going to be 1172 complaints from people who's doctests have been broken, same as when hash randomization came in, but personally, I don't care. Thank you for landing this! ChrisA From nad at python.org Thu Sep 8 18:07:22 2016 From: nad at python.org (Ned Deily) Date: Thu, 8 Sep 2016 15:07:22 -0700 Subject: [Python-Dev] IMPORTANT: 3.6.0b1 and feature code cutoff 2016-09-12 12:00 UTC Message-ID: <004BF553-CD9D-48EA-B54A-0672F21678DF@python.org> Happy end of summer (northern hemisphere) or winter (southern)! Along with the changing of the seasons, the time has come to finish feature development for Python 3.6. As previously announced, this coming Monday marks the end of the alpha phase of the release cycle and the beginning of the beta phase. Up through the alpha phase, there has been unrestricted feature development phase; that ends as of beta 1. All feature code for 3.6.0 must be checked in by the b1 cutoff on Monday (unless you have contacted me and we have agreed on an extension). As was done during the 3.5 release cycle, we will create the 3.6 branch at b1 time. During the beta phase, the emphasis is on fixes for new features, fixes for all categories of bugs and regressions, and documentation fixes/updates. I will send out specific information for core committers next week after the creation of the b1 tag and the 3.6 branch. Beta releases are intended to give the wider community the opportunity to test new features and bug fixes and to prepare their projects to support the new feature release. We strongly encourage maintainers of third-party Python projects to test with 3.6 during the beta phase and report issues found to bugs.python.org as soon as possible. While the release will be feature complete entering the beta phase, it is possible that features may be modified or, in rare cases, deleted up until the start of the release candidate phase. Our goal is have no changes after rc1. To achieve that, it will be extremely important to get as much exposure for 3.6 as possible during the beta phase. To recap: 2016-09-12 ~12:00 UTC: code snapshot for 3.6.0 beta 1 (feature code freeze, no new features) 2016-09-12 3.6 branch opens for 3.6.0; 3.7.0 feature development begins 2016-09-12 to 2016-12-04: 3.6.0 beta phase (bug, regression, and doc fixes, no new features) 2016-12-04 3.6.0 release candidate 1 (3.6.0 code freeze) 2016-12-16 3.6.0 release (3.6.0rc1 plus, if necessary, any dire emergency fixes) 2018-06 3.7.0 release (3.6.0 release + 18 months, details TBD) Thank you all for your great efforts so far on 3.6; it should be a great release! --Ned https://www.python.org/dev/peps/pep-0494/ -- Ned Deily nad at python.org -- [] From songofacandy at gmail.com Thu Sep 8 23:59:42 2016 From: songofacandy at gmail.com (INADA Naoki) Date: Fri, 9 Sep 2016 12:59:42 +0900 Subject: [Python-Dev] Review request: issue 27350, compact ordered dict In-Reply-To: References: Message-ID: Thank you for all core devs! I'll polish the implementation until 3.6b2. On Fri, Sep 9, 2016 at 6:42 AM, Guido van Rossum wrote: > It's in! Congrats, and thanks for your great work! See longer post by Victor. > > On Sun, Aug 28, 2016 at 12:16 AM, INADA Naoki wrote: >> On Sun, Aug 28, 2016 at 2:05 PM, Guido van Rossum wrote: >>> Hopefully some core dev(s) can work on this during the core sprint, which is >>> from Sept 5-9. >>> >> >> OK. While I'm in Japan (UTC+9) and cannot join the sprint, I'll be >> active as possible >> while the sprint. >> >> Thank you! >> >> >>> >>> -- >>> --Guido van Rossum (python.org/~guido) >> >> -- >> INADA Naoki > > > > -- > --Guido van Rossum (python.org/~guido) -- INADA Naoki From timothy.c.delaney at gmail.com Fri Sep 9 01:33:21 2016 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Fri, 9 Sep 2016 15:33:21 +1000 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: Message-ID: On 9 September 2016 at 07:45, Chris Angelico wrote: > On Fri, Sep 9, 2016 at 6:22 AM, Victor Stinner > wrote: > > A nice "side effect" of compact dict is that the dictionary now > > preserves the insertion order. It means that keyword arguments can now > > be iterated by their creation order: > > > > This is pretty sweet! Of course, there are going to be 1172 complaints > from people who's doctests have been broken, same as when hash > randomization came in, but personally, I don't care. Thank you for > landing this! > Are sets also ordered by default now? None of the PEPs appear to mention it. Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From benjamin at python.org Fri Sep 9 01:34:04 2016 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 08 Sep 2016 22:34:04 -0700 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: Message-ID: <1473399244.225934.720369745.6587BC20@webmail.messagingengine.com> On Thu, Sep 8, 2016, at 22:33, Tim Delaney wrote: > On 9 September 2016 at 07:45, Chris Angelico wrote: > > > On Fri, Sep 9, 2016 at 6:22 AM, Victor Stinner > > wrote: > > > A nice "side effect" of compact dict is that the dictionary now > > > preserves the insertion order. It means that keyword arguments can now > > > be iterated by their creation order: > > > > > > > This is pretty sweet! Of course, there are going to be 1172 complaints > > from people who's doctests have been broken, same as when hash > > randomization came in, but personally, I don't care. Thank you for > > landing this! > > > > Are sets also ordered by default now? None of the PEPs appear to mention > it. No. From timothy.c.delaney at gmail.com Fri Sep 9 01:38:03 2016 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Fri, 9 Sep 2016 15:38:03 +1000 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: <1473399244.225934.720369745.6587BC20@webmail.messagingengine.com> References: <1473399244.225934.720369745.6587BC20@webmail.messagingengine.com> Message-ID: On 9 September 2016 at 15:34, Benjamin Peterson wrote: > On Thu, Sep 8, 2016, at 22:33, Tim Delaney wrote: > > On 9 September 2016 at 07:45, Chris Angelico wrote: > > > > > On Fri, Sep 9, 2016 at 6:22 AM, Victor Stinner < > victor.stinner at gmail.com> > > > wrote: > > > > A nice "side effect" of compact dict is that the dictionary now > > > > preserves the insertion order. It means that keyword arguments can > now > > > > be iterated by their creation order: > > > > > > > > > > This is pretty sweet! Of course, there are going to be 1172 complaints > > > from people who's doctests have been broken, same as when hash > > > randomization came in, but personally, I don't care. Thank you for > > > landing this! > > > > > > > Are sets also ordered by default now? None of the PEPs appear to mention > > it. > > No. > That's an unfortunate inconsistency - I can imagine a lot of people making the assumption that if dict is ordered (esp. if documented as such) then sets would be as well. Might need a big red warning in the docs that it's not the case. Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Fri Sep 9 04:55:41 2016 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 9 Sep 2016 10:55:41 +0200 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered References: Message-ID: <20160909105541.5b8a7ec8@fsol> On Thu, 8 Sep 2016 14:20:53 -0700 Victor Stinner wrote: > 2016-09-08 13:36 GMT-07:00 Guido van Rossum : > > IIUC there's one small thing we might still want to change somewhere > > after 3.6b1 but before 3.6rc1: the order is not preserved when you > > delete some keys and then add some other keys. Apparently PyPy has > > come up with a clever solution for this, and we should probably adopt > > it, but it's probably best not to hurry that for 3.6b1. > > Very good news: I was wrong, Raymond Hettinger confirmed that the > Python 3.6 dict *already* preserves the items order in all cases. In > short, Python 3.6 dict = Python 3.5 OrderedDict (in fact, OrderedDict > has a few more methods). Is it an official feature of the language or an implementation detail? Regards Antoine. From fijall at gmail.com Fri Sep 9 05:28:46 2016 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 9 Sep 2016 11:28:46 +0200 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: <20160909105541.5b8a7ec8@fsol> References: <20160909105541.5b8a7ec8@fsol> Message-ID: On Fri, Sep 9, 2016 at 10:55 AM, Antoine Pitrou wrote: > On Thu, 8 Sep 2016 14:20:53 -0700 > Victor Stinner wrote: >> 2016-09-08 13:36 GMT-07:00 Guido van Rossum : >> > IIUC there's one small thing we might still want to change somewhere >> > after 3.6b1 but before 3.6rc1: the order is not preserved when you >> > delete some keys and then add some other keys. Apparently PyPy has >> > come up with a clever solution for this, and we should probably adopt >> > it, but it's probably best not to hurry that for 3.6b1. >> >> Very good news: I was wrong, Raymond Hettinger confirmed that the >> Python 3.6 dict *already* preserves the items order in all cases. In >> short, Python 3.6 dict = Python 3.5 OrderedDict (in fact, OrderedDict >> has a few more methods). > > Is it an official feature of the language or an implementation detail? > > Regards > > Antoine. I think an implementation detail (although I'm not opposed to having it mentioned in the spec), but using the same/similar approach for sets should be mostly relatively simple, no? PyPy has a pure python OrderedDict which is a wrapper around dict. For 3.6 it needs an adjustement since new methods showed up From status at bugs.python.org Fri Sep 9 12:08:43 2016 From: status at bugs.python.org (Python tracker) Date: Fri, 9 Sep 2016 18:08:43 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20160909160843.7D5B656916@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2016-09-02 - 2016-09-09) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 5568 (-62) closed 34233 (+166) total 39801 (+104) Open issues with patches: 2402 Issues opened (64) ================== #17602: mingw: default sys.path calculations for windows platforms http://bugs.python.org/issue17602 reopened by martin.panter #26513: platform.win32_ver() broken in 2.7.11 http://bugs.python.org/issue26513 reopened by steve.dower #27778: PEP 524: Add os.getrandom() http://bugs.python.org/issue27778 reopened by martin.panter #27938: PyUnicode_AsEncodedString, PyUnicode_Decode: add fast-path for http://bugs.python.org/issue27938 reopened by haypo #27942: Default value identity regression http://bugs.python.org/issue27942 opened by Kay.Hayen #27943: pstats.Stats: missing the source OS setting argument in strip_ http://bugs.python.org/issue27943 opened by Jaroslav #27945: Various segfaults with dict http://bugs.python.org/issue27945 opened by tehybel #27946: issues in elementtree and elsewhere due to PyDict_GetItem http://bugs.python.org/issue27946 opened by tehybel #27948: f-strings: allow backslashes only in the string parts, not in http://bugs.python.org/issue27948 opened by eric.smith #27950: Superfluous messages when running make http://bugs.python.org/issue27950 opened by xiang.zhang #27951: the reply's additional "Re:" is ok http://bugs.python.org/issue27951 opened by saifmega #27952: Finish converting fixcid.py from regex to re http://bugs.python.org/issue27952 opened by serhiy.storchaka #27954: makesetup does not take into account subdirectories http://bugs.python.org/issue27954 opened by David D #27955: getrandom() syscall returning EPERM make the system unusable. http://bugs.python.org/issue27955 opened by iwings #27963: null poiter dereference in set_conversion_mode due uncheck _ct http://bugs.python.org/issue27963 opened by minhrau #27965: Automatic .py extension when saving with IDLE on OSX http://bugs.python.org/issue27965 opened by InfiniteHybrid #27966: PEP-397 documents incorrect registry path http://bugs.python.org/issue27966 opened by mhammond #27971: utf-16 decoding can't handle lone surrogates http://bugs.python.org/issue27971 opened by lazka #27972: Confusing error during cyclic yield http://bugs.python.org/issue27972 opened by Max von Tettenborn #27973: urllib.urlretrieve() fails on second ftp transfer http://bugs.python.org/issue27973 opened by Sohaib Ahmad #27976: Deprecate building with bundled copy of libffi on non-Darwin P http://bugs.python.org/issue27976 opened by zach.ware #27977: smtplib send_message does not correctly handle unicode address http://bugs.python.org/issue27977 opened by r.david.murray #27978: Executor#shutdown with timeout http://bugs.python.org/issue27978 opened by Patrik Dufresne #27979: Remove bundled libffi http://bugs.python.org/issue27979 opened by zach.ware #27981: Reference leak in fp_setreadl() of Parser/tokenizer.c http://bugs.python.org/issue27981 opened by haypo #27984: singledispatch register should typecheck its argument http://bugs.python.org/issue27984 opened by amogorkon #27986: make distclean clobbers Lib/plat-darwin/* http://bugs.python.org/issue27986 opened by zach.ware #27987: obmalloc's 8-byte alignment causes undefined behavior http://bugs.python.org/issue27987 opened by benjamin.peterson #27989: incomplete signature with help function using typing http://bugs.python.org/issue27989 opened by David E. Franco G. #27990: Provide a way to enable getrandom on Linux even when build sys http://bugs.python.org/issue27990 opened by ncoghlan #27991: In the argparse howto there is a misleading sentence about sto http://bugs.python.org/issue27991 opened by py.user #27992: In the argparse there is a misleading words about %(prog)s val http://bugs.python.org/issue27992 opened by py.user #27994: In the argparse help(argparse) prints weird comments instead o http://bugs.python.org/issue27994 opened by py.user #27995: Upgrade Python 3.4 to OpenSSL 1.0.2h on Windows http://bugs.python.org/issue27995 opened by scw #27997: ImportError should be raised consistently from import machiner http://bugs.python.org/issue27997 opened by eric.snow #27998: Remove support of bytes paths in os.scandir() http://bugs.python.org/issue27998 opened by serhiy.storchaka #27999: Make "global after use" a SyntaxError http://bugs.python.org/issue27999 opened by levkivskyi #28000: Build fails on AIX with _LINUX_SOURCE_COMPAT flag http://bugs.python.org/issue28000 opened by sarterm #28001: test.support.open_urlresource should work from an installed Py http://bugs.python.org/issue28001 opened by zach.ware #28002: Some f-strings do not round trip through Tools/parser/test_unp http://bugs.python.org/issue28002 opened by eric.smith #28004: Optimize bytes.join(sequence) http://bugs.python.org/issue28004 opened by haypo #28007: Bad .pyc files prevent import of otherwise valid .py files. http://bugs.python.org/issue28007 opened by eric.snow #28008: PEP 530, asynchronous comprehensions implementation http://bugs.python.org/issue28008 opened by yselivanov #28009: core logic of uuid.getnode() is broken for AIX - all versions http://bugs.python.org/issue28009 opened by Michael.Felt #28015: configure --with-lto builds fail when CC=clang on Linux, requi http://bugs.python.org/issue28015 opened by gregory.p.smith #28016: test_fileio fails on AIX http://bugs.python.org/issue28016 opened by sarterm #28018: Cross compilation fails in regen http://bugs.python.org/issue28018 opened by Chi Hsuan Yen #28019: itertools.count() falls back to fast (integer) mode when step http://bugs.python.org/issue28019 opened by StyXman #28022: SSL releated deprecation for 3.6 http://bugs.python.org/issue28022 opened by christian.heimes #28023: python-gdb.py must be updated for the new Python 3.6 compact d http://bugs.python.org/issue28023 opened by haypo #28024: fileinput causes RecursionErrors when dealing with large numbe http://bugs.python.org/issue28024 opened by josh.r #28025: Use IntEnum and IntFlags in ssl module http://bugs.python.org/issue28025 opened by christian.heimes #28028: Convert warnings to SyntaxWarning in parser http://bugs.python.org/issue28028 opened by serhiy.storchaka #28029: Replace and empty strings http://bugs.python.org/issue28029 opened by St??phane Henriot #28035: make buildbottest when configured --with-optimizations can cau http://bugs.python.org/issue28035 opened by gregory.p.smith #28036: Remove unused pysqlite_flush_statement_cache function http://bugs.python.org/issue28036 opened by berker.peksag #28037: Use sqlite3_get_autocommit() instead of setting Connection->in http://bugs.python.org/issue28037 opened by berker.peksag #28038: Remove com2ann script (will be in separate repo) http://bugs.python.org/issue28038 opened by levkivskyi #28039: x86 Tiger buildbot needs __future__ with_statement http://bugs.python.org/issue28039 opened by martin.panter #28040: compact dict : SystemError: returned NULL without setting an e http://bugs.python.org/issue28040 opened by mbussonn #28041: Inconsistent behavior: Get st_nlink from os.stat() and os.scan http://bugs.python.org/issue28041 opened by Mohanson Leaf #28042: Coverity Scan defects in new dict code http://bugs.python.org/issue28042 opened by christian.heimes #28043: Sane defaults for SSLContext options and ciphers http://bugs.python.org/issue28043 opened by christian.heimes #28044: Make the sidebar in the documentation follow the section autom http://bugs.python.org/issue28044 opened by batiste Most recent 15 issues with no replies (15) ========================================== #28043: Sane defaults for SSLContext options and ciphers http://bugs.python.org/issue28043 #28042: Coverity Scan defects in new dict code http://bugs.python.org/issue28042 #28041: Inconsistent behavior: Get st_nlink from os.stat() and os.scan http://bugs.python.org/issue28041 #28039: x86 Tiger buildbot needs __future__ with_statement http://bugs.python.org/issue28039 #28037: Use sqlite3_get_autocommit() instead of setting Connection->in http://bugs.python.org/issue28037 #28036: Remove unused pysqlite_flush_statement_cache function http://bugs.python.org/issue28036 #28035: make buildbottest when configured --with-optimizations can cau http://bugs.python.org/issue28035 #28028: Convert warnings to SyntaxWarning in parser http://bugs.python.org/issue28028 #28024: fileinput causes RecursionErrors when dealing with large numbe http://bugs.python.org/issue28024 #28018: Cross compilation fails in regen http://bugs.python.org/issue28018 #28015: configure --with-lto builds fail when CC=clang on Linux, requi http://bugs.python.org/issue28015 #28009: core logic of uuid.getnode() is broken for AIX - all versions http://bugs.python.org/issue28009 #28008: PEP 530, asynchronous comprehensions implementation http://bugs.python.org/issue28008 #28004: Optimize bytes.join(sequence) http://bugs.python.org/issue28004 #28002: Some f-strings do not round trip through Tools/parser/test_unp http://bugs.python.org/issue28002 Most recent 15 issues waiting for review (15) ============================================= #28044: Make the sidebar in the documentation follow the section autom http://bugs.python.org/issue28044 #28043: Sane defaults for SSLContext options and ciphers http://bugs.python.org/issue28043 #28040: compact dict : SystemError: returned NULL without setting an e http://bugs.python.org/issue28040 #28038: Remove com2ann script (will be in separate repo) http://bugs.python.org/issue28038 #28037: Use sqlite3_get_autocommit() instead of setting Connection->in http://bugs.python.org/issue28037 #28036: Remove unused pysqlite_flush_statement_cache function http://bugs.python.org/issue28036 #28029: Replace and empty strings http://bugs.python.org/issue28029 #28025: Use IntEnum and IntFlags in ssl module http://bugs.python.org/issue28025 #28022: SSL releated deprecation for 3.6 http://bugs.python.org/issue28022 #28019: itertools.count() falls back to fast (integer) mode when step http://bugs.python.org/issue28019 #28018: Cross compilation fails in regen http://bugs.python.org/issue28018 #28016: test_fileio fails on AIX http://bugs.python.org/issue28016 #28008: PEP 530, asynchronous comprehensions implementation http://bugs.python.org/issue28008 #28004: Optimize bytes.join(sequence) http://bugs.python.org/issue28004 #28000: Build fails on AIX with _LINUX_SOURCE_COMPAT flag http://bugs.python.org/issue28000 Top 10 most discussed issues (10) ================================= #23591: enum: Add Flags and IntFlags http://bugs.python.org/issue23591 22 msgs #27850: Remove 3DES from cipher list (sweet32 CVE-2016-2183) http://bugs.python.org/issue27850 20 msgs #27928: Add hashlib.scrypt http://bugs.python.org/issue27928 12 msgs #28022: SSL releated deprecation for 3.6 http://bugs.python.org/issue28022 12 msgs #27350: Compact and ordered dict http://bugs.python.org/issue27350 11 msgs #27744: Add AF_ALG (Linux Kernel crypto) to socket module http://bugs.python.org/issue27744 11 msgs #27781: Change sys.getfilesystemencoding() on Windows to UTF-8 http://bugs.python.org/issue27781 11 msgs #1602: windows console doesn't print or input Unicode http://bugs.python.org/issue1602 10 msgs #25856: The __module__ attribute of non-heap classes is not interned http://bugs.python.org/issue25856 10 msgs #27137: Python implementation of `functools.partial` is not a class http://bugs.python.org/issue27137 10 msgs Issues closed (159) =================== #5575: Add env vars for controlling building sqlite, hashlib and ssl http://bugs.python.org/issue5575 closed by christian.heimes #6135: subprocess seems to use local encoding and give no choice http://bugs.python.org/issue6135 closed by steve.dower #6766: Cannot modify dictionaries inside dictionaries using Managers http://bugs.python.org/issue6766 closed by berker.peksag #7672: _ssl module overwrites existing thread safety callbacks http://bugs.python.org/issue7672 closed by christian.heimes #7836: Add /usr/sfw/lib to OpenSSL search path for Solaris. http://bugs.python.org/issue7836 closed by christian.heimes #8106: SSL session management http://bugs.python.org/issue8106 closed by christian.heimes #9423: Error in urllib2.do_open(self, http_class, req) http://bugs.python.org/issue9423 closed by christian.heimes #9743: __call__.__call__ chain cause crash when long enough http://bugs.python.org/issue9743 closed by christian.heimes #10274: imaplib should provide a means to validate a remote server ssl http://bugs.python.org/issue10274 closed by christian.heimes #11551: test_dummy_thread.py test coverage improvement http://bugs.python.org/issue11551 closed by orsenthil #11620: winsound.PlaySound() with SND_MEMORY should accept bytes inste http://bugs.python.org/issue11620 closed by python-dev #11734: Add half-float (16-bit) support to struct module http://bugs.python.org/issue11734 closed by mark.dickinson #12553: Add support for using a default CTE of '8bit' to MIMEText http://bugs.python.org/issue12553 closed by r.david.murray #12754: Add alternative random number generators http://bugs.python.org/issue12754 closed by haypo #13856: xmlrpc / httplib changes to allow for certificate verification http://bugs.python.org/issue13856 closed by christian.heimes #15016: Add special case for latin messages in email.mime.text http://bugs.python.org/issue15016 closed by r.david.murray #15272: pkgutil.find_loader accepts invalid module names http://bugs.python.org/issue15272 closed by eric.snow #15352: importlib.h should be regenerated when the marshaling code cha http://bugs.python.org/issue15352 closed by eric.snow #15578: Crash when modifying sys.modules during import http://bugs.python.org/issue15578 closed by eric.snow #15631: Python 3.3/3.4 installation issue on OpenSUSE lib/lib64 folder http://bugs.python.org/issue15631 closed by christian.heimes #16334: Faster unicode-escape and raw-unicode-escape codecs http://bugs.python.org/issue16334 closed by haypo #16763: test_ssl with connect_ex don't handle unreachable server corre http://bugs.python.org/issue16763 closed by christian.heimes #16764: Make zlib accept keyword-arguments http://bugs.python.org/issue16764 closed by martin.panter #17096: the system keyring should be used instead of ~/.pypirc http://bugs.python.org/issue17096 closed by christian.heimes #17121: SSH upload for distutils http://bugs.python.org/issue17121 closed by christian.heimes #17211: pkgutil.iter_modules and walk_packages should return a namedtu http://bugs.python.org/issue17211 closed by eric.snow #17884: Try to reuse stdint.h types like int32_t http://bugs.python.org/issue17884 closed by benjamin.peterson #18029: Python SSL support is missing from SPARC build http://bugs.python.org/issue18029 closed by christian.heimes #18550: internal_setblocking() doesn't check return value of fcntl() http://bugs.python.org/issue18550 closed by christian.heimes #18844: allow weights in random.choice http://bugs.python.org/issue18844 closed by rhettinger #19057: Sometimes urllib2 raises URLError when trying POST with httpS http://bugs.python.org/issue19057 closed by christian.heimes #19108: Benchmark runner tries to execute external Python command and http://bugs.python.org/issue19108 closed by scoder #20050: distutils should check PyPI certs when connecting to it http://bugs.python.org/issue20050 closed by christian.heimes #20328: mailbox: add method to delete mailbox http://bugs.python.org/issue20328 closed by r.david.murray #20469: ssl.getpeercert() should include extensions http://bugs.python.org/issue20469 closed by christian.heimes #20784: 'collections.abc' is no longer defined when collections is imp http://bugs.python.org/issue20784 closed by christian.heimes #20842: pkgutil docs should reference glossary terms not PEP 302 http://bugs.python.org/issue20842 closed by orsenthil #20924: openssl init 100% CPU utilization on Windows http://bugs.python.org/issue20924 closed by christian.heimes #21062: Evalute all import-related modules for best practices http://bugs.python.org/issue21062 closed by brett.cannon #21201: Uninformative error message in multiprocessing.Manager() http://bugs.python.org/issue21201 closed by davin #21250: sqlite3 doesn't have unit tests for 'insert or [algorithm]' fu http://bugs.python.org/issue21250 closed by berker.peksag #21324: dbhash/bsddb leaks random memory fragments to a database http://bugs.python.org/issue21324 closed by christian.heimes #21830: ssl.wrap_socket fails on Windows 7 when specifying ca_certs http://bugs.python.org/issue21830 closed by christian.heimes #22233: http.client splits headers on non-\r\n characters http://bugs.python.org/issue22233 closed by r.david.murray #22252: ssl blocking IO errors should inherit BlockingIOError http://bugs.python.org/issue22252 closed by christian.heimes #22301: smtplib.SMTP.starttls' documentation is just confusing http://bugs.python.org/issue22301 closed by christian.heimes #23065: Pyhton27.dll at SysWOW64 not updated when updating Python 2.7. http://bugs.python.org/issue23065 closed by christian.heimes #23085: update internal libffi copy to 3.2.1 http://bugs.python.org/issue23085 closed by zach.ware #23177: test_ssl: failures on OpenBSD with LibreSSL http://bugs.python.org/issue23177 closed by christian.heimes #23226: Add float linspace recipe to docs http://bugs.python.org/issue23226 closed by rhettinger #23274: make_ssl_data.py in Python 2.7.9 needs Python 3 to run http://bugs.python.org/issue23274 closed by christian.heimes #23531: SSL operations cause entire process to hang http://bugs.python.org/issue23531 closed by christian.heimes #23843: ssl.wrap_socket doesn't handle virtual TLS hosts http://bugs.python.org/issue23843 closed by christian.heimes #23845: test_ssl: fails on recent libressl with SSLV3_ALERT_HANDSHAKE_ http://bugs.python.org/issue23845 closed by christian.heimes #24254: Make class definition namespace ordered by default http://bugs.python.org/issue24254 closed by eric.snow #24277: Take the new email package features out of provisional status http://bugs.python.org/issue24277 closed by r.david.murray #24542: ssl - SSL_OP_NO_TICKET not reimplemented http://bugs.python.org/issue24542 closed by christian.heimes #24545: Issue with ssl package http://bugs.python.org/issue24545 closed by christian.heimes #24930: test_ssl: try more protocols in test_options() http://bugs.python.org/issue24930 closed by christian.heimes #25158: Python 3.2.2 and 3.5.0 Do not seem compatible with OpenSSL 1.0 http://bugs.python.org/issue25158 closed by christian.heimes #25387: sound_msgbeep doesn't check the return value of MessageBeep http://bugs.python.org/issue25387 closed by zach.ware #25405: User install of 3.5 removes py.exe from C:\Windows http://bugs.python.org/issue25405 closed by steve.dower #25437: Issue with ftplib.FTP_TLS and server forcing SSL connection re http://bugs.python.org/issue25437 closed by christian.heimes #25596: Use scandir() to speed up the glob module http://bugs.python.org/issue25596 closed by serhiy.storchaka #25761: Improve unpickling errors handling http://bugs.python.org/issue25761 closed by serhiy.storchaka #25825: AIX shared library extension modules installation broken http://bugs.python.org/issue25825 closed by martin.panter #25883: python 2.7.11 mod_wsgi regression on windows http://bugs.python.org/issue25883 closed by christian.heimes #26020: set_display evaluation order doesn't match documented behaviou http://bugs.python.org/issue26020 closed by rhettinger #26032: Use scandir() to speed up pathlib globbing http://bugs.python.org/issue26032 closed by serhiy.storchaka #26040: Improve coverage and rigour of test.test_math http://bugs.python.org/issue26040 closed by mark.dickinson #26058: PEP 509: Add ma_version to PyDictObject http://bugs.python.org/issue26058 closed by haypo #26209: TypeError in smtpd module with string arguments http://bugs.python.org/issue26209 closed by r.david.murray #26307: no PGO for built-in modules with `make profile-opt` http://bugs.python.org/issue26307 closed by gregory.p.smith #26359: CPython build options for out-of-the box performance http://bugs.python.org/issue26359 closed by gregory.p.smith #26470: Make OpenSSL module compatible with OpenSSL 1.1.0 http://bugs.python.org/issue26470 closed by christian.heimes #26667: Update importlib to accept pathlib.Path objects http://bugs.python.org/issue26667 closed by brett.cannon #26798: add BLAKE2 to hashlib http://bugs.python.org/issue26798 closed by christian.heimes #26982: Clarify forward annotations in PEP 484 http://bugs.python.org/issue26982 closed by gvanrossum #27078: Make f'' strings faster than .format: BUILD_STRING opcode? http://bugs.python.org/issue27078 closed by serhiy.storchaka #27106: configparser.__all__ is incomplete http://bugs.python.org/issue27106 closed by martin.panter #27179: subprocess uses wrong encoding on Windows http://bugs.python.org/issue27179 closed by steve.dower #27279: Add random.cryptorandom() and random.pseudorandom, deprecate o http://bugs.python.org/issue27279 closed by ncoghlan #27288: secrets should use getrandom() on Linux http://bugs.python.org/issue27288 closed by ncoghlan #27293: Summarize issues related to urandom, getrandom etc in secrets http://bugs.python.org/issue27293 closed by ncoghlan #27331: Add a policy argument to email.mime.MIMEBase http://bugs.python.org/issue27331 closed by r.david.murray #27355: Strip out the last lingering vestiges of Windows CE support http://bugs.python.org/issue27355 closed by larry #27364: Deprecate invalid escape sequences in str/bytes http://bugs.python.org/issue27364 closed by ebarry #27407: prepare_ssl.py missing in PCBuild folder http://bugs.python.org/issue27407 closed by python-dev #27427: Add new math module tests http://bugs.python.org/issue27427 closed by mark.dickinson #27445: Charset instance not passed to set_payload() http://bugs.python.org/issue27445 closed by berker.peksag #27570: Avoid memcpy(. . ., NULL, 0) etc calls http://bugs.python.org/issue27570 closed by martin.panter #27630: Generator._encoded_EMTPY misspelling in email package http://bugs.python.org/issue27630 closed by r.david.murray #27691: X509 cert with GEN_RID subject alt name causes SytemError http://bugs.python.org/issue27691 closed by christian.heimes #27731: Opt-out of MAX_PATH on Windows 10 http://bugs.python.org/issue27731 closed by steve.dower #27748: Simplify test_winsound http://bugs.python.org/issue27748 closed by python-dev #27756: Add pyd icon for 3.6 http://bugs.python.org/issue27756 closed by steve.dower #27776: PEP 524: Make os.urandom() blocking on Linux http://bugs.python.org/issue27776 closed by haypo #27811: _PyGen_Finalize() should not fail with an exception http://bugs.python.org/issue27811 closed by python-dev #27812: PyFrameObject.f_gen can be left pointing to a dangling generat http://bugs.python.org/issue27812 closed by python-dev #27853: Add title to examples in importlib docs http://bugs.python.org/issue27853 closed by brett.cannon #27866: ssl: get list of enabled ciphers http://bugs.python.org/issue27866 closed by berker.peksag #27868: Unconditionally state when a build succeeds http://bugs.python.org/issue27868 closed by brett.cannon #27872: Update os/os.path docs to mention path-like object support http://bugs.python.org/issue27872 closed by brett.cannon #27877: Add recipe for "valueless" Enums to docs http://bugs.python.org/issue27877 closed by berker.peksag #27881: Fix possible bugs when setting sqlite3.Connection.isolation_le http://bugs.python.org/issue27881 closed by berker.peksag #27883: Update sqlite version for Windows build http://bugs.python.org/issue27883 closed by zach.ware #27905: Add documentation for typing.Type http://bugs.python.org/issue27905 closed by gvanrossum #27911: Unnecessary error checks in exec_builtin_or_dynamic http://bugs.python.org/issue27911 closed by brett.cannon #27915: Use 'ascii' instead of 'us-ascii' to bypass lookup machinery http://bugs.python.org/issue27915 closed by haypo #27918: Running test suites without gui but still having windows flash http://bugs.python.org/issue27918 closed by terry.reedy #27921: f-strings: do not allow backslashes http://bugs.python.org/issue27921 closed by python-dev #27930: logging's QueueListener drops log messages http://bugs.python.org/issue27930 closed by python-dev #27935: logging level FATAL missing in _nameToLevel http://bugs.python.org/issue27935 closed by python-dev #27936: Inconsistent round behavior between float and int http://bugs.python.org/issue27936 closed by rhettinger #27937: logging.getLevelName microoptimization http://bugs.python.org/issue27937 closed by python-dev #27941: Bad error message from Decimal('garbage') across the py3 range http://bugs.python.org/issue27941 closed by skrah #27944: two hotshot module issues http://bugs.python.org/issue27944 closed by python-dev #27947: Trailing backslash in raw string format causes EOL http://bugs.python.org/issue27947 closed by tim.peters #27949: Fix description in bytes literal doc http://bugs.python.org/issue27949 closed by xiang.zhang #27953: math.tan has poor accuracy near pi/2 on OS X Tiger http://bugs.python.org/issue27953 closed by mark.dickinson #27956: optimize dict_traverse a bit http://bugs.python.org/issue27956 closed by python-dev #27957: minor typo in importlib docs http://bugs.python.org/issue27957 closed by python-dev #27958: 'zlib compression' not found in set(['RLE', 'ZLIB', None]) http://bugs.python.org/issue27958 closed by christian.heimes #27959: Add 'oem' encoding http://bugs.python.org/issue27959 closed by steve.dower #27960: Distutils tests are broken in 3.4 http://bugs.python.org/issue27960 closed by jason.coombs #27961: remove support for platforms without "long long" http://bugs.python.org/issue27961 closed by python-dev #27962: null poiter dereference in set_conversion_mode due uncheck _ct http://bugs.python.org/issue27962 closed by eryksun #27964: Add random.shuffled http://bugs.python.org/issue27964 closed by rhettinger #27967: Remove unused variables causing compile warnings in sqlite3 mo http://bugs.python.org/issue27967 closed by python-dev #27968: test_coroutines generates some warnings http://bugs.python.org/issue27968 closed by python-dev #27969: Suppress unnecessary message when running test_gdb http://bugs.python.org/issue27969 closed by python-dev #27970: ssl: can't verify a trusted site with imcomplete certificate c http://bugs.python.org/issue27970 closed by christian.heimes #27974: Remove dead code in importlib._bootstrap http://bugs.python.org/issue27974 closed by brett.cannon #27975: math.isnan(int) and math.isinf(int) should not raise OverflowE http://bugs.python.org/issue27975 closed by mark.dickinson #27980: Add better pythonw support to py launcher http://bugs.python.org/issue27980 closed by eryksun #27982: Allow keyword arguments in winsound http://bugs.python.org/issue27982 closed by python-dev #27983: "Cannot perform PGO build because llvm-profdata was not found http://bugs.python.org/issue27983 closed by gregory.p.smith #27985: Implement PEP 526 http://bugs.python.org/issue27985 closed by yselivanov #27988: email iter_attachments can mutate the payload http://bugs.python.org/issue27988 closed by r.david.murray #27993: In the argparse there are typos with endings in plural words http://bugs.python.org/issue27993 closed by martin.panter #27996: Python 3 ssl module can't use a fileno to create a SSLSocket http://bugs.python.org/issue27996 closed by berker.peksag #28003: PEP 525 asynchronous generators implementation http://bugs.python.org/issue28003 closed by yselivanov #28005: Broken encoding modules are silently skipped. http://bugs.python.org/issue28005 closed by steve.dower #28006: Remove tracing overhead from the fine-grained fast opcodes http://bugs.python.org/issue28006 closed by rhettinger #28010: http.client.HTTPConnection.putrequest incorrect arguments http://bugs.python.org/issue28010 closed by orsenthil #28011: winreg KEY_READ also fails for some keys http://bugs.python.org/issue28011 closed by eryksun #28012: Spam http://bugs.python.org/issue28012 closed by berker.peksag #28013: PPC64 Fedora socket and ssl compile failure http://bugs.python.org/issue28013 closed by christian.heimes #28014: Strange interaction between methods in subclass of C OrderedDi http://bugs.python.org/issue28014 closed by zach.ware #28017: bluetooth.h on big endian needs GNU C extensions http://bugs.python.org/issue28017 closed by christian.heimes #28020: Python 3 logging HTTPHandler doesn't implement a standard http http://bugs.python.org/issue28020 closed by SilentGhost #28021: Calculating wrong modulus manually http://bugs.python.org/issue28021 closed by steven.daprano #28026: module_from_spec() should raise an error in 3.6 http://bugs.python.org/issue28026 closed by eric.snow #28027: Remove Lib/plat-*/* files http://bugs.python.org/issue28027 closed by zach.ware #28030: Update the language reference for PEP 468. http://bugs.python.org/issue28030 closed by eric.snow #28031: Update pathlib.resolve() to match os.path.realpath() http://bugs.python.org/issue28031 closed by brett.cannon #28032: --with-lto builds segfault in many situations http://bugs.python.org/issue28032 closed by gregory.p.smith #28033: dictobject.c comment misspelling http://bugs.python.org/issue28033 closed by berker.peksag #28034: local var in "for v in iter" modify the uplevel var value. http://bugs.python.org/issue28034 closed by r.david.murray From brett at python.org Fri Sep 9 12:52:21 2016 From: brett at python.org (Brett Cannon) Date: Fri, 09 Sep 2016 16:52:21 +0000 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: <20160909105541.5b8a7ec8@fsol> References: <20160909105541.5b8a7ec8@fsol> Message-ID: On Fri, 9 Sep 2016 at 01:58 Antoine Pitrou wrote: > On Thu, 8 Sep 2016 14:20:53 -0700 > Victor Stinner wrote: > > 2016-09-08 13:36 GMT-07:00 Guido van Rossum : > > > IIUC there's one small thing we might still want to change somewhere > > > after 3.6b1 but before 3.6rc1: the order is not preserved when you > > > delete some keys and then add some other keys. Apparently PyPy has > > > come up with a clever solution for this, and we should probably adopt > > > it, but it's probably best not to hurry that for 3.6b1. > > > > Very good news: I was wrong, Raymond Hettinger confirmed that the > > Python 3.6 dict *already* preserves the items order in all cases. In > > short, Python 3.6 dict = Python 3.5 OrderedDict (in fact, OrderedDict > > has a few more methods). > > Is it an official feature of the language or an implementation detail? > It depends on the context. **kwargs is now defined to be an ordered mapping and PEP 520 has been updated to drop __definition_order__ and to say that cls.__dict__ is an ordered mapping. Otherwise we have not made dict itself ordered everywhere. And there has been discussion to rip out the C code for OrderedDict and change the Python code to subclass dict so it only has to provide its additions to the dict API. -------------- next part -------------- An HTML attachment was scrubbed... URL: From elprans at gmail.com Fri Sep 9 13:08:03 2016 From: elprans at gmail.com (Elvis Pranskevichus) Date: Fri, 09 Sep 2016 13:08:03 -0400 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <20160909105541.5b8a7ec8@fsol> Message-ID: <1890206.yRDVlEz0si@klinga.prans.org> On Friday, September 9, 2016 4:52:21 PM EDT Brett Cannon wrote: > On Fri, 9 Sep 2016 at 01:58 Antoine Pitrou wrote: > > On Thu, 8 Sep 2016 14:20:53 -0700 > > > > Victor Stinner wrote: > > > 2016-09-08 13:36 GMT-07:00 Guido van Rossum : > > > > IIUC there's one small thing we might still want to change somewhere > > > > after 3.6b1 but before 3.6rc1: the order is not preserved when you > > > > delete some keys and then add some other keys. Apparently PyPy has > > > > come up with a clever solution for this, and we should probably adopt > > > > it, but it's probably best not to hurry that for 3.6b1. > > > > > > Very good news: I was wrong, Raymond Hettinger confirmed that the > > > Python 3.6 dict *already* preserves the items order in all cases. In > > > short, Python 3.6 dict = Python 3.5 OrderedDict (in fact, OrderedDict > > > has a few more methods). > > > > Is it an official feature of the language or an implementation detail? > > It depends on the context. **kwargs is now defined to be an ordered mapping > and PEP 520 has been updated to drop __definition_order__ and to say that > cls.__dict__ is an ordered mapping. Otherwise we have not made dict itself > ordered everywhere. > > And there has been discussion to rip out the C code for OrderedDict and > change the Python code to subclass dict so it only has to provide its > additions to the dict API. Are there any downsides to explicitly specifying that all dicts are ordered? People will inevitably start relying on this behaviour, and this will essentially become the *de-facto* spec, so alternative Python implementations will have to follow suit anyway. Elvis From guido at python.org Fri Sep 9 13:17:06 2016 From: guido at python.org (Guido van Rossum) Date: Fri, 9 Sep 2016 10:17:06 -0700 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <20160909105541.5b8a7ec8@fsol> Message-ID: I've been asked about this. Here's my opinion on the letter of the law in 3.6: - keyword args are ordered - the namespace passed to a metaclass is ordered by definition order - ditto for the class __dict__ A compliant implementation may ensure the above three requirements either by making all dicts ordered, or by providing a custom dict subclass (e.g. OrderedDict) in those three cases. I'd like to handwave on the ordering of all other dicts. Yes, in CPython 3.6 and in PyPy they are all ordered, but it's an implementation detail. I don't want to *force* all other implementations to follow suit. I also don't want too many people start depending on this, since their code will break in 3.5. (Code that needs to depend on the ordering of keyword args or class attributes should be relatively uncommon; but people will start to depend on the ordering of all dicts all too easily. I want to remind them that they are taking a risk, and their code won't be backwards compatible.) --Guido On Fri, Sep 9, 2016 at 9:52 AM, Brett Cannon wrote: > > > On Fri, 9 Sep 2016 at 01:58 Antoine Pitrou wrote: >> >> On Thu, 8 Sep 2016 14:20:53 -0700 >> Victor Stinner wrote: >> > 2016-09-08 13:36 GMT-07:00 Guido van Rossum : >> > > IIUC there's one small thing we might still want to change somewhere >> > > after 3.6b1 but before 3.6rc1: the order is not preserved when you >> > > delete some keys and then add some other keys. Apparently PyPy has >> > > come up with a clever solution for this, and we should probably adopt >> > > it, but it's probably best not to hurry that for 3.6b1. >> > >> > Very good news: I was wrong, Raymond Hettinger confirmed that the >> > Python 3.6 dict *already* preserves the items order in all cases. In >> > short, Python 3.6 dict = Python 3.5 OrderedDict (in fact, OrderedDict >> > has a few more methods). >> >> Is it an official feature of the language or an implementation detail? > > > It depends on the context. **kwargs is now defined to be an ordered mapping > and PEP 520 has been updated to drop __definition_order__ and to say that > cls.__dict__ is an ordered mapping. Otherwise we have not made dict itself > ordered everywhere. > > And there has been discussion to rip out the C code for OrderedDict and > change the Python code to subclass dict so it only has to provide its > additions to the dict API. > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) From victor.stinner at gmail.com Fri Sep 9 13:28:10 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 9 Sep 2016 10:28:10 -0700 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <20160909105541.5b8a7ec8@fsol> Message-ID: 2016-09-09 10:17 GMT-07:00 Guido van Rossum : > - keyword args are ordered > - the namespace passed to a metaclass is ordered by definition order > - ditto for the class __dict__ Maybe we should define exactly "ordered" somewhere the language reference: https://docs.python.org/dev/reference/index.html I expect: * a mapping: mapping ABC, https://docs.python.org/dev/library/collections.abc.html#collections-abstract-base-classes * ordered by definition order * no more I mean: OrderedDict has extra methods, __reversed__() and move_to_end(). Users should not rely on them. Victor From brett at python.org Fri Sep 9 13:32:21 2016 From: brett at python.org (Brett Cannon) Date: Fri, 09 Sep 2016 17:32:21 +0000 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <20160909105541.5b8a7ec8@fsol> Message-ID: On Fri, 9 Sep 2016 at 10:28 Victor Stinner wrote: > 2016-09-09 10:17 GMT-07:00 Guido van Rossum : > > - keyword args are ordered > > - the namespace passed to a metaclass is ordered by definition order > > - ditto for the class __dict__ > > Maybe we should define exactly "ordered" somewhere the language reference: > https://docs.python.org/dev/reference/index.html > > I expect: > > * a mapping: mapping ABC, > > https://docs.python.org/dev/library/collections.abc.html#collections-abstract-base-classes > * ordered by definition order > * no more > > I mean: OrderedDict has extra methods, __reversed__() and > move_to_end(). Users should not rely on them. > Adding "ordered mapping" to the glossary and linking to the term from the language spec should cover that. Maybe Eric can add it since he made the spec updates earlier? -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Fri Sep 9 14:39:27 2016 From: barry at python.org (Barry Warsaw) Date: Sat, 10 Sep 2016 06:39:27 +1200 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: <1890206.yRDVlEz0si@klinga.prans.org> References: <20160909105541.5b8a7ec8@fsol> <1890206.yRDVlEz0si@klinga.prans.org> Message-ID: <20160910063927.723661ea.barry@wooz.org> On Sep 09, 2016, at 01:08 PM, Elvis Pranskevichus wrote: >Are there any downsides to explicitly specifying that all dicts are ordered? >People will inevitably start relying on this behaviour, and this will >essentially become the *de-facto* spec, so alternative Python implementations >will have to follow suit anyway. It *might* make sense to revisit this once 3.5 is no longer maintained at all, but I think Guido's exactly right in his analysis. If people start relying on all dicts being ordered now, their code won't be compatible with both 3.5 and 3.6, and I think it's important to emphasize this to developers. Cheers, -Barry From mertz at gnosis.cx Fri Sep 9 15:01:08 2016 From: mertz at gnosis.cx (David Mertz) Date: Fri, 9 Sep 2016 14:01:08 -0500 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: <20160910063927.723661ea.barry@wooz.org> References: <20160909105541.5b8a7ec8@fsol> <1890206.yRDVlEz0si@klinga.prans.org> <20160910063927.723661ea.barry@wooz.org> Message-ID: It seems unlikely, but not inconceivable, that someday in the future someone will implement a dictionary that is faster than current versions but at the cost of losing inherent ordering. It feels best to me only to promise order in specific cases like kwargs, but say nothing (even in 3.6 or 3.7) about the requirement for how dict itself is implemented. On Sep 9, 2016 11:39 AM, "Barry Warsaw" wrote: > On Sep 09, 2016, at 01:08 PM, Elvis Pranskevichus wrote: > > >Are there any downsides to explicitly specifying that all dicts are > ordered? > >People will inevitably start relying on this behaviour, and this will > >essentially become the *de-facto* spec, so alternative Python > implementations > >will have to follow suit anyway. > > It *might* make sense to revisit this once 3.5 is no longer maintained at > all, > but I think Guido's exactly right in his analysis. If people start > relying on > all dicts being ordered now, their code won't be compatible with both 3.5 > and > 3.6, and I think it's important to emphasize this to developers. > > Cheers, > -Barry > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > mertz%40gnosis.cx > -------------- next part -------------- An HTML attachment was scrubbed... URL: From elprans at gmail.com Fri Sep 9 15:40:32 2016 From: elprans at gmail.com (Elvis Pranskevichus) Date: Fri, 09 Sep 2016 15:40:32 -0400 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <20160910063927.723661ea.barry@wooz.org> Message-ID: <1565352.Wd7nh4Uc3l@klinga.prans.org> On Friday, September 9, 2016 2:01:08 PM EDT David Mertz wrote: > It feels best to me only to promise order in specific cases like kwargs, > but say nothing (even in 3.6 or 3.7) about the requirement for how dict > itself is implemented. On Saturday, September 10, 2016 6:39:27 AM EDT Barry Warsaw wrote: > If people start relying on all dicts being ordered now, their code won't > be compatible with both 3.5 and 3.6, and I think it's important to > emphasize this to developers. OK, that makes sense. Putting an explicit note in the documentation that one should not rely on the key order will probably be enough to reduce the concern. Elvis From rodrigc at freebsd.org Fri Sep 9 15:53:22 2016 From: rodrigc at freebsd.org (Craig Rodrigues) Date: Fri, 9 Sep 2016 12:53:22 -0700 Subject: [Python-Dev] Porting buildbot to Python 3 Message-ID: Hi, It's not essential, but I thought it would be nice to port buildbot to Python 3. I've managed to submit multiple simple patches to buildbot, which were quickly accepted: https://github.com/buildbot/buildbot/pulls/rodrigc?q=is%3Apr+is%3Aclosed Now things are more slow going as the easy stuff is out of the way and I am submitting more complicated things: https://github.com/buildbot/buildbot/pulls/rodrigc Is there anyone on python-dev who has the interest and free cycles to push Python 3 fixes to the buildbot team? Thanks. -- Craig -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Fri Sep 9 22:04:06 2016 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 9 Sep 2016 22:04:06 -0400 Subject: [Python-Dev] Changes to PEP 498 (f-strings) In-Reply-To: <119e0950-845f-2fc8-f1b4-30b9f4c00801@trueblade.com> References: <119e0950-845f-2fc8-f1b4-30b9f4c00801@trueblade.com> Message-ID: <2d7e81ac-5be6-0df1-79ec-95711ade3d0e@trueblade.com> I found some time before beta 1 to modify the f-string code to implement the desired behavior: no backslashes inside the curly braces, but they're allowed in the literal string portions. I just checked it in. I still need to update PEP 498, and the documentation needs updating. I'll create an issue for the docs once I've updated the PEP. This is a fairly large change, because now I need to parse the f-strings in UTF-8, and do the decoding myself in pieces, instead of the previous behavior of decoding the string first and then parsing it. I think I have tests for all of the backslash scenarios, but I'll watch the buildbots. Eric. On 8/30/2016 1:55 PM, Eric V. Smith wrote: > After a long discussion on python-ideas (starting at > https://mail.python.org/pipermail/python-ideas/2016-August/041727.html) > I'm proposing the following change to PEP 498: backslashes inside > brackets will be disallowed. The point of this is to disallow convoluted > code like: > >>>> d = {'a': 4} >>>> f'{d[\'a\']}' > '4' > > In addition, I'll disallow escapes to be used for brackets, as in: > >>>> f'\x7bd["a"]}' > '4' > > (where chr(0x7b) == "{"). > > Because we're so close to 3.6 beta 1, my plan is to: > > 1. Modify the PEP to reflect these restrictions. > 2. Modify the code to prevent _any_ backslashes inside f-strings. > > This is a more restrictive change than the PEP will describe, but it's > much easier to implement. After beta 1, and hopefully before beta 2, I > will implement the restrictions as I've outlined above (and as they will > be documented in the PEP). The net effects are: > > a. Some code that works in the alphas won't work in beta 1. I'll > document this. > b. All code that's valid in beta 1 will work in beta 2, and some > f-strings that are syntax errors in beta 1 will work in beta 2. > > I've discussed this issue with Ned and Guido, who are okay with these > changes. > > The python-ideas thread I referenced above has some discussion about > further changes to f-strings. Those proposals are outside the scope of > 3.6, but the changes I'm putting forth here will allow for those > additional changes, should we decide to make them. That's a discussion > for 3.7, however. > > I'm sending this email out just to notify people of this upcoming > change. I hope this won't generate much discussion. If you feel the need > to discuss this issue further, please use the python-ideas thread (where > some people are already ignoring it!). > > Eric. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/eric%2Ba-python-dev%40trueblade.com > From eric at trueblade.com Fri Sep 9 23:31:43 2016 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 9 Sep 2016 23:31:43 -0400 Subject: [Python-Dev] [Python-checkins] cpython: make invalid_comma_and_underscore a real prototype In-Reply-To: <20160910031554.36195.16737.ED9DDCC8@psf.io> References: <20160910031554.36195.16737.ED9DDCC8@psf.io> Message-ID: <0e77b786-4989-e0ca-d803-7a696991a8db@trueblade.com> Oops, thanks Benjamin. That was a copy and paste error. Eric. On 9/9/2016 11:15 PM, benjamin.peterson wrote: > https://hg.python.org/cpython/rev/1e7b636b6009 > changeset: 103539:1e7b636b6009 > user: Benjamin Peterson > date: Fri Sep 09 20:14:05 2016 -0700 > summary: > make invalid_comma_and_underscore a real prototype > > files: > Python/formatter_unicode.c | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > > diff --git a/Python/formatter_unicode.c b/Python/formatter_unicode.c > --- a/Python/formatter_unicode.c > +++ b/Python/formatter_unicode.c > @@ -41,7 +41,7 @@ > } > > static void > -invalid_comma_and_underscore() > +invalid_comma_and_underscore(void) > { > PyErr_Format(PyExc_ValueError, "Cannot specify both ',' and '_'."); > } > > > > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > https://mail.python.org/mailman/listinfo/python-checkins > From guido at python.org Sat Sep 10 00:02:06 2016 From: guido at python.org (Guido van Rossum) Date: Fri, 9 Sep 2016 21:02:06 -0700 Subject: [Python-Dev] Changes to PEP 498 (f-strings) In-Reply-To: <2d7e81ac-5be6-0df1-79ec-95711ade3d0e@trueblade.com> References: <119e0950-845f-2fc8-f1b4-30b9f4c00801@trueblade.com> <2d7e81ac-5be6-0df1-79ec-95711ade3d0e@trueblade.com> Message-ID: Very happy to hear it. It's almost like you were present at the sprint! On Fri, Sep 9, 2016 at 7:04 PM, Eric V. Smith wrote: > I found some time before beta 1 to modify the f-string code to implement the > desired behavior: no backslashes inside the curly braces, but they're > allowed in the literal string portions. I just checked it in. > > I still need to update PEP 498, and the documentation needs updating. I'll > create an issue for the docs once I've updated the PEP. > > This is a fairly large change, because now I need to parse the f-strings in > UTF-8, and do the decoding myself in pieces, instead of the previous > behavior of decoding the string first and then parsing it. > > I think I have tests for all of the backslash scenarios, but I'll watch the > buildbots. > > Eric. > > > On 8/30/2016 1:55 PM, Eric V. Smith wrote: >> >> After a long discussion on python-ideas (starting at >> https://mail.python.org/pipermail/python-ideas/2016-August/041727.html) >> I'm proposing the following change to PEP 498: backslashes inside >> brackets will be disallowed. The point of this is to disallow convoluted >> code like: >> >>>>> d = {'a': 4} >>>>> f'{d[\'a\']}' >> >> '4' >> >> In addition, I'll disallow escapes to be used for brackets, as in: >> >>>>> f'\x7bd["a"]}' >> >> '4' >> >> (where chr(0x7b) == "{"). >> >> Because we're so close to 3.6 beta 1, my plan is to: >> >> 1. Modify the PEP to reflect these restrictions. >> 2. Modify the code to prevent _any_ backslashes inside f-strings. >> >> This is a more restrictive change than the PEP will describe, but it's >> much easier to implement. After beta 1, and hopefully before beta 2, I >> will implement the restrictions as I've outlined above (and as they will >> be documented in the PEP). The net effects are: >> >> a. Some code that works in the alphas won't work in beta 1. I'll >> document this. >> b. All code that's valid in beta 1 will work in beta 2, and some >> f-strings that are syntax errors in beta 1 will work in beta 2. >> >> I've discussed this issue with Ned and Guido, who are okay with these >> changes. >> >> The python-ideas thread I referenced above has some discussion about >> further changes to f-strings. Those proposals are outside the scope of >> 3.6, but the changes I'm putting forth here will allow for those >> additional changes, should we decide to make them. That's a discussion >> for 3.7, however. >> >> I'm sending this email out just to notify people of this upcoming >> change. I hope this won't generate much discussion. If you feel the need >> to discuss this issue further, please use the python-ideas thread (where >> some people are already ignoring it!). >> >> Eric. >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> https://mail.python.org/mailman/options/python-dev/eric%2Ba-python-dev%40trueblade.com >> > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) From ethan at stoneleaf.us Sat Sep 10 03:49:43 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Sat, 10 Sep 2016 00:49:43 -0700 Subject: [Python-Dev] PEP520 and absence of __definition_order__ Message-ID: <57D3BB17.4000704@stoneleaf.us> Per Victor's advice I'm posting this here. PEP 520 has been accepted, but without the __definition_order__ attribute. The accompanying comment: > "Note: Since compact dict has landed in 3.6, __definition_order__ has > been removed. cls.__dict__ now mostly accomplishes the same thing > instead." The "mostly" is what concerns me. Much like having a custom __dir__ lets a class fine-tune what is of interest, a custom __definition_order__ allows a class to present a unified view of the class creation process. This could be important to classes that employ __getattr__ (or __getattribute__) to provide virtual attributes, such as Enum or proxy classes. With __definition_order__ Enum can display the actual creation order of enum members and methods, while relying on Enum.__dict__.keys() presents a jumbled mess with many attributes the user never wrote, the enum members either appearing /after/ all the methods (even if actually written before), or entirely absent. For example, this class: >>> class PassBy(Enum): ... value = 1 ... reference = 2 ... name = 3 ... object = name ... def used_by_python(self): ... return self.name == 'name' ... shows this: >>> PassBy.__dict__.keys() dict_keys([ '_generate_next_value_', '__module__', 'used_by_python', '__doc__', '_member_names_', '_member_map_', '_member_type_', '_value2member_map_', 'reference', 'object', '__new__', ]) Notice that two of the members are missing, and all are after the method. If __definition_order__ existed it would be this: >>> PassBy.__definition_order__ ['value', 'reference', 'name', 'object', 'used_by_python'] Which is a much more accurate picture of the user's class. -- ~Ethan~ From tds333 at mailbox.org Sat Sep 10 04:37:24 2016 From: tds333 at mailbox.org (Wolfgang) Date: Sat, 10 Sep 2016 10:37:24 +0200 Subject: [Python-Dev] sys.path file feature Message-ID: <40aca321-c83b-d184-aa75-e356258a9202@mailbox.org> Hi, tracking the commit log I have noticed for Windows there was added a new feature which is very interesting and can also be useful for other platforms. If I read it right it supports adding a sys.path text file near the executable to specify the Python sys.path variable and overwriting the default behavior. https://hg.python.org/cpython/rev/03517dd54977 This change is only for Windows (if I read right). But I think it is thus valuable to add this in common as general rule. This also simplifies and unifies virtual environment creating and standalone redistribution. Also I have one remaining question, is the "*.pth" file handling then disabled by this feature? If yes, can this be a problem in a virtual environment if a package uses a pth file installed in the virtual environment site-packages directory? Overall I think this is a great addition and the start to unify sys.path handling. And a good feature for redistribution of a Python interpreter without an installation. (Embedding, virtual environments, fat virtual environments, ...) Regards, Wolfgang From ncoghlan at gmail.com Sat Sep 10 05:27:59 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 10 Sep 2016 19:27:59 +1000 Subject: [Python-Dev] PEP520 and absence of __definition_order__ In-Reply-To: <57D3BB17.4000704@stoneleaf.us> References: <57D3BB17.4000704@stoneleaf.us> Message-ID: On 10 September 2016 at 17:49, Ethan Furman wrote: > Per Victor's advice I'm posting this here. > > PEP 520 has been accepted, but without the __definition_order__ attribute. > The accompanying comment: > >> "Note: Since compact dict has landed in 3.6, __definition_order__ has >> been removed. cls.__dict__ now mostly accomplishes the same thing >> instead." > > > The "mostly" is what concerns me. Much like having a custom __dir__ lets > a class fine-tune what is of interest, a custom __definition_order__ allows > a class to present a unified view of the class creation process. This could > be important to classes that employ __getattr__ (or __getattribute__) to > provide virtual attributes, such as Enum or proxy classes. +1 The reasoning for modifying the PEP post-acceptance is faulty - __definition_order__ wasn't just there as a CPython implementation detail, it was there as a way to allow class and metaclass developers to hide their *own* irrelevant implementation details. Since __definition_order__ was already accepted, and the rationale for removing it is incorrect, could we please have it back for beta 1? Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sat Sep 10 05:40:53 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 10 Sep 2016 19:40:53 +1000 Subject: [Python-Dev] sys.path file feature In-Reply-To: <40aca321-c83b-d184-aa75-e356258a9202@mailbox.org> References: <40aca321-c83b-d184-aa75-e356258a9202@mailbox.org> Message-ID: On 10 September 2016 at 18:37, Wolfgang wrote: > Hi, > > tracking the commit log I have noticed for Windows there was added a new > feature which is very interesting and can also be useful for other > platforms. > > If I read it right it supports adding a sys.path text file near the > executable to specify the Python sys.path variable and overwriting the > default behavior. > > https://hg.python.org/cpython/rev/03517dd54977 While I'm all for adding ways to simplify CPython sys.path configuration, they shouldn't be added as implicit side effects of other changes without at least some discussion of the chosen approach. If there isn't time for that, and it's needed to solve a particular problem, then the underscore-prefix naming convention indicating "this is not a standardised and supported interface" works just as well for config files as it does for module and attribute names. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From tjreedy at udel.edu Sat Sep 10 05:47:36 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 10 Sep 2016 05:47:36 -0400 Subject: [Python-Dev] PEP520 and absence of __definition_order__ In-Reply-To: References: <57D3BB17.4000704@stoneleaf.us> Message-ID: On 9/10/2016 5:27 AM, Nick Coghlan wrote: > On 10 September 2016 at 17:49, Ethan Furman wrote: >> Per Victor's advice I'm posting this here. >> >> PEP 520 has been accepted, but without the __definition_order__ attribute. >> The accompanying comment: >> >>> "Note: Since compact dict has landed in 3.6, __definition_order__ has >>> been removed. cls.__dict__ now mostly accomplishes the same thing >>> instead." >> >> >> The "mostly" is what concerns me. Much like having a custom __dir__ lets >> a class fine-tune what is of interest, a custom __definition_order__ allows >> a class to present a unified view of the class creation process. This could >> be important to classes that employ __getattr__ (or __getattribute__) to >> provide virtual attributes, such as Enum or proxy classes. > > +1 > > The reasoning for modifying the PEP post-acceptance is faulty - > __definition_order__ wasn't just there as a CPython implementation > detail, it was there as a way to allow class and metaclass developers > to hide their *own* irrelevant implementation details. > > Since __definition_order__ was already accepted, and the rationale for > removing it is incorrect, could we please have it back for beta 1? Someone (Ethan?) should ask that this be a release blocker on some issue. -- Terry Jan Reedy From storchaka at gmail.com Sat Sep 10 06:52:13 2016 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 10 Sep 2016 13:52:13 +0300 Subject: [Python-Dev] PEP 467: last round (?) In-Reply-To: <57C88355.9000302@stoneleaf.us> References: <57C88355.9000302@stoneleaf.us> Message-ID: On 01.09.16 22:36, Ethan Furman wrote: > * Add ``bytes.iterbytes`` and ``bytearray.iterbytes`` alternative iterators Could you please add a mention of alternative: seqtools.chunks()? seqtools.chunks(bytes, 1) and seqtools.chunks(bytearray, 1) should be equivalent to bytes.iterbytes() and bytearray.iterbytes() (but this function is applicable to arbitrary sequences, including memoryview and array). Is there a need of a PEP for new seqtools module (currently two classes are planned), or just providing sample implementation on the bugtracker would be enough? From ncoghlan at gmail.com Sat Sep 10 07:19:31 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 10 Sep 2016 21:19:31 +1000 Subject: [Python-Dev] PEP520 and absence of __definition_order__ In-Reply-To: References: <57D3BB17.4000704@stoneleaf.us> Message-ID: On 10 September 2016 at 19:27, Nick Coghlan wrote: > On 10 September 2016 at 17:49, Ethan Furman wrote: >> The "mostly" is what concerns me. Much like having a custom __dir__ lets >> a class fine-tune what is of interest, a custom __definition_order__ allows >> a class to present a unified view of the class creation process. This could >> be important to classes that employ __getattr__ (or __getattribute__) to >> provide virtual attributes, such as Enum or proxy classes. > > +1 > > The reasoning for modifying the PEP post-acceptance is faulty - > __definition_order__ wasn't just there as a CPython implementation > detail, it was there as a way to allow class and metaclass developers > to hide their *own* irrelevant implementation details. > > Since __definition_order__ was already accepted, and the rationale for > removing it is incorrect, could we please have it back for beta 1? After posting this, I realised I should give a bit more detail on why I see PEP 520 without __definition_order__ as potentially problematic. Specifically, it relates to these two sections in the PEP about having __definition_order__ be writable and about whether or not to set it for classes that aren't created via the class syntax: * https://www.python.org/dev/peps/pep-0520/#why-not-a-read-only-attribute * https://www.python.org/dev/peps/pep-0520/#support-for-c-api-types >From the first section: "Also, note that a writeable __definition_order__ allows dynamically created classes (e.g. by Cython) to still have __definition_order__ properly set. That could certainly be handled through specific class- creation tools, such as type() or the C-API, without the need to lose the semantics of a read-only attribute. However, with a writeable attribute it's a moot point. " >From the second: "However, since __definition_order__ can be set at any time through normal attribute assignment, it does not need any special treatment in the C-API." Unlike the __definition_order__ tuple, making "list(cls.__dict__)" the official way of accessing the definition order exposes an implementation detail that's somewhat specific to the way Python class statements work, rather than being universal across all the different techniques that exist for putting together Python class objects. As Terry suggested, I've reopened and elevated the priority of http://bugs.python.org/issue24254, but only to deferred blocker - while I do think we need to reconsider the decision to remove __definition_order__ based on a proper update to the PEP that accounts for all the points that came up in the original discussions, I also don't see any major problem with leaving it out in beta 1, and then restoring it in beta 2. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From steve.dower at python.org Sat Sep 10 09:01:49 2016 From: steve.dower at python.org (Steve Dower) Date: Sat, 10 Sep 2016 06:01:49 -0700 Subject: [Python-Dev] sys.path file feature In-Reply-To: References: <40aca321-c83b-d184-aa75-e356258a9202@mailbox.org> Message-ID: The underscore is an appropriate rename here, but calling the file sys.path was too juicy :) It's intended only for embedding on Windows and does not exist on Linux/Mac yet (more precisely, implementation is only in PC/getpathp.c). I chatted with some people about spreading it and there wasn't really enough interest yet - theoretical uses but not actual ones, whereas on Windows there are actual uses. If you have actual uses we can look more seriously at it, but right now it's more of a secret registry key that disables the registry. As it is totally outside the language and very specific to a particular installation, support can easily be added at any time. Find my various write-ups on the embeddable distro for details on the use cases, but none of them affect regular Python developers. Cheers, Steve Top-posted from my Windows Phone -----Original Message----- From: "Nick Coghlan" Sent: ?9/?10/?2016 2:43 To: "Wolfgang" Cc: "Python Dev" Subject: Re: [Python-Dev] sys.path file feature On 10 September 2016 at 18:37, Wolfgang wrote: > Hi, > > tracking the commit log I have noticed for Windows there was added a new > feature which is very interesting and can also be useful for other > platforms. > > If I read it right it supports adding a sys.path text file near the > executable to specify the Python sys.path variable and overwriting the > default behavior. > > https://hg.python.org/cpython/rev/03517dd54977 While I'm all for adding ways to simplify CPython sys.path configuration, they shouldn't be added as implicit side effects of other changes without at least some discussion of the chosen approach. If there isn't time for that, and it's needed to solve a particular problem, then the underscore-prefix naming convention indicating "this is not a standardised and supported interface" works just as well for config files as it does for module and attribute names. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia _______________________________________________ Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/steve.dower%40python.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian at python.org Sat Sep 10 10:22:57 2016 From: christian at python.org (Christian Heimes) Date: Sat, 10 Sep 2016 16:22:57 +0200 Subject: [Python-Dev] Let's make the SSL module sane Message-ID: <3f43848e-b53b-582c-2bbe-dab7d1e1f6b0@python.org> Hi, (CC TLS gurus) For 3.6 I like to make the SSL more sane and more secure by default. Yes, I'm a bit late but all my proposals are implemented, documented, partly tested and existing tests are passing. I'm going to write more tests and documentation after beta1. First I like to deprecated some old APIs and favor of SSLCotext. We have multiple ways to create a SSL socket or to configure libraries like urllib. The general idea is to make SSLContext the central object for TLS/SSL configuration. My patch deprecates ssl.wrap_socket() and SSLSocket constructor in favor of SSLContext.wrap_socket(). The patch also deprecates certfile, keyfile an similar arguments in network protocol libraries. I also considered to make cert validation enabled by default for all protocol in 3.6, Victor has rising some concerns. How about we change the behavior in 3.7 and just add a warning to 3.6? http://bugs.python.org/issue28022 https://github.com/tiran/cpython/commits/feature/feature/ssl_deprecation -------- Next up SSLContext default configuration. A bare SSLContext comes with insecure default settings. I'd like to make SSLContext(PROTOCOL_SSLv23) secure bu default. Changelog: The context is created with more secure default values. The options OP_NO_COMPRESSION, OP_CIPHER_SERVER_PREFERENCE, OP_SINGLE_DH_USE, OP_SINGLE_ECDH_USE, OP_NO_SSLv2 (except for PROTOCOL_SSLv2), and OP_NO_SSLv3 (except for PROTOCOL_SSLv3) are set by default. The initial cipher suite list contains only HIGH ciphers, no NULL ciphers and MD5 ciphers (except for PROTOCOL_SSLv2). http://bugs.python.org/issue28043 https://github.com/tiran/cpython/commits/feature/ssl_sane_defaults -------- Finally (and this is the biggest) I like to change how the protocols work. OpenSSL 1.1.0 has deprecated all version specific protocols. Soon OpenSSL will only support auto-negotiation (formerly known as PROTOCOL_SSLv23). My patch #26470 added PROTOCOL_TLS as alias for PROTOCOL_SSLv23. If the last idea is accepted I will remove PROTOCOL_TLS again. It hasn't been released yet. Instead I'm going to add PROTOCOL_TLS_CLIENT and PROTOCOL_TLS_SERVER (see https://www.openssl.org/docs/manmaster/ssl/SSL_CTX_new.html TLS_server_method(), TLS_client_method()). PROTOCOL_TLS_CLIENT is like PROTOCOL_SSLv23 but only supports client-side sockets and PROTOCOL_TLS_SERVER just server-side sockets. In my experience we can't have a SSLContext with sensible and secure settings for client and server at the same time. Hostname checking and cert validation is only sensible for client-side sockets. Starting in 3.8 (or 3.7?) there will be only PROTOCOL_TLS_CLIENT and PROTOCOL_TLS_SERVER. I haven't created a ticket yet, code is at https://github.com/tiran/cpython/commits/feature/openssl_client_server -------- How will my proposals change TLS/SSL code? Application must create a SSLContext object. Applications are recommended to keep the context around to benefit from session reusage and reduce overload of cert parsing. Client side, ctx = ssl.SSLContext(ssl.PROTOCOL_TLS_CLIENT): * works with TLSv1.0, TLSv1.1, TLSv1.2 and new protocols * Options OP_NO_SSLv2, OP_NO_SSLv3, OP_NO_COMPRESSION are set * Only HIGH cipher suites are enabled, MD5 and NULL are disabled * all other ciphers are still enabled, MD5 for SSLv2 * cert_required = CERT_REQUIRED * check_hostname = True * ctx.wrap_socket() creates a client-side socket * ctx.wrap_socket(server_side=True) will not work * root certs are *not* loaded I don't load any certs because it is not possible to remove a cert or X509 lookup once it is loaded. create_default_context() just have to load the certs and set more secure ciper suites. Server side, ctx = ssl.SSLContext(ssl.PROTOCOL_TLS_PROTOCOL): * works with TLSv1.0, TLSv1.1, TLSv1.2 and new protocols * Options OP_NO_SSLv2, OP_NO_SSLv3, OP_NO_COMPRESSION are set * OP_CIPHER_SERVER_PREFERENCE, OP_SINGLE_DH_USE, OP_SINGLE_ECDH_USE are set * Only HIGH cipher suites are enabled, MD5 and NULL are disabled * all other ciphers are still enabled, MD5 for SSLv2 * cert_required = CERT_NONE (no client cert validation) * check_hostname = False * no root CA certs are loaded * only ctx.wrap_socket(server_side=True) works I hope this mail makes sense. Christian From ncoghlan at gmail.com Sat Sep 10 11:24:13 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 11 Sep 2016 01:24:13 +1000 Subject: [Python-Dev] Let's make the SSL module sane In-Reply-To: <3f43848e-b53b-582c-2bbe-dab7d1e1f6b0@python.org> References: <3f43848e-b53b-582c-2bbe-dab7d1e1f6b0@python.org> Message-ID: On 11 September 2016 at 00:22, Christian Heimes wrote: > First I like to deprecated some old APIs and favor of SSLCotext. We have > multiple ways to create a SSL socket or to configure libraries like > urllib. The general idea is to make SSLContext the central object for > TLS/SSL configuration. My patch deprecates ssl.wrap_socket() I'll bring over my question from the tracker issue to here: there's a subset of ssl.wrap_socket() arguments which actually make sense as arguments to ssl.get_default_context().wrap_socket(). Accordingly, we can pick a subset of code (e.g. SSL/TLS clients) that we bless with not needing to change, leaving only code using deprecated parameters or creating server sockets that needs to be updated. As with past network security changes, a major factor we need to account for is that no matter how valuable a particular goal is from a broader industry perspective, people don't tend to react to API breaks by fixing their code - they react by not upgrading at all. Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From steve at holdenweb.com Sat Sep 10 11:27:49 2016 From: steve at holdenweb.com (Steve Holden) Date: Sat, 10 Sep 2016 17:27:49 +0200 Subject: [Python-Dev] [Webmaster] A broken link! In-Reply-To: <3434097.or_mail@whoishostingthismail.com> References: <3434097.or_mail@whoishostingthismail.com> Message-ID: Hi Karen, Thanks for your note. I just checked the source of the document in question, and it appears that link has been changed to reference https://www.mercurial-scm.org/guide, so it appears that we may be publishing an out-of-date document there. I'm copying this reply to the python-dev list, and the release manager may or may not choose to update the published version. regards Steve Steve Holden On Fri, Sep 9, 2016 at 3:37 PM, Karen Little < karen.little at whoishostingthismail.com> wrote: > Hi, > > Just wanted to let you know about a link that seems to be broken on this > page https://docs.python.org/3.2/whatsnew/3.2.html. > > It is this link http://mercurial.selenic.com/guide/, but the page doesn?t > seem to be active any more. I thought you might want to update. > > If you are looking for an alternative please check out > http://wiht.link/Mercurial-intro, it may make a suitable replacement. > > Kind Regards, > Karen > > > > Don't want emails from us anymore? Reply to this email with the word > "UNSUBSCRIBE" in the subject line. > WhoIsHostingThis, BM Box 3667, Old Gloucester Street London, WC1N 3XX, > United Kingdom > _______________________________________________ > Webmaster mailing list > Webmaster at python.org > https://mail.python.org/mailman/listinfo/webmaster > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From donald at stufft.io Sat Sep 10 12:24:13 2016 From: donald at stufft.io (Donald Stufft) Date: Sat, 10 Sep 2016 12:24:13 -0400 Subject: [Python-Dev] Let's make the SSL module sane In-Reply-To: <3f43848e-b53b-582c-2bbe-dab7d1e1f6b0@python.org> References: <3f43848e-b53b-582c-2bbe-dab7d1e1f6b0@python.org> Message-ID: <5F5D3263-3AD7-42C2-8F0F-E025C1938598@stufft.io> > On Sep 10, 2016, at 10:22 AM, Christian Heimes wrote: > > I don't load any certs because it is not possible to remove a cert or > X509 lookup once it is loaded. create_default_context() just have to > load the certs and set more secure ciper suites. This part is the most concerning to me, though I understand why it?s the case. Perhaps we can do something a little tricky to allow both things to happen? IOW do sort of a late binding of a call to loading the default certificates if no other certificates has been loaded when the call to SSLContext().wrap_socket() has been made. So we?d do something like: class SSLContext: def __init__(self, ?): self._loaded_certificates = False ? # Do Other Stuff def load_default_certs(self, ?): self._loaded_certificates = True ? # Do Other Stuff def load_verify_locations(self, ?): self._loaded_certificates = True ? # Do Other Stuff def wrap_socket(self, ?): if not self._loaded_certificates: self.load_default_certs() ? # Do Other Stuff That way if someone does something like: ctx = ssl.SSLContext(ssl.PROTOCOL_TLS_CLIENT) ctx.load_verify_locations(cafile=???) ctx.wrap_socket(?) Then they don?t get any default certificates added, HOWEVER if they do: ctx = ssl.SSLContext(ssl.PROTOCOL_TLS_CLIENT) ctx.wrap_socket(?) Then they do. The main draw back I can see with this is that you can?t wrap a socket and then add certificates after the fact? but I don?t even know if that makes sense to do? ? Donald Stufft From ncoghlan at gmail.com Sat Sep 10 12:57:25 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 11 Sep 2016 02:57:25 +1000 Subject: [Python-Dev] [Webmaster] A broken link! In-Reply-To: References: <3434097.or_mail@whoishostingthismail.com> Message-ID: On 11 September 2016 at 01:27, Steve Holden wrote: > Hi Karen, > > Thanks for your note. I just checked the source of the document in question, > and it appears that link has been changed to reference > https://www.mercurial-scm.org/guide, so it appears that we may be publishing > an out-of-date document there. > > I'm copying this reply to the python-dev list, and the release manager may > or may not choose to update the published version. There's a problem with the way we're publishing our docs, but it's probably more that we're not emitting canonical URL tags that tell search engines to drop the major version qualifier from the links they present in search results: https://bugs.python.org/issue26355 This means that even folks using newer versions of Python may land on older versions of the docs if that's what a search engine happens to present for their particular query. I haven't personally found the time to follow up on that idea with an actual implementation, but it would presumably be a matter of tinkering with the Sphinx theme and/or conf.py file (even for the no longer supported versions of the docs). Cheers, Nick. P.S. Although in this case, it may have just been a direct link to the 3.2 version of the 3.2 What's New - there isn't a lot we can do about that, as when a branch goes unsupported, we usually stop updating the docs as well (even when external links break) -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From guido at python.org Sat Sep 10 13:08:19 2016 From: guido at python.org (Guido van Rossum) Date: Sat, 10 Sep 2016 10:08:19 -0700 Subject: [Python-Dev] PEP520 and absence of __definition_order__ In-Reply-To: References: <57D3BB17.4000704@stoneleaf.us> Message-ID: Thanks for bringing this up. I think it's definitely possible to argue either way. I think what happened before was that I approved __definition_order__ because I wasn't expecting dict to be ordered by default. Now that Inada Naoki's patch has landed things have changed. Here's my reason for agreeing with (or convincing?) Eric to drop __definition_order__, as I remember it from the (lively) discussion at the sprint. - There are only a few use cases for definition order. Without trying to be complete, the use cases I am aware of are all similar to the django Forms interface (https://docs.djangoproject.com/en/1.10/ref/forms/api/#module-django.forms) where the order in which fields are defined determines the order in which they are rendered. Possibly the traits or traitlets libraries also have similar functionality; I know I added it to a database API I wrote at Google as well. I have another idea for a use case where you can define a named tuple with typed fields using PEP 526 syntax. - I like sparsity of interfaces. A class already has dunder attributes for bases, mro, docstring, name, qualified name, module, and probably a few others that I've forgotten. Cruft inevitably accumulates, but I still feel I have to fight it. If we can get the functionality needed for those use cases without a new dunder attribute, so much the better. - The Forms trait[let]s use cases and named tuples can clearly be dealt with by using list(cls.__dict__), since they all involve a user-defined class. - If we had had ordered dicts from the start, those use cases would have been built upon that happily. What would Cython do? I don't know, but I imagine they'd come up with it -- they certainly ought to be able with a way to construct the class __dict__ in the desired order. Is there even a situation where Cython would need to support the construction of forms, trait[let]s, or named tuples using Cython code in a way that the order is discoverable afterwards? (I imagine that Cython would love the named tuple idea, but they'd know the field definition order at compile time, so why would they also need it at runtime?) So I'm happy to continue thinking about this, but I expect this is not such a big deal as you fear. Anyway, let's see if someone comes up with a more convincing argument by beta 2! --Guido On Sat, Sep 10, 2016 at 4:19 AM, Nick Coghlan wrote: > On 10 September 2016 at 19:27, Nick Coghlan wrote: >> On 10 September 2016 at 17:49, Ethan Furman wrote: >>> The "mostly" is what concerns me. Much like having a custom __dir__ lets >>> a class fine-tune what is of interest, a custom __definition_order__ allows >>> a class to present a unified view of the class creation process. This could >>> be important to classes that employ __getattr__ (or __getattribute__) to >>> provide virtual attributes, such as Enum or proxy classes. >> >> +1 >> >> The reasoning for modifying the PEP post-acceptance is faulty - >> __definition_order__ wasn't just there as a CPython implementation >> detail, it was there as a way to allow class and metaclass developers >> to hide their *own* irrelevant implementation details. >> >> Since __definition_order__ was already accepted, and the rationale for >> removing it is incorrect, could we please have it back for beta 1? > > After posting this, I realised I should give a bit more detail on why > I see PEP 520 without __definition_order__ as potentially problematic. > Specifically, it relates to these two sections in the PEP about having > __definition_order__ be writable and about whether or not to set it > for classes that aren't created via the class syntax: > > * https://www.python.org/dev/peps/pep-0520/#why-not-a-read-only-attribute > * https://www.python.org/dev/peps/pep-0520/#support-for-c-api-types > > From the first section: "Also, note that a writeable > __definition_order__ allows dynamically created classes (e.g. by > Cython) to still have __definition_order__ properly set. That could > certainly be handled through specific class- creation tools, such as > type() or the C-API, without the need to lose the semantics of a > read-only attribute. However, with a writeable attribute it's a moot > point. " > > From the second: "However, since __definition_order__ can be set at > any time through normal attribute assignment, it does not need any > special treatment in the C-API." > > Unlike the __definition_order__ tuple, making "list(cls.__dict__)" the > official way of accessing the definition order exposes an > implementation detail that's somewhat specific to the way Python class > statements work, rather than being universal across all the different > techniques that exist for putting together Python class objects. > > As Terry suggested, I've reopened and elevated the priority of > http://bugs.python.org/issue24254, but only to deferred blocker - > while I do think we need to reconsider the decision to remove > __definition_order__ based on a proper update to the PEP that accounts > for all the points that came up in the original discussions, I also > don't see any major problem with leaving it out in beta 1, and then > restoring it in beta 2. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) From ncoghlan at gmail.com Sat Sep 10 13:57:35 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 11 Sep 2016 03:57:35 +1000 Subject: [Python-Dev] PEP520 and absence of __definition_order__ In-Reply-To: References: <57D3BB17.4000704@stoneleaf.us> Message-ID: On 11 September 2016 at 03:08, Guido van Rossum wrote: > So I'm happy to continue thinking about this, but I expect this is not > such a big deal as you fear. Anyway, let's see if someone comes up > with a more convincing argument by beta 2! For CPython specifically, I don't have anything more convincing than Ethan's Enum example (where the way the metaclass works means most of the interesting attributes don't live directly in the class dict, they live in private data structures stored in the class dict, making "list(MyEnum.__dict__)" inherently uninteresting, regardless of whether it's ordered or not). The proxy use cases I'm aware of (wrapt, weakref.proxy) tend to be used to wrap normal instances rather than class objects themselves, so they shouldn't be affected. With ordered-by-default class namespaces, both heap types and non-heap types should also mostly be populated in the "logical order" (i.e. the order names appear in the relevant C arrays), but that would formally be an implementation detail at this point, rather than something we commit to providing. The only other argument that occurs to me is one that didn't come up in the earlier PEP 520 discussions: how a not-quite-Python implementation (or a Python 3.5 compatible implementation that doesn't offer order-preserving behaviour the way PyPy does) can make sure that code that relies on ordered class namespaces *fails* in an informative way when run on that implementation. With __definition_order__, that's straightforward - the code that needs it will fail with AttributeError, and searching for the attribute named in the exception will lead affected users directly to PEP 520 and the Python 3.6 What's New guide. With implicitly ordered class namespaces, you don't get an exception if the namespace isn't actually order preserving - you get attributes in an arbitrary order instead. Interpreters can't detect that the user specifically wanted order preserving behaviour, and library and application authors can't readily detect whether or not the runtime offers order preserving behaviour (since they may just get lucky on that particular run). Even if we added a new flag to sys.implementation that indicated the use of order preserving class namespaces, there'd still be plenty of scope for subtle bugs where libraries and frameworks weren't checking that flag before relying on the new behaviour. Cheers, Nick. P.S. I'd actually love it if we could skip __definition_order__ - there really is a whole lot of runtime clutter on class objects, and we're adding __annotations__ as well. Unfortunately, I also think we made the right call the first time around in thinking it would still be necessary even if class namespaces became order preserving :) -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From christian at python.org Sat Sep 10 14:23:13 2016 From: christian at python.org (Christian Heimes) Date: Sat, 10 Sep 2016 20:23:13 +0200 Subject: [Python-Dev] Let's make the SSL module sane In-Reply-To: <5F5D3263-3AD7-42C2-8F0F-E025C1938598@stufft.io> References: <3f43848e-b53b-582c-2bbe-dab7d1e1f6b0@python.org> <5F5D3263-3AD7-42C2-8F0F-E025C1938598@stufft.io> Message-ID: <25b0cd47-5833-71af-0b51-07bd07287731@python.org> On 2016-09-10 18:24, Donald Stufft wrote: > >> On Sep 10, 2016, at 10:22 AM, Christian Heimes wrote: >> >> I don't load any certs because it is not possible to remove a cert or >> X509 lookup once it is loaded. create_default_context() just have to >> load the certs and set more secure ciper suites. > > > This part is the most concerning to me, though I understand why it?s the case. Perhaps we can do something a little tricky to allow both things to happen? IOW do sort of a late binding of a call to loading the default certificates if no other certificates has been loaded when the call to SSLContext().wrap_socket() has been made. > > So we?d do something like: > > > class SSLContext: > def __init__(self, ?): > self._loaded_certificates = False > ? # Do Other Stuff > > def load_default_certs(self, ?): > self._loaded_certificates = True > ? # Do Other Stuff > > def load_verify_locations(self, ?): > self._loaded_certificates = True > ? # Do Other Stuff > > def wrap_socket(self, ?): > if not self._loaded_certificates: > self.load_default_certs() > > ? # Do Other Stuff > > > That way if someone does something like: > > ctx = ssl.SSLContext(ssl.PROTOCOL_TLS_CLIENT) > ctx.load_verify_locations(cafile=???) > ctx.wrap_socket(?) > > Then they don?t get any default certificates added, HOWEVER if they do: > > ctx = ssl.SSLContext(ssl.PROTOCOL_TLS_CLIENT) > ctx.wrap_socket(?) > > Then they do. > > The main draw back I can see with this is that you can?t wrap a socket and then add certificates after the fact? but I don?t even know if that makes sense to do? It's a bit too clever and tricky for my taste. I prefer 'explicit is better than implicit' for trust anchors. My main concern are secure default settings. A SSLContext should be secure w/o further settings in order to prevent developers to shoot themselves in the knee. Missing root certs are not a direct security issue with CERT_REQUIRED. The connection will simply fail. I'd rather improve the error message than to auto-load certs. Christian From ericsnowcurrently at gmail.com Sat Sep 10 14:52:51 2016 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Sat, 10 Sep 2016 11:52:51 -0700 Subject: [Python-Dev] PEP520 and absence of __definition_order__ In-Reply-To: References: <57D3BB17.4000704@stoneleaf.us> Message-ID: On Sep 10, 2016 11:00, "Nick Coghlan" wrote: > > On 11 September 2016 at 03:08, Guido van Rossum wrote: > > So I'm happy to continue thinking about this, but I expect this is not > > such a big deal as you fear. Anyway, let's see if someone comes up > > with a more convincing argument by beta 2! , > Nick. > > P.S. I'd actually love it if we could skip __definition_order__ - > there really is a whole lot of runtime clutter on class objects, and > we're adding __annotations__ as well. Unfortunately, I also think we > made the right call the first time around in thinking it would still > be necessary even if class namespaces became order preserving :) > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ericsnowcurrently%40gmail.com -eric (phone) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Sat Sep 10 15:02:29 2016 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Sat, 10 Sep 2016 12:02:29 -0700 Subject: [Python-Dev] PEP520 and absence of __definition_order__ In-Reply-To: References: <57D3BB17.4000704@stoneleaf.us> Message-ID: On Sep 10, 2016 10:11, "Guido van Rossum" wrote: > > Thanks for bringing this up. I think it's definitely possible to argue > either way. I think what happened before was that I approved > __definition_order__ because I wasn't expecting dict to be ordered by > default. Now that Inada Naoki's patch has landed things have changed. > > Here's my reason for agreeing with (or convincing?) Eric to drop > __definition_order__, as I remember it from the (lively) discussion at > the sprint. > FWIW, my position was to leave __definition_order__ in place. However, once it became *mostly* redundant, I didn't consider the remaining benefits to be sufficient justification for the extra complexity in the code to the point that it was worth debating. So I didn't object very strenuously when Benjamin suggested removing it. Regardless, I'm still in favor of keeping __definition_order__. :) -eric (phone) -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian at python.org Sat Sep 10 15:20:17 2016 From: christian at python.org (Christian Heimes) Date: Sat, 10 Sep 2016 21:20:17 +0200 Subject: [Python-Dev] Let's make the SSL module sane In-Reply-To: References: <3f43848e-b53b-582c-2bbe-dab7d1e1f6b0@python.org> Message-ID: On 2016-09-10 17:24, Nick Coghlan wrote: > On 11 September 2016 at 00:22, Christian Heimes wrote: >> First I like to deprecated some old APIs and favor of SSLCotext. We have >> multiple ways to create a SSL socket or to configure libraries like >> urllib. The general idea is to make SSLContext the central object for >> TLS/SSL configuration. My patch deprecates ssl.wrap_socket() > > I'll bring over my question from the tracker issue to here: there's a > subset of ssl.wrap_socket() arguments which actually make sense as > arguments to ssl.get_default_context().wrap_socket(). > > Accordingly, we can pick a subset of code (e.g. SSL/TLS clients) that > we bless with not needing to change, leaving only code using > deprecated parameters or creating server sockets that needs to be > updated. Do you consider ssl.wrap_socket() relevant for so many projects? The function hurts performance and is no longer best practice. The deprecation of ssl.wrap_socket() is a friendly nudge. I don't mind to keep it around for another four or six years. There is one other use case not covered by SSLContext.wrap_socket() but by SSLSocket.__init__(). The SSLSocket constructor takes a fileno argument. But it's an undocumented feature and it's broken since at least 3.3. https://bugs.python.org/issue27629 > As with past network security changes, a major factor we need to > account for is that no matter how valuable a particular goal is from a > broader industry perspective, people don't tend to react to API breaks > by fixing their code - they react by not upgrading at all. I totally agree and have been verify careful to keep backwards compatibility. My third patch breaks just one scenario: ssl.create_default_context(purpose=Purpose.SERVER_AUTH) no longer supports server-side connections and CLIENT_AUTH no longer client-side connections. It's the good kind of incompatibility because it reveals API misuse. Application should never have used SERVER_AUTH context to create server sockets. Christian From guido at python.org Sat Sep 10 17:26:58 2016 From: guido at python.org (Guido van Rossum) Date: Sat, 10 Sep 2016 14:26:58 -0700 Subject: [Python-Dev] PEP520 and absence of __definition_order__ In-Reply-To: References: <57D3BB17.4000704@stoneleaf.us> Message-ID: On Sat, Sep 10, 2016 at 10:57 AM, Nick Coghlan wrote: > On 11 September 2016 at 03:08, Guido van Rossum wrote: >> So I'm happy to continue thinking about this, but I expect this is not >> such a big deal as you fear. Anyway, let's see if someone comes up >> with a more convincing argument by beta 2! > > For CPython specifically, I don't have anything more convincing than > Ethan's Enum example (where the way the metaclass works means most of > the interesting attributes don't live directly in the class dict, they > live in private data structures stored in the class dict, making > "list(MyEnum.__dict__)" inherently uninteresting, regardless of > whether it's ordered or not). But that would only matter if we also defined a helper utility that used __definition_order__. I expect that the implementation of Enum could be simplified somewhat in Python 3.6 since it can trust that the namespace passed into __new__ is ordered (so it doesn't have to switch it to an OrderedDict in __prepare__, perhaps). In any case the most likely way to use __definition_order__ in general was always to filter its contents through some other condition (e.g. "isn't a method and doesn't start with underscore") -- you can do the same with keys(). Classes that want to provide a custom list of "interesting" attributes can provide that using whatever class method or attribute they want -- it's just easier to keep those attributes ordered because the namespace is always ordered. > The proxy use cases I'm aware of (wrapt, weakref.proxy) tend to be > used to wrap normal instances rather than class objects themselves, so > they shouldn't be affected. > > With ordered-by-default class namespaces, both heap types and non-heap > types should also mostly be populated in the "logical order" (i.e. the > order names appear in the relevant C arrays), but that would formally > be an implementation detail at this point, rather than something we > commit to providing. > > The only other argument that occurs to me is one that didn't come up > in the earlier PEP 520 discussions: how a not-quite-Python > implementation (or a Python 3.5 compatible implementation that doesn't > offer order-preserving behaviour the way PyPy does) can make sure that > code that relies on ordered class namespaces *fails* in an informative > way when run on that implementation. Is that a real use case? It sounds like you're just constructing an artificial example that would be less convenient without __definition_order__. > With __definition_order__, that's straightforward - the code that > needs it will fail with AttributeError, and searching for the > attribute named in the exception will lead affected users directly to > PEP 520 and the Python 3.6 What's New guide. But that code would have to be written to use __definition_order__. It could just as easily be written to assert that sys.version_info() >= (3, 6). > With implicitly ordered class namespaces, you don't get an exception > if the namespace isn't actually order preserving - you get attributes > in an arbitrary order instead. Interpreters can't detect that the user > specifically wanted order preserving behaviour, and library and > application authors can't readily detect whether or not the runtime > offers order preserving behaviour (since they may just get lucky on > that particular run). That sounds very philosophical. You still can't check whether *dict* is order-preserving -- all you can do is checking whether a *class* preserves its order. Since PEP 520 is accepted only for Python 3.6, checking for the presence of __definition_order__ is no different than checking the version. > Even if we added a new flag to sys.implementation that indicated the > use of order preserving class namespaces, there'd still be plenty of > scope for subtle bugs where libraries and frameworks weren't checking > that flag before relying on the new behaviour. OK, I'm beginning to see the argument here. You want all code that relies on the order to be explicitly declaring that it does so by using a new API. Unfortunately the mere presence of __definition_order__ doesn't really help here -- since all dicts are order-preserving, there's still nothing (apart from documentation) to stop apps from relying on the ordering of the class __dict__ directly. > Cheers, > Nick. > > P.S. I'd actually love it if we could skip __definition_order__ - > there really is a whole lot of runtime clutter on class objects, and > we're adding __annotations__ as well. Unfortunately, I also think we > made the right call the first time around in thinking it would still > be necessary even if class namespaces became order preserving :) Note that __annotations__ is only added when there are annotations, so its presence could be used as a flag of sorts. (However you shouldn't use it directly -- each class in the MRO has its own __annotations__, and you should use typing.get_type_hints(cls) to coalesce all of them.) -- --Guido van Rossum (python.org/~guido) From njs at pobox.com Sat Sep 10 19:41:28 2016 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 10 Sep 2016 16:41:28 -0700 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: <20160910063927.723661ea.barry@wooz.org> References: <20160909105541.5b8a7ec8@fsol> <1890206.yRDVlEz0si@klinga.prans.org> <20160910063927.723661ea.barry@wooz.org> Message-ID: On Fri, Sep 9, 2016 at 11:39 AM, Barry Warsaw wrote: > On Sep 09, 2016, at 01:08 PM, Elvis Pranskevichus wrote: > >>Are there any downsides to explicitly specifying that all dicts are ordered? >>People will inevitably start relying on this behaviour, and this will >>essentially become the *de-facto* spec, so alternative Python implementations >>will have to follow suit anyway. > > It *might* make sense to revisit this once 3.5 is no longer maintained at all, > but I think Guido's exactly right in his analysis. If people start relying on > all dicts being ordered now, their code won't be compatible with both 3.5 and > 3.6, and I think it's important to emphasize this to developers. I feel like I'm missing something here... by this reasoning, we should *never* change the language spec when new features are added. E.g. if people use async/await in 3.5 then their code won't be compatible with 3.4, but async/await are still part of the language spec. And in any case, the distinction between "CPython feature" and "Python language-spec-guaranteed feature" is *extremely* arcane and inside-basebally -- it seems really unlikely that most users will even understand what this distinction means, never mind let it stop them from writing CPython-and-PyPy-specific code. Emphasizing that this is a new feature that only exists in 3.6+ of course makes sense, I just don't understand why that affects the language spec bit. (OTOH it doesn't matter that much anyway... the language spec is definitely a useful thing, but it's largely aspirational in practice -- other implementations target CPython compatibility more than they target language spec compatibility.) -n -- Nathaniel J. Smith -- https://vorpus.org From tbizzle at pvlearners.net Sat Sep 10 19:44:10 2016 From: tbizzle at pvlearners.net (Trevon Bizzle) Date: Sat, 10 Sep 2016 16:44:10 -0700 Subject: [Python-Dev] Installation Error Message-ID: Good evening! I tried downloading Python yesterday and was met with some success. I have been searching for solutions but can not seem to find one. Each time I try to run python an error occurs saying; python.exe - System Error The program can't start because api-ms-win-crt-runtime-l1-1-0.dll is missing from your computer. Try reinstalling the program to fix this problem. My friend and I tried at our school and was met no success in obtaining the program. I am attempting to run python on a CQ60-615DX Notebook. Any help would be appreciated. Thanks! -- Emails and other correspondence to and from Paradise Valley School District are subject to public disclosure, upon request, according to Arizona Public Records Law (A.R.S. ?39-121, et seq.) and corresponding State records retention requirements, unless the content is specifically exempt from disclosure by a state or federal law. -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Sat Sep 10 19:56:47 2016 From: python at mrabarnett.plus.com (MRAB) Date: Sun, 11 Sep 2016 00:56:47 +0100 Subject: [Python-Dev] Installation Error In-Reply-To: References: Message-ID: <1ad4b416-d042-7ab7-a5bc-939543f22053@mrabarnett.plus.com> On 2016-09-11 00:44, Trevon Bizzle wrote: > Good evening! I tried downloading Python yesterday and was met with some > success. I have been searching for solutions but can not seem to find > one. Each time I try to run python an error occurs saying; > python.exe - System Error > The program can't start because api-ms-win-crt-runtime-l1-1-0.dll is > missing from your computer. Try reinstalling the program to fix this > problem. > My friend and I tried at our school and was met no success in obtaining > the program. I am attempting to run python on a CQ60-615DX Notebook. Any > help would be appreciated. Thanks! > You computer needs the Universal C Runtime. An up-to-date system should already have it. Read here: Update for Universal C Runtime in Windows https://support.microsoft.com/en-us/kb/2999226 By the way, this list is for the development *of* the Python language. The list you should really be using is python-list at python.org. From ncoghlan at gmail.com Sat Sep 10 22:34:40 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 11 Sep 2016 12:34:40 +1000 Subject: [Python-Dev] Let's make the SSL module sane In-Reply-To: References: <3f43848e-b53b-582c-2bbe-dab7d1e1f6b0@python.org> Message-ID: On 11 September 2016 at 05:20, Christian Heimes wrote: > On 2016-09-10 17:24, Nick Coghlan wrote: >> On 11 September 2016 at 00:22, Christian Heimes wrote: >>> First I like to deprecated some old APIs and favor of SSLCotext. We have >>> multiple ways to create a SSL socket or to configure libraries like >>> urllib. The general idea is to make SSLContext the central object for >>> TLS/SSL configuration. My patch deprecates ssl.wrap_socket() >> >> I'll bring over my question from the tracker issue to here: there's a >> subset of ssl.wrap_socket() arguments which actually make sense as >> arguments to ssl.get_default_context().wrap_socket(). >> >> Accordingly, we can pick a subset of code (e.g. SSL/TLS clients) that >> we bless with not needing to change, leaving only code using >> deprecated parameters or creating server sockets that needs to be >> updated. > > Do you consider ssl.wrap_socket() relevant for so many projects? The > function hurts performance and is no longer best practice. The > deprecation of ssl.wrap_socket() is a friendly nudge. I don't mind to > keep it around for another four or six years. I have no problem with ripping out and replacing the internals of ssl.wrap_socket(), and doing whatever is needed to improve its performance. What I'm mainly looking for is a decision tree in the overall API design that minimises the amount of fresh information a developer needs to supply, and that makes the purpose of their code relatively self-evident to someone that is reading low(ish) level Python SSL/TLS code for the first time. For example, I think this would be a desirably simple design from a usage perspective: # Client sockets as default, settings may change in maintenance releases my_context = ssl.get_default_context() my_tls_socket = ssl.wrap_socket(my_uncovered_socket) # Server sockets by request, settings may change in maintenance releases my_context = ssl.get_default_server_context() my_tls_socket = ssl.wrap_server_socket(my_uncovered_socket) # More control with more responsibility, defaults only change in feature releases my_context = ssl.SSLContext() my_context = ssl.SSLContext(ssl.PROTOCOL_TLS_SERVER) With that approach, an API user only has to make two forced decisions: - am I securing a client connection or a server connection? - do I want to implicitly pick up modernised defaults in maintenance releases? And we can make the second one a non-decision in most cases by presenting the higher level convenience API as the preferred approach. There would be a third hidden decision implied by the convenience APIs (using the default system certificate store rather than loading a custom one), but most users wouldn't need to worry about that. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sat Sep 10 23:05:59 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 11 Sep 2016 13:05:59 +1000 Subject: [Python-Dev] PEP520 and absence of __definition_order__ In-Reply-To: References: <57D3BB17.4000704@stoneleaf.us> Message-ID: On 11 September 2016 at 07:26, Guido van Rossum wrote: > On Sat, Sep 10, 2016 at 10:57 AM, Nick Coghlan wrote: >> On 11 September 2016 at 03:08, Guido van Rossum wrote: >>> So I'm happy to continue thinking about this, but I expect this is not >>> such a big deal as you fear. Anyway, let's see if someone comes up >>> with a more convincing argument by beta 2! >> >> For CPython specifically, I don't have anything more convincing than >> Ethan's Enum example (where the way the metaclass works means most of >> the interesting attributes don't live directly in the class dict, they >> live in private data structures stored in the class dict, making >> "list(MyEnum.__dict__)" inherently uninteresting, regardless of >> whether it's ordered or not). > > But that would only matter if we also defined a helper utility that > used __definition_order__. I expect that the implementation of Enum > could be simplified somewhat in Python 3.6 since it can trust that the > namespace passed into __new__ is ordered (so it doesn't have to switch > it to an OrderedDict in __prepare__, perhaps). > > In any case the most likely way to use __definition_order__ in general > was always to filter its contents through some other condition (e.g. > "isn't a method and doesn't start with underscore") -- you can do the > same with keys(). Classes that want to provide a custom list of > "interesting" attributes can provide that using whatever class method > or attribute they want -- it's just easier to keep those attributes > ordered because the namespace is always ordered. For example,it's already possible to expose order information via __dir__, consumers of the information just have to bypass the implicit sorting applied by the dir() builtin: >>> class Example: ... def __dir__(self): ... return "first second third fourth".split() ... >>> dir(Example()) ['first', 'fourth', 'second', 'third'] >>> Example().__dir__() ['first', 'second', 'third', 'fourth'] You've persuaded me that omitting __definition_order__ is the right thing to do for now, so the last thing I'm going to do is to explicitly double check with the creators of a few interesting alternate implementations (MicroPython, VOC for JVM environments, Batavia for JavaScript environments) to see if this may cause them problems in officially implementing 3.6 (we know PyPy will be OK, since they did it first). VOC & Batavia *should* be OK (worst case, they return collections.OrderedDict from __prepare__ and also use it for __dict__ attributes), but I'm less certain about MicroPython (since I don't know enough about how its current dict implementation works to know whether or not they'll be able to make the same change PyPy and CPython did) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sat Sep 10 23:24:23 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 11 Sep 2016 13:24:23 +1000 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <20160909105541.5b8a7ec8@fsol> <1890206.yRDVlEz0si@klinga.prans.org> <20160910063927.723661ea.barry@wooz.org> Message-ID: On 11 September 2016 at 09:41, Nathaniel Smith wrote: > On Fri, Sep 9, 2016 at 11:39 AM, Barry Warsaw wrote: >> On Sep 09, 2016, at 01:08 PM, Elvis Pranskevichus wrote: >> >>>Are there any downsides to explicitly specifying that all dicts are ordered? >>>People will inevitably start relying on this behaviour, and this will >>>essentially become the *de-facto* spec, so alternative Python implementations >>>will have to follow suit anyway. >> >> It *might* make sense to revisit this once 3.5 is no longer maintained at all, >> but I think Guido's exactly right in his analysis. If people start relying on >> all dicts being ordered now, their code won't be compatible with both 3.5 and >> 3.6, and I think it's important to emphasize this to developers. > > I feel like I'm missing something here... by this reasoning, we should > *never* change the language spec when new features are added. E.g. if > people use async/await in 3.5 then their code won't be compatible with > 3.4, but async/await are still part of the language spec. And in any > case, the distinction between "CPython feature" and "Python > language-spec-guaranteed feature" is *extremely* arcane and > inside-basebally -- it seems really unlikely that most users will even > understand what this distinction means, never mind let it stop them > from writing CPython-and-PyPy-specific code. Emphasizing that this is > a new feature that only exists in 3.6+ of course makes sense, I just > don't understand why that affects the language spec bit. To conform with the updated language spec, implementations just need to use collections.OrderedDict in 3 places: - default return value of __prepare__ - underlying storage type for __dict__ attributes - storage type for passing kwargs to functions They don't *necessarily* have to change their builtin dict type to be order-preserving, as we're not deprecating collections.OrderedDict, and we're not adding the additional methods offered by OrderedDict to the base type. So for normal development, the guidance is still "use collections.OrderedDict explicitly if you need to preserve insertion order", as being more explicit gives compatibility with CPython < 3.6, and with any alternate implementations that take the path of using collections.OrderedDict selectively rather than changing the behaviour of their dict builtin (which was the original plan for CPython). > (OTOH it doesn't matter that much anyway... the language spec is > definitely a useful thing, but it's largely aspirational in practice > -- other implementations target CPython compatibility more than they > target language spec compatibility.) The distinction is that there are cases where we *do* convince library and framework authors to change their code for cross-version and cross-implementation compatibility - the popularity of explicit context management being one of the most significant examples of that, as it's far more necessary on implementations that don't use automatic reference counting the way CPython does. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From russell at keith-magee.com Sun Sep 11 00:15:36 2016 From: russell at keith-magee.com (Russell Keith-Magee) Date: Sun, 11 Sep 2016 12:15:36 +0800 Subject: [Python-Dev] PEP520 and absence of __definition_order__ In-Reply-To: References: <57D3BB17.4000704@stoneleaf.us> Message-ID: On Sun, Sep 11, 2016 at 11:05 AM, Nick Coghlan wrote: > On 11 September 2016 at 07:26, Guido van Rossum wrote: > > On Sat, Sep 10, 2016 at 10:57 AM, Nick Coghlan > wrote: > >> On 11 September 2016 at 03:08, Guido van Rossum > wrote: > >>> So I'm happy to continue thinking about this, but I expect this is not > >>> such a big deal as you fear. Anyway, let's see if someone comes up > >>> with a more convincing argument by beta 2! > >> > >> For CPython specifically, I don't have anything more convincing than > >> Ethan's Enum example (where the way the metaclass works means most of > >> the interesting attributes don't live directly in the class dict, they > >> live in private data structures stored in the class dict, making > >> "list(MyEnum.__dict__)" inherently uninteresting, regardless of > >> whether it's ordered or not). > > > > But that would only matter if we also defined a helper utility that > > used __definition_order__. I expect that the implementation of Enum > > could be simplified somewhat in Python 3.6 since it can trust that the > > namespace passed into __new__ is ordered (so it doesn't have to switch > > it to an OrderedDict in __prepare__, perhaps). > > > > In any case the most likely way to use __definition_order__ in general > > was always to filter its contents through some other condition (e.g. > > "isn't a method and doesn't start with underscore") -- you can do the > > same with keys(). Classes that want to provide a custom list of > > "interesting" attributes can provide that using whatever class method > > or attribute they want -- it's just easier to keep those attributes > > ordered because the namespace is always ordered. > > For example,it's already possible to expose order information via > __dir__, consumers of the information just have to bypass the implicit > sorting applied by the dir() builtin: > > >>> class Example: > ... def __dir__(self): > ... return "first second third fourth".split() > ... > >>> dir(Example()) > ['first', 'fourth', 'second', 'third'] > >>> Example().__dir__() > ['first', 'second', 'third', 'fourth'] > > You've persuaded me that omitting __definition_order__ is the right > thing to do for now, so the last thing I'm going to do is to > explicitly double check with the creators of a few interesting > alternate implementations (MicroPython, VOC for JVM environments, > Batavia for JavaScript environments) to see if this may cause them > problems in officially implementing 3.6 (we know PyPy will be OK, > since they did it first). > > VOC & Batavia *should* be OK (worst case, they return > collections.OrderedDict from __prepare__ and also use it for __dict__ > attributes), but I'm less certain about MicroPython (since I don't > know enough about how its current dict implementation works to know > whether or not they'll be able to make the same change PyPy and > CPython did) > >From the perspective of VOC and Batavia: As Nick notes, there may be some changes needed to use OrderDict (or a native analog) in a couple of places, but other than that, it doesn?t strike me as a change that will pose any significant difficulty. Yours, Russ Magee %-) -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at holdenweb.com Sun Sep 11 03:57:16 2016 From: steve at holdenweb.com (Steve Holden) Date: Sun, 11 Sep 2016 09:57:16 +0200 Subject: [Python-Dev] [Webmaster] A broken link! In-Reply-To: References: <3434097.or_mail@whoishostingthismail.com> Message-ID: On Sat, Sep 10, 2016 at 6:57 PM, Nick Coghlan wrote: > P.S. Although in this case, it may have just been a direct link to the > 3.2 version of the 3.2 What's New - there isn't a lot we can do about > that, as when a branch goes unsupported, we usually stop updating the > docs as well (even when external links break) > Thanks for the note, Nick. I assumed that it would be something like that. I know the devs go to great lengths to keep the documentation accurate, but I personally would rather see efforts put towards current versions. regards Steve -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Sun Sep 11 04:37:58 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Sun, 11 Sep 2016 04:37:58 -0400 Subject: [Python-Dev] Python 3.7: remove all private C functions from the Python C API? Message-ID: Hi, Currently, Python has 3 C API: * python core API * regular API: subset of the core API * stable API (ABI?), the Py_LIMITED_API thing: subset of the regular API For practical purpose, all functions are declared in Include/*.h. Basically, Python exposes "everything". There are private functions which are exported using PyAPI_FUNC(), whereas they should only be used inside Python "core". Technically, I'm not sure that we can get ride of PyAPI_FUNC() because the stdlib also has extensions which use a few private functions. For Python 3.7, I propose that we move all these private functions in separated header files, maybe Include/private/ or Include/core/, and not export them as part of the "regular API". The risk is that too many C extensions rely on all these tiny "private" functions. Maybe for performance. I don't know. What do you think? See also the issue #26900, "Exclude the private API from the stable API": http://bugs.python.org/issue26900 Victor From victor.stinner at gmail.com Sun Sep 11 04:42:28 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Sun, 11 Sep 2016 04:42:28 -0400 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <20160909105541.5b8a7ec8@fsol> <1890206.yRDVlEz0si@klinga.prans.org> <20160910063927.723661ea.barry@wooz.org> Message-ID: 2016-09-10 23:24 GMT-04:00 Nick Coghlan : > To conform with the updated language spec, implementations just need > to use collections.OrderedDict in 3 places: > > (...) > - storage type for passing kwargs to functions I'm not sure about the "just need" for this one, especially if you care of performances ;-) I mean, it's not easy to write an *efficient* hash table preserving the insertion order. Otherwise, CPython would have one since Python 1.5 :-) Victor From victor.stinner at gmail.com Sun Sep 11 04:55:00 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Sun, 11 Sep 2016 04:55:00 -0400 Subject: [Python-Dev] PEP520 and absence of __definition_order__ In-Reply-To: <57D3BB17.4000704@stoneleaf.us> References: <57D3BB17.4000704@stoneleaf.us> Message-ID: 2016-09-10 3:49 GMT-04:00 Ethan Furman : > With __definition_order__ Enum can display the actual creation order of enum > members and methods, while relying on Enum.__dict__.keys() presents a > jumbled mess with many attributes the user never wrote, the enum members either > appearing /after/ all the methods (even if actually written before), or > entirely absent. Python 3.5 also returns methods in Enum.__dict__(). So it would be a new feature, right? The use case seems to be specific to Enum. Can't you add a new method which only returns members (ordered by insertion order)? list(myenum._member_maps.keys()) returns members, sorted by insertion order. Is it what you want? Code: --- import enum class Color(enum.Enum): red = 1 blue = red green = 2 print(Color.__dict__.keys()) print(list(Color._member_map_.keys())) --- Python 3.5: --- dict_keys(['__module__', '_member_names_', 'green', '_member_type_', 'blue', '_value2member_map_', '_member_map_', '__new__', 'red', '__doc__']) ['red', 'blue', 'green'] --- Python 3.6: --- dict_keys(['_generate_next_value_', '__module__', '__doc__', '_member_names_', '_member_map_', '_member_type_', '_value2member_map_', 'red', 'blue', 'green', '__new__']) ['red', 'blue', 'green'] --- Note: It seems like dir(myenum) ignores "aliases" like blue=red in my example. Victor From rosuav at gmail.com Sun Sep 11 05:46:56 2016 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 11 Sep 2016 19:46:56 +1000 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <20160909105541.5b8a7ec8@fsol> <1890206.yRDVlEz0si@klinga.prans.org> <20160910063927.723661ea.barry@wooz.org> Message-ID: On Sun, Sep 11, 2016 at 6:42 PM, Victor Stinner wrote: > 2016-09-10 23:24 GMT-04:00 Nick Coghlan : >> To conform with the updated language spec, implementations just need >> to use collections.OrderedDict in 3 places: >> >> (...) >> - storage type for passing kwargs to functions > > I'm not sure about the "just need" for this one, especially if you > care of performances ;-) > > I mean, it's not easy to write an *efficient* hash table preserving > the insertion order. Otherwise, CPython would have one since Python > 1.5 :-) Can the requirement for kwargs be weakened to "preserves insertion order as long as it is not mutated"? That might make it easier on implementations. ChrisA From eric at trueblade.com Sun Sep 11 08:58:21 2016 From: eric at trueblade.com (Eric V. Smith) Date: Sun, 11 Sep 2016 08:58:21 -0400 Subject: [Python-Dev] [Python-checkins] cpython: Use HTTP in testPythonOrg In-Reply-To: <20160911124611.48720.70550.8327B58D@psf.io> References: <20160911124611.48720.70550.8327B58D@psf.io> Message-ID: Hi, Berker. Could you add a comment to the test on why this should use http? I can see this bouncing back and forth between http and https, as people clean an up all http usages to be https. Thanks. Eric. On 9/11/2016 8:46 AM, berker.peksag wrote: > https://hg.python.org/cpython/rev/bc085b7e8fd8 > changeset: 103634:bc085b7e8fd8 > user: Berker Peksag > date: Sun Sep 11 15:46:47 2016 +0300 > summary: > Use HTTP in testPythonOrg > > files: > Lib/test/test_robotparser.py | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > > diff --git a/Lib/test/test_robotparser.py b/Lib/test/test_robotparser.py > --- a/Lib/test/test_robotparser.py > +++ b/Lib/test/test_robotparser.py > @@ -276,7 +276,7 @@ > support.requires('network') > with support.transient_internet('www.python.org'): > parser = urllib.robotparser.RobotFileParser( > - "https://www.python.org/robots.txt") > + "http://www.python.org/robots.txt") > parser.read() > self.assertTrue( > parser.can_fetch("*", "http://www.python.org/robots.txt")) > > > > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > https://mail.python.org/mailman/listinfo/python-checkins > From elgo8537 at colorado.edu Sun Sep 11 03:45:24 2016 From: elgo8537 at colorado.edu (Elliot Gorokhovsky) Date: Sun, 11 Sep 2016 01:45:24 -0600 Subject: [Python-Dev] Drastically improving list.sort() for lists of strings/ints Message-ID: Hello all, I am interested in making a non-trivial improvement to list.sort(), but before I put in the work, I want to test the waters and see if this is something the community would accept. Basically, I want to implement radix sort for lists of strings. So list.sort() would detect if it is sorting a list of strings (which is one of the more common things you sort in python) and, if so, use in-place radix sort (see https://xlinux.nist.gov/dads/ HTML/americanFlagSort.html). In-place radix sort is significantly faster for lexicographic sorting than Timsort (or in general any comparison-based sort, since radix can beat the nlogn barrier). If you don't believe that last claim, suppose for the sake of the argument that it's true (because if I actually implemented this I could supply benchmarks to prove it). The idea is the following: in list.sort(), if using the default comparison operator, test the type of the first, middle, and last elements (or something along those lines). If they are all strings, in practice this means the list is very likely a list of strings, so it's probably worth the investment to check and see. So we iterate through and see if they really are all strings (again, in practice it is very unlikely this test would fail). Luckily, this is very, very nearly free (i.e. no memory cost) since we have to iterate through anyway as part of the in-place radix sort (first step is to count how many elements go in each bucket, you iterate through to count. So we would just put a test in the loop to break if it finds a non-string). Additionally, since often one is sorting objects with string or int fields instead of strings or ints directly, one could check the type of the field extracted by the key function or something. Is this something the community would be interested inf? Again, supposing I benchmark it and show it gives non-trivial improvements on existing codebases. Like for example I could benchmark the top 10 packages on pypi and show they run faster using my list.sort(). TL;DR: Should I spend time making list.sort() detect if it is sorting lexicographically and, if so, use a much faster algorithm? Thanks for reading, Elliot -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Sun Sep 11 12:14:06 2016 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 12 Sep 2016 02:14:06 +1000 Subject: [Python-Dev] Drastically improving list.sort() for lists of strings/ints In-Reply-To: References: Message-ID: On Sun, Sep 11, 2016 at 5:45 PM, Elliot Gorokhovsky wrote: > I am interested in making a non-trivial improvement to list.sort(), but > before I put in the work, I want to test the waters and see if this is > something the community would accept. Basically, I want to implement radix > sort for lists of strings. So list.sort() would detect if it is sorting a > list of strings (which is one of the more common things you sort in python) > and, if so, use in-place radix sort (see > https://xlinux.nist.gov/dads/HTML/americanFlagSort.html). In-place radix > sort is significantly faster for lexicographic sorting than Timsort (or in > general any comparison-based sort, since radix can beat the nlogn barrier). > If you don't believe that last claim, suppose for the sake of the argument > that it's true (because if I actually implemented this I could supply > benchmarks to prove it). I'd like to see these benchmarks, actually. Sounds interesting. How does it fare on different distributions of data - for instance, strings consisting exclusively of ASCII digits and punctuation eg "01:12:35,726 --> 01:12:36,810", or strings consisting primarily of ASCII but with occasional BMP or astral characters, or strings primarily of Cyrillic text, etc? What if every single string begins with a slash character (eg if you're sorting a bunch of path names)? At what list size does it become visibly faster? Could this be put onto PyPI and then benchmarked as lst.sort() vs flagsort(lst) ? ChrisA From raymond.hettinger at gmail.com Sun Sep 11 13:30:12 2016 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sun, 11 Sep 2016 10:30:12 -0700 Subject: [Python-Dev] Drastically improving list.sort() for lists of strings/ints In-Reply-To: References: Message-ID: > On Sep 11, 2016, at 12:45 AM, Elliot Gorokhovsky wrote: > > I am interested in making a non-trivial improvement to list.sort(), but before I put in the work, I want to test the waters and see if this is something the community would accept. Basically, I want to implement radix sort for lists of strings. So list.sort() would detect if it is sorting a list of strings (which is one of the more common things you sort in python) and, if so, use in-place radix sort (see https://xlinux.nist.gov/dads/HTML/americanFlagSort.html). For those who are interested, here is a direct link to the PDF that describes the algorithm. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.22.6990&rep=rep1&type=pdf Raymond From elliot.gorokhovsky at gmail.com Sun Sep 11 13:34:44 2016 From: elliot.gorokhovsky at gmail.com (Elliot Gorokhovsky) Date: Sun, 11 Sep 2016 17:34:44 +0000 Subject: [Python-Dev] Drastically improving list.sort() for lists of strings/ints In-Reply-To: References: Message-ID: Thanks for the link. If you look at the conclusion it says "We recommend American flag sort as an all-round algorithm for sorting strings." On Sun, Sep 11, 2016, 11:30 AM Raymond Hettinger < raymond.hettinger at gmail.com> wrote: > > > On Sep 11, 2016, at 12:45 AM, Elliot Gorokhovsky > wrote: > > > > I am interested in making a non-trivial improvement to list.sort(), but > before I put in the work, I want to test the waters and see if this is > something the community would accept. Basically, I want to implement radix > sort for lists of strings. So list.sort() would detect if it is sorting a > list of strings (which is one of the more common things you sort in python) > and, if so, use in-place radix sort (see > https://xlinux.nist.gov/dads/HTML/americanFlagSort.html). > > For those who are interested, here is a direct link to the PDF that > describes the algorithm. > > > http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.22.6990&rep=rep1&type=pdf > > > Raymond > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dickinsm at gmail.com Sun Sep 11 14:15:37 2016 From: dickinsm at gmail.com (Mark Dickinson) Date: Sun, 11 Sep 2016 19:15:37 +0100 Subject: [Python-Dev] Drastically improving list.sort() for lists of strings/ints In-Reply-To: References: Message-ID: > I am interested in making a non-trivial improvement to list.sort() [...] Would your proposed new sorting algorithm be stable? The language currently guarantees stability for `list.sort` and `sorted`. -- Mark From elliot.gorokhovsky at gmail.com Sun Sep 11 14:43:54 2016 From: elliot.gorokhovsky at gmail.com (Elliot Gorokhovsky) Date: Sun, 11 Sep 2016 18:43:54 +0000 Subject: [Python-Dev] Drastically improving list.sort() for lists of strings/ints In-Reply-To: References: Message-ID: The sort can be made stable, but that requires the allocation of an equal-sized auxiliary array. To quote from the paper: "Both list-based and two-array sorts entail ?(n) space overhead. That overhead shrinks to ?(logn) in American flag sort, which, like quicksort, trades off stability for space efficiency." So there are two options: follow C++ in providing a stable and unstable sort, or just use stable radix sort at the cost of allocating a scratch array. I understand why the first approach is essentially impossible, since it could break code written under the assumption that list.sort() is stable. But I think that in Python, since the list just holds pointers to objects instead of objects themselves, being in-place isn't that important: we're missing the cache all the time anyway since our objects are stored all over the place in memory. So I suppose the thing to do is to benchmark stable radix sort against timsort and see if it's still worth it. Again, I really don't think the auxiliary array would make that much of a difference. Note that in timsort we also use an auxiliary array. On Sun, Sep 11, 2016, 12:15 PM Mark Dickinson wrote: > > I am interested in making a non-trivial improvement to list.sort() [...] > > Would your proposed new sorting algorithm be stable? The language > currently guarantees stability for `list.sort` and `sorted`. > > -- > Mark > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dickinsm at gmail.com Sun Sep 11 14:58:08 2016 From: dickinsm at gmail.com (Mark Dickinson) Date: Sun, 11 Sep 2016 19:58:08 +0100 Subject: [Python-Dev] Drastically improving list.sort() for lists of strings/ints In-Reply-To: References: Message-ID: On Sun, Sep 11, 2016 at 7:43 PM, Elliot Gorokhovsky wrote: > So I suppose the thing to do is to benchmark stable radix sort against timsort and see if it's still worth it. Agreed; it would definitely be interesting to see benchmarks for the two-array stable sort as well as the American Flag unstable sort. (Indeed, I think it would be hard to move the proposal forward without such benchmarks.) Apart from the cases already mentioned by Chris, one of the situations you'll want to include in the benchmarks is the case of a list that's already almost sorted (e.g., an already sorted list with a few extra unsorted elements appended). This is a case that does arise in practice, and that Timsort performs particularly well on. -- Mark From ethan at stoneleaf.us Sun Sep 11 15:12:48 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Sun, 11 Sep 2016 12:12:48 -0700 Subject: [Python-Dev] PEP520 and absence of __definition_order__ In-Reply-To: References: <57D3BB17.4000704@stoneleaf.us> Message-ID: <57D5ACB0.2080005@stoneleaf.us> On 09/11/2016 01:55 AM, Victor Stinner wrote: > 2016-09-10 3:49 GMT-04:00 Ethan Furman wrote: >> With __definition_order__ Enum can display the actual creation order of enum >> members and methods, while relying on Enum.__dict__.keys() presents a >> jumbled mess with many attributes the user never wrote, the enum members either >> appearing /after/ all the methods (even if actually written before), or >> entirely absent. > > Python 3.5 also returns methods in Enum.__dict__(). So it would be a > new feature, right? __definition_order__ is (would be) a new feature, yes. > The use case seems to be specific to Enum. Can't you add a new method > which only returns members (ordered by insertion order)? The use case is specific to any custom metaclass that does more than enhance the attributes and/or methods already in the class body. > list(myenum._member_maps.keys()) returns members, sorted by insertion > order. Is it what you want? That only includes members, not other attributes nor methods. What I want is to make sure the other points of PEP 520 are not forgotten about, and that Enum conforms to the accepted PEP. > Code: > --- > import enum > > class Color(enum.Enum): > red = 1 > blue = red > green = 2 > > print(Color.__dict__.keys()) > print(list(Color._member_map_.keys())) > --- > > Python 3.5: > --- > dict_keys(['__module__', '_member_names_', 'green', '_member_type_', > 'blue', '_value2member_map_', '_member_map_', '__new__', 'red', > '__doc__']) > ['red', 'blue', 'green'] > --- > > Python 3.6: > --- > dict_keys(['_generate_next_value_', '__module__', '__doc__', > '_member_names_', '_member_map_', '_member_type_', > '_value2member_map_', 'red', 'blue', 'green', '__new__']) > ['red', 'blue', 'green'] > --- > > Note: It seems like dir(myenum) ignores "aliases" like blue=red in my example. That is intentional. -- ~Ethan~ From raymond.hettinger at gmail.com Sun Sep 11 15:16:22 2016 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sun, 11 Sep 2016 12:16:22 -0700 Subject: [Python-Dev] Drastically improving list.sort() for lists of strings/ints In-Reply-To: References: Message-ID: <025119A4-531E-435E-AA38-F82FBC67E9B8@gmail.com> > On Sep 11, 2016, at 11:58 AM, Mark Dickinson wrote: > >> So I suppose the thing to do is to benchmark stable radix sort against timsort and see if it's still worth it. > > Agreed; it would definitely be interesting to see benchmarks for the > two-array stable sort as well as the American Flag unstable sort. > (Indeed, I think it would be hard to move the proposal forward without > such benchmarks.) In the meantime, can I suggest moving this discussion to python-ideas. There are many practical issues to be addressed: * sort stability * detecting whether you're dealing with a list of strings * working with unicode rather than inputs limited to a one-byte alphabet * dealing with the multiple compact forms of unicode strings (i.e. complex internal representation) * avoiding degenerate cases * cache performance The referenced article tells us that "troubles with radix sort are in implementation, not in conception". Raymond From brett at python.org Sun Sep 11 15:21:39 2016 From: brett at python.org (Brett Cannon) Date: Sun, 11 Sep 2016 19:21:39 +0000 Subject: [Python-Dev] Python 3.7: remove all private C functions from the Python C API? In-Reply-To: References: Message-ID: In general I support cleaning up our C API to more clearly delineate the boundaries of what people can rely on and what they shouldn't. Could we go farther and do the same separation of the base and limited API at the header level instead of interleaving through #ifndef? On Sun, Sep 11, 2016, 01:38 Victor Stinner wrote: > Hi, > > Currently, Python has 3 C API: > > * python core API > * regular API: subset of the core API > * stable API (ABI?), the Py_LIMITED_API thing: subset of the regular API > > For practical purpose, all functions are declared in Include/*.h. > Basically, Python exposes "everything". There are private functions > which are exported using PyAPI_FUNC(), whereas they should only be > used inside Python "core". Technically, I'm not sure that we can get > ride of PyAPI_FUNC() because the stdlib also has extensions which use > a few private functions. > > For Python 3.7, I propose that we move all these private functions in > separated header files, maybe Include/private/ or Include/core/, and > not export them as part of the "regular API". > > The risk is that too many C extensions rely on all these tiny > "private" functions. Maybe for performance. I don't know. > > What do you think? > > See also the issue #26900, "Exclude the private API from the stable API": > http://bugs.python.org/issue26900 > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From raymond.hettinger at gmail.com Sun Sep 11 16:07:29 2016 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sun, 11 Sep 2016 13:07:29 -0700 Subject: [Python-Dev] Python 3.7: remove all private C functions from the Python C API? In-Reply-To: References: Message-ID: > On Sep 11, 2016, at 1:37 AM, Victor Stinner wrote: > > For Python 3.7, I propose that we move all these private functions in > separated header files, maybe Include/private/ or Include/core/, and > not export them as part of the "regular API". > > The risk is that too many C extensions rely on all these tiny > "private" functions. Maybe for performance. I don't know. > > What do you think? I think the risk is limited and inconsequential. We already document what is public, have specifically set aside a "limited" api, and the leading underscore private naming convention is both well established and well-understood. Even with pure python code, we make the claim that Python is a consenting adults language and that mostly works out just fine. The downside of the proposal is code churn and an increased maintenance burden. Having more include files to search through doesn't make it easier to learn the C code or to maintain it. Over time, the C code has gotten harder to read with cascades of macros and from breaking single concept files into multiple inter-dependent files. Raymond From tjreedy at udel.edu Sun Sep 11 16:48:50 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 11 Sep 2016 16:48:50 -0400 Subject: [Python-Dev] Drastically improving list.sort() for lists of strings/ints In-Reply-To: References: Message-ID: On 9/11/2016 3:45 AM, Elliot Gorokhovsky wrote: > Hello all, > > I am interested in making a non-trivial improvement to list.sort(), This is non-trivial, and harder than you seem to think it is. > but > before I put in the work, I want to test the waters and see if this is > something the community would accept. The debate on proposed enhancements is usually whether they are really enhancements, all things considered. For special-case speedups, the 'all things' include the frequency of the special cases, the ease of detecting them, the thoroughness of testintg, and the maintainability of the proposed, and likely more complicated, code. > Basically, I want to implement radix sort for lists of strings. Radix sort was invented for fixed-length strings of digits, as in all-digit id 'numbers', so 10 bins. Ascii strings need 128 bins, general byte strings need 256, still manageable. General unicode strings require 1,114,112 bins, most of which will be empty for most characters positions. This is harder to manage. So are variable-length strings. In CPython 3.3+, strings at the C level are not just strings but are 1, 2, or 4 bytes per char strings. So you could specifically target lists of bytes (and bytearrays) and lists of strings limited to 1-byte characters. The same code should pretty much work for both. > ... > In-place radix sort is significantly faster for lexicographic sorting than > Timsort (or in general any comparison-based sort, since radix can beat > the nlogn barrier). This unqualified statement is doubly wrong. First, with respect to sorting in general: 'aysmptotically faster' only means faster for 'large enough' tasks. Many real world tasks are small, and big tasks gets broken down into multiple small tasks. 'Asymptotically slower' algoritms may be faster for small tasks. Tim Peters investigated and empirically determined that an O(n*n) binary insort, as he optimized it on real machines, is faster than O(n*logn) sorting for up to around 64 items. So timsort uses binary insertion to sort up to 64 items. Similarly, you would have to investigate whether there is a size below which timsort is faster. Second, with respect to timsort in particular: timsort is designed to exploit structure and run faster than O(n*logn) in special cases. If a list is already sorted, timsort will do one O(n) scan and stop. Any radix sort will take several times longer. If a list is reverse sorted, timsort will do one O(n) scan and do an O(n) reverse. If a list is the concatenation of two sorted lists, timsort will find the two sorted sublists and merge them. If a sorted list has unsorted items appended to the end, timsort will sort the appended items and then do a merge. I expect any radix sort to be slower for all these cases. Tim Peters somewhere documented his experiments and results with various special but plausible cases of non-randomness. What you might propose is this: if the initial up or down sequence already determined by timsort is less than, say, 64 items, and the length of the list is more than some empirically determined value, and all items are bytes or byte-char strings and some further checks determine the same, then switch to rsort. But you should start by putting a module on PyPI. -- Terry Jan Reedy From srkunze at mail.de Sun Sep 11 16:50:12 2016 From: srkunze at mail.de (Sven R. Kunze) Date: Sun, 11 Sep 2016 22:50:12 +0200 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <20160909105541.5b8a7ec8@fsol> <1890206.yRDVlEz0si@klinga.prans.org> <20160910063927.723661ea.barry@wooz.org> Message-ID: On 11.09.2016 01:41, Nathaniel Smith wrote: > I feel like I'm missing something here... by this reasoning, we should > *never* change the language spec when new features are added. E.g. if > people use async/await in 3.5 then their code won't be compatible with > 3.4, but async/await are still part of the language spec. And in any > case, the distinction between "CPython feature" and "Python > language-spec-guaranteed feature" is *extremely* arcane and > inside-basebally -- it seems really unlikely that most users will even > understand what this distinction means, never mind let it stop them > from writing CPython-and-PyPy-specific code. Emphasizing that this is > a new feature that only exists in 3.6+ of course makes sense, I just > don't understand why that affects the language spec bit. > > (OTOH it doesn't matter that much anyway... the language spec is > definitely a useful thing, but it's largely aspirational in practice > -- other implementations target CPython compatibility more than they > target language spec compatibility.) The new dict has thousands and one advantages: no need to import OrderDict anymore, standard syntax for OrderDict, etc. People will love it. But is it legal to use? I tend to agree with you here and say "CPython mostly is the living spec" but I'm not 100% sure (I even restrain from writing a blog post about it although it so wonderful). Cheers, Sven From tim.peters at gmail.com Sun Sep 11 17:42:22 2016 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 11 Sep 2016 16:42:22 -0500 Subject: [Python-Dev] Drastically improving list.sort() for lists of strings/ints In-Reply-To: References: Message-ID: [redirected from python-dev, to python-ideas; please send followups only to python-ideas] [Elliot Gorokhovsky ] > ... > TL;DR: Should I spend time making list.sort() detect if it is sorting > lexicographically and, if so, use a much faster algorithm? It will be fun to find out ;-) As Mark, and especially Terry, pointed out, a major feature of the current sort is that it can exploit many kinds of pre-existing order. As the paper you referenced says, "Realistic sorting problems are usually far from random." But, although they did run some tests against data with significant order, they didn't test against any algorithms _aiming_ at exploiting uniformity. Just against their radix sort variants, and against a quicksort. That's where it's likely next to impossible to guess in advance whether radix sort _will_ have a real advantage. All the kinds of order the current sort can exploit are far from obvious, because the mechanisms it employs are low-level & very general. For example, consider arrays created by this function, for even `n`: def bn(n): r = [None] * n r[0::2] = range(0, n//2) r[1::2] = range(n//2, n) return r Then, e.g., >>> bn(16) [0, 8, 1, 9, 2, 10, 3, 11, 4, 12, 5, 13, 6, 14, 7, 15] This effectively takes range(n), cuts it in half, and does a "perfect shuffle" on the two halves. You'll find nothing in the current code looking for this as a special case, but it nevertheless sorts such arrays in "close to" O(n) time, and despite that there's no natural run in the input longer than 2 elements. That said, I'd encourage you to write your code as a new list method at first, to make it easiest to run benchmarks. If that proves promising, then you can worry about how to make a single method auto-decide which algorithm to use. Also use the two-array version. It's easier to understand and to code, and stability is crucial now. The extra memory burden doesn't bother me - an array of C pointers consumes little memory compared to the memory consumed by the Python objects they point at. Most of all, have fun! :-) From encukou at gmail.com Sun Sep 11 18:58:45 2016 From: encukou at gmail.com (Petr Viktorin) Date: Mon, 12 Sep 2016 00:58:45 +0200 Subject: [Python-Dev] Drastically improving list.sort() for lists of strings/ints In-Reply-To: References: Message-ID: On 09/11/2016 10:48 PM, Terry Reedy wrote: [...] > Second, with respect to timsort in particular: timsort is designed to > exploit structure and run faster than O(n*logn) in special cases. If a > list is already sorted, timsort will do one O(n) scan and stop. Any > radix sort will take several times longer. If a list is reverse sorted, > timsort will do one O(n) scan and do an O(n) reverse. If a list is the > concatenation of two sorted lists, timsort will find the two sorted > sublists and merge them. If a sorted list has unsorted items appended > to the end, timsort will sort the appended items and then do a merge. I > expect any radix sort to be slower for all these cases. Tim Peters > somewhere documented his experiments and results with various special > but plausible cases of non-randomness. That write-up is included in Python source: https://github.com/python/cpython/blob/master/Objects/listsort.txt A good read if you want to know what sort of thinking, benchmarking, and justification should go into a new sorting algorithm. From elliot.gorokhovsky at gmail.com Sun Sep 11 20:01:37 2016 From: elliot.gorokhovsky at gmail.com (Elliot Gorokhovsky) Date: Mon, 12 Sep 2016 00:01:37 +0000 Subject: [Python-Dev] Drastically improving list.sort() for lists of strings/ints In-Reply-To: References: Message-ID: Wow, Tim himself! Regarding performance on semi-ordered data: we'll have to benchmark to see, but intuitively I imagine radix would meet Timsort because verifying that a list of strings is sorted takes Omega(nw) (which gives a lower bound on Timsort), where w is the word length. Radix sort is Theta(nw). So at least asymptotically it checks out. I think if one uses the two-array algorithm, other semi-sortings can also be exploited, since the items get placed into their respective buckets in the order in which they appear in the list. So, for the example you gave, one pass would sort it correctly (since the list has the property if x_1 and x_2 are in bucket b, x1 comes before x2 in the list, so x1 will also come before x2 in the bucket. Except possibly for one "border bucket" that includes n/2). And then it would just be Theta(nw/b) in each bucket to verify sorted. I mean honestly the cool thing about radix is that the best case for Timsort on strings, Omega(nw), is the worst case for radix! So the point is I think the two array version, at least, preserves a lot of structure. Anyway, I hope to have benchmarks (relatively) soon! (I'm a senior in high school so I'm pretty busy...but I'll try to get on this as soon as I can). On Sun, Sep 11, 2016 at 3:42 PM Tim Peters wrote: > [redirected from python-dev, to python-ideas; > please send followups only to python-ideas] > > [Elliot Gorokhovsky ] > > ... > > TL;DR: Should I spend time making list.sort() detect if it is sorting > > lexicographically and, if so, use a much faster algorithm? > > It will be fun to find out ;-) > > As Mark, and especially Terry, pointed out, a major feature of the > current sort is that it can exploit many kinds of pre-existing order. > As the paper you referenced says, "Realistic sorting problems are > usually far from random." But, although they did run some tests > against data with significant order, they didn't test against any > algorithms _aiming_ at exploiting uniformity. Just against their > radix sort variants, and against a quicksort. > > That's where it's likely next to impossible to guess in advance > whether radix sort _will_ have a real advantage. All the kinds of > order the current sort can exploit are far from obvious, because the > mechanisms it employs are low-level & very general. For example, > consider arrays created by this function, for even `n`: > > def bn(n): > r = [None] * n > r[0::2] = range(0, n//2) > r[1::2] = range(n//2, n) > return r > > Then, e.g., > > >>> bn(16) > [0, 8, 1, 9, 2, 10, 3, 11, 4, 12, 5, 13, 6, 14, 7, 15] > > This effectively takes range(n), cuts it in half, and does a "perfect > shuffle" on the two halves. > > You'll find nothing in the current code looking for this as a special > case, but it nevertheless sorts such arrays in "close to" O(n) time, > and despite that there's no natural run in the input longer than 2 > elements. > > That said, I'd encourage you to write your code as a new list method > at first, to make it easiest to run benchmarks. If that proves > promising, then you can worry about how to make a single method > auto-decide which algorithm to use. > > Also use the two-array version. It's easier to understand and to > code, and stability is crucial now. The extra memory burden doesn't > bother me - an array of C pointers consumes little memory compared to > the memory consumed by the Python objects they point at. > > Most of all, have fun! :-) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve.dower at python.org Sun Sep 11 22:48:15 2016 From: steve.dower at python.org (Steve Dower) Date: Sun, 11 Sep 2016 19:48:15 -0700 Subject: [Python-Dev] [Python-checkins] cpython: Fixes test_getargs2 to get the buildbots working again. In-Reply-To: <20160912024411.21095.89458.35BA1187@psf.io> References: <20160912024411.21095.89458.35BA1187@psf.io> Message-ID: On 11Sep2016 1944, steve.dower wrote: > https://hg.python.org/cpython/rev/7793d34609cb > changeset: 103679:7793d34609cb > user: Steve Dower > date: Sun Sep 11 19:43:51 2016 -0700 > summary: > Fixes test_getargs2 to get the buildbots working again. > > files: > Lib/test/test_getargs2.py | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > > diff --git a/Lib/test/test_getargs2.py b/Lib/test/test_getargs2.py > --- a/Lib/test/test_getargs2.py > +++ b/Lib/test/test_getargs2.py > @@ -471,7 +471,7 @@ > > ret = get_args(*TupleSubclass([1, 2])) > self.assertEqual(ret, (1, 2)) > - self.assertIs(type(ret), tuple) > + self.assertIsInstance(ret, tuple) > > ret = get_args() > self.assertIn(ret, ((), None)) I'm not sure this is the fix we want to keep here, but it was sufficient to get the test going and unblock all the buildbots. I'm not entirely sure when the break appeared (essentially we seem to not be copying *args into a new tuple), but I'd guess it's to do with the fast calling improvements. Cheers, Steve From ncoghlan at gmail.com Sun Sep 11 22:54:07 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 12 Sep 2016 12:54:07 +1000 Subject: [Python-Dev] PEP520 and absence of __definition_order__ In-Reply-To: References: <57D3BB17.4000704@stoneleaf.us> Message-ID: On 11 September 2016 at 13:05, Nick Coghlan wrote: > VOC & Batavia *should* be OK (worst case, they return > collections.OrderedDict from __prepare__ and also use it for __dict__ > attributes), but I'm less certain about MicroPython (since I don't > know enough about how its current dict implementation works to know > whether or not they'll be able to make the same change PyPy and > CPython did) Micropython' s Damien George got back to me and indicated that once they get around to working on Python 3.6 compatibility (they're currently still working on 3.5), they'd likely also need to go down the path of using collections.OrderedDict in the situations where the 3.6 language spec calls for it (MicroPython's default dict implementation is less sparse than the CPython one, trading greater memory usage efficiency for an increased risk of hash collisions, so it's unlikely the new implementation would count as "compact" from that perspective). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From vadmium+py at gmail.com Sun Sep 11 22:59:43 2016 From: vadmium+py at gmail.com (Martin Panter) Date: Mon, 12 Sep 2016 02:59:43 +0000 Subject: [Python-Dev] [Python-checkins] cpython: Fixes test_getargs2 to get the buildbots working again. In-Reply-To: References: <20160912024411.21095.89458.35BA1187@psf.io> Message-ID: On 12 September 2016 at 02:48, Steve Dower wrote: >> Fixes test_getargs2 to get the buildbots working again. > > I'm not sure this is the fix we want to keep here, but it was sufficient to > get the test going and unblock all the buildbots. > > I'm not entirely sure when the break appeared (essentially we seem to not be > copying *args into a new tuple), but I'd guess it's to do with the fast > calling improvements. That seems to be everyone else?s guess too. See https://bugs.python.org/issue28086 (bug about this failure) https://bugs.python.org/issue27213 (bisected cause) From steve.dower at python.org Sun Sep 11 23:16:42 2016 From: steve.dower at python.org (Steve Dower) Date: Sun, 11 Sep 2016 20:16:42 -0700 Subject: [Python-Dev] [Python-checkins] cpython: Fixes test_getargs2 to get the buildbots working again. In-Reply-To: References: <20160912024411.21095.89458.35BA1187@psf.io> Message-ID: On 11Sep2016 1959, Martin Panter wrote: > On 12 September 2016 at 02:48, Steve Dower wrote: >>> Fixes test_getargs2 to get the buildbots working again. >> >> I'm not sure this is the fix we want to keep here, but it was sufficient to >> get the test going and unblock all the buildbots. >> >> I'm not entirely sure when the break appeared (essentially we seem to not be >> copying *args into a new tuple), but I'd guess it's to do with the fast >> calling improvements. > > That seems to be everyone else?s guess too. See > https://bugs.python.org/issue28086 (bug about this failure) > https://bugs.python.org/issue27213 (bisected cause) > Huh, I searched and didn't find anything. Maybe I typo'd my search query? Looking at the bisected cause, it seems like the intent is to allow subclasses of tuple to pass through. Considering this is seriously going to hold up beta 1, I'd rather assume that's the intent and unblock the release. Cheers, Steve From benjamin at python.org Mon Sep 12 02:23:53 2016 From: benjamin at python.org (Benjamin Peterson) Date: Sun, 11 Sep 2016 23:23:53 -0700 Subject: [Python-Dev] Python 3.7: remove all private C functions from the Python C API? In-Reply-To: References: Message-ID: <1473661433.351996.722715065.0E979EA4@webmail.messagingengine.com> That seems like a good idea in abstract. However, the boundaries will have to be delineated. Many functions beginning _Py are effectively part of the public API even for "well-behaved" 3rd-party extensions because they are used by magic macros. For example, _Py_Dealloc is used by Py_DECREF. Ideally, we would set the linkage of functions we really didn't want used externally to "hidden". On Sun, Sep 11, 2016, at 01:37, Victor Stinner wrote: > Hi, > > Currently, Python has 3 C API: > > * python core API > * regular API: subset of the core API > * stable API (ABI?), the Py_LIMITED_API thing: subset of the regular API > > For practical purpose, all functions are declared in Include/*.h. > Basically, Python exposes "everything". There are private functions > which are exported using PyAPI_FUNC(), whereas they should only be > used inside Python "core". Technically, I'm not sure that we can get > ride of PyAPI_FUNC() because the stdlib also has extensions which use > a few private functions. > > For Python 3.7, I propose that we move all these private functions in > separated header files, maybe Include/private/ or Include/core/, and > not export them as part of the "regular API". > > The risk is that too many C extensions rely on all these tiny > "private" functions. Maybe for performance. I don't know. > > What do you think? > > See also the issue #26900, "Exclude the private API from the stable API": > http://bugs.python.org/issue26900 > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/benjamin%40python.org From victor.stinner at gmail.com Mon Sep 12 04:41:52 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 12 Sep 2016 10:41:52 +0200 Subject: [Python-Dev] Python 3.7: remove all private C functions from the Python C API? In-Reply-To: <1473661433.351996.722715065.0E979EA4@webmail.messagingengine.com> References: <1473661433.351996.722715065.0E979EA4@webmail.messagingengine.com> Message-ID: 2016-09-12 8:23 GMT+02:00 Benjamin Peterson : > That seems like a good idea in abstract. However, the boundaries will > have to be delineated. Many functions beginning _Py are effectively part > of the public API even for "well-behaved" 3rd-party extensions Oh ok, that's also what I expected. So we should be very careful. Maybe we can experiment building a few major C extensions like numpy to find such issues? I already know that some C extensions have to access low-level internals, like debuggers or profilers. Maybe we need to add something to allow these extensions being compiled with the "private API"? > because they are used by magic macros. For example, _Py_Dealloc is used by Py_DECREF. I suggest to make _Py_Dealloc() public, but explain in its documentation that you should not use it directly :-) In some cases, we should define a function for the public API/ABI, but use a macro for the Python core. We already do that in some cases. Example: --- PyAPI_FUNC(PyThreadState *) PyThreadState_Get(void); #ifdef Py_BUILD_CORE PyAPI_DATA(_Py_atomic_address) _PyThreadState_Current; # define PyThreadState_GET() \ ((PyThreadState*)_Py_atomic_load_relaxed(&_PyThreadState_Current)) #else # define PyThreadState_GET() PyThreadState_Get() #endif --- For Py_DECREF, I prefer to keep a macro because this one is performance critical. Victor From storchaka at gmail.com Mon Sep 12 04:59:57 2016 From: storchaka at gmail.com (Serhiy Storchaka) Date: Mon, 12 Sep 2016 11:59:57 +0300 Subject: [Python-Dev] Python 3.7: remove all private C functions from the Python C API? In-Reply-To: References: <1473661433.351996.722715065.0E979EA4@webmail.messagingengine.com> Message-ID: On 12.09.16 11:41, Victor Stinner wrote: > 2016-09-12 8:23 GMT+02:00 Benjamin Peterson : >> That seems like a good idea in abstract. However, the boundaries will >> have to be delineated. Many functions beginning _Py are effectively part >> of the public API even for "well-behaved" 3rd-party extensions > > Oh ok, that's also what I expected. > > So we should be very careful. Maybe we can experiment building a few > major C extensions like numpy to find such issues? I think would be nice to create a test extension that uses *all* stable functions, build it with old Python versions and test if it works with new Python versions. From christian at python.org Mon Sep 12 05:37:36 2016 From: christian at python.org (Christian Heimes) Date: Mon, 12 Sep 2016 11:37:36 +0200 Subject: [Python-Dev] cpython: Issue #27999: Make "global after use" a SyntaxError, and ditto for nonlocal. In-Reply-To: <20160909165315.21136.26750.46DADC30@psf.io> References: <20160909165315.21136.26750.46DADC30@psf.io> Message-ID: On 2016-09-09 18:53, guido.van.rossum wrote: > https://hg.python.org/cpython/rev/804b71d43c85 > changeset: 103415:804b71d43c85 > user: Guido van Rossum > date: Fri Sep 09 09:36:26 2016 -0700 > summary: > Issue #27999: Make "global after use" a SyntaxError, and ditto for nonlocal. > Patch by Ivan Levkivskyi. > > files: > Doc/reference/simple_stmts.rst | 5 +- > Lib/test/test_syntax.py | 18 +++- > Misc/NEWS | 3 + > Python/symtable.c | 104 +++++++------------- > 4 files changed, 59 insertions(+), 71 deletions(-) > [...] > @@ -1337,31 +1313,23 @@ > long cur = symtable_lookup(st, name); > if (cur < 0) > VISIT_QUIT(st, 0); > - if (cur & DEF_ANNOT) { > - PyErr_Format(PyExc_SyntaxError, > - "annotated name '%U' can't be nonlocal", > - name); > + if (cur & (DEF_LOCAL | USE | DEF_ANNOT)) { > + char* msg; > + if (cur & DEF_ANNOT) { > + msg = NONLOCAL_ANNOT; > + } > + if (cur & DEF_LOCAL) { > + msg = NONLOCAL_AFTER_ASSIGN; > + } > + else { > + msg = NONLOCAL_AFTER_USE; > + } > + PyErr_Format(PyExc_SyntaxError, msg, name); Hi Guido, did you mean if / else if / else here? It's not completely clear if the code means to set msg a second time if both cur & DEF_ANNOT and cur & DEF_LOCAL are true. Christian From levkivskyi at gmail.com Mon Sep 12 05:46:12 2016 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Mon, 12 Sep 2016 11:46:12 +0200 Subject: [Python-Dev] cpython: Issue #27999: Make "global after use" a SyntaxError, and ditto for nonlocal. In-Reply-To: References: <20160909165315.21136.26750.46DADC30@psf.io> Message-ID: Christian, When I wrote this, my intention was like: cur & DEF_LOCAL is a "more serious" error, so that if both errors are made in the same statement: def f(): x: int = 5 global x "SyntaxError: global after assignment" will be reported. The same logic applies to nonlocal. -- Ivan On 12 September 2016 at 11:37, Christian Heimes wrote: > On 2016-09-09 18:53, guido.van.rossum wrote: > > https://hg.python.org/cpython/rev/804b71d43c85 > > changeset: 103415:804b71d43c85 > > user: Guido van Rossum > > date: Fri Sep 09 09:36:26 2016 -0700 > > summary: > > Issue #27999: Make "global after use" a SyntaxError, and ditto for > nonlocal. > > Patch by Ivan Levkivskyi. > > > > files: > > Doc/reference/simple_stmts.rst | 5 +- > > Lib/test/test_syntax.py | 18 +++- > > Misc/NEWS | 3 + > > Python/symtable.c | 104 +++++++------------- > > 4 files changed, 59 insertions(+), 71 deletions(-) > > > > [...] > > > @@ -1337,31 +1313,23 @@ > > long cur = symtable_lookup(st, name); > > if (cur < 0) > > VISIT_QUIT(st, 0); > > - if (cur & DEF_ANNOT) { > > - PyErr_Format(PyExc_SyntaxError, > > - "annotated name '%U' can't be nonlocal", > > - name); > > + if (cur & (DEF_LOCAL | USE | DEF_ANNOT)) { > > + char* msg; > > + if (cur & DEF_ANNOT) { > > + msg = NONLOCAL_ANNOT; > > + } > > + if (cur & DEF_LOCAL) { > > + msg = NONLOCAL_AFTER_ASSIGN; > > + } > > + else { > > + msg = NONLOCAL_AFTER_USE; > > + } > > + PyErr_Format(PyExc_SyntaxError, msg, name); > > Hi Guido, > > did you mean if / else if / else here? It's not completely clear if the > code means to set msg a second time if both cur & DEF_ANNOT and cur & > DEF_LOCAL are true. > > Christian > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > levkivskyi%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian at python.org Mon Sep 12 06:24:06 2016 From: christian at python.org (Christian Heimes) Date: Mon, 12 Sep 2016 12:24:06 +0200 Subject: [Python-Dev] cpython: Issue #27999: Make "global after use" a SyntaxError, and ditto for nonlocal. In-Reply-To: References: <20160909165315.21136.26750.46DADC30@psf.io> Message-ID: On 2016-09-12 11:46, Ivan Levkivskyi wrote: > Christian, > > When I wrote this, my intention was like: cur & DEF_LOCAL is a "more > serious" error, so that if both errors are made in the same statement: > def f(): > x: int = 5 > global x > > "SyntaxError: global after assignment" will be reported. The same logic > applies to nonlocal. Hi Ivan, thanks for your explanation. The code looks suspicious. Can you please provide a patch that makes it more obvious, e.g. either by using if / else if / else or a comment? Christian From victor.stinner at gmail.com Mon Sep 12 06:28:03 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 12 Sep 2016 12:28:03 +0200 Subject: [Python-Dev] [Python-checkins] cpython: Use HTTP in testPythonOrg In-Reply-To: References: <20160911124611.48720.70550.8327B58D@psf.io> Message-ID: I just noticed a failure on a recent Windows build: http://buildbot.python.org/all/builders/x86%20Windows7%203.x/builds/11620/steps/test/logs/stdio "urllib.error.URLError: " So I guess that the change is to restrict the unit test on parsing the robot failed and not test the SSL module. Am I right? Victor 2016-09-11 14:58 GMT+02:00 Eric V. Smith : > Hi, Berker. > > Could you add a comment to the test on why this should use http? I can see > this bouncing back and forth between http and https, as people clean an up > all http usages to be https. > > Thanks. > Eric. > > On 9/11/2016 8:46 AM, berker.peksag wrote: >> >> https://hg.python.org/cpython/rev/bc085b7e8fd8 >> changeset: 103634:bc085b7e8fd8 >> user: Berker Peksag >> date: Sun Sep 11 15:46:47 2016 +0300 >> summary: >> Use HTTP in testPythonOrg >> >> files: >> Lib/test/test_robotparser.py | 2 +- >> 1 files changed, 1 insertions(+), 1 deletions(-) >> >> >> diff --git a/Lib/test/test_robotparser.py b/Lib/test/test_robotparser.py >> --- a/Lib/test/test_robotparser.py >> +++ b/Lib/test/test_robotparser.py >> @@ -276,7 +276,7 @@ >> support.requires('network') >> with support.transient_internet('www.python.org'): >> parser = urllib.robotparser.RobotFileParser( >> - "https://www.python.org/robots.txt") >> + "http://www.python.org/robots.txt") >> parser.read() >> self.assertTrue( >> parser.can_fetch("*", >> "http://www.python.org/robots.txt")) >> >> >> >> _______________________________________________ >> Python-checkins mailing list >> Python-checkins at python.org >> https://mail.python.org/mailman/listinfo/python-checkins >> > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com From levkivskyi at gmail.com Mon Sep 12 06:29:42 2016 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Mon, 12 Sep 2016 12:29:42 +0200 Subject: [Python-Dev] cpython: Issue #27999: Make "global after use" a SyntaxError, and ditto for nonlocal. In-Reply-To: References: <20160909165315.21136.26750.46DADC30@psf.io> Message-ID: On 12 September 2016 at 12:24, Christian Heimes wrote: > The code looks suspicious. Can you please > provide a patch that makes it more obvious, e.g. either by using if / > else if / else or a comment? Sure, I will open an issue with a patch and will add you to nosy (cannot do this *right* now, sorry). -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian at python.org Mon Sep 12 06:35:13 2016 From: christian at python.org (Christian Heimes) Date: Mon, 12 Sep 2016 12:35:13 +0200 Subject: [Python-Dev] cpython: Issue #27999: Make "global after use" a SyntaxError, and ditto for nonlocal. In-Reply-To: References: <20160909165315.21136.26750.46DADC30@psf.io> Message-ID: On 2016-09-12 12:29, Ivan Levkivskyi wrote: > On 12 September 2016 at 12:24, Christian Heimes > wrote: > > The code looks suspicious. Can you please > provide a patch that makes it more obvious, e.g. either by using if / > else if / else or a comment? > > > Sure, I will open an issue with a patch and will add you to nosy (cannot > do this *right* now, sorry). Don't worry, it's not relevant for the beta release. My request is purely cosmetic to make the code a bit easier to understand. :) Christian From solipsis at pitrou.net Mon Sep 12 07:50:38 2016 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 12 Sep 2016 13:50:38 +0200 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered References: <20160909105541.5b8a7ec8@fsol> <1890206.yRDVlEz0si@klinga.prans.org> <20160910063927.723661ea.barry@wooz.org> Message-ID: <20160912135038.4c2eb635@fsol> On Fri, 9 Sep 2016 14:01:08 -0500 David Mertz wrote: > It seems unlikely, but not inconceivable, that someday in the future > someone will implement a dictionary that is faster than current versions > but at the cost of losing inherent ordering. I agree with this. Since ordering is a constraint, in abstracto it is quite understandable that relaxing a constraint may enable more efficient algorithms or implementations. Besides, I don't think it has been proven that the compact-and-ordered dict implementation is actually *faster* than the legacy one. It is more compact, which can matter in some contexts (memory-heavy workloads with lots of objects, perhaps), but not necessarily others. Regards Antoine. From solipsis at pitrou.net Mon Sep 12 07:57:57 2016 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 12 Sep 2016 13:57:57 +0200 Subject: [Python-Dev] Let's make the SSL module sane References: <3f43848e-b53b-582c-2bbe-dab7d1e1f6b0@python.org> Message-ID: <20160912135757.16ca713b@fsol> On Sat, 10 Sep 2016 16:22:57 +0200 Christian Heimes wrote: > > For 3.6 I like to make the SSL more sane and more secure by default. > Yes, I'm a bit late but all my proposals are implemented, documented, > partly tested and existing tests are passing. I don't have time nor motivation to review most of them, but I trust you that the implementations are sane :-) > First I like to deprecated some old APIs and favor of SSLCotext. This has always been the plan (to me), so a big +1. > The patch > also deprecates certfile, keyfile an similar arguments in network > protocol libraries. +1. > I also considered to make cert validation enabled by default for all > protocol in 3.6, Victor has rising some concerns. I assume you mean "in client mode". I think that sounds fine nowadays. If people haven't configured a set of trusted CAs properly, this should error out immediately, so they would notice it quickly IMHO. (in other words, +0.5) > How about we change > the behavior in 3.7 and just add a warning to 3.6? As you (or others) prefer :-) > Next up SSLContext default configuration. A bare SSLContext comes with > insecure default settings. I'd like to make SSLContext(PROTOCOL_SSLv23) > secure bu default. Changelog: The context is created with more secure > default values. The options OP_NO_COMPRESSION, > OP_CIPHER_SERVER_PREFERENCE, OP_SINGLE_DH_USE, OP_SINGLE_ECDH_USE, > OP_NO_SSLv2 (except for PROTOCOL_SSLv2), and OP_NO_SSLv3 (except for > PROTOCOL_SSLv3) are set by default. > The initial cipher suite list > contains only HIGH ciphers, no NULL ciphers and MD5 ciphers (except for > PROTOCOL_SSLv2). +1 to all this from me. The ship has sailed on most of this stuff already. > Finally (and this is the biggest) I like to change how the protocols > work. OpenSSL 1.1.0 has deprecated all version specific protocols. Soon > OpenSSL will only support auto-negotiation (formerly known as > PROTOCOL_SSLv23). My patch #26470 added PROTOCOL_TLS as alias for > PROTOCOL_SSLv23. If the last idea is accepted I will remove PROTOCOL_TLS > again. It hasn't been released yet. Instead I'm going to add > PROTOCOL_TLS_CLIENT and PROTOCOL_TLS_SERVER (see > https://www.openssl.org/docs/manmaster/ssl/SSL_CTX_new.html > TLS_server_method(), TLS_client_method()). PROTOCOL_TLS_CLIENT is like > PROTOCOL_SSLv23 but only supports client-side sockets and > PROTOCOL_TLS_SERVER just server-side sockets. In my experience we can't > have a SSLContext with sensible and secure settings for client and > server at the same time. Hostname checking and cert validation is only > sensible for client-side sockets. This sounds reasonable. No strong opinion from me but +0.5 as well. > Starting in 3.8 (or 3.7?) there will be only PROTOCOL_TLS_CLIENT and > PROTOCOL_TLS_SERVER. You *may* provide the old constants for compatibility, though (meaning "PROTOCOL_TLS", roughly). > How will my proposals change TLS/SSL code? > > Application must create a SSLContext object. Applications are > recommended to keep the context around to benefit from session reusage > and reduce overload of cert parsing. (well, most applications are advised to use an intermediate layer such as httplib ;-)) > I hope this mail makes sense. It does to me! Regards Antoine. From solipsis at pitrou.net Mon Sep 12 08:03:30 2016 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 12 Sep 2016 14:03:30 +0200 Subject: [Python-Dev] Python 3.7: remove all private C functions from the Python C API? References: Message-ID: <20160912140330.2bb0be28@fsol> On Sun, 11 Sep 2016 04:37:58 -0400 Victor Stinner wrote: > > For Python 3.7, I propose that we move all these private functions in > separated header files, maybe Include/private/ or Include/core/, and > not export them as part of the "regular API". -1 from me. There are reasons to rely on private stuff when necessary. As long as private APIs are underscore-prefixed, people know what they are risking by using them. Regards Antoine. From solipsis at pitrou.net Mon Sep 12 08:01:15 2016 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 12 Sep 2016 14:01:15 +0200 Subject: [Python-Dev] Let's make the SSL module sane References: <3f43848e-b53b-582c-2bbe-dab7d1e1f6b0@python.org> <5F5D3263-3AD7-42C2-8F0F-E025C1938598@stufft.io> <25b0cd47-5833-71af-0b51-07bd07287731@python.org> Message-ID: <20160912140115.3e8a3a0e@fsol> On Sat, 10 Sep 2016 20:23:13 +0200 Christian Heimes wrote: > > It's a bit too clever and tricky for my taste. I prefer 'explicit is > better than implicit' for trust anchors. My main concern are secure > default settings. A SSLContext should be secure w/o further settings in > order to prevent developers to shoot themselves in the knee. > > Missing root certs are not a direct security issue with CERT_REQUIRED. > The connection will simply fail. I'd rather improve the error message > than to auto-load certs. Agreed with all this. You don't want to have "magic" behaviour in a security-oriented module. Let people configure their contexts explicitly. As a reminder, people who don't want to configure TLS themselves should use an intermediate layer instead, such as ssl.create_default_context() or an application protocol implementation (httplib, etc.). Regards Antoine. From victor.stinner at gmail.com Mon Sep 12 08:36:53 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 12 Sep 2016 14:36:53 +0200 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: <20160912135038.4c2eb635@fsol> References: <20160909105541.5b8a7ec8@fsol> <1890206.yRDVlEz0si@klinga.prans.org> <20160910063927.723661ea.barry@wooz.org> <20160912135038.4c2eb635@fsol> Message-ID: 2016-09-12 13:50 GMT+02:00 Antoine Pitrou : > Besides, I don't think it has been proven that the compact-and-ordered > dict implementation is actually *faster* than the legacy one. Python 3.6 dict is slower than Python 3.5 dict, at least for a simple lookup: http://bugs.python.org/issue27350#msg275581 But its memory usage is 25% smaller. I'm curious about the performance of the "compaction" needed after adding too many dummy entries (and to preserve insertion order), but I don't know how to benchmark this :-) Maybe add/remove many new keys? I expect bad performance on the compaction, but maybe not as bad as the "hash DoS". For regular Python code, I don't expect compaction to be a common operation, since it's rare to remove attributes. It's more common to modify attributes value, than to remove them and later add new attributes. Victor From greg at krypto.org Mon Sep 12 12:27:14 2016 From: greg at krypto.org (Gregory P. Smith) Date: Mon, 12 Sep 2016 16:27:14 +0000 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <20160909105541.5b8a7ec8@fsol> <1890206.yRDVlEz0si@klinga.prans.org> <20160910063927.723661ea.barry@wooz.org> <20160912135038.4c2eb635@fsol> Message-ID: For the regular dict (non kwargs or namespace __dict__) use case I would actually like to *see disorder preserved during iteration*. If we don't, we will eventually to find ourselves in a similar state we were in pre hash-randomization: (1) Over time, code will come to depend on the order for no good reason. Especially true of tests. This greatly increases the engineering barrier when trying to move a codebase between Python versions or Python VMs. The underlying implementation is free to preserve order (as it now does, great work!) but I think the behavior of iteration when an ordered type was not explicitly requested or ordered iteration was not explicitly requested should be perturbed in order to maintain long term code health. Disorder for this purpose need not be a random shuffle (overkill). It just needs to be regularly inconsistent. A simple thing to do on top of 3.6's new dict implementation would be to pick a random starting point within the order array rather than offset 0 to start iteration from. That small change would be sufficient to guarantee that code depending on order must ask for order. It could even allow us to get people ready for iteration within the same process to become unstable. Maybe I worry too much. Having slogged through fixing problems to enable hash randomization on a code base of tens of millions of lines in 2012... there is a lot of value in enforcing disorder where none is intended to be guaranteed. Explicit is better than implicit. -gps On Mon, Sep 12, 2016 at 5:37 AM Victor Stinner wrote: > 2016-09-12 13:50 GMT+02:00 Antoine Pitrou : > > Besides, I don't think it has been proven that the compact-and-ordered > > dict implementation is actually *faster* than the legacy one. > > Python 3.6 dict is slower than Python 3.5 dict, at least for a simple > lookup: > http://bugs.python.org/issue27350#msg275581 > > But its memory usage is 25% smaller. > > I'm curious about the performance of the "compaction" needed after > adding too many dummy entries (and to preserve insertion order), but I > don't know how to benchmark this :-) Maybe add/remove many new keys? I > expect bad performance on the compaction, but maybe not as bad as the > "hash DoS". > > For regular Python code, I don't expect compaction to be a common > operation, since it's rare to remove attributes. It's more common to > modify attributes value, than to remove them and later add new > attributes. > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/greg%40krypto.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Sep 12 12:35:30 2016 From: guido at python.org (Guido van Rossum) Date: Mon, 12 Sep 2016 09:35:30 -0700 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <20160909105541.5b8a7ec8@fsol> <1890206.yRDVlEz0si@klinga.prans.org> <20160910063927.723661ea.barry@wooz.org> <20160912135038.4c2eb635@fsol> Message-ID: Couldn't we use the order in the actual hash table (which IIUC now contains just indexes into the ordered vector of key/value/hash structs)? That would probably simulate the pre-3.6 order quite effectively. But we'd have to add a new API to reveal the order (in effect just what Nick wanted). How much of the OrderedDict can be implemented just by adding new methods (IOW without changing the data structure)? On Mon, Sep 12, 2016 at 9:27 AM, Gregory P. Smith wrote: > For the regular dict (non kwargs or namespace __dict__) use case I would > actually like to see disorder preserved during iteration. > > If we don't, we will eventually to find ourselves in a similar state we were > in pre hash-randomization: > (1) Over time, code will come to depend on the order for no good reason. > Especially true of tests. This greatly increases the engineering barrier > when trying to move a codebase between Python versions or Python VMs. > > The underlying implementation is free to preserve order (as it now does, > great work!) but I think the behavior of iteration when an ordered type was > not explicitly requested or ordered iteration was not explicitly requested > should be perturbed in order to maintain long term code health. > > Disorder for this purpose need not be a random shuffle (overkill). It just > needs to be regularly inconsistent. A simple thing to do on top of 3.6's new > dict implementation would be to pick a random starting point within the > order array rather than offset 0 to start iteration from. That small change > would be sufficient to guarantee that code depending on order must ask for > order. It could even allow us to get people ready for iteration within the > same process to become unstable. > > Maybe I worry too much. Having slogged through fixing problems to enable > hash randomization on a code base of tens of millions of lines in 2012... > there is a lot of value in enforcing disorder where none is intended to be > guaranteed. Explicit is better than implicit. > > -gps > > On Mon, Sep 12, 2016 at 5:37 AM Victor Stinner > wrote: >> >> 2016-09-12 13:50 GMT+02:00 Antoine Pitrou : >> > Besides, I don't think it has been proven that the compact-and-ordered >> > dict implementation is actually *faster* than the legacy one. >> >> Python 3.6 dict is slower than Python 3.5 dict, at least for a simple >> lookup: >> http://bugs.python.org/issue27350#msg275581 >> >> But its memory usage is 25% smaller. >> >> I'm curious about the performance of the "compaction" needed after >> adding too many dummy entries (and to preserve insertion order), but I >> don't know how to benchmark this :-) Maybe add/remove many new keys? I >> expect bad performance on the compaction, but maybe not as bad as the >> "hash DoS". >> >> For regular Python code, I don't expect compaction to be a common >> operation, since it's rare to remove attributes. It's more common to >> modify attributes value, than to remove them and later add new >> attributes. >> >> Victor >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> https://mail.python.org/mailman/options/python-dev/greg%40krypto.org > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) From rosuav at gmail.com Mon Sep 12 12:51:29 2016 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 13 Sep 2016 02:51:29 +1000 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <20160909105541.5b8a7ec8@fsol> <1890206.yRDVlEz0si@klinga.prans.org> <20160910063927.723661ea.barry@wooz.org> <20160912135038.4c2eb635@fsol> Message-ID: On Tue, Sep 13, 2016 at 2:27 AM, Gregory P. Smith wrote: > Disorder for this purpose need not be a random shuffle (overkill). It just > needs to be regularly inconsistent. A simple thing to do on top of 3.6's new > dict implementation would be to pick a random starting point within the > order array rather than offset 0 to start iteration from. That small change > would be sufficient to guarantee that code depending on order must ask for > order. It could even allow us to get people ready for iteration within the > same process to become unstable. Don't forget that .items(), .keys(), and .values() are all synchronized, so you'd probably have to pick an offset at dict creation and run with it forever after. ChrisA From yselivanov.ml at gmail.com Mon Sep 12 12:52:42 2016 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Mon, 12 Sep 2016 12:52:42 -0400 Subject: [Python-Dev] Python 3.7: remove all private C functions from the Python C API? In-Reply-To: References: Message-ID: Some of the functions we have are really intended to be used *only* by the interpreter itself. For those it would be cool to have them in private headers (AFAIK we already do this, see dict-common.h for instance). Other than that, I think that using the underscore convention is fine. Yury On 2016-09-11 4:37 AM, Victor Stinner wrote: > Hi, > > Currently, Python has 3 C API: > > * python core API > * regular API: subset of the core API > * stable API (ABI?), the Py_LIMITED_API thing: subset of the regular API > > For practical purpose, all functions are declared in Include/*.h. > Basically, Python exposes "everything". There are private functions > which are exported using PyAPI_FUNC(), whereas they should only be > used inside Python "core". Technically, I'm not sure that we can get > ride of PyAPI_FUNC() because the stdlib also has extensions which use > a few private functions. > > For Python 3.7, I propose that we move all these private functions in > separated header files, maybe Include/private/ or Include/core/, and > not export them as part of the "regular API". > > The risk is that too many C extensions rely on all these tiny > "private" functions. Maybe for performance. I don't know. > > What do you think? > > See also the issue #26900, "Exclude the private API from the stable API": > http://bugs.python.org/issue26900 > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/yselivanov.ml%40gmail.com From songofacandy at gmail.com Mon Sep 12 12:56:05 2016 From: songofacandy at gmail.com (INADA Naoki) Date: Tue, 13 Sep 2016 01:56:05 +0900 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <20160909105541.5b8a7ec8@fsol> <1890206.yRDVlEz0si@klinga.prans.org> <20160910063927.723661ea.barry@wooz.org> <20160912135038.4c2eb635@fsol> Message-ID: On Tue, Sep 13, 2016 at 1:35 AM, Guido van Rossum wrote: > Couldn't we use the order in the actual hash table (which IIUC now > contains just indexes into the ordered vector of key/value/hash > structs)? That would probably simulate the pre-3.6 order quite > effectively. Maybe, it can. But current implementation may be faster on iteration, thanks to hardware prefetch of CPU. When sizeof(entry) is 24 (amd64), only 2.66... entries can be on cache line. > But we'd have to add a new API to reveal the order (in effect just > what Nick wanted). How much of the OrderedDict can be implemented just > by adding new methods (IOW without changing the data structure)? Current data structure uses fixed capacity, mostly append only array for entries. To implement `OrderedDict.move_to_end(last=False)`, OrderedDict should be implement more hack. (e.g. use the array as ring.) From victor.stinner at gmail.com Mon Sep 12 13:00:46 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 12 Sep 2016 19:00:46 +0200 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <20160909105541.5b8a7ec8@fsol> <1890206.yRDVlEz0si@klinga.prans.org> <20160910063927.723661ea.barry@wooz.org> <20160912135038.4c2eb635@fsol> Message-ID: 2016-09-12 18:35 GMT+02:00 Guido van Rossum : > Couldn't we use the order in the actual hash table (which IIUC now > contains just indexes into the ordered vector of key/value/hash > structs)? That would probably simulate the pre-3.6 order quite > effectively. >From what I understood, Python 3.6 dict got two *different* changes: * modify the dict structure to use two tables instead of only one: an "index" table (the hash table) and a second key/value table * tune the dict implementation to only append to the key/value table The second change depends on the first change. When a key is deleted, the entry is marked as DUMMY. When we add a new item, DUMMY entries are skipped and we only append at the end of the key/value table. Sometimes, the key/value table is compacted to free memory: all DUMMY entries are removed. It would be possible to add a flag to allow to reuse DUMMY entries, which means loosing the order. The order would only be lost when we add the first item after we removed at least one entry (when the first DUMMY entry is reused). The OrderedDict would set the flag to preserve the order. So technically, it is possible. The question is more what should be the "default" dict :-) Ordered or not? :-) Victor From guido at python.org Mon Sep 12 13:04:18 2016 From: guido at python.org (Guido van Rossum) Date: Mon, 12 Sep 2016 10:04:18 -0700 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <20160909105541.5b8a7ec8@fsol> <1890206.yRDVlEz0si@klinga.prans.org> <20160910063927.723661ea.barry@wooz.org> <20160912135038.4c2eb635@fsol> Message-ID: Wouldn't attempting to reuse DUMMY entries be expensive? You'd have to search forward in the array. Just keeping a count of DUMMY entries and compacting when there are too many seems better somehow. On Mon, Sep 12, 2016 at 10:00 AM, Victor Stinner wrote: > 2016-09-12 18:35 GMT+02:00 Guido van Rossum : >> Couldn't we use the order in the actual hash table (which IIUC now >> contains just indexes into the ordered vector of key/value/hash >> structs)? That would probably simulate the pre-3.6 order quite >> effectively. > > From what I understood, Python 3.6 dict got two *different* changes: > > * modify the dict structure to use two tables instead of only one: an > "index" table (the hash table) and a second key/value table > * tune the dict implementation to only append to the key/value table > > The second change depends on the first change. > > When a key is deleted, the entry is marked as DUMMY. When we add a new > item, DUMMY entries are skipped and we only append at the end of the > key/value table. Sometimes, the key/value table is compacted to free > memory: all DUMMY entries are removed. > > It would be possible to add a flag to allow to reuse DUMMY entries, > which means loosing the order. The order would only be lost when we > add the first item after we removed at least one entry (when the first > DUMMY entry is reused). > > The OrderedDict would set the flag to preserve the order. > > So technically, it is possible. The question is more what should be > the "default" dict :-) Ordered or not? :-) > > Victor -- --Guido van Rossum (python.org/~guido) From tim.peters at gmail.com Mon Sep 12 13:16:35 2016 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 12 Sep 2016 12:16:35 -0500 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <20160909105541.5b8a7ec8@fsol> <1890206.yRDVlEz0si@klinga.prans.org> <20160910063927.723661ea.barry@wooz.org> <20160912135038.4c2eb635@fsol> Message-ID: [Guido] > Wouldn't attempting to reuse DUMMY entries be expensive? You'd have to > search forward in the array. Just keeping a count of DUMMY entries and > compacting when there are too many seems better somehow. I haven't looked at the code, but presumably one of the members of a DUMMY key/value struct could be (ab)used to hold the index of "the next" DUMMY (i.e., treating DUMMYs as a stack implemented by a singly-linked list). In which case no search is needed, but the dict would need a word to hold the index of the DUMMY stack top (or, e.g., -1 when no DUMMY exists) - or dedicate "the first" key/value slot to holding the stack top - or ... It's just code, so it can do anything ;-) From songofacandy at gmail.com Mon Sep 12 13:24:39 2016 From: songofacandy at gmail.com (INADA Naoki) Date: Tue, 13 Sep 2016 02:24:39 +0900 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <20160909105541.5b8a7ec8@fsol> <1890206.yRDVlEz0si@klinga.prans.org> <20160910063927.723661ea.barry@wooz.org> <20160912135038.4c2eb635@fsol> Message-ID: > From what I understood, Python 3.6 dict got two *different* changes: > > * modify the dict structure to use two tables instead of only one: an > "index" table (the hash table) and a second key/value table > * tune the dict implementation to only append to the key/value table > > The second change depends on the first change. > > When a key is deleted, the entry is marked as DUMMY. When we add a new > item, DUMMY entries are skipped and we only append at the end of the > key/value table. Sometimes, the key/value table is compacted to free > memory: all DUMMY entries are removed. Minor correction: Put dummy key in *hash* table. The purpose of the dummy key is same to previous dict implementation. The entry where deleted is filled with NULL. > It would be possible to add a flag to allow to reuse DUMMY entries, > which means loosing the order. The order would only be lost when we > add the first item after we removed at least one entry (when the first > DUMMY entry is reused). Reusing NULL entry is possible, like original compact dict idea by Raymond. But we should rebuild hash table before it is filled by dummy keys. Otherwise, hash lookup may be very slow, or never stop. Sparseness of hash table is very important. Compaction in current implementation is not only for packing key-value entries, but also removing dummy keys from hash table. > > The OrderedDict would set the flag to preserve the order. > > So technically, it is possible. The question is more what should be > the "default" dict :-) Ordered or not? :-) Even if dict don't preserve insertion order when deletion, people may depend "preserving insertion order unless deletion". So fundamental question is: Is it to so bad thing that some people write code depending on CPython and PyPy implementation? I think cross-interpreter libraries can use OrederedDict correctly when they should use it. (They may run test on micropython, Jython and IronPython). And I think there are many use cases that "keeping insertion order is not required, but it's very nice if it is nearly zero cost.". For example, when logging with JSON lines, log.write(json.dumps( { "msg": "hello", "foo": foo, "bar" bar } )) Stable key order may be not required, but it makes the log more readable. From greg at krypto.org Mon Sep 12 18:25:40 2016 From: greg at krypto.org (Gregory P. Smith) Date: Mon, 12 Sep 2016 22:25:40 +0000 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <20160909105541.5b8a7ec8@fsol> <1890206.yRDVlEz0si@klinga.prans.org> <20160910063927.723661ea.barry@wooz.org> <20160912135038.4c2eb635@fsol> Message-ID: On Mon, Sep 12, 2016 at 10:25 AM INADA Naoki wrote: > > So fundamental question is: Is it to so bad thing that some people > write code depending on CPython and PyPy implementation? > Yes. See below. I think cross-interpreter libraries can use OrederedDict correctly > when they should use it. (They may run test on micropython, Jython and > IronPython). > The problem is that libraries which could otherwise be cross-VM compatible are not because they depend upon an implementation detail. So it becomes an additional porting burden on people trying to use the library on another VM that could've been avoided if we required people to be explicit about their needs. BUT... At this point I think coding up an example patch against beta1 offering a choice of disordered iteration capability that does not increase memory or iteration overhead in any significant way is needed. The problem is... I don't know how to express this as an API. Which sinks my whole though process and tables the idea. A parameter to .items(), .keys() and .values() is undesirable as it isn't backwards compatible [meaning it'll never be used] and .keys() needs to match __iter__ which can't have one anyways. A parameter on dict construction is similarly infeasible. Requiring the use of an orderdict like type in order to get the behavior is undesirable. Effectively I'm asking for some boolean state in each dict as to if it should iterate in order or not and a way to expose that to pure Python code in a way that namespace dicts iterate in order by default and others do not unless explicitly configured to do so. oh well. end thought process on my end. it was good while it lasted. Thanks for the compact and ordered dicts! People will love them. :) -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Mon Sep 12 18:28:18 2016 From: greg at krypto.org (Gregory P. Smith) Date: Mon, 12 Sep 2016 22:28:18 +0000 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <20160909105541.5b8a7ec8@fsol> <1890206.yRDVlEz0si@klinga.prans.org> <20160910063927.723661ea.barry@wooz.org> <20160912135038.4c2eb635@fsol> Message-ID: On Mon, Sep 12, 2016 at 9:51 AM Chris Angelico wrote: > On Tue, Sep 13, 2016 at 2:27 AM, Gregory P. Smith wrote: > > Disorder for this purpose need not be a random shuffle (overkill). It > just > > needs to be regularly inconsistent. A simple thing to do on top of 3.6's > new > > dict implementation would be to pick a random starting point within the > > order array rather than offset 0 to start iteration from. That small > change > > would be sufficient to guarantee that code depending on order must ask > for > > order. It could even allow us to get people ready for iteration within > the > > same process to become unstable. > > Don't forget that .items(), .keys(), and .values() are all > synchronized, so you'd probably have to pick an offset at dict > creation and run with it forever after. > Indeed. We could "cheat" and match existing 2.7 and 3.5 behavior by using the hash randomization seed to determine a "consistent within the life of a process" dict iteration order randomization without storing anything per dict. That has the added bonus/drawback (POV) of allowing people to fix a specific behavior via the existing environment variable as they already expect. But given my previous message deciding trying to implement disordered iteration by default in some cases is infeasible, it's moot. :) -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Mon Sep 12 18:46:00 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 12 Sep 2016 15:46:00 -0700 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <20160909105541.5b8a7ec8@fsol> <1890206.yRDVlEz0si@klinga.prans.org> <20160910063927.723661ea.barry@wooz.org> <20160912135038.4c2eb635@fsol> Message-ID: <57D73028.8020608@stoneleaf.us> On 09/12/2016 09:27 AM, Gregory P. Smith wrote: > For the regular dict (non kwargs or namespace __dict__) use case I would actually like to /see disorder preserved during iteration/. > > If we don't, we will eventually to find ourselves in a similar state we were in pre hash-randomization: Does anyone have a short explanation of the interaction between hash randomization and this new always ordered dict? Why doesn't one make the other useless? -- ~Ethan~ From brett at python.org Mon Sep 12 18:56:39 2016 From: brett at python.org (Brett Cannon) Date: Mon, 12 Sep 2016 22:56:39 +0000 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: <57D73028.8020608@stoneleaf.us> References: <20160909105541.5b8a7ec8@fsol> <1890206.yRDVlEz0si@klinga.prans.org> <20160910063927.723661ea.barry@wooz.org> <20160912135038.4c2eb635@fsol> <57D73028.8020608@stoneleaf.us> Message-ID: On Mon, 12 Sep 2016 at 15:46 Ethan Furman wrote: > On 09/12/2016 09:27 AM, Gregory P. Smith wrote: > > > For the regular dict (non kwargs or namespace __dict__) use case I would > actually like to /see disorder preserved during iteration/. > > > > If we don't, we will eventually to find ourselves in a similar state we > were in pre hash-randomization: > > Does anyone have a short explanation of the interaction between hash > randomization and this new always ordered dict? Why doesn't one make the > other useless? > There is still a hash table that stores a pointer into an array that stores the keys/values that are kept in an ordered array. So that first-level hash table still uses hash randomization. -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Mon Sep 12 18:59:47 2016 From: greg at krypto.org (Gregory P. Smith) Date: Mon, 12 Sep 2016 22:59:47 +0000 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <20160909105541.5b8a7ec8@fsol> <1890206.yRDVlEz0si@klinga.prans.org> <20160910063927.723661ea.barry@wooz.org> <20160912135038.4c2eb635@fsol> <57D73028.8020608@stoneleaf.us> Message-ID: On Mon, Sep 12, 2016 at 3:57 PM Brett Cannon wrote: > On Mon, 12 Sep 2016 at 15:46 Ethan Furman wrote: > > On 09/12/2016 09:27 AM, Gregory P. Smith wrote: > > > For the regular dict (non kwargs or namespace __dict__) use case I would > actually like to /see disorder preserved during iteration/. > > > > If we don't, we will eventually to find ourselves in a similar state we > were in pre hash-randomization: > > Does anyone have a short explanation of the interaction between hash > randomization and this new always ordered dict? Why doesn't one make the > other useless? > > > There is still a hash table that stores a pointer into an array that > stores the keys/values that are kept in an ordered array. So that > first-level hash table still uses hash randomization. > More specifically: If the goal of hash randomization is to reduce DDOS hash table stuffing attacks, that is still true. The hashing is randomized. Dict ordering may actually _help_ DDOS protection. It no longer leaks information potentially revealing details about the hash seed via the iteration order. -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Mon Sep 12 19:01:25 2016 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Mon, 12 Sep 2016 17:01:25 -0600 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: <57D73028.8020608@stoneleaf.us> References: <20160909105541.5b8a7ec8@fsol> <1890206.yRDVlEz0si@klinga.prans.org> <20160910063927.723661ea.barry@wooz.org> <20160912135038.4c2eb635@fsol> <57D73028.8020608@stoneleaf.us> Message-ID: On Mon, Sep 12, 2016 at 4:46 PM, Ethan Furman wrote: > Does anyone have a short explanation of the interaction between hash > randomization and this new always ordered dict? Why doesn't one make the > other useless? Before 3.6, dict iteration was based on the hash table, which varies based on the hash seed. The compact dict implementation separates the hash table from the keys table (which preserves insertion order), and iterates over the keys table. So the hash table uses the same hash randomization as before, but it no longer impacts iteration. -eric From yselivanov.ml at gmail.com Mon Sep 12 19:21:53 2016 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Mon, 12 Sep 2016 19:21:53 -0400 Subject: [Python-Dev] Python 3.6 what's new Message-ID: <17fa2d03-bf35-1ff5-1af6-6ceffcab8113@gmail.com> Hi, Elvis and I authored What's New in Python 3.5. We'd like to volunteer to do the same for 3.6. If there are no objections, we can make the first editing pass in a couple of weeks. Yury From nad at python.org Mon Sep 12 19:35:19 2016 From: nad at python.org (Ned Deily) Date: Mon, 12 Sep 2016 19:35:19 -0400 Subject: [Python-Dev] [RELEASE] Python 3.6.0b1 is now available Message-ID: <942D57F5-BA76-49CB-B3DB-18E2D6F12AC4@python.org> On behalf of the Python development community and the Python 3.6 release team, I'm happy to announce the availability of Python 3.6.0b1. 3.6.0b1 is the first of four planned beta releases of Python 3.6, the next major release of Python, and marks the end of the feature development phase for 3.6. Among the new major new features in Python 3.6 are: * PEP 468 - Preserving the order of **kwargs in a function * PEP 487 - Simpler customization of class creation * PEP 495 - Local Time Disambiguation * PEP 498 - Literal String Formatting * PEP 506 - Adding A Secrets Module To The Standard Library * PEP 509 - Add a private version to dict * PEP 515 - Underscores in Numeric Literals * PEP 519 - Adding a file system path protocol * PEP 520 - Preserving Class Attribute Definition Order * PEP 523 - Adding a frame evaluation API to CPython * PEP 524 - Make os.urandom() blocking on Linux (during system startup) * PEP 525 - Asynchronous Generators (provisional) * PEP 526 - Syntax for Variable Annotations (provisional) * PEP 528 - Change Windows console encoding to UTF-8 (provisional) * PEP 529 - Change Windows filesystem encoding to UTF-8 (provisional) * PEP 530 - Asynchronous Comprehensions Please see "What?s New In Python 3.6" for more information: https://docs.python.org/3.6/whatsnew/3.6.html You can find Python 3.6.0b1 here: https://www.python.org/downloads/release/python-360b1/ Beta releases are intended to give the wider community the opportunity to test new features and bug fixes and to prepare their projects to support the new feature release. We strongly encourage maintainers of third-party Python projects to test with 3.6 during the beta phase and report issues found to bugs.python.org as soon as possible. While the release is feature complete entering the beta phase, it is possible that features may be modified or, in rare cases, deleted up until the start of the release candidate phase (2016-12-05). Our goal is have no changes after rc1. To achieve that, it will be extremely important to get as much exposure for 3.6 as possible during the beta phase. Please keep in mind that this is a preview release and its use is not recommended for production environments The next planned release of Python 3.6 will be 3.6.0b2, currently scheduled for 2016-10-03. More information about the release schedule can be found here: https://www.python.org/dev/peps/pep-0494/ -- Ned Deily nad at python.org -- [] From guido at python.org Mon Sep 12 19:50:27 2016 From: guido at python.org (Guido van Rossum) Date: Mon, 12 Sep 2016 16:50:27 -0700 Subject: [Python-Dev] Python 3.6 what's new In-Reply-To: <17fa2d03-bf35-1ff5-1af6-6ceffcab8113@gmail.com> References: <17fa2d03-bf35-1ff5-1af6-6ceffcab8113@gmail.com> Message-ID: No objection! On Mon, Sep 12, 2016 at 4:21 PM, Yury Selivanov wrote: > Hi, > > Elvis and I authored What's New in Python 3.5. We'd like to volunteer to do > the same for 3.6. If there are no objections, we can make the first editing > pass in a couple of weeks. > > Yury > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) From timothy.c.delaney at gmail.com Mon Sep 12 19:50:58 2016 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Tue, 13 Sep 2016 09:50:58 +1000 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <20160909105541.5b8a7ec8@fsol> Message-ID: On 10 September 2016 at 03:17, Guido van Rossum wrote: > I've been asked about this. Here's my opinion on the letter of the law in > 3.6: > > - keyword args are ordered > - the namespace passed to a metaclass is ordered by definition order > - ditto for the class __dict__ > > A compliant implementation may ensure the above three requirements > either by making all dicts ordered, or by providing a custom dict > subclass (e.g. OrderedDict) in those three cases. > I'd like to add one more documented constraint - that dict literals maintain definition order (so long as the dict is not further modified). This allows defining a dict literal and then passing it as **kwargs. Hmm - again, there's no mention of dict literals in the PEPs. I'm assuming that dict literals will preserve their definition order with the new implementation, but is that a valid assumption? Guess I can test it now 3.6.0b1 is out. Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From benjamin at python.org Mon Sep 12 19:57:53 2016 From: benjamin at python.org (Benjamin Peterson) Date: Mon, 12 Sep 2016 16:57:53 -0700 Subject: [Python-Dev] Python 3.6 what's new In-Reply-To: <17fa2d03-bf35-1ff5-1af6-6ceffcab8113@gmail.com> References: <17fa2d03-bf35-1ff5-1af6-6ceffcab8113@gmail.com> Message-ID: <1473724673.2731822.723713297.0ABD8F1F@webmail.messagingengine.com> Thank you. On Mon, Sep 12, 2016, at 16:21, Yury Selivanov wrote: > Hi, > > Elvis and I authored What's New in Python 3.5. We'd like to volunteer > to do the same for 3.6. If there are no objections, we can make the > first editing pass in a couple of weeks. > > Yury > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/benjamin%40python.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From nad at python.org Mon Sep 12 20:15:06 2016 From: nad at python.org (Ned Deily) Date: Mon, 12 Sep 2016 20:15:06 -0400 Subject: [Python-Dev] 3.6.0 Beta Phase Development Message-ID: <092D85C9-5853-403F-B1E1-DF939C5388C0@python.org> Wow! What a busy and productive couple of weeks it has been leading up to 3.6.0b1 and feature code freeze! Congratulations and thanks to all of you who've contributed to the amazing number of PEPs, features, bug fixes, and doc changes that have gone into 3.6.0b1! Now that feature development for 3.6 is over, the challenge is to put the finishing touches on the features and documentation, squash bugs, and test test test. The next preview release will be 3.6.0b2 scheduled for 2016-10-03. In the cpython repo, there is now a 3.6 branch. Starting now, all changes for 3.6.0 should get pushed to the 3.6 branch and then merged to default for 3.7. New features nay continue to be pushed to the default branch for release in 3.7; no new features are now permitted in 3.6 (unless you have contacted me and we have agreed on an extension). Bug fixes appropriate for 3.5.x should get pushed to the 3.5 branch and then merged to 3.6 and then to default. I've updated the Developer's Guide to reflect the now current workflow. Let me know if you find any bugs in it. Likewise, please contact me if you have any questions about the workflow or about whether a change is appropriate for 3.6 beta. To recap: 2016-09-12 3.6 branch open for 3.6.0; 3.7.0 feature development begins 2016-09-12 to 2016-12-04: 3.6.0 beta phase (no new features) - push code for 3.6.0 (bug/regression/doc fixes) to the new 3.6 branch - push code for new features to the default branch for release in 3.7 2016-10-03: 3.6.0 beta 2 2016-12-04 3.6.0 release candidate 1 (3.6.0 code freeze) 2016-12-16 3.6.0 release (3.6.0rc1 plus, if necessary, any dire emergency fixes) 2018-06 3.7.0 release (3.6.0 release + 18 months, details TBD) Thank you all again for your great efforts so far on 3.6! --Ned http://cpython-devguide.readthedocs.io/en/latest/ https://www.python.org/dev/peps/pep-0494/ -- Ned Deily nad at python.org -- [] From nad at python.org Mon Sep 12 20:17:05 2016 From: nad at python.org (Ned Deily) Date: Mon, 12 Sep 2016 20:17:05 -0400 Subject: [Python-Dev] Python 3.6 what's new In-Reply-To: <1473724673.2731822.723713297.0ABD8F1F@webmail.messagingengine.com> References: <17fa2d03-bf35-1ff5-1af6-6ceffcab8113@gmail.com> <1473724673.2731822.723713297.0ABD8F1F@webmail.messagingengine.com> Message-ID: <78355178-4D34-48BF-A21D-703C6EFECF08@python.org> On Sep 12, 2016, at 19:57, Benjamin Peterson wrote: > Thank you. Ditto! Many thanks, Yury! -- Ned Deily nad at python.org -- [] From brett at python.org Mon Sep 12 20:28:00 2016 From: brett at python.org (Brett Cannon) Date: Tue, 13 Sep 2016 00:28:00 +0000 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <20160909105541.5b8a7ec8@fsol> Message-ID: On Mon, 12 Sep 2016 at 16:52 Tim Delaney wrote: > On 10 September 2016 at 03:17, Guido van Rossum wrote: > >> I've been asked about this. Here's my opinion on the letter of the law in >> 3.6: >> >> - keyword args are ordered >> - the namespace passed to a metaclass is ordered by definition order >> - ditto for the class __dict__ >> >> A compliant implementation may ensure the above three requirements >> either by making all dicts ordered, or by providing a custom dict >> subclass (e.g. OrderedDict) in those three cases. >> > > I'd like to add one more documented constraint - that dict literals > maintain definition order (so long as the dict is not further modified). > This allows defining a dict literal and then passing it as **kwargs. > That would require all dictionaries keep their insertion order which we are explicitly not doing (at least yet). If you look at the PEPs that are asking for definition order they specify an "ordered mapping", not a dict. Making dict literals do this means dict literals become "order mapping literals" which isn't what they are; they are dict literals. I don't think we should extend this guarantee to literals any more than any other dictionary. > > Hmm - again, there's no mention of dict literals in the PEPs. I'm assuming > that dict literals will preserve their definition order with the new > implementation, but is that a valid assumption? Guess I can test it now > 3.6.0b1 is out. > They will as an implementation detail, not because the language spec requires it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From timothy.c.delaney at gmail.com Mon Sep 12 22:37:16 2016 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Tue, 13 Sep 2016 12:37:16 +1000 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <20160909105541.5b8a7ec8@fsol> Message-ID: On 13 September 2016 at 10:28, Brett Cannon wrote: > >> I'd like to add one more documented constraint - that dict literals >> maintain definition order (so long as the dict is not further modified). >> This allows defining a dict literal and then passing it as **kwargs. >> > > That would require all dictionaries keep their insertion order which we > are explicitly not doing (at least yet). If you look at the PEPs that are > asking for definition order they specify an "ordered mapping", not a dict. > Making dict literals do this means dict literals become "order mapping > literals" which isn't what they are; they are dict literals. I don't think > we should extend this guarantee to literals any more than any other > dictionary. > I'm not sure I agree with you, but I'm not going to argue too strongly either (it can always be revisited later). I will note that a conforming implementation could be that the result of evaluating a dict literal is a frozen ordered dict which transparently changes to be a mutable dict as required. There could well be performance and/or memory benefits from such a dict implementation. Personally I expect all Python 3.6 implementations will have order-preserving dict as that's the easiest way to achieve the existing guarantees. And that enough code will come to depend on an order-preserving dict that eventually the decision will be made to retrospectively guarantee the semantics. Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From breamoreboy at yahoo.co.uk Tue Sep 13 02:57:03 2016 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Tue, 13 Sep 2016 07:57:03 +0100 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <20160909105541.5b8a7ec8@fsol> <1890206.yRDVlEz0si@klinga.prans.org> <20160910063927.723661ea.barry@wooz.org> <20160912135038.4c2eb635@fsol> Message-ID: On 12/09/2016 23:25, Gregory P. Smith wrote: > On Mon, Sep 12, 2016 at 10:25 AM INADA Naoki wrote: > > > So fundamental question is: Is it to so bad thing that some people > write code depending on CPython and PyPy implementation? > > > Yes. See below. > > I think cross-interpreter libraries can use OrederedDict correctly > when they should use it. (They may run test on micropython, Jython > and IronPython). > > > The problem is that libraries which could otherwise be cross-VM > compatible are not because they depend upon an implementation detail. So > it becomes an additional porting burden on people trying to use the > library on another VM that could've been avoided if we required people > to be explicit about their needs. > > BUT... > > At this point I think coding up an example patch against beta1 offering > a choice of disordered iteration capability that does not increase > memory or iteration overhead in any significant way is needed. > > The problem is... I don't know how to express this as an API. Which > sinks my whole though process and tables the idea. > "tables the idea" has the US meaning of close it down, not the UK meaning of open it up? :) -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence From ncoghlan at gmail.com Tue Sep 13 06:23:56 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 13 Sep 2016 20:23:56 +1000 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <20160909105541.5b8a7ec8@fsol> Message-ID: On 13 September 2016 at 12:37, Tim Delaney wrote: > Personally I expect all Python 3.6 implementations will have > order-preserving dict as that's the easiest way to achieve the existing > guarantees. Not all Python 3 implementation will be able to afford the memory hit that comes from doing that relative to their current approaches (e.g. MicroPython), and others may be relying on a 3rd party VM for their core data structures which may not offer a hash map with these characteristics (VOC and the JVM, Batavia and JavaScript runtimes) Using collections.OrderedDict selectively may not impose too large a memory or performance hit, but using it pervasively likely would. > And that enough code will come to depend on an order-preserving > dict that eventually the decision will be made to retrospectively guarantee > the semantics. We explicitly want to discourage that though, as one of the "alternate deployment targets" we'd like folks to retain compatibility with at the library and framework level is single-source 2/3 deployments. Most incompatibilities are splashy ones that can be detected easily just by testing on older versions, but this one can be a bit hard to pick up if you don't already know to check for it. The benefit of making the official stance be that dict-ordering-as-the-default-behaviour is an implementation detail, is that it puts the burden of maintaining compatibility on library and framework developers, and application developers that support "bring your own Python runtime" deployments, *not* on interpreter implementers. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Tue Sep 13 06:44:45 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 13 Sep 2016 20:44:45 +1000 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <20160909105541.5b8a7ec8@fsol> <1890206.yRDVlEz0si@klinga.prans.org> <20160910063927.723661ea.barry@wooz.org> <20160912135038.4c2eb635@fsol> Message-ID: On 13 September 2016 at 08:25, Gregory P. Smith wrote: > At this point I think coding up an example patch against beta1 offering a > choice of disordered iteration capability that does not increase memory or > iteration overhead in any significant way is needed. > > The problem is... I don't know how to express this as an API. Which sinks my > whole though process and tables the idea. > > A parameter to .items(), .keys() and .values() is undesirable as it isn't > backwards compatible [meaning it'll never be used] and .keys() needs to > match __iter__ which can't have one anyways. A parameter on dict > construction is similarly infeasible. > > Requiring the use of an orderdict like type in order to get the behavior is > undesirable. Effectively I'm asking for some boolean state in each dict as > to if it should iterate in order or not and a way to expose that to pure > Python code in a way that namespace dicts iterate in order by default and > others do not unless explicitly configured to do so. > > oh well. end thought process on my end. it was good while it lasted. I think this is looking at the compatibility testing problem from the wrong direction anyway, as rather than making it difficult for people to implicitly depend on the default key ordering, the scenario we would want to help with is this one: 1. Library developer inadvertently depends on the dicts-are-ordered-by-default implementation detail 2. Library user reports "your library isn't working for me on " 3. Library developer figures out the problem, and would like to update their test suite to deliberately provoke the misbehaviour 4. ??? That is, it falls into the same category as folks depending on CPython's reference counting for prompt resource cleanup, where we offer ResourceWarning to detect such cases, and context managers to clean them up more explicitly. For dict ordering dependence, anyone regularly testing against CPython 2.7 and CPython 3.5 will already have a good chance of detecting key order reliance just through hash randomisation (e.g. I hit an "inconsistent key order in generated JSON makes line-based diffing unreadable" one myself last week with a 3-entry dict for md5, sha1 and sha256 hashes - it was relatively rare to get the same key order two runs in a row) That means the only problematic case is the one where the only CPython version a project supports is 3.6+ *and* they want to support alternate implementations that don't preserve order in their default dict implementation. Given that current alternate implementations are still in the process of catching up to *3.5* (or Python 3 at all in the case of Jython and IronPython), I think we still have a good few years to ponder the question before this particular concern starts cropping up in practise :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From victor.stinner at gmail.com Tue Sep 13 12:36:38 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 13 Sep 2016 18:36:38 +0200 Subject: [Python-Dev] [python-committers] [RELEASE] Python 3.6.0b1 is now available In-Reply-To: <942D57F5-BA76-49CB-B3DB-18E2D6F12AC4@python.org> References: <942D57F5-BA76-49CB-B3DB-18E2D6F12AC4@python.org> Message-ID: Ok, it's start listing regressions/major issues :-) * Bug in _PyDict_Pop() on a splitted table: http://bugs.python.org/issue28120 -- bug in the new compact dict implementation Victor 2016-09-13 1:35 GMT+02:00 Ned Deily : > On behalf of the Python development community and the Python 3.6 release > team, I'm happy to announce the availability of Python 3.6.0b1. 3.6.0b1 > is the first of four planned beta releases of Python 3.6, the next major > release of Python, and marks the end of the feature development phase > for 3.6. > > Among the new major new features in Python 3.6 are: > > * PEP 468 - Preserving the order of **kwargs in a function > * PEP 487 - Simpler customization of class creation > * PEP 495 - Local Time Disambiguation > * PEP 498 - Literal String Formatting > * PEP 506 - Adding A Secrets Module To The Standard Library > * PEP 509 - Add a private version to dict > * PEP 515 - Underscores in Numeric Literals > * PEP 519 - Adding a file system path protocol > * PEP 520 - Preserving Class Attribute Definition Order > * PEP 523 - Adding a frame evaluation API to CPython > * PEP 524 - Make os.urandom() blocking on Linux (during system startup) > * PEP 525 - Asynchronous Generators (provisional) > * PEP 526 - Syntax for Variable Annotations (provisional) > * PEP 528 - Change Windows console encoding to UTF-8 (provisional) > * PEP 529 - Change Windows filesystem encoding to UTF-8 (provisional) > * PEP 530 - Asynchronous Comprehensions > > Please see "What?s New In Python 3.6" for more information: > > https://docs.python.org/3.6/whatsnew/3.6.html > > You can find Python 3.6.0b1 here: > > https://www.python.org/downloads/release/python-360b1/ > > Beta releases are intended to give the wider community the opportunity > to test new features and bug fixes and to prepare their projects to > support the new feature release. We strongly encourage maintainers of > third-party Python projects to test with 3.6 during the beta phase and > report issues found to bugs.python.org as soon as possible. While the > release is feature complete entering the beta phase, it is possible that > features may be modified or, in rare cases, deleted up until the start > of the release candidate phase (2016-12-05). Our goal is have no changes > after rc1. To achieve that, it will be extremely important to get as > much exposure for 3.6 as possible during the beta phase. Please keep in > mind that this is a preview release and its use is not recommended for > production environments > > The next planned release of Python 3.6 will be 3.6.0b2, currently > scheduled for 2016-10-03. More information about the release schedule > can be found here: > > https://www.python.org/dev/peps/pep-0494/ > > -- > Ned Deily > nad at python.org -- [] > > _______________________________________________ > python-committers mailing list > python-committers at python.org > https://mail.python.org/mailman/listinfo/python-committers > Code of Conduct: https://www.python.org/psf/codeofconduct/ From p.f.moore at gmail.com Tue Sep 13 12:55:53 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 13 Sep 2016 17:55:53 +0100 Subject: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8 In-Reply-To: References: Message-ID: On 5 September 2016 at 21:19, Paul Moore wrote: > > The code I'm looking at doesn't use the raw stream (I think). The > problem I had (and the reason I was concerned) is that the code does > some rather messy things, and without tracing back through the full > code path, I'm not 100% sure *what* level of stream it's using. > However, now that I know that the buffered layer won't ever error > because 1 byte isn't enough to return a full character, if I need to > change the code I can do so by switching to the buffered layer and > fixing the issue that way (although with Steve's new proposal even > that won't be necessary). Just as a follow-up, I did a quick test of pyinvoke with the new 3.6b1, and it works fine. So it looks like the final version of the code doesn't cause any problems for this use case, which is a good sign. Also, behaviour with other console utilities like prompt_toolkit and ipython console seems uniformly better (my standard check using euro signs works perfectly, I can't say I've gone any further to Asian scripts or anything like that). Overall, this looks really cool - thanks Steve for getting this in! Paul From Nikolaus at rath.org Tue Sep 13 13:24:18 2016 From: Nikolaus at rath.org (Nikolaus Rath) Date: Tue, 13 Sep 2016 10:24:18 -0700 Subject: [Python-Dev] Drastically improving list.sort() for lists of strings/ints In-Reply-To: (Terry Reedy's message of "Sun, 11 Sep 2016 16:48:50 -0400") References: Message-ID: <87fup3n265.fsf@thinkpad.rath.org> On Sep 11 2016, Terry Reedy wrote: > Tim Peters investigated and empirically determined that an > O(n*n) binary insort, as he optimized it on real machines, is faster > than O(n*logn) sorting for up to around 64 items. Out of curiosity: is this test repeated periodically on different architectures? Or could it be that it only ever was true 10 years ago on Tim's Power Mac G5 (or whatever he used)? Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F ?Time flies like an arrow, fruit flies like a Banana.? From random832 at fastmail.com Tue Sep 13 13:35:10 2016 From: random832 at fastmail.com (Random832) Date: Tue, 13 Sep 2016 13:35:10 -0400 Subject: [Python-Dev] Drastically improving list.sort() for lists of strings/ints In-Reply-To: <87fup3n265.fsf@thinkpad.rath.org> References: <87fup3n265.fsf@thinkpad.rath.org> Message-ID: <1473788110.2455594.724566521.69B942FC@webmail.messagingengine.com> On Tue, Sep 13, 2016, at 13:24, Nikolaus Rath wrote: > On Sep 11 2016, Terry Reedy wrote: > > Tim Peters investigated and empirically determined that an > > O(n*n) binary insort, as he optimized it on real machines, is faster > > than O(n*logn) sorting for up to around 64 items. > > Out of curiosity: is this test repeated periodically on different > architectures? Or could it be that it only ever was true 10 years ago on > Tim's Power Mac G5 (or whatever he used)? Binary insertion sort is still O(n*logn) in comparisons, so it's likely that this is due to short memmoves being sufficiently fast due to cache effects as not to matter. The number might have gotten larger or smaller, though. I wonder if it's something that could be tuned dynamically, at compile time or install time. From python at mrabarnett.plus.com Tue Sep 13 13:52:34 2016 From: python at mrabarnett.plus.com (MRAB) Date: Tue, 13 Sep 2016 18:52:34 +0100 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <20160909105541.5b8a7ec8@fsol> <1890206.yRDVlEz0si@klinga.prans.org> <20160910063927.723661ea.barry@wooz.org> <20160912135038.4c2eb635@fsol> Message-ID: On 2016-09-13 07:57, Mark Lawrence via Python-Dev wrote: > On 12/09/2016 23:25, Gregory P. Smith wrote: [snip] >> The problem is... I don't know how to express this as an API. Which >> sinks my whole though process and tables the idea. >> > > "tables the idea" has the US meaning of close it down, not the UK > meaning of open it up? :) > Indeed. The US usage differs from the rest of the English-speaking world. A better phrase would've been "shelves the idea". There's even a module in Python called "shelve", which makes it Pythonic. :-) From python at mrabarnett.plus.com Tue Sep 13 13:59:37 2016 From: python at mrabarnett.plus.com (MRAB) Date: Tue, 13 Sep 2016 18:59:37 +0100 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <20160909105541.5b8a7ec8@fsol> <1890206.yRDVlEz0si@klinga.prans.org> <20160910063927.723661ea.barry@wooz.org> <20160912135038.4c2eb635@fsol> Message-ID: <751ffaf8-3113-770e-2002-9b7cf02db7f2@mrabarnett.plus.com> On 2016-09-13 11:44, Nick Coghlan wrote: > On 13 September 2016 at 08:25, Gregory P. Smith wrote: >> At this point I think coding up an example patch against beta1 offering a >> choice of disordered iteration capability that does not increase memory or >> iteration overhead in any significant way is needed. >> >> The problem is... I don't know how to express this as an API. Which sinks my >> whole though process and tables the idea. >> >> A parameter to .items(), .keys() and .values() is undesirable as it isn't >> backwards compatible [meaning it'll never be used] and .keys() needs to >> match __iter__ which can't have one anyways. A parameter on dict >> construction is similarly infeasible. >> >> Requiring the use of an orderdict like type in order to get the behavior is >> undesirable. Effectively I'm asking for some boolean state in each dict as >> to if it should iterate in order or not and a way to expose that to pure >> Python code in a way that namespace dicts iterate in order by default and >> others do not unless explicitly configured to do so. >> >> oh well. end thought process on my end. it was good while it lasted. > > I think this is looking at the compatibility testing problem from the > wrong direction anyway, as rather than making it difficult for people > to implicitly depend on the default key ordering, the scenario we > would want to help with is this one: > > 1. Library developer inadvertently depends on the > dicts-are-ordered-by-default implementation detail > 2. Library user reports "your library isn't working for me on implementation without that behaviour>" > 3. Library developer figures out the problem, and would like to update > their test suite to deliberately provoke the misbehaviour > 4. ??? > > That is, it falls into the same category as folks depending on > CPython's reference counting for prompt resource cleanup, where we > offer ResourceWarning to detect such cases, and context managers to > clean them up more explicitly. > > For dict ordering dependence, anyone regularly testing against CPython > 2.7 and CPython 3.5 will already have a good chance of detecting key > order reliance just through hash randomisation (e.g. I hit an > "inconsistent key order in generated JSON makes line-based diffing > unreadable" one myself last week with a 3-entry dict for md5, sha1 and > sha256 hashes - it was relatively rare to get the same key order two > runs in a row) > > That means the only problematic case is the one where the only CPython > version a project supports is 3.6+ *and* they want to support > alternate implementations that don't preserve order in their default > dict implementation. > > Given that current alternate implementations are still in the process > of catching up to *3.5* (or Python 3 at all in the case of Jython and > IronPython), I think we still have a good few years to ponder the > question before this particular concern starts cropping up in practise > :) > The recommended way of dealing with features across different versions of Python is to check for them and see if they raise NameError or whatever, but I wonder if there would be any benefit to recording such things somewhere, e.g. sys.features['ordered_args'] returns True if arguments are passed in an ordered dict. From tim.peters at gmail.com Tue Sep 13 14:08:35 2016 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 13 Sep 2016 13:08:35 -0500 Subject: [Python-Dev] Drastically improving list.sort() for lists of strings/ints In-Reply-To: <87fup3n265.fsf@thinkpad.rath.org> References: <87fup3n265.fsf@thinkpad.rath.org> Message-ID: [Terry Reedy ] >> Tim Peters investigated and empirically determined that an >> O(n*n) binary insort, as he optimized it on real machines, is faster >> than O(n*logn) sorting for up to around 64 items. [Nikolaus Rath ] > Out of curiosity: is this test repeated periodically on different > architectures? Or could it be that it only ever was true 10 years ago on > Tim's Power Mac G5 (or whatever he used)? It has little to do with architecture, but much to do with the relative cost of comparisons versus pointer-copying. Near the end of https://github.com/python/cpython/blob/master/Objects/listsort.txt """ BINSORT A "binary insertion sort" is just like a textbook insertion sort, but instead of locating the correct position of the next item via linear (one at a time) search, an equivalent to Python's bisect.bisect_right is used to find the correct position in logarithmic time. Most texts don't mention this variation, and those that do usually say it's not worth the bother: insertion sort remains quadratic (expected and worst cases) either way. Speeding the search doesn't reduce the quadratic data movement costs. But in CPython's case, comparisons are extraordinarily expensive compared to moving data, and the details matter. Moving objects is just copying pointers. Comparisons can be arbitrarily expensive (can invoke arbitrary user-supplied Python code), but even in simple cases (like 3 < 4) _all_ decisions are made at runtime: what's the type of the left comparand? the type of the right? do they need to be coerced to a common type? where's the code to compare these types? And so on. Even the simplest Python comparison triggers a large pile of C-level pointer dereferences, conditionals, and function calls. So cutting the number of compares is almost always measurably helpful in CPython, and the savings swamp the quadratic-time data movement costs for reasonable minrun values. """ Binsort does a close to optimal number of comparisons on randomly ordered data, and that's the point. Also, in the context of the overall sorting algorithm, binsort is used to _extend_ the length of a naturally occurring "too short" run. There's no need to sort the whole thing from scratch, because we already know the prefix is sorted. That makes binsort a more-than-less obvious choice.(it takes full advantage of knowing that the prefix is already ordered). As that doc also says: """ When N is a power of 2, testing on random data showed that minrun values of 16, 32, 64 and 128 worked about equally well. At 256 the data-movement cost in binary insertion sort clearly hurt, and at 8 the increase in the number of function calls clearly hurt. """ So it settled on forcing minrun into the range 32 <= minrun <= 64 (the precise value depends on the number of elements in the entire array, for reasons also explained in that doc). That's far from either end where the value clearly mattered. If the full path through Python's expensive PyObject_RichCompareBool(X, Y, Py_LT) has gotten significantly faster, a smaller minrun range may make more sense now; or if it's gotten significantly slower, a larger minrun range. But, no, I don't believe anyone retests it. IIRC, when the algorithm was adopted in Java, they found a minrun range of 16 through 32 worked marginally better for them, because _their_ spelling of PyObject_RichCompareBool (for Java object comparison methods) is faster than CPython's. From srkunze at mail.de Tue Sep 13 14:11:10 2016 From: srkunze at mail.de (Sven R. Kunze) Date: Tue, 13 Sep 2016 20:11:10 +0200 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: <751ffaf8-3113-770e-2002-9b7cf02db7f2@mrabarnett.plus.com> References: <20160909105541.5b8a7ec8@fsol> <1890206.yRDVlEz0si@klinga.prans.org> <20160910063927.723661ea.barry@wooz.org> <20160912135038.4c2eb635@fsol> <751ffaf8-3113-770e-2002-9b7cf02db7f2@mrabarnett.plus.com> Message-ID: <2f6f81e7-fa71-2ae8-a660-57f672bb04e1@mail.de> On 13.09.2016 19:59, MRAB wrote: > The recommended way of dealing with features across different versions > of Python is to check for them and see if they raise NameError or > whatever, but I wonder if there would be any benefit to recording such > things somewhere, e.g. sys.features['ordered_args'] returns True if > arguments are passed in an ordered dict. Just to check: do people really that often change between Python implementations? My personal experience with this kind of compatibility is that it is rarely needed for large and complex programs. That is due to deployment and testing issues (at least in our environment as we run multiple Python services on a multitude of servers). Best, Sven From tseaver at palladion.com Tue Sep 13 14:21:25 2016 From: tseaver at palladion.com (Tres Seaver) Date: Tue, 13 Sep 2016 14:21:25 -0400 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: <2f6f81e7-fa71-2ae8-a660-57f672bb04e1@mail.de> References: <20160909105541.5b8a7ec8@fsol> <1890206.yRDVlEz0si@klinga.prans.org> <20160910063927.723661ea.barry@wooz.org> <20160912135038.4c2eb635@fsol> <751ffaf8-3113-770e-2002-9b7cf02db7f2@mrabarnett.plus.com> <2f6f81e7-fa71-2ae8-a660-57f672bb04e1@mail.de> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 09/13/2016 02:11 PM, Sven R. Kunze wrote: > On 13.09.2016 19:59, MRAB wrote: >> The recommended way of dealing with features across different >> versions of Python is to check for them and see if they raise >> NameError or whatever, but I wonder if there would be any benefit to >> recording such things somewhere, e.g. sys.features['ordered_args'] >> returns True if arguments are passed in an ordered dict. > > Just to check: do people really that often change between Python > implementations? > > My personal experience with this kind of compatibility is that it is > rarely needed for large and complex programs. That is due to > deployment and testing issues (at least in our environment as we run > multiple Python services on a multitude of servers). *Lots* of library authors have to straddle Python versions: consumers of those libraries only get to pick and choose when their code is at the "leaf" of the dependency tree (the application). Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJX2EOfAAoJEPKpaDSJE9HYkokP/j74MGBGt+JjcalETp54yJ5n zgun42oE8c+8rTl2gsnn+E7lipTZ9XW4e+/+XDAOBsb3VK3X344l4Wn1i1pfi9/n 1DXEJkO4rbvIOOI2pcsuVCHTLxcpafvKo0+sjVuXdbuBwWFS1OcSTXGoJ7UKi9yI NtmY16qIYLgNhbxRj5dysnFHtnBD9dnQTxs77QFGnu59nT8i+EI0BRqASMXTNhF3 3IZ13BqIIc0megaaSjfNt3BXaMSHEOpAjhes5ni6OEPPVuDk6XRQf705WcjY2S4H EKaArqJIwWHoLOO4gLiaFAa8x0+Vsl8nfGxgWFZFIPiZ0ALqcZ2YHg0GclUs8J4p eOPuLodc9GqtuyhbPctZLU2EbiGDexGS6GkIa3ESh0/WFaOKB5rt/26szHq/WWXE CGSq7QJssoiKfmdniSY1oa4n/1Q3N1PxZfv54YwnAPGy5SOYspFaWnCwORPRlH9s U2p8X61T5SGFouK3XNv8ZgswpH9bF51JBCJuXl9F1reL+4TpfD/0gHIUQLu34Ot/ 54zxtBB0h+FgnMZ62g+vp04d//0sw/BfsVElkjHi5ptcb+A9IAgjIfOWRDRtSzEx yOQ80dY3BPmknbYecdkYgJhlWke0FT6TOMYA/SVFd6IMol4hxPuDvgfvljRrZeJp Y3ilNxoz72TG5kHfEDbS =Xa3X -----END PGP SIGNATURE----- From turnbull.stephen.fw at u.tsukuba.ac.jp Tue Sep 13 14:31:28 2016 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Wed, 14 Sep 2016 03:31:28 +0900 Subject: [Python-Dev] Drastically improving list.sort() for lists of strings/ints In-Reply-To: <87fup3n265.fsf@thinkpad.rath.org> References: <87fup3n265.fsf@thinkpad.rath.org> Message-ID: <22488.17920.130056.863263@turnbull.sk.tsukuba.ac.jp> Nikolaus Rath writes: > Out of curiosity: is this test repeated periodically on different > architectures? Or could it be that it only ever was true 10 years > ago on Tim's Power Mac G5 (or whatever he used)? This is the same Tim Peters of the eponymous test for readability of syntax: "Syntax shall not look like grit on Tim's screen." I don't know if that will help you decide whether to trust his analysis, but it comforts me.<0.5 wink/> From srkunze at mail.de Tue Sep 13 14:42:08 2016 From: srkunze at mail.de (Sven R. Kunze) Date: Tue, 13 Sep 2016 20:42:08 +0200 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <20160909105541.5b8a7ec8@fsol> <1890206.yRDVlEz0si@klinga.prans.org> <20160910063927.723661ea.barry@wooz.org> <20160912135038.4c2eb635@fsol> <751ffaf8-3113-770e-2002-9b7cf02db7f2@mrabarnett.plus.com> <2f6f81e7-fa71-2ae8-a660-57f672bb04e1@mail.de> Message-ID: <8e14a981-985d-6425-792a-5c99fe1d6bd0@mail.de> On 13.09.2016 20:21, Tres Seaver wrote: > *Lots* of library authors have to straddle Python versions: consumers of > those libraries only get to pick and choose when their code is at the > "leaf" of the dependency tree (the application). Maybe, I didn't express myself well but this was not my intended question. Using this argument for not driving the evolution of the language spec, doesn't seem reasonable. Changes are necessary from time to time and this one in particular is not breaking compatibility with older versions. So, existing libs are okay. But why shouldn't the ordering of dicts not be an advertisable feature for application developers or developers of future libs? My reasoning so far is that in those circumstances people **won't switch** from CPython 3.6 to Cython to PyPy back to CPython 2.7 once a week (drawn from my experience at least). But maybe I'm wrong here. Cheers, Sven From windowod at gmail.com Tue Sep 13 15:01:12 2016 From: windowod at gmail.com (Yurij Alexandrovich) Date: Tue, 13 Sep 2016 22:01:12 +0300 Subject: [Python-Dev] Fwd: Cssdbpy is a simple SSDB client written on Cython. Faster standart SSDB client. In-Reply-To: References: Message-ID: Hello! Post my open source client for take feedback. Cssdbpy is a simple SSDB client written on Cython. Faster standart SSDB client. https://github.com/deslum/cssdbpy Thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: From senthil at uthcode.com Tue Sep 13 16:05:02 2016 From: senthil at uthcode.com (Senthil Kumaran) Date: Tue, 13 Sep 2016 13:05:02 -0700 Subject: [Python-Dev] Fwd: Cssdbpy is a simple SSDB client written on Cython. Faster standart SSDB client. In-Reply-To: References: Message-ID: On Tue, Sep 13, 2016 at 12:01 PM, Yurij Alexandrovich wrote: > > Cssdbpy is a simple SSDB client written on Cython. Faster standart SSDB > client. > > https://github.com/deslum/cssdbpy > Congrats. You should post this in python-annouce@ list. This list python-dev is about CPython development. Thank you, Senthil -------------- next part -------------- An HTML attachment was scrubbed... URL: From benhoyt at gmail.com Tue Sep 13 16:56:13 2016 From: benhoyt at gmail.com (Ben Hoyt) Date: Tue, 13 Sep 2016 16:56:13 -0400 Subject: [Python-Dev] Can CPython on GitHub use the "Merge" button on pull requests (now that they support "squash and merge")? Message-ID: I noticed in [PEP 512 - Document steps to commit a pull request]( https://www.python.org/dev/peps/pep-0512/#document-steps-to-commit-a-pull-request) it says that CPython on GitHub won't be able to use GitHub's "Merge" button on pull requests, because we want a linear history with one commit per change/issue. However, GitHub recently (actually on April 1, 2016 -- but it's not a joke :-) added support for "commit squashing". See https://github.com/blog/2141-squash-your-commits and https://help.github.com/articles/about-pull-request-merges/ ... basically you can do "old-GitHub-style merge" commits or "squash and merge" commits, and you can even set a repo to only allow "squash and merge" commits on that repo. Will CPython be able to use this? I think that using GitHub's integrated pull request and merge features will make it much easier for contributors (and core developers for that matter). And from personal experience, pressing that big green button is very satisfying. :-) P.S. While I'm here: is there a timeline for the various stages of PEP 512? -Ben -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Tue Sep 13 17:31:30 2016 From: brett at python.org (Brett Cannon) Date: Tue, 13 Sep 2016 21:31:30 +0000 Subject: [Python-Dev] Can CPython on GitHub use the "Merge" button on pull requests (now that they support "squash and merge")? In-Reply-To: References: Message-ID: On Tue, 13 Sep 2016 at 13:56 Ben Hoyt wrote: > I noticed in [PEP 512 - Document steps to commit a pull request]( > https://www.python.org/dev/peps/pep-0512/#document-steps-to-commit-a-pull-request) > it says that CPython on GitHub won't be able to use GitHub's "Merge" button > on pull requests, because we want a linear history with one commit per > change/issue. > > However, GitHub recently (actually on April 1, 2016 -- but it's not a joke > :-) added support for "commit squashing". See > https://github.com/blog/2141-squash-your-commits and > https://help.github.com/articles/about-pull-request-merges/ ... basically > you can do "old-GitHub-style merge" commits or "squash and merge" commits, > and you can even set a repo to only allow "squash and merge" commits on > that repo. > > Will CPython be able to use this? > Yes. That part of the PEP is outdated because I've been focusing on moving the other repos first (which are now done). > I think that using GitHub's integrated pull request and merge features > will make it much easier for contributors (and core developers for that > matter). And from personal experience, pressing that big green button is > very satisfying. :-) > > P.S. While I'm here: is there a timeline for the various stages of PEP 512? > The hope is by the end of the year, but no sooner than the release of Python 3.6.0. And FYI the core-workflow mailing list is the best place to ask about the GitHub migration. -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Tue Sep 13 19:42:35 2016 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 14 Sep 2016 11:42:35 +1200 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <20160909105541.5b8a7ec8@fsol> <1890206.yRDVlEz0si@klinga.prans.org> <20160910063927.723661ea.barry@wooz.org> <20160912135038.4c2eb635@fsol> Message-ID: <57D88EEB.80105@canterbury.ac.nz> MRAB wrote: > On 2016-09-13 07:57, Mark Lawrence via Python-Dev wrote: > >> "tables the idea" has the US meaning of close it down, not the UK >> meaning of open it up? :) > > A better phrase would've been "shelves the idea". There's even a module > in Python called "shelve", which makes it Pythonic. :-) So does that mean we should have a "table" module for managing objects we want to currently work on? -- Greg From python at mrabarnett.plus.com Tue Sep 13 20:02:37 2016 From: python at mrabarnett.plus.com (MRAB) Date: Wed, 14 Sep 2016 01:02:37 +0100 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: <57D88EEB.80105@canterbury.ac.nz> References: <20160909105541.5b8a7ec8@fsol> <1890206.yRDVlEz0si@klinga.prans.org> <20160910063927.723661ea.barry@wooz.org> <20160912135038.4c2eb635@fsol> <57D88EEB.80105@canterbury.ac.nz> Message-ID: On 2016-09-14 00:42, Greg Ewing wrote: > MRAB wrote: >> On 2016-09-13 07:57, Mark Lawrence via Python-Dev wrote: >> >>> "tables the idea" has the US meaning of close it down, not the UK >>> meaning of open it up? :) >> >> A better phrase would've been "shelves the idea". There's even a module >> in Python called "shelve", which makes it Pythonic. :-) > > So does that mean we should have a "table" module for managing > objects we want to currently work on? > Yes, although in the US locale it would be an alias for the 'shelve' module. :-) From benhoyt at gmail.com Tue Sep 13 20:17:30 2016 From: benhoyt at gmail.com (Ben Hoyt) Date: Tue, 13 Sep 2016 20:17:30 -0400 Subject: [Python-Dev] Can CPython on GitHub use the "Merge" button on pull requests (now that they support "squash and merge")? In-Reply-To: References: Message-ID: Great, and thanks for the info! -Ben On Sep 13, 2016 5:31 PM, "Brett Cannon" wrote: > > > On Tue, 13 Sep 2016 at 13:56 Ben Hoyt wrote: > >> I noticed in [PEP 512 - Document steps to commit a pull request]( >> https://www.python.org/dev/peps/pep-0512/#document-steps-to-commit-a- >> pull-request) it says that CPython on GitHub won't be able to use >> GitHub's "Merge" button on pull requests, because we want a linear history >> with one commit per change/issue. >> >> However, GitHub recently (actually on April 1, 2016 -- but it's not a >> joke :-) added support for "commit squashing". See >> https://github.com/blog/2141-squash-your-commits and >> https://help.github.com/articles/about-pull-request-merges/ ... >> basically you can do "old-GitHub-style merge" commits or "squash and merge" >> commits, and you can even set a repo to only allow "squash and merge" >> commits on that repo. >> >> Will CPython be able to use this? >> > > Yes. That part of the PEP is outdated because I've been focusing on moving > the other repos first (which are now done). > > >> I think that using GitHub's integrated pull request and merge features >> will make it much easier for contributors (and core developers for that >> matter). And from personal experience, pressing that big green button is >> very satisfying. :-) >> >> P.S. While I'm here: is there a timeline for the various stages of PEP >> 512? >> > > The hope is by the end of the year, but no sooner than the release of > Python 3.6.0. > > And FYI the core-workflow mailing list is the best place to ask about the > GitHub migration. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Wed Sep 14 06:41:04 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 14 Sep 2016 11:41:04 +0100 Subject: [Python-Dev] [python-committers] [RELEASE] Python 3.6.0b1 is now available In-Reply-To: References: <942D57F5-BA76-49CB-B3DB-18E2D6F12AC4@python.org> Message-ID: On 14 September 2016 at 11:32, Serhiy Storchaka wrote: > On 13.09.16 02:35, Ned Deily wrote: >> >> On behalf of the Python development community and the Python 3.6 release >> team, I'm happy to announce the availability of Python 3.6.0b1. 3.6.0b1 >> is the first of four planned beta releases of Python 3.6, the next major >> release of Python, and marks the end of the feature development phase >> for 3.6. > > > There is no mention on https://www.python.org/news/. The last release mentioned there is 3.4.0rc1... Paul From leewangzhong+python at gmail.com Wed Sep 14 08:18:49 2016 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Wed, 14 Sep 2016 08:18:49 -0400 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: <1473399244.225934.720369745.6587BC20@webmail.messagingengine.com> References: <1473399244.225934.720369745.6587BC20@webmail.messagingengine.com> Message-ID: On Sep 9, 2016 1:35 AM, "Benjamin Peterson" wrote: > On Thu, Sep 8, 2016, at 22:33, Tim Delaney wrote: > > Are sets also ordered by default now? None of the PEPs appear to mention > > it. > > No. Is there anyone working to move sets in the same direction for 3.6? -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Wed Sep 14 08:29:53 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 14 Sep 2016 13:29:53 +0100 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <1473399244.225934.720369745.6587BC20@webmail.messagingengine.com> Message-ID: On 14 September 2016 at 13:18, Franklin? Lee wrote: > On Sep 9, 2016 1:35 AM, "Benjamin Peterson" wrote: >> On Thu, Sep 8, 2016, at 22:33, Tim Delaney wrote: >> > Are sets also ordered by default now? None of the PEPs appear to mention >> > it. >> >> No. > > Is there anyone working to move sets in the same direction for 3.6? It won't happen for 3.6, as we're now in feature freeze. So it'd be 3.7 at the earliest. What exactly do you mean by "in the same direction" anyway? Remember that ordering isn't guaranteed (it's an implementation detail), so are you just saying "can sets benefit from the improvements this change provided to dicts"? If you *are* hoping for ordered sets, what's your use case? (Dictionaries had particular use cases - retaining ordering of keyword arguments and class definitions, the existence of OrderedDict, ...) Paul From leewangzhong+python at gmail.com Wed Sep 14 08:39:47 2016 From: leewangzhong+python at gmail.com (Franklin? Lee) Date: Wed, 14 Sep 2016 08:39:47 -0400 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <1473399244.225934.720369745.6587BC20@webmail.messagingengine.com> Message-ID: On Sep 14, 2016 8:29 AM, "Paul Moore" wrote: > > On 14 September 2016 at 13:18, Franklin? Lee > wrote: > > On Sep 9, 2016 1:35 AM, "Benjamin Peterson" wrote: > >> On Thu, Sep 8, 2016, at 22:33, Tim Delaney wrote: > >> > Are sets also ordered by default now? None of the PEPs appear to mention > >> > it. > >> > >> No. > > > > Is there anyone working to move sets in the same direction for 3.6? > > It won't happen for 3.6, as we're now in feature freeze. So it'd be > 3.7 at the earliest. > > What exactly do you mean by "in the same direction" anyway? Remember > that ordering isn't guaranteed (it's an implementation detail), so are > you just saying "can sets benefit from the improvements this change > provided to dicts"? If you *are* hoping for ordered sets, what's your > use case? (Dictionaries had particular use cases - retaining ordering > of keyword arguments and class definitions, the existence of > OrderedDict, ...) > > Paul I mean using a compact representation, if not an ordered one. I have no particular usecase in mind. As far as I understand the compact implementation, sets can do it just as well. The original discussion proposed trying to implement it for sets first. Like dict, they would (probably) use less memory, and would usually have a more readable (i.e. less jarring to read) print order. -------------- next part -------------- An HTML attachment was scrubbed... URL: From songofacandy at gmail.com Wed Sep 14 09:33:41 2016 From: songofacandy at gmail.com (INADA Naoki) Date: Wed, 14 Sep 2016 22:33:41 +0900 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <1473399244.225934.720369745.6587BC20@webmail.messagingengine.com> Message-ID: > > I mean using a compact representation, if not an ordered one. > > I have no particular usecase in mind. As far as I understand the compact > implementation, sets can do it just as well. The original discussion > proposed trying to implement it for sets first. > > Like dict, they would (probably) use less memory, and would usually have a > more readable (i.e. less jarring to read) print order. > I'll improve OrderedDict after dict in 3.6 is stable enough. Then, I'll do same to sets. While compact ordered split dict is very hard to implement right, OrderedDict and set must be easier than dict. From guido at python.org Wed Sep 14 10:36:30 2016 From: guido at python.org (Guido van Rossum) Date: Wed, 14 Sep 2016 07:36:30 -0700 Subject: [Python-Dev] [python-committers] [RELEASE] Python 3.6.0b1 is now available In-Reply-To: References: <942D57F5-BA76-49CB-B3DB-18E2D6F12AC4@python.org> Message-ID: Fortunately that page isn't linked from anywhere on the home page AFAIK. If it is, could someone file an issue in the pydotorg tracker? The url is at the bottom of every page. On Wed, Sep 14, 2016 at 3:41 AM, Paul Moore wrote: > On 14 September 2016 at 11:32, Serhiy Storchaka wrote: >> On 13.09.16 02:35, Ned Deily wrote: >>> >>> On behalf of the Python development community and the Python 3.6 release >>> team, I'm happy to announce the availability of Python 3.6.0b1. 3.6.0b1 >>> is the first of four planned beta releases of Python 3.6, the next major >>> release of Python, and marks the end of the feature development phase >>> for 3.6. >> >> >> There is no mention on https://www.python.org/news/. > > The last release mentioned there is 3.4.0rc1... > > Paul > _______________________________________________ > python-committers mailing list > python-committers at python.org > https://mail.python.org/mailman/listinfo/python-committers > Code of Conduct: https://www.python.org/psf/codeofconduct/ -- --Guido van Rossum (python.org/~guido) From VARD.ANTINYAN at cse.gu.se Wed Sep 14 10:05:46 2016 From: VARD.ANTINYAN at cse.gu.se (Vard Antinyan) Date: Wed, 14 Sep 2016 14:05:46 +0000 Subject: [Python-Dev] Code Complexity Survey Message-ID: Dear Python developers, We have undertaken a task to assess code complexity triggers and generate recommendations for developing simple and understandable code. Our intension is to share the results with you, developers, so everyone can learn the triggers behind complex software. We need your help for rigorous results. My request to you is - if you get 5-10 min. time, would you please consider to answer the questions of this survey? https://goo.gl/forms/h9WXZ8VSEw7BUyHg1 You are welcome to learn preliminary results through this link: https://www.facebook.com/SoftwareCodeQuality/photos/?tab=album&album_id=1639816749664288 The results will be shared in a public webpage and everyone possible will be invited to learn and discuss them. Your knowledge and experience is vital for achieving substantial and generalizable results, and your effort is much appreciated! Sincerely Vard Antinyan PhD candidate in University of Gothenburg, Sweden Tel: 0046317725707 -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Wed Sep 14 15:02:55 2016 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 14 Sep 2016 15:02:55 -0400 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <1473399244.225934.720369745.6587BC20@webmail.messagingengine.com> Message-ID: On 9/14/2016 9:33 AM, INADA Naoki wrote: >> >> I mean using a compact representation, if not an ordered one. >> >> I have no particular usecase in mind. As far as I understand the compact >> implementation, sets can do it just as well. The original discussion >> proposed trying to implement it for sets first. >> >> Like dict, they would (probably) use less memory, and would usually have a >> more readable (i.e. less jarring to read) print order. >> > > I'll improve OrderedDict after dict in 3.6 is stable enough. > Then, I'll do same to sets. > > While compact ordered split dict is very hard to implement right, > OrderedDict and set > must be easier than dict. Frozensets, and even more, sets, have lots of operations that dicts do not. Making sets more compact without slowing down the various operations should be an interesting challenge. Insert order is not meaningful for at least some of the operations. Moreover, I believe repeated inserts and deletions are much more common for sets than dicts. So I think that any initial ordering should be considered a side-effect and documented as such. The operations would then be optimized for speed and compactness without regard to ordering. This might mean keeping a linked list of free slots so slots can be reused and costly compaction avoided as much and for long as possible. We already have compact mutable collection types that can be kept insert-ordered if one chooses -- lists and collections.deque -- and they are not limited to hashables. Before sets were added, either lists or dicts with None values were used as sets. The latter is obsolete but lists are still sometimes used for their generality, as in a set of lists. We now also have enums for certain small frozensets where the set opertions are not needed. -- Terry Jan Reedy From ericsnowcurrently at gmail.com Wed Sep 14 18:50:42 2016 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Wed, 14 Sep 2016 16:50:42 -0600 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <1473399244.225934.720369745.6587BC20@webmail.messagingengine.com> Message-ID: On Wed, Sep 14, 2016 at 7:33 AM, INADA Naoki wrote: > I'll improve OrderedDict after dict in 3.6 is stable enough. +1 and if it's done carefully we could even utilize the pure Python OrderedDict and get rid of odictobject.c (and fold dict-common.h back into dictobject.c). We'd need to leave the current implementation as the fallback for implementations that don't have an ordered dict. However, we'd first try a compact-dict-based variant. Doing so would probably require a new field in sys.implementation that indicates dict is ordered. > Then, I'll do same to sets. Unless I've misunderstood, Raymond was opposed to making a similar change to set. -eric From timothy.c.delaney at gmail.com Wed Sep 14 18:54:17 2016 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Thu, 15 Sep 2016 08:54:17 +1000 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <1473399244.225934.720369745.6587BC20@webmail.messagingengine.com> Message-ID: On 15 September 2016 at 05:02, Terry Reedy wrote: > > We already have compact mutable collection types that can be kept > insert-ordered if one chooses -- lists and collections.deque -- and they > are not limited to hashables. Before sets were added, either lists or > dicts with None values were used as sets. The latter is obsolete but lists > are still sometimes used for their generality, as in a set of lists. We > now also have enums for certain small frozensets where the set opertions > are not needed. One use case that isn't covered by any of the above is removing duplicates whilst retaining order (of the first of the matching elements). With an OrderedSet (or ordered by default sets) it would be as simple as: a = OrderedSet(iterable) Probably the best current option would be: a = list(OrderedDict(k, None for k in iterable)) The other use I have for an ordered set is to build up an iterable of unique values whilst retaining order. It's a lot more efficient than doing a linear search on a list when adding each element to see if it's already present. In many cases the order is primarily important for debugging purposes, but I can definitely find cases in my current java codebase where I've used the pattern (LinkedHashSet) and the order is important to the semantics of the code. Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From raymond.hettinger at gmail.com Thu Sep 15 01:42:02 2016 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Wed, 14 Sep 2016 22:42:02 -0700 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <1473399244.225934.720369745.6587BC20@webmail.messagingengine.com> Message-ID: > On Sep 14, 2016, at 3:50 PM, Eric Snow wrote: > >> >> Then, I'll do same to sets. > > Unless I've misunderstood, Raymond was opposed to making a similar > change to set. That's right. Here are a few thoughts on the subject before people starting running wild. * For the compact dict, the space savings was a net win with the additional space consumed by the indices and the overallocation for the key/value/hash arrays being more than offset by the improved density of key/value/hash arrays. However for sets, the net was much less favorable because we still need the indices and overallocation but can only offset the space cost by densifying only two of the three arrays. In other words, compacting makes more sense when you have wasted space for keys, values, and hashes. If you lose one of those three, it stops being compelling. * The use pattern for sets is different from dicts. The former has more hit or miss lookups. The latter tends to have fewer missing key lookups. Also, some of the optimizations for the set-to-set operations make it difficult to retain set ordering without impacting performance. * I pursued alternative path to improve set performance. Instead of compacting (which wasn't much of space win and incurred the cost of an additional indirection), I added linear probing to reduce the cost of collisions and improve cache performance. This improvement is incompatible with the compacting approach I advocated for dictionaries. * For now, the ordering side-effect on dictionaries is non-guaranteed, so it is premature to start insisting the sets become ordered as well. The docs already link to a recipe for creating an OrderedSet ( https://code.activestate.com/recipes/576694/ ) but it seems like the uptake has been nearly zero. Also, now that Eric Snow has given us a fast OrderedDict, it is easier than ever to build an OrderedSet from MutableSet and OrderedDict, but again I haven't observed any real interest because typical set-to-set data analytics don't really need or care about ordering. Likewise, the primary use of fast membership testings is order agnostic. * That said, I do think there is room to add alternative set implementations to PyPI. In particular, there are some interesting special cases for orderable data where set-to-set operations can be sped-up by comparing entire ranges of keys (see https://code.activestate.com/recipes/230113-implementation-of-sets-using-sorted-lists for a starting point). IIRC, PyPI already has code for set-like bloom filters and cuckoo hashing. * I understanding that it is exciting to have a major block of code accepted into the Python core but that shouldn't open to floodgates to engaging in more major rewrites of other datatypes unless we're sure that it is warranted. Raymond Hettinger From nad at python.org Thu Sep 15 01:52:21 2016 From: nad at python.org (Ned Deily) Date: Thu, 15 Sep 2016 01:52:21 -0400 Subject: [Python-Dev] [python-committers] [RELEASE] Python 3.6.0b1 is now available In-Reply-To: References: <942D57F5-BA76-49CB-B3DB-18E2D6F12AC4@python.org> Message-ID: On Sep 14, 2016, at 10:36, Guido van Rossum wrote: > Fortunately that page isn't linked from anywhere on the home page > AFAIK. If it is, could someone file an issue in the pydotorg tracker? > The url is at the bottom of every page. > > On Wed, Sep 14, 2016 at 3:41 AM, Paul Moore wrote: >> On 14 September 2016 at 11:32, Serhiy Storchaka wrote: >>> On 13.09.16 02:35, Ned Deily wrote: >>>> >>>> On behalf of the Python development community and the Python 3.6 release >>>> team, I'm happy to announce the availability of Python 3.6.0b1. 3.6.0b1 >>>> is the first of four planned beta releases of Python 3.6, the next major >>>> release of Python, and marks the end of the feature development phase >>>> for 3.6. >>> >>> >>> There is no mention on https://www.python.org/news/. >> >> The last release mentioned there is 3.4.0rc1... https://github.com/python/pythondotorg/issues/1008 [closed] -> duplicate of https://github.com/python/pythondotorg/issues/807 [open] Also, https://www.python.org/news/ has been manually updated by Ewa. -- Ned Deily nad at python.org -- [] From storchaka at gmail.com Thu Sep 15 02:31:54 2016 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 15 Sep 2016 09:31:54 +0300 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: Message-ID: On 08.09.16 23:22, Victor Stinner wrote: > I pushed INADA Naoki's implementation of the "compact dict". The hash > table now stores indices pointing to a new second table which contains > keys and values: it adds one new level of indirection. The table of > indices is "compact": use 1, 2, 4 or 8 bytes per indice depending on > the size of the dictionary. Moreover, the keys/values table is also > more compact: its size is 2/3 of the indices table. > > A nice "side effect" of compact dict is that the dictionary now > preserves the insertion order. It means that keyword arguments can now > be iterated by their creation order: Note that this is made at the expense of the 20% slowing down an iteration. $ ./python -m timeit -s "d = dict.fromkeys(range(10**6))" -- "list(d)" Python 3.5: 66.1 msec per loop Python 3.6: 82.5 msec per loop Fortunately the cost of the lookup (the most critical operation for dicts) seems left the same. But this can be an argument against using this technique in sets. From storchaka at gmail.com Thu Sep 15 02:35:14 2016 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 15 Sep 2016 09:35:14 +0300 Subject: [Python-Dev] [RELEASE] Python 3.6.0b1 is now available In-Reply-To: References: <942D57F5-BA76-49CB-B3DB-18E2D6F12AC4@python.org> Message-ID: On 14.09.16 17:36, Guido van Rossum wrote: > Fortunately that page isn't linked from anywhere on the home page > AFAIK. If it is, could someone file an issue in the pydotorg tracker? > The url is at the bottom of every page. This is on of the first results (actually the first besides manually edited news) of googling "python news". From berker.peksag at gmail.com Thu Sep 15 02:48:19 2016 From: berker.peksag at gmail.com (=?UTF-8?Q?Berker_Peksa=C4=9F?=) Date: Thu, 15 Sep 2016 09:48:19 +0300 Subject: [Python-Dev] [python-committers] [RELEASE] Python 3.6.0b1 is now available In-Reply-To: References: <942D57F5-BA76-49CB-B3DB-18E2D6F12AC4@python.org> Message-ID: On Thu, Sep 15, 2016 at 9:35 AM, Serhiy Storchaka wrote: > On 14.09.16 17:36, Guido van Rossum wrote: >> >> Fortunately that page isn't linked from anywhere on the home page >> AFAIK. If it is, could someone file an issue in the pydotorg tracker? >> The url is at the bottom of every page. > > > This is on of the first results (actually the first besides manually edited > news) of googling "python news". Fixed, it should redirect to https://www.python.org/blogs/ now. Thanks for noticing this! --Berker From victor.stinner at gmail.com Thu Sep 15 03:20:17 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 15 Sep 2016 09:20:17 +0200 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: Message-ID: 2016-09-15 8:31 GMT+02:00 Serhiy Storchaka : > Note that this is made at the expense of the 20% slowing down an iteration. > > $ ./python -m timeit -s "d = dict.fromkeys(range(10**6))" -- "list(d)" > Python 3.5: 66.1 msec per loop > Python 3.6: 82.5 msec per loop > > Fortunately the cost of the lookup (the most critical operation for dicts) > seems left the same. > > But this can be an argument against using this technique in sets. My small benchmarks on dict memory usage and dict lookup: http://bugs.python.org/issue27350#msg275581 It seems like the memory usage is between 20% and 25% smaller. Great job! Memory usage, Python 3.5 => Python 3.6 on Linux x86_64: ./python -c 'import sys; print(sys.getsizeof({str(i):i for i in range(10)}))' * 10 items: 480 B => 384 B (-20%) * 100 items: 6240 B => 4720 B (-24%) * 1000 items: 49248 B => 36984 B (-25%) Note: the size is the the size of the container itself, not of keys nor values. http://bugs.python.org/issue27350#msg275587 As I expected, a dictionary lookup is a _little bit_ slower (3%) between Python 3.5 and Python 3.6: $ ./python -m perf timeit -s 'd={str(i):i for i in range(100)}' 'd["10"]; d["20"]; d["30"]; d["40"]; d["50"]; d["10"]; d["20"]; d["30"]; d["40"]; d["50"]' --rigorous Median +- std dev: [lookup35] 309 ns +- 10 ns -> [lookup36] 320 ns +- 8 ns: 1.03x slower Victor From songofacandy at gmail.com Thu Sep 15 04:02:39 2016 From: songofacandy at gmail.com (INADA Naoki) Date: Thu, 15 Sep 2016 08:02:39 +0000 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: Message-ID: > > > Note that this is made at the expense of the 20% slowing down an iteration. > > $ ./python -m timeit -s "d = dict.fromkeys(range(10**6))" -- "list(d)" > Python 3.5: 66.1 msec per loop > Python 3.6: 82.5 msec per loop > > Are two Pythons built with same options? In my environ: ~/local/python-master/bin/python3 -m timeit -s "d = dict.fromkeys(range(10**6))" 'list(d)' Python master (8cd9c) 100 loops, best of 3: 11 msec per loop Python 3.5.2 100 loops, best of 3: 11.6 msec per loop And dict creation time is: ~/local/python-master/bin/python3 -m timeit "d = dict.fromkeys(range(10**6))" Python master 10 loops, best of 3: 70.1 msec per loop Python 3.5.2 10 loops, best of 3: 78.2 msec per loop Both Python is built without neither `--with-optimizations` or `make profile-opt`. -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Thu Sep 15 04:11:14 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 15 Sep 2016 09:11:14 +0100 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: Message-ID: On 15 September 2016 at 07:31, Serhiy Storchaka wrote: > Note that this is made at the expense of the 20% slowing down an iteration. > > $ ./python -m timeit -s "d = dict.fromkeys(range(10**6))" -- "list(d)" > Python 3.5: 66.1 msec per loop > Python 3.6: 82.5 msec per loop On my Windows 7 PC with 3.5.2 and 3.6.0b1 installed from the standard python.org builds: >py -3.5 -m timeit -s "d = dict.fromkeys(range(10**6))" -- "list(d)" 10 loops, best of 3: 21.7 msec per loop >py -3.6 -m timeit -s "d = dict.fromkeys(range(10**6))" -- "list(d)" 100 loops, best of 3: 19.6 msec per loop So 3.6 is faster for me. Paul From victor.stinner at gmail.com Thu Sep 15 04:57:07 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 15 Sep 2016 10:57:07 +0200 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: Message-ID: 2016-09-15 10:02 GMT+02:00 INADA Naoki : > In my environ: > > ~/local/python-master/bin/python3 -m timeit -s "d = > dict.fromkeys(range(10**6))" 'list(d)' Stooooop! Please stop using timeit, it's lying! * You must not use the minimum but average or median * You must run a microbenchmark in multiple processes to test different randomized hash functions and different memory layouts In short: you should use my perf module. http://perf.readthedocs.io/en/latest/cli.html#timeit The memory layout and the hash function have a major important on such microbenchmark: https://haypo.github.io/journey-to-stable-benchmark-average.html > Both Python is built without neither `--with-optimizations` or `make > profile-opt`. That's bad :-) For most reliable benchmarks, it's better to use LTO+PGO compilation. Victor From songofacandy at gmail.com Thu Sep 15 05:23:02 2016 From: songofacandy at gmail.com (INADA Naoki) Date: Thu, 15 Sep 2016 09:23:02 +0000 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: Message-ID: On Thu, Sep 15, 2016 at 5:57 PM Victor Stinner wrote: > 2016-09-15 10:02 GMT+02:00 INADA Naoki : > > In my environ: > > > > ~/local/python-master/bin/python3 -m timeit -s "d = > > dict.fromkeys(range(10**6))" 'list(d)' > > Stooooop! Please stop using timeit, it's lying! > > * You must not use the minimum but average or median > * You must run a microbenchmark in multiple processes to test > different randomized hash functions and different memory layouts > > In short: you should use my perf module. > http://perf.readthedocs.io/en/latest/cli.html#timeit > > I'm sorry. Changing habit is bit difficult. I'll use it in next time. I ran microbench 3~5 times and confirm the result is stable before posting result. And when difference is smaller than 10%, I don't believe the result. > The memory layout and the hash function have a major important on such > microbenchmark: > https://haypo.github.io/journey-to-stable-benchmark-average.html > > In this microbench, hash randomization is not important, because key of dict is int. (It means iterating dict doesn't cause random memory access in old dict implementation too.) > > > Both Python is built without neither `--with-optimizations` or `make > > profile-opt`. > > That's bad :-) For most reliable benchmarks, it's better to use > LTO+PGO compilation. > LTO+PGO may make performance of `git pull && make` unstable. PGO clean build takes tooo long time for such a quick benchmark. So I don't want to use PGO in such a quick benchmark. And Python doesn't provide way to use LTO without PGO.... -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Thu Sep 15 05:26:05 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 15 Sep 2016 10:26:05 +0100 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: Message-ID: On 15 September 2016 at 09:57, Victor Stinner wrote: > 2016-09-15 10:02 GMT+02:00 INADA Naoki : >> In my environ: >> >> ~/local/python-master/bin/python3 -m timeit -s "d = >> dict.fromkeys(range(10**6))" 'list(d)' > > Stooooop! Please stop using timeit, it's lying! > > * You must not use the minimum but average or median > * You must run a microbenchmark in multiple processes to test > different randomized hash functions and different memory layouts > > In short: you should use my perf module. > http://perf.readthedocs.io/en/latest/cli.html#timeit Made essentially no difference to the results I posted: >py -3.5 -m perf timeit -s "d = dict.fromkeys(range(10**6))" -- "list(d)" .................... Median +- std dev: 21.4 ms +- 0.7 ms >py -3.6 -m perf timeit -s "d = dict.fromkeys(range(10**6))" -- "list(d)" .................... Median +- std dev: 20.0 ms +- 1.1 ms 3.6 remains faster, by very little (barely one standard deviation). I would consider that the same result as timeit (to the level that it's reasonable to assign any meaning to a microbenchmark). Paul From solipsis at pitrou.net Thu Sep 15 05:29:51 2016 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 15 Sep 2016 11:29:51 +0200 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered References: Message-ID: <20160915112951.6a9e91e1@fsol> On Thu, 15 Sep 2016 10:57:07 +0200 Victor Stinner wrote: > > > Both Python is built without neither `--with-optimizations` or `make > > profile-opt`. > > That's bad :-) For most reliable benchmarks, it's better to use > LTO+PGO compilation. That sounds irrelevant. LTO+PGO improves performance, it does nothing for benchmarking per se. That said, it's probably more useful to benchmark an optimized Python build than an unoptimized one... Regards Antoine. From raymond.hettinger at gmail.com Thu Sep 15 05:43:46 2016 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Thu, 15 Sep 2016 02:43:46 -0700 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: Message-ID: <842DEA6E-1A77-48B0-9AF5-FAA6EBF8A599@gmail.com> > On Sep 14, 2016, at 11:31 PM, Serhiy Storchaka wrote: > > Note that this is made at the expense of the 20% slowing down an iteration. > > $ ./python -m timeit -s "d = dict.fromkeys(range(10**6))" -- "list(d)" > Python 3.5: 66.1 msec per loop > Python 3.6: 82.5 msec per loop A range of consecutive integers which have consecutive hash values is a really weak and non-representative basis for comparison. Something like this will reveal the true and massive improvement in iteration speed: $ ./python.exe -m timeit -s "d=dict.fromkeys(map(str,range(10**6)))" "list(d)" There are two reasons for the significant improvement in iteration speed: 1) The dense key table is smaller (no intervening NULL entries) so we do fewer total memory fetches to loop over the keys, values, or items. 2) The loop over the dense table no longer makes frequent, unpredictable tests for NULL entries. (To better understand why this matters and how major the impact is, see http://stackoverflow.com/questions/11227809 ). Your mileage will vary depending on the size of dictionary and whether the old dictionary would have densely packed the keys (as in Serhiy's non-representative example). Raymond P.S. Algorithmically, the compact dict seems to be mostly where it needs to be (modulo some implementation bugs that are being ironed-out). However, the code hasn't been tuned and polished as much as the old implementation, so there is still room for its timings to improve. Dict copies should end-up being faster (fewer bytes copied and a predictable test for NULLs). Resizes should be much faster (only the small index table needs to be updated, while the keys/values/hashes don't get moved). In complex apps, the memory savings ought translate into better cache performance (that doesn't show-up much in tight benchmark loops but tends to make a different in real code). From victor.stinner at gmail.com Thu Sep 15 05:47:25 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 15 Sep 2016 11:47:25 +0200 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: <20160915112951.6a9e91e1@fsol> References: <20160915112951.6a9e91e1@fsol> Message-ID: 2016-09-15 11:29 GMT+02:00 Antoine Pitrou : > That sounds irrelevant. LTO+PGO improves performance, it does > nothing for benchmarking per se. In the past, I had bad surprised when running benchmarks without PGO: https://haypo.github.io/journey-to-stable-benchmark-deadcode.html I don't recall if ALSR was enabled or not. But I don't think that I used multiple processes when I ran these benchmarks because I didn't write the code yet :-) I should probably redo the same benchmark using new shiny benchmarking tools (which are expected to be more reliable and stable). Victor From p.f.moore at gmail.com Thu Sep 15 07:27:51 2016 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 15 Sep 2016 12:27:51 +0100 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: <842DEA6E-1A77-48B0-9AF5-FAA6EBF8A599@gmail.com> References: <842DEA6E-1A77-48B0-9AF5-FAA6EBF8A599@gmail.com> Message-ID: On 15 September 2016 at 10:43, Raymond Hettinger wrote: > Something like this will reveal the true and massive improvement in iteration speed: > > $ ./python.exe -m timeit -s "d=dict.fromkeys(map(str,range(10**6)))" "list(d)" >py -3.5 -m timeit -s "d=dict.fromkeys(map(str,range(10**6)))" "list(d)" 10 loops, best of 3: 66.2 msec per loop >py -3.6 -m timeit -s "d=dict.fromkeys(map(str,range(10**6)))" "list(d)" 10 loops, best of 3: 27.8 msec per loop And for Victor: >py -3.5 -m perf timeit -s "d=dict.fromkeys(map(str,range(10**6)))" "list(d)" .................... Median +- std dev: 65.7 ms +- 3.8 ms >py -3.6 -m perf timeit -s "d=dict.fromkeys(map(str,range(10**6)))" "list(d)" .................... Median +- std dev: 27.9 ms +- 1.2 ms Just as a side point, perf provided essentially identical results but took 2 minutes as opposed to 8 seconds for timeit to do so. I understand why perf is better, and I appreciate all the work Victor did to create it, and analyze the results, but for getting a quick impression of how a microbenchmark performs, I don't see timeit as being *quite* as bad as Victor is claiming. I will tend to use perf now that I have it installed, and now that I know how to run a published timeit invocation using perf. It's a really cool tool. But I certainly won't object to seeing people publish timeit results (any more than I'd object to *any* mirobenchmark). Paul From storchaka at gmail.com Thu Sep 15 08:04:04 2016 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 15 Sep 2016 15:04:04 +0300 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: Message-ID: On 15.09.16 11:02, INADA Naoki wrote: > Are two Pythons built with same options? Both are built from clean checkout with default options (hg update -C 3.x; ./configure; make -s). The only difference is -std=c99 and additional warnings in 3.6: Python 3.5: gcc -pthread -c -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -Werror=declaration-after-statement -I. -I./Include -DPy_BUILD_CORE -o Objects/dictobject.o Objects/dictobject.c Python 3.6: gcc -pthread -c -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -std=c99 -Wextra -Wno-unused-result -Wno-unused-parameter -Wno-missing-field-initializers -I. -I./Include -DPy_BUILD_CORE -o Objects/dictobject.o Objects/dictobject.c Usually I run a microbenchmark 3-5 times and choose the median. Results was stable enough (the variation is about 1%), unlikely the perf tool will give significantly different result. I repeated measurements on different computer, the difference is the same: Python 3.5: 10 loops, best of 3: 33.5 msec per loop Python 3.6: 10 loops, best of 3: 37.5 msec per loop These results look surprisingly and inexplicably to me. I expected that even if there is some performance regression in the lookup or modifying operation, the iteration should not be slower. CPUs on both computers work in 32-bit mode. Maybe this affects. For string keys Python 3.6 is 4 times faster! $ ./python -m timeit -s "d = dict.fromkeys(map(str, range(10**6)))" -- "list(d)" On one computer: Python 3.5: 10 loops, best of 3: 384 msec per loop Python 3.6: 10 loops, best of 3: 94.6 msec per loop On other computer: Python 3.5: 10 loops, best of 3: 179 msec per loop Python 3.6: 10 loops, best of 3: 46 msec per loop From storchaka at gmail.com Thu Sep 15 08:17:14 2016 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 15 Sep 2016 15:17:14 +0300 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: Message-ID: On 15.09.16 11:57, Victor Stinner wrote: > Stooooop! Please stop using timeit, it's lying! > > * You must not use the minimum but average or median > * You must run a microbenchmark in multiple processes to test > different randomized hash functions and different memory layouts > > In short: you should use my perf module. > http://perf.readthedocs.io/en/latest/cli.html#timeit > > The memory layout and the hash function have a major important on such > microbenchmark: > https://haypo.github.io/journey-to-stable-benchmark-average.html $ ./python -m perf timeit -s "d = dict.fromkeys(range(10**6))" -- "list(d)" Python 3.5: Median +- std dev: 65.1 ms +- 4.9 ms Python 3.6: Median +- std dev: 79.4 ms +- 3.9 ms Other computer: Python 3.5: Median +- std dev: 33.6 ms +- 0.3 ms Python 3.6: Median +- std dev: 37.5 ms +- 0.2 ms From storchaka at gmail.com Thu Sep 15 09:06:36 2016 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 15 Sep 2016 16:06:36 +0300 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: <842DEA6E-1A77-48B0-9AF5-FAA6EBF8A599@gmail.com> References: <842DEA6E-1A77-48B0-9AF5-FAA6EBF8A599@gmail.com> Message-ID: On 15.09.16 12:43, Raymond Hettinger wrote: >> On Sep 14, 2016, at 11:31 PM, Serhiy Storchaka wrote: >> >> Note that this is made at the expense of the 20% slowing down an iteration. >> >> $ ./python -m timeit -s "d = dict.fromkeys(range(10**6))" -- "list(d)" >> Python 3.5: 66.1 msec per loop >> Python 3.6: 82.5 msec per loop > > A range of consecutive integers which have consecutive hash values is a really weak and non-representative basis for comparison. With randomized integers the result is even worse. $ ./python -m timeit -s "import random; a = list(range(10**6)); random.seed(0); random.shuffle(a); d = dict.fromkeys(a)" -- "list(d)" Python 3.5: 10 loops, best of 3: 33.6 msec per loop Python 3.6: 10 loops, best of 3: 166 msec per loop From ericsnowcurrently at gmail.com Thu Sep 15 09:08:50 2016 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 15 Sep 2016 07:08:50 -0600 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: Message-ID: On Sep 15, 2016 06:06, "Serhiy Storchaka" wrote: > Python 3.5: 10 loops, best of 3: 33.5 msec per loop > Python 3.6: 10 loops, best of 3: 37.5 msec per loop > > These results look surprisingly and inexplicably to me. I expected that even if there is some performance regression in the lookup or modifying operation, the iteration should not be slower. My understanding is that the all-int-keys case is an outlier. This is due to how ints hash, resulting in fewer collisions and a mostly insertion-ordered hash table. Consequently, I'd expect the above microbenchmark to give roughly the same result between 3.5 and 3.6, which it did. -eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Thu Sep 15 09:46:40 2016 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 15 Sep 2016 15:46:40 +0200 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered References: Message-ID: <20160915154640.68df4399@fsol> On Thu, 15 Sep 2016 07:08:50 -0600 Eric Snow wrote: > On Sep 15, 2016 06:06, "Serhiy Storchaka" wrote: > > Python 3.5: 10 loops, best of 3: 33.5 msec per loop > > Python 3.6: 10 loops, best of 3: 37.5 msec per loop > > > > These results look surprisingly and inexplicably to me. I expected that > even if there is some performance regression in the lookup or modifying > operation, the iteration should not be slower. > > My understanding is that the all-int-keys case is an outlier. This is due > to how ints hash, resulting in fewer collisions and a mostly > insertion-ordered hash table. Consequently, I'd expect the above > microbenchmark to give roughly the same result between 3.5 and 3.6, which > it did. Dict iteration shouldn't have any dependence on collisions or insertion order. It's just a table scan, both in 3.5 and 3.6. Regards Antoine. From raymond.hettinger at gmail.com Thu Sep 15 11:02:10 2016 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Thu, 15 Sep 2016 08:02:10 -0700 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: <20160915154640.68df4399@fsol> References: <20160915154640.68df4399@fsol> Message-ID: <7BDDAF13-0C33-44A2-BBA8-82C5D8C7587E@gmail.com> [Eric] >> My understanding is that the all-int-keys case is an outlier. This is due >> to how ints hash, resulting in fewer collisions and a mostly >> insertion-ordered hash table. Consequently, I'd expect the above >> microbenchmark to give roughly the same result between 3.5 and 3.6, which >> it did. [Antoine] > Dict iteration shouldn't have any dependence on collisions or insertion > order. It's just a table scan, both in 3.5 and 3.6. Eric is correct on this one. The consecutive hashes make a huge difference for Python 3.5. While there is a table full table scan, the check for NULL entries becomes a predictable branch when all the keys are in consecutive positions. There is an astonishingly well written stack overflow post that explains this effect clearly: http://stackoverflow.com/questions/11227809 With normal randomized keys, Python 3.6 loop is dramatically better that Python 3.5: ~/py36 $ ./python.exe -m timeit -s "d=dict.fromkeys(map(str,range(10**6)))" "list(d)" 100 loops, best of 3: 12.3 msec per loop ~/py35 $ ./python.exe -m timeit -s "d=dict.fromkeys(map(str,range(10**6)))" "list(d)" 10 loops, best of 3: 54.7 msec per loop Repeating the timings, I get consistent results: 12.0 vs 46.7 and 12.0 vs 52.2 and 11.5 vs 44.8. Raymond P.S. Timings are from fresh builds on Mac OS X 10.11.6 running on a 2.6 Ghz Haswell i7 with 16Gb of 1600 Mhz ram: $ ./configure CC=gcc-6 && make From Artyom.Skrobov at arm.com Thu Sep 15 08:02:47 2016 From: Artyom.Skrobov at arm.com (Artyom Skrobov) Date: Thu, 15 Sep 2016 12:02:47 +0000 Subject: [Python-Dev] Python parser performance optimizations In-Reply-To: References: Message-ID: Hello, This is a monthly ping to get a review on http://bugs.python.org/issue26415 -- "Excessive peak memory consumption by the Python parser". Following the comments from August, the patches now include a more detailed comment for Init_ValidationGrammar(). The code change itself is still the same as two months ago. From: Artyom Skrobov Sent: 07 July 2016 15:44 To: python-dev at python.org; steve at pearwood.info; mafagafogigante at gmail.com; greg.ewing at canterbury.ac.nz Cc: nd Subject: RE: Python parser performance optimizations Hello, This is a monthly ping to get a review on http://bugs.python.org/issue26415 -- "Excessive peak memory consumption by the Python parser". The first patch of the series (an NFC refactoring) was successfully committed earlier in June, so the next step is to get the second patch, "the payload", reviewed and committed. To address the concerns raised by the commenters back in May: the patch doesn't lead to negative memory consumption, of course. The base for calculating percentages is the smaller number of the two; this is the same style of reporting that perf.py uses. In other words, "200% less memory usage" is a threefold shrink. The absolute values, and the way they were produced, are all reported under the ticket. From: Artyom Skrobov Sent: 26 May 2016 11:19 To: 'python-dev at python.org' Subject: Python parser performance optimizations Hello, Back in March, I've posted a patch at http://bugs.python.org/issue26526 -- "In parsermodule.c, replace over 2KLOC of hand-crafted validation code, with a DFA". The motivation for this patch was to enable a memory footprint optimization, discussed at http://bugs.python.org/issue26415 My proposed optimization reduces the memory footprint by up to 30% on the standard benchmarks, and by 200% on a degenerate case which sparked the discussion. The run time stays unaffected by this optimization. Python Developer's Guide says: "If you don't get a response within a few days after pinging the issue, then you can try emailing python-dev at python.org asking for someone to review your patch." So, here I am. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Thu Sep 15 11:36:43 2016 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 15 Sep 2016 08:36:43 -0700 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: <7BDDAF13-0C33-44A2-BBA8-82C5D8C7587E@gmail.com> References: <20160915154640.68df4399@fsol> <7BDDAF13-0C33-44A2-BBA8-82C5D8C7587E@gmail.com> Message-ID: <57DAC00B.30604@stoneleaf.us> On 09/15/2016 08:02 AM, Raymond Hettinger wrote: > Eric is correct on this one. The consecutive hashes make a huge difference for Python 3.5. While there is a table full table scan, the check for NULL entries becomes a predictable branch when all the keys are in consecutive positions. There is an astonishingly well written stack overflow post that explains this effect clearly: http://stackoverflow.com/questions/11227809 Thanks for that. Very good answer. -- ~Ethan~ From guido at python.org Thu Sep 15 12:01:15 2016 From: guido at python.org (Guido van Rossum) Date: Thu, 15 Sep 2016 09:01:15 -0700 Subject: [Python-Dev] Python parser performance optimizations In-Reply-To: References: Message-ID: I wonder if this patch could just be rejected instead of lingering forever? It clearly has no champion among the current core devs and therefore it won't be included in Python 3.6 (we're all volunteers so that's how it goes). The use case for the patch is also debatable: Python's parser wasn't designed to *efficiently* parse huge data tables like that, and if you have that much data, using JSON is the right answer. So this doesn't really scratch anyone's itch except of the patch author (Artyom). >From a quick look it seems the patch is very disruptive in terms of what it changes, so it's not easy to review. I recommend giving up, closing the issue as "won't fix", recommending to use JSON, and moving on. Sometimes a change is just not worth the effort. --Guido On Tue, Aug 9, 2016 at 1:59 AM, Artyom Skrobov wrote: > Hello, > > > > This is a monthly ping to get a review on http://bugs.python.org/issue26415 > -- ?Excessive peak memory consumption by the Python parser?. > > > > Following the comments from July, the patches now include updating Misc/NEWS > and compiler.rst to describe the change. > > > > The code change itself is still the same as a month ago. > > > > > > From: Artyom Skrobov > Sent: 07 July 2016 15:44 > To: python-dev at python.org; steve at pearwood.info; mafagafogigante at gmail.com; > greg.ewing at canterbury.ac.nz > Cc: nd > Subject: RE: Python parser performance optimizations > > > > Hello, > > > > This is a monthly ping to get a review on http://bugs.python.org/issue26415 > -- ?Excessive peak memory consumption by the Python parser?. > > The first patch of the series (an NFC refactoring) was successfully > committed earlier in June, so the next step is to get the second patch, ?the > payload?, reviewed and committed. > > > > To address the concerns raised by the commenters back in May: the patch > doesn?t lead to negative memory consumption, of course. The base for > calculating percentages is the smaller number of the two; this is the same > style of reporting that perf.py uses. In other words, ?200% less memory > usage? is a threefold shrink. > > > > The absolute values, and the way they were produced, are all reported under > the ticket. > > > > > > From: Artyom Skrobov > Sent: 26 May 2016 11:19 > To: 'python-dev at python.org' > Subject: Python parser performance optimizations > > > > Hello, > > > > Back in March, I?ve posted a patch at http://bugs.python.org/issue26526 -- > ?In parsermodule.c, replace over 2KLOC of hand-crafted validation code, with > a DFA?. > > > > The motivation for this patch was to enable a memory footprint optimization, > discussed at http://bugs.python.org/issue26415 > > My proposed optimization reduces the memory footprint by up to 30% on the > standard benchmarks, and by 200% on a degenerate case which sparked the > discussion. > > The run time stays unaffected by this optimization. > > > > Python Developer?s Guide says: ?If you don?t get a response within a few > days after pinging the issue, then you can try emailing > python-dev at python.org asking for someone to review your patch.? > > > > So, here I am. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) From solipsis at pitrou.net Thu Sep 15 12:13:54 2016 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 15 Sep 2016 18:13:54 +0200 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: <7BDDAF13-0C33-44A2-BBA8-82C5D8C7587E@gmail.com> References: <20160915154640.68df4399@fsol> <7BDDAF13-0C33-44A2-BBA8-82C5D8C7587E@gmail.com> Message-ID: <20160915181354.01b34ce3@fsol> On Thu, 15 Sep 2016 08:02:10 -0700 Raymond Hettinger wrote: > > Eric is correct on this one. The consecutive hashes make a huge difference for Python 3.5. While there is a table full table scan, the check for NULL entries becomes a predictable branch when all the keys are in consecutive positions. There is an astonishingly well written stack overflow post that explains this effect clearly: http://stackoverflow.com/questions/11227809 > > With normal randomized keys, Python 3.6 loop is dramatically better that Python 3.5: [...] You're jumping to conclusions. While there is a difference, there is no evidence that the difference is due to better branch prediction. Actually, let's do a quick back-of-the-envelope calculation and show that it can't be attributed mostly to branch prediction: > ~/py36 $ ./python.exe -m timeit -s "d=dict.fromkeys(map(str,range(10**6)))" "list(d)" > 100 loops, best of 3: 12.3 msec per loop > ~/py35 $ ./python.exe -m timeit -s "d=dict.fromkeys(map(str,range(10**6)))" "list(d)" > 10 loops, best of 3: 54.7 msec per loop For 10**6 elements, this is a 42ns difference per dict element. A 2.6 Ghz Haswell doesn't stall for 42ns when there's a branch mispredict. According to the Internet, the branch mispredict penalty for a Haswell CPU is 15 cycles, which is 5.7ns at 2.6 GHz. Far from the observed 42ns. 42ns, however, is congruent with another possible effect: a main memory access following a last-level cache miss. And indeed, Serhiy showed that this micro-benchmark is actually dependent on insertion order *on Python 3.6*: $ ./python -m timeit -s "l = [str(i) for i in range(10**6)]; d=dict.fromkeys(l)" "list(d)" -> 100 loops, best of 3: 20 msec per loop $ ./python -m timeit -s "import random; l = [str(i) for i in range(10**6)]; random.shuffle(l); d=dict.fromkeys(l)" "list(d)" -> 10 loops, best of 3: 55.8 msec per loop The only way the table scan (without NULL checks, since this is Python 3.6) can be dependent on insertion order is because iterating the table elements needs to INCREF each element, that is: this benchmark doesn't only scan the table in a nice prefetcher-friendly linear sequence, it also accesses object headers at arbitrary places in Python's heap memory. Since this micro-benchmark creates the keys in order just before filling the dict with them, randomizing the insertion order destroys the temporal locality of object header accesses when iterating over the dict keys. *This* looks like the right explanation, not branch mispredicts due to NULL checks. This also shows that a micro-benchmark that merely looks ok can actually be a terrible proxy of actual performance. As a further validation of this theory, let's dramatically decrease the working set size on the initial benchmark: $ ./python -m timeit -s "d=dict.fromkeys(map(str,range(10**3)))" "list(d)" -> Python 3.5: 100000 loops, best of 3: 10.9 usec per loop -> Python 3.6: 100000 loops, best of 3: 9.72 usec per loop When the working set fits in the cache, this micro-benchmark is only 12% slower on 3.5 compared to 3.6. *This* much smaller difference (a mere 1.2ns difference per dict element) could be attributed to eliminating the NULL checks, or to any other streamlining of the core iteration logic. Regards Antoine. From solipsis at pitrou.net Thu Sep 15 12:33:25 2016 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 15 Sep 2016 18:33:25 +0200 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered References: <20160915154640.68df4399@fsol> <7BDDAF13-0C33-44A2-BBA8-82C5D8C7587E@gmail.com> <20160915181354.01b34ce3@fsol> Message-ID: <20160915183325.3ae957bf@fsol> On Thu, 15 Sep 2016 18:13:54 +0200 Antoine Pitrou wrote: > > This also shows that a micro-benchmark that merely looks ok can actually > be a terrible proxy of actual performance. ... unless all your dicts have their key objects nicely arranged sequentially in heap memory, of course. Regards Antoine. From Nikolaus at rath.org Thu Sep 15 13:42:50 2016 From: Nikolaus at rath.org (Nikolaus Rath) Date: Thu, 15 Sep 2016 10:42:50 -0700 Subject: [Python-Dev] Drastically improving list.sort() for lists of strings/ints In-Reply-To: (Tim Peters's message of "Tue, 13 Sep 2016 13:08:35 -0500") References: <87fup3n265.fsf@thinkpad.rath.org> Message-ID: <87oa3pxdnp.fsf@thinkpad.rath.org> On Sep 13 2016, Tim Peters wrote: > [Terry Reedy ] >>> Tim Peters investigated and empirically determined that an >>> O(n*n) binary insort, as he optimized it on real machines, is faster >>> than O(n*logn) sorting for up to around 64 items. > > [Nikolaus Rath ] >> Out of curiosity: is this test repeated periodically on different >> architectures? Or could it be that it only ever was true 10 years ago on >> Tim's Power Mac G5 (or whatever he used)? > > It has little to do with architecture, but much to do with the > relative cost of comparisons versus pointer-copying. Near the end of > > https://github.com/python/cpython/blob/master/Objects/listsort.txt [...] Ah, that makes sense, thanks! Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F ?Time flies like an arrow, fruit flies like a Banana.? From storchaka at gmail.com Thu Sep 15 14:30:37 2016 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 15 Sep 2016 21:30:37 +0300 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: <20160915181354.01b34ce3@fsol> References: <20160915154640.68df4399@fsol> <7BDDAF13-0C33-44A2-BBA8-82C5D8C7587E@gmail.com> <20160915181354.01b34ce3@fsol> Message-ID: On 15.09.16 19:13, Antoine Pitrou wrote: > Since this micro-benchmark creates the keys in order just before > filling the dict with them, randomizing the insertion order destroys > the temporal locality of object header accesses when iterating over the > dict keys. *This* looks like the right explanation, not branch > mispredicts due to NULL checks. > > This also shows that a micro-benchmark that merely looks ok can actually > be a terrible proxy of actual performance. Thanks you for great explanation Antoine! I came to the same conclusions about randomized integers example, but didn't notice that this also is a main cause of the speed up of strings example. > As a further validation of this theory, let's dramatically decrease the > working set size on the initial benchmark: > > $ ./python -m timeit -s "d=dict.fromkeys(map(str,range(10**3)))" > "list(d)" > > -> Python 3.5: 100000 loops, best of 3: 10.9 usec per loop > -> Python 3.6: 100000 loops, best of 3: 9.72 usec per loop > > When the working set fits in the cache, this micro-benchmark is > only 12% slower on 3.5 compared to 3.6. > *This* much smaller difference (a mere 1.2ns difference per dict > element) could be attributed to eliminating the NULL checks, or to any > other streamlining of the core iteration logic. Yet one example, with random hashes and insertion order independent from the creation order. $ ./python -m timeit -s "import random; a = list(map(str, range(10**6))); random.shuffle(a); d = dict.fromkeys(a)" -- "list(d)" Python 3.5: 180, 180, 180 msec per loop Python 3.6: 171, 172, 171 msec per loop Python 3.6 is 5% faster and this looks closer to the actual performance. From victor.stinner at gmail.com Thu Sep 15 15:33:28 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 15 Sep 2016 21:33:28 +0200 Subject: [Python-Dev] Microbenchmarks Message-ID: The discussion on benchmarking is no more related to compact dict, so I start a new thread. 2016-09-15 13:27 GMT+02:00 Paul Moore : > Just as a side point, perf provided essentially identical results but > took 2 minutes as opposed to 8 seconds for timeit to do so. I > understand why perf is better, and I appreciate all the work Victor > did to create it, and analyze the results, but for getting a quick > impression of how a microbenchmark performs, I don't see timeit as > being *quite* as bad as Victor is claiming. He he, I expected such complain. I already wrote a section in the doc explaining "why perf is so slow": http://perf.readthedocs.io/en/latest/perf.html#why-is-perf-so-slow So you say that timeit just works and is faster? Ok. Let's see a small session: $ python3 -m timeit -s "d=dict.fromkeys(map(str,range(10**6)))" "list(d)" 10 loops, best of 3: 46.7 msec per loop $ python3 -m timeit -s "d=dict.fromkeys(map(str,range(10**6)))" "list(d)" 10 loops, best of 3: 46.9 msec per loop $ python3 -m timeit -s "d=dict.fromkeys(map(str,range(10**6)))" "list(d)" 10 loops, best of 3: 46.9To msec per loop $ python3 -m timeit -s "d=dict.fromkeys(map(str,range(10**6)))" "list(d)" 10 loops, best of 3: 47 msec per loop $ python2 -m timeit -s "d=dict.fromkeys(map(str,range(10**6)))" "list(d)" 10 loops, best of 3: 36.3 msec per loop $ python2 -m timeit -s "d=dict.fromkeys(map(str,range(10**6)))" "list(d)" 10 loops, best of 3: 36.1 msec per loop $ python2 -m timeit -s "d=dict.fromkeys(map(str,range(10**6)))" "list(d)" 10 loops, best of 3: 36.5 msec per loop $ python3 -m timeit -s "d=dict.fromkeys(map(str,range(10**6)))" "list(d)" 10 loops, best of 3: 48.3 msec per loop $ python3 -m timeit -s "d=dict.fromkeys(map(str,range(10**6)))" "list(d)" 10 loops, best of 3: 48.4 msec per loop $ python3 -m timeit -s "d=dict.fromkeys(map(str,range(10**6)))" "list(d)" 10 loops, best of 3: 48.8 msec per loop I ran timeit 7 times on Python 3 and 3 times on Python 2. Please ignore Python 2, it's just a quick command to interfere with Python 3 tests. Now the question is: what is the "correct" result for Python3? Let's take the minimum of the minimums: 46.7 ms. Now imagine that you only ran only have the first 4 runs. What is the "good" result now? Min is still 46.7 ms. And what if you only had the last 3 runs? What is the "good" result now? Min becomes 48.3 ms. On such microbenchmark, the difference between 46.7 ms and 48.3 ms is large :-( How do you know that you ran timeit enough times to make sure that the result is the good one? For me, the timeit tool is broken because you *must* run it many times to workaround its limits. In short, I wrote the perf module to answer to these questions. * perf uses multiple processes to test multiple memory layouts and multiple randomized hash functions * perf ignores the first run, used to "warmup" the benchmark (--warmups command line option) * perf provides many tools to analyze the distribution of results: minimum, maximum, standard deviation, histogram, number of samples, median, etc. * perf displays the median +- standard deviation: median is more reproductible and standard deviation gives an idea of the stability of the benchmark * etc. > I will tend to use perf now that I have it installed, and now that I > know how to run a published timeit invocation using perf. It's a > really cool tool. But I certainly won't object to seeing people > publish timeit results (any more than I'd object to *any* > mirobenchmark). I consider that timeit results are not reliable at all. There is no standard deviation and it's hard to guess how much times the user ran timeit nor how he/she computed the "good result". perf takes ~60 seconds by default. If you don't care of the accuracy, use --fast and it now only takes 20 seconds ;-) Victor From victor.stinner at gmail.com Thu Sep 15 17:04:03 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 15 Sep 2016 23:04:03 +0200 Subject: [Python-Dev] Microbenchmarks In-Reply-To: References: Message-ID: 2016-09-15 21:33 GMT+02:00 Victor Stinner : > perf takes ~60 seconds by default. If you don't care of the accuracy, > use --fast and it now only takes 20 seconds ;-) Oops, I'm wrong. By default, a "single dot" (one worker process) takes less 1 second, so 20 dots (default) takes less than 20 seconds. On the following example, the setup statement is quite expensive: $ python3 -m timeit -s "d=dict.fromkeys(map(str,range(10**6)))" "list(d)" The statement "list(d)" takes around 47 ms, but the setup statement takes 377 ms. It seems like timeit executes the setup statement. Perf is based on timeit, and so each process runs the setup statement at least 4 times (1 warmup + 3 samples): 4*377 ms ~= 1.5 sec per process. Replace range(10**6) with range(10**5) and the benchmark itself becomes much faster: 57 seconds => 15 seconds. Victor From victor.stinner at gmail.com Fri Sep 16 09:22:41 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 16 Sep 2016 15:22:41 +0200 Subject: [Python-Dev] [python-committers] [RELEASE] Python 3.6.0b1 is now available In-Reply-To: References: <942D57F5-BA76-49CB-B3DB-18E2D6F12AC4@python.org> Message-ID: 2016-09-13 18:36 GMT+02:00 Victor Stinner : > Ok, it's start listing regressions/major issues :-) > > * Bug in _PyDict_Pop() on a splitted table: > http://bugs.python.org/issue28120 -- bug in the new compact dict > implementation A new one: a crash in os.execve() and os.spawnve(): http://bugs.python.org/issue28114 It was reported by Windows users but also Twisted (test suite) on Linux. Victor From guido at python.org Fri Sep 16 10:44:44 2016 From: guido at python.org (Guido van Rossum) Date: Fri, 16 Sep 2016 07:44:44 -0700 Subject: [Python-Dev] Python parser performance optimizations In-Reply-To: References: Message-ID: OK, but if nobody responds within a week we should close it. IMO there's no value in keeping things around that nobody is going to apply. I don't expect that a year from now we'll suddenly a surge of interest in this patch, sorry. On Fri, Sep 16, 2016 at 4:25 AM, Artyom Skrobov wrote: > Thank you very much for your comments, > > I appreciate that we're all volunteers, and that if nobody fancies > reviewing a big invasive patch, then it won't get reviewed. > > Still, I want to note that the suggested optimization has a noticeable > positive effect on many benchmarks -- even though the effect may only > become of practical value in such uncommon use cases as parsing huge data > tables. > > As I found out later, JSON wasn't a viable option for storing dozens of > megabytes of deeply-nested data, either. To get acceptable deserialization > performance, I eventually had to resort to pickled files. > > > -----Original Message----- > From: gvanrossum at gmail.com [mailto:gvanrossum at gmail.com] On Behalf Of > Guido van Rossum > Sent: 15 September 2016 17:01 > To: Artyom Skrobov > Cc: python-dev at python.org; brett at python.org; jimjjewett at gmail.com; nd > Subject: Re: [Python-Dev] Python parser performance optimizations > > I wonder if this patch could just be rejected instead of lingering > forever? It clearly has no champion among the current core devs and > therefore it won't be included in Python 3.6 (we're all volunteers so > that's how it goes). > > The use case for the patch is also debatable: Python's parser wasn't > designed to *efficiently* parse huge data tables like that, and if you > have that much data, using JSON is the right answer. So this doesn't > really scratch anyone's itch except of the patch author (Artyom). > > From a quick look it seems the patch is very disruptive in terms of > what it changes, so it's not easy to review. > > I recommend giving up, closing the issue as "won't fix", recommending > to use JSON, and moving on. Sometimes a change is just not worth the > effort. > > --Guido > > On Tue, Aug 9, 2016 at 1:59 AM, Artyom Skrobov > wrote: > > Hello, > > > > > > > > This is a monthly ping to get a review on http://bugs.python.org/ > issue26415 > > -- ?Excessive peak memory consumption by the Python parser?. > > > > > > > > Following the comments from July, the patches now include updating > Misc/NEWS > > and compiler.rst to describe the change. > > > > > > > > The code change itself is still the same as a month ago. > > > > > > > > > > > > From: Artyom Skrobov > > Sent: 07 July 2016 15:44 > > To: python-dev at python.org; steve at pearwood.info; > mafagafogigante at gmail.com; > > greg.ewing at canterbury.ac.nz > > Cc: nd > > Subject: RE: Python parser performance optimizations > > > > > > > > Hello, > > > > > > > > This is a monthly ping to get a review on http://bugs.python.org/ > issue26415 > > -- ?Excessive peak memory consumption by the Python parser?. > > > > The first patch of the series (an NFC refactoring) was successfully > > committed earlier in June, so the next step is to get the second patch, > ?the > > payload?, reviewed and committed. > > > > > > > > To address the concerns raised by the commenters back in May: the patch > > doesn?t lead to negative memory consumption, of course. The base for > > calculating percentages is the smaller number of the two; this is the > same > > style of reporting that perf.py uses. In other words, ?200% less memory > > usage? is a threefold shrink. > > > > > > > > The absolute values, and the way they were produced, are all reported > under > > the ticket. > > > > > > > > > > > > From: Artyom Skrobov > > Sent: 26 May 2016 11:19 > > To: 'python-dev at python.org' > > Subject: Python parser performance optimizations > > > > > > > > Hello, > > > > > > > > Back in March, I?ve posted a patch at http://bugs.python.org/issue26526 > -- > > ?In parsermodule.c, replace over 2KLOC of hand-crafted validation code, > with > > a DFA?. > > > > > > > > The motivation for this patch was to enable a memory footprint > optimization, > > discussed at http://bugs.python.org/issue26415 > > > > My proposed optimization reduces the memory footprint by up to 30% on the > > standard benchmarks, and by 200% on a degenerate case which sparked the > > discussion. > > > > The run time stays unaffected by this optimization. > > > > > > > > Python Developer?s Guide says: ?If you don?t get a response within a few > > days after pinging the issue, then you can try emailing > > python-dev at python.org asking for someone to review your patch.? > > > > > > > > So, here I am. > > > > > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > https://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: > > https://mail.python.org/mailman/options/python-dev/guido%40python.org > > > > > > -- > --Guido van Rossum (python.org/~guido) > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Artyom.Skrobov at arm.com Fri Sep 16 07:25:38 2016 From: Artyom.Skrobov at arm.com (Artyom Skrobov) Date: Fri, 16 Sep 2016 11:25:38 +0000 Subject: [Python-Dev] Python parser performance optimizations In-Reply-To: References: Message-ID: Thank you very much for your comments, I appreciate that we're all volunteers, and that if nobody fancies reviewing a big invasive patch, then it won't get reviewed. Still, I want to note that the suggested optimization has a noticeable positive effect on many benchmarks -- even though the effect may only become of practical value in such uncommon use cases as parsing huge data tables. As I found out later, JSON wasn't a viable option for storing dozens of megabytes of deeply-nested data, either. To get acceptable deserialization performance, I eventually had to resort to pickled files. -----Original Message----- From: gvanrossum at gmail.com [mailto:gvanrossum at gmail.com] On Behalf Of Guido van Rossum Sent: 15 September 2016 17:01 To: Artyom Skrobov Cc: python-dev at python.org; brett at python.org; jimjjewett at gmail.com; nd Subject: Re: [Python-Dev] Python parser performance optimizations I wonder if this patch could just be rejected instead of lingering forever? It clearly has no champion among the current core devs and therefore it won't be included in Python 3.6 (we're all volunteers so that's how it goes). The use case for the patch is also debatable: Python's parser wasn't designed to *efficiently* parse huge data tables like that, and if you have that much data, using JSON is the right answer. So this doesn't really scratch anyone's itch except of the patch author (Artyom). From a quick look it seems the patch is very disruptive in terms of what it changes, so it's not easy to review. I recommend giving up, closing the issue as "won't fix", recommending to use JSON, and moving on. Sometimes a change is just not worth the effort. --Guido On Tue, Aug 9, 2016 at 1:59 AM, Artyom Skrobov wrote: > Hello, > > > > This is a monthly ping to get a review on http://bugs.python.org/issue26415 > -- ?Excessive peak memory consumption by the Python parser?. > > > > Following the comments from July, the patches now include updating Misc/NEWS > and compiler.rst to describe the change. > > > > The code change itself is still the same as a month ago. > > > > > > From: Artyom Skrobov > Sent: 07 July 2016 15:44 > To: python-dev at python.org; steve at pearwood.info; mafagafogigante at gmail.com; > greg.ewing at canterbury.ac.nz > Cc: nd > Subject: RE: Python parser performance optimizations > > > > Hello, > > > > This is a monthly ping to get a review on http://bugs.python.org/issue26415 > -- ?Excessive peak memory consumption by the Python parser?. > > The first patch of the series (an NFC refactoring) was successfully > committed earlier in June, so the next step is to get the second patch, ?the > payload?, reviewed and committed. > > > > To address the concerns raised by the commenters back in May: the patch > doesn?t lead to negative memory consumption, of course. The base for > calculating percentages is the smaller number of the two; this is the same > style of reporting that perf.py uses. In other words, ?200% less memory > usage? is a threefold shrink. > > > > The absolute values, and the way they were produced, are all reported under > the ticket. > > > > > > From: Artyom Skrobov > Sent: 26 May 2016 11:19 > To: 'python-dev at python.org' > Subject: Python parser performance optimizations > > > > Hello, > > > > Back in March, I?ve posted a patch at http://bugs.python.org/issue26526 -- > ?In parsermodule.c, replace over 2KLOC of hand-crafted validation code, with > a DFA?. > > > > The motivation for this patch was to enable a memory footprint optimization, > discussed at http://bugs.python.org/issue26415 > > My proposed optimization reduces the memory footprint by up to 30% on the > standard benchmarks, and by 200% on a degenerate case which sparked the > discussion. > > The run time stays unaffected by this optimization. > > > > Python Developer?s Guide says: ?If you don?t get a response within a few > days after pinging the issue, then you can try emailing > python-dev at python.org asking for someone to review your patch.? > > > > So, here I am. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) From status at bugs.python.org Fri Sep 16 12:08:50 2016 From: status at bugs.python.org (Python tracker) Date: Fri, 16 Sep 2016 18:08:50 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20160916160850.25D84568AB@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2016-09-09 - 2016-09-16) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 5507 (-61) closed 34432 (+199) total 39939 (+138) Open issues with patches: 2367 Issues opened (82) ================== #15369: pybench and test.pystone poorly documented http://bugs.python.org/issue15369 reopened by haypo #21009: Potential deadlock in concurrent futures when garbage collecti http://bugs.python.org/issue21009 reopened by davin #22493: Deprecate the use of flags not at the start of regular express http://bugs.python.org/issue22493 reopened by serhiy.storchaka #24254: Make class definition namespace ordered by default http://bugs.python.org/issue24254 reopened by ncoghlan #25573: FrameSummary repr() does not support previously working uses o http://bugs.python.org/issue25573 reopened by r.david.murray #26502: traceback.extract_tb breaks compatibility by returning FrameSu http://bugs.python.org/issue26502 reopened by r.david.murray #28027: Remove Lib/plat-*/* files http://bugs.python.org/issue28027 reopened by doko #28046: Remove the concept of platform-specific directories http://bugs.python.org/issue28046 opened by zach.ware #28050: test_traceback is broken by new CALL_FUNCTION* opcodes http://bugs.python.org/issue28050 opened by haypo #28052: clarify concurrent.futures docs to not refer to async Futures http://bugs.python.org/issue28052 opened by davin #28053: parameterize what serialization is used in multiprocessing http://bugs.python.org/issue28053 opened by davin #28054: Diff for visually comparing actual with expected in mock.asser http://bugs.python.org/issue28054 opened by Eli Rose #28055: pyhash's siphash24 assumes alignment of the data pointer http://bugs.python.org/issue28055 opened by doko #28058: [Patch] Don't use st_uid and st_gid on CloudABI http://bugs.python.org/issue28058 opened by EdSchouten #28060: Clean up division fast paths in Objects/longobject.c http://bugs.python.org/issue28060 opened by mark.dickinson #28062: Streamline repr(partial object) http://bugs.python.org/issue28062 opened by ebarry #28068: Error in freeze.py due to unguarded sys.abiflags usage under W http://bugs.python.org/issue28068 opened by gevorg #28069: signalmodule.c does "is" comparisons for SIG_IGN and SIG_DFL http://bugs.python.org/issue28069 opened by mark.dickinson #28074: Add Configuration file parser action http://bugs.python.org/issue28074 opened by Chris Nyland #28075: os.stat fails when access is denied http://bugs.python.org/issue28075 opened by ramson #28080: Allow reading member names with bogus encodings in zipfile http://bugs.python.org/issue28080 opened by sjt #28082: re: convert re flags to (much friendlier) IntFlag constants http://bugs.python.org/issue28082 opened by ethan.furman #28083: socket: finish constant to Enum/Flag conversion http://bugs.python.org/issue28083 opened by ethan.furman #28085: SSL: Add client and server protocols for SSLContext http://bugs.python.org/issue28085 opened by christian.heimes #28086: test.test_getargs2.TupleSubclass test failure http://bugs.python.org/issue28086 opened by rbcollins #28087: macOS 12 poll syscall returns prematurely http://bugs.python.org/issue28087 opened by MicroTransactionsMatterToo #28088: Document Transport.set_protocol and get_protocol http://bugs.python.org/issue28088 opened by yselivanov #28089: Document TCP_NODELAY by default http://bugs.python.org/issue28089 opened by yselivanov #28090: Document PEP 530 http://bugs.python.org/issue28090 opened by yselivanov #28091: Document PEP 525 http://bugs.python.org/issue28091 opened by yselivanov #28092: Build failure for 3.6 on Centos 5.11 http://bugs.python.org/issue28092 opened by steven.daprano #28095: test_startup_imports of test_site fails on OS X due to new imp http://bugs.python.org/issue28095 opened by ned.deily #28097: IDLE: document all key bindings, add menu items for more. http://bugs.python.org/issue28097 opened by terry.reedy #28099: Drop Mac OS X Tiger support in Python 3.6 http://bugs.python.org/issue28099 opened by haypo #28100: Refactor error messages in symtable.c http://bugs.python.org/issue28100 opened by levkivskyi #28107: Update typing module dicumentation for NamedTuple http://bugs.python.org/issue28107 opened by levkivskyi #28108: Python configure fails to detect tzname on platforms that have http://bugs.python.org/issue28108 opened by belopolsky #28110: launcher.msi has different product codes between 32 and 64-bit http://bugs.python.org/issue28110 opened by steve.dower #28111: geometric_mean can raise OverflowError when checking for inf http://bugs.python.org/issue28111 opened by steven.daprano #28113: Remove Py_CreateSymbolicLinkW http://bugs.python.org/issue28113 opened by eryksun #28115: Use argparse for the zipfile module http://bugs.python.org/issue28115 opened by serhiy.storchaka #28117: warning: dereferencing type-punned pointer will break strict-a http://bugs.python.org/issue28117 opened by serhiy.storchaka #28121: If module starts with comment or empty line then frame.f_code. http://bugs.python.org/issue28121 opened by Aivar.Annamaa #28123: _PyDict_GetItem_KnownHash ignores DKIX_ERROR return http://bugs.python.org/issue28123 opened by xiang.zhang #28124: Rework SSL module documentation http://bugs.python.org/issue28124 opened by christian.heimes #28125: identify cross builds by a more generic environment setting. http://bugs.python.org/issue28125 opened by doko #28128: Improve the warning message for invalid escape sequences http://bugs.python.org/issue28128 opened by Chi Hsuan Yen #28129: assertion failures in ctypes http://bugs.python.org/issue28129 opened by Oren Milman #28130: Document that time.tzset updates time module globals http://bugs.python.org/issue28130 opened by belopolsky #28132: impossible to uninstall python3.6.0b1-amd64 from windows 10 http://bugs.python.org/issue28132 opened by Big Stone #28134: socket.socket(fileno=fd) does not work as documented http://bugs.python.org/issue28134 opened by christian.heimes #28137: Windows sys.path file should be renamed http://bugs.python.org/issue28137 opened by steve.dower #28138: Windows _sys.path file should allow import site http://bugs.python.org/issue28138 opened by steve.dower #28139: Misleading Indentation in C source code http://bugs.python.org/issue28139 opened by franciscouzo #28140: Attempt to give better errors for pip commands typed into the http://bugs.python.org/issue28140 opened by ncoghlan #28141: shutil.copystat utime lookup fails on certain Android file sys http://bugs.python.org/issue28141 opened by Jerry A #28143: ASDL compatibility with Python 3 system interpreter http://bugs.python.org/issue28143 opened by malthe #28144: Decrease empty_keys_struct's dk_refcnt http://bugs.python.org/issue28144 opened by xiang.zhang #28145: Fix whitespace in C source code http://bugs.python.org/issue28145 opened by franciscouzo #28146: Confusing error messages in str.format() http://bugs.python.org/issue28146 opened by serhiy.storchaka #28147: Unbounded memory growth resizing split-table dicts http://bugs.python.org/issue28147 opened by minrk #28148: [Patch] Also stop using localtime() in timemodule http://bugs.python.org/issue28148 opened by EdSchouten #28149: Incorrect indentation under ???else??? in _bsddb.c http://bugs.python.org/issue28149 opened by martin.panter #28151: testPythonOrg() of test_robotparser fails on validating python http://bugs.python.org/issue28151 opened by haypo #28152: Clang warnings: code will never be executed http://bugs.python.org/issue28152 opened by haypo #28157: Document time module constants (timezone, tzname, etc.) as dep http://bugs.python.org/issue28157 opened by belopolsky #28158: Implement LOAD_GLOBAL opcode cache http://bugs.python.org/issue28158 opened by yselivanov #28159: Deprecate isdst argument in email.utils.localtime http://bugs.python.org/issue28159 opened by belopolsky #28161: Opening CON for write access fails http://bugs.python.org/issue28161 opened by eryksun #28162: WindowsConsoleIO readall() fails if first line starts with Ctr http://bugs.python.org/issue28162 opened by eryksun #28163: WindowsConsoleIO fileno() passes wrong flags to _open_osfhandl http://bugs.python.org/issue28163 opened by eryksun #28164: _PyIO_get_console_type fails for various paths http://bugs.python.org/issue28164 opened by eryksun #28165: The 'subprocess' module leaks roughly 4 KiB of memory per call http://bugs.python.org/issue28165 opened by Xavion #28166: WindowsConsoleIO misbehavior when Ctrl+C is ignored http://bugs.python.org/issue28166 opened by eryksun #28167: remove platform.linux_distribution() http://bugs.python.org/issue28167 opened by doko #28168: Use _winapi.WaitForMultipleObjects in Popen.wait() http://bugs.python.org/issue28168 opened by eryksun #28169: shift exponent overflow http://bugs.python.org/issue28169 opened by franciscouzo #28172: Upper-case all example enum members http://bugs.python.org/issue28172 opened by Rosuav #28178: allow to cache_clear(some_key) in lru_cache http://bugs.python.org/issue28178 opened by S??bastien de Menten #28179: Segfault in test_recursionlimit_fatalerror http://bugs.python.org/issue28179 opened by berker.peksag #28180: sys.getfilesystemencoding() should default to utf-8 http://bugs.python.org/issue28180 opened by Jan Niklas Hasse #28182: Expose OpenSSL verification results in SSLError http://bugs.python.org/issue28182 opened by Chi Hsuan Yen Most recent 15 issues with no replies (15) ========================================== #28182: Expose OpenSSL verification results in SSLError http://bugs.python.org/issue28182 #28179: Segfault in test_recursionlimit_fatalerror http://bugs.python.org/issue28179 #28172: Upper-case all example enum members http://bugs.python.org/issue28172 #28166: WindowsConsoleIO misbehavior when Ctrl+C is ignored http://bugs.python.org/issue28166 #28164: _PyIO_get_console_type fails for various paths http://bugs.python.org/issue28164 #28163: WindowsConsoleIO fileno() passes wrong flags to _open_osfhandl http://bugs.python.org/issue28163 #28162: WindowsConsoleIO readall() fails if first line starts with Ctr http://bugs.python.org/issue28162 #28161: Opening CON for write access fails http://bugs.python.org/issue28161 #28159: Deprecate isdst argument in email.utils.localtime http://bugs.python.org/issue28159 #28157: Document time module constants (timezone, tzname, etc.) as dep http://bugs.python.org/issue28157 #28152: Clang warnings: code will never be executed http://bugs.python.org/issue28152 #28149: Incorrect indentation under ???else??? in _bsddb.c http://bugs.python.org/issue28149 #28141: shutil.copystat utime lookup fails on certain Android file sys http://bugs.python.org/issue28141 #28130: Document that time.tzset updates time module globals http://bugs.python.org/issue28130 #28129: assertion failures in ctypes http://bugs.python.org/issue28129 Most recent 15 issues waiting for review (15) ============================================= #28172: Upper-case all example enum members http://bugs.python.org/issue28172 #28168: Use _winapi.WaitForMultipleObjects in Popen.wait() http://bugs.python.org/issue28168 #28158: Implement LOAD_GLOBAL opcode cache http://bugs.python.org/issue28158 #28151: testPythonOrg() of test_robotparser fails on validating python http://bugs.python.org/issue28151 #28148: [Patch] Also stop using localtime() in timemodule http://bugs.python.org/issue28148 #28147: Unbounded memory growth resizing split-table dicts http://bugs.python.org/issue28147 #28145: Fix whitespace in C source code http://bugs.python.org/issue28145 #28144: Decrease empty_keys_struct's dk_refcnt http://bugs.python.org/issue28144 #28143: ASDL compatibility with Python 3 system interpreter http://bugs.python.org/issue28143 #28139: Misleading Indentation in C source code http://bugs.python.org/issue28139 #28134: socket.socket(fileno=fd) does not work as documented http://bugs.python.org/issue28134 #28129: assertion failures in ctypes http://bugs.python.org/issue28129 #28128: Improve the warning message for invalid escape sequences http://bugs.python.org/issue28128 #28123: _PyDict_GetItem_KnownHash ignores DKIX_ERROR return http://bugs.python.org/issue28123 #28113: Remove Py_CreateSymbolicLinkW http://bugs.python.org/issue28113 Top 10 most discussed issues (10) ================================= #27213: Rework CALL_FUNCTION* opcodes http://bugs.python.org/issue27213 30 msgs #28046: Remove the concept of platform-specific directories http://bugs.python.org/issue28046 28 msgs #28055: pyhash's siphash24 assumes alignment of the data pointer http://bugs.python.org/issue28055 28 msgs #28128: Improve the warning message for invalid escape sequences http://bugs.python.org/issue28128 21 msgs #28027: Remove Lib/plat-*/* files http://bugs.python.org/issue28027 20 msgs #28147: Unbounded memory growth resizing split-table dicts http://bugs.python.org/issue28147 16 msgs #28022: SSL releated deprecation for 3.6 http://bugs.python.org/issue28022 14 msgs #28092: Build failure for 3.6 on Centos 5.11 http://bugs.python.org/issue28092 14 msgs #26081: Implement asyncio Future in C to improve performance http://bugs.python.org/issue26081 12 msgs #28080: Allow reading member names with bogus encodings in zipfile http://bugs.python.org/issue28080 12 msgs Issues closed (186) =================== #3990: The Linux2 platform definition is incorrect for alpha, hppa, m http://bugs.python.org/issue3990 closed by zach.ware #4558: ./configure --with-stdc89 to test ANSI C conformity http://bugs.python.org/issue4558 closed by christian.heimes #10213: tests shouldn't fail with unset timezone http://bugs.python.org/issue10213 closed by belopolsky #10740: sqlite3 module breaks transactions and potentially corrupts da http://bugs.python.org/issue10740 closed by berker.peksag #10765: Build regression from automation changes on windows http://bugs.python.org/issue10765 closed by zach.ware #10976: accept bytes in json.loads() http://bugs.python.org/issue10976 closed by ncoghlan #11640: Shelve references globals in its __del__ method http://bugs.python.org/issue11640 closed by berker.peksag #12619: Automatically regenerate platform-specific modules http://bugs.python.org/issue12619 closed by zach.ware #12643: code.InteractiveConsole ignores sys.excepthook http://bugs.python.org/issue12643 closed by berker.peksag #13405: Add DTrace probes http://bugs.python.org/issue13405 closed by lukasz.langa #13924: Mercurial robots.txt should let robots crawl landing pages. http://bugs.python.org/issue13924 closed by barry #14776: Add SystemTap static markers http://bugs.python.org/issue14776 closed by lukasz.langa #14976: queue.Queue() is not reentrant, so signals and GC can cause de http://bugs.python.org/issue14976 closed by rhettinger #14977: mailcap does not respect precedence in the presence of wildcar http://bugs.python.org/issue14977 closed by r.david.murray #15941: Time module: effect of time.timezone change http://bugs.python.org/issue15941 closed by belopolsky #16189: ld_so_aix not found http://bugs.python.org/issue16189 closed by martin.panter #16193: display full e-mail name in hg.python.org annotate pages http://bugs.python.org/issue16193 closed by berker.peksag #16384: import.c doesn't handle EOFError from PyMarshal_Read* http://bugs.python.org/issue16384 closed by eric.snow #16700: Document that bytes OS API can returns unusable results on Win http://bugs.python.org/issue16700 closed by barry #17394: Add slicing support to collections.deque http://bugs.python.org/issue17394 closed by rhettinger #17512: backport of the _sysconfigdata.py module (issue 13150) breaks http://bugs.python.org/issue17512 closed by berker.peksag #17582: xml.etree.ElementTree does not preserve whitespaces in attribu http://bugs.python.org/issue17582 closed by rhettinger #17909: Autodetecting JSON encoding http://bugs.python.org/issue17909 closed by ncoghlan #17941: namedtuple should support fully qualified name for more portab http://bugs.python.org/issue17941 closed by rhettinger #18199: Windows: support path longer than 260 bytes using "\\?\" prefi http://bugs.python.org/issue18199 closed by steve.dower #18401: Tests for pdb import ~/.pdbrc http://bugs.python.org/issue18401 closed by lukasz.langa #18546: ssl.get_server_certificate like addition for cert chain http://bugs.python.org/issue18546 closed by christian.heimes #19003: email.generator.BytesGenerator corrupts data by changing line http://bugs.python.org/issue19003 closed by r.david.murray #19489: move quick search box above TOC http://bugs.python.org/issue19489 closed by python-dev #19502: Wrong time zone offset, when using time.strftime() with a give http://bugs.python.org/issue19502 closed by belopolsky #19763: Make it easier to backport statistics to 2.7 http://bugs.python.org/issue19763 closed by christian.heimes #20476: If new email policies are used, default message factory should http://bugs.python.org/issue20476 closed by r.david.murray #20483: Missing network resource checks in test_urllib2 & test_smtplib http://bugs.python.org/issue20483 closed by berker.peksag #20885: Little Endian PowerPC64 Linux http://bugs.python.org/issue20885 closed by berker.peksag #21337: Add tests for Tix http://bugs.python.org/issue21337 closed by zach.ware #22450: urllib doesn't put Accept: */* in the headers http://bugs.python.org/issue22450 closed by rhettinger #22458: Add fractions benchmark http://bugs.python.org/issue22458 closed by haypo #22544: Inconsistent cmath.log behaviour http://bugs.python.org/issue22544 closed by mark.dickinson #22799: wrong time.timezone http://bugs.python.org/issue22799 closed by belopolsky #23105: os.O_SHLOCK and os.O_EXLOCK are not available on Linux http://bugs.python.org/issue23105 closed by python-dev #23403: Use pickle protocol 4 by default? http://bugs.python.org/issue23403 closed by davin #23545: Turn on extra warnings on GCC http://bugs.python.org/issue23545 closed by serhiy.storchaka #23722: During metaclass.__init__, super() of the constructed class do http://bugs.python.org/issue23722 closed by ncoghlan #24168: Unittest discover fails with namespace package if the path con http://bugs.python.org/issue24168 closed by barry #24186: OpenSSL causes buffer overrun exception http://bugs.python.org/issue24186 closed by steve.dower #24320: Remove a now-unnecessary workaround from importlib._bootstrap. http://bugs.python.org/issue24320 closed by eric.snow #24454: Improve the usability of the match object named group API http://bugs.python.org/issue24454 closed by eric.smith #24510: Make _PyCoro_GetAwaitableIter a public API http://bugs.python.org/issue24510 closed by yselivanov #24511: Add methods for async protocols http://bugs.python.org/issue24511 closed by yselivanov #24594: msilib.OpenDatabase Type Confusion http://bugs.python.org/issue24594 closed by steve.dower #24693: zipfile: change RuntimeError to more appropriate exception typ http://bugs.python.org/issue24693 closed by serhiy.storchaka #25144: 3.5 Win install fails with "TARGETDIR" http://bugs.python.org/issue25144 closed by steve.dower #25221: PyLong_FromLong() potentially returns irregular object when sm http://bugs.python.org/issue25221 closed by mark.dickinson #25270: codecs.escape_encode systemerror on empty byte string http://bugs.python.org/issue25270 closed by berker.peksag #25283: Make tm_gmtoff and tm_zone available on all platforms http://bugs.python.org/issue25283 closed by python-dev #25497: Rewrite test_robotparser http://bugs.python.org/issue25497 closed by berker.peksag #25671: Fix venv activate.fish to maintain $status http://bugs.python.org/issue25671 closed by python-dev #25758: ensurepip/venv broken on Windows if path includes unicode http://bugs.python.org/issue25758 closed by steve.dower #25776: More compact pickle of iterators etc http://bugs.python.org/issue25776 closed by rhettinger #25856: The __module__ attribute of non-heap classes is not interned http://bugs.python.org/issue25856 closed by serhiy.storchaka #25895: urllib.parse.urljoin does not handle WebSocket URLs http://bugs.python.org/issue25895 closed by berker.peksag #25969: Update lib2to3 grammar to include missing unpacking generaliza http://bugs.python.org/issue25969 closed by gregory.p.smith #26132: 2.7.11 Windows Installer issues on Win2008R2 http://bugs.python.org/issue26132 closed by steve.dower #26141: typing module documentation incomplete http://bugs.python.org/issue26141 closed by gvanrossum #26182: Deprecation warnings for the future async and await keywords i http://bugs.python.org/issue26182 closed by yselivanov #26284: Fix telco benchmark http://bugs.python.org/issue26284 closed by haypo #26331: PEP 515: Tokenizer: allow underscores for grouping in numeric http://bugs.python.org/issue26331 closed by brett.cannon #26383: benchmarks (perf.py): number of decimal places in csv output http://bugs.python.org/issue26383 closed by haypo #26455: Inconsistent behavior with KeyboardInterrupt and asyncio futur http://bugs.python.org/issue26455 closed by gvanrossum #26496: Exhausted deque iterator should free a reference to a deque http://bugs.python.org/issue26496 closed by rhettinger #26507: Use highest pickle protocol in multiprocessing http://bugs.python.org/issue26507 closed by davin #26511: Add link to id() built-in in comparison operator documentation http://bugs.python.org/issue26511 closed by rhettinger #26533: logging.config does not allow disable_existing_loggers=True http://bugs.python.org/issue26533 closed by python-dev #26557: dictviews methods not present on shelve objects http://bugs.python.org/issue26557 closed by rhettinger #26619: 3.5.1 install fails on Windows Server 2008 R2 64-bit http://bugs.python.org/issue26619 closed by steve.dower #26654: asyncio is not inspecting keyword arguments of functools.parti http://bugs.python.org/issue26654 closed by yselivanov #26797: Segafault in _PyObject_Alloc http://bugs.python.org/issue26797 closed by yselivanov #26815: SIGBUS in test_ssl.test_dealloc_warn() on "AMD64 FreeBSD 10.0 http://bugs.python.org/issue26815 closed by christian.heimes #26830: Refactor Tools/scripts/google.py http://bugs.python.org/issue26830 closed by berker.peksag #26858: android: setting SO_REUSEPORT fails http://bugs.python.org/issue26858 closed by yselivanov #26885: Add parsing support for more types in xmlrpc http://bugs.python.org/issue26885 closed by serhiy.storchaka #26900: Exclude the private API from the stable API http://bugs.python.org/issue26900 closed by serhiy.storchaka #26909: Asyncio: Pipes and socket IO is very slow http://bugs.python.org/issue26909 closed by yselivanov #27080: Implement the formatting part of PEP 515, '_' in numeric liter http://bugs.python.org/issue27080 closed by eric.smith #27137: Python implementation of `functools.partial` is not a class http://bugs.python.org/issue27137 closed by ncoghlan #27199: TarFile expose copyfileobj bufsize to improve throughput http://bugs.python.org/issue27199 closed by lukasz.langa #27314: Cannot install 3.5.2 with 3.6.0a1 installed http://bugs.python.org/issue27314 closed by python-dev #27350: Compact and ordered dict http://bugs.python.org/issue27350 closed by benjamin.peterson #27415: regression: BaseEventLoop.create_server does not accept port=N http://bugs.python.org/issue27415 closed by yselivanov #27456: asyncio: set TCP_NODELAY flag by default http://bugs.python.org/issue27456 closed by yselivanov #27516: Wrong initialization of python path with embeddable distributi http://bugs.python.org/issue27516 closed by steve.dower #27520: Issue when building PGO http://bugs.python.org/issue27520 closed by steve.dower #27564: 2.7.12 Windows Installer package broken. http://bugs.python.org/issue27564 closed by steve.dower #27566: Tools/freeze/winmakemakefile.py clean target should use 'del' http://bugs.python.org/issue27566 closed by steve.dower #27569: Windows install problems http://bugs.python.org/issue27569 closed by steve.dower #27576: An unexpected difference between dict and OrderedDict http://bugs.python.org/issue27576 closed by eric.snow #27599: Buffer overrun in binascii http://bugs.python.org/issue27599 closed by serhiy.storchaka #27604: More details about `-O` flag http://bugs.python.org/issue27604 closed by berker.peksag #27665: Make create_server able to listen on several ports http://bugs.python.org/issue27665 closed by yselivanov #27680: Reduce Github pull request rate http://bugs.python.org/issue27680 closed by berker.peksag #27705: Updating old C:/Windows/System32/ucrtbased.dll http://bugs.python.org/issue27705 closed by terry.reedy #27759: selectors incorrectly retain invalid file descriptors http://bugs.python.org/issue27759 closed by yselivanov #27810: Add METH_FASTCALL: new calling convention for C functions http://bugs.python.org/issue27810 closed by haypo #27830: Add _PyObject_FastCallKeywords(): avoid the creation of a temp http://bugs.python.org/issue27830 closed by haypo #27890: platform.release() incorrect in Python 3.5.2 on Windows 2008Se http://bugs.python.org/issue27890 closed by steve.dower #27948: f-strings: allow backslashes only in the string parts, not in http://bugs.python.org/issue27948 closed by eric.smith #27952: Finish converting fixcid.py from regex to re http://bugs.python.org/issue27952 closed by martin.panter #27976: Deprecate building with bundled copy of libffi on non-Darwin P http://bugs.python.org/issue27976 closed by python-dev #27981: Reference leak in fp_setreadl() of Parser/tokenizer.c http://bugs.python.org/issue27981 closed by berker.peksag #27986: make distclean clobbers Lib/plat-darwin/* http://bugs.python.org/issue27986 closed by zach.ware #27991: In the argparse howto there is a misleading sentence about sto http://bugs.python.org/issue27991 closed by berker.peksag #27999: Make "global after use" a SyntaxError http://bugs.python.org/issue27999 closed by gvanrossum #28008: PEP 530, asynchronous comprehensions implementation http://bugs.python.org/issue28008 closed by yselivanov #28018: Cross compilation fails in regen http://bugs.python.org/issue28018 closed by Chi Hsuan Yen #28019: itertools.count() falls back to fast (integer) mode when step http://bugs.python.org/issue28019 closed by serhiy.storchaka #28024: fileinput causes RecursionErrors when dealing with large numbe http://bugs.python.org/issue28024 closed by berker.peksag #28035: make buildbottest when configured --with-optimizations can cau http://bugs.python.org/issue28035 closed by gregory.p.smith #28036: Remove unused pysqlite_flush_statement_cache function http://bugs.python.org/issue28036 closed by berker.peksag #28037: Use sqlite3_get_autocommit() instead of setting Connection->in http://bugs.python.org/issue28037 closed by berker.peksag #28038: Remove com2ann script (will be in separate repo) http://bugs.python.org/issue28038 closed by gvanrossum #28039: x86 Tiger buildbot needs __future__ with_statement http://bugs.python.org/issue28039 closed by martin.panter #28040: compact dict : SystemError: returned NULL without setting an e http://bugs.python.org/issue28040 closed by haypo #28045: minor inaccuracy in range_contains_long http://bugs.python.org/issue28045 closed by berker.peksag #28047: email set_content does not always use the correct line length http://bugs.python.org/issue28047 closed by r.david.murray #28048: Adjust class-build method of Enum so final ordered dict more c http://bugs.python.org/issue28048 closed by ethan.furman #28049: Add documentation for typing.Awaitable and friends http://bugs.python.org/issue28049 closed by gvanrossum #28051: Typo and broken links in page "What's New In Python 3.5" http://bugs.python.org/issue28051 closed by benjamin.peterson #28056: sizeof unit tests fail on ARMv7 http://bugs.python.org/issue28056 closed by haypo #28057: Warnings (45) building Doc/library/email.*.rst http://bugs.python.org/issue28057 closed by terry.reedy #28059: Windows: test_platform.test_architecture_via_symlink() regress http://bugs.python.org/issue28059 closed by python-dev #28061: Compact dict bug on Windows (Visual Studio): if (mp->ma_keys-> http://bugs.python.org/issue28061 closed by ebarry #28063: Adding a mutually exclusive group to an argument group results http://bugs.python.org/issue28063 closed by John.Didion #28064: String executed inside a function ignores global statements http://bugs.python.org/issue28064 closed by xiang.zhang #28065: Update Windows build to xz-5.2.2 http://bugs.python.org/issue28065 closed by zach.ware #28066: [Patch] Fix the ability to cross compile Python when doing a r http://bugs.python.org/issue28066 closed by martin.panter #28067: Do not call localtime (gmtime) in datetime module http://bugs.python.org/issue28067 closed by python-dev #28070: 3.6 regression: re.compile not handling flags with X correctly http://bugs.python.org/issue28070 closed by serhiy.storchaka #28071: Stop set.difference when set is empty http://bugs.python.org/issue28071 closed by rhettinger #28072: Empty Strings are not parsed to None. http://bugs.python.org/issue28072 closed by ethan.furman #28073: Update documentation about None vs type(None) in typing http://bugs.python.org/issue28073 closed by gvanrossum #28076: Variable annotations should be mangled for private names http://bugs.python.org/issue28076 closed by gvanrossum #28077: Fix find_empty_slot in dictobject http://bugs.python.org/issue28077 closed by haypo #28078: Silence resource warnings in test_socket http://bugs.python.org/issue28078 closed by christian.heimes #28079: Update typing and test typing from python/typing repo http://bugs.python.org/issue28079 closed by gvanrossum #28081: [Patch] timemodule: Complete Autoconf bits for clock_*() funct http://bugs.python.org/issue28081 closed by python-dev #28084: Fatal Python error: Py_EndInterpreter: not the last thread http://bugs.python.org/issue28084 closed by martin.panter #28093: ResourceWarning in test_ssl http://bugs.python.org/issue28093 closed by xiang.zhang #28094: Document behaviour of Process.join() in multiprocessing http://bugs.python.org/issue28094 closed by berker.peksag #28096: set.difference() is not interruptible http://bugs.python.org/issue28096 closed by rhettinger #28098: sys.getsizeof(0) is incorrect http://bugs.python.org/issue28098 closed by mark.dickinson #28101: Add utf-8 alias to aliases.py dictionary http://bugs.python.org/issue28101 closed by haypo #28102: zipfile.py script should print usage to stderr http://bugs.python.org/issue28102 closed by serhiy.storchaka #28103: Style fix in zipfile.rst http://bugs.python.org/issue28103 closed by berker.peksag #28104: Set documentation is incorrect http://bugs.python.org/issue28104 closed by rhettinger #28105: warning: ???nkwargs??? may be used uninitialized http://bugs.python.org/issue28105 closed by haypo #28106: [Benchmarks] Add --testonly argument to perf.py to run benchma http://bugs.python.org/issue28106 closed by haypo #28109: What's new item for PEP 526 -- Variable annotations http://bugs.python.org/issue28109 closed by gvanrossum #28112: Add callback to functools.lru_cache http://bugs.python.org/issue28112 closed by rhettinger #28114: parse_envlist(): os.execve(), os.spawnve(), etc. crash in Pyth http://bugs.python.org/issue28114 closed by berker.peksag #28116: Error in what's new - PEP??515 http://bugs.python.org/issue28116 closed by brett.cannon #28118: type-limits warning in PyMem_New() _ssl_locks_count http://bugs.python.org/issue28118 closed by christian.heimes #28119: Explicit null dereferenced in formatter_unicode.c http://bugs.python.org/issue28119 closed by christian.heimes #28120: Bug in _PyDict_Pop() on a splitted table http://bugs.python.org/issue28120 closed by haypo #28122: email.header.decode_header can not decode string with quotatio http://bugs.python.org/issue28122 closed by r.david.murray #28126: Py_MEMCPY: Use memcpy on Windows? http://bugs.python.org/issue28126 closed by christian.heimes #28127: Add _PyDict_CheckConsistency() http://bugs.python.org/issue28127 closed by haypo #28131: assert statements missed when loaded by zipimporter http://bugs.python.org/issue28131 closed by berker.peksag #28133: spam http://bugs.python.org/issue28133 closed by SilentGhost #28135: assertRaises should return the exception in its simple form http://bugs.python.org/issue28135 closed by r.david.murray #28136: RegEx documentation error http://bugs.python.org/issue28136 closed by SilentGhost #28142: windows installer not adding PYTHONHOME http://bugs.python.org/issue28142 closed by eryksun #28150: Error CERTIFICATE_VERIFY_FAILED in macOS http://bugs.python.org/issue28150 closed by ned.deily #28153: [Patch] selectmodule: Make kqueue()'s event filters optional http://bugs.python.org/issue28153 closed by berker.peksag #28154: Core dump after importing lxml in Python3.6b http://bugs.python.org/issue28154 closed by philip.stefanov #28155: Small typo in Json docs http://bugs.python.org/issue28155 closed by SilentGhost #28156: [Patch] posixmodule: Make the presence of os.getpid() optional http://bugs.python.org/issue28156 closed by berker.peksag #28160: Python -V and --version output to stderr instead of stdout http://bugs.python.org/issue28160 closed by berker.peksag #28170: SystemError: References: <942D57F5-BA76-49CB-B3DB-18E2D6F12AC4@python.org> Message-ID: On 15.09.16 09:48, Berker Peksa? wrote: > Fixed, it should redirect to https://www.python.org/blogs/ now. Thanks > for noticing this! Thanks Berker! From eric at trueblade.com Mon Sep 19 04:35:41 2016 From: eric at trueblade.com (Eric V. Smith) Date: Mon, 19 Sep 2016 04:35:41 -0400 Subject: [Python-Dev] [Python-checkins] cpython (2.7): properly handle the single null-byte file (closes #24022) In-Reply-To: <20160919064424.81528.32543.0663C2E4@psf.io> References: <20160919064424.81528.32543.0663C2E4@psf.io> Message-ID: <52339026-F471-41EF-BAC2-DB948907D6E7@trueblade.com> Shouldn't there be a test added for this? -- Eric. > On Sep 19, 2016, at 2:44 AM, benjamin.peterson wrote: > > https://hg.python.org/cpython/rev/c6438a3df7a4 > changeset: 103950:c6438a3df7a4 > branch: 2.7 > parent: 103927:a8771f230c06 > user: Benjamin Peterson > date: Sun Sep 18 23:41:11 2016 -0700 > summary: > properly handle the single null-byte file (closes #24022) > > files: > Parser/tokenizer.c | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > > diff --git a/Parser/tokenizer.c b/Parser/tokenizer.c > --- a/Parser/tokenizer.c > +++ b/Parser/tokenizer.c > @@ -951,7 +951,7 @@ > else { > tok->done = E_OK; > tok->inp = strchr(tok->buf, '\0'); > - done = tok->inp[-1] == '\n'; > + done = tok->inp == tok->buf || tok->inp[-1] == '\n'; > } > } > else { > > -- > Repository URL: https://hg.python.org/cpython > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > https://mail.python.org/mailman/listinfo/python-checkins -------------- next part -------------- An HTML attachment was scrubbed... URL: From berker.peksag at gmail.com Mon Sep 19 04:56:47 2016 From: berker.peksag at gmail.com (=?UTF-8?Q?Berker_Peksa=C4=9F?=) Date: Mon, 19 Sep 2016 11:56:47 +0300 Subject: [Python-Dev] [Python-checkins] cpython: Use HTTP in testPythonOrg In-Reply-To: References: <20160911124611.48720.70550.8327B58D@psf.io> Message-ID: On Sun, Sep 11, 2016 at 3:58 PM, Eric V. Smith wrote: > Hi, Berker. > > Could you add a comment to the test on why this should use http? I can see > this bouncing back and forth between http and https, as people clean an up > all http usages to be https. Hi Eric, Sorry, I missed your email. Victor's analysis is correct. I've changed the test to use pythontest.net and increased the test coverage in http://bugs.python.org/issue28151. Thank you for the review! --Berker From wolfgang.maier at biologie.uni-freiburg.de Mon Sep 19 13:29:08 2016 From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier) Date: Mon, 19 Sep 2016 19:29:08 +0200 Subject: [Python-Dev] docs.python.org problem Message-ID: Dear all, FYI, https://docs.python.org/3.6/ is currently pointing to the Python 3.7.0a0 documentation Best, Wolfgang From nad at python.org Mon Sep 19 14:07:06 2016 From: nad at python.org (Ned Deily) Date: Mon, 19 Sep 2016 14:07:06 -0400 Subject: [Python-Dev] docs.python.org problem In-Reply-To: References: Message-ID: <17E5F28B-B280-4652-85BC-D61BDBB7FFA5@python.org> On Sep 19, 2016, at 13:29, Wolfgang Maier wrote: > FYI, https://docs.python.org/3.6/ is currently pointing to the Python 3.7.0a0 documentation Thanks for the report. Working on it. --Ned -- Ned Deily nad at python.org -- [] From guido at python.org Mon Sep 19 14:07:13 2016 From: guido at python.org (Guido van Rossum) Date: Mon, 19 Sep 2016 11:07:13 -0700 Subject: [Python-Dev] docs.python.org problem In-Reply-To: References: Message-ID: I've filed https://github.com/python/pythondotorg/issues/1014, not sure if that's the right tracker though. On Mon, Sep 19, 2016 at 10:29 AM, Wolfgang Maier < wolfgang.maier at biologie.uni-freiburg.de> wrote: > Dear all, > > FYI, https://docs.python.org/3.6/ is currently pointing to the Python > 3.7.0a0 documentation > > Best, > Wolfgang > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido% > 40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From benjamin at python.org Tue Sep 20 01:07:37 2016 From: benjamin at python.org (Benjamin Peterson) Date: Mon, 19 Sep 2016 22:07:37 -0700 Subject: [Python-Dev] [Python-checkins] cpython (2.7): properly handle the single null-byte file (closes #24022) In-Reply-To: <52339026-F471-41EF-BAC2-DB948907D6E7@trueblade.com> References: <20160919064424.81528.32543.0663C2E4@psf.io> <52339026-F471-41EF-BAC2-DB948907D6E7@trueblade.com> Message-ID: <1474348057.605572.730999473.612D84D6@webmail.messagingengine.com> On Mon, Sep 19, 2016, at 01:35, Eric V. Smith wrote: > Shouldn't there be a test added for this? In fact, there is one: test_particularly_evil_undecodable in test_compile.py. No has managed to make Python crash by exploiting this particular problem?it's just ASan complaints. From dimaqq at gmail.com Tue Sep 20 05:56:26 2016 From: dimaqq at gmail.com (Dima Tisnek) Date: Tue, 20 Sep 2016 11:56:26 +0200 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <20160915154640.68df4399@fsol> <7BDDAF13-0C33-44A2-BBA8-82C5D8C7587E@gmail.com> <20160915181354.01b34ce3@fsol> Message-ID: Totally random thought: Can lru_cache be simplified to use an ordered dict instead of dict + linked list? On 15 September 2016 at 20:30, Serhiy Storchaka wrote: > On 15.09.16 19:13, Antoine Pitrou wrote: >> >> Since this micro-benchmark creates the keys in order just before >> filling the dict with them, randomizing the insertion order destroys >> the temporal locality of object header accesses when iterating over the >> dict keys. *This* looks like the right explanation, not branch >> mispredicts due to NULL checks. >> >> This also shows that a micro-benchmark that merely looks ok can actually >> be a terrible proxy of actual performance. > > > Thanks you for great explanation Antoine! I came to the same conclusions > about randomized integers example, but didn't notice that this also is a > main cause of the speed up of strings example. > >> As a further validation of this theory, let's dramatically decrease the >> working set size on the initial benchmark: >> >> $ ./python -m timeit -s "d=dict.fromkeys(map(str,range(10**3)))" >> "list(d)" >> >> -> Python 3.5: 100000 loops, best of 3: 10.9 usec per loop >> -> Python 3.6: 100000 loops, best of 3: 9.72 usec per loop >> >> When the working set fits in the cache, this micro-benchmark is >> only 12% slower on 3.5 compared to 3.6. >> *This* much smaller difference (a mere 1.2ns difference per dict >> element) could be attributed to eliminating the NULL checks, or to any >> other streamlining of the core iteration logic. > > > Yet one example, with random hashes and insertion order independent from the > creation order. > > $ ./python -m timeit -s "import random; a = list(map(str, range(10**6))); > random.shuffle(a); d = dict.fromkeys(a)" -- "list(d)" > > Python 3.5: 180, 180, 180 msec per loop > Python 3.6: 171, 172, 171 msec per loop > > Python 3.6 is 5% faster and this looks closer to the actual performance. > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/dimaqq%40gmail.com From songofacandy at gmail.com Tue Sep 20 07:11:52 2016 From: songofacandy at gmail.com (INADA Naoki) Date: Tue, 20 Sep 2016 20:11:52 +0900 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <20160915154640.68df4399@fsol> <7BDDAF13-0C33-44A2-BBA8-82C5D8C7587E@gmail.com> <20160915181354.01b34ce3@fsol> Message-ID: On Tue, Sep 20, 2016 at 7:02 PM, INADA Naoki wrote: > On Tue, Sep 20, 2016 at 6:56 PM, Dima Tisnek wrote: >> Totally random thought: >> >> Can lru_cache be simplified to use an ordered dict instead of dict + >> linked list? >> > > I think so. > See also: http://bugs.python.org/issue28199#msg276938 > FYI, current dict implementation is not optimized for removing first item like this: ``` // When hit max_size Py_ssize_t pos; PyObject *key; if (PyDict_Next(d, &pos, &key, NULL)) { if (PyDict_DelItem(key) < 0) { // error. } } ``` So, before changing lru_cache implementation, I (or someone else) should rewrite OrderedDict which has O(1) "remove first item" method. (At least max_size is not None). But both of OrderedDict and lru_cache improvements can't be in 3.6 since 3.6 is beta now. I'll try it after 3.6rc1. -- INADA Naoki From ericsnowcurrently at gmail.com Tue Sep 20 11:17:01 2016 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 20 Sep 2016 09:17:01 -0600 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <20160915154640.68df4399@fsol> <7BDDAF13-0C33-44A2-BBA8-82C5D8C7587E@gmail.com> <20160915181354.01b34ce3@fsol> Message-ID: On Tue, Sep 20, 2016 at 5:11 AM, INADA Naoki wrote: > But both of OrderedDict and lru_cache improvements can't be in 3.6 > since 3.6 is beta now. > I'll try it after 3.6rc1. When you do, make sure you keep in mind the performance constraints of *all* the OrderedDict methods. The constraints are discussed somewhat at the top of https://hg.python.org/cpython/file/tip/Objects/odictobject.c. -eric From fijall at gmail.com Tue Sep 20 15:01:16 2016 From: fijall at gmail.com (Maciej Fijalkowski) Date: Tue, 20 Sep 2016 21:01:16 +0200 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <842DEA6E-1A77-48B0-9AF5-FAA6EBF8A599@gmail.com> Message-ID: On Thu, Sep 15, 2016 at 1:27 PM, Paul Moore wrote: > On 15 September 2016 at 10:43, Raymond Hettinger > wrote: >> Something like this will reveal the true and massive improvement in iteration speed: >> >> $ ./python.exe -m timeit -s "d=dict.fromkeys(map(str,range(10**6)))" "list(d)" > >>py -3.5 -m timeit -s "d=dict.fromkeys(map(str,range(10**6)))" "list(d)" > 10 loops, best of 3: 66.2 msec per loop >>py -3.6 -m timeit -s "d=dict.fromkeys(map(str,range(10**6)))" "list(d)" > 10 loops, best of 3: 27.8 msec per loop > > And for Victor: > >>py -3.5 -m perf timeit -s "d=dict.fromkeys(map(str,range(10**6)))" "list(d)" > .................... > Median +- std dev: 65.7 ms +- 3.8 ms >>py -3.6 -m perf timeit -s "d=dict.fromkeys(map(str,range(10**6)))" "list(d)" > .................... > Median +- std dev: 27.9 ms +- 1.2 ms > > Just as a side point, perf provided essentially identical results but > took 2 minutes as opposed to 8 seconds for timeit to do so. I > understand why perf is better, and I appreciate all the work Victor > did to create it, and analyze the results, but for getting a quick > impression of how a microbenchmark performs, I don't see timeit as > being *quite* as bad as Victor is claiming. > > I will tend to use perf now that I have it installed, and now that I > know how to run a published timeit invocation using perf. It's a > really cool tool. But I certainly won't object to seeing people > publish timeit results (any more than I'd object to *any* > mirobenchmark). > > Paul How about we just make timeit show average and not disable the GC then (two of the complaints that will not change the execution time)? From christian at python.org Wed Sep 21 05:06:27 2016 From: christian at python.org (Christian Heimes) Date: Wed, 21 Sep 2016 11:06:27 +0200 Subject: [Python-Dev] cpython (3.6): replace usage of Py_VA_COPY with the (C99) standard va_copy In-Reply-To: <20160921033951.19751.13445.DE725F99@psf.io> References: <20160921033951.19751.13445.DE725F99@psf.io> Message-ID: <4f0bb1cc-9e6e-feb9-f901-69dd1206ccf0@python.org> On 2016-09-21 05:39, benjamin.peterson wrote: > https://hg.python.org/cpython/rev/278b21d8e86e > changeset: 103977:278b21d8e86e > branch: 3.6 > parent: 103975:d31b4de433b7 > user: Benjamin Peterson > date: Tue Sep 20 20:39:33 2016 -0700 > summary: > replace usage of Py_VA_COPY with the (C99) standard va_copy Thanks! Coverity has been complaining about Py_VA_COPY() for a long time. Your change may cause a memory leak on some platforms. You must va_end() a va_copy() region: Each invocation of va_copy() must be matched by a corresponding invocation of va_end() in the same function. https://linux.die.net/man/3/va_copy From dimaqq at gmail.com Wed Sep 21 06:10:33 2016 From: dimaqq at gmail.com (Dima Tisnek) Date: Wed, 21 Sep 2016 12:10:33 +0200 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <20160915154640.68df4399@fsol> <7BDDAF13-0C33-44A2-BBA8-82C5D8C7587E@gmail.com> <20160915181354.01b34ce3@fsol> Message-ID: I guess what `lru_cache` needs is atomic push-pop: on hit: pop(this) + push_back(this) on miss: pop_front() + push_back(this) I reckon, if flat array is lazy (i.e. can grow larger than no. of keys), then *amortised* push-pop performance is not hard to achieve. Overall, it sounds more like heap queue; And it's a great example of feature creep -- once ordered dicts are builtin, every one and their niece wants to use them, not necessarily what they were originally envisioned for. By comparison, **kwargs and **meta are statistically mostly immutable. Perhaps distinct specialisations are better? On 20 September 2016 at 13:11, INADA Naoki wrote: > On Tue, Sep 20, 2016 at 7:02 PM, INADA Naoki wrote: >> On Tue, Sep 20, 2016 at 6:56 PM, Dima Tisnek wrote: >>> Totally random thought: >>> >>> Can lru_cache be simplified to use an ordered dict instead of dict + >>> linked list? >>> >> >> I think so. >> See also: http://bugs.python.org/issue28199#msg276938 >> > > FYI, current dict implementation is not optimized for removing first > item like this: > > ``` > // When hit max_size > Py_ssize_t pos; > PyObject *key; > if (PyDict_Next(d, &pos, &key, NULL)) { > if (PyDict_DelItem(key) < 0) { > // error. > } > } > ``` > > So, before changing lru_cache implementation, I (or someone else) should rewrite > OrderedDict which has O(1) "remove first item" method. (At least > max_size is not None). > > But both of OrderedDict and lru_cache improvements can't be in 3.6 > since 3.6 is beta now. > I'll try it after 3.6rc1. > > -- > INADA Naoki > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/dimaqq%40gmail.com From victor.stinner at gmail.com Wed Sep 21 06:42:22 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 21 Sep 2016 12:42:22 +0200 Subject: [Python-Dev] cpython (3.6): replace usage of Py_VA_COPY with the (C99) standard va_copy In-Reply-To: References: <20160921033951.19751.13445.DE725F99@psf.io> <4f0bb1cc-9e6e-feb9-f901-69dd1206ccf0@python.org> Message-ID: I see that the old macro is now an alias to va_copy(). A similar change was done for Py_MEMCPY(). Would it make sense to put these old macros in a new backward_compat.h header, so maybe one day we can remove them? :-) Maybe we need at least a comment mentionning when (python version) the macro became an alias. Victor -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Wed Sep 21 10:11:34 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 21 Sep 2016 16:11:34 +0200 Subject: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered In-Reply-To: References: <842DEA6E-1A77-48B0-9AF5-FAA6EBF8A599@gmail.com> Message-ID: 2016-09-20 21:01 GMT+02:00 Maciej Fijalkowski : > How about we just make timeit show average and not disable the GC then > (two of the complaints that will not change the execution time)? Thanks for the reminder. The first part of my plan was to write a new module to experiment changes. This part is done: it's the new perf module available on PyPI (it works on Python 2.7-3.7). The second part of my plan was to enhance the stdlib, so here you have! "Enhance the timeit module: display average +- std dev instead of minimum" http://bugs.python.org/issue28240 Victor From tchappui at gmail.com Wed Sep 21 07:22:58 2016 From: tchappui at gmail.com (Thierry Chappuis) Date: Wed, 21 Sep 2016 13:22:58 +0200 Subject: [Python-Dev] cpython (3.6): replace usage of Py_VA_COPY with the (C99) standard va_copy In-Reply-To: References: <20160921033951.19751.13445.DE725F99@psf.io> <4f0bb1cc-9e6e-feb9-f901-69dd1206ccf0@python.org> Message-ID: <6981ad96-7730-4422-b5f5-5c51ae16880f@gmail.com> Hello, C99 has shown slow adoption by microsoft compilers on windows. On this platform, the support of va_copy() is recent and started with Visual Studio 2013. Therefore, starting from Python 3.5, PY_VA_COPY can now be mapped directly to the native implementation of va_copy(). Hence, the proposed change might be justified. Best wishes Thierry On Wed, Sep 21, 2016 at 12:42pm, Victor Stinner < victor.stinner at gmail.com [victor.stinner at gmail.com] > wrote: I see that the old macro is now an alias to va_copy(). A similar change was done for Py_MEMCPY(). Would it make sense to put these old macros in a new backward_compat.h header, so maybe one day we can remove them? :-) Maybe we need at least a comment mentionning when (python version) the macro became an alias. Victor -------------- next part -------------- An HTML attachment was scrubbed... URL: From benjamin at python.org Thu Sep 22 02:01:01 2016 From: benjamin at python.org (Benjamin Peterson) Date: Wed, 21 Sep 2016 23:01:01 -0700 Subject: [Python-Dev] cpython (3.6): replace usage of Py_VA_COPY with the (C99) standard va_copy In-Reply-To: <4f0bb1cc-9e6e-feb9-f901-69dd1206ccf0@python.org> References: <20160921033951.19751.13445.DE725F99@psf.io> <4f0bb1cc-9e6e-feb9-f901-69dd1206ccf0@python.org> Message-ID: <1474524061.1330260.733416233.6E6B9906@webmail.messagingengine.com> On Wed, Sep 21, 2016, at 02:06, Christian Heimes wrote: > On 2016-09-21 05:39, benjamin.peterson wrote: > > https://hg.python.org/cpython/rev/278b21d8e86e > > changeset: 103977:278b21d8e86e > > branch: 3.6 > > parent: 103975:d31b4de433b7 > > user: Benjamin Peterson > > date: Tue Sep 20 20:39:33 2016 -0700 > > summary: > > replace usage of Py_VA_COPY with the (C99) standard va_copy > > Thanks! Coverity has been complaining about Py_VA_COPY() for a long > time. Your change may cause a memory leak on some platforms. You must > va_end() a va_copy() region: Yep. Thanks for fixing that. I'm not actually aware of any platform where va_end() frees anything, but it's the right thing to do. From benjamin at python.org Thu Sep 22 02:02:40 2016 From: benjamin at python.org (Benjamin Peterson) Date: Wed, 21 Sep 2016 23:02:40 -0700 Subject: [Python-Dev] cpython (3.6): replace usage of Py_VA_COPY with the (C99) standard va_copy In-Reply-To: References: <20160921033951.19751.13445.DE725F99@psf.io> <4f0bb1cc-9e6e-feb9-f901-69dd1206ccf0@python.org> Message-ID: <1474524160.1330752.733418297.54E48F04@webmail.messagingengine.com> On Wed, Sep 21, 2016, at 03:42, Victor Stinner wrote: > I see that the old macro is now an alias to va_copy(). A similar change > was > done for Py_MEMCPY(). Would it make sense to put these old macros in a > new > backward_compat.h header, so maybe one day we can remove them? :-) That's fine with me, though, the maintenance burden of them is precisely one line. Just dump the compat macros in Python 4.0 I think. From victor.stinner at gmail.com Thu Sep 22 07:44:00 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 22 Sep 2016 13:44:00 +0200 Subject: [Python-Dev] cpython (3.6): replace usage of Py_VA_COPY with the (C99) standard va_copy In-Reply-To: <1474524160.1330752.733418297.54E48F04@webmail.messagingengine.com> References: <20160921033951.19751.13445.DE725F99@psf.io> <4f0bb1cc-9e6e-feb9-f901-69dd1206ccf0@python.org> <1474524160.1330752.733418297.54E48F04@webmail.messagingengine.com> Message-ID: 2016-09-22 8:02 GMT+02:00 Benjamin Peterson : > Just dump the compat macros in Python 4.0 I think. Please don't. Python 3 was so painful because we decided to make millions of tiny backward incompatible changes. To have a smooth Python 4.0 release, we should only remove things which were already deprecated since at least 2 cycles, and well documented as deprecated. Note: The Gtk project has similar questions on backward compatibility ;-) https://blogs.gnome.org/desrt/2016/06/13/gtk-4-0-is-not-gtk-4/ (Migration to Gtk3 was also painful for developers, no?) Victor From benjamin at python.org Fri Sep 23 02:47:20 2016 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 22 Sep 2016 23:47:20 -0700 Subject: [Python-Dev] cpython (3.6): replace usage of Py_VA_COPY with the (C99) standard va_copy In-Reply-To: References: <20160921033951.19751.13445.DE725F99@psf.io> <4f0bb1cc-9e6e-feb9-f901-69dd1206ccf0@python.org> <1474524160.1330752.733418297.54E48F04@webmail.messagingengine.com> Message-ID: <1474613240.146291.734573889.64CB4FDB@webmail.messagingengine.com> On Thu, Sep 22, 2016, at 04:44, Victor Stinner wrote: > 2016-09-22 8:02 GMT+02:00 Benjamin Peterson : > > Just dump the compat macros in Python 4.0 I think. > > Please don't. Python 3 was so painful because we decided to make > millions of tiny backward incompatible changes. To have a smooth > Python 4.0 release, we should only remove things which were already > deprecated since at least 2 cycles, and well documented as deprecated. I'm being flippant here because of the triviality of the change. Anyone using Py_VA_COPY or Py_MEMCPY can fix their code in a backwards and forwards compatible manner in 7 seconds with a sed command. From rosuav at gmail.com Fri Sep 23 03:03:20 2016 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 23 Sep 2016 17:03:20 +1000 Subject: [Python-Dev] cpython (3.6): replace usage of Py_VA_COPY with the (C99) standard va_copy In-Reply-To: <1474613240.146291.734573889.64CB4FDB@webmail.messagingengine.com> References: <20160921033951.19751.13445.DE725F99@psf.io> <4f0bb1cc-9e6e-feb9-f901-69dd1206ccf0@python.org> <1474524160.1330752.733418297.54E48F04@webmail.messagingengine.com> <1474613240.146291.734573889.64CB4FDB@webmail.messagingengine.com> Message-ID: On Fri, Sep 23, 2016 at 4:47 PM, Benjamin Peterson wrote: > On Thu, Sep 22, 2016, at 04:44, Victor Stinner wrote: >> 2016-09-22 8:02 GMT+02:00 Benjamin Peterson : >> > Just dump the compat macros in Python 4.0 I think. >> >> Please don't. Python 3 was so painful because we decided to make >> millions of tiny backward incompatible changes. To have a smooth >> Python 4.0 release, we should only remove things which were already >> deprecated since at least 2 cycles, and well documented as deprecated. > > I'm being flippant here because of the triviality of the change. Anyone > using Py_VA_COPY or Py_MEMCPY can fix their code in a backwards and > forwards compatible manner in 7 seconds with a sed command. In fact, this kind of thing would be perfect for Python 4.0 - it's technically backward incompatible (thus justifying the 4.0 number), but removes only things that have been deprecated for some time, and have simple and direct translations. ChrisA From victor.stinner at gmail.com Fri Sep 23 03:04:03 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 23 Sep 2016 09:04:03 +0200 Subject: [Python-Dev] cpython (3.6): replace usage of Py_VA_COPY with the (C99) standard va_copy In-Reply-To: <1474613240.146291.734573889.64CB4FDB@webmail.messagingengine.com> References: <20160921033951.19751.13445.DE725F99@psf.io> <4f0bb1cc-9e6e-feb9-f901-69dd1206ccf0@python.org> <1474524160.1330752.733418297.54E48F04@webmail.messagingengine.com> <1474613240.146291.734573889.64CB4FDB@webmail.messagingengine.com> Message-ID: 2016-09-23 8:47 GMT+02:00 Benjamin Peterson : > I'm being flippant here because of the triviality of the change. Anyone > using Py_VA_COPY or Py_MEMCPY can fix their code in a backwards and > forwards compatible manner in 7 seconds with a sed command. Python 3 had the same argument with 2to3: run 2to3 once, and you are done. C99 is a new thing for Python >= 3.6, but when you want to support Python 2.7 and 3.5, you are stuck at Visual Studio 2010 which is less happy with C99 than VS 2015... Hum, I don't recall if Python 2.7 requires VS 2010 or 2008? Python 2.7 doesn't seem to be mentioned in the dev guide :-/ https://docs.python.org/devguide/setup.html#windows-compiling Victor From tchappui at gmail.com Fri Sep 23 03:23:33 2016 From: tchappui at gmail.com (tchappui at gmail.com) Date: Fri, 23 Sep 2016 07:23:33 +0000 (UTC) Subject: [Python-Dev] cpython (3.6): replace usage of Py_VA_COPY with the (C99) standard va_copy In-Reply-To: References: <20160921033951.19751.13445.DE725F99@psf.io> <4f0bb1cc-9e6e-feb9-f901-69dd1206ccf0@python.org> <1474524160.1330752.733418297.54E48F04@webmail.messagingengine.com> <1474613240.146291.734573889.64CB4FDB@webmail.messagingengine.com> Message-ID: <8525F8D32715CFD1.909b2e83-f183-433e-88c1-42209f64ee52@mail.outlook.com> Hi, Python 2.7 requires VS 2008 as Microsoft provides a specific bundle https://www.microsoft.com/en-us/download/details.aspx?id=44266 Kind regards Thierry On Fri, Sep 23, 2016 at 9:05 AM +0200, "Victor Stinner" wrote: 2016-09-23 8:47 GMT+02:00 Benjamin Peterson : > I'm being flippant here because of the triviality of the change. Anyone > using Py_VA_COPY or Py_MEMCPY can fix their code in a backwards and > forwards compatible manner in 7 seconds with a sed command. Python 3 had the same argument with 2to3: run 2to3 once, and you are done. C99 is a new thing for Python >= 3.6, but when you want to support Python 2.7 and 3.5, you are stuck at Visual Studio 2010 which is less happy with C99 than VS 2015... Hum, I don't recall if Python 2.7 requires VS 2010 or 2008? Python 2.7 doesn't seem to be mentioned in the dev guide :-/ https://docs.python.org/devguide/setup.html#windows-compiling Victor _______________________________________________ Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/tchappui%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Fri Sep 23 04:08:15 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 23 Sep 2016 10:08:15 +0200 Subject: [Python-Dev] cpython (3.6): replace usage of Py_VA_COPY with the (C99) standard va_copy In-Reply-To: References: <20160921033951.19751.13445.DE725F99@psf.io> <4f0bb1cc-9e6e-feb9-f901-69dd1206ccf0@python.org> <1474524160.1330752.733418297.54E48F04@webmail.messagingengine.com> <1474613240.146291.734573889.64CB4FDB@webmail.messagingengine.com> Message-ID: 2016-09-23 9:03 GMT+02:00 Chris Angelico : > In fact, this kind of thing would be perfect for Python 4.0 - it's > technically backward incompatible (thus justifying the 4.0 number), > but removes only things that have been deprecated for some time, and > have simple and direct translations. Sorry, you missed my point. No, I'm strongly opposed to yet another "break the world" major release. Python 4 must not introduce deliberate backward compatible changes for the purity of the code. The main lesson learnt from Python 3.0 is that practicality beats purity. For me, it's fine to remove deprecated things, but only if we respect the smooth common deprecation planning: * pending deprecation * deprecation * remove The strict minimum is a deprecation in one cycle, but it's better to use 3 cycles (Python 3.x releases) for a smooth transition. For Py_MEMCPY and Py_VA_COPY, the maintenance burden is gone: these macros are now dumb aliases. I would prefer to wait until Python 2.7 is death 7 times before removing it. But I'm proposing to keep tracks of such macros kept for backward compatibility to remind that some day, we should think how to remove them. By the way, GCC has a neat "deprecated" attribute. Maybe we should start to use it? https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#Common-Function-Attributes Currently, we don't warn technically developers of C extensions. The best that we can do is to document deprecations C API in What's New in Python 3.x and in the documentation of the C API... but who still read these documentations when their C extension is stable and "just works"? A warning during the compilation would be a nice hint that something is wrong and should be fixed. Oh... I already proposed to use __attribute__((deprecated)) in november 2013 :-) http://bugs.python.org/issue19569 Victor From victor.stinner at gmail.com Fri Sep 23 04:33:12 2016 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 23 Sep 2016 10:33:12 +0200 Subject: [Python-Dev] OpenIndiana and Solaris support Message-ID: Hi, My question is simple: do we officially support Solaris and/or OpenIndiana? Jesus Cea runs an OpenIndiana buildbot slave: http://buildbot.python.org/all/buildslaves/cea-indiana-x86 "Open Indiana 32 bits" The platform module of Python says "Solaris-2.11", I don't know the exact OpenIndiana version. A lot of unit tests fail on this buildbot with MemoryError. I guess that it's related to Solaris which doesn't allow overcommit (allocating more memory than available memory on the system), or more simply because the slave has not enough memory. There is now an issue which seems specific to OpenIndiana: http://bugs.python.org/issue27847 It might impact Solaris as well, but the Solaris buildbot is offline since "684 builds". Five years ago, I reported a bug because the curses module of Python 3 doesn't build on Solaris nor OpenIndiana anymore. It seems like the bug was not fixed, and the issue is still open: http://bugs.python.org/issue13552 So my question is if we officially support Solaris and/or OpenIndiana. If yes, how can we fix issues when we only have buildbot slave which has memory errors, and no SSH access to this server? Solaris doesn't seem to be officially supported in Python, so I suggest to drop the OpenIndiana buildbot (which is failing since at least 2 years) and close all Solaris issues as "WONTFIX". Victor From status at bugs.python.org Fri Sep 23 12:08:50 2016 From: status at bugs.python.org (Python tracker) Date: Fri, 23 Sep 2016 18:08:50 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20160923160850.A2F5556A28@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2016-09-16 - 2016-09-23) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 5530 (+23) closed 34485 (+53) total 40015 (+76) Open issues with patches: 2381 Issues opened (63) ================== #16700: Document that bytes OS API can returns unusable results on Win http://bugs.python.org/issue16700 reopened by serhiy.storchaka #27922: Make IDLE tests less flashy http://bugs.python.org/issue27922 reopened by terry.reedy #28162: WindowsConsoleIO readall() fails if first line starts with Ctr http://bugs.python.org/issue28162 reopened by eryksun #28164: _PyIO_get_console_type fails for various paths http://bugs.python.org/issue28164 reopened by eryksun #28165: The 'subprocess' module leaks memory when called in certain wa http://bugs.python.org/issue28165 reopened by Xavion #28176: Fix callbacks race in asyncio.SelectorLoop.sock_connect http://bugs.python.org/issue28176 reopened by haypo #28183: Clean up and speed up dict iteration http://bugs.python.org/issue28183 opened by serhiy.storchaka #28185: Tabs in C source code http://bugs.python.org/issue28185 opened by franciscouzo #28186: Autogenerated tabs / trailing whitespace http://bugs.python.org/issue28186 opened by franciscouzo #28188: os.putenv should support bytes arguments on Windows http://bugs.python.org/issue28188 opened by eryksun #28190: Cross-build _curses failed if host ncurses headers and target http://bugs.python.org/issue28190 opened by Chi Hsuan Yen #28191: Support RFC4985 SRVName in SAN name http://bugs.python.org/issue28191 opened by christian.heimes #28194: Clean up some checks in dict implementation http://bugs.python.org/issue28194 opened by xiang.zhang #28196: ssl.match_hostname() should check for SRV-ID and URI-ID http://bugs.python.org/issue28196 opened by christian.heimes #28197: range.index mismatch with documentation http://bugs.python.org/issue28197 opened by veky #28199: Compact dict resizing is doing too much work http://bugs.python.org/issue28199 opened by rhettinger #28201: dict: perturb shift should be done when first conflict http://bugs.python.org/issue28201 opened by inada.naoki #28202: Python 3.5.1 C API, the global variable is not destroyed when http://bugs.python.org/issue28202 opened by Jack Liu #28203: complex() gives wrong error when the second argument has an in http://bugs.python.org/issue28203 opened by manishearth #28206: signal.Signals not documented http://bugs.python.org/issue28206 opened by Samuel Colvin #28207: SQLite headers are not searched in custom locations http://bugs.python.org/issue28207 opened by Santiago Castro #28208: update sqlite to 3.14.2 http://bugs.python.org/issue28208 opened by Big Stone #28209: Exe or MSI unable to find Py3.5 http://bugs.python.org/issue28209 opened by jcrmatos #28210: argparse with subcommands difference in python 2.7 / 3.5 http://bugs.python.org/issue28210 opened by stephan #28211: Wrong return value type in the doc of PyMapping_Keys/Values/It http://bugs.python.org/issue28211 opened by xiang.zhang #28212: Closing server in asyncio is not efficient http://bugs.python.org/issue28212 opened by ???????????????????? ???????????? #28213: asyncio SSLProtocol _app_transport is private http://bugs.python.org/issue28213 opened by ???????????????????? ???????????? #28214: Improve exception reporting for problematic __set_name__ attri http://bugs.python.org/issue28214 opened by Tim.Graham #28215: PyModule_AddIntConstant() wraps >=2^31 values when long is 4 b http://bugs.python.org/issue28215 opened by altendky #28217: Add interactive console tests http://bugs.python.org/issue28217 opened by steve.dower #28218: Windows docs have wrong versionadded description http://bugs.python.org/issue28218 opened by steve.dower #28219: Is order of argparse --help output officially defined? http://bugs.python.org/issue28219 opened by barry #28221: Unused indata in test_ssl.ThreadedTests.test_asyncore_server http://bugs.python.org/issue28221 opened by martin.panter #28222: test_distutils fails http://bugs.python.org/issue28222 opened by xiang.zhang #28223: test_tools fails with timeout on AMD64 Snow Leop 3.x buildbot http://bugs.python.org/issue28223 opened by haypo #28224: Compilation warnings on Windows: export 'PyInit_xx' specified http://bugs.python.org/issue28224 opened by haypo #28225: bz2 does not support pathlib http://bugs.python.org/issue28225 opened by ethan.furman #28226: compileall does not support pathlib http://bugs.python.org/issue28226 opened by ethan.furman #28227: gzip does not support pathlib http://bugs.python.org/issue28227 opened by ethan.furman #28228: imghdr does not support pathlib http://bugs.python.org/issue28228 opened by ethan.furman #28229: lzma does not support pathlib http://bugs.python.org/issue28229 opened by ethan.furman #28230: tarfile does not support pathlib http://bugs.python.org/issue28230 opened by ethan.furman #28231: zipfile does not support pathlib http://bugs.python.org/issue28231 opened by ethan.furman #28232: asyncio: wrap_future() doesn't handle cancellation correctly http://bugs.python.org/issue28232 opened by haypo #28234: In xml.etree.ElementTree docs there are many absent Element cl http://bugs.python.org/issue28234 opened by py.user #28235: In xml.etree.ElementTree docs there is no parser argument in f http://bugs.python.org/issue28235 opened by py.user #28236: In xml.etree.ElementTree Element can be created with empty and http://bugs.python.org/issue28236 opened by py.user #28237: In xml.etree.ElementTree bytes tag or attributes raises on ser http://bugs.python.org/issue28237 opened by py.user #28238: In xml.etree.ElementTree findall() can't search all elements i http://bugs.python.org/issue28238 opened by py.user #28240: Enhance the timeit module: display average +- std dev instead http://bugs.python.org/issue28240 opened by haypo #28243: Performance regression in functools.partial() http://bugs.python.org/issue28243 opened by serhiy.storchaka #28247: Add an option to zipapp to produce a Windows executable http://bugs.python.org/issue28247 opened by paul.moore #28248: Upgrade installers to OpenSSL 1.0.2i http://bugs.python.org/issue28248 opened by alex #28249: doctest.DocTestFinder reports incorrect line numbers with excl http://bugs.python.org/issue28249 opened by cpitclaudel #28250: typing.NamedTuple instances are not picklable Two http://bugs.python.org/issue28250 opened by Kurt #28251: Help manuals do not appear in Windows search http://bugs.python.org/issue28251 opened by steve.dower #28252: Tuples used before introduction to tuple in tutorial http://bugs.python.org/issue28252 opened by Eswar Yaganti #28253: calendar.prcal(9999) output has a problem http://bugs.python.org/issue28253 opened by jiangping.li #28254: Add C API for gc.enable, gc.disable, and gc.isenabled http://bugs.python.org/issue28254 opened by llllllllll #28255: TextCalendar.prweek/month/year outputs an extra whitespace cha http://bugs.python.org/issue28255 opened by xiang.zhang #28256: Cleanup Modules/_math.c http://bugs.python.org/issue28256 opened by haypo #28257: Regression for star argument parameter error messages http://bugs.python.org/issue28257 opened by kayhayen #28258: Broken python-config generated with Estonian locale http://bugs.python.org/issue28258 opened by Arfrever Most recent 15 issues with no replies (15) ========================================== #28254: Add C API for gc.enable, gc.disable, and gc.isenabled http://bugs.python.org/issue28254 #28252: Tuples used before introduction to tuple in tutorial http://bugs.python.org/issue28252 #28250: typing.NamedTuple instances are not picklable Two http://bugs.python.org/issue28250 #28249: doctest.DocTestFinder reports incorrect line numbers with excl http://bugs.python.org/issue28249 #28238: In xml.etree.ElementTree findall() can't search all elements i http://bugs.python.org/issue28238 #28237: In xml.etree.ElementTree bytes tag or attributes raises on ser http://bugs.python.org/issue28237 #28236: In xml.etree.ElementTree Element can be created with empty and http://bugs.python.org/issue28236 #28232: asyncio: wrap_future() doesn't handle cancellation correctly http://bugs.python.org/issue28232 #28231: zipfile does not support pathlib http://bugs.python.org/issue28231 #28230: tarfile does not support pathlib http://bugs.python.org/issue28230 #28229: lzma does not support pathlib http://bugs.python.org/issue28229 #28227: gzip does not support pathlib http://bugs.python.org/issue28227 #28226: compileall does not support pathlib http://bugs.python.org/issue28226 #28225: bz2 does not support pathlib http://bugs.python.org/issue28225 #28219: Is order of argparse --help output officially defined? http://bugs.python.org/issue28219 Most recent 15 issues waiting for review (15) ============================================= #28258: Broken python-config generated with Estonian locale http://bugs.python.org/issue28258 #28256: Cleanup Modules/_math.c http://bugs.python.org/issue28256 #28255: TextCalendar.prweek/month/year outputs an extra whitespace cha http://bugs.python.org/issue28255 #28254: Add C API for gc.enable, gc.disable, and gc.isenabled http://bugs.python.org/issue28254 #28253: calendar.prcal(9999) output has a problem http://bugs.python.org/issue28253 #28240: Enhance the timeit module: display average +- std dev instead http://bugs.python.org/issue28240 #28235: In xml.etree.ElementTree docs there is no parser argument in f http://bugs.python.org/issue28235 #28234: In xml.etree.ElementTree docs there are many absent Element cl http://bugs.python.org/issue28234 #28231: zipfile does not support pathlib http://bugs.python.org/issue28231 #28230: tarfile does not support pathlib http://bugs.python.org/issue28230 #28229: lzma does not support pathlib http://bugs.python.org/issue28229 #28228: imghdr does not support pathlib http://bugs.python.org/issue28228 #28227: gzip does not support pathlib http://bugs.python.org/issue28227 #28226: compileall does not support pathlib http://bugs.python.org/issue28226 #28225: bz2 does not support pathlib http://bugs.python.org/issue28225 Top 10 most discussed issues (10) ================================= #28165: The 'subprocess' module leaks memory when called in certain wa http://bugs.python.org/issue28165 14 msgs #28183: Clean up and speed up dict iteration http://bugs.python.org/issue28183 14 msgs #28240: Enhance the timeit module: display average +- std dev instead http://bugs.python.org/issue28240 14 msgs #27761: Private _nth_root function loses accuracy http://bugs.python.org/issue27761 12 msgs #28202: Python 3.5.1 C API, the global variable is not destroyed when http://bugs.python.org/issue28202 12 msgs #28203: complex() gives wrong error when the second argument has an in http://bugs.python.org/issue28203 11 msgs #28182: Expose OpenSSL verification results in SSLError http://bugs.python.org/issue28182 10 msgs #26351: Occasionally check for Ctrl-C in long-running operations like http://bugs.python.org/issue26351 9 msgs #28197: range.index mismatch with documentation http://bugs.python.org/issue28197 9 msgs #28214: Improve exception reporting for problematic __set_name__ attri http://bugs.python.org/issue28214 9 msgs Issues closed (53) ================== #16293: curses.ungetch raises OverflowError when given -1 http://bugs.python.org/issue16293 closed by berker.peksag #20173: Derby #4: Convert 53 sites to Argument Clinic across 5 files http://bugs.python.org/issue20173 closed by zach.ware #21516: pathlib.Path(...).is_dir() crashes on some directories (Window http://bugs.python.org/issue21516 closed by berker.peksag #23372: defaultdict.fromkeys should accept a callable factory http://bugs.python.org/issue23372 closed by rhettinger #24022: Python heap corruption issue http://bugs.python.org/issue24022 closed by python-dev #25400: robotparser doesn't return crawl delay for default entry http://bugs.python.org/issue25400 closed by berker.peksag #25470: Random Malloc error raised http://bugs.python.org/issue25470 closed by haypo #25651: Confusing output for TestCase.subTest(0) http://bugs.python.org/issue25651 closed by berker.peksag #26384: UnboundLocalError in socket._sendfile_use_sendfile http://bugs.python.org/issue26384 closed by berker.peksag #26661: python fails to locate system libffi http://bugs.python.org/issue26661 closed by christian.heimes #27111: redundant variables in long_add and long_sub http://bugs.python.org/issue27111 closed by mark.dickinson #27213: Rework CALL_FUNCTION* opcodes http://bugs.python.org/issue27213 closed by serhiy.storchaka #27222: redundant checks and a weird use of goto statements in long_rs http://bugs.python.org/issue27222 closed by mark.dickinson #27282: Raise BlockingIOError in os.urandom if kernel is not ready http://bugs.python.org/issue27282 closed by haypo #27348: traceback (and threading) drops exception message http://bugs.python.org/issue27348 closed by martin.panter #27441: redundant assignments to ob_size of new ints that _PyLong_New http://bugs.python.org/issue27441 closed by mark.dickinson #27482: heap-buffer-overflow on address 0x6250000078ff http://bugs.python.org/issue27482 closed by benjamin.peterson #27806: 2.7 32-bit builds fail on macOS 10.12 Sierra due to dependency http://bugs.python.org/issue27806 closed by ned.deily #27932: platform.win32_ver() leaks in 2.7.12 http://bugs.python.org/issue27932 closed by steve.dower #27950: Superfluous messages when running make http://bugs.python.org/issue27950 closed by martin.panter #27955: getrandom() syscall returning EPERM make the system unusable. http://bugs.python.org/issue27955 closed by haypo #27979: Remove bundled libffi http://bugs.python.org/issue27979 closed by python-dev #27990: Provide a way to enable getrandom on Linux even when build sys http://bugs.python.org/issue27990 closed by ncoghlan #28042: Coverity Scan defects in new dict code http://bugs.python.org/issue28042 closed by christian.heimes #28075: os.stat fails when access is denied http://bugs.python.org/issue28075 closed by berker.peksag #28086: test.test_getargs2.TupleSubclass test failure http://bugs.python.org/issue28086 closed by serhiy.storchaka #28110: launcher.msi has different product codes between 32 and 64-bit http://bugs.python.org/issue28110 closed by steve.dower #28137: Windows sys.path file should be renamed http://bugs.python.org/issue28137 closed by steve.dower #28138: Windows _sys.path file should allow import site http://bugs.python.org/issue28138 closed by steve.dower #28145: Fix whitespace in C source code http://bugs.python.org/issue28145 closed by franciscouzo #28151: testPythonOrg() of test_robotparser fails on validating python http://bugs.python.org/issue28151 closed by berker.peksag #28161: Opening CON for write access fails http://bugs.python.org/issue28161 closed by steve.dower #28163: WindowsConsoleIO fileno() passes wrong flags to _open_osfhandl http://bugs.python.org/issue28163 closed by steve.dower #28178: allow to cache_clear(some_key) in lru_cache http://bugs.python.org/issue28178 closed by rhettinger #28184: Trailing whitespace in C source code http://bugs.python.org/issue28184 closed by python-dev #28187: Check return value of _PyBytes_Resize http://bugs.python.org/issue28187 closed by berker.peksag #28189: dictitems_contains swallows compare errors http://bugs.python.org/issue28189 closed by rhettinger #28192: Don't import readline in isolated mode http://bugs.python.org/issue28192 closed by steve.dower #28193: Consider using lru_cache for the re.py caches http://bugs.python.org/issue28193 closed by rhettinger #28195: test_huntrleaks_fd_leak fails on Windows http://bugs.python.org/issue28195 closed by haypo #28198: heap-buffer-overflow in tok_nextc (Parser/tokenizer.c:954) http://bugs.python.org/issue28198 closed by berker.peksag #28200: Windows: path_converter() leaks memory for Unicode filenames http://bugs.python.org/issue28200 closed by haypo #28204: Spam http://bugs.python.org/issue28204 closed by xiang.zhang #28205: Add optional suffix to str.join http://bugs.python.org/issue28205 closed by rhettinger #28216: micro optimization for import_all_from http://bugs.python.org/issue28216 closed by xiang.zhang #28220: argparse's add_mutually_exclusive_group() should accept title http://bugs.python.org/issue28220 closed by berker.peksag #28233: PyUnicode_FromFormatV can leak PyUnicodeWriter http://bugs.python.org/issue28233 closed by haypo #28239: Implement functools.lru_cache() using ordered dict http://bugs.python.org/issue28239 closed by serhiy.storchaka #28241: Nested fuctions Unexpected behaviour when stored in a list and http://bugs.python.org/issue28241 closed by zach.ware #28242: os.environ.get documentation missing http://bugs.python.org/issue28242 closed by ned.deily #28244: Incorrect Example in itertools.product description http://bugs.python.org/issue28244 closed by rhettinger #28245: Embeddable Python does not use PYTHONPATH. http://bugs.python.org/issue28245 closed by steve.dower #28246: Unable to read simple text file http://bugs.python.org/issue28246 closed by eryksun From steve at pearwood.info Fri Sep 23 12:32:40 2016 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 24 Sep 2016 02:32:40 +1000 Subject: [Python-Dev] cpython (3.6): replace usage of Py_VA_COPY with the (C99) standard va_copy In-Reply-To: <1474613240.146291.734573889.64CB4FDB@webmail.messagingengine.com> References: <20160921033951.19751.13445.DE725F99@psf.io> <4f0bb1cc-9e6e-feb9-f901-69dd1206ccf0@python.org> <1474524160.1330752.733418297.54E48F04@webmail.messagingengine.com> <1474613240.146291.734573889.64CB4FDB@webmail.messagingengine.com> Message-ID: <20160923163239.GF22471@ando.pearwood.info> On Thu, Sep 22, 2016 at 11:47:20PM -0700, Benjamin Peterson wrote: > > On Thu, Sep 22, 2016, at 04:44, Victor Stinner wrote: > > 2016-09-22 8:02 GMT+02:00 Benjamin Peterson : > > > Just dump the compat macros in Python 4.0 I think. > > > > Please don't. Python 3 was so painful because we decided to make > > millions of tiny backward incompatible changes. To have a smooth > > Python 4.0 release, we should only remove things which were already > > deprecated since at least 2 cycles, and well documented as deprecated. > > I'm being flippant here because of the triviality of the change. Anyone > using Py_VA_COPY or Py_MEMCPY can fix their code in a backwards and > forwards compatible manner in 7 seconds with a sed command. Sorry, I haven't been following this thread in detail, so perhaps I've misunderstood. Are you assuming that anyone who is building Python from source is automatically able to diagnose C level build failures and known how to fix them using sed? -- Steve From guido at python.org Fri Sep 23 13:49:35 2016 From: guido at python.org (Guido van Rossum) Date: Fri, 23 Sep 2016 10:49:35 -0700 Subject: [Python-Dev] OpenIndiana and Solaris support In-Reply-To: References: Message-ID: What on earth is OpenIndiana? Its website is a mystery of buzzwords and PR vagueness: "openindiana Community-driven Illumos Distribution" "What is illumos ? >From the illumos developer?s guide: ?illumos is a consolidation of software that forms the core of an Operating System. It includes the kernel, device drivers, core system libraries, and utilities. It is the home of many technologies include ZFS, DTrace, Zones, ctf, [...]" "The ?Hipster? branch Hipster is a codename for rapidly moving development branch of OpenIndiana and users might experience occasional breakages or problems. Hipster is using rolling-release model and only publishes installation ISOs once in a while. Every ISO release will be announced via mailing list and ..." That didn't exactly answer my questions. Clearly they don't care about anyone who isn't already a user of openindiana or illumos. So I propose that we shouldn't care about them either. On Fri, Sep 23, 2016 at 1:33 AM, Victor Stinner wrote: > Hi, > > My question is simple: do we officially support Solaris and/or OpenIndiana? > > Jesus Cea runs an OpenIndiana buildbot slave: > http://buildbot.python.org/all/buildslaves/cea-indiana-x86 > "Open Indiana 32 bits" > > The platform module of Python says "Solaris-2.11", I don't know the > exact OpenIndiana version. > > A lot of unit tests fail on this buildbot with MemoryError. I guess > that it's related to Solaris which doesn't allow overcommit > (allocating more memory than available memory on the system), or more > simply because the slave has not enough memory. > > There is now an issue which seems specific to OpenIndiana: > http://bugs.python.org/issue27847 > > It might impact Solaris as well, but the Solaris buildbot is offline > since "684 builds". > > Five years ago, I reported a bug because the curses module of Python 3 > doesn't build on Solaris nor OpenIndiana anymore. It seems like the > bug was not fixed, and the issue is still open: > http://bugs.python.org/issue13552 > > So my question is if we officially support Solaris and/or OpenIndiana. > If yes, how can we fix issues when we only have buildbot slave which > has memory errors, and no SSH access to this server? > > Solaris doesn't seem to be officially supported in Python, so I > suggest to drop the OpenIndiana buildbot (which is failing since at > least 2 years) and close all Solaris issues as "WONTFIX". > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) From david.c.stewart at intel.com Fri Sep 23 13:55:08 2016 From: david.c.stewart at intel.com (Stewart, David C) Date: Fri, 23 Sep 2016 17:55:08 +0000 Subject: [Python-Dev] OpenIndiana and Solaris support In-Reply-To: References: Message-ID: Illumos, OpenIndiana et al are open source forks of Solaris. Back before the acquisition by Oracle, Sun open sourced the Solaris OS, called it OpenSolaris and encouraged projects to use it as an OS for x86 and other architectures. But after the acquisition, the OpenSolaris project seemed to end (I don?t know specifics) but several organizations carried on with distributions / forks of OpenSolaris. On 9/23/16, 10:49 AM, "Python-Dev on behalf of Guido van Rossum" wrote: What on earth is OpenIndiana? Its website is a mystery of buzzwords and PR vagueness: "openindiana Community-driven Illumos Distribution" "What is illumos ? From the illumos developer?s guide: ?illumos is a consolidation of software that forms the core of an Operating System. It includes the kernel, device drivers, core system libraries, and utilities. It is the home of many technologies include ZFS, DTrace, Zones, ctf, [...]" "The ?Hipster? branch Hipster is a codename for rapidly moving development branch of OpenIndiana and users might experience occasional breakages or problems. Hipster is using rolling-release model and only publishes installation ISOs once in a while. Every ISO release will be announced via mailing list and ..." That didn't exactly answer my questions. Clearly they don't care about anyone who isn't already a user of openindiana or illumos. So I propose that we shouldn't care about them either. On Fri, Sep 23, 2016 at 1:33 AM, Victor Stinner wrote: > Hi, > > My question is simple: do we officially support Solaris and/or OpenIndiana? > > Jesus Cea runs an OpenIndiana buildbot slave: > http://buildbot.python.org/all/buildslaves/cea-indiana-x86 > "Open Indiana 32 bits" > > The platform module of Python says "Solaris-2.11", I don't know the > exact OpenIndiana version. > > A lot of unit tests fail on this buildbot with MemoryError. I guess > that it's related to Solaris which doesn't allow overcommit > (allocating more memory than available memory on the system), or more > simply because the slave has not enough memory. > > There is now an issue which seems specific to OpenIndiana: > http://bugs.python.org/issue27847 > > It might impact Solaris as well, but the Solaris buildbot is offline > since "684 builds". > > Five years ago, I reported a bug because the curses module of Python 3 > doesn't build on Solaris nor OpenIndiana anymore. It seems like the > bug was not fixed, and the issue is still open: > http://bugs.python.org/issue13552 > > So my question is if we officially support Solaris and/or OpenIndiana. > If yes, how can we fix issues when we only have buildbot slave which > has memory errors, and no SSH access to this server? > > Solaris doesn't seem to be officially supported in Python, so I > suggest to drop the OpenIndiana buildbot (which is failing since at > least 2 years) and close all Solaris issues as "WONTFIX". > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) _______________________________________________ Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/david.c.stewart%40intel.com From guido at python.org Fri Sep 23 14:04:18 2016 From: guido at python.org (Guido van Rossum) Date: Fri, 23 Sep 2016 11:04:18 -0700 Subject: [Python-Dev] OpenIndiana and Solaris support In-Reply-To: References: Message-ID: My guess is that Oracle owns the brand "Solaris" and its awful lawyers have done this. I don't think it's worth our time to support either Solaris or its descendants unless Oracle pays for it. It's too bad for the open source participants in OpenIndiana but realistically we just can't afford the distraction. On Fri, Sep 23, 2016 at 10:55 AM, Stewart, David C wrote: > Illumos, OpenIndiana et al are open source forks of Solaris. > Back before the acquisition by Oracle, Sun open sourced the Solaris OS, called it OpenSolaris and encouraged projects to use it as an OS for x86 and other architectures. But after the acquisition, the OpenSolaris project seemed to end (I don?t know specifics) but several organizations carried on with distributions / forks of OpenSolaris. > > On 9/23/16, 10:49 AM, "Python-Dev on behalf of Guido van Rossum" wrote: > > What on earth is OpenIndiana? Its website is a mystery of buzzwords > and PR vagueness: > > "openindiana > Community-driven Illumos Distribution" > > "What is illumos ? > > From the illumos developer?s guide: ?illumos is a consolidation of > software that forms the core of an Operating System. It includes the > kernel, device drivers, core system libraries, and utilities. It is > the home of many technologies include ZFS, DTrace, Zones, ctf, [...]" > > "The ?Hipster? branch > > Hipster is a codename for rapidly moving development branch of > OpenIndiana and users might experience occasional breakages or > problems. Hipster is using rolling-release model and only publishes > installation ISOs once in a while. Every ISO release will be announced > via mailing list and ..." > > > That didn't exactly answer my questions. Clearly they don't care about > anyone who isn't already a user of openindiana or illumos. So I > propose that we shouldn't care about them either. > > On Fri, Sep 23, 2016 at 1:33 AM, Victor Stinner > wrote: > > Hi, > > > > My question is simple: do we officially support Solaris and/or OpenIndiana? > > > > Jesus Cea runs an OpenIndiana buildbot slave: > > http://buildbot.python.org/all/buildslaves/cea-indiana-x86 > > "Open Indiana 32 bits" > > > > The platform module of Python says "Solaris-2.11", I don't know the > > exact OpenIndiana version. > > > > A lot of unit tests fail on this buildbot with MemoryError. I guess > > that it's related to Solaris which doesn't allow overcommit > > (allocating more memory than available memory on the system), or more > > simply because the slave has not enough memory. > > > > There is now an issue which seems specific to OpenIndiana: > > http://bugs.python.org/issue27847 > > > > It might impact Solaris as well, but the Solaris buildbot is offline > > since "684 builds". > > > > Five years ago, I reported a bug because the curses module of Python 3 > > doesn't build on Solaris nor OpenIndiana anymore. It seems like the > > bug was not fixed, and the issue is still open: > > http://bugs.python.org/issue13552 > > > > So my question is if we officially support Solaris and/or OpenIndiana. > > If yes, how can we fix issues when we only have buildbot slave which > > has memory errors, and no SSH access to this server? > > > > Solaris doesn't seem to be officially supported in Python, so I > > suggest to drop the OpenIndiana buildbot (which is failing since at > > least 2 years) and close all Solaris issues as "WONTFIX". > > > > Victor > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > https://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org > > > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/david.c.stewart%40intel.com > > -- --Guido van Rossum (python.org/~guido) From benjamin at python.org Sat Sep 24 04:07:21 2016 From: benjamin at python.org (Benjamin Peterson) Date: Sat, 24 Sep 2016 01:07:21 -0700 Subject: [Python-Dev] cpython (3.6): replace usage of Py_VA_COPY with the (C99) standard va_copy In-Reply-To: <20160923163239.GF22471@ando.pearwood.info> References: <20160921033951.19751.13445.DE725F99@psf.io> <4f0bb1cc-9e6e-feb9-f901-69dd1206ccf0@python.org> <1474524160.1330752.733418297.54E48F04@webmail.messagingengine.com> <1474613240.146291.734573889.64CB4FDB@webmail.messagingengine.com> <20160923163239.GF22471@ando.pearwood.info> Message-ID: <1474704441.2703463.735584049.1990AF68@webmail.messagingengine.com> On Fri, Sep 23, 2016, at 09:32, Steven D'Aprano wrote: > On Thu, Sep 22, 2016 at 11:47:20PM -0700, Benjamin Peterson wrote: > > > > On Thu, Sep 22, 2016, at 04:44, Victor Stinner wrote: > > > 2016-09-22 8:02 GMT+02:00 Benjamin Peterson : > > > > Just dump the compat macros in Python 4.0 I think. > > > > > > Please don't. Python 3 was so painful because we decided to make > > > millions of tiny backward incompatible changes. To have a smooth > > > Python 4.0 release, we should only remove things which were already > > > deprecated since at least 2 cycles, and well documented as deprecated. > > > > I'm being flippant here because of the triviality of the change. Anyone > > using Py_VA_COPY or Py_MEMCPY can fix their code in a backwards and > > forwards compatible manner in 7 seconds with a sed command. > > Sorry, I haven't been following this thread in detail, so perhaps I've > misunderstood. Are you assuming that anyone who is building Python from > source is automatically able to diagnose C level build failures and > known how to fix them using sed? I am assuming authors of CPython extensions possess those skills. From benjamin at python.org Sat Sep 24 04:10:26 2016 From: benjamin at python.org (Benjamin Peterson) Date: Sat, 24 Sep 2016 01:10:26 -0700 Subject: [Python-Dev] cpython (3.6): replace usage of Py_VA_COPY with the (C99) standard va_copy In-Reply-To: References: <20160921033951.19751.13445.DE725F99@psf.io> <4f0bb1cc-9e6e-feb9-f901-69dd1206ccf0@python.org> <1474524160.1330752.733418297.54E48F04@webmail.messagingengine.com> <1474613240.146291.734573889.64CB4FDB@webmail.messagingengine.com> Message-ID: <1474704626.2703951.735585505.320CDE21@webmail.messagingengine.com> On Fri, Sep 23, 2016, at 00:04, Victor Stinner wrote: > 2016-09-23 8:47 GMT+02:00 Benjamin Peterson : > > I'm being flippant here because of the triviality of the change. Anyone > > using Py_VA_COPY or Py_MEMCPY can fix their code in a backwards and > > forwards compatible manner in 7 seconds with a sed command. > > Python 3 had the same argument with 2to3: run 2to3 once, and you are > done. C99 is a new thing for Python >= 3.6, but when you want to > support Python 2.7 and 3.5, you are stuck at Visual Studio 2010 which > is less happy with C99 than VS 2015... Python 2.7 doesn't provide Py_VA_COPY, so using it wouldn't do you much good anyway in term of Python 2/3 compatibility. This is not like 2to3 because the automatic transform is correct in all cases. From christian at python.org Sat Sep 24 09:05:19 2016 From: christian at python.org (Christian Heimes) Date: Sat, 24 Sep 2016 15:05:19 +0200 Subject: [Python-Dev] Code quality report Message-ID: <3ef38a9d-8796-93da-790b-c10a1b4822df@python.org> Hi, here is a short code quality report. Overall we are in a good shape for Python 3.6.0. I'm a bit worried about the amount of security bugs, though. Some haven't progressed in more than a year. Coverity Scan ------------- 3.6.0b1 added a bunch of new defects, most of them were false positives. Python is down again to zero open defects (default branch on Linux X86_64). total defects: 1,115 outstanding defects: 0 dismissed: 169 fixed: 946 https://scan.coverity.com/projects/python C code coverage --------------- I have updated my LCOV report (GCC on Linux X86_64). Our test coverage is quite good. line coverage: 81.9 % function coverage: 92.5 % https://tiran.bitbucket.io/python-lcov/ security bugs ------------- I'm seeing 46 open security bugs on our bug tracker, http://bit.ly/2cYWZy0 . configure / compile warnings ---------------------------- Python configures and compiles without warnings with GCC on Linux X86_64. Clang emits four warnings for unreachable code. All warnings are harmless. On i686 I'm still getting four warnings in the KeccakCodePackage (sha3), https://bugs.python.org/issue28117. Regards, Christian -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 455 bytes Desc: OpenPGP digital signature URL: From guido at python.org Sat Sep 24 12:26:43 2016 From: guido at python.org (Guido van Rossum) Date: Sat, 24 Sep 2016 09:26:43 -0700 Subject: [Python-Dev] Code quality report In-Reply-To: <3ef38a9d-8796-93da-790b-c10a1b4822df@python.org> References: <3ef38a9d-8796-93da-790b-c10a1b4822df@python.org> Message-ID: Thanks for watching our back, Christian! Regarding the security bugs, what would be most helpful? Code reviews? Patches? Testing? Just commits? Hopefully there are some people here who want to help making Python 3.6 more secure (I hear this list has thousands of lurkers :-). On Sat, Sep 24, 2016 at 6:05 AM, Christian Heimes wrote: > Hi, > > here is a short code quality report. Overall we are in a good shape for > Python 3.6.0. I'm a bit worried about the amount of security bugs, > though. Some haven't progressed in more than a year. > > > Coverity Scan > ------------- > > 3.6.0b1 added a bunch of new defects, most of them were false positives. > Python is down again to zero open defects (default branch on Linux X86_64). > > total defects: 1,115 > outstanding defects: 0 > dismissed: 169 > fixed: 946 > https://scan.coverity.com/projects/python > > > C code coverage > --------------- > > I have updated my LCOV report (GCC on Linux X86_64). Our test coverage > is quite good. > > line coverage: 81.9 % > function coverage: 92.5 % > https://tiran.bitbucket.io/python-lcov/ > > > security bugs > ------------- > > I'm seeing 46 open security bugs on our bug tracker, > http://bit.ly/2cYWZy0 . > > > configure / compile warnings > ---------------------------- > > Python configures and compiles without warnings with GCC on Linux > X86_64. Clang emits four warnings for unreachable code. All warnings are > harmless. > > On i686 I'm still getting four warnings in the KeccakCodePackage (sha3), > https://bugs.python.org/issue28117. > > Regards, > Christian > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) From ncoghlan at gmail.com Sun Sep 25 10:34:03 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 26 Sep 2016 00:34:03 +1000 Subject: [Python-Dev] cpython (3.6): replace usage of Py_VA_COPY with the (C99) standard va_copy In-Reply-To: <1474704441.2703463.735584049.1990AF68@webmail.messagingengine.com> References: <20160921033951.19751.13445.DE725F99@psf.io> <4f0bb1cc-9e6e-feb9-f901-69dd1206ccf0@python.org> <1474524160.1330752.733418297.54E48F04@webmail.messagingengine.com> <1474613240.146291.734573889.64CB4FDB@webmail.messagingengine.com> <20160923163239.GF22471@ando.pearwood.info> <1474704441.2703463.735584049.1990AF68@webmail.messagingengine.com> Message-ID: On 24 September 2016 at 18:07, Benjamin Peterson wrote: > > > On Fri, Sep 23, 2016, at 09:32, Steven D'Aprano wrote: >> On Thu, Sep 22, 2016 at 11:47:20PM -0700, Benjamin Peterson wrote: >> > >> > On Thu, Sep 22, 2016, at 04:44, Victor Stinner wrote: >> > > 2016-09-22 8:02 GMT+02:00 Benjamin Peterson : >> > > > Just dump the compat macros in Python 4.0 I think. >> > > >> > > Please don't. Python 3 was so painful because we decided to make >> > > millions of tiny backward incompatible changes. To have a smooth >> > > Python 4.0 release, we should only remove things which were already >> > > deprecated since at least 2 cycles, and well documented as deprecated. >> > >> > I'm being flippant here because of the triviality of the change. Anyone >> > using Py_VA_COPY or Py_MEMCPY can fix their code in a backwards and >> > forwards compatible manner in 7 seconds with a sed command. >> >> Sorry, I haven't been following this thread in detail, so perhaps I've >> misunderstood. Are you assuming that anyone who is building Python from >> source is automatically able to diagnose C level build failures and >> known how to fix them using sed? > > I am assuming authors of CPython extensions possess those skills. Not all projects on PyPI have active maintainers though, and on the project user end, there's a significant difference between "Can set up a C build environment well enough to let distutils build simple C extensions for a new Python release" and "Is the maintainer of the C extension". It's often useful to think of *any* backwards incompatible change we make as a pruning filter on PyPI: projects that don't have active maintainers that are affected by the change won't be updated as a matter of course. The end result is then usually going to be one of: - the original author returns to active maintenance for long enough to release an update - an interested user contacts the original author and takes over maintenance - affected users migrate away to a new actively maintained fork of the project - affected users migrate away to another existing project addressing the same need There are some cases where a lack of active maintenance is inherently a problem (e.g. network security), so we're happy to trigger those ripple effects. In other cases, the pay-off might be in ease of maintenance for the core development team, or in ease of future learning for new Python developers. But it doesn't matter how trivial the specific change needed is if getting it resolved and a new version published turns out to require a transfer of project ownership - the cost is in the ownership change rather than the software change itself. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From benno at benno.id.au Sun Sep 25 19:21:56 2016 From: benno at benno.id.au (Ben Leslie) Date: Sun, 25 Sep 2016 16:21:56 -0700 Subject: [Python-Dev] TextIO seek and tell cookies Message-ID: Hi all, I recently shot myself in the foot by assuming that TextIO.tell returned integers rather than opaque cookies. Specifically I was adding an offset to the value returned by TextIO.tell. In retrospect this doesn't make sense/ Now, I don't want to drive change simply because I failed to read the documentation carefully, but I think the current API is very easy to misuse. Most of the time TextIO.tell returns a cookie that is actually an integer and adding an offset to it and seek-ing works fine. The only indication you get that you are mis-using the API is that sometimes tell returns a cookie that when you add an integer offset to it will cause seek() to fail with an OverflowError. Would it be possible to change the API to return something more opaque? E.g.: rather than converting the C cookie structure to a long, could it instead be converted to a bytes() object. (I.e.: Change textiowrapper_build_cookie to use PyBytes_FromStringAndSize rather than _PyLong_FromByteArray and equivalent for textiowrapper_parse_cookie). This would ensure the return value is never mis-used and is probably also faster using bytes objects than converting to/from an integer. Are there any downsides to this? I've made some progress developing a patch to change this functionality. Is it worth polishing and submitting? Cheers, Ben From python at mrabarnett.plus.com Sun Sep 25 20:21:57 2016 From: python at mrabarnett.plus.com (MRAB) Date: Mon, 26 Sep 2016 01:21:57 +0100 Subject: [Python-Dev] TextIO seek and tell cookies In-Reply-To: References: Message-ID: On 2016-09-26 00:21, Ben Leslie wrote: > Hi all, > > I recently shot myself in the foot by assuming that TextIO.tell > returned integers rather than opaque cookies. Specifically I was > adding an offset to the value returned by TextIO.tell. In retrospect > this doesn't make sense/ > > Now, I don't want to drive change simply because I failed to read the > documentation carefully, but I think the current API is very easy to > misuse. Most of the time TextIO.tell returns a cookie that is actually > an integer and adding an offset to it and seek-ing works fine. > > The only indication you get that you are mis-using the API is that > sometimes tell returns a cookie that when you add an integer offset to > it will cause seek() to fail with an OverflowError. > > Would it be possible to change the API to return something more > opaque? E.g.: rather than converting the C cookie structure to a long, > could it instead be converted to a bytes() object. > > (I.e.: Change textiowrapper_build_cookie to use > PyBytes_FromStringAndSize rather than _PyLong_FromByteArray and > equivalent for textiowrapper_parse_cookie). > > This would ensure the return value is never mis-used and is probably > also faster using bytes objects than converting to/from an integer. > why would it be faster? It's an integer internally. > Are there any downsides to this? I've made some progress developing a > patch to change this functionality. Is it worth polishing and > submitting? > An alternative might be a subclass of int. From gordon at parasamgate.com Sun Sep 25 20:25:31 2016 From: gordon at parasamgate.com (Gordon R. Burgess) Date: Sun, 25 Sep 2016 20:25:31 -0400 Subject: [Python-Dev] Possibly inconsistent behavior in re groupdict Message-ID: <1474849531.16933.4.camel@parasamgate.com> I've been lurking for a couple of months, working up the confidence to ask the list about this behavior - I've searched through the PEPs but couldn't find any specific reference to it. In a nutshell, in the Python 3.5 library re patterns and search buffers both need to be either unicode or byte strings - but the keys in the groupdict are always returned as str in either case. I don't know whether or not this is by design, but it would make more sense to me if when searching a bytes object with a bytes pattern the keys returned in the groupdict were bytes as well. I reworked the example a little just now so it would run it on 2.7 as well; on 2.7 the keys in the dictionary correspond to the mode of the pattern as expected (and bytes and unicode are interconverted silently) - code and output are inline below. Thanks for your time, Gordon [Code] import sys import re from datetime import datetime data = (u"first string (unicode)", ?????????b"second string (bytes)") pattern = [re.compile(u"(?P\\w+) .*\\((?P\\w+)\\)"), ???????????re.compile(b"(?P\\w+) .*\\((?P\\w+)\\)")] print("*** re consistency check ***\nRun: %s\nVersion: Python %s\n" % ??????(datetime.now(), sys.version)) for p in pattern: ????for d in data: ????????try: ????????????result = "groupdict: %s" % (p.match(d) and p.match(d).groupdict()) ????????except Exception as e: ????????????result = "error: %s" % e.args[0] ????????print("mode: %s\npattern: %s\ndata: %s\n%s\n" % ??????????????(type(p.pattern).__name__, p.pattern, d, result)) [Output] gordon at w540:~/workspace/regex_demo$ python3 regex_demo.py? *** re consistency check *** Run: 2016-09-25 20:06:29.472332 Version: Python 3.5.2+ (default, Sep 10 2016, 10:24:58)? [GCC 6.2.0 20160901] mode: str pattern: (?P\w+) .*\((?P\w+)\) data: first string (unicode) groupdict: {'ordinal': 'first', 'type': 'unicode'} mode: str pattern: (?P\w+) .*\((?P\w+)\) data: b'second string (bytes)' error: cannot use a string pattern on a bytes-like object mode: bytes pattern: b'(?P\\w+) .*\\((?P\\w+)\\)' data: first string (unicode) error: cannot use a bytes pattern on a string-like object mode: bytes pattern: b'(?P\\w+) .*\\((?P\\w+)\\)' data: b'second string (bytes)' groupdict: {'ordinal': b'second', 'type': b'bytes'} gordon at w540:~/workspace/regex_demo$ python regex_demo.py? *** re consistency check *** Run: 2016-09-25 20:06:23.375322 Version: Python 2.7.12+ (default, Sep??1 2016, 20:27:38)? [GCC 6.2.0 20160822] mode: unicode pattern: (?P\w+) .*\((?P\w+)\) data: first string (unicode) groupdict: {u'ordinal': u'first', u'type': u'unicode'} mode: unicode pattern: (?P\w+) .*\((?P\w+)\) data: second string (bytes) groupdict: {u'ordinal': 'second', u'type': 'bytes'} mode: str pattern: (?P\w+) .*\((?P\w+)\) data: first string (unicode) groupdict: {'ordinal': u'first', 'type': u'unicode'} mode: str pattern: (?P\w+) .*\((?P\w+)\) data: second string (bytes) groupdict: {'ordinal': 'second', 'type': 'bytes'} From ncoghlan at gmail.com Mon Sep 26 00:05:51 2016 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 26 Sep 2016 14:05:51 +1000 Subject: [Python-Dev] TextIO seek and tell cookies In-Reply-To: References: Message-ID: On 26 September 2016 at 10:21, MRAB wrote: > On 2016-09-26 00:21, Ben Leslie wrote: >> Are there any downsides to this? I've made some progress developing a >> patch to change this functionality. Is it worth polishing and >> submitting? >> > An alternative might be a subclass of int. It could make sense to use a subclass of int that emitted deprecation warnings for integer arithmetic, and then eventually disallowed it entirely. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From guido at python.org Mon Sep 26 00:18:51 2016 From: guido at python.org (Guido van Rossum) Date: Sun, 25 Sep 2016 21:18:51 -0700 Subject: [Python-Dev] TextIO seek and tell cookies In-Reply-To: References: Message-ID: Be careful though, comparing these to plain integers should probably be allowed, and we also should make sure that things like serialization via JSON or storing in an SQL database don't break. I personally think it's one of those "learn not to touch the stove" cases and there's limited value in making this API idiot proof. On Sun, Sep 25, 2016 at 9:05 PM, Nick Coghlan wrote: > On 26 September 2016 at 10:21, MRAB wrote: >> On 2016-09-26 00:21, Ben Leslie wrote: >>> Are there any downsides to this? I've made some progress developing a >>> patch to change this functionality. Is it worth polishing and >>> submitting? >>> >> An alternative might be a subclass of int. > > It could make sense to use a subclass of int that emitted deprecation > warnings for integer arithmetic, and then eventually disallowed it > entirely. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) From guido at python.org Mon Sep 26 00:36:20 2016 From: guido at python.org (Guido van Rossum) Date: Sun, 25 Sep 2016 21:36:20 -0700 Subject: [Python-Dev] Possibly inconsistent behavior in re groupdict In-Reply-To: <1474849531.16933.4.camel@parasamgate.com> References: <1474849531.16933.4.camel@parasamgate.com> Message-ID: Hi Gordon, You pose an interesting question that I don't think anyone has posed before. Having thought about it, I think that the keys in the group dict are similar to the names of variables or attributes, and I think treating them always as strings makes sense. For example, I might write a function that allows passing in a pattern and a search string, both either str or bytes, where the function would expect fixed keys in the group dict: def extract_key_value(pattern, target): m = re.match(pattern, target) return m and m.groupdict['key'], m.groupdict['value'] There might be a problem with decoding the group name from the pattern, so sticking to ASCII group names would be wise. There's also the backwards compatibility concern: even if we did want to change this, would we want to break existing code (like the above) that might currently work? --Guido On Sun, Sep 25, 2016 at 5:25 PM, Gordon R. Burgess wrote: > I've been lurking for a couple of months, working up the confidence to > ask the list about this behavior - I've searched through the PEPs but > couldn't find any specific reference to it. > > In a nutshell, in the Python 3.5 library re patterns and search buffers > both need to be either unicode or byte strings - but the keys in the > groupdict are always returned as str in either case. > > I don't know whether or not this is by design, but it would make more > sense to me if when searching a bytes object with a bytes pattern the > keys returned in the groupdict were bytes as well. > > I reworked the example a little just now so it would run it on 2.7 as > well; on 2.7 the keys in the dictionary correspond to the mode of the > pattern as expected (and bytes and unicode are interconverted silently) > - code and output are inline below. > > Thanks for your time, > > Gordon > > [Code] > > import sys > import re > from datetime import datetime > > data = (u"first string (unicode)", > b"second string (bytes)") > > pattern = [re.compile(u"(?P\\w+) .*\\((?P\\w+)\\)"), > re.compile(b"(?P\\w+) .*\\((?P\\w+)\\)")] > > print("*** re consistency check ***\nRun: %s\nVersion: Python %s\n" % > (datetime.now(), sys.version)) > for p in pattern: > for d in data: > try: > result = "groupdict: %s" % (p.match(d) and > p.match(d).groupdict()) > except Exception as e: > result = "error: %s" % e.args[0] > print("mode: %s\npattern: %s\ndata: %s\n%s\n" % > (type(p.pattern).__name__, p.pattern, d, result)) > > [Output] > > gordon at w540:~/workspace/regex_demo$ python3 regex_demo.py > *** re consistency check *** > Run: 2016-09-25 20:06:29.472332 > Version: Python 3.5.2+ (default, Sep 10 2016, 10:24:58) > [GCC 6.2.0 20160901] > > mode: str > pattern: (?P\w+) .*\((?P\w+)\) > data: first string (unicode) > groupdict: {'ordinal': 'first', 'type': 'unicode'} > > mode: str > pattern: (?P\w+) .*\((?P\w+)\) > data: b'second string (bytes)' > error: cannot use a string pattern on a bytes-like object > > mode: bytes > pattern: b'(?P\\w+) .*\\((?P\\w+)\\)' > data: first string (unicode) > error: cannot use a bytes pattern on a string-like object > > mode: bytes > pattern: b'(?P\\w+) .*\\((?P\\w+)\\)' > data: b'second string (bytes)' > groupdict: {'ordinal': b'second', 'type': b'bytes'} > > gordon at w540:~/workspace/regex_demo$ python regex_demo.py > *** re > consistency check *** > Run: 2016-09-25 20:06:23.375322 > Version: Python > 2.7.12+ (default, Sep 1 2016, 20:27:38) > [GCC 6.2.0 20160822] > > mode: unicode > pattern: (?P\w+) .*\((?P\w+)\) > data: first string (unicode) > groupdict: {u'ordinal': u'first', u'type': u'unicode'} > > mode: unicode > pattern: (?P\w+) .*\((?P\w+)\) > data: second string (bytes) > groupdict: {u'ordinal': 'second', u'type': 'bytes'} > > mode: str > pattern: (?P\w+) .*\((?P\w+)\) > data: first string (unicode) > groupdict: {'ordinal': u'first', 'type': u'unicode'} > > mode: str > pattern: (?P\w+) .*\((?P\w+)\) > data: second string (bytes) > groupdict: {'ordinal': 'second', 'type': 'bytes'} > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) From pludemann at google.com Mon Sep 26 01:48:19 2016 From: pludemann at google.com (Peter Ludemann) Date: Sun, 25 Sep 2016 22:48:19 -0700 Subject: [Python-Dev] TextIO seek and tell cookies In-Reply-To: References: Message-ID: On 25 September 2016 at 21:18, Guido van Rossum wrote: > Be careful though, comparing these to plain integers should probably > be allowed, ?There's a good reason why it's "opaque" ... why would you want to make it less opaque? And I'm curious why Python didn't adopt the fgetpos/fsetpos style that makes the data structure completely opaque (fpos_t). IIRC, this was added to C when the ANSI standard was first written, to allow cross-platform compatibility in cases where ftell/fseek was difficult (or impossible) to fully implement. Maybe those reasons don't matter any more (e.g., dealing with record-oriented or keyed file systems) ... > and we also should make sure that things like > serialization via JSON or storing in an SQL database don't break. I > personally think it's one of those "learn not to touch the stove" > cases and there's limited value in making this API idiot proof. > > On Sun, Sep 25, 2016 at 9:05 PM, Nick Coghlan wrote: > > On 26 September 2016 at 10:21, MRAB wrote: > >> On 2016-09-26 00:21, Ben Leslie wrote: > >>> Are there any downsides to this? I've made some progress developing a > >>> patch to change this functionality. Is it worth polishing and > >>> submitting? > >>> > >> An alternative might be a subclass of int. > > > > It could make sense to use a subclass of int that emitted deprecation > > warnings for integer arithmetic, and then eventually disallowed it > > entirely. > > > > Cheers, > > Nick. > > > > -- > > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > https://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > > > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > pludemann%40google.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From benno at benno.id.au Mon Sep 26 05:26:38 2016 From: benno at benno.id.au (Ben Leslie) Date: Mon, 26 Sep 2016 02:26:38 -0700 Subject: [Python-Dev] TextIO seek and tell cookies In-Reply-To: References: Message-ID: On 25 September 2016 at 17:21, MRAB wrote: > On 2016-09-26 00:21, Ben Leslie wrote: >> >> Hi all, >> >> I recently shot myself in the foot by assuming that TextIO.tell >> returned integers rather than opaque cookies. Specifically I was >> adding an offset to the value returned by TextIO.tell. In retrospect >> this doesn't make sense/ >> >> Now, I don't want to drive change simply because I failed to read the >> documentation carefully, but I think the current API is very easy to >> misuse. Most of the time TextIO.tell returns a cookie that is actually >> an integer and adding an offset to it and seek-ing works fine. >> >> The only indication you get that you are mis-using the API is that >> sometimes tell returns a cookie that when you add an integer offset to >> it will cause seek() to fail with an OverflowError. >> >> Would it be possible to change the API to return something more >> opaque? E.g.: rather than converting the C cookie structure to a long, >> could it instead be converted to a bytes() object. >> >> (I.e.: Change textiowrapper_build_cookie to use >> PyBytes_FromStringAndSize rather than _PyLong_FromByteArray and >> equivalent for textiowrapper_parse_cookie). >> >> This would ensure the return value is never mis-used and is probably >> also faster using bytes objects than converting to/from an integer. >> > why would it be faster? It's an integer internally. It isn't an integer internally though, it is a cookie: typedef struct { Py_off_t start_pos; int dec_flags; int bytes_to_feed; int chars_to_skip; char need_eof; } cookie_type; The memory view of this structure is then converted to a long. Surely converting to a PyLong is more work than converting to bytes? In any case, performance really isn't the motivation here. Cheers, Ben From benno at benno.id.au Mon Sep 26 05:30:17 2016 From: benno at benno.id.au (Ben Leslie) Date: Mon, 26 Sep 2016 02:30:17 -0700 Subject: [Python-Dev] TextIO seek and tell cookies In-Reply-To: References: Message-ID: I think the case of JSON or SQL database is even more important though. tell/seek can return 129-bit integers (maybe even more? my maths might be off here). The very large integers that can be returned by tell() will break serialization to JSON, and storing in a SQL database (at least for most database types). What is the value of comparing these to plain integers? Unless you happen to know the magic encoding it isn't going to be very useful I think? Cheers, Ben On 25 September 2016 at 21:18, Guido van Rossum wrote: > Be careful though, comparing these to plain integers should probably > be allowed, and we also should make sure that things like > serialization via JSON or storing in an SQL database don't break. I > personally think it's one of those "learn not to touch the stove" > cases and there's limited value in making this API idiot proof. > > On Sun, Sep 25, 2016 at 9:05 PM, Nick Coghlan wrote: >> On 26 September 2016 at 10:21, MRAB wrote: >>> On 2016-09-26 00:21, Ben Leslie wrote: >>>> Are there any downsides to this? I've made some progress developing a >>>> patch to change this functionality. Is it worth polishing and >>>> submitting? >>>> >>> An alternative might be a subclass of int. >> >> It could make sense to use a subclass of int that emitted deprecation >> warnings for integer arithmetic, and then eventually disallowed it >> entirely. >> >> Cheers, >> Nick. >> >> -- >> Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org > > > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/benno%40benno.id.au From benno at benno.id.au Mon Sep 26 09:10:58 2016 From: benno at benno.id.au (Ben Leslie) Date: Mon, 26 Sep 2016 06:10:58 -0700 Subject: [Python-Dev] TextIO seek and tell cookies In-Reply-To: References: Message-ID: It was pointed out in private email that technically JSON can represent very large integers even if ECMAScript itself can't. But the idea of transmitting these offsets outside of a running process is not something that I had anticipated. It got me thinking: is there a guarantee that these opaque values returned from tell() is stable across different versions of Python? My reading of opaque is that it could be subject to change, but that possibly isn't the intent. It seems that since the sizeof(int) and sizeof(Py_off_t) could be different in different builds of Python even off the same version, then the opaque value returned is necessarily going to be different between builds of even the same version of Python. It seems like it would be prudent to discourage the sharing of these opaque cookies (such as via a database or interchange formats) as you'd have to be very sure that they would be interpreted correctly in any receiving instance. Cheers, Ben On 26 September 2016 at 02:30, Ben Leslie wrote: > I think the case of JSON or SQL database is even more important though. > > tell/seek can return 129-bit integers (maybe even more? my maths might > be off here). > > The very large integers that can be returned by tell() will break > serialization to JSON, and storing in a SQL database (at least for > most database types). > > What is the value of comparing these to plain integers? Unless you > happen to know the magic encoding it isn't going to be very useful I > think? > > Cheers, > > Ben > > On 25 September 2016 at 21:18, Guido van Rossum wrote: >> Be careful though, comparing these to plain integers should probably >> be allowed, and we also should make sure that things like >> serialization via JSON or storing in an SQL database don't break. I >> personally think it's one of those "learn not to touch the stove" >> cases and there's limited value in making this API idiot proof. >> >> On Sun, Sep 25, 2016 at 9:05 PM, Nick Coghlan wrote: >>> On 26 September 2016 at 10:21, MRAB wrote: >>>> On 2016-09-26 00:21, Ben Leslie wrote: >>>>> Are there any downsides to this? I've made some progress developing a >>>>> patch to change this functionality. Is it worth polishing and >>>>> submitting? >>>>> >>>> An alternative might be a subclass of int. >>> >>> It could make sense to use a subclass of int that emitted deprecation >>> warnings for integer arithmetic, and then eventually disallowed it >>> entirely. >>> >>> Cheers, >>> Nick. >>> >>> -- >>> Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia >>> _______________________________________________ >>> Python-Dev mailing list >>> Python-Dev at python.org >>> https://mail.python.org/mailman/listinfo/python-dev >>> Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org >> >> >> >> -- >> --Guido van Rossum (python.org/~guido) >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: https://mail.python.org/mailman/options/python-dev/benno%40benno.id.au From random832 at fastmail.com Mon Sep 26 10:32:53 2016 From: random832 at fastmail.com (Random832) Date: Mon, 26 Sep 2016 10:32:53 -0400 Subject: [Python-Dev] TextIO seek and tell cookies In-Reply-To: References: Message-ID: <1474900373.2750772.737286425.124555B8@webmail.messagingengine.com> On Mon, Sep 26, 2016, at 05:30, Ben Leslie wrote: > I think the case of JSON or SQL database is even more important though. > > tell/seek can return 129-bit integers (maybe even more? my maths might > be off here). > > The very large integers that can be returned by tell() will break > serialization to JSON, and storing in a SQL database (at least for > most database types). > > What is the value of comparing these to plain integers? Unless you > happen to know the magic encoding it isn't going to be very useful I > think? I assume the value is that in the circumstances in which all of the flags and other bits are zero, they can be used as offsets in precisely the way that you used them. It may also be possible that in some cases where they are not zero, doing arithmetic with them is still "safe" since the real offset is still in the low-order bits. I don't know if those circumstances are predictable enough for it to be worthwhile. Changing it would obviously break code that does this (unless, perhaps, it were changed to be a class with arithmetic operators), the question is whether such code "deserves" to be broken. In my own tests, even a UTF-8-sig file with DOS line endings "worked". Does anyone have information about what circumstances can reliably cause tell() to return values that are *not* simple integers? Maybe it has something to do with working with stateful encodings like iso-2022 or UTF-7? What was the situation that caused your problem? From michael at felt.demon.nl Mon Sep 26 08:25:09 2016 From: michael at felt.demon.nl (Michael Felt) Date: Mon, 26 Sep 2016 14:25:09 +0200 Subject: [Python-Dev] 3.6.0 Beta Phase Development In-Reply-To: <092D85C9-5853-403F-B1E1-DF939C5388C0@python.org> References: <092D85C9-5853-403F-B1E1-DF939C5388C0@python.org> Message-ID: <605124a2-8d37-d1f2-5865-bd0fd23b7003@felt.demon.nl> On 13/09/2016 02:15, Ned Deily wrote: > the challenge is to put the finishing touches on the features and documentation, squash bugs, and test test test. The next preview release will be 3.6.0b2 Found one typo in Modules/_io/_iomodule.h on line 156 - #endif^L rather than #endif (posted as an issue, but I suppose just a note here would have been enough) I have a longish list of messages to stderr from the compiler (IBM xlc) on AIX. Rather than spam everyone with those - would opening an issue be the way forward, or just sending the file to a person - rather than the list. From guido at python.org Mon Sep 26 11:41:15 2016 From: guido at python.org (Guido van Rossum) Date: Mon, 26 Sep 2016 08:41:15 -0700 Subject: [Python-Dev] 3.6.0 Beta Phase Development In-Reply-To: <605124a2-8d37-d1f2-5865-bd0fd23b7003@felt.demon.nl> References: <092D85C9-5853-403F-B1E1-DF939C5388C0@python.org> <605124a2-8d37-d1f2-5865-bd0fd23b7003@felt.demon.nl> Message-ID: The issue tracker is your friend! On Mon, Sep 26, 2016 at 5:25 AM, Michael Felt wrote: > > On 13/09/2016 02:15, Ned Deily wrote: >> >> the challenge is to put the finishing touches on the features and >> documentation, squash bugs, and test test test. The next preview release >> will be 3.6.0b2 > > > Found one typo in Modules/_io/_iomodule.h on line 156 - #endif^L rather than > #endif (posted as an issue, but I suppose just a note here would have been > enough) > > I have a longish list of messages to stderr from the compiler (IBM xlc) on > AIX. Rather than spam everyone with those - would opening an issue be the > way forward, or just sending the file to a person - rather than the list. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) From trentm at gmail.com Mon Sep 26 18:32:01 2016 From: trentm at gmail.com (Trent Mick) Date: Mon, 26 Sep 2016 15:32:01 -0700 Subject: [Python-Dev] OpenIndiana and Solaris support In-Reply-To: References: Message-ID: I work for Joyent (joyent.com) now, which employs a number of devs that work on illumos (illumos.org). We also provide cloud infrastructure. Would it help if we offered one or more instances (VMs) on which to run buildbot slaves (and on which volunteers for bug fixing could hack)? I know a lot of people in the illumos community would be quite sad to have it dropped as a core Python plat. Guido, Yes you are correct that Oracle owns the Solaris brand. tl;dr history if you care: - sunos -> Solaris - Sun open sources Solaris, called OpenSolaris (2005) - Oracle acquires Sun and closes Solaris (Aug 2010). Shortly after, the community forks OpenSolaris and calls it illumos (Sep 2010) - OpenIndiana is a distro of illumos (somewhat similar to how Ubuntu is a distro of Linux). Other distros are SmartOS (the one Joyent works on), and OmniOS. - Oracle continues work on Solaris, releasing "Solaris 11 Express". I've no real numbers of usage of illumos vs Solaris 11 vs others. Cheers, Trent p.s. I hear that Jesus is also in contact with some of the illumos-devs on IRC (and perhaps email). I hope we can help there. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Sep 26 18:37:45 2016 From: guido at python.org (Guido van Rossum) Date: Mon, 26 Sep 2016 15:37:45 -0700 Subject: [Python-Dev] OpenIndiana and Solaris support In-Reply-To: References: Message-ID: Thanks for the reality check Trent! I think if enough people with core committer bits want to keep supporting Solaris / Illumos / OpenIndiana / other variants that's fine, but I don't think that just having some VMs to test on is enough -- we also need people who can fix problems if those buildbots start failing, and that requires pretty specialized knowledge. Plus of course we won't know if fixing it for OpenIndiana will also fix it for Solaris 11 Express or for other Illumos forks. (For Linux it's easier to assess these things because so many people in open source use Linux and its many forks.) On Mon, Sep 26, 2016 at 3:32 PM, Trent Mick wrote: > I work for Joyent (joyent.com) now, which employs a number of devs that work > on illumos (illumos.org). We also provide cloud infrastructure. Would it > help if we offered one or more instances (VMs) on which to run buildbot > slaves (and on which volunteers for bug fixing could hack)? I know a lot of > people in the illumos community would be quite sad to have it dropped as a > core Python plat. > > Guido, > Yes you are correct that Oracle owns the Solaris brand. > > tl;dr history if you care: > - sunos -> Solaris > - Sun open sources Solaris, called OpenSolaris (2005) > - Oracle acquires Sun and closes Solaris (Aug 2010). Shortly after, the > community forks OpenSolaris and calls it illumos (Sep 2010) > - OpenIndiana is a distro of illumos (somewhat similar to how Ubuntu is a > distro of Linux). Other distros are SmartOS (the one Joyent works on), and > OmniOS. > - Oracle continues work on Solaris, releasing "Solaris 11 Express". > > I've no real numbers of usage of illumos vs Solaris 11 vs others. > > Cheers, > Trent > > p.s. I hear that Jesus is also in contact with some of the illumos-devs on > IRC (and perhaps email). I hope we can help there. -- --Guido van Rossum (python.org/~guido) From greg.ewing at canterbury.ac.nz Mon Sep 26 18:43:38 2016 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 27 Sep 2016 11:43:38 +1300 Subject: [Python-Dev] TextIO seek and tell cookies In-Reply-To: References: Message-ID: <57E9A49A.2060300@canterbury.ac.nz> Ben Leslie wrote: > But the idea of transmitting these offsets outside of a running > process is not something that I had anticipated. It got me thinking: > is there a guarantee that these opaque values returned from tell() is > stable across different versions of Python? Are they even guaranteed to work on a different file object in the same process? I.e. if you read some stuff from a file, do tell() on it, then close it, open it again and seek() with that token, are you guaranteed to end up at the same place in the file? -- Greg From guido at python.org Mon Sep 26 18:51:47 2016 From: guido at python.org (Guido van Rossum) Date: Mon, 26 Sep 2016 15:51:47 -0700 Subject: [Python-Dev] TextIO seek and tell cookies In-Reply-To: <57E9A49A.2060300@canterbury.ac.nz> References: <57E9A49A.2060300@canterbury.ac.nz> Message-ID: Yeah, that should work. The implementation is something like a byte offset to the start of a line plus a character count, plus some misc flags. I found this implementation in the 2.6 code (the last version where it was pure Python code): def _pack_cookie(self, position, dec_flags=0, bytes_to_feed=0, need_eof=0, chars_to_skip=0): # The meaning of a tell() cookie is: seek to position, set the # decoder flags to dec_flags, read bytes_to_feed bytes, feed them # into the decoder with need_eof as the EOF flag, then skip # chars_to_skip characters of the decoded result. For most simple # decoders, tell() will often just give a byte offset in the file. return (position | (dec_flags<<64) | (bytes_to_feed<<128) | (chars_to_skip<<192) | bool(need_eof)<<256) def _unpack_cookie(self, bigint): rest, position = divmod(bigint, 1<<64) rest, dec_flags = divmod(rest, 1<<64) rest, bytes_to_feed = divmod(rest, 1<<64) need_eof, chars_to_skip = divmod(rest, 1<<64) return position, dec_flags, bytes_to_feed, need_eof, chars_to_skip On Mon, Sep 26, 2016 at 3:43 PM, Greg Ewing wrote: > Ben Leslie wrote: >> >> But the idea of transmitting these offsets outside of a running >> process is not something that I had anticipated. It got me thinking: >> is there a guarantee that these opaque values returned from tell() is >> stable across different versions of Python? > > > Are they even guaranteed to work on a different file > object in the same process? I.e. if you read some stuff > from a file, do tell() on it, then close it, open it > again and seek() with that token, are you guaranteed to > end up at the same place in the file? > > -- > Greg > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) From brett at python.org Mon Sep 26 18:54:05 2016 From: brett at python.org (Brett Cannon) Date: Mon, 26 Sep 2016 22:54:05 +0000 Subject: [Python-Dev] OpenIndiana and Solaris support In-Reply-To: References: Message-ID: On Mon, 26 Sep 2016 at 15:38 Guido van Rossum wrote: > Thanks for the reality check Trent! I think if enough people with core > committer bits want to keep supporting Solaris / Illumos / OpenIndiana > / other variants that's fine, but I don't think that just having some > VMs to test on is enough -- we also need people who can fix problems > if those buildbots start failing, and that requires pretty specialized > knowledge. Plus of course we won't know if fixing it for OpenIndiana > will also fix it for Solaris 11 Express or for other Illumos forks. > (For Linux it's easier to assess these things because so many people > in open source use Linux and its many forks.) > The official requirement to support a platform is a stable buildbot and a core dev to keep the support up: https://www.python.org/dev/peps/pep-0011/#supporting-platforms. Victor has asked that the OpenIndiana buildbot be removed from the stable pool as it consistently throws MemoryError which means its support is not improving. If Trent is willing to maintain a buildbot in a Joyent VM that at least takes care of that part, but it still requires Jesus to volunteer to keep the support up if it's going to be supported for free. Otherwise Joyent could consider contracting with one of the various core devs who happen to be consultants to help maintain the support. At minimum, though, a new buildbot could go into the unstable pool so illumos devs can keep an eye on when things break to try and get platform-independent changes upstreamed that happen to help illumos (e.g. no #ifdef changes specific to illumos, but if something just needed to be made more robust and it happens to help illumos that's typically fine). > > On Mon, Sep 26, 2016 at 3:32 PM, Trent Mick wrote: > > I work for Joyent (joyent.com) now, which employs a number of devs that > work > > on illumos (illumos.org). We also provide cloud infrastructure. Would it > > help if we offered one or more instances (VMs) on which to run buildbot > > slaves (and on which volunteers for bug fixing could hack)? I know a > lot of > > people in the illumos community would be quite sad to have it dropped as > a > > core Python plat. > > > > Guido, > > Yes you are correct that Oracle owns the Solaris brand. > > > > tl;dr history if you care: > > - sunos -> Solaris > > - Sun open sources Solaris, called OpenSolaris (2005) > > - Oracle acquires Sun and closes Solaris (Aug 2010). Shortly after, the > > community forks OpenSolaris and calls it illumos (Sep 2010) > > - OpenIndiana is a distro of illumos (somewhat similar to how Ubuntu is a > > distro of Linux). Other distros are SmartOS (the one Joyent works on), > and > > OmniOS. > > - Oracle continues work on Solaris, releasing "Solaris 11 Express". > > > > I've no real numbers of usage of illumos vs Solaris 11 vs others. > > > > Cheers, > > Trent > > > > p.s. I hear that Jesus is also in contact with some of the illumos-devs > on > > IRC (and perhaps email). I hope we can help there. > > > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael at felt.demon.nl Tue Sep 27 15:01:24 2016 From: michael at felt.demon.nl (Michael Felt) Date: Tue, 27 Sep 2016 21:01:24 +0200 Subject: [Python-Dev] 3.6.0 Beta Phase Development In-Reply-To: References: <092D85C9-5853-403F-B1E1-DF939C5388C0@python.org> <605124a2-8d37-d1f2-5865-bd0fd23b7003@felt.demon.nl> Message-ID: <803fe9cb-57b1-130b-a105-7cc30a9bbfdb@felt.demon.nl> On 26/09/2016 17:41, Guido van Rossum wrote: > The issue tracker is your friend! I shall remember this for future reference As you probably noticed - new "issue" https://bugs.python.org/issue28290 -- BETA report: Python-3.6 build messages to stderr: AIX and "not GCC" From gordon at parasamgate.com Wed Sep 28 18:03:38 2016 From: gordon at parasamgate.com (Gordon R. Burgess) Date: Wed, 28 Sep 2016 18:03:38 -0400 Subject: [Python-Dev] Possibly inconsistent behavior in re groupdict In-Reply-To: References: <1474849531.16933.4.camel@parasamgate.com> Message-ID: <1475100218.10920.5.camel@parasamgate.com> Hi Guido - thanks for your thoughts on this. This came up for me when writing an HL7 library, where the raw data is all bytes - it seemed a little odd that the names went in as bytes and came out as str - especially given the way the re library expects consistency between the patterns and targets - but I also appreciate the point about breaking code. ?(Including mine, which has a comment on it that says, "match.groupdict returns a dict with str keys in Python 3.5" :D) Cheers, Gordon -----Original Message----- From: Guido van Rossum Reply-to: guido at python.org To: Gordon R. Burgess Cc: Python-Dev Subject: Re: [Python-Dev] Possibly inconsistent behavior in re groupdict Date: Sun, 25 Sep 2016 21:36:20 -0700 Hi Gordon, You pose an interesting question that I don't think anyone has posed before. Having thought about it, I think that the keys in the group dict are similar to the names of variables or attributes, and I think treating them always as strings makes sense. For example, I might write a function that allows passing in a pattern and a search string, both either str or bytes, where the function would expect fixed keys in the group dict: def extract_key_value(pattern, target): ????m = re.match(pattern, target) ????return m and m.groupdict['key'], m.groupdict['value'] There might be a problem with decoding the group name from the pattern, so sticking to ASCII group names would be wise. There's also the backwards compatibility concern: even if we did want to change this, would we want to break existing code (like the above) that might currently work? --Guido On Sun, Sep 25, 2016 at 5:25 PM, Gordon R. Burgess wrote: > > I've been lurking for a couple of months, working up the confidence > to > ask the list about this behavior - I've searched through the PEPs but > couldn't find any specific reference to it. > > In a nutshell, in the Python 3.5 library re patterns and search > buffers > both need to be either unicode or byte strings - but the keys in the > groupdict are always returned as str in either case. > > I don't know whether or not this is by design, but it would make more > sense to me if when searching a bytes object with a bytes pattern the > keys returned in the groupdict were bytes as well. > > I reworked the example a little just now so it would run it on 2.7 as > well; on 2.7 the keys in the dictionary correspond to the mode of the > pattern as expected (and bytes and unicode are interconverted > silently) > - code and output are inline below. > > Thanks for your time, > > Gordon > > [Code] > > import sys > import re > from datetime import datetime > > data = (u"first string (unicode)", > ?????????b"second string (bytes)") > > pattern = [re.compile(u"(?P\\w+) .*\\((?P\\w+)\\)"), > ???????????re.compile(b"(?P\\w+) .*\\((?P\\w+)\\)")] > > print("*** re consistency check ***\nRun: %s\nVersion: Python %s\n" % > ??????(datetime.now(), sys.version)) > for p in pattern: > ????for d in data: > ????????try: > ????????????result = "groupdict: %s" % (p.match(d) and > p.match(d).groupdict()) > ????????except Exception as e: > ????????????result = "error: %s" % e.args[0] > ????????print("mode: %s\npattern: %s\ndata: %s\n%s\n" % > ??????????????(type(p.pattern).__name__, p.pattern, d, result)) > > [Output] > > gordon at w540:~/workspace/regex_demo$ python3 regex_demo.py > *** re consistency check *** > Run: 2016-09-25 20:06:29.472332 > Version: Python 3.5.2+ (default, Sep 10 2016, 10:24:58) > [GCC 6.2.0 20160901] > > mode: str > pattern: (?P\w+) .*\((?P\w+)\) > data: first string (unicode) > groupdict: {'ordinal': 'first', 'type': 'unicode'} > > mode: str > pattern: (?P\w+) .*\((?P\w+)\) > data: b'second string (bytes)' > error: cannot use a string pattern on a bytes-like object > > mode: bytes > pattern: b'(?P\\w+) .*\\((?P\\w+)\\)' > data: first string (unicode) > error: cannot use a bytes pattern on a string-like object > > mode: bytes > pattern: b'(?P\\w+) .*\\((?P\\w+)\\)' > data: b'second string (bytes)' > groupdict: {'ordinal': b'second', 'type': b'bytes'} > > gordon at w540:~/workspace/regex_demo$ python regex_demo.py > *** re > consistency check *** > Run: 2016-09-25 20:06:23.375322 > Version: Python > 2.7.12+ (default, Sep??1 2016, 20:27:38) > [GCC 6.2.0 20160822] > > mode: unicode > pattern: (?P\w+) .*\((?P\w+)\) > data: first string (unicode) > groupdict: {u'ordinal': u'first', u'type': u'unicode'} > > mode: unicode > pattern: (?P\w+) .*\((?P\w+)\) > data: second string (bytes) > groupdict: {u'ordinal': 'second', u'type': 'bytes'} > > mode: str > pattern: (?P\w+) .*\((?P\w+)\) > data: first string (unicode) > groupdict: {'ordinal': u'first', 'type': u'unicode'} > > mode: str > pattern: (?P\w+) .*\((?P\w+)\) > data: second string (bytes) > groupdict: {'ordinal': 'second', 'type': 'bytes'} > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido > %40python.org From vadmium+py at gmail.com Thu Sep 29 00:38:55 2016 From: vadmium+py at gmail.com (Martin Panter) Date: Thu, 29 Sep 2016 04:38:55 +0000 Subject: [Python-Dev] [Python-checkins] cpython (merge 3.5 -> default): Null merge. In-Reply-To: <82b4f7c7-a39f-3a21-d554-4d8a6a8af7d5@udel.edu> References: <20160929015741.12318.48541.A617579D@psf.io> <82b4f7c7-a39f-3a21-d554-4d8a6a8af7d5@udel.edu> Message-ID: On 29 September 2016 at 03:04, Terry Reedy wrote: > On 9/28/2016 9:57 PM, terry.reedy wrote: >> https://hg.python.org/cpython/rev/02eb35b79af0 > > (2nd try) I mistakenly null merged from 3.5 to default. > Should a now do a proper null merge from 3.5 to 3.6 to default? Yes, I think 3.5 needs to be merged into 3.6, and the result needs to be merged into default. I guess they are null merges because the entries were already present in 3.6b1. > Should I revert this null merge? I don?t think there is much point in reverting a null merge, if there are no actual changes to revert. From vadmium+py at gmail.com Thu Sep 29 01:07:16 2016 From: vadmium+py at gmail.com (Martin Panter) Date: Thu, 29 Sep 2016 05:07:16 +0000 Subject: [Python-Dev] [Python-checkins] cpython (merge 3.5 -> default): Null merge. In-Reply-To: References: <20160929015741.12318.48541.A617579D@psf.io> <82b4f7c7-a39f-3a21-d554-4d8a6a8af7d5@udel.edu> Message-ID: On 29 September 2016 at 04:26, Zachary Ware wrote: > On Wed, Sep 28, 2016 at 10:04 PM, Terry Reedy wrote: >> On 9/28/2016 9:57 PM, terry.reedy wrote: >>> https://hg.python.org/cpython/rev/02eb35b79af0 >> >> >> (2nd try) I mistakenly null merged from 3.5 to default. >> Should a now do a proper null merge from 3.5 to 3.6 to default? >> Should I revert this null merge? FYI I committed some merges (04060fa4428d and ae0c983d3c65) which should have fixed this all up. > You aren't the only one who's missed 3.6 since it was branched :). If > there are changes in 3.5 that should not be in 3.6, you should go > ahead and do a null merge from 3.5 -> 3.6 -> default. If the changes > in 3.5 are already in 3.6, I'd just leave it as is; it will clear up > when somebody next merges something. In this case, my automatic merge process gave conflicts and spooky ?ambiguous merge? warnings, so in this case I think it was good to deal with it sooner rather than later. From status at bugs.python.org Fri Sep 30 12:08:51 2016 From: status at bugs.python.org (Python tracker) Date: Fri, 30 Sep 2016 18:08:51 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20160930160851.2A3B15618B@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2016-09-23 - 2016-09-30) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 5519 (-11) closed 34555 (+70) total 40074 (+59) Open issues with patches: 2381 Issues opened (39) ================== #18235: _sysconfigdata.py wrong on AIX installations http://bugs.python.org/issue18235 reopened by martin.panter #19795: Formatting of True/False/None in docs http://bugs.python.org/issue19795 reopened by serhiy.storchaka #27665: Make create_server able to listen on several ports http://bugs.python.org/issue27665 reopened by bayandin #28259: Ctypes bug windows http://bugs.python.org/issue28259 opened by PlatonAdCo #28261: wrong error messages when using PyArg_ParseTuple to parse norm http://bugs.python.org/issue28261 opened by Oren Milman #28262: Header folder folds incorrectly causing MissingHeaderBodySepar http://bugs.python.org/issue28262 opened by vincenttc #28264: Python 3.4.4 Turtle library - Turtle.onclick events blocked by http://bugs.python.org/issue28264 opened by George Fagin #28267: [MinGW] Crash at start when compiled by MinGW for 64-bit Windo http://bugs.python.org/issue28267 opened by vmurashev #28269: [MinGW] Can't compile Python/dynload_win.c due to static strca http://bugs.python.org/issue28269 opened by vmurashev #28270: [MinGW] Can't compile Modules/posixmodule.c by MinGW - several http://bugs.python.org/issue28270 opened by vmurashev #28271: [MinGW] Can't compile _ctypes/callproc.c - SEH not supported b http://bugs.python.org/issue28271 opened by vmurashev #28272: a redundant check in maybe_small_long http://bugs.python.org/issue28272 opened by Oren Milman #28273: Make os.waitpid() option parameter optional. http://bugs.python.org/issue28273 opened by StyXman #28275: LZMADecompressor.decompress Use After Free http://bugs.python.org/issue28275 opened by JohnLeitch #28276: test_loading.py - false positive result for "def test_find" wh http://bugs.python.org/issue28276 opened by Michael.Felt #28277: ./Modules/_io/_iomodule.c build failure on AIX (beta1) while ( http://bugs.python.org/issue28277 opened by Michael.Felt #28278: Make `weakref.WeakKeyDictionary.__repr__` meaningful http://bugs.python.org/issue28278 opened by cool-RR #28279: setuptools failing to read from setup.cfg only in Python 3.6 http://bugs.python.org/issue28279 opened by Roy Williams #28280: Always return a list from PyMapping_Keys/PyMapping_Values/PyMa http://bugs.python.org/issue28280 opened by serhiy.storchaka #28281: Remove year limits from calendar http://bugs.python.org/issue28281 opened by belopolsky #28282: find_library("c") defers to find_msvcrt() http://bugs.python.org/issue28282 opened by martin.panter #28286: gzip guessing of mode is ambiguilous http://bugs.python.org/issue28286 opened by serhiy.storchaka #28287: Refactor subprocess.Popen to let a subclass handle IO asynchro http://bugs.python.org/issue28287 opened by martius #28288: Expose environment variable for Py_Py3kWarningFlag http://bugs.python.org/issue28288 opened by Roy Williams #28290: BETA report: Python-3.6 build messages to stderr: AIX and "not http://bugs.python.org/issue28290 opened by Michael.Felt #28291: urllib/urllib2 AbstractDigestAuthHandler locked to retried cou http://bugs.python.org/issue28291 opened by secynic #28292: Make Calendar.itermonthdates() behave consistently in edge cas http://bugs.python.org/issue28292 opened by belopolsky #28293: Don't completely dump the regex cache when full http://bugs.python.org/issue28293 opened by rhettinger #28294: HTTPServer server.py assumes sys.stderr != None http://bugs.python.org/issue28294 opened by grismar #28295: PyUnicode_AsUCS4 doc and impl conflict on exception http://bugs.python.org/issue28295 opened by xiang.zhang #28298: can't set big int-like objects to items in array 'Q', 'L' and http://bugs.python.org/issue28298 opened by Oren Milman #28301: python3.4-config --extension-suffix reports '.cpython-34m.so' http://bugs.python.org/issue28301 opened by DrLou #28307: Accelerate 'string' % (value, ...) by using formatted string l http://bugs.python.org/issue28307 opened by serhiy.storchaka #28308: Accelerate 'string'.format(value, ...) by using formatted stri http://bugs.python.org/issue28308 opened by serhiy.storchaka #28309: Accelerate string.Template by using formatted string literals http://bugs.python.org/issue28309 opened by serhiy.storchaka #28312: Minor change - more direct hint re: multiple machine sizes and http://bugs.python.org/issue28312 opened by Michael.Felt #28314: ElementTree: Element.getiterator(tag) broken in 3.6 http://bugs.python.org/issue28314 opened by mitya57 #28315: incorrect "in ?" output in 'divide' example at "Defining Clean http://bugs.python.org/issue28315 opened by viorel #28317: Improve support of FORMAT_VALUE in dis http://bugs.python.org/issue28317 opened by serhiy.storchaka Most recent 15 issues with no replies (15) ========================================== #28317: Improve support of FORMAT_VALUE in dis http://bugs.python.org/issue28317 #28312: Minor change - more direct hint re: multiple machine sizes and http://bugs.python.org/issue28312 #28309: Accelerate string.Template by using formatted string literals http://bugs.python.org/issue28309 #28287: Refactor subprocess.Popen to let a subclass handle IO asynchro http://bugs.python.org/issue28287 #28286: gzip guessing of mode is ambiguilous http://bugs.python.org/issue28286 #28282: find_library("c") defers to find_msvcrt() http://bugs.python.org/issue28282 #28280: Always return a list from PyMapping_Keys/PyMapping_Values/PyMa http://bugs.python.org/issue28280 #28279: setuptools failing to read from setup.cfg only in Python 3.6 http://bugs.python.org/issue28279 #28273: Make os.waitpid() option parameter optional. http://bugs.python.org/issue28273 #28271: [MinGW] Can't compile _ctypes/callproc.c - SEH not supported b http://bugs.python.org/issue28271 #28269: [MinGW] Can't compile Python/dynload_win.c due to static strca http://bugs.python.org/issue28269 #28264: Python 3.4.4 Turtle library - Turtle.onclick events blocked by http://bugs.python.org/issue28264 #28261: wrong error messages when using PyArg_ParseTuple to parse norm http://bugs.python.org/issue28261 #28259: Ctypes bug windows http://bugs.python.org/issue28259 #28249: doctest.DocTestFinder reports incorrect line numbers with excl http://bugs.python.org/issue28249 Most recent 15 issues waiting for review (15) ============================================= #28317: Improve support of FORMAT_VALUE in dis http://bugs.python.org/issue28317 #28315: incorrect "in ?" output in 'divide' example at "Defining Clean http://bugs.python.org/issue28315 #28314: ElementTree: Element.getiterator(tag) broken in 3.6 http://bugs.python.org/issue28314 #28309: Accelerate string.Template by using formatted string literals http://bugs.python.org/issue28309 #28298: can't set big int-like objects to items in array 'Q', 'L' and http://bugs.python.org/issue28298 #28295: PyUnicode_AsUCS4 doc and impl conflict on exception http://bugs.python.org/issue28295 #28294: HTTPServer server.py assumes sys.stderr != None http://bugs.python.org/issue28294 #28293: Don't completely dump the regex cache when full http://bugs.python.org/issue28293 #28291: urllib/urllib2 AbstractDigestAuthHandler locked to retried cou http://bugs.python.org/issue28291 #28287: Refactor subprocess.Popen to let a subclass handle IO asynchro http://bugs.python.org/issue28287 #28286: gzip guessing of mode is ambiguilous http://bugs.python.org/issue28286 #28275: LZMADecompressor.decompress Use After Free http://bugs.python.org/issue28275 #28273: Make os.waitpid() option parameter optional. http://bugs.python.org/issue28273 #28272: a redundant check in maybe_small_long http://bugs.python.org/issue28272 #28271: [MinGW] Can't compile _ctypes/callproc.c - SEH not supported b http://bugs.python.org/issue28271 Top 10 most discussed issues (10) ================================= #28293: Don't completely dump the regex cache when full http://bugs.python.org/issue28293 13 msgs #28207: Use pkg-config to find dependencies http://bugs.python.org/issue28207 12 msgs #28183: Clean up and speed up dict iteration http://bugs.python.org/issue28183 11 msgs #27873: multiprocessing.pool.Pool.map should take more than one iterab http://bugs.python.org/issue27873 7 msgs #28281: Remove year limits from calendar http://bugs.python.org/issue28281 7 msgs #28275: LZMADecompressor.decompress Use After Free http://bugs.python.org/issue28275 6 msgs #28314: ElementTree: Element.getiterator(tag) broken in 3.6 http://bugs.python.org/issue28314 6 msgs #27386: Asyncio server hang when clients connect and immediately disco http://bugs.python.org/issue27386 5 msgs #28199: Compact dict resizing is doing too much work http://bugs.python.org/issue28199 5 msgs #28267: [MinGW] Crash at start when compiled by MinGW for 64-bit Windo http://bugs.python.org/issue28267 5 msgs Issues closed (66) ================== #5895: socketmodule.c on HPUX ia64 without _XOPEN_SOURCE_EXTENDED com http://bugs.python.org/issue5895 closed by christian.heimes #10673: multiprocess.Process join method - timeout indistinguishable f http://bugs.python.org/issue10673 closed by berker.peksag #18893: invalid exception handling in Lib/ctypes/macholib/dyld.py http://bugs.python.org/issue18893 closed by berker.peksag #20100: epoll docs are not clear with regards to CLOEXEC. http://bugs.python.org/issue20100 closed by berker.peksag #20754: distutils should use SafeConfigParser http://bugs.python.org/issue20754 closed by berker.peksag #20947: -Wstrict-overflow findings http://bugs.python.org/issue20947 closed by christian.heimes #21578: Misleading error message when ImportError called with invalid http://bugs.python.org/issue21578 closed by serhiy.storchaka #21903: ctypes documentation MessageBoxA example produces error http://bugs.python.org/issue21903 closed by berker.peksag #22969: Compile fails with --without-signal-module http://bugs.python.org/issue22969 closed by berker.peksag #23155: unittest: object has no attribute '_removed_tests' http://bugs.python.org/issue23155 closed by berker.peksag #23520: test_os failed (python-3.4.3, Linux SuSE) http://bugs.python.org/issue23520 closed by berker.peksag #23701: Drop extraneous comment from winreg.QueryValue's docstring http://bugs.python.org/issue23701 closed by berker.peksag #24201: _winreg PyHKEY Type Confusion http://bugs.python.org/issue24201 closed by steve.dower #25830: _TypeAlias: Discrepancy between docstring and behavior http://bugs.python.org/issue25830 closed by gvanrossum #26075: typing.Union unifies types too broadly http://bugs.python.org/issue26075 closed by gvanrossum #26148: String literals are not interned if in a tuple http://bugs.python.org/issue26148 closed by serhiy.storchaka #26224: Add "version added" for documentation of asyncio.timeout for d http://bugs.python.org/issue26224 closed by berker.peksag #26477: typing forward references and module attributes http://bugs.python.org/issue26477 closed by gvanrossum #26550: documentation minor issue : "Step back: WSGI" section from "HO http://bugs.python.org/issue26550 closed by berker.peksag #26650: calendar: OverflowErrors for year == 1 and firstweekday > 0 http://bugs.python.org/issue26650 closed by belopolsky #27322: test_compile_path fails when python has been installed http://bugs.python.org/issue27322 closed by berker.peksag #27565: Offer error context manager for code.interact http://bugs.python.org/issue27565 closed by berker.peksag #27703: Replace two Py_XDECREFs with Py_DECREFs in do_raise http://bugs.python.org/issue27703 closed by serhiy.storchaka #27740: Fix doc of Py_CompileStringExFlags http://bugs.python.org/issue27740 closed by berker.peksag #27766: Add ChaCha20 Poly1305 to SSL ciphers http://bugs.python.org/issue27766 closed by christian.heimes #27845: Optimize update_keyword_args() function http://bugs.python.org/issue27845 closed by serhiy.storchaka #27897: Avoid possible crash in pysqlite_connection_create_collation http://bugs.python.org/issue27897 closed by serhiy.storchaka #27914: Incorrect comment in PyModule_ExcDef http://bugs.python.org/issue27914 closed by serhiy.storchaka #27942: Default value identity regression http://bugs.python.org/issue27942 closed by serhiy.storchaka #27963: null poiter dereference in set_conversion_mode due uncheck _ct http://bugs.python.org/issue27963 closed by serhiy.storchaka #27995: Upgrade Python 3.4 to OpenSSL 1.0.2h on Windows http://bugs.python.org/issue27995 closed by christian.heimes #28144: Decrease empty_keys_struct's dk_refcnt http://bugs.python.org/issue28144 closed by serhiy.storchaka #28148: [Patch] Also stop using localtime() in timemodule http://bugs.python.org/issue28148 closed by belopolsky #28194: Clean up some checks in dict implementation http://bugs.python.org/issue28194 closed by serhiy.storchaka #28203: complex() gives wrong error when the second argument has an in http://bugs.python.org/issue28203 closed by mark.dickinson #28211: Wrong return value type in the doc of PyMapping_Keys/Values/It http://bugs.python.org/issue28211 closed by serhiy.storchaka #28221: Unused indata in test_ssl.ThreadedTests.test_asyncore_server http://bugs.python.org/issue28221 closed by martin.panter #28250: typing.NamedTuple instances are not picklable Two http://bugs.python.org/issue28250 closed by mark.dickinson #28252: Tuples used before introduction to tuple in tutorial http://bugs.python.org/issue28252 closed by rhettinger #28253: calendar.prcal(9999) output has a problem http://bugs.python.org/issue28253 closed by belopolsky #28254: Add C API for gc.enable, gc.disable, and gc.isenabled http://bugs.python.org/issue28254 closed by rhettinger #28258: Broken python-config generated with Estonian locale http://bugs.python.org/issue28258 closed by serhiy.storchaka #28260: mock._Any and mock._Call implement __eq__ but not __hash__ http://bugs.python.org/issue28260 closed by berker.peksag #28263: Python 2.7's `-3` flag warns about __eq__ being implemented wi http://bugs.python.org/issue28263 closed by christian.heimes #28265: builtin_function_or_method's __getattribute__ not applicable t http://bugs.python.org/issue28265 closed by eric.snow #28266: setup.py uses build Python's configuration when cross-compilin http://bugs.python.org/issue28266 closed by Rouslan Korneychuk #28268: bz2.open does not use __fspath__ (PEP 519) http://bugs.python.org/issue28268 closed by serhiy.storchaka #28274: asyncio does not call exception handler if task stored http://bugs.python.org/issue28274 closed by r.david.murray #28283: test_sock_connect_sock_write_race() of test.test_asyncio.test_ http://bugs.python.org/issue28283 closed by berker.peksag #28284: Memory corruption due to size expansion (overflow) in _json.en http://bugs.python.org/issue28284 closed by python-dev #28285: 35.5 - 29.58 = 5.920000000000002 (it's false !) http://bugs.python.org/issue28285 closed by serhiy.storchaka #28289: ImportError.__init__ doesn't reset not specified exception att http://bugs.python.org/issue28289 closed by serhiy.storchaka #28296: Add __le__ and __ge__ to collections.Counter http://bugs.python.org/issue28296 closed by r.david.murray #28297: sched module example has wrong output http://bugs.python.org/issue28297 closed by xiang.zhang #28299: DirEntry.is_dir() evaluates True for a file on Windows http://bugs.python.org/issue28299 closed by paul.moore #28300: [PATCH] Fix misspelled "implemented" word http://bugs.python.org/issue28300 closed by berker.peksag #28302: Unpacking numpy array give list http://bugs.python.org/issue28302 closed by SilentGhost #28303: [PATCH] Fix broken grammar in "pydoc3 unittest" http://bugs.python.org/issue28303 closed by berker.peksag #28304: Condition.wait() doesn't raise KeyboardInterrupt http://bugs.python.org/issue28304 closed by berker.peksag #28305: Make error for Python3.6 on Cygwin http://bugs.python.org/issue28305 closed by r.david.murray #28306: incorrect output "int division or modulo by zero" in Handling http://bugs.python.org/issue28306 closed by berker.peksag #28310: Mixing yield and return with value is allowed http://bugs.python.org/issue28310 closed by xiang.zhang #28311: AIX shared library extension modules installation broken - Pyt http://bugs.python.org/issue28311 closed by martin.panter #28313: ttk Style().configure() overwrites Tk().option_add() Button bu http://bugs.python.org/issue28313 closed by serhiy.storchaka #28316: descriptor and repr get into conflict http://bugs.python.org/issue28316 closed by benjamin.peterson #1175984: Make subprocess.Popen support file-like objects (win) http://bugs.python.org/issue1175984 closed by christian.heimes From nad at python.org Fri Sep 30 16:18:16 2016 From: nad at python.org (Ned Deily) Date: Fri, 30 Sep 2016 16:18:16 -0400 Subject: [Python-Dev] IMPORTANT: An extra week until 3.6.0b2, now 2016-10-10 Message-ID: <52A1C0D6-3AAE-4214-95C9-16A0251EFC31@python.org> Thanks for all of your efforts in getting us to the beta phase of 3.6.0! A large number of important features and a huge amount of code were committed just prior to the b1 feature freeze 3 weekends ago. Not surprisingly, there were a number of bugs found and loose ends identified and, as a result, we've negotiated some extensions to get things in before b2. Under the current schedule there were only 3 weeks between b1 and b2 and then 4 weeks between b2 and b3; that was mainly because we pushed b1 back a week due to the development sprint. I would *really* like for us to get those remaining pieces which were granted extensions into b2 as planned. The longer they are delayed, the more risk it puts on the final steps of the release and it's really important to have a stable base for our testing efforts and those of our downstream users, like third-party developers and distributors. So I think it makes sense to move b2 back a week, giving us all an extra week to get things in for b2. Without changing the date for b3, we will now have 4 weeks between b1 and b2 and 3 weeks between b2 and b3. That gives us about 10 days from now until b2. It would be great if you can update the issue tracker for any exempted items you have. I will try to followup with you, as needed, over the next few days on their status. Please contact me if you have any questions about the 3.6.0 schedule or about whether a change is appropriate for the beta phase. To recap, the remaining milestones for 3.6.0: 2016-10-10, 1200 UTC: 3.6.0 beta 2 (was 10-03, remaining exempted features, bug and doc fixes) 2016-10-31: 3.6.0 beta 3 (bug and doc fixes) 2016-11-21: 3.6.0 beta 4 (important bug fixes and doc fixes) 2016-12-05 3.6.0 release candidate 1 (3.6.0 code freeze, critical bug fixes, doc fixes) 2016-12-16 3.6.0 release (3.6.0rc1 plus any necessary emergency fixes) Thank you all again for your great efforts so far on 3.6! --Ned https://www.python.org/dev/peps/pep-0494/ -- Ned Deily nad at python.org -- []