From trent at snakebite.org Thu Jun 5 14:02:14 2014 From: trent at snakebite.org (Trent Nelson) Date: Thu, 5 Jun 2014 05:02:14 -0700 Subject: [Python-ideas] Make Python code read-only In-Reply-To: References: Message-ID: <9C2B8DBB-80EC-40DE-847E-C9AF421CDDBE@snakebite.org> On May 20, 2014, at 12:57 PM, Victor Stinner wrote: > Hi, > > I'm trying to find the best option to make CPython faster. I would > like to discuss here a first idea of making the Python code read-only > to allow new optimizations. I did two passes on read-only functionality for PyParallel. First attempt was similar to yours; I instrumented various core Python objects such that mutations could be detected against read-only objects (and subsequently raised as an exception). That didn?t pan out the way I wanted it to, especially in the PyParallel multiple-interpreter-threads-running-in-parallel environment. Second attempt: use memory protection. CPUs and OSes are really good at enforcing memory protection ? leverage that. Don?t try and do it yourself in userspace. This worked much better. That work is described starting here: https://speakerdeck.com/trent/pyparallel-how-we-removed-the-gil-and-exploited-all-cores?slide=138 Relevant bits of implementation: obmalloc.c: http://hg.python.org/sandbox/trent/rev/0e70a0caa1c0#l6.299 ceval.c: http://hg.python.org/sandbox/trent/rev/0e70a0caa1c0#l9.30 On POSIX you?d achieve the same affect via mprotect and a SIGSEV trap. Just FYI. Regards, Trent. From victor.stinner at gmail.com Thu Jun 5 15:05:42 2014 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 5 Jun 2014 15:05:42 +0200 Subject: [Python-ideas] Make Python code read-only In-Reply-To: <9C2B8DBB-80EC-40DE-847E-C9AF421CDDBE@snakebite.org> References: <9C2B8DBB-80EC-40DE-847E-C9AF421CDDBE@snakebite.org> Message-ID: 2014-06-05 14:02 GMT+02:00 Trent Nelson : > On May 20, 2014, at 12:57 PM, Victor Stinner wrote: >> I'm trying to find the best option to make CPython faster. I would >> like to discuss here a first idea of making the Python code read-only >> to allow new optimizations. > > I did two passes on read-only functionality for PyParallel. First attempt was similar to yours; I instrumented various core Python objects such that mutations could be detected against read-only objects (and subsequently raised as an exception). That didn?t pan out the way I wanted it to, especially in the PyParallel multiple-interpreter-threads-running-in-parallel environment. > > Second attempt: use memory protection. CPUs and OSes are really good at enforcing memory protection ? leverage that. Don?t try and do it yourself in userspace. This worked much better. My first attempt to "make the code read-only" was a big fail. Lot of errors and complains :-) I'm now moving to a different approach: "notify changes of the code". In PyParellel, you raise an error if something is modified. I don't need such restriction, I "just" want to disable optimizations if the code changed. > On POSIX you?d achieve the same affect via mprotect and a SIGSEV trap. I don't think that relying on SIGSEGV is reliable :-( Such signal can be emitted for various reasons and you have to use sigsetjmp/siglongjmp which is unsafe: you cannot cleanup state when an error occurs. Or did you implement it differently? Victor From sturla.molden at gmail.com Thu Jun 5 16:42:32 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Thu, 05 Jun 2014 16:42:32 +0200 Subject: [Python-ideas] Make Python code read-only In-Reply-To: References: Message-ID: On 21/05/14 02:16, Victor Stinner wrote: > I don't want to optimize a single function, I want to optimize a whole > application. Right. Even Java does not do that. (Hence the name 'Hotspot') > If possible, I would prefer to not have to modify the application to > run it faster. > > Numba plays very well with numbers and arrays, but I'm not sure that > it is able to inline arbitrary Python function for example. Numba will compile the Python overhead out of function calls, if that is what you mean. Numba will also accelerate Python objects (method calls and attribute access). LLVM knows how to do simple optimisations like function inlining. When a Python function is JIT compiled to LLVM bytecode by Numba, LLVM knows what to do with ut. If the function body is small enough, LLVM will inline it completely. Numba is still under development, so is might no be considered "production ready" yet. Currently it will give you performance comparable to -O2 in C for most algorithmic Python code. Sturla From ncoghlan at gmail.com Thu Jun 5 18:05:35 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 6 Jun 2014 02:05:35 +1000 Subject: [Python-ideas] String-like methods on StringIO objects? Message-ID: >From the "idle speculation" files (inspired by the recent thread on python-dev): has anyone ever experimented with offering string methods like find() on StringIO objects? I don't work in any sufficiently memory constrained environments these days that that style of API would be worth the hassle relative to a normal string, it just struck me as a potentially interesting approach to the notion of a string manipulation type that didn't generally copy data around and could use different code point sizes internally for different parts of the text data. Cheers, Nick. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dw+python-ideas at hmmz.org Thu Jun 5 18:39:25 2014 From: dw+python-ideas at hmmz.org (dw+python-ideas at hmmz.org) Date: Thu, 5 Jun 2014 16:39:25 +0000 Subject: [Python-ideas] String-like methods on StringIO objects? In-Reply-To: References: Message-ID: <20140605163925.GB17301@k2> On Fri, Jun 06, 2014 at 02:05:35AM +1000, Nick Coghlan wrote: > From the "idle speculation" files (inspired by the recent thread on > python-dev): has anyone ever experimented with offering string methods like > find() on StringIO objects? > I don't work in any sufficiently memory constrained environments these days > that that style of API would be worth the hassle relative to a normal string, > it just struck me as a potentially interesting approach to the notion of a > string manipulation type that didn't generally copy data around and could use > different code point sizes internally for different parts of the text data. Thought about this quite a bit. There are a few ways StringIO/BytesIO/buffers could improve, not sure which approaches are interesting, though.. 1) Not sure if it's the case in Python3.x (pretty sure it isn't in 2.x), but cStringIO could optimize for the case where the IO is discarded after building a single string by using the CPython APIs for doing that (e.g. _PyString_Resize). In that case, getvalue() returns the built string, and sets an internal flag to cause it to be copied to a new private string if any further IO is invoked. This inverts the current behaviour where the normal case of build-and-discard causes a copy. 2) Rather than implement string methods on the StringIO, it might be nicer if those methods could apply to a memoryview, and then make it possible e.g. for BytesIO to be exposed as a memoryview. Right now Python doesn't have much in the way of generic "type safe / memory safe" APIs for doing things to regular memory without first invoking copies/conversions of various sorts. This might be the more useful thing to fix. We have plenty of special cases, like bytearray(), array.array(), StringIO (to some degree), and so on, and various ways to manipulate that memory (ctypes and struct module for example), but they are all somewhat hodge-podges of each other and lack any "one way to do it". I had looked at building some kind of unified 'memory slice' type last year, since I keep bumping into the need for better Python-level support for this stuff when working on 'bit twiddling' projects of various kinds. It's mostly thinking aloud, but here is a rough sketch for the kind of module I had been considering last year, mostly while working with Python 2: https://github.com/dw/memsink/wiki/Memory-Module . The idea was to provide a common 'Slice' adaptor type whose memory could be interpreted using a couple of different abstractions (Vector and File being the obvious). David > > Cheers, > Nick. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From cf.natali at gmail.com Thu Jun 5 21:26:35 2014 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Thu, 5 Jun 2014 20:26:35 +0100 Subject: [Python-ideas] Make Python code read-only In-Reply-To: References: Message-ID: 2014-06-05 15:42 GMT+01:00 Sturla Molden : > Numba is still under development, so is might no be considered "production > ready" yet. Currently it will give you performance comparable to -O2 in C > for most algorithmic Python code. When you consider it production ready, don't hesitate to suggest it for inclusion on python-dev: FWIW, I think it's high time we have a JIT compiler in CPython... From mistersheik at gmail.com Sat Jun 7 03:53:30 2014 From: mistersheik at gmail.com (Neil Girdhar) Date: Fri, 6 Jun 2014 18:53:30 -0700 (PDT) Subject: [Python-ideas] =?utf-8?q?Put_default_setstate_and_getstate_on_obj?= =?utf-8?q?ect_for_use_in_co=C3=B6perative_inheritance=2E?= Message-ID: When implementing getstate in co?oerative inheritance, the typical thing to do is to call super to get dictionary and add the appropriate entries. Setstate is similar: you extract what you need out of the dictionary and call super with the remaining entries. Unfortunately, object does not have a default implementation, so you need a base class like so: class DefaultSetstateAndGetstate: """ Define default getstate and setstate for use in co?perative inheritance. """ def __getstate__(self): return self.__dict__.copy() def __setstate__(self, state): self.__dict__.update(state) I suggest that this be added to object. Best, Neil -------------- next part -------------- An HTML attachment was scrubbed... URL: From mistersheik at gmail.com Sat Jun 7 03:59:38 2014 From: mistersheik at gmail.com (Neil Girdhar) Date: Fri, 6 Jun 2014 18:59:38 -0700 (PDT) Subject: [Python-ideas] Expose `itertools.count.start` and implement `itertools.count.__eq__` based on it, like `range`. In-Reply-To: References: <082cd87a-aeb5-49bf-9f79-d99a6d18e402@googlegroups.com> Message-ID: That would be great. On Friday, May 16, 2014 12:16:52 AM UTC-4, Antony Lee wrote: > > Actually, a more reasonable solution would be to have range handle keyword > arguments and map "range(start=x)" to "count(x)". Or, perhaps more simply, > "range(x, None)" (so that no keyword arguments are needed). > > > 2014-05-15 13:04 GMT-07:00 Ram Rachum >: > >> Now that I think about it, I would ideally want `itertools.count` to be >> deprecated in favor of `range(float('inf'))`, but I know that would never >> happen. >> >> >> On Thursday, May 15, 2014 11:02:56 PM UTC+3, Ram Rachum wrote: >>> >>> I suggest exposing `itertools.count.start` and implementing >>> `itertools.count.__eq__` based on it. This'll provide the same benefits >>> that `range` got by exposing `range.start` and allowing `range.__eq__`. >>> >> >> _______________________________________________ >> Python-ideas mailing list >> Python... at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sat Jun 7 07:14:57 2014 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 7 Jun 2014 15:14:57 +1000 Subject: [Python-ideas] =?iso-8859-1?q?Put_default_setstate_and_getstate_o?= =?iso-8859-1?q?n_object_for_use_in_co=F6perative_inheritance=2E?= In-Reply-To: References: Message-ID: <20140607051457.GN10355@ando> On Fri, Jun 06, 2014 at 06:53:30PM -0700, Neil Girdhar wrote: > When implementing getstate in co?oerative inheritance, the typical thing to > do is to call super to get dictionary and add the appropriate entries. > Setstate is similar: you extract what you need out of the dictionary and > call super with the remaining entries. Unfortunately, object does not have > a default implementation, so you need a base class like so: I'm afraid you're going to need to explain in more detail what you're talking about. Even a link to a discussion elsewhere. I've used cooperative inheritance without needing to write a getstate or setstate method, so I have no idea why you think these are important enough to go into the base object. I presume you're not talking about serialization formats? That's where I would normally expect to find a getstate and setstate. It might also help if you can do a survey of other languages, like Java and Ruby, and tell us if they have such methods in the base object. -- Steven From mistersheik at gmail.com Sat Jun 7 08:10:15 2014 From: mistersheik at gmail.com (Neil Girdhar) Date: Sat, 7 Jun 2014 02:10:15 -0400 Subject: [Python-ideas] =?utf-8?q?Put_default_setstate_and_getstate_on_obj?= =?utf-8?q?ect_for_use_in_co=C3=B6perative_inheritance=2E?= In-Reply-To: <20140607051457.GN10355@ando> References: <20140607051457.GN10355@ando> Message-ID: Hi Steven, If you don't know about getstate and setstate, I suggest you take a look at the documentation: https://docs.python.org/3.3/library/pickle.html#object.__getstate__. Besides allowing objects to be pickled, providing these methods allows them to be copied with the copy module. Some of the pickling and copying support can be provided by getnewargs, but this was unfortunately almost useless for cooperative inheritance. Luckily, getnewargs_ex was recently added, which fills in this hole (each subclass fills in the keyword arguments it wants to pass to __new__ and calls super for the rest). Best, Neil On Sat, Jun 7, 2014 at 1:14 AM, Steven D'Aprano wrote: > On Fri, Jun 06, 2014 at 06:53:30PM -0700, Neil Girdhar wrote: > > When implementing getstate in co?oerative inheritance, the typical thing > to > > do is to call super to get dictionary and add the appropriate entries. > > Setstate is similar: you extract what you need out of the dictionary and > > call super with the remaining entries. Unfortunately, object does not > have > > a default implementation, so you need a base class like so: > > I'm afraid you're going to need to explain in more detail what you're > talking about. Even a link to a discussion elsewhere. I've used > cooperative inheritance without needing to write a getstate or setstate > method, so I have no idea why you think these are important enough to go > into the base object. I presume you're not talking about serialization > formats? That's where I would normally expect to find a getstate and > setstate. > > It might also help if you can do a survey of other languages, like Java > and Ruby, and tell us if they have such methods in the base object. > > > -- > Steven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- > > --- > You received this message because you are subscribed to a topic in the > Google Groups "python-ideas" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/python-ideas/QkvOwa1-pHQ/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > python-ideas+unsubscribe at googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Jun 7 08:18:30 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 7 Jun 2014 16:18:30 +1000 Subject: [Python-ideas] =?utf-8?q?Put_default_setstate_and_getstate_on_obj?= =?utf-8?q?ect_for_use_in_co=C3=B6perative_inheritance=2E?= In-Reply-To: References: Message-ID: On 7 June 2014 16:05, Neil Girdhar wrote: > I use cooperative multiple inheritance throughout my (large-ish) project, > and I find it very comfortable and powerful. I am currently using the class > below to serve as an anchor point. The thing is that this behavior is > already implemented somewhere in Python (where?) since it is the default > behaviour if getstate or setstate don't exist. Why not explicitly make it > available to call super? There is fallback behaviour in the pickle and copy modules that doesn't rely on the getstate/setstate APIs. Those fallbacks are defined by the protocols, not by the object model. https://docs.python.org/3/library/pickle.html#pickle-inst covers the available protocols for instance pickling. https://docs.python.org/3/library/copy.html covers (towards the end) some of the options for making class instances copyable https://docs.python.org/3/library/copyreg.html is an additional registry that allows third parties to make instances of classes defined elsewhere support pickling and copying without relying on monkeypatching. > I think I saw or got an email from Guido that I can't seem to find that > rightly points out that object doesn't have __dict__ so this can't be done. > I'm curious why object doesn't have __dict__? Where does the __dict__ comes > into existence? I assume that objects of type object and instantiated > objects of other types have the same metaclass; does the metaclass treat > them differently? Types defined in C extensions and those defined dynamically on the heap share a metaclass at runtime, but their initialisation code is different. You can also define Python level types without a __dict__ by declaring a __slots__ attribute with no __dict__ entry (for example, collections.namedtuple uses that to ensure namedtuple instances are exactly the same size as ordinary tuples - the mapping from field names to tuple indices is maintained on the class). Cheers, Nick. P.S. Posting through Google Groups doesn't work properly - it messes up the reply headers completely. gmane does a better job of interoperating with the mailing list software (as far as I am aware, Google just don't care whether or not interaction with non-Google lists actually works) -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From mistersheik at gmail.com Sat Jun 7 08:36:40 2014 From: mistersheik at gmail.com (Neil Girdhar) Date: Sat, 7 Jun 2014 02:36:40 -0400 Subject: [Python-ideas] =?utf-8?q?Put_default_setstate_and_getstate_on_obj?= =?utf-8?q?ect_for_use_in_co=C3=B6perative_inheritance=2E?= In-Reply-To: References: Message-ID: On Sat, Jun 7, 2014 at 2:18 AM, Nick Coghlan wrote: > On 7 June 2014 16:05, Neil Girdhar wrote: > > I use cooperative multiple inheritance throughout my (large-ish) project, > > and I find it very comfortable and powerful. I am currently using the > class > > below to serve as an anchor point. The thing is that this behavior is > > already implemented somewhere in Python (where?) since it is the default > > behaviour if getstate or setstate don't exist. Why not explicitly make > it > > available to call super? > > There is fallback behaviour in the pickle and copy modules that > doesn't rely on the getstate/setstate APIs. Those fallbacks are > defined by the protocols, not by the object model. > Those fallbacks are essentially default implementations of setstate and getstate. It seems to me like it would make sense to implement those fallbacks once rather than twice in the various places that you mention. > > https://docs.python.org/3/library/pickle.html#pickle-inst covers the > available protocols for instance pickling. > https://docs.python.org/3/library/copy.html covers (towards the end) > some of the options for making class instances copyable > Yes, personally, I prefer writing setstate and getstate and getting copy for free rather than writing a separate __copy__ method. > https://docs.python.org/3/library/copyreg.html is an additional > registry that allows third parties to make instances of classes > defined elsewhere support pickling and copying without relying on > monkeypatching. > copyreg is unfortunately no use for cooperative inheritance as far as I can see. The whole point is for each class to pickle what it needs to and delegate the rest of the pickling to super. > > > I think I saw or got an email from Guido that I can't seem to find that > > rightly points out that object doesn't have __dict__ so this can't be > done. > > I'm curious why object doesn't have __dict__? Where does the __dict__ > comes > > into existence? I assume that objects of type object and instantiated > > objects of other types have the same metaclass; does the metaclass treat > > them differently? > > Types defined in C extensions and those defined dynamically on the > heap share a metaclass at runtime, but their initialisation code is > different. You can also define Python level types without a __dict__ > by declaring a __slots__ attribute with no __dict__ entry (for > example, collections.namedtuple uses that to ensure namedtuple > instances are exactly the same size as ordinary tuples - the mapping > from field names to tuple indices is maintained on the class). > Very interesting, thanks for explaining what is happening. I don't see why __dict__ isn't just in object though. Is it just for the (minor) efficiency of saving an empty dict reference? > > Cheers, > Nick. > > P.S. Posting through Google Groups doesn't work properly - it messes > up the reply headers completely. gmane does a better job of > interoperating with the mailing list software (as far as I am aware, > Google just don't care whether or not interaction with non-Google > lists actually works) > Sorry, I'm just answering via email. I don't know anything about gmane. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Jun 7 10:41:11 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 7 Jun 2014 18:41:11 +1000 Subject: [Python-ideas] =?utf-8?q?Put_default_setstate_and_getstate_on_obj?= =?utf-8?q?ect_for_use_in_co=C3=B6perative_inheritance=2E?= In-Reply-To: References: Message-ID: On 7 Jun 2014 16:37, "Neil Girdhar" wrote: > On Sat, Jun 7, 2014 at 2:18 AM, Nick Coghlan wrote: >> >> On 7 June 2014 16:05, Neil Girdhar wrote: >> > I use cooperative multiple inheritance throughout my (large-ish) project, >> > and I find it very comfortable and powerful. I am currently using the class >> > below to serve as an anchor point. The thing is that this behavior is >> > already implemented somewhere in Python (where?) since it is the default >> > behaviour if getstate or setstate don't exist. Why not explicitly make it >> > available to call super? >> >> There is fallback behaviour in the pickle and copy modules that >> doesn't rely on the getstate/setstate APIs. Those fallbacks are >> defined by the protocols, not by the object model. > > > Those fallbacks are essentially default implementations of setstate and getstate. It seems to me like it would make sense to implement those fallbacks once rather than twice in the various places that you mention. As far as I am aware, it's not implemented in two places - I believe copy falls back pickling & unpickling if there's no other copy operation defined. We don't try to jam everything into the base object, as library protocols are easier to evolve without breaking backwards compatibility. (For CPython, there's also the practical consideration that "object" methods have to be implemented in C, so having protocol fallbacks in the standard library sometimes makes them easier to work on). >> > I think I saw or got an email from Guido that I can't seem to find that >> > rightly points out that object doesn't have __dict__ so this can't be done. >> > I'm curious why object doesn't have __dict__? Where does the __dict__ comes >> > into existence? I assume that objects of type object and instantiated >> > objects of other types have the same metaclass; does the metaclass treat >> > them differently? >> >> Types defined in C extensions and those defined dynamically on the >> heap share a metaclass at runtime, but their initialisation code is >> different. You can also define Python level types without a __dict__ >> by declaring a __slots__ attribute with no __dict__ entry (for >> example, collections.namedtuple uses that to ensure namedtuple >> instances are exactly the same size as ordinary tuples - the mapping >> from field names to tuple indices is maintained on the class). > > > Very interesting, thanks for explaining what is happening. I don't see why __dict__ isn't just in object though. Is it just for the (minor) efficiency of saving an empty dict reference? A reference is a 64-bit pointer. That would be additional overhead on *every single object*. All ints, all strings, all tuples, all dicts(!), etc. Saving 8 bytes per object adds up fast, which is why a lot of the core types (including object itself) don't have a per-instance __dict__ attribute. Keeping objects as small as possible also impacts how many will fit in the CPU cache, so this approach can end up providing a speed increase as well. Cheers, Nick. > >> >> >> Cheers, >> Nick. >> >> P.S. Posting through Google Groups doesn't work properly - it messes >> up the reply headers completely. gmane does a better job of >> interoperating with the mailing list software (as far as I am aware, >> Google just don't care whether or not interaction with non-Google >> lists actually works) > > > Sorry, I'm just answering via email. I don't know anything about gmane. >> >> >> -- >> Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mistersheik at gmail.com Sat Jun 7 10:46:30 2014 From: mistersheik at gmail.com (Neil Girdhar) Date: Sat, 7 Jun 2014 04:46:30 -0400 Subject: [Python-ideas] =?utf-8?q?Put_default_setstate_and_getstate_on_obj?= =?utf-8?q?ect_for_use_in_co=C3=B6perative_inheritance=2E?= In-Reply-To: References: Message-ID: On Sat, Jun 7, 2014 at 4:41 AM, Nick Coghlan wrote: > > On 7 Jun 2014 16:37, "Neil Girdhar" wrote: > > On Sat, Jun 7, 2014 at 2:18 AM, Nick Coghlan wrote: > >> > >> On 7 June 2014 16:05, Neil Girdhar wrote: > >> > I use cooperative multiple inheritance throughout my (large-ish) > project, > >> > and I find it very comfortable and powerful. I am currently using > the class > >> > below to serve as an anchor point. The thing is that this behavior is > >> > already implemented somewhere in Python (where?) since it is the > default > >> > behaviour if getstate or setstate don't exist. Why not explicitly > make it > >> > available to call super? > >> > >> There is fallback behaviour in the pickle and copy modules that > >> doesn't rely on the getstate/setstate APIs. Those fallbacks are > >> defined by the protocols, not by the object model. > > > > > > Those fallbacks are essentially default implementations of setstate and > getstate. It seems to me like it would make sense to implement those > fallbacks once rather than twice in the various places that you mention. > > As far as I am aware, it's not implemented in two places - I believe copy > falls back pickling & unpickling if there's no other copy operation defined. > > We don't try to jam everything into the base object, as library protocols > are easier to evolve without breaking backwards compatibility. (For > CPython, there's also the practical consideration that "object" methods > have to be implemented in C, so having protocol fallbacks in the standard > library sometimes makes them easier to work on). > I see your point. > >> > I think I saw or got an email from Guido that I can't seem to find > that > >> > rightly points out that object doesn't have __dict__ so this can't be > done. > >> > I'm curious why object doesn't have __dict__? Where does the > __dict__ comes > >> > into existence? I assume that objects of type object and instantiated > >> > objects of other types have the same metaclass; does the metaclass > treat > >> > them differently? > >> > >> Types defined in C extensions and those defined dynamically on the > >> heap share a metaclass at runtime, but their initialisation code is > >> different. You can also define Python level types without a __dict__ > >> by declaring a __slots__ attribute with no __dict__ entry (for > >> example, collections.namedtuple uses that to ensure namedtuple > >> instances are exactly the same size as ordinary tuples - the mapping > >> from field names to tuple indices is maintained on the class). > > > > > > Very interesting, thanks for explaining what is happening. I don't see > why __dict__ isn't just in object though. Is it just for the (minor) > efficiency of saving an empty dict reference? > > A reference is a 64-bit pointer. That would be additional overhead on > *every single object*. All ints, all strings, all tuples, all dicts(!), > etc. Saving 8 bytes per object adds up fast, which is why a lot of the core > types (including object itself) don't have a per-instance __dict__ > attribute. > > Keeping objects as small as possible also impacts how many will fit in the > CPU cache, so this approach can end up providing a speed increase as well. > Right, that makes sense. I think the flyweight pattern would eliminate this: use a special representation for the common case and then switch to a real representation as soon as things become weird. (I can see how that would be extra development time unless it could be done automatically by a clever JIT.) Best, Neil > Cheers, > Nick. > > > > >> > >> > >> Cheers, > >> Nick. > >> > >> P.S. Posting through Google Groups doesn't work properly - it messes > >> up the reply headers completely. gmane does a better job of > >> interoperating with the mailing list software (as far as I am aware, > >> Google just don't care whether or not interaction with non-Google > >> lists actually works) > > > > > > Sorry, I'm just answering via email. I don't know anything about gmane. > >> > >> > >> -- > >> Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sat Jun 7 11:10:50 2014 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 7 Jun 2014 19:10:50 +1000 Subject: [Python-ideas] =?iso-8859-1?q?Put_default_setstate_and_getstate_o?= =?iso-8859-1?q?n_object_for_use_in_co=F6perative_inheritance=2E?= In-Reply-To: References: <20140607051457.GN10355@ando> Message-ID: <20140607091050.GO10355@ando> On Sat, Jun 07, 2014 at 02:10:15AM -0400, Neil Girdhar wrote: > Hi Steven, > > If you don't know about getstate and setstate, I suggest you take a look at > the documentation: > https://docs.python.org/3.3/library/pickle.html#object.__getstate__. I know about getstate as it regards to pickle, that's why I asked if you were talking about serialization. Unfortunately you never mentioned pickle, or copy, you talked about cooperative inheritence which is a generic concept that applies much more broadly than just copying or serializing instances. > Besides allowing objects to be pickled, providing these methods allows > them to be copied with the copy module. objects can already be copied and pickled: py> import copy, pickle py> x = object() py> copy.copy(x) py> pickle.dumps(x) b'\x80\x03cbuiltins\nobject\nq\x00)\x81q\x01.' Copying and pickling are defined by protocols, not inheritence, so there's no need for a single root method. As the documentation states, you only need to define a __getstate__ and __setstate__ method when the default protocol behaviour is not sufficient for your class, so adding these methods to object is unnecessary. There's a historical reason for doing it this way: in Python 2, not everything inherits from object. -- Steven From turnbull at sk.tsukuba.ac.jp Sat Jun 7 11:16:39 2014 From: turnbull at sk.tsukuba.ac.jp (Stephen J. Turnbull) Date: Sat, 07 Jun 2014 18:16:39 +0900 Subject: [Python-ideas] =?utf-8?q?Put_default_setstate_and_getstate_on_obj?= =?utf-8?q?ect_for_use_in_co=C3=B6perative_inheritance=2E?= In-Reply-To: References: Message-ID: <87wqct13m0.fsf@uwakimon.sk.tsukuba.ac.jp> Neil Girdhar writes: > Sorry, I'm just answering via email. I don't know anything about gmane. Then please change the To: from @googlegroups to @python.org by hand. If that's too annoying to do every time, learn about GMane once. :-) From ncoghlan at gmail.com Sat Jun 7 11:34:03 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 7 Jun 2014 19:34:03 +1000 Subject: [Python-ideas] =?utf-8?q?Put_default_setstate_and_getstate_on_obj?= =?utf-8?q?ect_for_use_in_co=C3=B6perative_inheritance=2E?= In-Reply-To: References: Message-ID: On 7 June 2014 18:46, Neil Girdhar wrote: > > Right, that makes sense. I think the flyweight pattern would eliminate > this: use a special representation for the common case and then switch to a > real representation as soon as things become weird. (I can see how that > would be extra development time unless it could be done automatically by a > clever JIT.) The flyweight pattern imposes its own costs in terms of additional levels of indirection and even more pointers to carry around. The approach we take is that object instances get a __dict__ attribute by default, unless the creator of the class decides "there are going to be enough of these for it to be worth skipping the space not only for the attribute dicts themselves, but also for the attribute dict reference on each instance". We do the same with weakref support. The other thing to keep in mind is that many of CPython's "internal" representations aren't actually internal: many of them are exposed in various ways through the CPython C API. As other implementations have discovered, preserving full compatibility with that API places some pretty significant constraints on the implementation techniques you use (or else means putting a lot of work into a compatibility shim layer like IronClad, JyNI or cpyext). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From mistersheik at gmail.com Sat Jun 7 20:25:18 2014 From: mistersheik at gmail.com (Neil Girdhar) Date: Sat, 7 Jun 2014 14:25:18 -0400 Subject: [Python-ideas] =?utf-8?q?Put_default_setstate_and_getstate_on_obj?= =?utf-8?q?ect_for_use_in_co=C3=B6perative_inheritance=2E?= In-Reply-To: <20140607091050.GO10355@ando> References: <20140607051457.GN10355@ando> <20140607091050.GO10355@ando> Message-ID: Hi Steven, Have you tried implementing getstate and setstate with cooperatively inherited classes? You'll need to call super().__getstate__(), which won't exist, but it really should. What I'm proposing is to move the default behaviour out of the pickle and copy internals to an object-level implementation of getstate and setstate where I think it belongs. Regarding Guido's point that object doesn't have a dict, as weird as that is, then I think a default getstate could just check for that with hasattr and if it's missing return the empty dict. Best, Neil On Sat, Jun 7, 2014 at 5:10 AM, Steven D'Aprano wrote: > On Sat, Jun 07, 2014 at 02:10:15AM -0400, Neil Girdhar wrote: > > Hi Steven, > > > > If you don't know about getstate and setstate, I suggest you take a look > at > > the documentation: > > https://docs.python.org/3.3/library/pickle.html#object.__getstate__. > > I know about getstate as it regards to pickle, that's why I asked if you > were talking about serialization. Unfortunately you never mentioned > pickle, or copy, you talked about cooperative inheritence which is a > generic concept that applies much more broadly than just copying or > serializing instances. > > > > Besides allowing objects to be pickled, providing these methods allows > > them to be copied with the copy module. > > objects can already be copied and pickled: > > py> import copy, pickle > py> x = object() > py> copy.copy(x) > > py> pickle.dumps(x) > b'\x80\x03cbuiltins\nobject\nq\x00)\x81q\x01.' > > Copying and pickling are defined by protocols, not inheritence, so > there's no need for a single root method. As the documentation states, > you only need to define a __getstate__ and __setstate__ method when the > default protocol behaviour is not sufficient for your class, so adding > these methods to object is unnecessary. > > > There's a historical reason for doing it this way: in Python 2, not > everything inherits from object. > > > -- > Steven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- > > --- > You received this message because you are subscribed to a topic in the > Google Groups "python-ideas" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/python-ideas/QkvOwa1-pHQ/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > python-ideas+unsubscribe at googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mistersheik at gmail.com Sat Jun 7 20:34:48 2014 From: mistersheik at gmail.com (Neil Girdhar) Date: Sat, 7 Jun 2014 14:34:48 -0400 Subject: [Python-ideas] =?utf-8?q?Put_default_setstate_and_getstate_on_obj?= =?utf-8?q?ect_for_use_in_co=C3=B6perative_inheritance=2E?= In-Reply-To: References: Message-ID: I understand your concern for cpython, but I don't think it will be the future of Python. I think every object should have a dict and then the JIT should just make it fast. I think that's possible. Anyway, this is a separate discussion. My new proposal is for setstate and getstate to have default implementations that first check for the __dict__ attribute and do the normal thing (getstate returns {}, setstate does nothing) if it doesn't exist. Best, Neil On Sat, Jun 7, 2014 at 5:34 AM, Nick Coghlan wrote: > On 7 June 2014 18:46, Neil Girdhar wrote: > > > > Right, that makes sense. I think the flyweight pattern would eliminate > > this: use a special representation for the common case and then switch > to a > > real representation as soon as things become weird. (I can see how that > > would be extra development time unless it could be done automatically by > a > > clever JIT.) > > The flyweight pattern imposes its own costs in terms of additional > levels of indirection and even more pointers to carry around. The > approach we take is that object instances get a __dict__ attribute by > default, unless the creator of the class decides "there are going to > be enough of these for it to be worth skipping the space not only for > the attribute dicts themselves, but also for the attribute dict > reference on each instance". We do the same with weakref support. > > The other thing to keep in mind is that many of CPython's "internal" > representations aren't actually internal: many of them are exposed in > various ways through the CPython C API. As other implementations have > discovered, preserving full compatibility with that API places some > pretty significant constraints on the implementation techniques you > use (or else means putting a lot of work into a compatibility shim > layer like IronClad, JyNI or cpyext). > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Sat Jun 7 21:07:32 2014 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 07 Jun 2014 15:07:32 -0400 Subject: [Python-ideas] Expose `itertools.count.start` and implement `itertools.count.__eq__` based on it, like `range`. In-Reply-To: References: <082cd87a-aeb5-49bf-9f79-d99a6d18e402@googlegroups.com> Message-ID: On 6/6/2014 9:59 PM, Neil Girdhar wrote: > That would be great. What does 'that' refer too? Ram's original proposal, which you quoted, or Antony's counter-proposal, which you also quoted? Ambiguity is the cost of top-posting combined with over-quoting. Since I already explained what is wrong with Ram's proposal, I will delete it and assume you mean Antony's, which seems to not have arrived on my machine. > On Friday, May 16, 2014 12:16:52 AM UTC-4, Antony Lee wrote: > > Actually, a more reasonable solution would be to have range handle > keyword arguments and map "range(start=x)" to "count(x)". Having a parameter like 'start' mean two different things when passed by position and name is the sort of oddity we try to avoid. Since "a range object is an immutable, constant attribute, reiterable sequence object" (my earlier post), while count is an iterator, that does not literally work. So I will assume that you mean (looking ahead) 'an iterable sis_range such that iter(sis_range(n, None, step)) is the same as count(n, step)'. > Or, perhaps more simply, "range(x, None)" As an expression, stop=None is literally what you mean. The problem is that range is a finite sequence with a finite length and an indexable end. For instance, range(10)[-1] == 9. What is needed is a new semi_infinite_sequence base class 'SemiInfSeq' that allows (but not requires) infinite length: float('inf') or a new int('inf'). It would also have to disallow negative ints for indexing and slicing. Or perhaps a class factory is needed. Many infinite iterators whose items can be calculated from index n could be the iterator for a SIS subclass. A geometric series and the sequence of squares are other examples. This could be a PyPI package. -- Terry Jan Reedy From guido at python.org Sat Jun 7 21:12:58 2014 From: guido at python.org (Guido van Rossum) Date: Sat, 7 Jun 2014 12:12:58 -0700 Subject: [Python-ideas] =?utf-8?q?Put_default_setstate_and_getstate_on_obj?= =?utf-8?q?ect_for_use_in_co=C3=B6perative_inheritance=2E?= In-Reply-To: References: Message-ID: You haven't explained why you need this. You just stated a proposal. On Jun 7, 2014 12:06 PM, "Neil Girdhar" wrote: > I understand your concern for cpython, but I don't think it will be the > future of Python. I think every object should have a dict and then the JIT > should just make it fast. I think that's possible. > > Anyway, this is a separate discussion. My new proposal is for setstate > and getstate to have default implementations that first check for the > __dict__ attribute and do the normal thing (getstate returns {}, setstate > does nothing) if it doesn't exist. > > Best, Neil > > > On Sat, Jun 7, 2014 at 5:34 AM, Nick Coghlan wrote: > >> On 7 June 2014 18:46, Neil Girdhar wrote: >> > >> > Right, that makes sense. I think the flyweight pattern would eliminate >> > this: use a special representation for the common case and then switch >> to a >> > real representation as soon as things become weird. (I can see how that >> > would be extra development time unless it could be done automatically >> by a >> > clever JIT.) >> >> The flyweight pattern imposes its own costs in terms of additional >> levels of indirection and even more pointers to carry around. The >> approach we take is that object instances get a __dict__ attribute by >> default, unless the creator of the class decides "there are going to >> be enough of these for it to be worth skipping the space not only for >> the attribute dicts themselves, but also for the attribute dict >> reference on each instance". We do the same with weakref support. >> >> The other thing to keep in mind is that many of CPython's "internal" >> representations aren't actually internal: many of them are exposed in >> various ways through the CPython C API. As other implementations have >> discovered, preserving full compatibility with that API places some >> pretty significant constraints on the implementation techniques you >> use (or else means putting a lot of work into a compatibility shim >> layer like IronClad, JyNI or cpyext). >> >> Cheers, >> Nick. >> >> -- >> Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia >> > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mistersheik at gmail.com Sat Jun 7 21:26:00 2014 From: mistersheik at gmail.com (Neil Girdhar) Date: Sat, 7 Jun 2014 15:26:00 -0400 Subject: [Python-ideas] =?utf-8?q?Put_default_setstate_and_getstate_on_obj?= =?utf-8?q?ect_for_use_in_co=C3=B6perative_inheritance=2E?= In-Reply-To: References: Message-ID: Hi, Okay. In my project I have many classes multiply inheriting from each other. Most of these classes derive from "NetworkElement" and objects of this type are stored in a tree. I would now like to serialize the tree of objects so that I can save the state of the network. I would also like to instantiate copies of the tree so that I can rewind the state of the network to given checkpoints and play back the simulation. The easiest way to implement both serialize and copy in such a way that they are consistent (serialization and deserialization is equivalent to copy) is to implement setstate and getstate. In cooperative inheritance, the general pattern is to call super and do whatever is particular to your class around that. I needed to inherit from the mixin I displayed at the top of this message in order to provide a default setstate and getstate as these are not present in object. Intuitively, I think that it would be better for these to exist on object. I don't think I should have to provide these methods using a mixin. It's not a big deal, but I think it's a small wrinkle in Python not to have default implementations of these methods given that that default behavior is being done anyway. Are there any drawbacks to providing these default methods? Best, Neil On Sat, Jun 7, 2014 at 3:12 PM, Guido van Rossum wrote: > You haven't explained why you need this. You just stated a proposal. > On Jun 7, 2014 12:06 PM, "Neil Girdhar" wrote: > >> I understand your concern for cpython, but I don't think it will be the >> future of Python. I think every object should have a dict and then the JIT >> should just make it fast. I think that's possible. >> >> Anyway, this is a separate discussion. My new proposal is for setstate >> and getstate to have default implementations that first check for the >> __dict__ attribute and do the normal thing (getstate returns {}, setstate >> does nothing) if it doesn't exist. >> >> Best, Neil >> >> >> On Sat, Jun 7, 2014 at 5:34 AM, Nick Coghlan wrote: >> >>> On 7 June 2014 18:46, Neil Girdhar wrote: >>> > >>> > Right, that makes sense. I think the flyweight pattern would eliminate >>> > this: use a special representation for the common case and then switch >>> to a >>> > real representation as soon as things become weird. (I can see how >>> that >>> > would be extra development time unless it could be done automatically >>> by a >>> > clever JIT.) >>> >>> The flyweight pattern imposes its own costs in terms of additional >>> levels of indirection and even more pointers to carry around. The >>> approach we take is that object instances get a __dict__ attribute by >>> default, unless the creator of the class decides "there are going to >>> be enough of these for it to be worth skipping the space not only for >>> the attribute dicts themselves, but also for the attribute dict >>> reference on each instance". We do the same with weakref support. >>> >>> The other thing to keep in mind is that many of CPython's "internal" >>> representations aren't actually internal: many of them are exposed in >>> various ways through the CPython C API. As other implementations have >>> discovered, preserving full compatibility with that API places some >>> pretty significant constraints on the implementation techniques you >>> use (or else means putting a lot of work into a compatibility shim >>> layer like IronClad, JyNI or cpyext). >>> >>> Cheers, >>> Nick. >>> >>> -- >>> Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia >>> >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mistersheik at gmail.com Sat Jun 7 21:39:31 2014 From: mistersheik at gmail.com (Neil Girdhar) Date: Sat, 7 Jun 2014 12:39:31 -0700 (PDT) Subject: [Python-ideas] Make __reduce__ to correspond to __getnewargs_ex__ In-Reply-To: References: Message-ID: <7f41d45f-b1a9-413e-b57c-b2a83c01ff0c@googlegroups.com> Any comments on this? I ended up making reduce work using a metaclass: class KwargsNewMetaclass(type): """ This metaclass reimplements __reduce__ so that it tries to call __getnewargs_ex__. If that doesn't work, it falls back to __getnewargs__. In the first case, it will pass the keyword arguments to object.__new__. It also exposes a kwargs_new static method that can be overridden for use by __reduce__. """ @staticmethod def kwargs_new(cls, new_kwargs, *new_args): retval = cls.__new__(cls, *new_args, **new_kwargs) retval.__init__(*new_args, **new_kwargs) return retval def __new__(cls, name, bases, classdict): result = super().__new__(cls, name, bases, classdict) def __reduce__(self): try: getnewargs_ex = self.__getnewargs_ex__ except AttributeError: new_args, new_kwargs = (self.__getnewargs__(), {}) else: new_args, new_kwargs = getnewargs_ex() return (self.kwargs_new(cls), (type(self), new_kwargs,) + tuple(new_args), self.__getstate__()) result.__reduce__ = __reduce__ return result On Sunday, March 23, 2014 6:20:41 PM UTC-4, Neil Girdhar wrote: > > Currently __reduce__ > > returns up to five things: > > (1) self.__new__ (or a substitute) > (2) the result of __getnewargs__ > , > which returns a tuple of positional arguments for __new__ > , > (3) the result of __getstate__ > , > which returns an object to be passed to __setstate__ > > (4) an iterator of values for appending to a sequence > (5) an iterator of key-value pairs for setting on a string. > > Python 3.4 added the very useful (for me) __getnewargs_ex__ > , > which returns a pair: > (1) a tuple of positional arguments for __new__ > > (2) a dict of keyword arguments for __new__ > > > Therefore, I am proposing that __reduce__ return somehow these keyword > arguments for __new__. > > Best, > Neil > -------------- next part -------------- An HTML attachment was scrubbed... URL: From antony.lee at berkeley.edu Sat Jun 7 22:33:51 2014 From: antony.lee at berkeley.edu (Antony Lee) Date: Sat, 7 Jun 2014 13:33:51 -0700 Subject: [Python-ideas] Expose `itertools.count.start` and implement `itertools.count.__eq__` based on it, like `range`. In-Reply-To: References: <082cd87a-aeb5-49bf-9f79-d99a6d18e402@googlegroups.com> Message-ID: I agree that "range(start=x)" is awkward due to the unusual argument handling of "range", and you are also correct that I should have said "an iterable for which iter(range(x, None, step)) behaves as count(n, step)". I don't see an issue with len and negative indexing raising ValueError and IndexError, respectively. After all, a negative index on an infinite sequence is just as undefined. Antony 2014-06-07 12:07 GMT-07:00 Terry Reedy : > On 6/6/2014 9:59 PM, Neil Girdhar wrote: > >> That would be great. >> > > What does 'that' refer too? Ram's original proposal, which you quoted, or > Antony's counter-proposal, which you also quoted? Ambiguity is the cost of > top-posting combined with over-quoting. > > Since I already explained what is wrong with Ram's proposal, I will delete > it and assume you mean Antony's, which seems to not have arrived on my > machine. > > > On Friday, May 16, 2014 12:16:52 AM UTC-4, Antony Lee wrote: >> >> Actually, a more reasonable solution would be to have range handle >> keyword arguments and map "range(start=x)" to "count(x)". >> > > Having a parameter like 'start' mean two different things when passed by > position and name is the sort of oddity we try to avoid. > > Since "a range object is an immutable, constant attribute, reiterable > sequence object" (my earlier post), while count is an iterator, that does > not literally work. So I will assume that you mean (looking ahead) 'an > iterable sis_range such that iter(sis_range(n, None, step)) is the same as > count(n, step)'. > > > Or, perhaps more simply, "range(x, None)" >> > > As an expression, stop=None is literally what you mean. > > The problem is that range is a finite sequence with a finite length and an > indexable end. For instance, range(10)[-1] == 9. What is needed is a new > semi_infinite_sequence base class 'SemiInfSeq' that allows (but not > requires) infinite length: float('inf') or a new int('inf'). It would also > have to disallow negative ints for indexing and slicing. Or perhaps a class > factory is needed. > > Many infinite iterators whose items can be calculated from index n could > be the iterator for a SIS subclass. A geometric series and the sequence of > squares are other examples. This could be a PyPI package. > > -- > Terry Jan Reedy > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Jun 8 00:57:02 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 8 Jun 2014 08:57:02 +1000 Subject: [Python-ideas] =?utf-8?q?Put_default_setstate_and_getstate_on_obj?= =?utf-8?q?ect_for_use_in_co=C3=B6perative_inheritance=2E?= In-Reply-To: References: Message-ID: On 8 Jun 2014 05:26, "Neil Girdhar" wrote: > > In cooperative inheritance, the general pattern is to call super and do whatever is particular to your class around that. I needed to inherit from the mixin I displayed at the top of this message in order to provide a default setstate and getstate as these are not present in object. Intuitively, I think that it would be better for these to exist on object. I don't think I should have to provide these methods using a mixin. You haven't explained why you're trying to do cooperative multiple inheritance without a common base class to define the rules for your type system. Leaving that element out of a cooperative multiple inheritance design is generally a really bad idea. >It's not a big deal, but I think it's a small wrinkle in Python not to have default implementations of these methods given that that default behavior is being done anyway. Are there any drawbacks to providing these default methods? Yes - increased complexity in the language core. Currently, pickling is completely independent of the language core, so implementations can reuse the same pickling library (although they may want to write an accelerated version eventually). Cheers, Nick. -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Sun Jun 8 01:09:30 2014 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 08 Jun 2014 11:09:30 +1200 Subject: [Python-ideas] =?iso-8859-1?q?Put_default_setstate_and_getstate_o?= =?iso-8859-1?q?n_object_for_use_in_co=F6perative_inheritance=2E?= In-Reply-To: References: <20140607051457.GN10355@ando> <20140607091050.GO10355@ando> Message-ID: <53939BAA.5010602@canterbury.ac.nz> Neil Girdhar wrote: > Have you tried implementing getstate and setstate with cooperatively > inherited classes? You'll need to call super().__getstate__(), which > won't exist, but it really should. The same issue exists with *any* method that you use in a cooperative super call. You need to ensure that there is a class at the end of the MRO with a method that terminates the super call chain. It's obviously infeasible to add all such possible methods to class object. You will have to provide a *very* strong reason why __getstate__ and __setstate__ should be singled out for special treatment in this regard. -- Greg From ncoghlan at gmail.com Sun Jun 8 01:10:19 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 8 Jun 2014 09:10:19 +1000 Subject: [Python-ideas] Expose `itertools.count.start` and implement `itertools.count.__eq__` based on it, like `range`. In-Reply-To: References: <082cd87a-aeb5-49bf-9f79-d99a6d18e402@googlegroups.com> Message-ID: On 8 Jun 2014 06:34, "Antony Lee" wrote: > > I agree that "range(start=x)" is awkward due to the unusual argument handling of "range", and you are also correct that I should have said "an iterable for which iter(range(x, None, step)) behaves as count(n, step)". > I don't see an issue with len and negative indexing raising ValueError and IndexError, respectively. After all, a negative index on an infinite sequence is just as undefined. This is missing the point. We already have a better abstraction for data sources of unknown (potentially infinite) length: the iterator protocol. Yes we *could* define an infinite sequence as "like a sequence, but almost all the operations that distinguish it from an iterator throw an exception", but that's a long way from making a case for why we *should*. Nobody in this thread has addressed key questions like the following: * What is the actual use case for "infinite sequences"? * How is the "infinite sequence" concept easier to teach, write & read than existing approaches based on the itertools module? * How does claiming to provide a particular interface, and then throwing exceptions when you actually try to use it provide a better API user experience than continuing with the status quo? The bar for new syntax in Python is high, but the bar for new semantic concepts is even higher. Regards, Nick. > Antony > > > 2014-06-07 12:07 GMT-07:00 Terry Reedy : > >> On 6/6/2014 9:59 PM, Neil Girdhar wrote: >>> >>> That would be great. >> >> >> What does 'that' refer too? Ram's original proposal, which you quoted, or Antony's counter-proposal, which you also quoted? Ambiguity is the cost of top-posting combined with over-quoting. >> >> Since I already explained what is wrong with Ram's proposal, I will delete it and assume you mean Antony's, which seems to not have arrived on my machine. >> >> >>> On Friday, May 16, 2014 12:16:52 AM UTC-4, Antony Lee wrote: >>> >>> Actually, a more reasonable solution would be to have range handle >>> keyword arguments and map "range(start=x)" to "count(x)". >> >> >> Having a parameter like 'start' mean two different things when passed by position and name is the sort of oddity we try to avoid. >> >> Since "a range object is an immutable, constant attribute, reiterable sequence object" (my earlier post), while count is an iterator, that does not literally work. So I will assume that you mean (looking ahead) 'an iterable sis_range such that iter(sis_range(n, None, step)) is the same as count(n, step)'. >> >> >>> Or, perhaps more simply, "range(x, None)" >> >> >> As an expression, stop=None is literally what you mean. >> >> The problem is that range is a finite sequence with a finite length and an indexable end. For instance, range(10)[-1] == 9. What is needed is a new semi_infinite_sequence base class 'SemiInfSeq' that allows (but not requires) infinite length: float('inf') or a new int('inf'). It would also have to disallow negative ints for indexing and slicing. Or perhaps a class factory is needed. >> >> Many infinite iterators whose items can be calculated from index n could be the iterator for a SIS subclass. A geometric series and the sequence of squares are other examples. This could be a PyPI package. >> >> -- >> Terry Jan Reedy >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From benjamin at python.org Sun Jun 8 01:35:44 2014 From: benjamin at python.org (Benjamin Peterson) Date: Sat, 7 Jun 2014 23:35:44 +0000 (UTC) Subject: [Python-ideas] =?utf-8?q?Put_default_setstate_and_getstate_on_obj?= =?utf-8?q?ect_for_use_in_co=C3=B6perative_inheritance=2E?= References: Message-ID: Nick Coghlan writes: > Yes - increased complexity in the language core. Currently, pickling is completely independent of the language core, so implementations can reuse the same pickling library (although they may want to write an accelerated version eventually). That's not really true considering the amount of pickle goop in typeobject.c and the fact that many builtin types implement their own pickling. From tjreedy at udel.edu Sun Jun 8 04:12:04 2014 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 07 Jun 2014 22:12:04 -0400 Subject: [Python-ideas] Expose `itertools.count.start` and implement `itertools.count.__eq__` based on it, like `range`. In-Reply-To: References: <082cd87a-aeb5-49bf-9f79-d99a6d18e402@googlegroups.com> Message-ID: On 6/7/2014 4:33 PM, Antony Lee wrote: > I agree that "range(start=x)" is awkward due to the unusual argument > handling of "range", and you are also correct that I should have said > "an iterable for which iter(range(x, None, step)) behaves as count(n, > step)". > I don't see an issue with len and negative indexing raising ValueError > and IndexError, respectively. After all, a negative index on an > infinite sequence is just as undefined. The issue is doing that with range, which is defined to a sequence, and is registered as a collections.abc.Sequence. Break the promise of that definition, break code, people scream. Many functions properly require a sequence rather than just any iterable. from collections.abc import Sequence def cross(seq): if not issubclass(seq, Sequence): raise TypeError("%x is not a collections.abc.Sequence" % type(seq)) for first in seq: for second in seq: yield (first, second) There is another issue with merely extending range -- its weird signature. Range(10) stops at 10; count(10) begins at 10. The signature of an iterable based on count should be based on count, not range. Semi-infinite-sequence could be a well-defined category of classes. But it should start outside of the stdlib. It could be fun, and have niche uses, like teaching. But like Nick, I am dubious that it would add enough to beyond having infinite iterators to warrant being in the stdlib. In any case, that would need to be proven with field experience. -- Terry Jan Reedy From mistersheik at gmail.com Sun Jun 8 04:47:06 2014 From: mistersheik at gmail.com (Neil Girdhar) Date: Sat, 7 Jun 2014 22:47:06 -0400 Subject: [Python-ideas] =?utf-8?q?Put_default_setstate_and_getstate_on_obj?= =?utf-8?q?ect_for_use_in_co=C3=B6perative_inheritance=2E?= In-Reply-To: <53939BAA.5010602@canterbury.ac.nz> References: <20140607051457.GN10355@ando> <20140607091050.GO10355@ando> <53939BAA.5010602@canterbury.ac.nz> Message-ID: Good point. I would like to kindly retract my suggestion and thank everyone for their input. In implementing further, I realize that the default getstate I want returns {} while the current default getstate used when getstate doesn't exist returns self.__dict__. Therefore I need a superclass anyway. Any comments on my other suggestion to modify __reduce__ so that it takes into account the __getnewargs_ex__ that was added in Python 3.4 would be much appreciated. Best, Neil On Sat, Jun 7, 2014 at 7:09 PM, Greg Ewing wrote: > Neil Girdhar wrote: > >> Have you tried implementing getstate and setstate with cooperatively >> inherited classes? You'll need to call super().__getstate__(), which won't >> exist, but it really should. >> > > The same issue exists with *any* method that you use in > a cooperative super call. You need to ensure that there is > a class at the end of the MRO with a method that terminates > the super call chain. > > It's obviously infeasible to add all such possible methods > to class object. You will have to provide a *very* strong > reason why __getstate__ and __setstate__ should be singled > out for special treatment in this regard. > > -- > Greg > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- > > --- You received this message because you are subscribed to a topic in the > Google Groups "python-ideas" group. > To unsubscribe from this topic, visit https://groups.google.com/d/ > topic/python-ideas/QkvOwa1-pHQ/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > python-ideas+unsubscribe at googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Sun Jun 8 05:00:10 2014 From: ethan at stoneleaf.us (Ethan Furman) Date: Sat, 07 Jun 2014 20:00:10 -0700 Subject: [Python-ideas] Make __reduce__ to correspond to __getnewargs_ex__ In-Reply-To: <7f41d45f-b1a9-413e-b57c-b2a83c01ff0c@googlegroups.com> References: <7f41d45f-b1a9-413e-b57c-b2a83c01ff0c@googlegroups.com> Message-ID: <5393D1BA.6040008@stoneleaf.us> On 06/07/2014 12:39 PM, Neil Girdhar wrote: > > Any comments on this? __reduce__ is already a well-defined part of pickle. We also have __reduce_ex__, __getnewargs__, and now __getnewargs_ex__ -- why do we need __reduce__ to do the same thing as __getnewargs_ex__? -- ~Ethan~ From mistersheik at gmail.com Sun Jun 8 08:52:29 2014 From: mistersheik at gmail.com (Neil Girdhar) Date: Sun, 8 Jun 2014 02:52:29 -0400 Subject: [Python-ideas] Make __reduce__ to correspond to __getnewargs_ex__ In-Reply-To: <5393D1BA.6040008@stoneleaf.us> References: <7f41d45f-b1a9-413e-b57c-b2a83c01ff0c@googlegroups.com> <5393D1BA.6040008@stoneleaf.us> Message-ID: Was my proposal clear? If reduce isn't updated to return the keyword arguments then it is not compatible with classes that require keyword arguments without some serious metaclass fiddling as far as I can tell. On Sat, Jun 7, 2014 at 11:00 PM, Ethan Furman wrote: > On 06/07/2014 12:39 PM, Neil Girdhar wrote: > >> >> Any comments on this? >> > > __reduce__ is already a well-defined part of pickle. > > We also have __reduce_ex__, __getnewargs__, and now __getnewargs_ex__ -- > why do we need __reduce__ to do the same thing as __getnewargs_ex__? > > -- > ~Ethan~ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- > > --- You received this message because you are subscribed to a topic in the > Google Groups "python-ideas" group. > To unsubscribe from this topic, visit https://groups.google.com/d/ > topic/python-ideas/zohH2BCtYzY/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > python-ideas+unsubscribe at googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Jun 8 12:03:20 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 8 Jun 2014 20:03:20 +1000 Subject: [Python-ideas] Make __reduce__ to correspond to __getnewargs_ex__ In-Reply-To: References: <7f41d45f-b1a9-413e-b57c-b2a83c01ff0c@googlegroups.com> <5393D1BA.6040008@stoneleaf.us> Message-ID: On 8 Jun 2014 16:53, "Neil Girdhar" wrote: > > Was my proposal clear? If reduce isn't updated to return the keyword arguments then it is not compatible with classes that require keyword arguments without some serious metaclass fiddling as far as I can tell. Those classes shouldn't use reduce for their pickling support. Cheers, Nick. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ned at nedbatchelder.com Sun Jun 8 13:45:41 2014 From: ned at nedbatchelder.com (Ned Batchelder) Date: Sun, 08 Jun 2014 07:45:41 -0400 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: References: <537C888D.7060903@nedbatchelder.com> <537DFBCA.2070006@nedbatchelder.com> <20140522175910.GM10355@ando> <537E5D67.90101@nedbatchelder.com> <5383DBE8.6020309@nedbatchelder.com> Message-ID: <53944CE5.50009@nedbatchelder.com> On 5/26/14 10:40 PM, Nick Coghlan wrote: > > > On 27 May 2014 10:28, "Ned Batchelder" > wrote: > > > > On 5/23/14 1:22 PM, Guido van Rossum wrote: > >> > >> On Fri, May 23, 2014 at 10:17 AM, Eric Snow > > wrote: > >>> > >>> On Fri, May 23, 2014 at 10:49 AM, Guido van Rossum > > wrote: > > >>> > >>> Would it be a problem if .pyc files weren't generated or used (a la -B > >>> or PYTHONDONTWRITEBYTECODE) when you ran coverage? > >> > >> > >> In first approximation that would probably be okay, although it > would make coverage even slower. I was envisioning something where it > would still use, but not write, pyc files for the stdlib or > site-packages, because the code in whose coverage I am interested is > puny compared to the stdlib code it imports. > > > > > > I was concerned about losing any time in test suites that are > already considered too slow. But I tried to do some controlled > measurements of these scenarios, and found the worst case (no .pyc > available, and none written) was only 2.8% slower than full .pyc files > available. When I tried to measure stdlib .pyc's available, and no > .pyc's for my code, the results were actually very slightly faster > than the typical case. I think this points to the difficult in > controlling all the variables! > > > > In any case, it seems that the penalty for avoiding the .pyc files > is not burdensome. > > Along these lines, how about making the environment variable something > like "PYTHONANALYSINGSOURCE" with the effects: > > - bytecode files are neither read nor written > - all bytecode and AST optimisations are disabled > > A use case oriented flag like that lets us tweak the definition as > needed in the future, unlike an option that is specific to turning off > the CPython peephole optimiser (e.g. we don't have an AST optimiser > yet, but turning it off would still be covered by an "analysing > source" flag). > My inclination would still be to provide separate controls like "DISABLE_OPTIMIZATIONS" and "DISABLE_BYTECODE", these are power tools in any case. What is the process from this point forward? A patch? A PEP? --Ned. > > Cheers, > Nick. > > >> > >> > >> -- > >> --Guido van Rossum (python.org/~guido ) > >> > >> > >> _______________________________________________ > >> Python-ideas mailing list > >> Python-ideas at python.org > >> https://mail.python.org/mailman/listinfo/python-ideas > >> Code of Conduct: http://python.org/psf/codeofconduct/ > > > > > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Jun 8 16:18:58 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 9 Jun 2014 00:18:58 +1000 Subject: [Python-ideas] Disable all peephole optimizations In-Reply-To: <53944CE5.50009@nedbatchelder.com> References: <537C888D.7060903@nedbatchelder.com> <537DFBCA.2070006@nedbatchelder.com> <20140522175910.GM10355@ando> <537E5D67.90101@nedbatchelder.com> <5383DBE8.6020309@nedbatchelder.com> <53944CE5.50009@nedbatchelder.com> Message-ID: On 8 Jun 2014 21:45, "Ned Batchelder" wrote: > > On 5/26/14 10:40 PM, Nick Coghlan wrote: >> >> >> On 27 May 2014 10:28, "Ned Batchelder" wrote: >> > >> > On 5/23/14 1:22 PM, Guido van Rossum wrote: >> >> >> >> On Fri, May 23, 2014 at 10:17 AM, Eric Snow < ericsnowcurrently at gmail.com> wrote: >> >>> >> >>> On Fri, May 23, 2014 at 10:49 AM, Guido van Rossum wrote: >> >> >>> >> >>> Would it be a problem if .pyc files weren't generated or used (a la -B >> >>> or PYTHONDONTWRITEBYTECODE) when you ran coverage? >> >> >> >> >> >> In first approximation that would probably be okay, although it would make coverage even slower. I was envisioning something where it would still use, but not write, pyc files for the stdlib or site-packages, because the code in whose coverage I am interested is puny compared to the stdlib code it imports. >> > >> > >> > I was concerned about losing any time in test suites that are already considered too slow. But I tried to do some controlled measurements of these scenarios, and found the worst case (no .pyc available, and none written) was only 2.8% slower than full .pyc files available. When I tried to measure stdlib .pyc's available, and no .pyc's for my code, the results were actually very slightly faster than the typical case. I think this points to the difficult in controlling all the variables! >> > >> > In any case, it seems that the penalty for avoiding the .pyc files is not burdensome. >> >> Along these lines, how about making the environment variable something like "PYTHONANALYSINGSOURCE" with the effects: >> >> - bytecode files are neither read nor written >> - all bytecode and AST optimisations are disabled >> >> A use case oriented flag like that lets us tweak the definition as needed in the future, unlike an option that is specific to turning off the CPython peephole optimiser (e.g. we don't have an AST optimiser yet, but turning it off would still be covered by an "analysing source" flag). > > > My inclination would still be to provide separate controls like "DISABLE_OPTIMIZATIONS" and "DISABLE_BYTECODE", these are power tools in any case. What is the process from this point forward? A patch? A PEP? A PEP would help ensure the use cases are clearly documented and properly covered by the chosen solution. It will also help cover all the incidental details (like the impact on cache tags). But either a patch or a PEP would get it moving - the main risk in going direct to a patch is the potential for needing to rework the design. Cheers, Nick. > > --Ned. > >> Cheers, >> Nick. >> >> >> >> >> >> >> -- >> >> --Guido van Rossum (python.org/~guido) >> >> >> >> >> >> _______________________________________________ >> >> Python-ideas mailing list >> >> Python-ideas at python.org >> >> https://mail.python.org/mailman/listinfo/python-ideas >> >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > >> > >> > >> > _______________________________________________ >> > Python-ideas mailing list >> > Python-ideas at python.org >> > https://mail.python.org/mailman/listinfo/python-ideas >> > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mistersheik at gmail.com Sun Jun 8 19:59:22 2014 From: mistersheik at gmail.com (Neil Girdhar) Date: Sun, 8 Jun 2014 13:59:22 -0400 Subject: [Python-ideas] Make __reduce__ to correspond to __getnewargs_ex__ In-Reply-To: References: <7f41d45f-b1a9-413e-b57c-b2a83c01ff0c@googlegroups.com> <5393D1BA.6040008@stoneleaf.us> Message-ID: Of course they should? What should they use? On Sun, Jun 8, 2014 at 6:03 AM, Nick Coghlan wrote: > > On 8 Jun 2014 16:53, "Neil Girdhar" wrote: > > > > Was my proposal clear? If reduce isn't updated to return the keyword > arguments then it is not compatible with classes that require keyword > arguments without some serious metaclass fiddling as far as I can tell. > > Those classes shouldn't use reduce for their pickling support. > > Cheers, > Nick. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Sun Jun 8 21:13:41 2014 From: ethan at stoneleaf.us (Ethan Furman) Date: Sun, 08 Jun 2014 12:13:41 -0700 Subject: [Python-ideas] Make __reduce__ to correspond to __getnewargs_ex__ In-Reply-To: References: <7f41d45f-b1a9-413e-b57c-b2a83c01ff0c@googlegroups.com> <5393D1BA.6040008@stoneleaf.us> Message-ID: <5394B5E5.3030508@stoneleaf.us> On 06/08/2014 10:59 AM, Neil Girdhar wrote: > On Sun, Jun 8, 2014 at 6:03 AM, Nick Coghlan wrote: >> >> Those classes shouldn't use reduce for their pickling support. > > Of course they should? What should they use? They should use the methods that make sense. Pickling is a protocol. It will use (in, I believe, this order): __getnewargs_ex__ __getnewargs__ __reduce_ex__ __reduce__ I'm not sure where __getstate__ and __setstate__ fit in, and happily I don't need to unless I'm subclassing something that makes use of them. Anyway, back to the story. When you call pickle.dump, that code will look for the most advanced method available on the object you are trying to pickle, and use it. (Well, the most advanced method for the protocol version you have selected, that's available on the object.) So, if you selected protocol 2, then the pickle code will look for __getnewargs__, but not __getnewargs_ex__, as __getnewargs_ex__ isn't available until protocol 4. This is similar to iterating: first choice for iterating is to call an object's __iter__ method, but if there isn't one Python will fall back to using __getitem__ using integers from 0 until IndexError is raised. -- ~Ethan~ From mistersheik at gmail.com Mon Jun 9 00:18:00 2014 From: mistersheik at gmail.com (Neil Girdhar) Date: Sun, 8 Jun 2014 18:18:00 -0400 Subject: [Python-ideas] Make __reduce__ to correspond to __getnewargs_ex__ In-Reply-To: <5394B5E5.3030508@stoneleaf.us> References: <7f41d45f-b1a9-413e-b57c-b2a83c01ff0c@googlegroups.com> <5393D1BA.6040008@stoneleaf.us> <5394B5E5.3030508@stoneleaf.us> Message-ID: On Sun, Jun 8, 2014 at 3:13 PM, Ethan Furman wrote: > On 06/08/2014 10:59 AM, Neil Girdhar wrote: > >> On Sun, Jun 8, 2014 at 6:03 AM, Nick Coghlan wrote: >> >>> >>> Those classes shouldn't use reduce for their pickling support. >>> >> >> Of course they should? What should they use? >> > > They should use the methods that make sense. Pickling is a protocol. It > will use (in, I believe, this order): > > __getnewargs_ex__ > > __getnewargs__ > > __reduce_ex__ > > __reduce__ > In fact, the reduce functions come first. The order is: reduce_ex reduce getnewargs_ex and getstate (3.4+) getnewargs_ex (3.4+) getnewargs and getstate getnewargs getstate default code that copies/pickles the dict Up until 3.3, a "default" reduce function might be: def __reduce__(self): return type(self), self.__getnewargs__(), self.__getstate__() It' s no longer possible to write a default reduce function in 3.4 given the addtion __getnewargs_ex__. What I am suggesting is to add a new protocol so that reduce returns the keyword arguments. Best, Neil > > I'm not sure where __getstate__ and __setstate__ fit in, and happily I > don't need to unless I'm subclassing something that makes use of them. > You really should find out where they fit in before you reply to a message about them :) > > Anyway, back to the story. > > When you call pickle.dump, that code will look for the most advanced > method available on the object you are trying to pickle, and use it. > (Well, the most advanced method for the protocol version you have > selected, that's available on the object.) So, if you selected protocol 2, > then the pickle code will look for __getnewargs__, but not > __getnewargs_ex__, as __getnewargs_ex__ isn't available until protocol 4. > > This is similar to iterating: first choice for iterating is to call an > object's __iter__ method, but if there isn't one Python will fall back to > using __getitem__ using integers from 0 until IndexError is raised. > > > -- > ~Ethan~ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- > > --- You received this message because you are subscribed to a topic in the > Google Groups "python-ideas" group. > To unsubscribe from this topic, visit https://groups.google.com/d/ > topic/python-ideas/zohH2BCtYzY/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > python-ideas+unsubscribe at googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mistersheik at gmail.com Tue Jun 10 08:04:58 2014 From: mistersheik at gmail.com (Neil Girdhar) Date: Mon, 9 Jun 2014 23:04:58 -0700 (PDT) Subject: [Python-ideas] Implement `itertools.permutations.__getitem__` and `itertools.permutations.index` In-Reply-To: References: <20140505171538.GR4273@ando> <20140506023902.GV4273@ando> Message-ID: <18aa148c-f6cb-414c-9f6c-de5011568204@googlegroups.com> I really like this and hope that it eventually makes it into the stdlib. It's also a good argument for your other suggestion whereby some of the itertools to return Iterables rather than Iterators like range does. Best, Neil On Wednesday, May 7, 2014 1:43:20 PM UTC-4, Ram Rachum wrote: > > I'm probably going to implement it in my python_toolbox package. I already > implemented 30% and it's really cool. It's at the point where I doubt that > I want it in the stdlib because I've gotten so much awesome functionality > into it and I'd hate to (a) have 80% of it stripped and (b) have the class > names changed to be non-Pythonic :) > > > On Wed, May 7, 2014 at 8:40 PM, Tal Einat > > wrote: > >> On Wed, May 7, 2014 at 8:21 PM, Ram Rachum > > wrote: >> > Hi Tal, >> > >> > I'm using it for a project of my own (optimizing keyboard layout) but I >> > can't make the case that it's useful for the stdlib. I'd understand if >> it >> > would be omitted for not being enough of a common need. >> >> At the least, this (a function for getting a specific permutation by >> lexicographical-order index) could make a nice cookbook recipe. >> >> - Tal >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mistersheik at gmail.com Tue Jun 10 08:15:45 2014 From: mistersheik at gmail.com (Neil Girdhar) Date: Mon, 9 Jun 2014 23:15:45 -0700 (PDT) Subject: [Python-ideas] Empty set, Empty dict In-Reply-To: References: Message-ID: <1e8947ea-2eb4-4226-ad80-2005e6f3e537@googlegroups.com> I've seen this proposed before, and I personally would love this, but my guess is that it breaks too much code for too little gain. On Wednesday, May 21, 2014 12:33:30 PM UTC-4, Fr?d?ric Legembre wrote: > > > Now | Future | > ---------------------------------------------------- > () | () | empty tuple ( 1, 2, 3 ) > [] | [] | empty list [ 1, 2, 3 ] > set() | {} | empty set { 1, 2, 3 } > {} | {:} | empty dict { 1:a, 2:b, 3:c } > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Tue Jun 10 09:59:54 2014 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 10 Jun 2014 09:59:54 +0200 Subject: [Python-ideas] Empty set, Empty dict In-Reply-To: <1e8947ea-2eb4-4226-ad80-2005e6f3e537@googlegroups.com> References: <1e8947ea-2eb4-4226-ad80-2005e6f3e537@googlegroups.com> Message-ID: 2014-06-10 8:15 GMT+02:00 Neil Girdhar : > I've seen this proposed before, and I personally would love this, but my > guess is that it breaks too much code for too little gain. > > On Wednesday, May 21, 2014 12:33:30 PM UTC-4, Fr?d?ric Legembre wrote: >> >> >> Now | Future | >> ---------------------------------------------------- >> () | () | empty tuple ( 1, 2, 3 ) >> [] | [] | empty list [ 1, 2, 3 ] >> set() | {} | empty set { 1, 2, 3 } >> {} | {:} | empty dict { 1:a, 2:b, 3:c } Your guess is right. It will break all Python 2 and Python 3 in the world. Technically, set((1, 2)) is different than {1, 2}: the first creates a tuple and loads the global name "set" (which can be replaced at runtime!), whereas the later uses bytecode and only store values (numbers 1 and 2). It would be nice to have a syntax for empty set, but {} is a no-no. Victor From wichert at wiggy.net Tue Jun 10 11:07:42 2014 From: wichert at wiggy.net (Wichert Akkerman) Date: Tue, 10 Jun 2014 11:07:42 +0200 Subject: [Python-ideas] Empty set, Empty dict Message-ID: Victor Stinner wrote: 2014-06-10 8:15 GMT+02:00 Neil Girdhar : > > I've seen this proposed before, and I personally would love this, but my > > guess is that it breaks too much code for too little gain. > > > > On Wednesday, May 21, 2014 12:33:30 PM UTC-4, Fr?d?ric Legembre wrote: > >> > >> > >> Now | Future | > >> ---------------------------------------------------- > >> () | () | empty tuple ( 1, 2, 3 ) > >> [] | [] | empty list [ 1, 2, 3 ] > >> set() | {} | empty set { 1, 2, 3 } > >> {} | {:} | empty dict { 1:a, 2:b, 3:c } > > > Your guess is right. It will break all Python 2 and Python 3 in the world. > > Technically, set((1, 2)) is different than {1, 2}: the first creates a > tuple and loads the global name "set" (which can be replaced at > runtime!), whereas the later uses bytecode and only store values > (numbers 1 and 2). > > It would be nice to have a syntax for empty set, but {} is a no-no. Perhaps {,} would be a possible spelling. For consistency you might want to allow (,) to create an empty tuple as well; personally I would find that more intuitive that (()). Wichert. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rymg19 at gmail.com Tue Jun 10 18:25:09 2014 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Tue, 10 Jun 2014 11:25:09 -0500 Subject: [Python-ideas] Empty set, Empty dict In-Reply-To: References: Message-ID: +1 for using {,}. On Tue, Jun 10, 2014 at 4:07 AM, Wichert Akkerman wrote: > Victor Stinner wrote: > > > 2014-06-10 8:15 GMT+02:00 Neil Girdhar >: > > >* I've seen this proposed before, and I personally would love this, but my > *>* guess is that it breaks too much code for too little gain. > *>>* On Wednesday, May 21, 2014 12:33:30 PM UTC-4, Fr?d?ric Legembre wrote: > *>>>>>>* Now | Future | > *>>* ---------------------------------------------------- > *>>* () | () | empty tuple ( 1, 2, 3 ) > *>>* [] | [] | empty list [ 1, 2, 3 ] > *>>* set() | {} | empty set { 1, 2, 3 } > *>>* {} | {:} | empty dict { 1:a, 2:b, 3:c } > * > > Your guess is right. It will break all Python 2 and Python 3 in the world. > > Technically, set((1, 2)) is different than {1, 2}: the first creates a > tuple and loads the global name "set" (which can be replaced at > runtime!), whereas the later uses bytecode and only store values > (numbers 1 and 2). > > It would be nice to have a syntax for empty set, but {} is a no-no. > > > Perhaps {,} would be a possible spelling. For consistency you might want > to allow (,) to create an empty tuple as well; personally I would find that > more intuitive that (()). > > Wichert. > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Ryan If anybody ever asks me why I prefer C++ to C, my answer will be simple: "It's becauseslejfp23(@#Q*(E*EIdc-SEGFAULT. Wait, I don't think that was nul-terminated." -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Tue Jun 10 18:39:56 2014 From: mertz at gnosis.cx (David Mertz) Date: Tue, 10 Jun 2014 09:39:56 -0700 Subject: [Python-ideas] Empty set, Empty dict In-Reply-To: <1e8947ea-2eb4-4226-ad80-2005e6f3e537@googlegroups.com> References: <1e8947ea-2eb4-4226-ad80-2005e6f3e537@googlegroups.com> Message-ID: > > On Wednesday, May 21, 2014 12:33:30 PM UTC-4, Fr?d?ric Legembre wrote: >> >> >> Now | Future | >> ---------------------------------------------------- >> () | () | empty tuple ( 1, 2, 3 ) >> [] | [] | empty list [ 1, 2, 3 ] >> set() | {} | empty set { 1, 2, 3 } >> {} | {:} | empty dict { 1:a, 2:b, 3:c } > > This is *exactly* what I would want if I were designing a language from scratch. It's obvious, readable, etc. However, it also breaks every single instance of 'newdict = {}' in Python code, which is a very common idiom. Unfortunately, I don't really like the proposed empty-set literal proposed in the thread: '{,}'. It saves two characters over 'set()', but is not intuitive to me. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Jun 10 18:39:56 2014 From: guido at python.org (Guido van Rossum) Date: Tue, 10 Jun 2014 09:39:56 -0700 Subject: [Python-ideas] Empty set, Empty dict In-Reply-To: References: Message-ID: No. Jeez. :-( On Tue, Jun 10, 2014 at 9:25 AM, Ryan Gonzalez wrote: > +1 for using {,}. > > > On Tue, Jun 10, 2014 at 4:07 AM, Wichert Akkerman > wrote: > >> Victor Stinner wrote: >> >> >> 2014-06-10 8:15 GMT+02:00 Neil Girdhar >: >> >> >> >> >* I've seen this proposed before, and I personally would love this, but my >> *>* guess is that it breaks too much code for too little gain. >> *>>* On Wednesday, May 21, 2014 12:33:30 PM UTC-4, Fr?d?ric Legembre wrote: >> *>>>>>>* Now | Future | >> *>>* ---------------------------------------------------- >> *>>* () | () | empty tuple ( 1, 2, 3 ) >> *>>* [] | [] | empty list [ 1, 2, 3 ] >> *>>* set() | {} | empty set { 1, 2, 3 } >> *>>* {} | {:} | empty dict { 1:a, 2:b, 3:c } >> * >> >> Your guess is right. It will break all Python 2 and Python 3 in the world. >> >> Technically, set((1, 2)) is different than {1, 2}: the first creates a >> tuple and loads the global name "set" (which can be replaced at >> runtime!), whereas the later uses bytecode and only store values >> (numbers 1 and 2). >> >> It would be nice to have a syntax for empty set, but {} is a no-no. >> >> >> Perhaps {,} would be a possible spelling. For consistency you might want >> to allow (,) to create an empty tuple as well; personally I would find that >> more intuitive that (()). >> >> Wichert. >> >> >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > > -- > Ryan > If anybody ever asks me why I prefer C++ to C, my answer will be simple: > "It's becauseslejfp23(@#Q*(E*EIdc-SEGFAULT. Wait, I don't think that was > nul-terminated." > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From mistersheik at gmail.com Fri Jun 13 06:07:56 2014 From: mistersheik at gmail.com (Neil Girdhar) Date: Thu, 12 Jun 2014 21:07:56 -0700 (PDT) Subject: [Python-ideas] A General Outline for Just-in-Time Acceleration of Python Message-ID: I was wondering what work is being done on Python to make it faster. I understand that cpython is incrementally improved. I'm not sure, but I think that pypy acceleration works by compiling a restricted set of Python. And I think I heard something about Guido working on a different model for accelerating Python. I apologize in advance that I didn't look into these projects in a lot of detail. My number one dream about computer languages is for me to be able to write in a language as easy as Python and have it run as quickly as if it were written. I do believe that this is possible (since in theory someone could look at my Python code and port it to C++). Unfortunately, I don't have time to work on this goal, but I still wanted to get feedback about some ideas I have about reaching this goal. First, I don't think it's important for a "code block" (say, a small section of code with less coupling to statements outside the block than to within the block) to run quickly on its first iteration. What I'm suggesting instead is for every iteration of a "code block", the runtime stochastically decides whether to collect statistics about that iteration. Those statistics include the the time running the block, the time perform attribute accesses including type method lookups and so on. Basically, the runtime is trying to guess the potential savings of optimizing this block. If the block is run many times and the potential savings are large, then stochastically again, the block is promoted to a second-level statistics collection. This level collects statistics about all of the external couplings of the block, like the types and values of the passed-in and returned values. Using the second-level statistics, the runtime can now guess whether the block should be promoted to a third level whereby any consistencies are exploited. For example, if the passed-in parameter types and return value type of the "min" function are (int, int, int) for 40% of the statistics and (float, float, float) for 59%, and other types for the remaining 1%, then two precompiled versions of min are generated: one for int and one for float. These precompiled code blocks have different costs than regular Python blocks. They need to pay the following costs: * a check for the required invariants (parameter types above, but it could be parameter values, or other invariants) * they need to install hooks on objects that must remain invariant during the execution of the block; if the invariants are ever violated during the execution of the block, then all of the computations done during this execution of the block must be discarded * therefore a third cost is the probability of discarded the computation times the average cost of the doing the wasted computation. The saving is that the code block * can be transformed into a faster bytecode, which includes straight assembly instructions in some sections since types or values can now be assumed, * can use data structures that make type or access assumptions (for example a list that always contains ints can use a flattened representation; a large set that is repeatedly having membership checked with many negative results might benefit from an auxiliary bloom filter, etc.) In summary the runtime performs stochastic, incremental promotion of code blocks from first-level, to second-level, to multiple precompiled versions. It can also demote a code block. The difference between the costs of the different levels is statistically estimated. Examples of optimizations that can be commonly accomplished using such a system are: * global variables are folded into code as constants. (Even if they change rarely, you pay the discarding penalty described above plus the recompilation cost; the benefit of inline use of the constant (and any constant folding) might outweigh these costs.) * lookup of member functions, which almost never change * flattening of homogeneously-typed lists Best, Neil -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Fri Jun 13 07:52:28 2014 From: mertz at gnosis.cx (David Mertz) Date: Thu, 12 Jun 2014 22:52:28 -0700 Subject: [Python-ideas] A General Outline for Just-in-Time Acceleration of Python In-Reply-To: References: Message-ID: Other a sprinkling of the word "stochastic" around this post (why that word, not the more obvious "random"?), this basically exactly describes what PyPy does. On Thu, Jun 12, 2014 at 9:07 PM, Neil Girdhar wrote: > I was wondering what work is being done on Python to make it faster. I > understand that cpython is incrementally improved. I'm not sure, but I > think that pypy acceleration works by compiling a restricted set of Python. > And I think I heard something about Guido working on a different model for > accelerating Python. I apologize in advance that I didn't look into these > projects in a lot of detail. My number one dream about computer languages > is for me to be able to write in a language as easy as Python and have it > run as quickly as if it were written. I do believe that this is possible > (since in theory someone could look at my Python code and port it to C++). > > Unfortunately, I don't have time to work on this goal, but I still wanted > to get feedback about some ideas I have about reaching this goal. > > First, I don't think it's important for a "code block" (say, a small > section of code with less coupling to statements outside the block than to > within the block) to run quickly on its first iteration. > > What I'm suggesting instead is for every iteration of a "code block", the > runtime stochastically decides whether to collect statistics about that > iteration. Those statistics include the the time running the block, the > time perform attribute accesses including type method lookups and so on. > Basically, the runtime is trying to guess the potential savings of > optimizing this block. > > If the block is run many times and the potential savings are large, then > stochastically again, the block is promoted to a second-level statistics > collection. This level collects statistics about all of the external > couplings of the block, like the types and values of the passed-in and > returned values. > > Using the second-level statistics, the runtime can now guess whether the > block should be promoted to a third level whereby any consistencies are > exploited. For example, if the passed-in parameter types and return value > type of the "min" function are (int, int, int) for 40% of the statistics > and (float, float, float) for 59%, and other types for the remaining 1%, > then two precompiled versions of min are generated: one for int and one for > float. > > These precompiled code blocks have different costs than regular Python > blocks. They need to pay the following costs: > * a check for the required invariants (parameter types above, but it could > be parameter values, or other invariants) > * they need to install hooks on objects that must remain invariant during > the execution of the block; if the invariants are ever violated during the > execution of the block, then all of the computations done during this > execution of the block must be discarded > * therefore a third cost is the probability of discarded the computation > times the average cost of the doing the wasted computation. > > The saving is that the code block > * can be transformed into a faster bytecode, which includes straight > assembly instructions in some sections since types or values can now be > assumed, > * can use data structures that make type or access assumptions (for > example a list that always contains ints can use a flattened > representation; a large set that is repeatedly having membership checked > with many negative results might benefit from an auxiliary bloom filter, > etc.) > > In summary the runtime performs stochastic, incremental promotion of code > blocks from first-level, to second-level, to multiple precompiled versions. > It can also demote a code block. The difference between the costs of > the different levels is statistically estimated. > > Examples of optimizations that can be commonly accomplished using such a > system are: > * global variables are folded into code as constants. (Even if they > change rarely, you pay the discarding penalty described above plus the > recompilation cost; the benefit of inline use of the constant (and any > constant folding) might outweigh these costs.) > * lookup of member functions, which almost never change > * flattening of homogeneously-typed lists > > Best, > > Neil > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mistersheik at gmail.com Fri Jun 13 07:53:55 2014 From: mistersheik at gmail.com (Neil Girdhar) Date: Fri, 13 Jun 2014 01:53:55 -0400 Subject: [Python-ideas] A General Outline for Just-in-Time Acceleration of Python In-Reply-To: References: Message-ID: Well that's great to hear :) I thought pypy only worked on a restricted set of Python. Does pypy save the optimization statistics between runs? On Fri, Jun 13, 2014 at 1:52 AM, David Mertz wrote: > Other a sprinkling of the word "stochastic" around this post (why that > word, not the more obvious "random"?), this basically exactly describes > what PyPy does. > > > On Thu, Jun 12, 2014 at 9:07 PM, Neil Girdhar > wrote: > >> I was wondering what work is being done on Python to make it faster. I >> understand that cpython is incrementally improved. I'm not sure, but I >> think that pypy acceleration works by compiling a restricted set of Python. >> And I think I heard something about Guido working on a different model for >> accelerating Python. I apologize in advance that I didn't look into these >> projects in a lot of detail. My number one dream about computer languages >> is for me to be able to write in a language as easy as Python and have it >> run as quickly as if it were written. I do believe that this is possible >> (since in theory someone could look at my Python code and port it to C++). >> >> Unfortunately, I don't have time to work on this goal, but I still wanted >> to get feedback about some ideas I have about reaching this goal. >> >> First, I don't think it's important for a "code block" (say, a small >> section of code with less coupling to statements outside the block than to >> within the block) to run quickly on its first iteration. >> >> What I'm suggesting instead is for every iteration of a "code block", the >> runtime stochastically decides whether to collect statistics about that >> iteration. Those statistics include the the time running the block, the >> time perform attribute accesses including type method lookups and so on. >> Basically, the runtime is trying to guess the potential savings of >> optimizing this block. >> >> If the block is run many times and the potential savings are large, then >> stochastically again, the block is promoted to a second-level statistics >> collection. This level collects statistics about all of the external >> couplings of the block, like the types and values of the passed-in and >> returned values. >> >> Using the second-level statistics, the runtime can now guess whether the >> block should be promoted to a third level whereby any consistencies are >> exploited. For example, if the passed-in parameter types and return value >> type of the "min" function are (int, int, int) for 40% of the statistics >> and (float, float, float) for 59%, and other types for the remaining 1%, >> then two precompiled versions of min are generated: one for int and one for >> float. >> >> These precompiled code blocks have different costs than regular Python >> blocks. They need to pay the following costs: >> * a check for the required invariants (parameter types above, but it >> could be parameter values, or other invariants) >> * they need to install hooks on objects that must remain invariant during >> the execution of the block; if the invariants are ever violated during the >> execution of the block, then all of the computations done during this >> execution of the block must be discarded >> * therefore a third cost is the probability of discarded the computation >> times the average cost of the doing the wasted computation. >> >> The saving is that the code block >> * can be transformed into a faster bytecode, which includes straight >> assembly instructions in some sections since types or values can now be >> assumed, >> * can use data structures that make type or access assumptions (for >> example a list that always contains ints can use a flattened >> representation; a large set that is repeatedly having membership checked >> with many negative results might benefit from an auxiliary bloom filter, >> etc.) >> >> In summary the runtime performs stochastic, incremental promotion of code >> blocks from first-level, to second-level, to multiple precompiled versions. >> It can also demote a code block. The difference between the costs of >> the different levels is statistically estimated. >> >> Examples of optimizations that can be commonly accomplished using such a >> system are: >> * global variables are folded into code as constants. (Even if they >> change rarely, you pay the discarding penalty described above plus the >> recompilation cost; the benefit of inline use of the constant (and any >> constant folding) might outweigh these costs.) >> * lookup of member functions, which almost never change >> * flattening of homogeneously-typed lists >> >> Best, >> >> Neil >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > > -- > Keeping medicines from the bloodstreams of the sick; food > from the bellies of the hungry; books from the hands of the > uneducated; technology from the underdeveloped; and putting > advocates of freedom in prisons. Intellectual property is > to the 21st century what the slave trade was to the 16th. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Fri Jun 13 08:21:39 2014 From: mertz at gnosis.cx (David Mertz) Date: Thu, 12 Jun 2014 23:21:39 -0700 Subject: [Python-ideas] A General Outline for Just-in-Time Acceleration of Python In-Reply-To: References: Message-ID: There are many people with a better knowledge of PyPy than me; I've only looked at it and read some general white papers from the team. But I do know with certainty that PyPy executes ALL Python code, not a restricted subset[*]. You might be thinking of Cython in terms of an "annotated, restricted, version of Python." However, PyPy itself is *written* in a restricted subset called RPython, which also might be what you are thinking of. Well, much of PyPy is written that way, I'm pretty sure some of it is regular unrestricted Python too. Other than the fact that PyPy isn't named, e.g. "PyRPy" this is just an implementation detail though--Jython is written in Java, Iron Python is written in C#, CPython is written in C, and PyPy is written in (R)Python. They all run user programs the same though[**] [*] Yes, possibly there is some weird buggy corner case where it does the wrong thing, but if so, file a ticket to get it fixed. [**] For a suitably fuzzy meaning of "the same"--obviously performance characteristics are going to differ, as are things like access to external library calls, etc. It's the same in terms of taking the same identical source files to run, in any case. On Thu, Jun 12, 2014 at 10:53 PM, Neil Girdhar wrote: > Well that's great to hear :) I thought pypy only worked on a restricted > set of Python. Does pypy save the optimization statistics between runs? > > > On Fri, Jun 13, 2014 at 1:52 AM, David Mertz wrote: > >> Other a sprinkling of the word "stochastic" around this post (why that >> word, not the more obvious "random"?), this basically exactly describes >> what PyPy does. >> >> >> On Thu, Jun 12, 2014 at 9:07 PM, Neil Girdhar >> wrote: >> >>> I was wondering what work is being done on Python to make it faster. I >>> understand that cpython is incrementally improved. I'm not sure, but I >>> think that pypy acceleration works by compiling a restricted set of Python. >>> And I think I heard something about Guido working on a different model for >>> accelerating Python. I apologize in advance that I didn't look into these >>> projects in a lot of detail. My number one dream about computer languages >>> is for me to be able to write in a language as easy as Python and have it >>> run as quickly as if it were written. I do believe that this is possible >>> (since in theory someone could look at my Python code and port it to C++). >>> >>> Unfortunately, I don't have time to work on this goal, but I still >>> wanted to get feedback about some ideas I have about reaching this goal. >>> >>> First, I don't think it's important for a "code block" (say, a small >>> section of code with less coupling to statements outside the block than to >>> within the block) to run quickly on its first iteration. >>> >>> What I'm suggesting instead is for every iteration of a "code block", >>> the runtime stochastically decides whether to collect statistics about that >>> iteration. Those statistics include the the time running the block, the >>> time perform attribute accesses including type method lookups and so on. >>> Basically, the runtime is trying to guess the potential savings of >>> optimizing this block. >>> >>> If the block is run many times and the potential savings are large, then >>> stochastically again, the block is promoted to a second-level statistics >>> collection. This level collects statistics about all of the external >>> couplings of the block, like the types and values of the passed-in and >>> returned values. >>> >>> Using the second-level statistics, the runtime can now guess whether the >>> block should be promoted to a third level whereby any consistencies are >>> exploited. For example, if the passed-in parameter types and return value >>> type of the "min" function are (int, int, int) for 40% of the statistics >>> and (float, float, float) for 59%, and other types for the remaining 1%, >>> then two precompiled versions of min are generated: one for int and one for >>> float. >>> >>> These precompiled code blocks have different costs than regular Python >>> blocks. They need to pay the following costs: >>> * a check for the required invariants (parameter types above, but it >>> could be parameter values, or other invariants) >>> * they need to install hooks on objects that must remain invariant >>> during the execution of the block; if the invariants are ever violated >>> during the execution of the block, then all of the computations done during >>> this execution of the block must be discarded >>> * therefore a third cost is the probability of discarded the computation >>> times the average cost of the doing the wasted computation. >>> >>> The saving is that the code block >>> * can be transformed into a faster bytecode, which includes straight >>> assembly instructions in some sections since types or values can now be >>> assumed, >>> * can use data structures that make type or access assumptions (for >>> example a list that always contains ints can use a flattened >>> representation; a large set that is repeatedly having membership checked >>> with many negative results might benefit from an auxiliary bloom filter, >>> etc.) >>> >>> In summary the runtime performs stochastic, incremental promotion of >>> code blocks from first-level, to second-level, to multiple precompiled >>> versions. It can also demote a code block. The difference between the >>> costs of the different levels is statistically estimated. >>> >>> Examples of optimizations that can be commonly accomplished using such a >>> system are: >>> * global variables are folded into code as constants. (Even if they >>> change rarely, you pay the discarding penalty described above plus the >>> recompilation cost; the benefit of inline use of the constant (and any >>> constant folding) might outweigh these costs.) >>> * lookup of member functions, which almost never change >>> * flattening of homogeneously-typed lists >>> >>> Best, >>> >>> Neil >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >> >> >> >> -- >> Keeping medicines from the bloodstreams of the sick; food >> from the bellies of the hungry; books from the hands of the >> uneducated; technology from the underdeveloped; and putting >> advocates of freedom in prisons. Intellectual property is >> to the 21st century what the slave trade was to the 16th. >> > > -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Jun 13 10:22:54 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 13 Jun 2014 18:22:54 +1000 Subject: [Python-ideas] A General Outline for Just-in-Time Acceleration of Python In-Reply-To: References: Message-ID: On 13 Jun 2014 16:22, "David Mertz" wrote: > > There are many people with a better knowledge of PyPy than me; I've only looked at it and read some general white papers from the team. But I do know with certainty that PyPy executes ALL Python code, not a restricted subset[*]. You might be thinking of Cython in terms of an "annotated, restricted, version of Python." There's also Numba, which allows particular functions to be flagged for JIT compilation with LLVM (and vectorisation if using NumPy). That technically only supports a Python subset, but it's in the context of the normal CPython interpreter, so it's generally possible to just skip accelerating the code that Numba can't handle. Cheers, Nick. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mistersheik at gmail.com Fri Jun 13 11:05:58 2014 From: mistersheik at gmail.com (Neil Girdhar) Date: Fri, 13 Jun 2014 05:05:58 -0400 Subject: [Python-ideas] A General Outline for Just-in-Time Acceleration of Python In-Reply-To: References: Message-ID: Thanks for the explanation. It's really good to know that PyPy executes all Python code. The idea of using a restricted subset is really annoying to me. The only thing I don't understand is why PyPy would be written in RPython. If it's doing a good job analyzing source code, why do people need to annotate source code instead of the runtime (by collecting statistics)? Best, Neil On Fri, Jun 13, 2014 at 2:21 AM, David Mertz wrote: > There are many people with a better knowledge of PyPy than me; I've only > looked at it and read some general white papers from the team. But I do > know with certainty that PyPy executes ALL Python code, not a restricted > subset[*]. You might be thinking of Cython in terms of an "annotated, > restricted, version of Python." > > However, PyPy itself is *written* in a restricted subset called RPython, > which also might be what you are thinking of. Well, much of PyPy is > written that way, I'm pretty sure some of it is regular unrestricted Python > too. Other than the fact that PyPy isn't named, e.g. "PyRPy" this is just > an implementation detail though--Jython is written in Java, Iron Python is > written in C#, CPython is written in C, and PyPy is written in (R)Python. > They all run user programs the same though[**] > > [*] Yes, possibly there is some weird buggy corner case where it does the > wrong thing, but if so, file a ticket to get it fixed. > > [**] For a suitably fuzzy meaning of "the same"--obviously performance > characteristics are going to differ, as are things like access to external > library calls, etc. It's the same in terms of taking the > same identical source files to run, in any case. > > > On Thu, Jun 12, 2014 at 10:53 PM, Neil Girdhar > wrote: > >> Well that's great to hear :) I thought pypy only worked on a restricted >> set of Python. Does pypy save the optimization statistics between runs? >> >> >> On Fri, Jun 13, 2014 at 1:52 AM, David Mertz wrote: >> >>> Other a sprinkling of the word "stochastic" around this post (why that >>> word, not the more obvious "random"?), this basically exactly describes >>> what PyPy does. >>> >>> >>> On Thu, Jun 12, 2014 at 9:07 PM, Neil Girdhar >>> wrote: >>> >>>> I was wondering what work is being done on Python to make it faster. I >>>> understand that cpython is incrementally improved. I'm not sure, but I >>>> think that pypy acceleration works by compiling a restricted set of Python. >>>> And I think I heard something about Guido working on a different model for >>>> accelerating Python. I apologize in advance that I didn't look into these >>>> projects in a lot of detail. My number one dream about computer languages >>>> is for me to be able to write in a language as easy as Python and have it >>>> run as quickly as if it were written. I do believe that this is possible >>>> (since in theory someone could look at my Python code and port it to C++). >>>> >>>> Unfortunately, I don't have time to work on this goal, but I still >>>> wanted to get feedback about some ideas I have about reaching this goal. >>>> >>>> First, I don't think it's important for a "code block" (say, a small >>>> section of code with less coupling to statements outside the block than to >>>> within the block) to run quickly on its first iteration. >>>> >>>> What I'm suggesting instead is for every iteration of a "code block", >>>> the runtime stochastically decides whether to collect statistics about that >>>> iteration. Those statistics include the the time running the block, the >>>> time perform attribute accesses including type method lookups and so on. >>>> Basically, the runtime is trying to guess the potential savings of >>>> optimizing this block. >>>> >>>> If the block is run many times and the potential savings are large, >>>> then stochastically again, the block is promoted to a second-level >>>> statistics collection. This level collects statistics about all of the >>>> external couplings of the block, like the types and values of the passed-in >>>> and returned values. >>>> >>>> Using the second-level statistics, the runtime can now guess whether >>>> the block should be promoted to a third level whereby any consistencies are >>>> exploited. For example, if the passed-in parameter types and return value >>>> type of the "min" function are (int, int, int) for 40% of the statistics >>>> and (float, float, float) for 59%, and other types for the remaining 1%, >>>> then two precompiled versions of min are generated: one for int and one for >>>> float. >>>> >>>> These precompiled code blocks have different costs than regular Python >>>> blocks. They need to pay the following costs: >>>> * a check for the required invariants (parameter types above, but it >>>> could be parameter values, or other invariants) >>>> * they need to install hooks on objects that must remain invariant >>>> during the execution of the block; if the invariants are ever violated >>>> during the execution of the block, then all of the computations done during >>>> this execution of the block must be discarded >>>> * therefore a third cost is the probability of discarded the >>>> computation times the average cost of the doing the wasted computation. >>>> >>>> The saving is that the code block >>>> * can be transformed into a faster bytecode, which includes straight >>>> assembly instructions in some sections since types or values can now be >>>> assumed, >>>> * can use data structures that make type or access assumptions (for >>>> example a list that always contains ints can use a flattened >>>> representation; a large set that is repeatedly having membership checked >>>> with many negative results might benefit from an auxiliary bloom filter, >>>> etc.) >>>> >>>> In summary the runtime performs stochastic, incremental promotion of >>>> code blocks from first-level, to second-level, to multiple precompiled >>>> versions. It can also demote a code block. The difference between >>>> the costs of the different levels is statistically estimated. >>>> >>>> Examples of optimizations that can be commonly accomplished using such >>>> a system are: >>>> * global variables are folded into code as constants. (Even if they >>>> change rarely, you pay the discarding penalty described above plus the >>>> recompilation cost; the benefit of inline use of the constant (and any >>>> constant folding) might outweigh these costs.) >>>> * lookup of member functions, which almost never change >>>> * flattening of homogeneously-typed lists >>>> >>>> Best, >>>> >>>> Neil >>>> >>>> _______________________________________________ >>>> Python-ideas mailing list >>>> Python-ideas at python.org >>>> https://mail.python.org/mailman/listinfo/python-ideas >>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>>> >>> >>> >>> >>> -- >>> Keeping medicines from the bloodstreams of the sick; food >>> from the bellies of the hungry; books from the hands of the >>> uneducated; technology from the underdeveloped; and putting >>> advocates of freedom in prisons. Intellectual property is >>> to the 21st century what the slave trade was to the 16th. >>> >> >> > > > -- > Keeping medicines from the bloodstreams of the sick; food > from the bellies of the hungry; books from the hands of the > uneducated; technology from the underdeveloped; and putting > advocates of freedom in prisons. Intellectual property is > to the 21st century what the slave trade was to the 16th. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From timothy.c.delaney at gmail.com Fri Jun 13 11:15:00 2014 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Fri, 13 Jun 2014 19:15:00 +1000 Subject: [Python-ideas] A General Outline for Just-in-Time Acceleration of Python In-Reply-To: References: Message-ID: On 13 June 2014 19:05, Neil Girdhar wrote: > Thanks for the explanation. It's really good to know that PyPy executes > all Python code. The idea of using a restricted subset is really annoying > to me. > > The only thing I don't understand is why PyPy would be written in RPython. > If it's doing a good job analyzing source code, why do people need to > annotate source code instead of the runtime (by collecting statistics)? > Start reading from here: http://pypy.readthedocs.org/en/latest/coding-guide.html#our-runtime-interpreter-is-rpython Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Fri Jun 13 11:15:47 2014 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 13 Jun 2014 11:15:47 +0200 Subject: [Python-ideas] A General Outline for Just-in-Time Acceleration of Python In-Reply-To: References: Message-ID: 2014-06-13 11:05 GMT+02:00 Neil Girdhar : > Thanks for the explanation. It's really good to know that PyPy executes all > Python code. The idea of using a restricted subset is really annoying to > me. PyPy is 100% compatible with CPython, it implements very tricky implementation details just to be 100% compatible. (But PyPy support of the CPython C API is only partial, but the C API is not part of the "Python language".) In short, the JIT compiler is written in a different language called RPython. This language can be compiled to C, but other backends are or were available: Java, .NET, Javascript, etc. (You should check, I'm not sure.) The whole "PyPy project" is more than just than a fast Python interpreter. There are also fast interpreter for Ruby, PHP, and some other languages. Victor From victor.stinner at gmail.com Fri Jun 13 11:36:15 2014 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 13 Jun 2014 11:36:15 +0200 Subject: [Python-ideas] A General Outline for Just-in-Time Acceleration of Python In-Reply-To: References: Message-ID: Hi, 2014-06-13 6:07 GMT+02:00 Neil Girdhar : > I was wondering what work is being done on Python to make it faster. PyPy is 100% compatible with CPython and it is much faster. Numba is also fast, maybe faster than PyPy in some cases (I read that it can uses a GPU) but it's more specialized to numerical computation. > I understand that cpython is incrementally improved. I'm not sure, but I > think that pypy acceleration works by compiling a restricted set of Python. > And I think I heard something about Guido working on a different model for > accelerating Python. I apologize in advance that I didn't look into these > projects in a lot of detail. My number one dream about computer languages > is for me to be able to write in a language as easy as Python and have it > run as quickly as if it were written. I started to take notes about how CPython can be made faster: http://haypo-notes.readthedocs.org/faster_cpython.html See for example my section "Why Python is slow?": http://haypo-notes.readthedocs.org/faster_cpython.html#why-python-is-slow In short: because Python is a dynamic language (the code can be modified at runtime, a single variable can store different types, almost everything can be modified at runtime), the compiler cannot do much assumption on the Python code and so it's very hard to emit fast code (bytecode). > I do believe that this is possible > (since in theory someone could look at my Python code and port it to C++). There are projects to compile Python to C++. See for example pythran: http://pythonhosted.org/pythran/ But these projects only support a subset of Python. The C++ language is less dynamic than Python. > What I'm suggesting instead is for every iteration of a "code block", the > runtime stochastically decides whether to collect statistics about that > iteration. Those statistics include the the time running the block, the > time perform attribute accesses including type method lookups and so on. > Basically, the runtime is trying to guess the potential savings of > optimizing this block. You should really take a look at PyPy. It implements a *very efficient* tracing JIT. The problem is to not make the program slower when you trace it. PyPy makes some compromises to avoid this overhead, it only optimizes loop with more than N iterations (1000?) for example. > If the block is run many times and the potential savings are large, then > stochastically again, the block is promoted to a second-level statistics > collection. This level collects statistics about all of the external > couplings of the block, like the types and values of the passed-in and > returned values. Sorry, this is not the real technical problem :-) No, the real problem is to detect environment changes, remove the specialized code (optimized for the old environment) and maybe re-optimize the code later. Environment: modules, classes (types), functions, "constants", etc. If anything is modified, the code must be regenerated. A specialized code is a compiled version of your Python code which is based on different assumptions to run faster. For example, if your function calls the builtin function "len", you can make the assumption that the len function returns an int. But if the builtin "len" function is replaced by something else, you must call the new len function. With a JIT, you can detect changes of the envrionment and regenerate optimized functions during the execution of the application. You can for example add a "timestamp" (counter incremented at each change) in dictionaries and check if the timestamp changed. My notes about that: http://haypo-notes.readthedocs.org/faster_cpython.html#learn-types > The saving is that the code block > * can be transformed into a faster bytecode, which includes straight > assembly instructions in some sections since types or values can now be > assumed, My plan is to add the infrastructure to support specialized code in CPython: - support multiple codes in a single function - each code has an environment to decide if it can be used or not - notify (or at least detect) changes of the environment (notify when the Python code is changed: modules, classes, functions) It should work well for functions, but I don't see yet how to implement these things for instances of classes because you can also override methods in an instance. > * can use data structures that make type or access assumptions (for example > a list that always contains ints can use a flattened representation; a large > set that is repeatedly having membership checked with many negative results > might benefit from an auxiliary bloom filter, etc.) Again, please see PyPy: it has very efficient data structures. I don't think that such changes can be made in CPython. CPython code is too old, too many users rely on the current implementation: rely on the "C API". There are for example a PyList_GET_ITEM() macro to access directly an item of a list. This macro is not part of the stable API, but I guess that most C modules such macro (depending on C structures). From joseph.martinot-lagarde at m4x.org Sat Jun 14 08:54:00 2014 From: joseph.martinot-lagarde at m4x.org (Joseph Martinot-Lagarde) Date: Sat, 14 Jun 2014 08:54:00 +0200 Subject: [Python-ideas] A General Outline for Just-in-Time Acceleration of Python In-Reply-To: References: Message-ID: <539BF188.7040600@m4x.org> Le 13/06/2014 08:21, David Mertz a ?crit : > There are many people with a better knowledge of PyPy than me; I've only > looked at it and read some general white papers from the team. But I do > know with certainty that PyPy executes ALL Python code, not a restricted > subset[*]. You might be thinking of Cython in terms of an "annotated, > restricted, version of Python." Cython compiles all python, it is not restricted. On the other hand, pypy is not compatible with the C API of CPython, thus can not run compiled modules as is. > > However, PyPy itself is *written* in a restricted subset called RPython, > which also might be what you are thinking of. Well, much of PyPy is > written that way, I'm pretty sure some of it is regular unrestricted > Python too. Other than the fact that PyPy isn't named, e.g. "PyRPy" > this is just an implementation detail though--Jython is written in Java, > Iron Python is written in C#, CPython is written in C, and PyPy is > written in (R)Python. They all run user programs the same though[**] > > [*] Yes, possibly there is some weird buggy corner case where it does > the wrong thing, but if so, file a ticket to get it fixed. > > [**] For a suitably fuzzy meaning of "the same"--obviously performance > characteristics are going to differ, as are things like access to > external library calls, etc. It's the same in terms of taking the > same identical source files to run, in any case. > > > On Thu, Jun 12, 2014 at 10:53 PM, Neil Girdhar > > wrote: > > Well that's great to hear :) I thought pypy only worked on a > restricted set of Python. Does pypy save the optimization > statistics between runs? > > > On Fri, Jun 13, 2014 at 1:52 AM, David Mertz > > wrote: > > Other a sprinkling of the word "stochastic" around this post > (why that word, not the more obvious "random"?), this basically > exactly describes what PyPy does. > > > On Thu, Jun 12, 2014 at 9:07 PM, Neil Girdhar > > wrote: > > I was wondering what work is being done on Python to make it > faster. I understand that cpython is incrementally > improved. I'm not sure, but I think that pypy acceleration > works by compiling a restricted set of Python. And I think > I heard something about Guido working on a different model > for accelerating Python. I apologize in advance that I > didn't look into these projects in a lot of detail. My > number one dream about computer languages is for me to be > able to write in a language as easy as Python and have it > run as quickly as if it were written. I do believe that > this is possible (since in theory someone could look at my > Python code and port it to C++). > > Unfortunately, I don't have time to work on this goal, but I > still wanted to get feedback about some ideas I have about > reaching this goal. > > First, I don't think it's important for a "code block" (say, > a small section of code with less coupling to statements > outside the block than to within the block) to run quickly > on its first iteration. > > What I'm suggesting instead is for every iteration of a > "code block", the runtime stochastically decides whether to > collect statistics about that iteration. Those statistics > include the the time running the block, the time perform > attribute accesses including type method lookups and so on. > Basically, the runtime is trying to guess the potential > savings of optimizing this block. > > If the block is run many times and the potential savings are > large, then stochastically again, the block is promoted to a > second-level statistics collection. This level collects > statistics about all of the external couplings of the block, > like the types and values of the passed-in and returned values. > > Using the second-level statistics, the runtime can now guess > whether the block should be promoted to a third level > whereby any consistencies are exploited. For example, if > the passed-in parameter types and return value type of the > "min" function are (int, int, int) for 40% of the statistics > and (float, float, float) for 59%, and other types for the > remaining 1%, then two precompiled versions of min are > generated: one for int and one for float. > > These precompiled code blocks have different costs than > regular Python blocks. They need to pay the following costs: > * a check for the required invariants (parameter types > above, but it could be parameter values, or other invariants) > * they need to install hooks on objects that must remain > invariant during the execution of the block; if the > invariants are ever violated during the execution of the > block, then all of the computations done during this > execution of the block must be discarded > * therefore a third cost is the probability of discarded the > computation times the average cost of the doing the wasted > computation. > > The saving is that the code block > * can be transformed into a faster bytecode, which includes > straight assembly instructions in some sections since types > or values can now be assumed, > * can use data structures that make type or access > assumptions (for example a list that always contains ints > can use a flattened representation; a large set that is > repeatedly having membership checked with many negative > results might benefit from an auxiliary bloom filter, etc.) > > In summary the runtime performs stochastic, incremental > promotion of code blocks from first-level, to second-level, > to multiple precompiled versions. It can also demote a code > block. The difference between the costs of the different > levels is statistically estimated. > > Examples of optimizations that can be commonly accomplished > using such a system are: > * global variables are folded into code as constants. (Even > if they change rarely, you pay the discarding penalty > described above plus the recompilation cost; the benefit of > inline use of the constant (and any constant folding) might > outweigh these costs.) > * lookup of member functions, which almost never change > * flattening of homogeneously-typed lists > > Best, > > Neil > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > > -- > Keeping medicines from the bloodstreams of the sick; food > from the bellies of the hungry; books from the hands of the > uneducated; technology from the underdeveloped; and putting > advocates of freedom in prisons. Intellectual property is > to the 21st century what the slave trade was to the 16th. > > > > > > -- > Keeping medicines from the bloodstreams of the sick; food > from the bellies of the hungry; books from the hands of the > uneducated; technology from the underdeveloped; and putting > advocates of freedom in prisons. Intellectual property is > to the 21st century what the slave trade was to the 16th. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > --- Ce courrier ?lectronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com From mertz at gnosis.cx Sat Jun 14 09:30:31 2014 From: mertz at gnosis.cx (David Mertz) Date: Sat, 14 Jun 2014 00:30:31 -0700 Subject: [Python-ideas] A General Outline for Just-in-Time Acceleration of Python In-Reply-To: <539BF188.7040600@m4x.org> References: <539BF188.7040600@m4x.org> Message-ID: On Fri, Jun 13, 2014 at 11:54 PM, Joseph Martinot-Lagarde < joseph.martinot-lagarde at m4x.org> wrote: > Cython compiles all python, it is not restricted. > Well, kinda yes and no. You are correct of course, that anything that you can execute with 'python someprog' you can compile with 'cython someprog'. However, there is an obvious sense in which adding an annotation (which is, of course, a syntax error for Python itself) "restricts" the code in Cython. E.g.: def silly(): cdef int n, i for i in range(10): if i < 5: n = i + 1 else: n = str(i) This *silly* function isn't really Python code at all, of course. But if you ignore the annotation, it would be--pointless code, but valid. As soon as you add the annotation, you *restrict* the type of code you can write in the scope of the annotation. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From joseph.martinot-lagarde at m4x.org Sat Jun 14 09:38:00 2014 From: joseph.martinot-lagarde at m4x.org (Joseph Martinot-Lagarde) Date: Sat, 14 Jun 2014 09:38:00 +0200 Subject: [Python-ideas] A General Outline for Just-in-Time Acceleration of Python In-Reply-To: References: <539BF188.7040600@m4x.org> Message-ID: <539BFBD8.1060506@m4x.org> Le 14/06/2014 09:30, David Mertz a ?crit : > On Fri, Jun 13, 2014 at 11:54 PM, Joseph Martinot-Lagarde > > wrote: > > Cython compiles all python, it is not restricted. > > > Well, kinda yes and no. You are correct of course, that anything that > you can execute with 'python someprog' you can compile with 'cython > someprog'. However, there is an obvious sense in which adding an > annotation (which is, of course, a syntax error for Python itself) > "restricts" the code in Cython. E.g.: > > def silly(): > cdef int n, i > for i in range(10): > if i < 5: > n = i + 1 > else: > n = str(i) > > This *silly* function isn't really Python code at all, of course. But > if you ignore the annotation, it would be--pointless code, but valid. As > soon as you add the annotation, you *restrict* the type of code you can > write in the scope of the annotation. > Yeah, the point is that *you* restrict, not cython. From your previous post I understood that you meant "pypy runs all python but cython doesn't, it is restricted". I use numpy regularely, and in this case it is the other way around: I can optimize my code using cython but I can't run it with pypy at all. --- Ce courrier ?lectronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com From mertz at gnosis.cx Sat Jun 14 09:53:11 2014 From: mertz at gnosis.cx (David Mertz) Date: Sat, 14 Jun 2014 00:53:11 -0700 Subject: [Python-ideas] A General Outline for Just-in-Time Acceleration of Python In-Reply-To: <539BFBD8.1060506@m4x.org> References: <539BF188.7040600@m4x.org> <539BFBD8.1060506@m4x.org> Message-ID: Is there ever a case where removing all the type annotations from Cython code does not produce code that can run in PyPy? I don't know Cython well enough to be certain the answer is 'no', but I think so. So a function a little like my 'silly()' function--but that did something actually interesting in the loop--might run faster by removing the annotation and running it in PyPy. Or is might NOT, of course; the answer is not obvious without looking at the exact code in question, and probably not without actually timing it. But the idea is that let's say I have some code with a loop and some numeric operations inside that loop that I'm currently running using CPython. There are at least two ways I might speed up that code: A) Edit the code to contain some type annotations, and compile it with Cython. However, if I do this, I *might* have to modify some other constructs in the overall code block to get it to compile (i.e. if there's any polymorphism about variable types). B) Run the unchanged code using PyPy. Well, in this description, PyPy sounds better... but it's not better if option (A) makes faster code in *your* specific code, of course. And moreover, (B) is not true if your existing code relies on C extensions, such as NumPy, which mostly aren't going to run on PyPy. However, I do know about https://bitbucket.org/pypy/numpy. At least some substantial part of NumPy has been ported to PyPy. This may or may not support the code *you* need to run. On Sat, Jun 14, 2014 at 12:38 AM, Joseph Martinot-Lagarde < joseph.martinot-lagarde at m4x.org> wrote: > Le 14/06/2014 09:30, David Mertz a ?crit : > >> On Fri, Jun 13, 2014 at 11:54 PM, Joseph Martinot-Lagarde >> > > wrote: >> >> Cython compiles all python, it is not restricted. >> >> >> Well, kinda yes and no. You are correct of course, that anything that >> you can execute with 'python someprog' you can compile with 'cython >> someprog'. However, there is an obvious sense in which adding an >> annotation (which is, of course, a syntax error for Python itself) >> "restricts" the code in Cython. E.g.: >> >> def silly(): >> cdef int n, i >> for i in range(10): >> if i < 5: >> n = i + 1 >> else: >> n = str(i) >> >> This *silly* function isn't really Python code at all, of course. But >> if you ignore the annotation, it would be--pointless code, but valid. As >> soon as you add the annotation, you *restrict* the type of code you can >> write in the scope of the annotation. >> >> > Yeah, the point is that *you* restrict, not cython. From your previous > post I understood that you meant "pypy runs all python but cython doesn't, > it is restricted". > > I use numpy regularely, and in this case it is the other way around: I can > optimize my code using cython but I can't run it with pypy at all. > > > --- > Ce courrier ?lectronique ne contient aucun virus ou logiciel malveillant > parce que la protection avast! Antivirus est active. > http://www.avast.com > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From joseph.martinot-lagarde at m4x.org Sat Jun 14 10:11:48 2014 From: joseph.martinot-lagarde at m4x.org (Joseph Martinot-Lagarde) Date: Sat, 14 Jun 2014 10:11:48 +0200 Subject: [Python-ideas] A General Outline for Just-in-Time Acceleration of Python In-Reply-To: References: <539BF188.7040600@m4x.org> <539BFBD8.1060506@m4x.org> Message-ID: <539C03C4.2040200@m4x.org> Le 14/06/2014 09:53, David Mertz a ?crit : > Is there ever a case where removing all the type annotations from Cython > code does not produce code that can run in PyPy? I don't know Cython > well enough to be certain the answer is 'no', but I think so. Cython without annotations is just python, so it can be run in pypy. > So a > function a little like my 'silly()' function--but that did something > actually interesting in the loop--might run faster by removing the > annotation and running it in PyPy. Or is might NOT, of course; the > answer is not obvious without looking at the exact code in question, and > probably not without actually timing it. > > But the idea is that let's say I have some code with a loop and some > numeric operations inside that loop that I'm currently running using > CPython. There are at least two ways I might speed up that code: > > A) Edit the code to contain some type annotations, and compile it with > Cython. However, if I do this, I *might* have to modify some other > constructs in the overall code block to get it to compile (i.e. if > there's any polymorphism about variable types). > > B) Run the unchanged code using PyPy. > > Well, in this description, PyPy sounds better... but it's not better if > option (A) makes faster code in *your* specific code, of course. And > moreover, (B) is not true if your existing code relies on C extensions, > such as NumPy, which mostly aren't going to run on PyPy. Actually I don't see it like pypy vs cython. The only common point between these projects is that they both try to optimize python code, but in such different ways that the outcome completely depends on the code. For my problems cython is an obvious choice, for others it could be useless. > > However, I do know about https://bitbucket.org/pypy/numpy. At least > some substantial part of NumPy has been ported to PyPy. This may or may > not support the code *you* need to run. Right now it doesn't ! ;) I have the same problem with numba and numexpr, it seems to rarely be compatible with my real world use cases. > > > On Sat, Jun 14, 2014 at 12:38 AM, Joseph Martinot-Lagarde > > wrote: > > Le 14/06/2014 09:30, David Mertz a ?crit : > > On Fri, Jun 13, 2014 at 11:54 PM, Joseph Martinot-Lagarde > > >> > wrote: > > Cython compiles all python, it is not restricted. > > > Well, kinda yes and no. You are correct of course, that > anything that > you can execute with 'python someprog' you can compile with 'cython > someprog'. However, there is an obvious sense in which adding an > annotation (which is, of course, a syntax error for Python itself) > "restricts" the code in Cython. E.g.: > > def silly(): > cdef int n, i > for i in range(10): > if i < 5: > n = i + 1 > else: > n = str(i) > > This *silly* function isn't really Python code at all, of > course. But > if you ignore the annotation, it would be--pointless code, but > valid. As > soon as you add the annotation, you *restrict* the type of code > you can > write in the scope of the annotation. > > > Yeah, the point is that *you* restrict, not cython. From your > previous post I understood that you meant "pypy runs all python but > cython doesn't, it is restricted". > > I use numpy regularely, and in this case it is the other way around: > I can optimize my code using cython but I can't run it with pypy at all. > > > --- > Ce courrier ?lectronique ne contient aucun virus ou logiciel > malveillant parce que la protection avast! Antivirus est active. > http://www.avast.com > > > _________________________________________________ > Python-ideas mailing list > Python-ideas at python.org > > https://mail.python.org/__mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/__codeofconduct/ > > > > > > -- > Keeping medicines from the bloodstreams of the sick; food > from the bellies of the hungry; books from the hands of the > uneducated; technology from the underdeveloped; and putting > advocates of freedom in prisons. Intellectual property is > to the 21st century what the slave trade was to the 16th. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > --- Ce courrier ?lectronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com From ncoghlan at gmail.com Sun Jun 15 07:51:02 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 15 Jun 2014 15:51:02 +1000 Subject: [Python-ideas] A General Outline for Just-in-Time Acceleration of Python In-Reply-To: References: <539BF188.7040600@m4x.org> <539BFBD8.1060506@m4x.org> Message-ID: On 14 June 2014 17:53, David Mertz wrote: > Is there ever a case where removing all the type annotations from Cython > code does not produce code that can run in PyPy? I don't know Cython well > enough to be certain the answer is 'no', but I think so. So a function a > little like my 'silly()' function--but that did something actually > interesting in the loop--might run faster by removing the annotation and > running it in PyPy. Or is might NOT, of course; the answer is not obvious > without looking at the exact code in question, and probably not without > actually timing it. > > But the idea is that let's say I have some code with a loop and some numeric > operations inside that loop that I'm currently running using CPython. There > are at least two ways I might speed up that code: > > A) Edit the code to contain some type annotations, and compile it with > Cython. However, if I do this, I *might* have to modify some other > constructs in the overall code block to get it to compile (i.e. if there's > any polymorphism about variable types). > > B) Run the unchanged code using PyPy. C) Compile the unchanged code with Cython There's a myth that Cython requires type annotations to speed up Python code. This is not accurate: just bypassing the main eval loop and some aspects of function call handling can provide a respectable speed-up, even when Cython is still performing all the operations through the abstract object API rather than being able to drop back to platform native types. The speed increases aren't as significant as those on offer in PyPy, but they're not trivial, and they don't come at the cost of potential incompatibility with other C extensions (See https://github.com/cython/cython/wiki/FAQ#is-cython-faster-than-cpython for more details) The difference I see between the PyPy approach and the Cython approach to optimisation is between a desire to "just make Python fast, even if it means breaking compatibility with C extensions" (you could call this the "Python as application programming language" perspective, which puts PyPy in a head-to-head contest with the JVM and the .NET CLR, moreso than with CPython) and the "make CPython CPU bottlenecks fast, even if doing so requires some additional static annotations" (you could call this the "CPython as orchestration platform" approach, which leverages the rich CPython C API and the ubiquity of C dynamic linking support at the operating system level to interoperate with existing software components, rather than treating reliance on "native" software as something to be avoided as the application programming runtimes tend to do). Those differences in perspective can then create significant barriers to productive communication between different communities of developers and users (for folks that use Python as an orchestration language, PyPy's poorer orchestration support is a dealbreaker, while for PyPy developers focused on applications programming use cases, the lack of interest from system integrators can be intensely frustrating). cffi is a potential path to improving PyPy's handling of the "orchestration platform" use case, but it still has a long way to go to catch up to CPython on that front. NumPyPy in particular still has a fair bit of work to do in catching up to NumPy (http://buildbot.pypy.org/numpy-status/latest.html is an excellent resource for checking in on the progress of that effort). In all areas of Python optimisation, though, there's a lot of work to be done in cracking the discoverability and distribution channel problem. I assume redistributors are still wary of offering PyPy support because it's a completely new way of building language runtimes and they aren't sure they understand it yet. Customer demand can overcome that wariness, but if existing Python users are using Python in an orchestration role rather than primarily as an applications programming language, then that demand may not be there. If customers aren't even aware that these optimisation tools exist in the first place, then that will also hinder the generation of demand. This is hinted at by the fact that even Cython (let alone PyPy) isn't as well supported by redistributors as Python 3, suggesting that customers and redistributors may not be looking far enough outside python.org and python-dev for opportunities to enhance their Python environments. To use a Red Hat specific example, CPython itself is available as a core supported part of the operating system (with 2.3, 2.4, 2.6 and 2.7 all still supported), while CPython 2.7 and 3.3 are also available as explicitly supported offerings through Red Hat Software Collections. Both PyPy and Cython are also available for Red Hat Enterprise Linux & derivatives, but only through the community provided "Extra Packages for Enterprise Linux" repositories. The newer Numba JIT compiler for CPython isn't even in EPEL - you have to build it from source yourself, or acquire it via other means (likely conda). >From a redistribution perspective, engineering staff can certainly suggest "Hey, these would be good things to offer our customers", but that's never going to be as compelling as customers coming to us (or other Python vendors) and asking "Hey, what can you do for me to make my Python code run faster?". Upstream has good answers for a lot of these problems, but commercial redistributors usually aren't going to bite unless they can see a clear business case for it. Lowering the *cost* of redistribution is also at the heart of a lot of the work going on around metadata 2.0 on distutils-sig - at the moment, the repackaging process (getting from PyPI formats to redistributor formats) can be incredibly manual (not only when the project is first repackaged, but sometimes even on subsequent updates), which is one of the reasons repackaged Python projects tend to be measured in the dozens, or at best hundreds, compared to the tens of thousands available upstream, and why there tend to be significant lag times between upstream updates and updates of repackaged versions. Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sun Jun 15 14:33:14 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 15 Jun 2014 22:33:14 +1000 Subject: [Python-ideas] A possible transition plan to bytes-based iteration and indexing for binary data Message-ID: At PyCon earlier this year, Guido (and others) persuaded me that the integer based indexing and iteration for bytes and bytearray in Python 3 was a genuine design mistake based on the initial Python 3 design which lacked an immutable bytes type entirely (so producing integers was originally the only reasonable choice). The earlier design discussions around PEP 467 (which proposes to clean up a few other bits and pieces of that original legacy which PEP 3137 left in place) all treated "bytes indexing returns an integer" as an unchangeable aspect of Python 3, since there wasn't an obvious way to migrate to instead returning length 1 bytes objects with a reasonable story to handle the incompatibility for Python 3 users, even if everyone was in favour of the end result. A few weeks ago I had an idea for a migration strategy that seemed feasible, and I now have a very, very preliminary proof of concept up at https://bitbucket.org/ncoghlan/cpython_sandbox/branch/bytes_migration_experiment The general principle involved would be to return an integer *subtype* from indexing and iteration operations on bytes, bytearray and memoryview objects using the "default" format character. That subtype would then be detected in various locations and handled the way a length 1 bytes object would be handled, rather than the way an integer would be handled. The current proof of concept adds such handling to ord(), bytes() and bytearray() (with appropriate test cases in test_bytes) giving the following results: >>> b'hello'[0] 104 >>> ord(b'hello'[0]) 104 >>> bytes(b'hello'[0]) b'h' >>> bytearray(b'hello'[0]) bytearray(b'h') (the subtype is currently visible at the Python level as "types._BytesInt") The proof of concept doesn't override any normal integer behaviour, but a more complete solution would be in a position to emit a warning when the result of binary indexing is used as an integer (either always, or controlled by a command line switch, depending on the performance impact). With this integer subtype in place for Python 3.5 to provide a transition period where both existing integer-compatible operations (like int() and arithmetic operations) and selected bytes-compatible operations (like ord(), bytes() and bytearray()) are supported, these operations could then be switched to producing a normal length 1 bytes object in Python 3.6. It wouldn't be pretty, and it would be a pain to document, but it seems feasible. The alternative is for PEP 367 to add a separate bytes iteration method, which strikes me as further entrenching a design we aren't currently happy with. Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From antoine at python.org Sun Jun 15 15:08:31 2014 From: antoine at python.org (Antoine Pitrou) Date: Sun, 15 Jun 2014 09:08:31 -0400 Subject: [Python-ideas] A possible transition plan to bytes-based iteration and indexing for binary data In-Reply-To: References: Message-ID: Le 15/06/2014 08:33, Nick Coghlan a ?crit : > > The general principle involved would be to return an integer *subtype* > from indexing and iteration operations on bytes, bytearray and > memoryview objects using the "default" format character. That subtype > would then be detected in various locations and handled the way a > length 1 bytes object would be handled, rather than the way an integer > would be handled. The current proof of concept adds such handling to > ord(), bytes() and bytearray() (with appropriate test cases in > test_bytes) giving the following results: > >>>> b'hello'[0] > 104 >>>> ord(b'hello'[0]) > 104 >>>> bytes(b'hello'[0]) > b'h' >>>> bytearray(b'hello'[0]) > bytearray(b'h') That sounds terribly confusing to me. I'd rather live with the current behaviour. Regards Antoine. From steve at pearwood.info Sun Jun 15 17:24:29 2014 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 16 Jun 2014 01:24:29 +1000 Subject: [Python-ideas] A possible transition plan to bytes-based iteration and indexing for binary data In-Reply-To: References: Message-ID: <20140615152428.GB7742@ando> On Sun, Jun 15, 2014 at 10:33:14PM +1000, Nick Coghlan wrote: > At PyCon earlier this year, Guido (and others) persuaded me that the > integer based indexing and iteration for bytes and bytearray in Python > 3 was a genuine design mistake based on the initial Python 3 design > which lacked an immutable bytes type entirely (so producing integers > was originally the only reasonable choice). [...] > The general principle involved would be to return an integer *subtype* Have you considered subclassing bytes, rather than int? for i in b"foo": assert isinstance(i, int) for b in sensible_bytes(b"foo"): assert isinstance(b, bytes) I'm not wedded to the name :-) And then, perhaps some time in the distant future when porting from Python 2.7 is no longer a priority, we can add from __future__ import bytes_iteration_yields_bytes There's at least two obvious downsides: the b'' syntax will still refer to the less useful type, and it will be a violation of the Liskov substitution principle (but then I've always considered that to be a guideline rather than a hard law). > It wouldn't be pretty, and it would be a pain to document, but it > seems feasible. The alternative is for PEP 367 to add a separate bytes > iteration method, which strikes me as further entrenching a design we > aren't currently happy with. Unless you have a strategy to deprecate *and remove* the magic int subclass some time in the foreseeable future, you're still entrenching the design. I think whatever we do, we're going to end up with something ugly in the language. Possibly the least ugly, and certainly the least magic, is a separate bytes iteration method. Keeping-an-open-mind-but-leaning-towards-minus-one-on-the-idea-ly y'rs, -- Steven From steve at pearwood.info Sun Jun 15 17:36:01 2014 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 16 Jun 2014 01:36:01 +1000 Subject: [Python-ideas] A possible transition plan to bytes-based iteration and indexing for binary data In-Reply-To: References: Message-ID: <20140615153601.GC7742@ando> A further thought comes to mind... On Sun, Jun 15, 2014 at 10:33:14PM +1000, Nick Coghlan wrote: [...] > The general principle involved would be to return an integer *subtype* > >>> bytes(b'hello'[0]) > b'h' Hmmm. This is, I think, worrying. Now you have two sorts of ints: a = b'hello'[0] b = 104 assert a == b # succeeds assert bytes(a) == bytes(b) # fails I can see problems where one of these _ByteInts gets used where you're expecting a regular int, or visa versa, and you're left with a silent failure and perplexing, hard to diagnose behaviour. -- Steven From njs at pobox.com Sun Jun 15 17:49:30 2014 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 15 Jun 2014 16:49:30 +0100 Subject: [Python-ideas] A possible transition plan to bytes-based iteration and indexing for binary data In-Reply-To: <20140615152428.GB7742@ando> References: <20140615152428.GB7742@ando> Message-ID: On 15 Jun 2014 16:25, "Steven D'Aprano" wrote: > > On Sun, Jun 15, 2014 at 10:33:14PM +1000, Nick Coghlan wrote: > > At PyCon earlier this year, Guido (and others) persuaded me that the > > integer based indexing and iteration for bytes and bytearray in Python > > 3 was a genuine design mistake based on the initial Python 3 design > > which lacked an immutable bytes type entirely (so producing integers > > was originally the only reasonable choice). > [...] > > The general principle involved would be to return an integer *subtype* > > Have you considered subclassing bytes, rather than int? Isn't the obvious answer to subclass both? This would require a bit of fiddling to ensure memory layout compatibility, but seems feasible to me [1]. So b"abcd" would give a bytes object, and b"abcd"[0] would an inty_bytes object, which acts like an int in int contexts and likes a bytes in bytes contexts. E.g., inty_bytes + int -> int (and warns) inty_bytes + bytes -> bytes Bonus points if we can make isinstance(inty_bytes, int) warn too. The main obstacle I see is that there are a small number of operations that are well defined for both bytes and int objects with different semantics: inty_bytes * int -> ? inty_bytes + inty_bytes -> ? I suspect these will be a major challenge for any transition scheme. (Is it even viable to make bytes method behaviour dependent on a __future__ import? I guess this would require stack frame inspection?) -n [1] specifically I envision adding an unexposed base class that has the struct fields required by int but no methods, making int and bytes both inherit from it, and the inty_bytes would inherit from both. This wastes a bit of memory in each bytes object, but only during the transition. -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Sun Jun 15 19:03:16 2014 From: greg at krypto.org (Gregory P. Smith) Date: Sun, 15 Jun 2014 10:03:16 -0700 Subject: [Python-ideas] A possible transition plan to bytes-based iteration and indexing for binary data In-Reply-To: References: Message-ID: On Sun, Jun 15, 2014 at 5:33 AM, Nick Coghlan wrote: > At PyCon earlier this year, Guido (and others) persuaded me that the > integer based indexing and iteration for bytes and bytearray in Python > 3 was a genuine design mistake based on the initial Python 3 design > which lacked an immutable bytes type entirely (so producing integers > was originally the only reasonable choice). > > The earlier design discussions around PEP 467 (which proposes to clean > up a few other bits and pieces of that original legacy which PEP 3137 > left in place) all treated "bytes indexing returns an integer" as an > unchangeable aspect of Python 3, since there wasn't an obvious way to > migrate to instead returning length 1 bytes objects with a reasonable > story to handle the incompatibility for Python 3 users, even if > everyone was in favour of the end result. > > A few weeks ago I had an idea for a migration strategy that seemed > feasible, and I now have a very, very preliminary proof of concept up > at > https://bitbucket.org/ncoghlan/cpython_sandbox/branch/bytes_migration_experiment > > The general principle involved would be to return an integer *subtype* > from indexing and iteration operations on bytes, bytearray and > memoryview objects using the "default" format character. That subtype > would then be detected in various locations and handled the way a > length 1 bytes object would be handled, rather than the way an integer > would be handled. The current proof of concept adds such handling to > ord(), bytes() and bytearray() (with appropriate test cases in > test_bytes) giving the following results: > > >>> b'hello'[0] > 104 > >>> ord(b'hello'[0]) > 104 > >>> bytes(b'hello'[0]) > b'h' > >>> bytearray(b'hello'[0]) > bytearray(b'h') > > (the subtype is currently visible at the Python level as "types._BytesInt") > > The proof of concept doesn't override any normal integer behaviour, > but a more complete solution would be in a position to emit a warning > when the result of binary indexing is used as an integer (either > always, or controlled by a command line switch, depending on the > performance impact). > > With this integer subtype in place for Python 3.5 to provide a > transition period where both existing integer-compatible operations > (like int() and arithmetic operations) and selected bytes-compatible > operations (like ord(), bytes() and bytearray()) are supported, these > operations could then be switched to producing a normal length 1 bytes > object in Python 3.6. > > It wouldn't be pretty, and it would be a pain to document, but it > seems feasible. The alternative is for PEP 367 to add a separate bytes > I believe you mean PEP 467. > iteration method, which strikes me as further entrenching a design we > aren't currently happy with. > > Regards, > Nick. We just got rid of the mess of having multiple integer types (int vs long), it'd be a shame to recreate that problem in any form. The ship has sailed. Python 3 means bytes indexing returns ints. It's well defined and code has started to depend on it. People who want a b'A' instead of 0x41 know to use slice notation [n:n+1] instead of [n] to get a one byte bytes() as that is what is required in code that works in 2.6 through 3.4 today. Anything we do to change it is going to be messier and more mysterious. Entertaining the idea anyways: If there is going to be a new type for bytes indexing, it needs to multiply inherit from both int and bytes so that isinstance() checks work. We'd need to make sure all C API calls that check for a specific type actually work with the new one as well (at first glance I count 57 uses of PyBytes_CheckExact and PyLong_CheckExact in CPython). The ambiguious operator * and + cases and any similar that Nathaniel Smith pointed out would still be a problem and a potential source of confusion for users. If anything, a new iteration method in PEP 467 that yields length 1 bytes() makes *some* sense for convenience, but I don't personally see much use for single byte iteration of any form in a high level language. It is odd to me that str and bytes *ever* supported iteration. How many times have we each written code to check that a passed argument was "a sequence but, oh, wait, not a string, because you didn't *really* mean to do that". That was a Python 1 decision. Oops. :) -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From bcannon at gmail.com Sun Jun 15 23:42:02 2014 From: bcannon at gmail.com (Dr. Brett Cannon) Date: Sun, 15 Jun 2014 21:42:02 +0000 Subject: [Python-ideas] A possible transition plan to bytes-based iteration and indexing for binary data References: Message-ID: Why do we need a fancy subtype when a future statement could get us the semantics we want without breaking anything? I realize it won't work with 2.7 but at least it gives us some way forward that isn't quite so delicate. On Sun, Jun 15, 2014, 10:11, Gregory P. Smith wrote: > On Sun, Jun 15, 2014 at 5:33 AM, Nick Coghlan wrote: > >> At PyCon earlier this year, Guido (and others) persuaded me that the >> integer based indexing and iteration for bytes and bytearray in Python >> 3 was a genuine design mistake based on the initial Python 3 design >> which lacked an immutable bytes type entirely (so producing integers >> was originally the only reasonable choice). >> >> The earlier design discussions around PEP 467 (which proposes to clean >> up a few other bits and pieces of that original legacy which PEP 3137 >> left in place) all treated "bytes indexing returns an integer" as an >> unchangeable aspect of Python 3, since there wasn't an obvious way to >> migrate to instead returning length 1 bytes objects with a reasonable >> story to handle the incompatibility for Python 3 users, even if >> everyone was in favour of the end result. >> >> A few weeks ago I had an idea for a migration strategy that seemed >> feasible, and I now have a very, very preliminary proof of concept up >> at >> https://bitbucket.org/ncoghlan/cpython_sandbox/branch/bytes_migration_experiment >> >> The general principle involved would be to return an integer *subtype* >> from indexing and iteration operations on bytes, bytearray and >> memoryview objects using the "default" format character. That subtype >> would then be detected in various locations and handled the way a >> length 1 bytes object would be handled, rather than the way an integer >> would be handled. The current proof of concept adds such handling to >> ord(), bytes() and bytearray() (with appropriate test cases in >> test_bytes) giving the following results: >> >> >>> b'hello'[0] >> 104 >> >>> ord(b'hello'[0]) >> 104 >> >>> bytes(b'hello'[0]) >> b'h' >> >>> bytearray(b'hello'[0]) >> bytearray(b'h') >> >> (the subtype is currently visible at the Python level as >> "types._BytesInt") >> >> The proof of concept doesn't override any normal integer behaviour, >> but a more complete solution would be in a position to emit a warning >> when the result of binary indexing is used as an integer (either >> always, or controlled by a command line switch, depending on the >> performance impact). >> >> With this integer subtype in place for Python 3.5 to provide a >> transition period where both existing integer-compatible operations >> (like int() and arithmetic operations) and selected bytes-compatible >> operations (like ord(), bytes() and bytearray()) are supported, these >> operations could then be switched to producing a normal length 1 bytes >> object in Python 3.6. >> >> It wouldn't be pretty, and it would be a pain to document, but it >> seems feasible. The alternative is for PEP 367 to add a separate bytes >> > > I believe you mean PEP 467. > > >> iteration method, which strikes me as further entrenching a design we >> aren't currently happy with. >> >> Regards, >> Nick. > > > We just got rid of the mess of having multiple integer types (int vs > long), it'd be a shame to recreate that problem in any form. > > The ship has sailed. Python 3 means bytes indexing returns ints. It's well > defined and code has started to depend on it. People who want a b'A' > instead of 0x41 know to use slice notation [n:n+1] instead of [n] to get a > one byte bytes() as that is what is required in code that works in 2.6 > through 3.4 today. Anything we do to change it is going to be messier and > more mysterious. > > Entertaining the idea anyways: If there is going to be a new type for > bytes indexing, it needs to multiply inherit from both int and bytes so > that isinstance() checks work. We'd need to make sure all C API calls that > check for a specific type actually work with the new one as well (at first > glance I count 57 uses of PyBytes_CheckExact and PyLong_CheckExact in > CPython). The ambiguious operator * and + cases and any similar that > Nathaniel Smith pointed out would still be a problem and a potential source > of confusion for users. > > If anything, a new iteration method in PEP 467 that yields length 1 > bytes() makes *some* sense for convenience, but I don't personally see > much use for single byte iteration of any form in a high level language. > > It is odd to me that str and bytes *ever* supported iteration. How many > times have we each written code to check that a passed argument was "a > sequence but, oh, wait, not a string, because you didn't *really* mean to > do that". That was a Python 1 decision. Oops. :) > > -gps > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Sun Jun 15 23:50:12 2014 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 16 Jun 2014 07:50:12 +1000 Subject: [Python-ideas] A possible transition plan to bytes-based iteration and indexing for binary data In-Reply-To: <20140615153601.GC7742@ando> References: <20140615153601.GC7742@ando> Message-ID: On Mon, Jun 16, 2014 at 1:36 AM, Steven D'Aprano wrote: > Hmmm. This is, I think, worrying. Now you have two sorts of ints: > > a = b'hello'[0] > b = 104 > assert a == b # succeeds > assert bytes(a) == bytes(b) # fails ISTM the problem here is the bytes(104) constructor, which is of marginal utility anyway. If that could be configured to produce a warning, that would solve the problem, right? You might get that assertion failing, but you'd get a warning that explains why. ChrisA From greg at krypto.org Sun Jun 15 23:57:12 2014 From: greg at krypto.org (Gregory P. Smith) Date: Sun, 15 Jun 2014 14:57:12 -0700 Subject: [Python-ideas] A possible transition plan to bytes-based iteration and indexing for binary data In-Reply-To: References: Message-ID: On Sun, Jun 15, 2014 at 2:42 PM, Dr. Brett Cannon wrote: > Why do we need a fancy subtype when a future statement could get us the > semantics we want without breaking anything? I realize it won't work with > 2.7 but at least it gives us some way forward that isn't quite so delicate. how could it? within a single file where such a statement applies there is no knowledge of what types are. In order for this to work you would need to have your __future__ statement alter the behavior of *all* [] and iteration done within the file to conditionally take a code path that does something different iff the type being operated on is determined at runtime to be bytes. -gps > > On Sun, Jun 15, 2014, 10:11, Gregory P. Smith wrote: > >> On Sun, Jun 15, 2014 at 5:33 AM, Nick Coghlan >> wrote: >> >>> At PyCon earlier this year, Guido (and others) persuaded me that the >>> integer based indexing and iteration for bytes and bytearray in Python >>> 3 was a genuine design mistake based on the initial Python 3 design >>> which lacked an immutable bytes type entirely (so producing integers >>> was originally the only reasonable choice). >>> >>> The earlier design discussions around PEP 467 (which proposes to clean >>> up a few other bits and pieces of that original legacy which PEP 3137 >>> left in place) all treated "bytes indexing returns an integer" as an >>> unchangeable aspect of Python 3, since there wasn't an obvious way to >>> migrate to instead returning length 1 bytes objects with a reasonable >>> story to handle the incompatibility for Python 3 users, even if >>> everyone was in favour of the end result. >>> >>> A few weeks ago I had an idea for a migration strategy that seemed >>> feasible, and I now have a very, very preliminary proof of concept up >>> at >>> https://bitbucket.org/ncoghlan/cpython_sandbox/branch/bytes_migration_experiment >>> >>> The general principle involved would be to return an integer *subtype* >>> from indexing and iteration operations on bytes, bytearray and >>> memoryview objects using the "default" format character. That subtype >>> would then be detected in various locations and handled the way a >>> length 1 bytes object would be handled, rather than the way an integer >>> would be handled. The current proof of concept adds such handling to >>> ord(), bytes() and bytearray() (with appropriate test cases in >>> test_bytes) giving the following results: >>> >>> >>> b'hello'[0] >>> 104 >>> >>> ord(b'hello'[0]) >>> 104 >>> >>> bytes(b'hello'[0]) >>> b'h' >>> >>> bytearray(b'hello'[0]) >>> bytearray(b'h') >>> >>> (the subtype is currently visible at the Python level as >>> "types._BytesInt") >>> >>> The proof of concept doesn't override any normal integer behaviour, >>> but a more complete solution would be in a position to emit a warning >>> when the result of binary indexing is used as an integer (either >>> always, or controlled by a command line switch, depending on the >>> performance impact). >>> >>> With this integer subtype in place for Python 3.5 to provide a >>> transition period where both existing integer-compatible operations >>> (like int() and arithmetic operations) and selected bytes-compatible >>> operations (like ord(), bytes() and bytearray()) are supported, these >>> operations could then be switched to producing a normal length 1 bytes >>> object in Python 3.6. >>> >>> It wouldn't be pretty, and it would be a pain to document, but it >>> seems feasible. The alternative is for PEP 367 to add a separate bytes >>> >> >> I believe you mean PEP 467. >> >> >>> iteration method, which strikes me as further entrenching a design we >>> aren't currently happy with. >>> >>> Regards, >>> Nick. >> >> >> We just got rid of the mess of having multiple integer types (int vs >> long), it'd be a shame to recreate that problem in any form. >> >> The ship has sailed. Python 3 means bytes indexing returns ints. It's >> well defined and code has started to depend on it. People who want a b'A' >> instead of 0x41 know to use slice notation [n:n+1] instead of [n] to get a >> one byte bytes() as that is what is required in code that works in 2.6 >> through 3.4 today. Anything we do to change it is going to be messier and >> more mysterious. >> >> Entertaining the idea anyways: If there is going to be a new type for >> bytes indexing, it needs to multiply inherit from both int and bytes so >> that isinstance() checks work. We'd need to make sure all C API calls that >> check for a specific type actually work with the new one as well (at first >> glance I count 57 uses of PyBytes_CheckExact and PyLong_CheckExact in >> CPython). The ambiguious operator * and + cases and any similar that >> Nathaniel Smith pointed out would still be a problem and a potential source >> of confusion for users. >> >> If anything, a new iteration method in PEP 467 that yields length 1 >> bytes() makes *some* sense for convenience, but I don't personally see >> much use for single byte iteration of any form in a high level language. >> >> It is odd to me that str and bytes *ever* supported iteration. How many >> times have we each written code to check that a passed argument was "a >> sequence but, oh, wait, not a string, because you didn't *really* mean >> to do that". That was a Python 1 decision. Oops. :) >> >> -gps >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mal at egenix.com Mon Jun 16 00:32:07 2014 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 16 Jun 2014 00:32:07 +0200 Subject: [Python-ideas] A possible transition plan to bytes-based iteration and indexing for binary data In-Reply-To: References: Message-ID: <539E1EE7.4060206@egenix.com> On 15.06.2014 23:42, Dr. Brett Cannon wrote: > Why do we need a fancy subtype when a future statement could get us the > semantics we want without breaking anything? I realize it won't work with > 2.7 but at least it gives us some way forward that isn't quite so delicate. Whatever the solution, +100 on making the change default in Python 3.6 :-) > On Sun, Jun 15, 2014, 10:11, Gregory P. Smith wrote: > >> On Sun, Jun 15, 2014 at 5:33 AM, Nick Coghlan wrote: >> >>> At PyCon earlier this year, Guido (and others) persuaded me that the >>> integer based indexing and iteration for bytes and bytearray in Python >>> 3 was a genuine design mistake based on the initial Python 3 design >>> which lacked an immutable bytes type entirely (so producing integers >>> was originally the only reasonable choice). >>> >>> The earlier design discussions around PEP 467 (which proposes to clean >>> up a few other bits and pieces of that original legacy which PEP 3137 >>> left in place) all treated "bytes indexing returns an integer" as an >>> unchangeable aspect of Python 3, since there wasn't an obvious way to >>> migrate to instead returning length 1 bytes objects with a reasonable >>> story to handle the incompatibility for Python 3 users, even if >>> everyone was in favour of the end result. >>> >>> A few weeks ago I had an idea for a migration strategy that seemed >>> feasible, and I now have a very, very preliminary proof of concept up >>> at >>> https://bitbucket.org/ncoghlan/cpython_sandbox/branch/bytes_migration_experiment >>> >>> The general principle involved would be to return an integer *subtype* >>> from indexing and iteration operations on bytes, bytearray and >>> memoryview objects using the "default" format character. That subtype >>> would then be detected in various locations and handled the way a >>> length 1 bytes object would be handled, rather than the way an integer >>> would be handled. The current proof of concept adds such handling to >>> ord(), bytes() and bytearray() (with appropriate test cases in >>> test_bytes) giving the following results: >>> >>>>>> b'hello'[0] >>> 104 >>>>>> ord(b'hello'[0]) >>> 104 >>>>>> bytes(b'hello'[0]) >>> b'h' >>>>>> bytearray(b'hello'[0]) >>> bytearray(b'h') >>> >>> (the subtype is currently visible at the Python level as >>> "types._BytesInt") >>> >>> The proof of concept doesn't override any normal integer behaviour, >>> but a more complete solution would be in a position to emit a warning >>> when the result of binary indexing is used as an integer (either >>> always, or controlled by a command line switch, depending on the >>> performance impact). >>> >>> With this integer subtype in place for Python 3.5 to provide a >>> transition period where both existing integer-compatible operations >>> (like int() and arithmetic operations) and selected bytes-compatible >>> operations (like ord(), bytes() and bytearray()) are supported, these >>> operations could then be switched to producing a normal length 1 bytes >>> object in Python 3.6. >>> >>> It wouldn't be pretty, and it would be a pain to document, but it >>> seems feasible. The alternative is for PEP 367 to add a separate bytes >>> >> >> I believe you mean PEP 467. >> >> >>> iteration method, which strikes me as further entrenching a design we >>> aren't currently happy with. >>> >>> Regards, >>> Nick. >> >> >> We just got rid of the mess of having multiple integer types (int vs >> long), it'd be a shame to recreate that problem in any form. >> >> The ship has sailed. Python 3 means bytes indexing returns ints. It's well >> defined and code has started to depend on it. People who want a b'A' >> instead of 0x41 know to use slice notation [n:n+1] instead of [n] to get a >> one byte bytes() as that is what is required in code that works in 2.6 >> through 3.4 today. Anything we do to change it is going to be messier and >> more mysterious. >> >> Entertaining the idea anyways: If there is going to be a new type for >> bytes indexing, it needs to multiply inherit from both int and bytes so >> that isinstance() checks work. We'd need to make sure all C API calls that >> check for a specific type actually work with the new one as well (at first >> glance I count 57 uses of PyBytes_CheckExact and PyLong_CheckExact in >> CPython). The ambiguious operator * and + cases and any similar that >> Nathaniel Smith pointed out would still be a problem and a potential source >> of confusion for users. >> >> If anything, a new iteration method in PEP 467 that yields length 1 >> bytes() makes *some* sense for convenience, but I don't personally see >> much use for single byte iteration of any form in a high level language. >> >> It is odd to me that str and bytes *ever* supported iteration. How many >> times have we each written code to check that a passed argument was "a >> sequence but, oh, wait, not a string, because you didn't *really* mean to >> do that". That was a Python 1 decision. Oops. :) >> >> -gps >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From guido at python.org Mon Jun 16 01:09:41 2014 From: guido at python.org (Guido van Rossum) Date: Sun, 15 Jun 2014 16:09:41 -0700 Subject: [Python-ideas] A possible transition plan to bytes-based iteration and indexing for binary data In-Reply-To: References: Message-ID: +1 on "the ship has sailed". Let's live with the consequences rather than introduce yet another change. The change will cause more friction than getting used to the current behavior. On Sun, Jun 15, 2014 at 10:03 AM, Gregory P. Smith wrote: > > On Sun, Jun 15, 2014 at 5:33 AM, Nick Coghlan wrote: > >> At PyCon earlier this year, Guido (and others) persuaded me that the >> integer based indexing and iteration for bytes and bytearray in Python >> 3 was a genuine design mistake based on the initial Python 3 design >> which lacked an immutable bytes type entirely (so producing integers >> was originally the only reasonable choice). >> >> The earlier design discussions around PEP 467 (which proposes to clean >> up a few other bits and pieces of that original legacy which PEP 3137 >> left in place) all treated "bytes indexing returns an integer" as an >> unchangeable aspect of Python 3, since there wasn't an obvious way to >> migrate to instead returning length 1 bytes objects with a reasonable >> story to handle the incompatibility for Python 3 users, even if >> everyone was in favour of the end result. >> >> A few weeks ago I had an idea for a migration strategy that seemed >> feasible, and I now have a very, very preliminary proof of concept up >> at >> https://bitbucket.org/ncoghlan/cpython_sandbox/branch/bytes_migration_experiment >> >> The general principle involved would be to return an integer *subtype* >> from indexing and iteration operations on bytes, bytearray and >> memoryview objects using the "default" format character. That subtype >> would then be detected in various locations and handled the way a >> length 1 bytes object would be handled, rather than the way an integer >> would be handled. The current proof of concept adds such handling to >> ord(), bytes() and bytearray() (with appropriate test cases in >> test_bytes) giving the following results: >> >> >>> b'hello'[0] >> 104 >> >>> ord(b'hello'[0]) >> 104 >> >>> bytes(b'hello'[0]) >> b'h' >> >>> bytearray(b'hello'[0]) >> bytearray(b'h') >> >> (the subtype is currently visible at the Python level as >> "types._BytesInt") >> >> The proof of concept doesn't override any normal integer behaviour, >> but a more complete solution would be in a position to emit a warning >> when the result of binary indexing is used as an integer (either >> always, or controlled by a command line switch, depending on the >> performance impact). >> >> With this integer subtype in place for Python 3.5 to provide a >> transition period where both existing integer-compatible operations >> (like int() and arithmetic operations) and selected bytes-compatible >> operations (like ord(), bytes() and bytearray()) are supported, these >> operations could then be switched to producing a normal length 1 bytes >> object in Python 3.6. >> >> It wouldn't be pretty, and it would be a pain to document, but it >> seems feasible. The alternative is for PEP 367 to add a separate bytes >> > > I believe you mean PEP 467. > > >> iteration method, which strikes me as further entrenching a design we >> aren't currently happy with. >> >> Regards, >> Nick. > > > We just got rid of the mess of having multiple integer types (int vs > long), it'd be a shame to recreate that problem in any form. > > The ship has sailed. Python 3 means bytes indexing returns ints. It's well > defined and code has started to depend on it. People who want a b'A' > instead of 0x41 know to use slice notation [n:n+1] instead of [n] to get a > one byte bytes() as that is what is required in code that works in 2.6 > through 3.4 today. Anything we do to change it is going to be messier and > more mysterious. > > Entertaining the idea anyways: If there is going to be a new type for > bytes indexing, it needs to multiply inherit from both int and bytes so > that isinstance() checks work. We'd need to make sure all C API calls that > check for a specific type actually work with the new one as well (at first > glance I count 57 uses of PyBytes_CheckExact and PyLong_CheckExact in > CPython). The ambiguious operator * and + cases and any similar that > Nathaniel Smith pointed out would still be a problem and a potential source > of confusion for users. > > If anything, a new iteration method in PEP 467 that yields length 1 > bytes() makes *some* sense for convenience, but I don't personally see > much use for single byte iteration of any form in a high level language. > > It is odd to me that str and bytes *ever* supported iteration. How many > times have we each written code to check that a passed argument was "a > sequence but, oh, wait, not a string, because you didn't *really* mean to > do that". That was a Python 1 decision. Oops. :) > > -gps > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Jun 16 01:17:56 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 16 Jun 2014 09:17:56 +1000 Subject: [Python-ideas] A possible transition plan to bytes-based iteration and indexing for binary data In-Reply-To: References: Message-ID: On 16 Jun 2014 09:10, "Guido van Rossum" wrote: > > +1 on "the ship has sailed". Let's live with the consequences rather than introduce yet another change. The change will cause more friction than getting used to the current behavior. OK by me - I thought your reaction might be along those lines, which is why I posted the idea for feedback as soon as the proof of concept was even vaguely functional. I'll go back to the approach of improving the Python 3 bytes & bytearray docs before updating PEP 467 again. Cheers, Nick. > > > On Sun, Jun 15, 2014 at 10:03 AM, Gregory P. Smith wrote: >> >> >> On Sun, Jun 15, 2014 at 5:33 AM, Nick Coghlan wrote: >>> >>> At PyCon earlier this year, Guido (and others) persuaded me that the >>> integer based indexing and iteration for bytes and bytearray in Python >>> 3 was a genuine design mistake based on the initial Python 3 design >>> which lacked an immutable bytes type entirely (so producing integers >>> was originally the only reasonable choice). >>> >>> The earlier design discussions around PEP 467 (which proposes to clean >>> up a few other bits and pieces of that original legacy which PEP 3137 >>> left in place) all treated "bytes indexing returns an integer" as an >>> unchangeable aspect of Python 3, since there wasn't an obvious way to >>> migrate to instead returning length 1 bytes objects with a reasonable >>> story to handle the incompatibility for Python 3 users, even if >>> everyone was in favour of the end result. >>> >>> A few weeks ago I had an idea for a migration strategy that seemed >>> feasible, and I now have a very, very preliminary proof of concept up >>> at https://bitbucket.org/ncoghlan/cpython_sandbox/branch/bytes_migration_experiment >>> >>> The general principle involved would be to return an integer *subtype* >>> from indexing and iteration operations on bytes, bytearray and >>> memoryview objects using the "default" format character. That subtype >>> would then be detected in various locations and handled the way a >>> length 1 bytes object would be handled, rather than the way an integer >>> would be handled. The current proof of concept adds such handling to >>> ord(), bytes() and bytearray() (with appropriate test cases in >>> test_bytes) giving the following results: >>> >>> >>> b'hello'[0] >>> 104 >>> >>> ord(b'hello'[0]) >>> 104 >>> >>> bytes(b'hello'[0]) >>> b'h' >>> >>> bytearray(b'hello'[0]) >>> bytearray(b'h') >>> >>> (the subtype is currently visible at the Python level as "types._BytesInt") >>> >>> The proof of concept doesn't override any normal integer behaviour, >>> but a more complete solution would be in a position to emit a warning >>> when the result of binary indexing is used as an integer (either >>> always, or controlled by a command line switch, depending on the >>> performance impact). >>> >>> With this integer subtype in place for Python 3.5 to provide a >>> transition period where both existing integer-compatible operations >>> (like int() and arithmetic operations) and selected bytes-compatible >>> operations (like ord(), bytes() and bytearray()) are supported, these >>> operations could then be switched to producing a normal length 1 bytes >>> object in Python 3.6. >>> >>> It wouldn't be pretty, and it would be a pain to document, but it >>> seems feasible. The alternative is for PEP 367 to add a separate bytes >> >> >> I believe you mean PEP 467. >> >>> >>> iteration method, which strikes me as further entrenching a design we >>> aren't currently happy with. >>> >>> Regards, >>> Nick. >> >> >> We just got rid of the mess of having multiple integer types (int vs long), it'd be a shame to recreate that problem in any form. >> >> The ship has sailed. Python 3 means bytes indexing returns ints. It's well defined and code has started to depend on it. People who want a b'A' instead of 0x41 know to use slice notation [n:n+1] instead of [n] to get a one byte bytes() as that is what is required in code that works in 2.6 through 3.4 today. Anything we do to change it is going to be messier and more mysterious. >> >> Entertaining the idea anyways: If there is going to be a new type for bytes indexing, it needs to multiply inherit from both int and bytes so that isinstance() checks work. We'd need to make sure all C API calls that check for a specific type actually work with the new one as well (at first glance I count 57 uses of PyBytes_CheckExact and PyLong_CheckExact in CPython). The ambiguious operator * and + cases and any similar that Nathaniel Smith pointed out would still be a problem and a potential source of confusion for users. >> >> If anything, a new iteration method in PEP 467 that yields length 1 bytes() makes some sense for convenience, but I don't personally see much use for single byte iteration of any form in a high level language. >> >> It is odd to me that str and bytes ever supported iteration. How many times have we each written code to check that a passed argument was "a sequence but, oh, wait, not a string, because you didn't really mean to do that". That was a Python 1 decision. Oops. :) >> >> -gps >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > > > > -- > --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Mon Jun 16 02:03:56 2014 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 16 Jun 2014 12:03:56 +1200 Subject: [Python-ideas] A possible transition plan to bytes-based iteration and indexing for binary data In-Reply-To: References: Message-ID: <539E346C.7050207@canterbury.ac.nz> Gregory P. Smith wrote: > In order for this to work you would need to have your __future__ > statement alter the behavior of *all* [] and iteration done within the > file to conditionally take a code path that does something different iff > the type being operated on is determined at runtime to be bytes. It *could* be done. When the future statement is in effect, different bytecodes could be generated for indexing and iteration that look out for bytes and work differently. -- Greg From stefan_ml at behnel.de Mon Jun 16 09:05:25 2014 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 16 Jun 2014 09:05:25 +0200 Subject: [Python-ideas] A General Outline for Just-in-Time Acceleration of Python In-Reply-To: References: <539BF188.7040600@m4x.org> Message-ID: David Mertz, 14.06.2014 09:30: > On Fri, Jun 13, 2014 at 11:54 PM, Joseph Martinot-Lagarde wrote: > >> Cython compiles all python, it is not restricted. > > Well, kinda yes and no. You are correct of course, that anything that you > can execute with 'python someprog' you can compile with 'cython someprog'. > However, there is an obvious sense in which adding an annotation (which > is, of course, a syntax error for Python itself) "restricts" the code in > Cython. E.g.: > > def silly(): > cdef int n, i You can rewrite this as import cython @cython.locals(n=int, i=int) def silly(): which makes it valid Python but has the same semantics as your cdef declaration when compiled in Cython. > for i in range(10): > if i < 5: > n = i + 1 > else: > n = str(i) > > This *silly* function isn't really Python code at all, of course. But if > you ignore the annotation, it would be--pointless code, but valid. As soon > as you add the annotation, you *restrict* the type of code you can write in > the scope of the annotation. When compiled with Cython, you will get a TypeError on i == 5 (because you said so), whereas it will run through the whole loop in Python. Stefan From stefan_ml at behnel.de Mon Jun 16 09:10:10 2014 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 16 Jun 2014 09:10:10 +0200 Subject: [Python-ideas] A General Outline for Just-in-Time Acceleration of Python In-Reply-To: References: <539BF188.7040600@m4x.org> <539BFBD8.1060506@m4x.org> Message-ID: David Mertz, 14.06.2014 09:53: > moreover, (B) is not true if your existing code relies on C extensions, > such as NumPy, which mostly aren't going to run on PyPy. > > However, I do know about https://bitbucket.org/pypy/numpy. At least some > substantial part of NumPy has been ported to PyPy. This may or may not > support the code *you* need to run. Usually, when people say "my code uses NumPy", what they mean is "NumPy and parts of the surrounding ecosystem", which often includes SciPy and other specialised number crunching libraries. Porting all of that to PyPy and its numpypy reimplementation would take a while. Stefan From rosuav at gmail.com Mon Jun 16 09:15:46 2014 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 16 Jun 2014 17:15:46 +1000 Subject: [Python-ideas] A General Outline for Just-in-Time Acceleration of Python In-Reply-To: References: <539BF188.7040600@m4x.org> Message-ID: On Mon, Jun 16, 2014 at 5:05 PM, Stefan Behnel wrote: > You can rewrite this as > > import cython > > @cython.locals(n=int, i=int) > def silly(): > > which makes it valid Python but has the same semantics as your cdef > declaration when compiled in Cython. Syntactically valid, yes. Is there a dummy decorator class cython.locals for the case where it's running under Python? ChrisA From stefan_ml at behnel.de Mon Jun 16 09:34:30 2014 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 16 Jun 2014 09:34:30 +0200 Subject: [Python-ideas] A General Outline for Just-in-Time Acceleration of Python In-Reply-To: References: <539BF188.7040600@m4x.org> Message-ID: Chris Angelico, 16.06.2014 09:15: > On Mon, Jun 16, 2014 at 5:05 PM, Stefan Behnel wrote: >> You can rewrite this as >> >> import cython >> >> @cython.locals(n=int, i=int) >> def silly(): >> >> which makes it valid Python but has the same semantics as your cdef >> declaration when compiled in Cython. > > Syntactically valid, yes. Is there a dummy decorator class > cython.locals for the case where it's running under Python? Wouldn't make much sense otherwise, would it? :) https://github.com/cython/cython/blob/master/Cython/Shadow.py Here are some details: http://docs.cython.org/src/tutorial/pure.html Stefan From rosuav at gmail.com Mon Jun 16 09:38:32 2014 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 16 Jun 2014 17:38:32 +1000 Subject: [Python-ideas] A General Outline for Just-in-Time Acceleration of Python In-Reply-To: References: <539BF188.7040600@m4x.org> Message-ID: On Mon, Jun 16, 2014 at 5:34 PM, Stefan Behnel wrote: > Chris Angelico, 16.06.2014 09:15: >> Syntactically valid, yes. Is there a dummy decorator class >> cython.locals for the case where it's running under Python? > > Wouldn't make much sense otherwise, would it? :) > > https://github.com/cython/cython/blob/master/Cython/Shadow.py > Heh, I kinda figured it'd have to exist. Incidentally, I flipped through that source file and didn't see it - had to actually search before I found this tiny two-line function that does the job. Naturally I assumed I was looking for a class, but a function that returns an identity function of course does just as well. ChrisA From npmccallum at redhat.com Mon Jun 16 20:03:30 2014 From: npmccallum at redhat.com (Nathaniel McCallum) Date: Mon, 16 Jun 2014 14:03:30 -0400 Subject: [Python-ideas] Bitwise operations on bytes class Message-ID: <1402941810.4273.26.camel@ipa.example.com> I find myself, fairly often, needing to perform bitwise operations (rshift, lshift, and, or, xor) on arrays of bytes in python (both bytes and bytearray). I can't think of any other reasonable use for these operators. Is upstream Python interested in this kind of behavior by default? At the least, it would make many algorithms very easy to read and write. Nathaniel From tjreedy at udel.edu Mon Jun 16 21:20:33 2014 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 16 Jun 2014 15:20:33 -0400 Subject: [Python-ideas] Bitwise operations on bytes class In-Reply-To: <1402941810.4273.26.camel@ipa.example.com> References: <1402941810.4273.26.camel@ipa.example.com> Message-ID: On 6/16/2014 2:03 PM, Nathaniel McCallum wrote: > I find myself, fairly often, needing to perform bitwise operations > (rshift, lshift, and, or, xor) on arrays of bytes in python (both bytes > and bytearray). If you are often doing and/or/xor on large arrays, as one might do for bitmap images, you should probably be using numpy or a derivative thereof. What use do you have for shifting bits across byte boundaries, where the bytes are really bytes? Why would you not turn multiple bytes considered together into an int? > I can't think of any other reasonable use for these operators. I don't understand this. They are routinely used on ints for various purposes. -- Terry Jan Reedy From stefan_ml at behnel.de Mon Jun 16 21:25:58 2014 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 16 Jun 2014 21:25:58 +0200 Subject: [Python-ideas] Bitwise operations on bytes class In-Reply-To: <1402941810.4273.26.camel@ipa.example.com> References: <1402941810.4273.26.camel@ipa.example.com> Message-ID: Nathaniel McCallum, 16.06.2014 20:03: > I find myself, fairly often, needing to perform bitwise operations > (rshift, lshift, and, or, xor) on arrays of bytes in python (both bytes > and bytearray). I can't think of any other reasonable use for these > operators. Is upstream Python interested in this kind of behavior by > default? At the least, it would make many algorithms very easy to read > and write. ISTM that what you're asking for is essentially a SIMD data type, which certainly has a lot of nice applications. However, restricting it to byte values seems to be a rather niche use case to me. IMHO, this seems much better suited for the array module than the "bytes as in string" general purpose bytes type. The array module has support for all sorts of C-ish integer types. Different ways to handle errors (e.g. overflows) across the array would be another reason to not push this into the bytes type. Stefan From ethan at stoneleaf.us Mon Jun 16 21:03:08 2014 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 16 Jun 2014 12:03:08 -0700 Subject: [Python-ideas] Bitwise operations on bytes class In-Reply-To: <1402941810.4273.26.camel@ipa.example.com> References: <1402941810.4273.26.camel@ipa.example.com> Message-ID: <539F3F6C.1060507@stoneleaf.us> On 06/16/2014 11:03 AM, Nathaniel McCallum wrote: > > I find myself, fairly often, needing to perform bitwise operations > (rshift, lshift, and, or, xor) on arrays of bytes in python (both bytes > and bytearray). I can't think of any other reasonable use for these > operators. Is upstream Python interested in this kind of behavior by > default? At the least, it would make many algorithms very easy to read > and write. Could you give a couple examples? -- ~Ethan~ From npmccallum at redhat.com Mon Jun 16 21:43:33 2014 From: npmccallum at redhat.com (Nathaniel McCallum) Date: Mon, 16 Jun 2014 15:43:33 -0400 Subject: [Python-ideas] Bitwise operations on bytes class In-Reply-To: References: <1402941810.4273.26.camel@ipa.example.com> Message-ID: <1402947813.4273.28.camel@ipa.example.com> On Mon, 2014-06-16 at 15:20 -0400, Terry Reedy wrote: > On 6/16/2014 2:03 PM, Nathaniel McCallum wrote: > > I find myself, fairly often, needing to perform bitwise operations > > (rshift, lshift, and, or, xor) on arrays of bytes in python (both bytes > > and bytearray). > > If you are often doing and/or/xor on large arrays, as one might do for > bitmap images, you should probably be using numpy or a derivative thereof. > > What use do you have for shifting bits across byte boundaries, where the > bytes are really bytes? Why would you not turn multiple bytes > considered together into an int? There are many reasons. Anything relating to cryptography, key derivation, asn1 BitString, etc. Many network protocols have specialized algorithms which require bit rotations or bitwise operations on blocks. > > I can't think of any other reasonable use for these operators. > > I don't understand this. They are routinely used on ints for various > purposes. I meant that, for instance, I can't think of any other reasonable interpretation for what "bytes() ^ bytes()" would mean other than a bitwise xor of the bytes in the arrays. Yes, of course the operators have meanings in other contexts. But in this context, I think the meaning of the operators is self-evident and precise in meaning. Perhaps some code will clarify what I'm proposing. Attached is a class I have found continual reuse for over the last few years. It implements bitwise operators on a bytes subclass. Something similar could be done for bytearray. Nathaniel -------------- next part -------------- A non-text attachment was scrubbed... Name: bbytes.py Type: text/x-python Size: 1571 bytes Desc: not available URL: From stefan_ml at behnel.de Mon Jun 16 21:55:40 2014 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 16 Jun 2014 21:55:40 +0200 Subject: [Python-ideas] Bitwise operations on bytes class In-Reply-To: <1402947813.4273.28.camel@ipa.example.com> References: <1402941810.4273.26.camel@ipa.example.com> <1402947813.4273.28.camel@ipa.example.com> Message-ID: Nathaniel McCallum, 16.06.2014 21:43: > Perhaps some code will clarify what I'm proposing. Attached is a class I > have found continual reuse for over the last few years. It implements > bitwise operators on a bytes subclass. Something similar could be done > for bytearray. Ok, according to your code, you don't want a SIMD type but rather an arbitrary size integer type. Why don't you just use the "int" ("long" in Py2) type for that? It has way faster operations than your multiple copy implementation. Stefan From dholth at gmail.com Mon Jun 16 22:01:11 2014 From: dholth at gmail.com (Daniel Holth) Date: Mon, 16 Jun 2014 16:01:11 -0400 Subject: [Python-ideas] Bitwise operations on bytes class In-Reply-To: <1402947813.4273.28.camel@ipa.example.com> References: <1402941810.4273.26.camel@ipa.example.com> <1402947813.4273.28.camel@ipa.example.com> Message-ID: Interesting idea. I like it. I notice Python 3 has int.from_bytes() and int.to_bytes(). On Mon, Jun 16, 2014 at 3:43 PM, Nathaniel McCallum wrote: > On Mon, 2014-06-16 at 15:20 -0400, Terry Reedy wrote: >> On 6/16/2014 2:03 PM, Nathaniel McCallum wrote: >> > I find myself, fairly often, needing to perform bitwise operations >> > (rshift, lshift, and, or, xor) on arrays of bytes in python (both bytes >> > and bytearray). >> >> If you are often doing and/or/xor on large arrays, as one might do for >> bitmap images, you should probably be using numpy or a derivative thereof. >> >> What use do you have for shifting bits across byte boundaries, where the >> bytes are really bytes? Why would you not turn multiple bytes >> considered together into an int? > > There are many reasons. Anything relating to cryptography, key > derivation, asn1 BitString, etc. Many network protocols have specialized > algorithms which require bit rotations or bitwise operations on blocks. > >> > I can't think of any other reasonable use for these operators. >> >> I don't understand this. They are routinely used on ints for various >> purposes. > > I meant that, for instance, I can't think of any other reasonable > interpretation for what "bytes() ^ bytes()" would mean other than a > bitwise xor of the bytes in the arrays. Yes, of course the operators > have meanings in other contexts. But in this context, I think the > meaning of the operators is self-evident and precise in meaning. > > Perhaps some code will clarify what I'm proposing. Attached is a class I > have found continual reuse for over the last few years. It implements > bitwise operators on a bytes subclass. Something similar could be done > for bytearray. > > Nathaniel > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From npmccallum at redhat.com Mon Jun 16 22:16:13 2014 From: npmccallum at redhat.com (Nathaniel McCallum) Date: Mon, 16 Jun 2014 16:16:13 -0400 Subject: [Python-ideas] Bitwise operations on bytes class In-Reply-To: References: <1402941810.4273.26.camel@ipa.example.com> <1402947813.4273.28.camel@ipa.example.com> Message-ID: <1402949773.4273.30.camel@ipa.example.com> On Mon, 2014-06-16 at 21:55 +0200, Stefan Behnel wrote: > Nathaniel McCallum, 16.06.2014 21:43: > > Perhaps some code will clarify what I'm proposing. Attached is a class I > > have found continual reuse for over the last few years. It implements > > bitwise operators on a bytes subclass. Something similar could be done > > for bytearray. > > Ok, according to your code, you don't want a SIMD type but rather an > arbitrary size integer type. Why don't you just use the "int" ("long" in > Py2) type for that? It has way faster operations than your multiple copy > implementation. Of course my attached code is slow. This is precisely why I'm proposing native additions to the bytes class. However, in most algorithms, there is a single operation like this on a block of data which is otherwise not treated as an integer. This operation often takes the form of something like: blocks.append(blocks[-1] ^ block) In all the surrounding code, you are dealing with bytes *as* bytes. Converting into alternate types breaks up the readability of the algorithm. And given the security requirements of such algorithms, readability is extremely important. The above code example has both simplicity and obviousness. Currently, in py3k, this is AFAICS the best alternative for readability: blocks.append([a ^ b for a, b in zip(blocks[-1], block)] While this is infinitely better than Python 2.x, I think my proposal is still significantly more readable. When implemented natively, my proposal is also far more performant than this. Nathaniel From guido at python.org Mon Jun 16 22:21:51 2014 From: guido at python.org (Guido van Rossum) Date: Mon, 16 Jun 2014 13:21:51 -0700 Subject: [Python-ideas] Bitwise operations on bytes class In-Reply-To: References: <1402941810.4273.26.camel@ipa.example.com> <1402947813.4273.28.camel@ipa.example.com> Message-ID: As additional input to thsi discussion I would like to remind you all that it's not a good idea to have every operator apply to every data type, as this increases the chances that bugs percolate up to a point where it's hard to figure out where an unexpected value was generated. IOW, just because there's no current meaning for e.g. b^b, that doesn't necessarily make it a good idea to add one. (There are other arguments from language usability against adding new operations indiscriminately, but this in particular jumped out at me.) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From npmccallum at redhat.com Mon Jun 16 22:22:28 2014 From: npmccallum at redhat.com (Nathaniel McCallum) Date: Mon, 16 Jun 2014 16:22:28 -0400 Subject: [Python-ideas] Bitwise operations on bytes class In-Reply-To: <1402949773.4273.30.camel@ipa.example.com> References: <1402941810.4273.26.camel@ipa.example.com> <1402947813.4273.28.camel@ipa.example.com> <1402949773.4273.30.camel@ipa.example.com> Message-ID: <1402950148.4273.32.camel@ipa.example.com> On Mon, 2014-06-16 at 16:16 -0400, Nathaniel McCallum wrote: > On Mon, 2014-06-16 at 21:55 +0200, Stefan Behnel wrote: > > Nathaniel McCallum, 16.06.2014 21:43: > > > Perhaps some code will clarify what I'm proposing. Attached is a class I > > > have found continual reuse for over the last few years. It implements > > > bitwise operators on a bytes subclass. Something similar could be done > > > for bytearray. > > > > Ok, according to your code, you don't want a SIMD type but rather an > > arbitrary size integer type. Why don't you just use the "int" ("long" in > > Py2) type for that? It has way faster operations than your multiple copy > > implementation. > > Of course my attached code is slow. This is precisely why I'm proposing > native additions to the bytes class. > > However, in most algorithms, there is a single operation like this on a > block of data which is otherwise not treated as an integer. This > operation often takes the form of something like: > > blocks.append(blocks[-1] ^ block) > > In all the surrounding code, you are dealing with bytes *as* bytes. > Converting into alternate types breaks up the readability of the > algorithm. And given the security requirements of such algorithms, > readability is extremely important. > > The above code example has both simplicity and obviousness. Currently, > in py3k, this is AFAICS the best alternative for readability: > > blocks.append([a ^ b for a, b in zip(blocks[-1], block)] > > While this is infinitely better than Python 2.x, I think my proposal is > still significantly more readable. When implemented natively, my > proposal is also far more performant than this. Also, when implemented on bytearray, you can get things like this: cksum ^= block. This can be very fast as it can be done with no copies. It is also extremely readable. Nathaniel From npmccallum at redhat.com Mon Jun 16 22:28:00 2014 From: npmccallum at redhat.com (Nathaniel McCallum) Date: Mon, 16 Jun 2014 16:28:00 -0400 Subject: [Python-ideas] Bitwise operations on bytes class In-Reply-To: References: <1402941810.4273.26.camel@ipa.example.com> <1402947813.4273.28.camel@ipa.example.com> Message-ID: <1402950480.4273.34.camel@ipa.example.com> On Mon, 2014-06-16 at 13:21 -0700, Guido van Rossum wrote: > As additional input to thsi discussion I would like to remind you all > that it's not a good idea to have every operator apply to every data > type, as this increases the chances that bugs percolate up to a point > where it's hard to figure out where an unexpected value was generated. > IOW, just because there's no current meaning for e.g. b^b, that > doesn't necessarily make it a good idea to add one. (There are other > arguments from language usability against adding new operations > indiscriminately, but this in particular jumped out at me.) Agreed. My only thought here was that this addition seems to me to be extremely natural and emulates the precise grammar that is very often seen in algorithms in IETF RFCs (for instance). But the precise threshold of "too many operators" can be difficult to gauge. That is probably above my pay grade. :) Nathaniel From antoine at python.org Mon Jun 16 22:38:00 2014 From: antoine at python.org (Antoine Pitrou) Date: Mon, 16 Jun 2014 16:38:00 -0400 Subject: [Python-ideas] Bitwise operations on bytes class In-Reply-To: <1402950480.4273.34.camel@ipa.example.com> References: <1402941810.4273.26.camel@ipa.example.com> <1402947813.4273.28.camel@ipa.example.com> <1402950480.4273.34.camel@ipa.example.com> Message-ID: There's a bitstring package on PyPI, perhaps it has the desired operations: https://pypi.python.org/pypi/bitstring/ Regards Antoine. Le 16/06/2014 16:28, Nathaniel McCallum a ?crit : > On Mon, 2014-06-16 at 13:21 -0700, Guido van Rossum wrote: >> As additional input to thsi discussion I would like to remind you all >> that it's not a good idea to have every operator apply to every data >> type, as this increases the chances that bugs percolate up to a point >> where it's hard to figure out where an unexpected value was generated. >> IOW, just because there's no current meaning for e.g. b^b, that >> doesn't necessarily make it a good idea to add one. (There are other >> arguments from language usability against adding new operations >> indiscriminately, but this in particular jumped out at me.) > > Agreed. My only thought here was that this addition seems to me to be > extremely natural and emulates the precise grammar that is very often > seen in algorithms in IETF RFCs (for instance). But the precise > threshold of "too many operators" can be difficult to gauge. That is > probably above my pay grade. :) > > Nathaniel > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From greg.ewing at canterbury.ac.nz Mon Jun 16 23:53:03 2014 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 17 Jun 2014 09:53:03 +1200 Subject: [Python-ideas] Bitwise operations on bytes class In-Reply-To: <1402949773.4273.30.camel@ipa.example.com> References: <1402941810.4273.26.camel@ipa.example.com> <1402947813.4273.28.camel@ipa.example.com> <1402949773.4273.30.camel@ipa.example.com> Message-ID: <539F673F.4000101@canterbury.ac.nz> Nathaniel McCallum wrote: > In all the surrounding code, you are dealing with bytes *as* bytes. > Converting into alternate types breaks up the readability of the > algorithm. And given the security requirements of such algorithms, > readability is extremely important. Not to mention needlessly inefficient. There's also the issue that you are usually dealing with a specific number of bits. When you convert to an int, you lose any notion of it having a size, so you have to keep track of that separately, and take its effect on the bitwise operations into account manually. E.g. the bitwise complement of an N-bit string is another N-bit string. But the bitwise complement of a positive int is a bit string with an infinite number of leading 1 bits, which you have to mask off. The bitwise complement of a bytes object, on the other hand, would be another bytes object of the same size. -- Greg From ncoghlan at gmail.com Tue Jun 17 00:48:51 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 17 Jun 2014 08:48:51 +1000 Subject: [Python-ideas] Bitwise operations on bytes class In-Reply-To: <1402947813.4273.28.camel@ipa.example.com> References: <1402941810.4273.26.camel@ipa.example.com> <1402947813.4273.28.camel@ipa.example.com> Message-ID: On 17 Jun 2014 05:44, "Nathaniel McCallum" wrote: > > On Mon, 2014-06-16 at 15:20 -0400, Terry Reedy wrote: > > On 6/16/2014 2:03 PM, Nathaniel McCallum wrote: > > > I find myself, fairly often, needing to perform bitwise operations > > > (rshift, lshift, and, or, xor) on arrays of bytes in python (both bytes > > > and bytearray). > > > > If you are often doing and/or/xor on large arrays, as one might do for > > bitmap images, you should probably be using numpy or a derivative thereof. > > > > What use do you have for shifting bits across byte boundaries, where the > > bytes are really bytes? Why would you not turn multiple bytes > > considered together into an int? > > There are many reasons. Anything relating to cryptography, key > derivation, asn1 BitString, etc. Many network protocols have specialized > algorithms which require bit rotations or bitwise operations on blocks. I used to want something like this when trying to deal with bit slips on serial channels - sliding a pattern one bit to the left or right was a pain. It makes more sense on the bytes type to me than it does on multibyte array formats (which would suffer from messy endianness issues). As Nathaniel noted, there's no other obvious meaning for these operations on the binary data types, and it would definitely make bitbashing in Python easier (something that will only become more common with the rise of things like Arduino, Raspberry Pi and MicroPython). Cheers, Nick. > > > I can't think of any other reasonable use for these operators. > > > > I don't understand this. They are routinely used on ints for various > > purposes. > > I meant that, for instance, I can't think of any other reasonable > interpretation for what "bytes() ^ bytes()" would mean other than a > bitwise xor of the bytes in the arrays. Yes, of course the operators > have meanings in other contexts. But in this context, I think the > meaning of the operators is self-evident and precise in meaning. > > Perhaps some code will clarify what I'm proposing. Attached is a class I > have found continual reuse for over the last few years. It implements > bitwise operators on a bytes subclass. Something similar could be done > for bytearray. > > Nathaniel > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Tue Jun 17 00:59:30 2014 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 17 Jun 2014 08:59:30 +1000 Subject: [Python-ideas] Bitwise operations on bytes class In-Reply-To: <1402949773.4273.30.camel@ipa.example.com> References: <1402941810.4273.26.camel@ipa.example.com> <1402947813.4273.28.camel@ipa.example.com> <1402949773.4273.30.camel@ipa.example.com> Message-ID: On Tue, Jun 17, 2014 at 6:16 AM, Nathaniel McCallum wrote: > Of course my attached code is slow. This is precisely why I'm proposing > native additions to the bytes class. I presume you're aware that the bytes type is immutable, right? You're still going to have at least some copying going on, whereas with a mutable type you might well be able to avoid that. Efficiency suggests bytearray instead. ChrisA From greg.ewing at canterbury.ac.nz Tue Jun 17 02:00:20 2014 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 17 Jun 2014 12:00:20 +1200 Subject: [Python-ideas] Bitwise operations on bytes class In-Reply-To: References: <1402941810.4273.26.camel@ipa.example.com> <1402947813.4273.28.camel@ipa.example.com> <1402949773.4273.30.camel@ipa.example.com> Message-ID: <539F8514.3030303@canterbury.ac.nz> Chris Angelico wrote: > I presume you're aware that the bytes type is immutable, right? You're > still going to have at least some copying going on, whereas with a > mutable type you might well be able to avoid that. Efficiency suggests > bytearray instead. Why not both? -- Greg From rosuav at gmail.com Tue Jun 17 02:03:24 2014 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 17 Jun 2014 10:03:24 +1000 Subject: [Python-ideas] Bitwise operations on bytes class In-Reply-To: <539F8514.3030303@canterbury.ac.nz> References: <1402941810.4273.26.camel@ipa.example.com> <1402947813.4273.28.camel@ipa.example.com> <1402949773.4273.30.camel@ipa.example.com> <539F8514.3030303@canterbury.ac.nz> Message-ID: On Tue, Jun 17, 2014 at 10:00 AM, Greg Ewing wrote: > Chris Angelico wrote: >> >> I presume you're aware that the bytes type is immutable, right? You're >> still going to have at least some copying going on, whereas with a >> mutable type you might well be able to avoid that. Efficiency suggests >> bytearray instead. > > > Why not both? If you do a series of operations on a large bytes object, each one will involve a full copy. If you do the same series of operations on a large mutable object, they can be optimized down to non-copying. Why both? ChrisA From steve at pearwood.info Tue Jun 17 02:55:33 2014 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 17 Jun 2014 10:55:33 +1000 Subject: [Python-ideas] Bitwise operations on bytes class In-Reply-To: References: <1402941810.4273.26.camel@ipa.example.com> <1402947813.4273.28.camel@ipa.example.com> <1402949773.4273.30.camel@ipa.example.com> Message-ID: <20140617005533.GF7742@ando> On Tue, Jun 17, 2014 at 08:59:30AM +1000, Chris Angelico wrote: > On Tue, Jun 17, 2014 at 6:16 AM, Nathaniel McCallum > wrote: > > Of course my attached code is slow. This is precisely why I'm proposing > > native additions to the bytes class. > > I presume you're aware that the bytes type is immutable, right? You're > still going to have at least some copying going on, whereas with a > mutable type you might well be able to avoid that. Efficiency suggests > bytearray instead. The very first sentence of Nathaniel's first post in this thread: "I find myself, fairly often, needing to perform bitwise operations (rshift, lshift, and, or, xor) on arrays of bytes in python (both bytes and bytearray)." So yes, I think he is aware of it :-) -- Steven From ncoghlan at gmail.com Tue Jun 17 08:02:36 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 17 Jun 2014 16:02:36 +1000 Subject: [Python-ideas] Bitwise operations on bytes class In-Reply-To: References: <1402941810.4273.26.camel@ipa.example.com> <1402947813.4273.28.camel@ipa.example.com> <1402949773.4273.30.camel@ipa.example.com> <539F8514.3030303@canterbury.ac.nz> Message-ID: On 17 Jun 2014 10:04, "Chris Angelico" wrote: > > On Tue, Jun 17, 2014 at 10:00 AM, Greg Ewing > wrote: > > Chris Angelico wrote: > >> > >> I presume you're aware that the bytes type is immutable, right? You're > >> still going to have at least some copying going on, whereas with a > >> mutable type you might well be able to avoid that. Efficiency suggests > >> bytearray instead. > > > > > > Why not both? > > If you do a series of operations on a large bytes object, each one > will involve a full copy. If you do the same series of operations on a > large mutable object, they can be optimized down to non-copying. Why > both? Because the two APIs are currently in sync outside mutating operations, and there isn't a compelling reason to break that symmetry, even if this proposal was put forward as a PEP and ultimately accepted. Cheers, Nick. > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Tue Jun 17 08:03:40 2014 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 17 Jun 2014 16:03:40 +1000 Subject: [Python-ideas] Bitwise operations on bytes class In-Reply-To: References: <1402941810.4273.26.camel@ipa.example.com> <1402947813.4273.28.camel@ipa.example.com> <1402949773.4273.30.camel@ipa.example.com> <539F8514.3030303@canterbury.ac.nz> Message-ID: On Tue, Jun 17, 2014 at 4:02 PM, Nick Coghlan wrote: > Because the two APIs are currently in sync outside mutating operations, and > there isn't a compelling reason to break that symmetry, even if this > proposal was put forward as a PEP and ultimately accepted. Ah! That would be why. Sorry for the noise! ChrisA From ncoghlan at gmail.com Tue Jun 17 10:36:42 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 17 Jun 2014 18:36:42 +1000 Subject: [Python-ideas] Bitwise operations on bytes class In-Reply-To: References: <1402941810.4273.26.camel@ipa.example.com> <1402947813.4273.28.camel@ipa.example.com> <1402949773.4273.30.camel@ipa.example.com> <539F8514.3030303@canterbury.ac.nz> Message-ID: On 17 June 2014 16:03, Chris Angelico wrote: > On Tue, Jun 17, 2014 at 4:02 PM, Nick Coghlan wrote: >> Because the two APIs are currently in sync outside mutating operations, and >> there isn't a compelling reason to break that symmetry, even if this >> proposal was put forward as a PEP and ultimately accepted. > > Ah! That would be why. Sorry for the noise! Clarifying non-obvious design principles isn't noise on python-ideas, it's one of the reasons the list exists :) Cheers, Nick. From rosuav at gmail.com Tue Jun 17 10:40:55 2014 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 17 Jun 2014 18:40:55 +1000 Subject: [Python-ideas] Bitwise operations on bytes class In-Reply-To: References: <1402941810.4273.26.camel@ipa.example.com> <1402947813.4273.28.camel@ipa.example.com> <1402949773.4273.30.camel@ipa.example.com> <539F8514.3030303@canterbury.ac.nz> Message-ID: On Tue, Jun 17, 2014 at 6:36 PM, Nick Coghlan wrote: > On 17 June 2014 16:03, Chris Angelico wrote: >> On Tue, Jun 17, 2014 at 4:02 PM, Nick Coghlan wrote: >>> Because the two APIs are currently in sync outside mutating operations, and >>> there isn't a compelling reason to break that symmetry, even if this >>> proposal was put forward as a PEP and ultimately accepted. >> >> Ah! That would be why. Sorry for the noise! > > Clarifying non-obvious design principles isn't noise on python-ideas, > it's one of the reasons the list exists :) Then I'm glad to have been able to play the role of The Watson [1] for the benefit the audience :) ChrisA [1] http://tvtropes.org/pmwiki/pmwiki.php/Main/TheWatson From npmccallum at redhat.com Tue Jun 17 15:24:57 2014 From: npmccallum at redhat.com (Nathaniel McCallum) Date: Tue, 17 Jun 2014 09:24:57 -0400 Subject: [Python-ideas] Bitwise operations on bytes class In-Reply-To: <539F673F.4000101@canterbury.ac.nz> References: <1402941810.4273.26.camel@ipa.example.com> <1402947813.4273.28.camel@ipa.example.com> <1402949773.4273.30.camel@ipa.example.com> <539F673F.4000101@canterbury.ac.nz> Message-ID: <1403011497.4273.36.camel@ipa.example.com> On Tue, 2014-06-17 at 09:53 +1200, Greg Ewing wrote: > Nathaniel McCallum wrote: > > In all the surrounding code, you are dealing with bytes *as* bytes. > > Converting into alternate types breaks up the readability of the > > algorithm. And given the security requirements of such algorithms, > > readability is extremely important. > > Not to mention needlessly inefficient. > > There's also the issue that you are usually dealing > with a specific number of bits. When you convert to > an int, you lose any notion of it having a size, so > you have to keep track of that separately, and take > its effect on the bitwise operations into account > manually. > > E.g. the bitwise complement of an N-bit string is > another N-bit string. But the bitwise complement of > a positive int is a bit string with an infinite > number of leading 1 bits, which you have to mask > off. The bitwise complement of a bytes object, on > the other hand, would be another bytes object of > the same size. +1 From storchaka at gmail.com Tue Jun 17 21:29:56 2014 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 17 Jun 2014 22:29:56 +0300 Subject: [Python-ideas] Bitwise operations on bytes class In-Reply-To: References: <1402941810.4273.26.camel@ipa.example.com> <1402947813.4273.28.camel@ipa.example.com> <1402950480.4273.34.camel@ipa.example.com> Message-ID: 16.06.14 23:38, Antoine Pitrou ???????(??): > > There's a bitstring package on PyPI, perhaps it has the desired operations: > https://pypi.python.org/pypi/bitstring/ And bitarray: https://pypi.python.org/pypi/bitarray From ethan at stoneleaf.us Tue Jun 17 21:35:02 2014 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 17 Jun 2014 12:35:02 -0700 Subject: [Python-ideas] Bitwise operations on bytes class In-Reply-To: <1402941810.4273.26.camel@ipa.example.com> References: <1402941810.4273.26.camel@ipa.example.com> Message-ID: <53A09866.7010509@stoneleaf.us> On 06/16/2014 11:03 AM, Nathaniel McCallum wrote: > > I find myself, fairly often, needing to perform bitwise operations > (rshift, lshift, and, or, xor) on arrays of bytes in python (both bytes > and bytearray). I can't think of any other reasonable use for these > operators. Is upstream Python interested in this kind of behavior by > default? At the least, it would make many algorithms very easy to read > and write. I like the idea, but one question I have: when shifting, are the incoming bits set to 0 or 1? Why? -- ~Ethan~ From antoine at python.org Tue Jun 17 22:37:29 2014 From: antoine at python.org (Antoine Pitrou) Date: Tue, 17 Jun 2014 16:37:29 -0400 Subject: [Python-ideas] Bitwise operations on bytes class In-Reply-To: <53A09866.7010509@stoneleaf.us> References: <1402941810.4273.26.camel@ipa.example.com> <53A09866.7010509@stoneleaf.us> Message-ID: Le 17/06/2014 15:35, Ethan Furman a ?crit : > > I like the idea, but one question I have: when shifting, are the > incoming bits set to 0 or 1? Why? By convention, 0. Historically, that's how CPUs do it. (and also because it provides a quick way of multiplying / dividing by 2^N). Regards Antoine. From python at mrabarnett.plus.com Tue Jun 17 23:33:29 2014 From: python at mrabarnett.plus.com (MRAB) Date: Tue, 17 Jun 2014 22:33:29 +0100 Subject: [Python-ideas] Bitwise operations on bytes class In-Reply-To: References: <1402941810.4273.26.camel@ipa.example.com> <53A09866.7010509@stoneleaf.us> Message-ID: <53A0B429.1060405@mrabarnett.plus.com> On 2014-06-17 21:37, Antoine Pitrou wrote: > Le 17/06/2014 15:35, Ethan Furman a ?crit : >> >> I like the idea, but one question I have: when shifting, are the >> incoming bits set to 0 or 1? Why? > > By convention, 0. Historically, that's how CPUs do it. > (and also because it provides a quick way of multiplying / dividing by 2^N). > That's sometimes known as a "logical shift". When shifting to the right, there's also the "arithmetic shift", which preserves the most significant bit. Do we need that too? (I don't think so.) If yes, then what should be operator be? Just a 'normal' method call? From ncoghlan at gmail.com Wed Jun 18 00:10:26 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 18 Jun 2014 08:10:26 +1000 Subject: [Python-ideas] Bitwise operations on bytes class In-Reply-To: <53A0B429.1060405@mrabarnett.plus.com> References: <1402941810.4273.26.camel@ipa.example.com> <53A09866.7010509@stoneleaf.us> <53A0B429.1060405@mrabarnett.plus.com> Message-ID: On 18 Jun 2014 07:34, "MRAB" wrote: > > On 2014-06-17 21:37, Antoine Pitrou wrote: >> >> Le 17/06/2014 15:35, Ethan Furman a ?crit : >>> >>> >>> I like the idea, but one question I have: when shifting, are the >>> incoming bits set to 0 or 1? Why? >> >> >> By convention, 0. Historically, that's how CPUs do it. >> (and also because it provides a quick way of multiplying / dividing by 2^N). >> > That's sometimes known as a "logical shift". My bitbashing-with-Python work was all serial communications protocol based, so logical shifts were what I wanted (I was also in the fortunate position of being able to tolerate the slow speed of doing them in Python, because HF radio comms are so slow the data streams to be analysed weren't very big). > When shifting to the right, there's also the "arithmetic shift", which > preserves the most significant bit. > > Do we need that too? (I don't think so.) If yes, then what should be > operator be? Just a 'normal' method call? Wanting an arithmetic shift would be a sign that one is working with integers rather than arbitrary binary data, and ints or one of the fixed width types from NumPy would likely be a better fit. So leaving that out of any proposal sounds fine to me. Cheers, Nick. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Wed Jun 18 01:30:36 2014 From: python at mrabarnett.plus.com (MRAB) Date: Wed, 18 Jun 2014 00:30:36 +0100 Subject: [Python-ideas] Bitwise operations on bytes class In-Reply-To: References: <1402941810.4273.26.camel@ipa.example.com> <53A09866.7010509@stoneleaf.us> <53A0B429.1060405@mrabarnett.plus.com> Message-ID: <53A0CF9C.9090302@mrabarnett.plus.com> On 2014-06-17 23:10, Nick Coghlan wrote: > > On 18 Jun 2014 07:34, "MRAB" wrote: > > > > On 2014-06-17 21:37, Antoine Pitrou wrote: > >> > >> Le 17/06/2014 15:35, Ethan Furman a ?crit : > >>> > >>> > >>> I like the idea, but one question I have: when shifting, are the > >>> incoming bits set to 0 or 1? Why? > >> > >> > >> By convention, 0. Historically, that's how CPUs do it. > >> (and also because it provides a quick way of multiplying / dividing by 2^N). > >> > > That's sometimes known as a "logical shift". > > My bitbashing-with-Python work was all serial communications protocol based, so logical shifts were what I wanted (I was also in the fortunate position of being able to tolerate the slow speed of doing them in Python, because HF radio comms are so slow the data streams to be analysed weren't very big). > > > When shifting to the right, there's also the "arithmetic shift", which > > preserves the most significant bit. > > > > Do we need that too? (I don't think so.) If yes, then what should be > > operator be? Just a 'normal' method call? > > Wanting an arithmetic shift would be a sign that one is working with integers rather than arbitrary binary data, and ints or one of the fixed width types from NumPy would likely be a better fit. So leaving that out of any proposal sounds fine to me. > What about rotates? From ncoghlan at gmail.com Wed Jun 18 04:34:42 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 18 Jun 2014 12:34:42 +1000 Subject: [Python-ideas] Bitwise operations on bytes class In-Reply-To: <53A0CF9C.9090302@mrabarnett.plus.com> References: <1402941810.4273.26.camel@ipa.example.com> <53A09866.7010509@stoneleaf.us> <53A0B429.1060405@mrabarnett.plus.com> <53A0CF9C.9090302@mrabarnett.plus.com> Message-ID: On 18 Jun 2014 09:31, "MRAB" wrote: > > On 2014-06-17 23:10, Nick Coghlan wrote: > > > > Wanting an arithmetic shift would be a sign that one is working with integers rather than arbitrary binary data, and ints or one of the fixed width types from NumPy would likely be a better fit. So leaving that out of any proposal sounds fine to me. > > > What about rotates? Bitwise rotation would be a bit of a pain to build on top of bitwise masking and logical shifts, but it could be done, so I think it would make more sense to keep a proposal minimal. Cheers, Nick. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From npmccallum at redhat.com Wed Jun 18 07:03:02 2014 From: npmccallum at redhat.com (Nathaniel McCallum) Date: Wed, 18 Jun 2014 01:03:02 -0400 Subject: [Python-ideas] Bitwise operations on bytes class In-Reply-To: References: <1402941810.4273.26.camel@ipa.example.com> <53A09866.7010509@stoneleaf.us> <53A0B429.1060405@mrabarnett.plus.com> <53A0CF9C.9090302@mrabarnett.plus.com> Message-ID: <1403067782.6477.3.camel@ipa.example.com> On Wed, 2014-06-18 at 12:34 +1000, Nick Coghlan wrote: > > On 18 Jun 2014 09:31, "MRAB" wrote: > > > > On 2014-06-17 23:10, Nick Coghlan wrote: > > > > > > Wanting an arithmetic shift would be a sign that one is working > with integers rather than arbitrary binary data, and ints or one of > the fixed width types from NumPy would likely be a better fit. So > leaving that out of any proposal sounds fine to me. > > > > > What about rotates? > > Bitwise rotation would be a bit of a pain to build on top of bitwise > masking and logical shifts, but it could be done, so I think it would > make more sense to keep a proposal minimal. Agreed. The code that I attached to one of my early replies actually implemented rotate, but I don't think that is what should be implemented by default in this proposal. Nathaniel From pcmanticore at gmail.com Wed Jun 18 12:23:26 2014 From: pcmanticore at gmail.com (Claudiu Popa) Date: Wed, 18 Jun 2014 13:23:26 +0300 Subject: [Python-ideas] Improving xmlrpc introspection Message-ID: Hello. This idea proposes enhancing the xmlrpc library by adding a couple of introspectable servers and proxies. For instance, here's an output of using the current idioms. >>> proxy = ServerProxy('http://localhost:8000') >>> dir(proxy) ['_ServerProxy__allow_none', '_ServerProxy__close', '_ServerProxy__encoding', '_ServerProxy__handler', '_ServerProxy__host', '_ServerProxy__request', '_ServerProxy__transport', '_ServerProxy__verbose', '__call__', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__enter__', '__eq__', '__exit__', '__format__', '__ge__', '__getattr__' , '__getattribute__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__'] Nothing useful in dir. The following works only if the server enables introspection: >>> proxy.system.listMethods() ['mul', 'pow', 'system.listMethods', 'system.methodHelp', 'system.methodSignature'] Now, let's see what mul does: >>> proxy.mul >>> help(proxy.mul) Help on _Method in module xmlrpc.client object: class _Method(builtins.object) | Methods defined here: | | __call__(self, *args) | | __getattr__(self, name) | | __init__(self, send, name) | # some magic to bind an XML-RPC method to an RPC server. | # supports "nested" methods (e.g. examples.getStateName) | | ---------------------------------------------------------------------- | Data descriptors defined here: | | __dict__ | dictionary for instance variables (if defined) | | __weakref__ | list of weak references to the object (if defined) Nothing useful for us. Neither methodHelp, nor methodSignature are very useful: >>> proxy.system.methodHelp('mul') 'multiplication' >>> proxy.system.methodSignature('mul') 'signatures not supported' We can find out something about that method by calling it. >>> proxy.mul(1, 2, 3) Traceback (most recent call last): File "", line 1, in File "D:\Projects\cpython\lib\xmlrpc\client.py", line 1091, in __call__ return self.__send(self.__name, args) File "D:\Projects\cpython\lib\xmlrpc\client.py", line 1421, in __request verbose=self.__verbose File "D:\Projects\cpython\lib\xmlrpc\client.py", line 1133, in request return self.single_request(host, handler, request_body, verbose) File "D:\Projects\cpython\lib\xmlrpc\client.py", line 1149, in single_request return self.parse_response(resp) File "D:\Projects\cpython\lib\xmlrpc\client.py", line 1320, in parse_response return u.close() File "D:\Projects\cpython\lib\xmlrpc\client.py", line 658, in close raise Fault(**self._stack[0]) xmlrpc.client.Fault: :mul() takes 3 positional arguments but 4 were given"> So, only after calling a method, one can find meaningful informations about it. My idea behaves like this: >>> from xmlrpc.client import MagicProxy # not a very good name, but it does some magic behind >>> proxy = MagicProxy('http://localhost:8000') >>> dir(proxy) ['_ServerProxy__allow_none', '_ServerProxy__close', '_ServerProxy__encoding', '_ServerProxy__handler', '_ServerProxy__host', '_ServerProxy__request', '_ServerProxy__trans ', '_ServerProxy__verbose', '__call__', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__enter__', '__eq__', '__exit__', '__format__', '__ge__', '__getattr__', '__getattribute__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_collect_methods', '_original_mul', '_original_pow', 'mul', 'pow'] >>> proxy.mul >>> proxy.pow >>> help(proxy.mul) Help on function mul in module xmlrpc.client: mul(x:1, y) -> 2 multiplication >>> help(proxy.pow) Help on function pow in module xmlrpc.client: pow(*args, **kwargs) pow(x, y[, z]) -> number With two arguments, equivalent to x**y. With three arguments, equivalent to (x**y) % z, but may be more efficient (e.g. for ints). >>> proxy.mul(1) Traceback (most recent call last): File "", line 1, in TypeError: mul() missing 1 required positional argument: 'y' >>> proxy.mul(1, 2, 3) Traceback (most recent call last): File "", line 1, in TypeError: mul() takes 2 positional arguments but 3 were given >>> proxy.mul(1, 2) 2 >>> import inspect >>> inspect.signature(proxy.mul) 2"> >>> As we can see, the registered methods can be introspected and calling one with the wrong number of arguments will not trigger a request to the server, but will fail right in the user's code. As a problem, it will work only for servers written in Python. For others will fallback to the current idiom. Would something like this be useful as an addition to the stdlib's xmlrpc module? If someone wants to test it, here's a rough patch against tip: https://gist.github.com/PCManticore/cf82ab421d4dc5c7f6ff. Thanks! From mail at robertlehmann.de Wed Jun 18 13:25:50 2014 From: mail at robertlehmann.de (Robert Lehmann) Date: Wed, 18 Jun 2014 13:25:50 +0200 Subject: [Python-ideas] Really support custom types for global namespace Message-ID: [resending w/o Google Groups ] I'm not sure if this is a beaten horse; I could only find vaguely related discussions on other scoping issues (so please, by all means, point me to past discussions of what I propose.) The interpreter currently supports setting a custom type for globals() and overriding __getitem__. The same is not true for __setitem__: class Namespace(dict): def __getitem__(self, key): print("getitem", key) def __setitem__(self, key, value): print("setitem", key, value) def fun(): global x, y x # should call globals.__getitem__ y = 1 # should call globals.__setitem__ dis.dis(fun) # 3 0 LOAD_GLOBAL 0 (x) # 3 POP_TOP # # 4 4 LOAD_CONST 1 (1) # 7 STORE_GLOBAL 1 (y) # 10 LOAD_CONST 0 (None) # 13 RETURN_VALUE exec(fun.__code__, Namespace()) # => getitem x # no setitem :-( I think it is weird why reading global variables goes through the usual magic methods just fine, while writing does not. The behaviour seems to have been introduced in Python 3.3.x (commit e3ab8aa ) to support custom __builtins__. The documentation is fuzzy on this issue: If only globals is provided, it must be a dictionary, which will be used > for both the global and the local variables. If globals and locals are > given, they are used for the global and local variables, respectively. If > provided, locals can be any mapping object. People at python-list were at odds if this was a bug, unspecified/unsupported behaviour, or a deliberate design decision. If it is just unsupported, I don't think the asymmetry makes it any better. If it is deliberate, I don't understand why dispatching on the dictness of globals (PyDict_CheckExact(f_globals)) is good enough for LOAD_GLOBAL, but not for STORE_GLOBAL in terms of performance. I have a patch (+ tests) to the current default branch straightening out this asymmetry and will happily open a ticket if you think this is indeed a bug. Thanks in advance, Robert -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Jun 18 16:42:21 2014 From: guido at python.org (Guido van Rossum) Date: Wed, 18 Jun 2014 07:42:21 -0700 Subject: [Python-ideas] Improving xmlrpc introspection In-Reply-To: References: Message-ID: Since this is an internet client+server, can you please also consider security as part of your design? Perhaps it's not always a good idea to have that much introspectability on a web interface. On Wed, Jun 18, 2014 at 3:23 AM, Claudiu Popa wrote: > Hello. > > This idea proposes enhancing the xmlrpc library by adding a couple > of introspectable servers and proxies. For instance, here's an output of > using the current idioms. > > >>> proxy = ServerProxy('http://localhost:8000') > >>> dir(proxy) > ['_ServerProxy__allow_none', '_ServerProxy__close', > '_ServerProxy__encoding', '_ServerProxy__handler', > '_ServerProxy__host', '_ServerProxy__request', > '_ServerProxy__transport', '_ServerProxy__verbose', '__call__', > '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', > '__enter__', '__eq__', '__exit__', '__format__', '__ge__', > '__getattr__' > , '__getattribute__', '__gt__', '__hash__', '__init__', '__le__', > '__lt__', '__module__', '__ne__', '__new__', '__reduce__', > '__reduce_ex__', '__repr__', '__setattr__', > '__sizeof__', '__str__', '__subclasshook__', '__weakref__'] > > > Nothing useful in dir. The following works only if the server enables > introspection: > > >>> proxy.system.listMethods() > ['mul', 'pow', 'system.listMethods', 'system.methodHelp', > 'system.methodSignature'] > > Now, let's see what mul does: > > >>> proxy.mul > > >>> help(proxy.mul) > Help on _Method in module xmlrpc.client object: > > class _Method(builtins.object) > | Methods defined here: > | > | __call__(self, *args) > | > | __getattr__(self, name) > | > | __init__(self, send, name) > | # some magic to bind an XML-RPC method to an RPC server. > | # supports "nested" methods (e.g. examples.getStateName) > | > | ---------------------------------------------------------------------- > | Data descriptors defined here: > | > | __dict__ > | dictionary for instance variables (if defined) > | > | __weakref__ > | list of weak references to the object (if defined) > > > > Nothing useful for us. Neither methodHelp, nor methodSignature are very > useful: > > >>> proxy.system.methodHelp('mul') > 'multiplication' > >>> proxy.system.methodSignature('mul') > 'signatures not supported' > > > We can find out something about that method by calling it. > > >>> proxy.mul(1, 2, 3) > Traceback (most recent call last): > File "", line 1, in > File "D:\Projects\cpython\lib\xmlrpc\client.py", line 1091, in __call__ > return self.__send(self.__name, args) > File "D:\Projects\cpython\lib\xmlrpc\client.py", line 1421, in __request > verbose=self.__verbose > File "D:\Projects\cpython\lib\xmlrpc\client.py", line 1133, in request > return self.single_request(host, handler, request_body, verbose) > File "D:\Projects\cpython\lib\xmlrpc\client.py", line 1149, in > single_request > return self.parse_response(resp) > File "D:\Projects\cpython\lib\xmlrpc\client.py", line 1320, in > parse_response > return u.close() > File "D:\Projects\cpython\lib\xmlrpc\client.py", line 658, in close > raise Fault(**self._stack[0]) > xmlrpc.client.Fault: :mul() takes 3 > positional arguments but 4 were given"> > > > So, only after calling a method, one can find meaningful informations > about it. > My idea behaves like this: > > >>> from xmlrpc.client import MagicProxy # not a very good name, but it > does some magic behind > >>> proxy = MagicProxy('http://localhost:8000') > >>> dir(proxy) > ['_ServerProxy__allow_none', '_ServerProxy__close', > '_ServerProxy__encoding', '_ServerProxy__handler', > '_ServerProxy__host', '_ServerProxy__request', '_ServerProxy__trans > ', '_ServerProxy__verbose', '__call__', '__class__', '__delattr__', > '__dict__', '__dir__', '__doc__', '__enter__', '__eq__', '__exit__', > '__format__', '__ge__', > '__getattr__', '__getattribute__', '__gt__', '__hash__', '__init__', > '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', > '__reduce_ex__', '__repr__', '__setattr__', > '__sizeof__', '__str__', '__subclasshook__', '__weakref__', > '_collect_methods', '_original_mul', '_original_pow', 'mul', 'pow'] > >>> proxy.mul > > >>> proxy.pow > > >>> help(proxy.mul) > Help on function mul in module xmlrpc.client: > > mul(x:1, y) -> 2 > multiplication > > >>> help(proxy.pow) > Help on function pow in module xmlrpc.client: > > pow(*args, **kwargs) > pow(x, y[, z]) -> number > > With two arguments, equivalent to x**y. With three arguments, > equivalent to (x**y) % z, but may be more efficient (e.g. for ints). > > >>> proxy.mul(1) > Traceback (most recent call last): > File "", line 1, in > TypeError: mul() missing 1 required positional argument: 'y' > >>> proxy.mul(1, 2, 3) > Traceback (most recent call last): > File "", line 1, in > TypeError: mul() takes 2 positional arguments but 3 were given > >>> proxy.mul(1, 2) > 2 > >>> import inspect > >>> inspect.signature(proxy.mul) > 2"> > >>> > > As we can see, the registered methods can be introspected and calling > one with the wrong number of arguments will not trigger a request to > the server, but will fail right in the user's code. > As a problem, it will work only for servers written in Python. For > others will fallback to the current idiom. > Would something like this be useful as an addition to the stdlib's > xmlrpc module? > If someone wants to test it, here's a rough patch against tip: > https://gist.github.com/PCManticore/cf82ab421d4dc5c7f6ff. > > Thanks! > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Jun 18 16:54:52 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 19 Jun 2014 00:54:52 +1000 Subject: [Python-ideas] Improving xmlrpc introspection In-Reply-To: References: Message-ID: On 19 Jun 2014 00:44, "Guido van Rossum" wrote: > > Since this is an internet client+server, can you please also consider security as part of your design? Perhaps it's not always a good idea to have that much introspectability on a web interface. I don't recall the details, but there was a CVE quite some time ago for an information leak in SimpleXMLRPCServer. That's not necessarily a "this is a bad idea" response, just "the security implications of such a feature would need to be managed very carefully (and if that's too hard to do, it might be a bad idea)". Cheers, Nick. > > > On Wed, Jun 18, 2014 at 3:23 AM, Claudiu Popa wrote: >> >> Hello. >> >> This idea proposes enhancing the xmlrpc library by adding a couple >> of introspectable servers and proxies. For instance, here's an output of >> using the current idioms. >> >> >>> proxy = ServerProxy('http://localhost:8000') >> >>> dir(proxy) >> ['_ServerProxy__allow_none', '_ServerProxy__close', >> '_ServerProxy__encoding', '_ServerProxy__handler', >> '_ServerProxy__host', '_ServerProxy__request', >> '_ServerProxy__transport', '_ServerProxy__verbose', '__call__', >> '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', >> '__enter__', '__eq__', '__exit__', '__format__', '__ge__', >> '__getattr__' >> , '__getattribute__', '__gt__', '__hash__', '__init__', '__le__', >> '__lt__', '__module__', '__ne__', '__new__', '__reduce__', >> '__reduce_ex__', '__repr__', '__setattr__', >> '__sizeof__', '__str__', '__subclasshook__', '__weakref__'] >> >> >> Nothing useful in dir. The following works only if the server enables >> introspection: >> >> >>> proxy.system.listMethods() >> ['mul', 'pow', 'system.listMethods', 'system.methodHelp', >> 'system.methodSignature'] >> >> Now, let's see what mul does: >> >> >>> proxy.mul >> >> >>> help(proxy.mul) >> Help on _Method in module xmlrpc.client object: >> >> class _Method(builtins.object) >> | Methods defined here: >> | >> | __call__(self, *args) >> | >> | __getattr__(self, name) >> | >> | __init__(self, send, name) >> | # some magic to bind an XML-RPC method to an RPC server. >> | # supports "nested" methods (e.g. examples.getStateName) >> | >> | ---------------------------------------------------------------------- >> | Data descriptors defined here: >> | >> | __dict__ >> | dictionary for instance variables (if defined) >> | >> | __weakref__ >> | list of weak references to the object (if defined) >> >> >> >> Nothing useful for us. Neither methodHelp, nor methodSignature are very useful: >> >> >>> proxy.system.methodHelp('mul') >> 'multiplication' >> >>> proxy.system.methodSignature('mul') >> 'signatures not supported' >> >> >> We can find out something about that method by calling it. >> >> >>> proxy.mul(1, 2, 3) >> Traceback (most recent call last): >> File "", line 1, in >> File "D:\Projects\cpython\lib\xmlrpc\client.py", line 1091, in __call__ >> return self.__send(self.__name, args) >> File "D:\Projects\cpython\lib\xmlrpc\client.py", line 1421, in __request >> verbose=self.__verbose >> File "D:\Projects\cpython\lib\xmlrpc\client.py", line 1133, in request >> return self.single_request(host, handler, request_body, verbose) >> File "D:\Projects\cpython\lib\xmlrpc\client.py", line 1149, in single_request >> return self.parse_response(resp) >> File "D:\Projects\cpython\lib\xmlrpc\client.py", line 1320, in parse_response >> return u.close() >> File "D:\Projects\cpython\lib\xmlrpc\client.py", line 658, in close >> raise Fault(**self._stack[0]) >> xmlrpc.client.Fault: :mul() takes 3 >> positional arguments but 4 were given"> >> >> >> So, only after calling a method, one can find meaningful informations about it. >> My idea behaves like this: >> >> >>> from xmlrpc.client import MagicProxy # not a very good name, but it does some magic behind >> >>> proxy = MagicProxy('http://localhost:8000') >> >>> dir(proxy) >> ['_ServerProxy__allow_none', '_ServerProxy__close', >> '_ServerProxy__encoding', '_ServerProxy__handler', >> '_ServerProxy__host', '_ServerProxy__request', '_ServerProxy__trans >> ', '_ServerProxy__verbose', '__call__', '__class__', '__delattr__', >> '__dict__', '__dir__', '__doc__', '__enter__', '__eq__', '__exit__', >> '__format__', '__ge__', >> '__getattr__', '__getattribute__', '__gt__', '__hash__', '__init__', >> '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', >> '__reduce_ex__', '__repr__', '__setattr__', >> '__sizeof__', '__str__', '__subclasshook__', '__weakref__', >> '_collect_methods', '_original_mul', '_original_pow', 'mul', 'pow'] >> >>> proxy.mul >> >> >>> proxy.pow >> >> >>> help(proxy.mul) >> Help on function mul in module xmlrpc.client: >> >> mul(x:1, y) -> 2 >> multiplication >> >> >>> help(proxy.pow) >> Help on function pow in module xmlrpc.client: >> >> pow(*args, **kwargs) >> pow(x, y[, z]) -> number >> >> With two arguments, equivalent to x**y. With three arguments, >> equivalent to (x**y) % z, but may be more efficient (e.g. for ints). >> >> >>> proxy.mul(1) >> Traceback (most recent call last): >> File "", line 1, in >> TypeError: mul() missing 1 required positional argument: 'y' >> >>> proxy.mul(1, 2, 3) >> Traceback (most recent call last): >> File "", line 1, in >> TypeError: mul() takes 2 positional arguments but 3 were given >> >>> proxy.mul(1, 2) >> 2 >> >>> import inspect >> >>> inspect.signature(proxy.mul) >> 2"> >> >>> >> >> As we can see, the registered methods can be introspected and calling >> one with the wrong number of arguments will not trigger a request to >> the server, but will fail right in the user's code. >> As a problem, it will work only for servers written in Python. For >> others will fallback to the current idiom. >> Would something like this be useful as an addition to the stdlib's >> xmlrpc module? >> If someone wants to test it, here's a rough patch against tip: >> https://gist.github.com/PCManticore/cf82ab421d4dc5c7f6ff. >> >> Thanks! >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > > > > -- > --Guido van Rossum (python.org/~guido) > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From npmccallum at redhat.com Wed Jun 18 17:35:28 2014 From: npmccallum at redhat.com (Nathaniel McCallum) Date: Wed, 18 Jun 2014 11:35:28 -0400 Subject: [Python-ideas] Bitwise operations on bytes class In-Reply-To: <1402941810.4273.26.camel@ipa.example.com> References: <1402941810.4273.26.camel@ipa.example.com> Message-ID: <1403105728.6477.12.camel@ipa.example.com> On Mon, 2014-06-16 at 14:03 -0400, Nathaniel McCallum wrote: > I find myself, fairly often, needing to perform bitwise operations > (rshift, lshift, and, or, xor) on arrays of bytes in python (both bytes > and bytearray). I can't think of any other reasonable use for these > operators. Is upstream Python interested in this kind of behavior by > default? At the least, it would make many algorithms very easy to read > and write. So it seems to me that there is a consensus that something like this is a good idea, with perhaps the exception of Guido's reminder to not overpopulate the operators (is that a no for this proposal?). Summarizing: 1. In lshift, what bits are introduced on the right-hand side? Zero is traditional. 2. In rshift, what bits are introduced on the left-hand side? An argument can be made for either zero (logical) or retaining the left-most bit (arithmetic). The 'arithmetic shift' seems to fit the sphere of NumPy. Zero should be preferred. 3. Rotates and other common operations are out of scope for this proposal. 4. One question not discussed is what to do when attempting to and/or/xor against a bytes() or bytearray() that is of a different length. Should we left-align the shorter of the two? Right-align? Throw an exception? Also, I'm new to this process. Where should I go from here? Do I need to form a PEP? Nathaniel From antoine at python.org Wed Jun 18 17:51:36 2014 From: antoine at python.org (Antoine Pitrou) Date: Wed, 18 Jun 2014 11:51:36 -0400 Subject: [Python-ideas] Bitwise operations on bytes class In-Reply-To: <1403105728.6477.12.camel@ipa.example.com> References: <1402941810.4273.26.camel@ipa.example.com> <1403105728.6477.12.camel@ipa.example.com> Message-ID: Le 18/06/2014 11:35, Nathaniel McCallum a ?crit : > On Mon, 2014-06-16 at 14:03 -0400, Nathaniel McCallum wrote: >> I find myself, fairly often, needing to perform bitwise operations >> (rshift, lshift, and, or, xor) on arrays of bytes in python (both bytes >> and bytearray). I can't think of any other reasonable use for these >> operators. Is upstream Python interested in this kind of behavior by >> default? At the least, it would make many algorithms very easy to read >> and write. > > So it seems to me that there is a consensus that something like this is > a good idea, with perhaps the exception of Guido's reminder to not > overpopulate the operators (is that a no for this proposal?). Rather than adding new operations to bytes/bytearray, an alternative is a separate type ("bitview"?) which would take a writable buffer as argument and then provide the operations over that buffer. It would allow make the operations compatible with other writable buffer types such as numpy arrays, etc. Regards Antoine. From skip at pobox.com Wed Jun 18 17:52:07 2014 From: skip at pobox.com (Skip Montanaro) Date: Wed, 18 Jun 2014 10:52:07 -0500 Subject: [Python-ideas] Improving xmlrpc introspection In-Reply-To: References: Message-ID: I might be a bit confused (nothing new there), but it seemed to me that Claudiu indicated all his MagicProxy magic happens in the client: > As we can see, the registered methods can be introspected and calling > one with the wrong number of arguments will not trigger a request to > the server, but will fail right in the user's code. I think we will have to see the code to decide if it's a security risk. Claudiu, I suggest you open an issue in the tracker so others can see how the magic works. Skip From alexander.belopolsky at gmail.com Wed Jun 18 18:05:30 2014 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 18 Jun 2014 12:05:30 -0400 Subject: [Python-ideas] Bitwise operations on bytes class In-Reply-To: References: <1402941810.4273.26.camel@ipa.example.com> <1403105728.6477.12.camel@ipa.example.com> Message-ID: On Wed, Jun 18, 2014 at 11:51 AM, Antoine Pitrou wrote: > Rather than adding new operations to bytes/bytearray, an alternative is a > separate type ("bitview"?) which would take a writable buffer as argument > and then provide the operations over that buffer. +1 .. and it does not have to be part of stdlib. The advantage of implementing this outside of stdlib is that users of older versions of Python will benefit immediately. -------------- next part -------------- An HTML attachment was scrubbed... URL: From npmccallum at redhat.com Wed Jun 18 18:20:37 2014 From: npmccallum at redhat.com (Nathaniel McCallum) Date: Wed, 18 Jun 2014 12:20:37 -0400 Subject: [Python-ideas] Bitwise operations on bytes class In-Reply-To: References: <1402941810.4273.26.camel@ipa.example.com> <1403105728.6477.12.camel@ipa.example.com> Message-ID: <1403108437.6477.14.camel@ipa.example.com> On Wed, 2014-06-18 at 12:05 -0400, Alexander Belopolsky wrote: > > On Wed, Jun 18, 2014 at 11:51 AM, Antoine Pitrou > wrote: > Rather than adding new operations to bytes/bytearray, an > alternative is a separate type ("bitview"?) which would take a > writable buffer as argument and then provide the operations > over that buffer. > > +1 > > > .. and it does not have to be part of stdlib. The advantage of > implementing this outside of stdlib is that users of older versions of > Python will benefit immediately. Older versions of Python can just do: third = [a ^ b for a, b in zip(first, second)] The problem is that this is more expensive and less readable than: third = first ^ second ... or ... first ^= second I'm not making this proposal on the basis that something can't be done already, but based on the fact that implementing it natively as part of the base types is a natural growth of the language. Of course this can be implemented in a module at the cost of "batteries included," a new dependency, readability and perhaps some additional overhead. I, for one, would not use such a module and would just implement the operations myself (as I have done for the last several years). The reason for this proposal is that such operations seem to me to be extremely natural to bytes/bytearray. And I think at least some others agree. Nathaniel From ethan at stoneleaf.us Wed Jun 18 18:27:59 2014 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 18 Jun 2014 09:27:59 -0700 Subject: [Python-ideas] Really support custom types for global namespace In-Reply-To: References: Message-ID: <53A1BE0F.5050407@stoneleaf.us> On 06/18/2014 04:25 AM, Robert Lehmann wrote: > > I have a patch (+ tests) to the current default branch straightening out this asymmetry and will happily open a ticket > if you think this is indeed a bug. If there is not a ticket open for this already, go ahead and open it -- it will provide history and rationale even if rejected. -- ~Ethan~ From storchaka at gmail.com Wed Jun 18 20:52:04 2014 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 18 Jun 2014 21:52:04 +0300 Subject: [Python-ideas] Bitwise operations on bytes class In-Reply-To: References: <1402941810.4273.26.camel@ipa.example.com> <1403105728.6477.12.camel@ipa.example.com> Message-ID: 18.06.14 18:51, Antoine Pitrou ???????(??): > Rather than adding new operations to bytes/bytearray, an alternative is > a separate type ("bitview"?) which would take a writable buffer as > argument and then provide the operations over that buffer. +1 From pcmanticore at gmail.com Thu Jun 19 08:35:26 2014 From: pcmanticore at gmail.com (Claudiu Popa) Date: Thu, 19 Jun 2014 09:35:26 +0300 Subject: [Python-ideas] Improving xmlrpc introspection In-Reply-To: References: Message-ID: On Wed, Jun 18, 2014 at 6:52 PM, Skip Montanaro wrote: > I might be a bit confused (nothing new there), but it seemed to me > that Claudiu indicated all his MagicProxy magic happens in the client: > >> As we can see, the registered methods can be introspected and calling >> one with the wrong number of arguments will not trigger a request to >> the server, but will fail right in the user's code. > > I think we will have to see the code to decide if it's a security > risk. Claudiu, I suggest you open an issue in the tracker so others > can see how the magic works. > > Skip > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ That's right, the behaviour occurs in the client, the only catch is that it needs a new method in xmlrpc.server and the server must support introspection already, by providing the `system` proxy methods. I already posted a sample patch in the first message (https://gist.github.com/PCManticore/cf82ab421d4dc5c7f6ff). Now, something is wrong in the client, because it exec's the information received in order to create the local functions, but probably there are other methods for achieving the same behaviour. Anyway, thank you all for your responses. I admit that I didn't think at the security implications of this proposal very much and it was enlightening as is. From ericsnowcurrently at gmail.com Thu Jun 19 21:26:00 2014 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 19 Jun 2014 13:26:00 -0600 Subject: [Python-ideas] Really support custom types for global namespace In-Reply-To: References: Message-ID: On Wed, Jun 18, 2014 at 5:25 AM, Robert Lehmann wrote: > The interpreter currently supports setting a custom type for globals() and > overriding __getitem__. The same is not true for __setitem__: > > class Namespace(dict): > def __getitem__(self, key): > print("getitem", key) > def __setitem__(self, key, value): > print("setitem", key, value) > > def fun(): > global x, y > x # should call globals.__getitem__ > y = 1 # should call globals.__setitem__ > > dis.dis(fun) > # 3 0 LOAD_GLOBAL 0 (x) > # 3 POP_TOP > # > # 4 4 LOAD_CONST 1 (1) > # 7 STORE_GLOBAL 1 (y) > # 10 LOAD_CONST 0 (None) > # 13 RETURN_VALUE > > exec(fun.__code__, Namespace()) > # => getitem x > # no setitem :-( > > I think it is weird why reading global variables goes through the usual > magic methods just fine, while writing does not. The behaviour seems to > have been introduced in Python 3.3.x (commit e3ab8aa) to support custom > __builtins__. The documentation is fuzzy on this issue: > >> If only globals is provided, it must be a dictionary, which will be used >> for both the global and the local variables. If globals and locals are >> given, they are used for the global and local variables, respectively. If >> provided, locals can be any mapping object. "it must be a dictionary" implies to me the exclusion of subclasses. Keep in mind that subclassing core builtin types (like dict) is generally not a great idea and overriding methods there is definitely a bad idea. A big part of this is due to an implementation detail of CPython: the use of the concrete C API, especially for dict. The concrete API is useful for performance, but it isn't subclass-friendly (re: overridden methods) in the least. > People at python-list were at odds if this was a bug, > unspecified/unsupported behaviour, or a deliberate design decision. I'd lean toward unspecified behavior, though (again) the docs imply to me that using anything other than dict isn't guaranteed to work right. So I'd consider this a proposal to add a slow path to STORE_GLOBAL that supports dict subclasses with overridden __setitem__() and to explicitly indicate support for get/set in the docs for exec(). To be honest, I'm not sold on the idea. There are subtleties involved here that make messing around with exec a high risk endeavor, requiring sufficient justification. What's the use case here? Also, is this exec-specific? Consider the case of class definitions and that the namespace in which they are executed can be customized via __prepare_class__() on the metaclass. I could be wrong, but I'm pretty sure you don't run into the problem there. So there may be more to the story here. > If it > is just unsupported, I don't think the asymmetry makes it any better. If it > is deliberate, I don't understand why dispatching on the dictness of globals > (PyDict_CheckExact(f_globals)) is good enough for LOAD_GLOBAL, but not for > STORE_GLOBAL in terms of performance. > > I have a patch (+ tests) to the current default branch straightening out > this asymmetry and will happily open a ticket if you think this is indeed a > bug. Definitely open a ticket (and reply here with a link). -eric From victor.stinner at gmail.com Thu Jun 19 23:21:10 2014 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 19 Jun 2014 23:21:10 +0200 Subject: [Python-ideas] Really support custom types for global namespace In-Reply-To: References: Message-ID: 2014-06-18 13:25 GMT+02:00 Robert Lehmann : > I have a patch (+ tests) to the current default branch straightening out > this asymmetry and will happily open a ticket if you think this is indeed a > bug. Hi, I'm the author of the change allowing custom types for builtins. I wrote it for my pysandbox project (now abandonned, the sandbox is broken by design!). I'm interested to support custom types for globals and locals. It may require deep changes in ceval.c, builtin functions, frames, etc. In short, only the dict type is supported for globals and locals. Using another types for builtins is also experimental. Don't do that at home :-) Victor From rosuav at gmail.com Fri Jun 20 01:07:17 2014 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 20 Jun 2014 09:07:17 +1000 Subject: [Python-ideas] Really support custom types for global namespace In-Reply-To: References: Message-ID: On Fri, Jun 20, 2014 at 5:26 AM, Eric Snow wrote: > "it must be a dictionary" implies to me the exclusion of subclasses. This is something where the docs and most code disagree. When you call isinstance(), it assumes LSP and accepts a subclass, but the Python docs tend to be explicit about accepting subclasses. That's fine when they do (eg https://docs.python.org/3/reference/simple_stmts.html#raise says "subclass or an instance of BaseException"), but less clear when not. Would it be worth adding a few words to the docs saying this? """If only globals is provided, it must be a dictionary, which will be used...""" --> """If only globals is provided, it must be (exactly) a dict, which will be used...""" with the word dict being a link to stdtypes.html#mapping-types-dict ? ChrisA From ionel.mc at gmail.com Sat Jun 21 18:57:04 2014 From: ionel.mc at gmail.com (Ionel Maries Cristian) Date: Sat, 21 Jun 2014 19:57:04 +0300 Subject: [Python-ideas] "Escape hatch" for preferred encoding (default encoding for `open`) Message-ID: It would be nice if there would be an escape hatch for situations where the value of locale.getpreferredencoding() can't be changed (eg: windows - try changing that to utf8 ) in the form of an environment variable like PYTHONPREFERREDENCODING (or something like that). The idea is that it would override the default encoding for open() for platforms/situations where it's infeasible to manually specify the encoding to open (eg: lots of old code) or change locale to something utf8-ish (windows). I've found an old thread about this problem but to my bewilderment no one considered using an environment variable. Thanks, -- Ionel M. -------------- next part -------------- An HTML attachment was scrubbed... URL: From flying-sheep at web.de Sun Jun 22 11:25:20 2014 From: flying-sheep at web.de (Philipp A.) Date: Sun, 22 Jun 2014 11:25:20 +0200 Subject: [Python-ideas] Empty set, Empty dict In-Reply-To: References: Message-ID: of course it?s ugly, but it?s also obvious that it had to be suggested, because it?s the only obvious idea. which leaves us with either a non-obvious idea or no empty set literal, which is a bit sad and inconsistent. if i?d develop a language from scratch, i?d possibly use the following empty literals: [] = list; () = tuple, {} = set [:] = ordered dict, (:) = named tuple, {:} = dict but that ship has sailed. 2014-06-10 18:39 GMT+02:00 Guido van Rossum : > No. Jeez. :-( > > > On Tue, Jun 10, 2014 at 9:25 AM, Ryan Gonzalez wrote: > >> +1 for using {,}. >> >> >> On Tue, Jun 10, 2014 at 4:07 AM, Wichert Akkerman >> wrote: >> >>> Victor Stinner wrote: >>> >>> 2014-06-10 8:15 GMT+02:00 Neil Girdhar >: >>> >>> >>> >>> >>> >* I've seen this proposed before, and I personally would love this, but my >>> *>* guess is that it breaks too much code for too little gain. >>> *>>* On Wednesday, May 21, 2014 12:33:30 PM UTC-4, Fr?d?ric Legembre wrote: >>> *>>>>>>* Now | Future | >>> *>>* ---------------------------------------------------- >>> *>>* () | () | empty tuple ( 1, 2, 3 ) >>> *>>* [] | [] | empty list [ 1, 2, 3 ] >>> *>>* set() | {} | empty set { 1, 2, 3 } >>> *>>* {} | {:} | empty dict { 1:a, 2:b, 3:c } >>> * >>> >>> Your guess is right. It will break all Python 2 and Python 3 in the world. >>> >>> Technically, set((1, 2)) is different than {1, 2}: the first creates a >>> tuple and loads the global name "set" (which can be replaced at >>> runtime!), whereas the later uses bytecode and only store values >>> (numbers 1 and 2). >>> >>> It would be nice to have a syntax for empty set, but {} is a no-no. >>> >>> >>> Perhaps {,} would be a possible spelling. For consistency you might want >>> to allow (,) to create an empty tuple as well; personally I would find that >>> more intuitive that (()). >>> >>> Wichert. >>> >>> >>> >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >> >> >> >> -- >> Ryan >> If anybody ever asks me why I prefer C++ to C, my answer will be simple: >> "It's becauseslejfp23(@#Q*(E*EIdc-SEGFAULT. Wait, I don't think that was >> nul-terminated." >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > > -- > --Guido van Rossum (python.org/~guido) > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From clint.hepner at gmail.com Sun Jun 22 14:29:07 2014 From: clint.hepner at gmail.com (Clint Hepner) Date: Sun, 22 Jun 2014 08:29:07 -0400 Subject: [Python-ideas] Empty set, Empty dict In-Reply-To: References: Message-ID: There's one more, even more obvious (IMO) option for an empty set literal: U+2205, EMPTY SET. But that opens a whole other can of worms, namely expanding the grammar to allow Unicode characters outside of identifiers. > On Jun 22, 2014, at 5:25 AM, "Philipp A." wrote: > > of course it?s ugly, but it?s also obvious that it had to be suggested, because it?s the only obvious idea. > > which leaves us with either a non-obvious idea or no empty set literal, which is a bit sad and inconsistent. > > if i?d develop a language from scratch, i?d possibly use the following empty literals: > > [] = list; () = tuple, {} = set > [:] = ordered dict, (:) = named tuple, {:} = dict > > but that ship has sailed. > > > 2014-06-10 18:39 GMT+02:00 Guido van Rossum : >> No. Jeez. :-( >> >> >>> On Tue, Jun 10, 2014 at 9:25 AM, Ryan Gonzalez wrote: >>> +1 for using {,}. >>> >>> >>>> On Tue, Jun 10, 2014 at 4:07 AM, Wichert Akkerman wrote: >>>> Victor Stinner wrote: >>>> 2014-06-10 8:15 GMT+02:00 Neil Girdhar : >>>> >>>>> >>>>> >>>>> >>>>> >>>>> > I've seen this proposed before, and I personally would love this, but my >>>>> > guess is that it breaks too much code for too little gain. >>>>> > >>>>> > On Wednesday, May 21, 2014 12:33:30 PM UTC-4, Fr?d?ric Legembre wrote: >>>>> >> >>>>> >> >>>>> >> Now | Future | >>>>> >> ---------------------------------------------------- >>>>> >> () | () | empty tuple ( 1, 2, 3 ) >>>>> >> [] | [] | empty list [ 1, 2, 3 ] >>>>> >> set() | {} | empty set { 1, 2, 3 } >>>>> >> {} | {:} | empty dict { 1:a, 2:b, 3:c } >>>>> >>>>> >>>>> Your guess is right. It will break all Python 2 and Python 3 in the world. >>>>> >>>>> Technically, set((1, 2)) is different than {1, 2}: the first creates a >>>>> tuple and loads the global name "set" (which can be replaced at >>>>> runtime!), whereas the later uses bytecode and only store values >>>>> (numbers 1 and 2). >>>>> >>>>> It would be nice to have a syntax for empty set, but {} is a no-no. >>>> >>>> Perhaps {,} would be a possible spelling. For consistency you might want to allow (,) to create an empty tuple as well; personally I would find that more intuitive that (()). >>>> Wichert. >>>> >>>> >>>> >>>> _______________________________________________ >>>> Python-ideas mailing list >>>> Python-ideas at python.org >>>> https://mail.python.org/mailman/listinfo/python-ideas >>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >>> >>> >>> -- >>> Ryan >>> If anybody ever asks me why I prefer C++ to C, my answer will be simple: "It's becauseslejfp23(@#Q*(E*EIdc-SEGFAULT. Wait, I don't think that was nul-terminated." >>> >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> >> >> -- >> --Guido van Rossum (python.org/~guido) >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan_ml at behnel.de Sun Jun 22 14:48:05 2014 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 22 Jun 2014 14:48:05 +0200 Subject: [Python-ideas] Empty set, Empty dict In-Reply-To: References: Message-ID: Clint Hepner, 22.06.2014 14:29: > There's one more, even more obvious (IMO) option for an empty set literal: U+2205, EMPTY SET. But that opens a whole other can of worms, namely expanding the grammar to allow Unicode characters outside of identifiers. ... and then teaching people how to type them on their keyboards. Stefan From rosuav at gmail.com Sun Jun 22 14:51:03 2014 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 22 Jun 2014 22:51:03 +1000 Subject: [Python-ideas] Empty set, Empty dict In-Reply-To: References: Message-ID: On Sun, Jun 22, 2014 at 10:48 PM, Stefan Behnel wrote: > Clint Hepner, 22.06.2014 14:29: >> There's one more, even more obvious (IMO) option for an empty set literal: U+2205, EMPTY SET. But that opens a whole other can of worms, namely expanding the grammar to allow Unicode characters outside of identifiers. > > ... and then teaching people how to type them on their keyboards. At least the concept of "empty set literal" has more merit than "save a few keystrokes on the word 'lambda'". Even if it isn't something easily typed, it would have value over the current spelling of "set()", which isn't a literal. (Whether it has *enough* value over set() to be worth doing is still in question, but it's not like lambda vs ?.) ChrisA From ncoghlan at gmail.com Sun Jun 22 15:04:31 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 22 Jun 2014 23:04:31 +1000 Subject: [Python-ideas] Empty set, Empty dict In-Reply-To: References: Message-ID: On 22 June 2014 22:51, Chris Angelico wrote: > At least the concept of "empty set literal" has more merit than "save > a few keystrokes on the word 'lambda'". Even if it isn't something > easily typed, it would have value over the current spelling of > "set()", which isn't a literal. (Whether it has *enough* value over > set() to be worth doing is still in question, but it's not like lambda > vs ?.) Yep, "status quo wins a stalemate" tends to be the winner on this particular topic. With a blank slate, the obvious choice is {} for the empty set and {:} for the empty dict, but Python doesn't have that option due to builtin sets arriving *long* after builtin dicts (for a very long time, sets weren't even in the standard library - folks just used dicts with the values all set to None). So, for those historical reasons, set() will likely persist indefinitely with its discontinuity in appearance between the "zero items" and "one or more predefined items" cases. Cheers, Nick, -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From flying-sheep at web.de Sun Jun 22 16:41:51 2014 From: flying-sheep at web.de (Philipp A.) Date: Sun, 22 Jun 2014 16:41:51 +0200 Subject: [Python-ideas] Empty set, Empty dict In-Reply-To: References: Message-ID: i honestly don?t see the problem here. if people are too lazy to find a input method that works for them (Alt Gr, compose key, copy&paste), they should just continue to type ASCII, and leave the more elegant unicode variants for others. this violates TSBOOWTDI, but as there?s also dict() next to {}, this should neither be a problem. i like scala?s way to allow both <- and ?, as well as => and ?, and so on. ? and ? seem like good ideas to me as un-redefinable empty set literal and shorter/more elegant lambda. And ??? for ?Ellipsis?. there?s also ?, ?, ?, ?,?, ?, ?, ?, ?, ?, ?, ?, and ?, but i think those are a bit much: my_set = ? my_set ?= other_set my_set = map(? e: e ? 5, my_set ? third_set) *?* spam *?* my_set: if spam *?* None *?* spam ? 8: print(spam *?* allowed_values, *?* spam) vs. my_set = ? my_set &= other_set my_set = map(? e: e * 5, my_set | third_set) for spam in my_set: if spam is None or spam <= 8: print(spam not in allowed_values, not spam) ? 2014-06-22 14:29 GMT+02:00 Clint Hepner : > There's one more, even more obvious (IMO) option for an empty set literal: > U+2205, EMPTY SET. But that opens a whole other can of worms, namely > expanding the grammar to allow Unicode characters outside of identifiers. > > On Jun 22, 2014, at 5:25 AM, "Philipp A." wrote: > > of course it?s ugly, but it?s also obvious that it had to be suggested, > because it?s the only obvious idea. > > which leaves us with either a non-obvious idea or no empty set literal, > which is a bit sad and inconsistent. > > if i?d develop a language from scratch, i?d possibly use the following > empty literals: > > [] = list; () = tuple, {} = set > [:] = ordered dict, (:) = named tuple, {:} = dict > > but that ship has sailed. > > > 2014-06-10 18:39 GMT+02:00 Guido van Rossum : > >> No. Jeez. :-( >> >> >> On Tue, Jun 10, 2014 at 9:25 AM, Ryan Gonzalez wrote: >> >>> +1 for using {,}. >>> >>> >>> On Tue, Jun 10, 2014 at 4:07 AM, Wichert Akkerman >>> wrote: >>> >>>> Victor Stinner wrote: >>>> >>>> 2014-06-10 8:15 GMT+02:00 Neil Girdhar >: >>>> >>>> >>>> >>>> >>>> >* I've seen this proposed before, and I personally would love this, but my >>>> *>* guess is that it breaks too much code for too little gain. >>>> *>>* On Wednesday, May 21, 2014 12:33:30 PM UTC-4, Fr?d?ric Legembre wrote: >>>> *>>>>>>* Now | Future | >>>> *>>* ---------------------------------------------------- >>>> *>>* () | () | empty tuple ( 1, 2, 3 ) >>>> *>>* [] | [] | empty list [ 1, 2, 3 ] >>>> *>>* set() | {} | empty set { 1, 2, 3 } >>>> *>>* {} | {:} | empty dict { 1:a, 2:b, 3:c } >>>> * >>>> >>>> Your guess is right. It will break all Python 2 and Python 3 in the world. >>>> >>>> Technically, set((1, 2)) is different than {1, 2}: the first creates a >>>> tuple and loads the global name "set" (which can be replaced at >>>> runtime!), whereas the later uses bytecode and only store values >>>> (numbers 1 and 2). >>>> >>>> It would be nice to have a syntax for empty set, but {} is a no-no. >>>> >>>> >>>> Perhaps {,} would be a possible spelling. For consistency you might >>>> want to allow (,) to create an empty tuple as well; personally I would find >>>> that more intuitive that (()). >>>> >>>> Wichert. >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> Python-ideas mailing list >>>> Python-ideas at python.org >>>> https://mail.python.org/mailman/listinfo/python-ideas >>>> Code of Conduct: http://python.org/psf/codeofconduct/ >>>> >>> >>> >>> >>> -- >>> Ryan >>> If anybody ever asks me why I prefer C++ to C, my answer will be simple: >>> "It's becauseslejfp23(@#Q*(E*EIdc-SEGFAULT. Wait, I don't think that was >>> nul-terminated." >>> >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >> >> >> >> -- >> --Guido van Rossum (python.org/~guido) >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Sun Jun 22 17:01:02 2014 From: python at mrabarnett.plus.com (MRAB) Date: Sun, 22 Jun 2014 16:01:02 +0100 Subject: [Python-ideas] Empty set, Empty dict In-Reply-To: References: Message-ID: <53A6EFAE.3050803@mrabarnett.plus.com> On 2014-06-22 15:41, Philipp A. wrote: > i honestly don?t see the problem here. > > if people are too lazy to find a input method that works for them (Alt > Gr, compose key, copy&paste), they should just continue to type ASCII, > and leave the more elegant unicode variants for others. > > this violates TSBOOWTDI, but as there?s also |dict()| next to |{}|, this > should neither be a problem. > > i like scala?s way to allow both |<-| and |?|, as well as |=>| and |?|, > and so on. ? and ? seem like good ideas to me as un-redefinable empty > set literal and shorter/more elegant lambda. And ??? for ?Ellipsis?. > [snip] ? is a valid identifier in Python 3 because it's a letter. From tjreedy at udel.edu Sun Jun 22 22:18:57 2014 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 22 Jun 2014 16:18:57 -0400 Subject: [Python-ideas] .pyu nicode syntax symbols (was Re: Empty set, Empty dict) In-Reply-To: References: Message-ID: Problem: For years, various people have suggested that they would like to use syntactically significant unicode symbols in Python code. A prime example is using U+2205, EMPTY SET, ?, instead of 'set()'. On the other hand, the conservative, overwhelmed core development group is not much interested and would rather do other things. Solution: Act instead of ask. One or more of the people who really want this could get themselves together and produce a working system. (If multiple people, ask for a new sig and mailing list). 1. Ask core development to reserve '.pyu' for python with unicode symbolds. (If refused, chose something else.) 2. Write pyu.py. It should first translate x.pyu to the equivalent x.py. If x.py exists, check the date (at with .py and .pyc). Optionally, but probably by default, run x.py. Translation requires two operations: masking comments and string literals from translation and translating the remainder. I personally would start by doing the two operations separately, with separately testable functions. def codechunk(unisymcode): '''Yield code_or_not, code_chunk pairs for code with unicode symbols. Chunks are comments or string literals (code_or_not == False), and code that might have unicode symbols that need translation 'code_or_not' == True). ''' unisym = def unisym2ascii(unisymcode): blocklist = [] for code, block in codeblocks(unisymcode): if code: block = block.translate(unisym) blocklist.append(block) return ''.join(blocklist) 3. Upload pyu.py to PyPI, *along with instructions on the various ways to enter unicode symbols on various systems*. Announce and promote. On 6/22/2014 10:41 AM, Philipp A. wrote: > if people are too lazy to find a input method that works for them (Alt > Gr, compose key, copy&paste), they should just continue to type ASCII, > and leave the more elegant unicode variants for others. Being snarky can be fun, but if I wrote and distributed pyu.py, I would want as many users as possible. > ? and ? seem like good ideas to me as un-redefinable empty > set literal and shorter/more elegant lambda. And ??? for ?Ellipsis?. > > there?s also ?, ?, ?, ?,?, ?, ?, ?, ?, ?, ?, ?, and ?, but i think those > are a bit much: I think the unisym dict should be inclusive and let people choose to use the symbols they want. I suspect I use ? and ? b sooner than ?. A mathematician that used most of those symbols, for a math audience, could still use the ascii tranlation for other audiences. On 6/22/2014 11:01 AM, MRAB wrote: > ? is a valid identifier in Python 3 because it's a letter. Overall, I see this as less of a problem than the possibility of rebinding builtin names. The program could have a 'translate_lambda' (default True) parameter. But I would be willing to say that if you use unicode symbols, then you cannot also use ? as an identifier. (If one did, the resulting .py would stop with SyntaxError where 'lambda' repladed identifier ?.) -- Terry Jan Reedy From barry at python.org Sun Jun 22 22:52:32 2014 From: barry at python.org (Barry Warsaw) Date: Sun, 22 Jun 2014 16:52:32 -0400 Subject: [Python-ideas] Empty set, Empty dict References: Message-ID: <20140622165232.0afbb358@anarchist> On Jun 22, 2014, at 11:04 PM, Nick Coghlan wrote: >With a blank slate, the obvious choice is {} for the empty set and {:} for >the empty dict, but Python doesn't have that option due to builtin sets >arriving *long* after builtin dicts Although, I think future-import could help the transition here, if we decided it was a good idea. Not that I'm necessarily advocating for it; set() is good enough for me. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: not available URL: From guido at python.org Mon Jun 23 02:30:59 2014 From: guido at python.org (Guido van Rossum) Date: Sun, 22 Jun 2014 17:30:59 -0700 Subject: [Python-ideas] .pyu nicode syntax symbols (was Re: Empty set, Empty dict) In-Reply-To: References: Message-ID: Hm. What's wrong with rejecting bad ideas? On Jun 22, 2014 1:19 PM, "Terry Reedy" wrote: > Problem: For years, various people have suggested that they would like to > use syntactically significant unicode symbols in Python code. A prime > example is using U+2205, EMPTY SET, ?, instead of 'set()'. On the other > hand, the conservative, overwhelmed core development group is not much > interested and would rather do other things. > > Solution: Act instead of ask. > > One or more of the people who really want this could get themselves > together and produce a working system. (If multiple people, ask for a new > sig and mailing list). > > 1. Ask core development to reserve '.pyu' for python with unicode > symbolds. (If refused, chose something else.) > > 2. Write pyu.py. It should first translate x.pyu to the equivalent x.py. > If x.py exists, check the date (at with .py and .pyc). Optionally, but > probably by default, run x.py. > > Translation requires two operations: masking comments and string literals > from translation and translating the remainder. I personally would start by > doing the two operations separately, with separately testable functions. > > def codechunk(unisymcode): > '''Yield code_or_not, code_chunk pairs for code with unicode symbols. > > Chunks are comments or string literals (code_or_not == False), > and code that might have unicode symbols that need translation > 'code_or_not' == True). > ''' > which already knows how to recognize comments and strings.> > > unisym = > > def unisym2ascii(unisymcode): > blocklist = [] > for code, block in codeblocks(unisymcode): > if code: > block = block.translate(unisym) > blocklist.append(block) > return ''.join(blocklist) > > 3. Upload pyu.py to PyPI, *along with instructions on the various ways to > enter unicode symbols on various systems*. Announce and promote. > > > On 6/22/2014 10:41 AM, Philipp A. wrote: > >> if people are too lazy to find a input method that works for them (Alt >> Gr, compose key, copy&paste), they should just continue to type ASCII, >> and leave the more elegant unicode variants for others. >> > > Being snarky can be fun, but if I wrote and distributed pyu.py, I would > want as many users as possible. > > ? and ? seem like good ideas to me as un-redefinable empty >> set literal and shorter/more elegant lambda. And ??? for ?Ellipsis?. >> >> there?s also ?, ?, ?, ?,?, ?, ?, ?, ?, ?, ?, ?, and ?, but i think those >> are a bit much: >> > > I think the unisym dict should be inclusive and let people choose to use > the symbols they want. I suspect I use ? and ? b sooner than ?. A > mathematician that used most of those symbols, for a math audience, could > still use the ascii tranlation for other audiences. > > On 6/22/2014 11:01 AM, MRAB wrote: > > ? is a valid identifier in Python 3 because it's a letter. > > Overall, I see this as less of a problem than the possibility of rebinding > builtin names. The program could have a 'translate_lambda' (default True) > parameter. But I would be willing to say that if you use unicode symbols, > then you cannot also use ? as an identifier. (If one did, the resulting .py > would stop with SyntaxError where 'lambda' repladed identifier ?.) > > -- > Terry Jan Reedy > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From antony.lee at berkeley.edu Mon Jun 23 03:05:37 2014 From: antony.lee at berkeley.edu (Antony Lee) Date: Sun, 22 Jun 2014 18:05:37 -0700 Subject: [Python-ideas] Another pathlib suggestion In-Reply-To: References: Message-ID: After some more thought, a better API may be to provide a "with_suffixes" method, such that "p.with_suffixes(*s).suffixes == s" (just like "p.with_suffix(s) == s"). For example, we'd have Path("foo.tar.gz").with_suffixes(".ext") == Path("foo.ext") Path("foo.ext").with_suffixes(".tar", ".gz") == Path("foo.tar.gz") I guess this is a less popular topic than discussing new empty set literals though :) but if you really like Unicode, you could just use https://github.com/ehamberg/vim-cute-python Antony 2014-05-21 13:38 GMT-07:00 Antony Lee : > Handling of Paths with multiple extensions is currently not so easy with > pathlib. Specifically, I don't think there is an easy way to go from > "foo.tar.gz" to "foo.ext", because Path.with_suffix only replaces the last > suffix. > > I would therefore like to suggest either > > 1/ add Path.replace_suffix, such that > Path("foo.tar.gz").replace_suffix(".tar.gz", ".ext") == Path("foo.ext") > (this would also provide extension-checking capabilities, raising > ValueError if the first argument is not a valid suffix of the initial > path); or > > 2/ add a second argument to Path.with_suffix, "n_to_strip" (although > perhaps with a better name), defaulting to 1, such that > Path("foo.tar.gz").with_suffix(".ext", 0) == Path("foo.tar.gz.ext") > Path("foo.tar.gz").with_suffix(".ext", 1) == Path("foo.tar.ext") > Path("foo.tar.gz").with_suffix(".ext", 2) == Path("foo.ext") # set > n_to_strip to len(path.suffixes) for stripping all of them. > Path("foo.tar.gz").with_suffix(".ext", 3) raises a ValueError. > > Best, > Antony > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Jun 23 05:23:32 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 23 Jun 2014 13:23:32 +1000 Subject: [Python-ideas] .pyu nicode syntax symbols (was Re: Empty set, Empty dict) In-Reply-To: References: Message-ID: On 23 June 2014 10:30, Guido van Rossum wrote: > Hm. What's wrong with rejecting bad ideas? While I agree it's a bad idea to use symbols that can't be readily typed as part of the language syntax, I think Terry's broader point that anything which *can* be implemented outside the core usually *should* be implemented outside the core (at least as a proof-of-concept) is a good one. Hy shows it is possible to implement a Lisp on top of the CPython runtime, so folks should certainly be capable of implementing a Python-with-Unicode-symbols on top of existing Python runtimes without needing the blessing of the core development team. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From guido at python.org Mon Jun 23 06:00:20 2014 From: guido at python.org (Guido van Rossum) Date: Sun, 22 Jun 2014 21:00:20 -0700 Subject: [Python-ideas] .pyu nicode syntax symbols (was Re: Empty set, Empty dict) In-Reply-To: References: Message-ID: On Sun, Jun 22, 2014 at 8:23 PM, Nick Coghlan wrote: > On 23 June 2014 10:30, Guido van Rossum wrote: > > Hm. What's wrong with rejecting bad ideas? > > While I agree it's a bad idea to use symbols that can't be readily > typed as part of the language syntax, I think Terry's broader point > that anything which *can* be implemented outside the core usually > *should* be implemented outside the core (at least as a > proof-of-concept) is a good one. > This particular proposal sounds to me like something that shouldn't be implemented at all. We don't need another split in the community over how to spell operators. > Hy shows it is possible to implement a Lisp on top of the CPython > runtime, It wasn't proposed as a serious feature on python-ideas. > so folks should certainly be capable of implementing a > Python-with-Unicode-symbols on top of existing Python runtimes without > needing the blessing of the core development team. Terry *is* asking for a blessing of the .pyu extension by the core team. (Although it seems he wouldn't be too upset if he didn't get it. :-) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefano.borini at ferrara.linux.it Mon Jun 23 14:06:05 2014 From: stefano.borini at ferrara.linux.it (Stefano Borini) Date: Mon, 23 Jun 2014 14:06:05 +0200 Subject: [Python-ideas] Accepting keyword arguments for __getitem__ Message-ID: <20140623120605.GA17255@ferrara.linux.it> Dear all, At work we use a notation like LDA[Z=5] to define a specific level of accuracy for our evaluation. This notation is used just for textual labels, but it would be nice if it actually worked at the scripting level, which led me to think to the following: at the moment, we have the following >>> class A: ... def __getitem__(self, y): ... print(y) ... >>> a=A() >>> a[2] 2 >>> a[2,3] (2, 3) >>> a[1:3] slice(1, 3, None) >>> a[1:3, 4] (slice(1, 3, None), 4) >>> I would propose to add the possibility for a[Z=3], where y would then be a dictionary {"Z": 3}. In the case of a[1:3, 4, Z=3, R=5], the value of y would be a tuple containing (slice(1,3,None), 4, {"Z": 3}, {"R": 5}). This allows to preserve the ordering as specified (e.g. a[Z=3, R=4] vs a[R=4, Z=3]). Do you think it would be a good/useful idea? Was this already discussed or proposed in a PEP? Google did not help on this regard. Thank you, Stefano Borini From rosuav at gmail.com Mon Jun 23 14:24:53 2014 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 23 Jun 2014 22:24:53 +1000 Subject: [Python-ideas] Accepting keyword arguments for __getitem__ In-Reply-To: <20140623120605.GA17255@ferrara.linux.it> References: <20140623120605.GA17255@ferrara.linux.it> Message-ID: On Mon, Jun 23, 2014 at 10:06 PM, Stefano Borini wrote: > At work we use a notation like LDA[Z=5] to define a specific level of accuracy for our evaluation. This notation is used > just for textual labels, but it would be nice if it actually worked at the scripting level, which led me to think to the following: > at the moment, we have the following > >>>> class A: > ... def __getitem__(self, y): > ... print(y) > ... >>>> a=A() >>>> a[2] > 2 >>>> a[2,3] > (2, 3) >>>> a[1:3] > slice(1, 3, None) >>>> a[1:3, 4] > (slice(1, 3, None), 4) >>>> > > I would propose to add the possibility for a[Z=3], where y would then be a > dictionary {"Z": 3}. The obvious way to accept that would be to support keyword arguments, and then it begins looking very much like a call. Can you alter your notation very slightly to become LDA(Z=5) instead? Then you can accept that with your class thus: class A: def __call__(self, Z): print(Z) Or you can accept it generically with keyword arg collection: class A: def __call__(self, **kw): print(kw) >>> a=A() >>> a(Z=3) {'Z': 3} Requires a small change to notation, but no changes to Python, ergo it can be done without waiting for a new release! ChrisA From stefano.borini at ferrara.linux.it Mon Jun 23 14:53:39 2014 From: stefano.borini at ferrara.linux.it (Stefano Borini) Date: Mon, 23 Jun 2014 14:53:39 +0200 Subject: [Python-ideas] Accepting keyword arguments for __getitem__ In-Reply-To: References: <20140623120605.GA17255@ferrara.linux.it> Message-ID: <20140623125339.GA18680@ferrara.linux.it> On Mon, Jun 23, 2014 at 10:24:53PM +1000, Chris Angelico wrote: > The obvious way to accept that would be to support keyword arguments, > and then it begins looking very much like a call. Can you alter your > notation very slightly to become LDA(Z=5) instead? We certainly can, but I was wondering if such extension would be useful in other contexts. Also, with the function solution, you would lose the order of the entries. You can't distinguish foo(z=3, r=4) from foo(r=4, z=3) -- ------------------------------------------------------------ -----BEGIN GEEK CODE BLOCK----- Version: 3.12 GCS d- s+:--- a? C++++ UL++++ P+ L++++ E--- W- N+ o K- w--- O+ M- V- PS+ PE+ Y PGP++ t+++ 5 X- R* tv+ b DI-- D+ G e h++ r+ y* ------------------------------------------------------------ From graffatcolmingov at gmail.com Mon Jun 23 15:01:19 2014 From: graffatcolmingov at gmail.com (Ian Cordasco) Date: Mon, 23 Jun 2014 08:01:19 -0500 Subject: [Python-ideas] Accepting keyword arguments for __getitem__ In-Reply-To: <20140623125339.GA18680@ferrara.linux.it> References: <20140623120605.GA17255@ferrara.linux.it> <20140623125339.GA18680@ferrara.linux.it> Message-ID: On Mon, Jun 23, 2014 at 7:53 AM, Stefano Borini wrote: > On Mon, Jun 23, 2014 at 10:24:53PM +1000, Chris Angelico wrote: >> The obvious way to accept that would be to support keyword arguments, >> and then it begins looking very much like a call. Can you alter your >> notation very slightly to become LDA(Z=5) instead? > > We certainly can, but I was wondering if such extension would be useful in other contexts. > Also, with the function solution, you would lose the order of the entries. You can't distinguish > foo(z=3, r=4) from foo(r=4, z=3) Chris may have missed that requirement (as I did) when they first read your email. Your desired behaviour matches no other known behaviour in Python. The only way to achieve that would be to do something akin to: foo(dict(z=3), dict(r=4)) And the same would be true of your proposed feature for __getitem__ because all keyword arguments would be collected into one dictionary. It would be unreasonable for just one method to behave totally differently from the standard behaviour in Python. It would be confusing for only __getitem__ (and ostensibly, __setitem__) to take keyword arguments but instead of turning them into a dictionary, turn them into individual single-item dictionaries. From ram.rachum at gmail.com Mon Jun 23 14:57:26 2014 From: ram.rachum at gmail.com (Ram Rachum) Date: Mon, 23 Jun 2014 05:57:26 -0700 (PDT) Subject: [Python-ideas] as_completed Message-ID: <62c29483-db9c-4879-a8fe-e9d1de6e4758@googlegroups.com> What do you think about an argument as_completed=False to Executor.map ? Personally I'd find it really handy. From rosuav at gmail.com Mon Jun 23 15:07:33 2014 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 23 Jun 2014 23:07:33 +1000 Subject: [Python-ideas] Accepting keyword arguments for __getitem__ In-Reply-To: <20140623125339.GA18680@ferrara.linux.it> References: <20140623120605.GA17255@ferrara.linux.it> <20140623125339.GA18680@ferrara.linux.it> Message-ID: On Mon, Jun 23, 2014 at 10:53 PM, Stefano Borini wrote: > On Mon, Jun 23, 2014 at 10:24:53PM +1000, Chris Angelico wrote: >> The obvious way to accept that would be to support keyword arguments, >> and then it begins looking very much like a call. Can you alter your >> notation very slightly to become LDA(Z=5) instead? > > We certainly can, but I was wondering if such extension would be useful in other contexts. > Also, with the function solution, you would lose the order of the entries. You can't distinguish > foo(z=3, r=4) from foo(r=4, z=3) Then you're asking for something where the syntax->semantics translation is very different from the rest of Python. I suspect that won't fly. As an alternative, you may want to look into a preprocessor - some sort of source code or concrete syntax tree transformation (you can't use an AST transform unless you start with valid, compilable Python). Translate this: LDA[z=3, r=4] into this: LDA(("z",3),("r",4)) and then parse it off like this: class A: def __call__(self, *args): for name, value in args: blah blah blah I rather doubt your proposal would see much support in the rest of the Python world, so a solution that's specific to your codebase would be the way to go. ChrisA From steve at pearwood.info Mon Jun 23 15:15:16 2014 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 23 Jun 2014 23:15:16 +1000 Subject: [Python-ideas] as_completed In-Reply-To: <62c29483-db9c-4879-a8fe-e9d1de6e4758@googlegroups.com> References: <62c29483-db9c-4879-a8fe-e9d1de6e4758@googlegroups.com> Message-ID: <20140623131516.GT7742@ando> On Mon, Jun 23, 2014 at 05:57:26AM -0700, Ram Rachum wrote: > What do you think about an argument as_completed=False to Executor.map ? > Personally I'd find it really handy. What is this argument intended to do? If you have a suggestion, you should explain what the suggestion is. Are you talking about Executor in the futures module? -- Steven From steve at pearwood.info Mon Jun 23 15:42:13 2014 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 23 Jun 2014 23:42:13 +1000 Subject: [Python-ideas] .pyu nicode syntax symbols (was Re: Empty set, Empty dict) In-Reply-To: References: Message-ID: <20140623134212.GV7742@ando> On Sun, Jun 22, 2014 at 09:00:20PM -0700, Guido van Rossum wrote: > On Sun, Jun 22, 2014 at 8:23 PM, Nick Coghlan wrote: > > > On 23 June 2014 10:30, Guido van Rossum wrote: > > > Hm. What's wrong with rejecting bad ideas? > > > > While I agree it's a bad idea to use symbols that can't be readily > > typed as part of the language syntax, I think Terry's broader point > > that anything which *can* be implemented outside the core usually > > *should* be implemented outside the core (at least as a > > proof-of-concept) is a good one. > > > > This particular proposal sounds to me like something that shouldn't be > implemented at all. We don't need another split in the community over how > to spell operators. I think you're exaggerating the danger here a tad. Split the community? We can barely get the community to grudgingly accept that maybe there's a use for Unicode *at all*, let alone use it as syntax :-) -- Steven From stefano.borini at ferrara.linux.it Mon Jun 23 17:59:11 2014 From: stefano.borini at ferrara.linux.it (Stefano Borini) Date: Mon, 23 Jun 2014 17:59:11 +0200 Subject: [Python-ideas] Accepting keyword arguments for __getitem__ In-Reply-To: References: <20140623120605.GA17255@ferrara.linux.it> <20140623125339.GA18680@ferrara.linux.it> Message-ID: <53A84ECF.8090500@ferrara.linux.it> On Mon, Jun 23, 2014 at 08:01:19AM -0500, Ian Cordasco wrote: > Chris may have missed that requirement (as I did) when they first read > your email. I blame my poor choice of subject on that misunderstanding. Apologies. > Your desired behaviour matches no other known behaviour in > Python. The only way to achieve that would be to do something akin to: > > foo(dict(z=3), dict(r=4)) > > And the same would be true of your proposed feature for __getitem__ > because all keyword arguments would be collected into one dictionary. > It would be unreasonable for just one method to behave totally > differently from the standard behaviour in Python. > It would be confusing for only __getitem__ (and ostensibly, __setitem__) to take > keyword arguments but instead of turning them into a dictionary, turn > them into individual single-item dictionaries. I tend to agree, however, the fact is that when you say a[2,3,4] __getitem__ is not called with four arguments. It's called with one tuple argument, which puts it already in a different category than a(2,3,4), where each entry is bound to individual arguments. It makes sense if you understand the comma as a tuple production. With keyword arguments, it would resemble more of a namedtuple, at least partially. The alternative, and accidentally proposed by my subject, would be to have __getitem__(self, y, **kwargs) and have a[1,2,Z=3,R=4] produce y=(1,2) kwargs = {"Z":3, "R": 4} but that would be equally heterogeneous (no *args), and it would not preserve ordering. I am not a big fan either of my own idea. I just threw a bone to see if it has already been discussed or if anyone would envision other possible use cases for this notation. From rosuav at gmail.com Mon Jun 23 18:18:24 2014 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 24 Jun 2014 02:18:24 +1000 Subject: [Python-ideas] Accepting keyword arguments for __getitem__ In-Reply-To: <53A84ECF.8090500@ferrara.linux.it> References: <20140623120605.GA17255@ferrara.linux.it> <20140623125339.GA18680@ferrara.linux.it> <53A84ECF.8090500@ferrara.linux.it> Message-ID: On Tue, Jun 24, 2014 at 1:59 AM, Stefano Borini wrote: > I am not a big fan either of my own idea. I just threw a bone to see if it > has > already been discussed or if anyone would envision other possible use cases > for > this notation. Best place to go from here would be a preparser, which you can then publish. If, at some later date, someone else has a similar need, s/he can see what you did and either (1) use it as-is and utter a prayer of thanks that someone's done the work already; (2) tweak it to fit the exact situation required; or (3) grumble at your code, and come back to python-ideas with a proposal. The proposal from #3 would sound something like this: "Here's my use-case. There's this recipe on the internet but it's awkward because X and Y, and it'd be so much better if this could be supported by the core language." And then we'd have this long and fruitful discussion (Sir Humphrey Appleby would approve!), figuring out whether it's of value or not, all with the solid basis of two separate use-cases for the same new syntax. ChrisA From g.rodola at gmail.com Mon Jun 23 18:29:34 2014 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Mon, 23 Jun 2014 18:29:34 +0200 Subject: [Python-ideas] .pyu nicode syntax symbols (was Re: Empty set, Empty dict) In-Reply-To: References: Message-ID: On Mon, Jun 23, 2014 at 5:23 AM, Nick Coghlan wrote: > On 23 June 2014 10:30, Guido van Rossum wrote: > > Hm. What's wrong with rejecting bad ideas? > > While I agree it's a bad idea to use symbols that can't be readily > typed as part of the language syntax, I think Terry's broader point > that anything which *can* be implemented outside the core usually > *should* be implemented outside the core (at least as a > proof-of-concept) is a good one. > AFAIU this *really* looks like a bad idea. I don't even understand why would anyone want to do such a thing. -- Giampaolo - http://grodola.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Mon Jun 23 19:11:29 2014 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 23 Jun 2014 13:11:29 -0400 Subject: [Python-ideas] Accepting keyword arguments for __getitem__ In-Reply-To: <20140623120605.GA17255@ferrara.linux.it> References: <20140623120605.GA17255@ferrara.linux.it> Message-ID: On 6/23/2014 8:06 AM, Stefano Borini wrote: > Dear all, > > At work we use a notation like LDA[Z=5] to define a specific level > of accuracy for our evaluation. This notation is used > just for textual labels, but it would be nice if it actually worked > at the scripting level, which led me to think to the following: > at the moment, we have the following > >>>> class A: > ... def __getitem__(self, y): This actually says that y can be passed by position or name ;-) > ... print(y) > ... >>>> a=A() >>>> a[2] > 2 >>> a.__getitem__(y=2) 2 >>>> a[2,3] > (2, 3) >>>> a[1:3] > slice(1, 3, None) >>>> a[1:3, 4] > (slice(1, 3, None), 4) >>>> > > I would propose to add the possibility for a[Z=3], where y would then be a > dictionary {"Z": 3}. In the case of a[1:3, 4, Z=3, R=5], the value of y would > be a tuple containing (slice(1,3,None), 4, {"Z": 3}, {"R": 5}). This allows to > preserve the ordering as specified (e.g. a[Z=3, R=4] vs a[R=4, Z=3]). As others have pointed out, you are not actually asking that __getitem__ 'accept keyword arguments'. Rather you are asking that "x=y" be seen as an abbreviation for "{'x':y}" in a very rare usage in a particular context to save 4 (admittedly awkward) keystrokes. The resulting confusion is not worth it. Saving 4 of 7 might seem worth it, but it real cases, like "precision=4" versus "{'precision':4}" the ratio is lower. I also wonder whether you might sometimes us the same spec in multiple subscriptings, so that you might define "p = {'precision': 4}" once and use it multiple times. In your introductory paragraph, you only specify one optional parameter -- accuracy. So it is not clear why you do not just write a .get(self, ob, accuaracy=default) method. If their are multiple options, make them keyword only. -- Terry Jan Reedy From p.f.moore at gmail.com Mon Jun 23 19:32:58 2014 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 23 Jun 2014 18:32:58 +0100 Subject: [Python-ideas] Accepting keyword arguments for __getitem__ In-Reply-To: References: <20140623120605.GA17255@ferrara.linux.it> Message-ID: On 23 June 2014 18:11, Terry Reedy wrote: > As others have pointed out, you are not actually asking that __getitem__ > 'accept keyword arguments'. Rather you are asking that "x=y" be seen as an > abbreviation for "{'x':y}" in a very rare usage in a particular context to > save 4 (admittedly awkward) keystrokes. The point here is that the OP is viewing Python syntax as (in effect) a DSL[1] for his application, and is looking for syntactical support for constructs that make sense in the context of that DSL. It's not about saving keystrokes, but about expressing things in a way that matches the problem space. The problem is that Python doesn't really support use as a DSL (as opposed to, say Ruby and Perl, which have syntax that is explicitly designed for use as a DSL). Trying to add on DSL-style syntax into Python is always going to be difficult, because that's not how the language was designed. On the other hand, writing a parser or preprocessor that handles a specific DSL is entirely possible - just painful because you need to handle all the niggly details of expression parsers, etc. Maybe a better approach would be to add features to the Python parser to allow it to be used in 3rd party code and customised. Applications could then more easily write their own Python-derived syntax, with a parser that can read from a string, or even implement an import hook to allow directly importable DSL files. I don't know how practical this solution is, or how much of it is already available, but it might be a more productive way of directing people who are looking for "python-like" syntax for their application languages, rather than simply leaving them with writing their own parser, or trying to get Python's syntax changed (which is essentially not going to happen). Just a thought... Paul [1] Domain Specific Language, just in case the term isn't familiar. From joseph.martinot-lagarde at m4x.org Mon Jun 23 20:22:35 2014 From: joseph.martinot-lagarde at m4x.org (Joseph Martinot-Lagarde) Date: Mon, 23 Jun 2014 20:22:35 +0200 Subject: [Python-ideas] Accepting keyword arguments for __getitem__ In-Reply-To: <20140623120605.GA17255@ferrara.linux.it> References: <20140623120605.GA17255@ferrara.linux.it> Message-ID: <53A8706B.1070106@m4x.org> Le 23/06/2014 14:06, Stefano Borini a ?crit : > Dear all, > > At work we use a notation like LDA[Z=5] to define a specific level of accuracy for our evaluation. This notation is used > just for textual labels, but it would be nice if it actually worked at the scripting level, which led me to think to the following: > at the moment, we have the following > >>>> class A: > ... def __getitem__(self, y): > ... print(y) > ... >>>> a=A() >>>> a[2] > 2 >>>> a[2,3] > (2, 3) >>>> a[1:3] > slice(1, 3, None) >>>> a[1:3, 4] > (slice(1, 3, None), 4) >>>> > > I would propose to add the possibility for a[Z=3], where y would then be a > dictionary {"Z": 3}. In the case of a[1:3, 4, Z=3, R=5], the value of y would > be a tuple containing (slice(1,3,None), 4, {"Z": 3}, {"R": 5}). This allows to > preserve the ordering as specified (e.g. a[Z=3, R=4] vs a[R=4, Z=3]). > > Do you think it would be a good/useful idea? Was this already discussed or proposed in a PEP? > Google did not help on this regard. > > Thank you, > > Stefano Borini > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > Actually I proposed a similar functionnality a few months ago: http://thread.gmane.org/gmane.comp.python.ideas/27584 The main point is not saving a few keystrokes but increase readability. It is indeed possible to use __call__ (that's what I'm doing in some cases), but then the indexing part is lost. Using a dictionnary is not clear either. Compare: table[x=8, y=11] table[{x: 8}, {y: 11}] You could argue that keyword arguments are useless since you can always add a dictionary as last argument... Before using python I was using Matlab. One very annoying thing in Matlab is that both indexing and function call use parenthesis. Code mixing both is really hard to understand. Coming to python was a relief on this aspect, where [] and () makes really clear whether the operation is a call or indexing. Now that I know python better, it bothers me that indexing doesn't have the same semantics a a function call. To me their intentions are different but their use should be the same. I guess that the equivalence between a[1, 2] and a[(1, 2)] is for backward compatibility, but it shouldn't stop from adding keywords arguments. Using a preprocessor seems fine when building a full application, but is really impracticable when crunching numbers from scripts or ipython. Also, using a preprocessor for something as simple as indexing seems really overkill. Now, I don't understand why you need to know the ordering of keyword arguments, since they are clearly labeled ? I'd hate to have to manually parse (slice(1,3,None), 4, {"Z": 3}, {"R": 5}). Joseph From joseph.martinot-lagarde at m4x.org Mon Jun 23 20:35:34 2014 From: joseph.martinot-lagarde at m4x.org (Joseph Martinot-Lagarde) Date: Mon, 23 Jun 2014 20:35:34 +0200 Subject: [Python-ideas] Accepting keyword arguments for __getitem__ In-Reply-To: <53A8706B.1070106@m4x.org> References: <20140623120605.GA17255@ferrara.linux.it> <53A8706B.1070106@m4x.org> Message-ID: <53A87376.9020705@m4x.org> Le 23/06/2014 20:22, Joseph Martinot-Lagarde a ?crit : > Le 23/06/2014 14:06, Stefano Borini a ?crit : >> Dear all, >> >> At work we use a notation like LDA[Z=5] to define a specific level of >> accuracy for our evaluation. This notation is used >> just for textual labels, but it would be nice if it actually worked at >> the scripting level, which led me to think to the following: >> at the moment, we have the following >> >>>>> class A: >> ... def __getitem__(self, y): >> ... print(y) >> ... >>>>> a=A() >>>>> a[2] >> 2 >>>>> a[2,3] >> (2, 3) >>>>> a[1:3] >> slice(1, 3, None) >>>>> a[1:3, 4] >> (slice(1, 3, None), 4) >>>>> >> >> I would propose to add the possibility for a[Z=3], where y would then >> be a >> dictionary {"Z": 3}. In the case of a[1:3, 4, Z=3, R=5], the value of >> y would >> be a tuple containing (slice(1,3,None), 4, {"Z": 3}, {"R": 5}). This >> allows to >> preserve the ordering as specified (e.g. a[Z=3, R=4] vs a[R=4, Z=3]). >> >> Do you think it would be a good/useful idea? Was this already >> discussed or proposed in a PEP? >> Google did not help on this regard. >> >> Thank you, >> >> Stefano Borini >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > Actually I proposed a similar functionnality a few months ago: > http://thread.gmane.org/gmane.comp.python.ideas/27584 > > The main point is not saving a few keystrokes but increase readability. > It is indeed possible to use __call__ (that's what I'm doing in some > cases), but then the indexing part is lost. Using a dictionnary is not > clear either. Compare: > > table[x=8, y=11] > table[{x: 8}, {y: 11}] > > You could argue that keyword arguments are useless since you can always > add a dictionary as last argument... > > Before using python I was using Matlab. One very annoying thing in > Matlab is that both indexing and function call use parenthesis. Code > mixing both is really hard to understand. Coming to python was a relief > on this aspect, where [] and () makes really clear whether the operation > is a call or indexing. > > Now that I know python better, it bothers me that indexing doesn't have > the same semantics a a function call. To me their intentions are > different but their use should be the same. I guess that the equivalence > between a[1, 2] and a[(1, 2)] is for backward compatibility, but it > shouldn't stop from adding keywords arguments. > > Using a preprocessor seems fine when building a full application, but is > really impracticable when crunching numbers from scripts or ipython. > Also, using a preprocessor for something as simple as indexing seems > really overkill. > > Now, I don't understand why you need to know the ordering of keyword > arguments, since they are clearly labeled ? I'd hate to have to manually > parse (slice(1,3,None), 4, {"Z": 3}, {"R": 5}). > > Joseph > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > I forgot to add that slice notation can't be used in function calls, you have to use the very less readable slice() function. For a live example of something that would use keyword arguments in __getitem__, there is numpy.r_ : http://docs.scipy.org/doc/numpy/reference/generated/numpy.r_.html. Joseph From antoine at python.org Mon Jun 23 20:36:55 2014 From: antoine at python.org (Antoine Pitrou) Date: Mon, 23 Jun 2014 14:36:55 -0400 Subject: [Python-ideas] .pyu nicode syntax symbols (was Re: Empty set, Empty dict) In-Reply-To: References: Message-ID: Agreed with Guido. The proposed idea looks terribly silly (I was actually wondering whether the post was serious or not). Regards Antoine. Le 23/06/2014 00:00, Guido van Rossum a ?crit : > On Sun, Jun 22, 2014 at 8:23 PM, Nick Coghlan > > wrote: > > On 23 June 2014 10:30, Guido van Rossum > wrote: > > Hm. What's wrong with rejecting bad ideas? > > While I agree it's a bad idea to use symbols that can't be readily > typed as part of the language syntax, I think Terry's broader point > that anything which *can* be implemented outside the core usually > *should* be implemented outside the core (at least as a > proof-of-concept) is a good one. > > > This particular proposal sounds to me like something that shouldn't be > implemented at all. We don't need another split in the community over > how to spell operators. > > Hy shows it is possible to implement a Lisp on top of the CPython > runtime, > > > It wasn't proposed as a serious feature on python-ideas. > > so folks should certainly be capable of implementing a > Python-with-Unicode-symbols on top of existing Python runtimes without > needing the blessing of the core development team. > > > Terry *is* asking for a blessing of the .pyu extension by the core team. > (Although it seems he wouldn't be too upset if he didn't get it. :-) > > -- > --Guido van Rossum (python.org/~guido ) > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From jeanpierreda at gmail.com Mon Jun 23 20:37:37 2014 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Mon, 23 Jun 2014 11:37:37 -0700 Subject: [Python-ideas] Accepting keyword arguments for __getitem__ In-Reply-To: <20140623120605.GA17255@ferrara.linux.it> References: <20140623120605.GA17255@ferrara.linux.it> Message-ID: What about using slices instead? >>> a['Z': 3, 'B': 2] (slice('Z', 3, None), slice('B', 2, None)) -- Devin On Mon, Jun 23, 2014 at 5:06 AM, Stefano Borini wrote: > Dear all, > > At work we use a notation like LDA[Z=5] to define a specific level of accuracy for our evaluation. This notation is used > just for textual labels, but it would be nice if it actually worked at the scripting level, which led me to think to the following: > at the moment, we have the following > >>>> class A: > ... def __getitem__(self, y): > ... print(y) > ... >>>> a=A() >>>> a[2] > 2 >>>> a[2,3] > (2, 3) >>>> a[1:3] > slice(1, 3, None) >>>> a[1:3, 4] > (slice(1, 3, None), 4) >>>> > > I would propose to add the possibility for a[Z=3], where y would then be a > dictionary {"Z": 3}. In the case of a[1:3, 4, Z=3, R=5], the value of y would > be a tuple containing (slice(1,3,None), 4, {"Z": 3}, {"R": 5}). This allows to > preserve the ordering as specified (e.g. a[Z=3, R=4] vs a[R=4, Z=3]). > > Do you think it would be a good/useful idea? Was this already discussed or proposed in a PEP? > Google did not help on this regard. > > Thank you, > > Stefano Borini > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From guido at python.org Mon Jun 23 20:47:51 2014 From: guido at python.org (Guido van Rossum) Date: Mon, 23 Jun 2014 11:47:51 -0700 Subject: [Python-ideas] Accepting keyword arguments for __getitem__ In-Reply-To: References: <20140623120605.GA17255@ferrara.linux.it> Message-ID: I'm not sure yet what to think of the proposal (the proposed workarounds sound pretty reasonable) but it looks to me like the OP (Stefano) did a pretty good and careful analysis of the existing API, and his actual proposal does make the most sense if we wanted to add such a feature at all. (And yes, the subject was a little misleading. :-) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Mon Jun 23 21:11:25 2014 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 23 Jun 2014 15:11:25 -0400 Subject: [Python-ideas] .pyu nicode syntax symbols (was Re: Empty set, Empty dict) In-Reply-To: References: Message-ID: On 6/22/2014 8:30 PM, Guido van Rossum wrote: > Hm. What's wrong with rejecting bad ideas? [I am not sure whether you are asking seriously or rhetorically, but I think this question is worth a serious response.] Aside from the fact that different people have different ideas of what is an absolutely bad idea, nothing. I personally reject almost all new syntax ideas because I think most of them are local small-audience optimizations that would overall make Python worse. However, the purpose of python-ideas is "Discussions of speculative Python language ideas". 'Discussion' means not routinely trying to stop discussion. Indeed, some good can come from discussion of ideas I (or you) think are bad. Rejection has multiple forms, not mutually exclusive: Inaction: by default, an idea is effectively rejected until a patch is committed. Education: explaining how one can already accomplish the desired task. Deflection: suggest implementing the idea somewhere other than in core Python. Explanation: explain why something is bad. Downvote or BDFL rejection: (self-explanatory) > On Jun 22, 2014 1:19 PM, "Terry Reedy" > > wrote: > > Problem: For years, various people have suggested that they would > like to use syntactically significant unicode symbols in Python > code. A prime example is using U+2205, EMPTY SET, ?, instead of > 'set()'. Specifically, I believe people have asked that Python parsers accept and translate unicode symbols *in .py files*. This would have the immediate effect of making some .py files invisibly and unnecessarily incompatible with all existing Python interpreters, even if the translated code would run just fine. I, too, do not want the meaning of '.py' fragmented further than it already is. > On the other hand, the conservative, overwhelmed core > development group is not much interested and would rather do other > things. In other words, the idea of changing Python itself has been and will be rejected by inaction for at least the next few years, and until circumstances change after that. (Hence, no need for *me* to 'reject' it.) > Solution: Act instead of ask. 'Stop asking' is not only rejection of the idea of changing Python, but also of continuing the discussion that has gone on for years. People who do not want to give up the idea should do something else. In the course of suggesting an implementation, I also suggested some aspects of an implementation that I consider important. > One or more of the people who really want this could get themselves > together and produce a working system. (If multiple people, ask for > a new sig and mailing list). Discuss it elsewhere because python-ideas is not 3rd-party-package-dev. > 1. Ask core development to reserve '.pyu' for python with unicode > symbols. (If refused, chose something else.) In other words, 1. do not use .py for unisym_python. 2. while .pyu seems like an obvious alternative (to me), recognize python-devs moral rights to .pyx, regardless of legalities. > 2. Write pyu.py. It should first translate x.pyu to the equivalent > x.py. ... run x.py. To be clear, I meant write x.py to disk, where it would be available for humans to read. This is specifically aimed at the issue of 'fragmenting the community'. > [snip implementation idea] > A mathematician that used most of those symbols, for a math > audience, could still use the ascii tranlation for other audiences. Again, I would want the standard .py file available. In my first post to clp/python list over 17 years ago, I dubbed Python 'executable pseudocode' and opined that it should be used to communicate algorithms in preference to non-executable notation. I would rather a mathematician use symbols embedded in Python, with a link to a .py file, than the same symbols in a non-executable *and non-testable* notation. -- Terry Jan Reedy From guido at python.org Mon Jun 23 21:28:03 2014 From: guido at python.org (Guido van Rossum) Date: Mon, 23 Jun 2014 12:28:03 -0700 Subject: [Python-ideas] .pyu nicode syntax symbols (was Re: Empty set, Empty dict) In-Reply-To: References: Message-ID: Sorry Terry, I was short (and ended up being cryptic) because I was on a mobile device. I meant "this is a bad idea and should be rejected", and in addition I also meant to discourage a 3rd party implementation of the idea. I also wanted to object against your claim that this idea has only been left unimplemented because of disinterest or inaction by the core dev team; to the contrary, the general sentiment is pretty clear that it's a bad idea. There are other ideas that are not suitable for adding to the language but where we would encourage folks to help themselves by writing a module or extension and posting it on PyPy, or even ideas where it would eventually be a good idea to include such a package into the stdlib. But this is not one of them. On Mon, Jun 23, 2014 at 12:11 PM, Terry Reedy wrote: > On 6/22/2014 8:30 PM, Guido van Rossum wrote: > >> Hm. What's wrong with rejecting bad ideas? >> > > [I am not sure whether you are asking seriously or rhetorically, but I > think this question is worth a serious response.] > > Aside from the fact that different people have different ideas of what is > an absolutely bad idea, nothing. I personally reject almost all new syntax > ideas because I think most of them are local small-audience optimizations > that would overall make Python worse. > > However, the purpose of python-ideas is "Discussions of speculative Python > language ideas". 'Discussion' means not routinely trying to stop > discussion. Indeed, some good can come from discussion of ideas I (or you) > think are bad. > > Rejection has multiple forms, not mutually exclusive: > > Inaction: by default, an idea is effectively rejected until a patch is > committed. > > Education: explaining how one can already accomplish the desired task. > > Deflection: suggest implementing the idea somewhere other than in core > Python. > > Explanation: explain why something is bad. > > Downvote or BDFL rejection: (self-explanatory) > > On Jun 22, 2014 1:19 PM, "Terry Reedy" >> > > wrote: >> >> Problem: For years, various people have suggested that they would >> like to use syntactically significant unicode symbols in Python >> code. A prime example is using U+2205, EMPTY SET, ?, instead of >> 'set()'. >> > > Specifically, I believe people have asked that Python parsers accept and > translate unicode symbols *in .py files*. This would have the immediate > effect of making some .py files invisibly and unnecessarily incompatible > with all existing Python interpreters, even if the translated code would > run just fine. I, too, do not want the meaning of '.py' fragmented further > than it already is. > > > On the other hand, the conservative, overwhelmed core >> development group is not much interested and would rather do other >> things. >> > > In other words, the idea of changing Python itself has been and will be > rejected by inaction for at least the next few years, and until > circumstances change after that. (Hence, no need for *me* to 'reject' it.) > > > Solution: Act instead of ask. >> > > 'Stop asking' is not only rejection of the idea of changing Python, but > also of continuing the discussion that has gone on for years. People who do > not want to give up the idea should do something else. In the course of > suggesting an implementation, I also suggested some aspects of an > implementation that I consider important. > > > One or more of the people who really want this could get themselves >> together and produce a working system. (If multiple people, ask for >> a new sig and mailing list). >> > > Discuss it elsewhere because python-ideas is not 3rd-party-package-dev. > > 1. Ask core development to reserve '.pyu' for python with unicode >> symbols. (If refused, chose something else.) >> > > In other words, > 1. do not use .py for unisym_python. > 2. while .pyu seems like an obvious alternative (to me), recognize > python-devs moral rights to .pyx, regardless of legalities. > > 2. Write pyu.py. It should first translate x.pyu to the equivalent >> x.py. ... run x.py. >> > > To be clear, I meant write x.py to disk, where it would be available for > humans to read. This is specifically aimed at the issue of 'fragmenting the > community'. > > > [snip implementation idea] > > A mathematician that used most of those symbols, for a math >> audience, could still use the ascii tranlation for other audiences. >> > > Again, I would want the standard .py file available. > > In my first post to clp/python list over 17 years ago, I dubbed Python > 'executable pseudocode' and opined that it should be used to communicate > algorithms in preference to non-executable notation. I would rather a > mathematician use symbols embedded in Python, with a link to a .py file, > than the same symbols in a non-executable *and non-testable* notation. > > > -- > Terry Jan Reedy > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Mon Jun 23 22:16:55 2014 From: abarnert at yahoo.com (Andrew Barnert) Date: Mon, 23 Jun 2014 13:16:55 -0700 Subject: [Python-ideas] Accepting keyword arguments for __getitem__ In-Reply-To: <20140623125339.GA18680@ferrara.linux.it> References: <20140623120605.GA17255@ferrara.linux.it> <20140623125339.GA18680@ferrara.linux.it> Message-ID: On Jun 23, 2014, at 5:53, Stefano Borini wrote: > On Mon, Jun 23, 2014 at 10:24:53PM +1000, Chris Angelico wrote: >> The obvious way to accept that would be to support keyword arguments, >> and then it begins looking very much like a call. Can you alter your >> notation very slightly to become LDA(Z=5) instead? > > We certainly can, but I was wondering if such extension would be useful in other contexts. > Also, with the function solution, you would lose the order of the entries. You can't distinguish > foo(z=3, r=4) from foo(r=4, z=3) That last problem is a more general one, which applies to function calls at least as much as to your proposed use case, and there's an open PEP (466) that could probably use more use cases to convince people. With that PEP, you wouldn't get {'z': 3}, {'r': 4}, but OrderedDict(('z', 3), ('r', 4)) or something equivalent. I think that would make the function-calling workaround much more usable. And it would definitely make your additional proposal a lot simpler: add kwargs--which then work exactly the same as in function calls--to getitem. There's also a proposal for namedtuple literals, which seems like it fit your use case a lot better (especially if, like a regular tuple literal, the parens were optional). Unfortunately, if I remember right, nobody was able to come up with a good enough solution to the semantic problems to make it worth writing a PEP. But you could find that in the archives and see if you can come up with a workable version of that idea. From stefano.borini at ferrara.linux.it Mon Jun 23 22:40:26 2014 From: stefano.borini at ferrara.linux.it (Stefano Borini) Date: Mon, 23 Jun 2014 22:40:26 +0200 Subject: [Python-ideas] Accepting keyword arguments for __getitem__ In-Reply-To: References: <20140623120605.GA17255@ferrara.linux.it> <20140623125339.GA18680@ferrara.linux.it> Message-ID: <53A890BA.4020904@ferrara.linux.it> On 6/23/14 10:16 PM, Andrew Barnert wrote: > That last problem is a more general one, which applies to function > calls at least as much as to your proposed use case, and there's an > open PEP (466) that could probably use more use cases to convince > people. Sorry, I cannot find it. PEP 466 is about network security. It's the first time I engage in active python proposals, so I might be a bit clueless if for python3 there's a different repository/numbering system. > With that PEP, you wouldn't get {'z': 3}, {'r': 4}, but > OrderedDict(('z', 3), ('r', 4)) or something equivalent. I think that > would make the function-calling workaround much more usable. And it > would definitely make your additional proposal a lot simpler: add > kwargs--which then work exactly the same as in function calls--to > getitem. You would however have to skip *args, as it would never make sense in that context: the full non-keyword arguments would be always packed into the y tuple. > There's also a proposal for namedtuple literals, which seems like it > fit your use case a lot better (especially if, like a regular tuple > literal, the parens were optional). Unfortunately, if I remember > right, nobody was able to come up with a good enough solution to the > semantic problems to make it worth writing a PEP. But you could find > that in the archives and see if you can come up with a workable > version of that idea. I see that the idea spawned some discussion, and at this point I don't really know what a possible course of action might be. I am certainly open to do additional research and aggregate what I find into some kind of proto-PEP, and hack the interpreter for some possible implementation. Thanks, Stefano From eric at trueblade.com Mon Jun 23 23:27:43 2014 From: eric at trueblade.com (Eric V. Smith) Date: Mon, 23 Jun 2014 17:27:43 -0400 Subject: [Python-ideas] Accepting keyword arguments for __getitem__ In-Reply-To: <53A890BA.4020904@ferrara.linux.it> References: <20140623120605.GA17255@ferrara.linux.it> <20140623125339.GA18680@ferrara.linux.it> <53A890BA.4020904@ferrara.linux.it> Message-ID: <53A89BCF.6030203@trueblade.com> On 6/23/2014 4:40 PM, Stefano Borini wrote: > On 6/23/14 10:16 PM, Andrew Barnert wrote: >> That last problem is a more general one, which applies to function >> calls at least as much as to your proposed use case, and there's an >> open PEP (466) that could probably use more use cases to convince >> people. > > Sorry, I cannot find it. PEP 466 is about network security. It's the > first time I engage in active python proposals, so I might be a bit > clueless if for python3 there's a different repository/numbering system. That's PEP 468: http://legacy.python.org/dev/peps/pep-0468/ From ndbecker2 at gmail.com Fri Jun 27 15:05:48 2014 From: ndbecker2 at gmail.com (Neal Becker) Date: Fri, 27 Jun 2014 09:05:48 -0400 Subject: [Python-ideas] problems with import Message-ID: One problem I often encounter is with import search. An example of the problem is with the package mercurial. It has extensions in a subdirectory called 'hgext'. On fedora, I install mercurial using the vendor package. This creates /usr/lib64/python2.7/site-packages/hgext/ Later, I want to try out an extension as a non-privileged user. cd python setup.py install --user Now I also have ~/.local/lib/python2.7/site-packages/hgext but python won't search there for extensions. Once if finds the system hgext directory, it won't look also in the local one. Any thoughts? From __peter__ at web.de Fri Jun 27 15:34:48 2014 From: __peter__ at web.de (Peter Otten) Date: Fri, 27 Jun 2014 15:34:48 +0200 Subject: [Python-ideas] problems with import References: Message-ID: Neal Becker wrote: > One problem I often encounter is with import search. > > An example of the problem is with the package mercurial. It has > extensions in a subdirectory called 'hgext'. > > On fedora, I install mercurial using the vendor package. This creates > > /usr/lib64/python2.7/site-packages/hgext/ > > Later, I want to try out an extension as a non-privileged user. > > cd > python setup.py install --user > > Now I also have > ~/.local/lib/python2.7/site-packages/hgext > > but python won't search there for extensions. Once if finds the system > hgext directory, it won't look also in the local one. > > Any thoughts? Isn't that addressed with "PEP 420 -- Implicit Namespace Packages? http://legacy.python.org/dev/peps/pep-0420/ From rosuav at gmail.com Fri Jun 27 15:44:03 2014 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 27 Jun 2014 23:44:03 +1000 Subject: [Python-ideas] problems with import In-Reply-To: References: Message-ID: On Fri, Jun 27, 2014 at 11:34 PM, Peter Otten <__peter__ at web.de> wrote: > Isn't that addressed with "PEP 420 -- Implicit Namespace Packages? > > http://legacy.python.org/dev/peps/pep-0420/ Unfortunately for the OP, that doesn't seem to be applicable to Python 2.x; also, changes made to Python in response to this list don't usually apply to 2.x either. Mercurial doesn't work with Python 3, currently, although some of the improvements to the latest Pythons have been in response to reported issues with hg, so it's possible that hg might support 3.5 or 3.6 at some point. I don't know if there's a 2.x-compatible solution to this problem. ChrisA From __peter__ at web.de Fri Jun 27 16:01:13 2014 From: __peter__ at web.de (Peter Otten) Date: Fri, 27 Jun 2014 16:01:13 +0200 Subject: [Python-ideas] problems with import References: Message-ID: Chris Angelico wrote: > On Fri, Jun 27, 2014 at 11:34 PM, Peter Otten > <__peter__ at web.de> wrote: >> Isn't that addressed with "PEP 420 -- Implicit Namespace Packages? >> >> http://legacy.python.org/dev/peps/pep-0420/ > > Unfortunately for the OP, that doesn't seem to be applicable to Python > 2.x; also, changes made to Python in response to this list don't > usually apply to 2.x either. Mercurial doesn't work with Python 3, > currently, although some of the improvements to the latest Pythons > have been in response to reported issues with hg, so it's possible > that hg might support 3.5 or 3.6 at some point. > > I don't know if there's a 2.x-compatible solution to this problem. http://legacy.python.org/dev/peps/pep-0420/#namespace-packages-today But this is likely off-topic for python-ideas. From steve at pearwood.info Fri Jun 27 19:12:15 2014 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 28 Jun 2014 03:12:15 +1000 Subject: [Python-ideas] problems with import In-Reply-To: References: Message-ID: <20140627171215.GG13014@ando> On Fri, Jun 27, 2014 at 09:05:48AM -0400, Neal Becker wrote: [...] > Now I also have > ~/.local/lib/python2.7/site-packages/hgext > > but python won't search there for extensions. Once if finds the system hgext > directory, it won't look also in the local one. Re-arrange sys.path so that the local site-packages comes first, before the global site-packages. (I'm surprised Python doesn't already do this.) -- Steven From antoine at python.org Fri Jun 27 20:33:07 2014 From: antoine at python.org (Antoine Pitrou) Date: Fri, 27 Jun 2014 14:33:07 -0400 Subject: [Python-ideas] problems with import In-Reply-To: <20140627171215.GG13014@ando> References: <20140627171215.GG13014@ando> Message-ID: Le 27/06/2014 13:12, Steven D'Aprano a ?crit : > On Fri, Jun 27, 2014 at 09:05:48AM -0400, Neal Becker wrote: > [...] >> Now I also have >> ~/.local/lib/python2.7/site-packages/hgext >> >> but python won't search there for extensions. Once if finds the system hgext >> directory, it won't look also in the local one. > > Re-arrange sys.path so that the local site-packages comes first, > before the global site-packages. (I'm surprised Python doesn't > already do this.) Then he would have the reverse problem: once he installs a user-local hg extension, the bundled (official) hg extensions wouldn't be reachable anymore. The answer here comes into two possibilities, both of which have to do with Mercurial and none with Python itself: 1) Mercurial could make hgext a namespace package (see Peter Otten's answer above) 2) third-party extensions for Mercurial should never install into the "hgext" package but rather in a separate top-level package or module (presumable called "hgsomething") Regards Antoine. From steve at pearwood.info Fri Jun 27 21:19:26 2014 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 28 Jun 2014 05:19:26 +1000 Subject: [Python-ideas] problems with import In-Reply-To: References: <20140627171215.GG13014@ando> Message-ID: <20140627191926.GH13014@ando> On Fri, Jun 27, 2014 at 02:33:07PM -0400, Antoine Pitrou wrote: > Le 27/06/2014 13:12, Steven D'Aprano a ?crit : > >On Fri, Jun 27, 2014 at 09:05:48AM -0400, Neal Becker wrote: > >[...] > >>Now I also have > >>~/.local/lib/python2.7/site-packages/hgext > >> > >>but python won't search there for extensions. Once if finds the system > >>hgext > >>directory, it won't look also in the local one. > > > >Re-arrange sys.path so that the local site-packages comes first, > >before the global site-packages. (I'm surprised Python doesn't > >already do this.) > > Then he would have the reverse problem: once he installs a user-local hg > extension, the bundled (official) hg extensions wouldn't be reachable > anymore. Naturally, but I assumed that the only reason you would install something locally was if you intended it to over-ride the global version. If that's not the case, then you're right, it's an issue for Mercurial to solve. -- Steven From ndbecker2 at gmail.com Fri Jun 27 23:02:14 2014 From: ndbecker2 at gmail.com (Neal Becker) Date: Fri, 27 Jun 2014 17:02:14 -0400 Subject: [Python-ideas] problems with import References: <20140627171215.GG13014@ando> <20140627191926.GH13014@ando> Message-ID: Steven D'Aprano wrote: > On Fri, Jun 27, 2014 at 02:33:07PM -0400, Antoine Pitrou wrote: >> Le 27/06/2014 13:12, Steven D'Aprano a ?crit : >> >On Fri, Jun 27, 2014 at 09:05:48AM -0400, Neal Becker wrote: >> >[...] >> >>Now I also have >> >>~/.local/lib/python2.7/site-packages/hgext >> >> >> >>but python won't search there for extensions. Once if finds the system >> >>hgext >> >>directory, it won't look also in the local one. >> > >> >Re-arrange sys.path so that the local site-packages comes first, >> >before the global site-packages. (I'm surprised Python doesn't >> >already do this.) >> >> Then he would have the reverse problem: once he installs a user-local hg >> extension, the bundled (official) hg extensions wouldn't be reachable >> anymore. > > Naturally, but I assumed that the only reason you would install > something locally was if you intended it to over-ride the global > version. If that's not the case, then you're right, it's an issue for > Mercurial to solve. > > I don't think this is unique to mercurial. I'd like to have 2 areas for installing extensions to a package: a system wide and a local. I think the semantics we'd want is that the 2 trees are effectively merged, with the local overriding in the event of a conflict From guido at python.org Fri Jun 27 23:06:29 2014 From: guido at python.org (Guido van Rossum) Date: Fri, 27 Jun 2014 14:06:29 -0700 Subject: [Python-ideas] problems with import In-Reply-To: References: <20140627171215.GG13014@ando> <20140627191926.GH13014@ando> Message-ID: Yeah, so in Python 3.3 this is possible through namespace packages (see PEP 420 -- tldr: remove the empty __init__.py). It has been supported for a long time in Python 2 by setuptools, and you can even do it yourself by setting the package's __path__ attribute. See also pkgutil.py in the Python 2 stdlib. On Fri, Jun 27, 2014 at 2:02 PM, Neal Becker wrote: > Steven D'Aprano wrote: > > > On Fri, Jun 27, 2014 at 02:33:07PM -0400, Antoine Pitrou wrote: > >> Le 27/06/2014 13:12, Steven D'Aprano a ?crit : > >> >On Fri, Jun 27, 2014 at 09:05:48AM -0400, Neal Becker wrote: > >> >[...] > >> >>Now I also have > >> >>~/.local/lib/python2.7/site-packages/hgext > >> >> > >> >>but python won't search there for extensions. Once if finds the > system > >> >>hgext > >> >>directory, it won't look also in the local one. > >> > > >> >Re-arrange sys.path so that the local site-packages comes first, > >> >before the global site-packages. (I'm surprised Python doesn't > >> >already do this.) > >> > >> Then he would have the reverse problem: once he installs a user-local hg > >> extension, the bundled (official) hg extensions wouldn't be reachable > >> anymore. > > > > Naturally, but I assumed that the only reason you would install > > something locally was if you intended it to over-ride the global > > version. If that's not the case, then you're right, it's an issue for > > Mercurial to solve. > > > > > > I don't think this is unique to mercurial. > > I'd like to have 2 areas for installing extensions to a package: > a system wide and a local. > > I think the semantics we'd want is that the 2 trees are effectively merged, > with the local overriding in the event of a conflict > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Fri Jun 27 23:39:48 2014 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 27 Jun 2014 15:39:48 -0600 Subject: [Python-ideas] problems with import In-Reply-To: References: Message-ID: On Fri, Jun 27, 2014 at 7:05 AM, Neal Becker wrote: > One problem I often encounter is with import search. > > An example of the problem is with the package mercurial. It has extensions in a > subdirectory called 'hgext'. > > On fedora, I install mercurial using the vendor package. This creates > > /usr/lib64/python2.7/site-packages/hgext/ > > Later, I want to try out an extension as a non-privileged user. > > cd > python setup.py install --user > > Now I also have > ~/.local/lib/python2.7/site-packages/hgext > > but python won't search there for extensions. Once if finds the system hgext > directory, it won't look also in the local one. > > Any thoughts? Use ~/.hgrc to enable extensions: http://www.selenic.com/mercurial/hgrc.5.html#extensions In your case give the explicit path. I've been doing this for years and it works great. -eric From ncoghlan at gmail.com Sat Jun 28 00:18:27 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 28 Jun 2014 08:18:27 +1000 Subject: [Python-ideas] problems with import In-Reply-To: <20140627191926.GH13014@ando> References: <20140627171215.GG13014@ando> <20140627191926.GH13014@ando> Message-ID: On 28 Jun 2014 05:20, "Steven D'Aprano" wrote: > Naturally, but I assumed that the only reason you would install > something locally was if you intended it to over-ride the global > version. If that's not the case, then you're right, it's an issue for > Mercurial to solve. Local installs are supported to *add* non-conflicting user specific packages to the system Python installation, not to override them. The fact it's tricky to override the standard library and system provided libraries helps reduce the attack surface when running software with elevated access, but still as a specific user. Packages *can* opt-in to allowing contributions of submodules from later sys.path entries by declaring that package as a namespace package. In Python 3.3+ that's as simple as leaving __init__.py out entirely. In any version, it can be done explicitly using pkgutil.extend_path() (standard library) or pkg_resources.declare_namespace() (published as part of setuptools) Even with namespace packages, though, *modules* earlier in sys.path still take precedence over later entries. Cheers, Nick. > > > -- > Steven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From antoine at python.org Sat Jun 28 01:52:11 2014 From: antoine at python.org (Antoine Pitrou) Date: Fri, 27 Jun 2014 19:52:11 -0400 Subject: [Python-ideas] problems with import In-Reply-To: References: <20140627171215.GG13014@ando> <20140627191926.GH13014@ando> Message-ID: Le 27/06/2014 17:02, Neal Becker a ?crit : > > I don't think this is unique to mercurial. > > I'd like to have 2 areas for installing extensions to a package: > a system wide and a local. > > I think the semantics we'd want is that the 2 trees are effectively merged, > with the local overriding in the event of a conflict Who is "we"? Mercurial extensions have to be enabled manually in your .hgrc, so whether they live in the hgext namespace or anywhere else is quite irrelevant. Regards Antoine. From random832 at fastmail.us Sat Jun 28 07:00:02 2014 From: random832 at fastmail.us (random832 at fastmail.us) Date: Sat, 28 Jun 2014 01:00:02 -0400 Subject: [Python-ideas] .pyu nicode syntax symbols (was Re: Empty set, Empty dict) In-Reply-To: References: Message-ID: <1403931602.14407.135458493.44CF193B@webmail.messagingengine.com> On Sun, Jun 22, 2014, at 16:18, Terry Reedy wrote: > 2. Write pyu.py. It should first translate x.pyu to the equivalent x.py. What is the equivalent x.py for "BUILD_SET 0" rather than "LOAD_GLOBAL (set), CALL_FUNCTION 0"? From rosuav at gmail.com Sat Jun 28 07:28:57 2014 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 28 Jun 2014 15:28:57 +1000 Subject: [Python-ideas] .pyu nicode syntax symbols (was Re: Empty set, Empty dict) In-Reply-To: <1403931602.14407.135458493.44CF193B@webmail.messagingengine.com> References: <1403931602.14407.135458493.44CF193B@webmail.messagingengine.com> Message-ID: On Sat, Jun 28, 2014 at 3:00 PM, wrote: > On Sun, Jun 22, 2014, at 16:18, Terry Reedy wrote: > >> 2. Write pyu.py. It should first translate x.pyu to the equivalent x.py. > > What is the equivalent x.py for "BUILD_SET 0" rather than "LOAD_GLOBAL > (set), CALL_FUNCTION 0"? Is there any reason that it has to be normal-looking source code? def empty_set_literal(): # line 123 of somefile.pyu print("I'm an empty set!", ?) # becomes empty_set_literal = type(lambda:0)(type((lambda:0).__code__)(0,0,0,3,67,b't\x00\x00d\x01\x00h\x00\x00\x83\x02\x00\x01d\x00\x00S',(None,"I'm an empty set!",{}),('print',),(),"somefile.pyu","empty_set_literal",123,b"\x00\x01"),globals(),"empty_set_literal") I got most of the args for the code() constructor by disassembling the function, using a one-element set, and then manually edited the code afterward. It does appear to work: >>> dis.dis(empty_set_literal) 124 0 LOAD_GLOBAL 0 (print) 3 LOAD_CONST 1 ("I'm an empty set!") 6 BUILD_SET 0 9 CALL_FUNCTION 2 (2 positional, 0 keyword pair) 12 POP_TOP 13 LOAD_CONST 0 (None) 16 RETURN_VALUE >>> empty_set_literal() I'm an empty set! set() Given that the purpose of this is to make something executable, not something readable (in contrast to, say, 2to3), I don't think it would be a problem to have nightmare-level code in there occasionally. That said, I'm not particularly in favour of the proposal - I just felt like answering this part of it :) ChrisA From rosuav at gmail.com Sat Jun 28 09:26:11 2014 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 28 Jun 2014 17:26:11 +1000 Subject: [Python-ideas] .pyu nicode syntax symbols (was Re: Empty set, Empty dict) In-Reply-To: References: <1403931602.14407.135458493.44CF193B@webmail.messagingengine.com> Message-ID: On Sat, Jun 28, 2014 at 3:28 PM, Chris Angelico wrote: > On Sat, Jun 28, 2014 at 3:00 PM, wrote: >> On Sun, Jun 22, 2014, at 16:18, Terry Reedy wrote: >> >>> 2. Write pyu.py. It should first translate x.pyu to the equivalent x.py. >> >> What is the equivalent x.py for "BUILD_SET 0" rather than "LOAD_GLOBAL >> (set), CALL_FUNCTION 0"? > > Is there any reason that it has to be normal-looking source code? Here's a POC translator. Give it a string with the source code for one function, and it'll give back a string that'll generate a similar function. Currently assumes it's working at top level - doesn't handle nested functions, methods, etc, etc. But it seems to work. https://github.com/Rosuav/shed/blob/master/empty_set.py ChrisA From jsbfox at gmail.com Sat Jun 28 10:04:24 2014 From: jsbfox at gmail.com (Thomas Allen) Date: Sat, 28 Jun 2014 01:04:24 -0700 Subject: [Python-ideas] Special keyword denoting an infinite loop Message-ID: Rust language defines a special way to make an infinite loop ( http://doc.rust-lang.org/tutorial.html#loops). I propose adding the same keyword to Python. It will be very useful for WSGI servers and will suit as a more convenient replacement for recursion (hence Python doesn't do TRE). I personally find it much prettier than *while True* or *while 1*. It won't cause any problems with existing programs, because *loop* is very rarely used as a variable name. For instance while True: > do_something() > do_something_else() would turn to loop: > do_something() > do_something_else() -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sat Jun 28 11:11:12 2014 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 28 Jun 2014 19:11:12 +1000 Subject: [Python-ideas] Special keyword denoting an infinite loop In-Reply-To: References: Message-ID: <20140628091112.GI13014@ando> On Sat, Jun 28, 2014 at 01:04:24AM -0700, Thomas Allen wrote: > Rust language defines a special way to make an infinite loop ( > http://doc.rust-lang.org/tutorial.html#loops). Do they give an explanation for why they use a keyword for such a redundant purpose? > I propose adding the same keyword to Python. It will be very useful for > WSGI servers and will suit as a more convenient replacement for recursion > (hence Python doesn't do TRE). I understand that *infinite loops* themselves are useful, and that recursion can be replaced by iteration, but how does the "loop" keyword solve these issues better than "while True"? > I personally find it much prettier than *while > True* or *while 1*. If the only advantage of this is that you personally find it prettier, then I'm a strong -1 on this suggestion. * I personally find it less elegant than "while True". "while True" tells you explicity what it does: it's a while loop, and it operates while True is true (i.e. forever). "loop" looks like an incomplete line: what sort of loop, while, repeat or for? Loop for how long? It's all implicit. * It's yet another special keyword to memorise. It doesn't eliminate the need to know "while", or to know "True", and it gives you no extra benefit. It's just completely redundant. Suppose "loop" becomes a keyword in Python 3.5. That means that every existing Python program that uses "loop" as a function or variable cannot work in Python 3.5. It also means that any Python 3.5 code that uses the "loop" keyword Adding new keywords is only done for the most critical reasons, or when there is no other good alternative, not just on a whim. There is already a perfectly good way to write infinite loops, adding the "loop" keyword doesn't add anything to the language, it just breaks working code for the sake of a minor, cosmetic change. > It won't cause any problems with existing programs, > because *loop* is very rarely used as a variable name. How do you know it is rare? I've written code where loop is a name: def main(): setup() loop() -- Steven From stefan_ml at behnel.de Sat Jun 28 12:05:43 2014 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 28 Jun 2014 12:05:43 +0200 Subject: [Python-ideas] Special keyword denoting an infinite loop In-Reply-To: <20140628091112.GI13014@ando> References: <20140628091112.GI13014@ando> Message-ID: Steven D'Aprano, 28.06.2014 11:11: > On Sat, Jun 28, 2014 at 01:04:24AM -0700, Thomas Allen wrote: >> It won't cause any problems with existing programs, >> because *loop* is very rarely used as a variable name. > > How do you know it is rare? I've written code where loop is a name: > > def main(): > setup() > loop() Also, this example only uses English names. There is no reason to assume that programmers with other native languages that (also) use ASCII letters or transliterations would not happen to have a word spelled "loop" in their language that they may commonly use in their programs. Or that programmers with a lower level of proficiency in the English language would also not consider "loop" a good name for a variable or function. Or that there is no technical/business/science/social/you-name-it terminology whatsoever that makes "loop" appear as the most obvious choice for a name in a program of that domain. "Obvious Reasoning" easily fails when it comes to understanding naming decisions, especially across cultural boundaries. Adding a new keyword needs very serious reasoning, and that's a good thing. Stefan From rosuav at gmail.com Sat Jun 28 12:07:17 2014 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 28 Jun 2014 20:07:17 +1000 Subject: [Python-ideas] Special keyword denoting an infinite loop In-Reply-To: References: Message-ID: On Sat, Jun 28, 2014 at 6:04 PM, Thomas Allen wrote: > I personally find it much prettier than while True or while 1. One common technique I've seen is the self-documenting infinite loop: while "more work to be done": get_work() do_work() If you're worried about the prettiness of "while True", this might help. Since any non-empty string counts as true, this can add a bit more information without disrupting the loop itself. ChrisA From ncoghlan at gmail.com Sat Jun 28 12:24:40 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 28 Jun 2014 20:24:40 +1000 Subject: [Python-ideas] Special keyword denoting an infinite loop In-Reply-To: References: Message-ID: On 28 June 2014 18:04, Thomas Allen wrote: > Rust language defines a special way to make an infinite loop > (http://doc.rust-lang.org/tutorial.html#loops). > > I propose adding the same keyword to Python. It will be very useful for WSGI > servers and will suit as a more convenient replacement for recursion (hence > Python doesn't do TRE). I personally find it much prettier than while True > or while 1. It won't cause any problems with existing programs, because loop > is very rarely used as a variable name. "won't cause any problems" does not mesh with an assertion of "very rarely used" on two counts: - "very rarely" means it *will* cause problems for at least some programs - the "very rarely used" assertion isn't backed by any analysis However, it's a useful example for illustrating some good questions to ask about any proposals to change the language: 1. What else will have to change as a consequence? 2. Who will be hurt by this change, and how much will they be hurt? 3. Who will gain from this change, and how much will they gain? I'm going to work through and answer all of these for this proposal - this isn't to pick on you, it's to show the kind of thinking that may lie behind a terse "No" or "That's a terrible idea" when a dev is pressed for time and isn't able to write out their full rationale for disliking a suggestion :) Starting from the top: 1. What else will have to change as a consequence? In this case, a quick search over CPython itself for "loop" variables finds: - the "asyncore.loop" public API - parameters named "loop" in the asyncio public API Really, we can stop there - a new keyword that conflicts with public APIs in the standard library just won't happen without an extraordinarily compelling reason, and it's unlikely such a reason is going to be suddenly discovered for a language that has already been around for more than 20 years. However, I'll continue on to illustrate how even a quick check like running "pss --python loop" from a CPython checkout (which is all I did to come up with these examples) can recalibrate our intuitions about variable names and the impact of introducing new keywords. Additional uses of "loop" as a name in CPython: - many internal variables named "loop" in asyncio and its test suite - a call to asyncore.loop in the smtpd standard library module - a "loop" counter in the hashlib standard library module - calls to asyncore.loop in the test suite for the asyncore standard library module - a call to asyncore.loop in the test suite for the asynchat standard library module - a call to asyncore.loop in the test suite for the poplib standard library module - a call to asyncore.loop in the test suite for the ftplib standard library module - a call to asyncore.loop in the test suite for the logging standard library module - a call to asyncore.loop in the test suite for the ssl standard library module - a call to asyncore.loop in the test suite for the os standard library module - a "loop" attribute in the test suite for the cyclic garbage collector - a "loop" variable in the test suite for the faulthandler module - a "loop" variable in the test suite for the signal module - a "loop" variable in the ccbench tool (used to check GIL tuning parameters) In addition to the above cases that actually *do* use "loop", there are plenty of other cases called things like "_loop", "mainloop" or "cmdloop", that could easily have been called just "loop" instead. The "very rarely used" claim doesn't hold up, even just looking at the standard library. It's a relatively *domain specific* variable name, but that's not the same as being rare - in the applicable domain, it gets used a *lot*. The search shows that "loop" is also used in many comments as a generic term, and I know from personal experience that is often used as an umbrella term where saying "loop statement" encompasses both for loops and while loops. 2. Who will be hurt by this change, and how much will they be hurt? - anyone affected by the backwards compatibility break for asyncore and asyncio - anyone with an existing variable called "loop" (which includes the core dev team) - anyone used to using "loop statement" as an umbrella term (which includes me) - anyone tasked with explaining why there's a dedicated alternative spelling for "while True:" and "while 1:" in a way that students can grasp easily (the compiler can already detect and optimise them with their existing spelling, so the keyword isn't needed for that. Even static analysis tools can pick up the explicit infinite loops pretty easily. That only leaves the readability argument, which has a certain amount of merit as described below) 3. Who will gain from this change, and how much will they gain? - future learners of Python may more easily grasp that "while True:"/"while 1:" infinite loops tend to serve a fundamentally different purpose than normal while loops. Unfortunately, Rust chooses to allow both the "infinite loop" and the normal "while loop" to be used to implement loop-and-a-half semantics, so it doesn't actually make that distinction - "loop" is literally just an alternative spelling of "while true", that provides no additional hints as to whether or not "break" might be present in the loop body. For Python, the backwards compatibility issues make the idea of "loop" as a new keyword a clear loss, and there's insufficient gain in the idea in general to be worth pursuing it further. I don't think the referenced feature actually makes much sense as part of Rust either, but starting afresh means it is at least harmless, albeit a little redundant. If it disallowed "break", you'd at least have a clear indicator that "this is the last statement in this execution unit - the only way out now is to return to our caller". Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From breamoreboy at yahoo.co.uk Sat Jun 28 14:34:22 2014 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Sat, 28 Jun 2014 13:34:22 +0100 Subject: [Python-ideas] Special keyword denoting an infinite loop In-Reply-To: References: Message-ID: On 28/06/2014 09:04, Thomas Allen wrote: > Rust language defines a special way to make an infinite loop > (http://doc.rust-lang.org/tutorial.html#loops). > > I propose adding the same keyword to Python. It will be very useful for > WSGI servers and will suit as a more convenient replacement for > recursion (hence Python doesn't do TRE). I personally find it much > prettier than /while True/ or /while 1/. It won't cause any problems > with existing programs, because /loop/ is very rarely used as a variable > name. > > For instance > > while True: > do_something() > do_something_else() > > > would turn to > > loop: > do_something() > do_something_else() > No thank you, my standard answer applies. I prefer Python in a Nutshell to fit in my pocket, not the back of a 40 ton articulated lorry. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence --- This email is free from viruses and malware because avast! Antivirus protection is active. http://www.avast.com From jeanpierreda at gmail.com Sat Jun 28 14:53:08 2014 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Sat, 28 Jun 2014 05:53:08 -0700 Subject: [Python-ideas] Special keyword denoting an infinite loop In-Reply-To: <20140628091112.GI13014@ando> References: <20140628091112.GI13014@ando> Message-ID: On Sat, Jun 28, 2014 at 2:11 AM, Steven D'Aprano wrote: > On Sat, Jun 28, 2014 at 01:04:24AM -0700, Thomas Allen wrote: >> Rust language defines a special way to make an infinite loop ( >> http://doc.rust-lang.org/tutorial.html#loops). > > Do they give an explanation for why they use a keyword for such a > redundant purpose? Sure. "while true {...}" would require magic by the compiler to make it optimized and to make things like the following pass compile-time checks: "let a; while true { a = 1; break;}; return a". With a while loop, Rust can't really know that the loop executes even once without special-casing the argument, so it emits a compile-time error because the variable a might be uninitialized. If Rust magically knew about while true, then it becomes confusing if replacing "true" with something the compiler doesn't directly understand causes the compiler to get confused. Special cases aren't special enough to break the rules, so Rust decides that the special case here deserves its own keyword. In Python, there is no special case at all, so there is no extra keyword. As it should be. Rust has some ideas Python could borrow, but this ain't one of them. -1. -- Devin From ncoghlan at gmail.com Sat Jun 28 17:14:53 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 29 Jun 2014 01:14:53 +1000 Subject: [Python-ideas] Special keyword denoting an infinite loop In-Reply-To: <1403967419.84811.YahooMailNeo@web122102.mail.ne1.yahoo.com> References: <1403967419.84811.YahooMailNeo@web122102.mail.ne1.yahoo.com> Message-ID: On 29 June 2014 00:56, Benny Khoo wrote: > rather than a special keyword in Python, how about having Python to support > the concept of passing block (a group of statements) as argument? I thought > that can be quite elegant solution. So a loop statement can be interpreted > simply as a function that accept a block e.g. loop [block]? > > Supporting block has a lot of practical applications. I remember seeing some > special purpose flow control functions as early as Tcl. We also see it in > Ruby and the more recently the new Swift language. This is a well worn path, and it's difficult to retrofit to an existing language. Ruby, at least, relies heavily on a convention of taking blocks as the last argument to a function to make things work, which is a poor fit to Python's keyword arguments and far more varied positional signatures for higher order functions. PEP 403 and PEP 3150 are a couple of different explorations of the idea a more block-like feature. http://python-notes.curiousefficiency.org/en/latest/pep_ideas/suite_expr.html is one that goes even further to consider a delineated subsyntax for Python that would allow entire suites as expressions. However, the stumbling block all these proposals tend to hit is that proponents really, really, struggle to come up with compelling use cases where "just define a named function" isn't a clearer and easier to understand answer. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sat Jun 28 17:16:20 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 29 Jun 2014 01:16:20 +1000 Subject: [Python-ideas] Special keyword denoting an infinite loop In-Reply-To: References: <20140628091112.GI13014@ando> Message-ID: On 28 June 2014 22:53, Devin Jeanpierre wrote: > On Sat, Jun 28, 2014 at 2:11 AM, Steven D'Aprano wrote: >> On Sat, Jun 28, 2014 at 01:04:24AM -0700, Thomas Allen wrote: >>> Rust language defines a special way to make an infinite loop ( >>> http://doc.rust-lang.org/tutorial.html#loops). >> >> Do they give an explanation for why they use a keyword for such a >> redundant purpose? > > Sure. "while true {...}" would require magic by the compiler to make > it optimized and to make things like the following pass compile-time > checks: "let a; while true { a = 1; break;}; return a". With a while > loop, Rust can't really know that the loop executes even once without > special-casing the argument, so it emits a compile-time error because > the variable a might be uninitialized. If Rust magically knew about > while true, then it becomes confusing if replacing "true" with > something the compiler doesn't directly understand causes the compiler > to get confused. > > Special cases aren't special enough to break the rules, so Rust > decides that the special case here deserves its own keyword. Ah, that makes a lot of sense - I forgot that it wouldn't be just an optimisation for Rust, but a control flow validity checking change as well. Thanks for the explanation. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From stephen at xemacs.org Sat Jun 28 17:48:01 2014 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sun, 29 Jun 2014 00:48:01 +0900 Subject: [Python-ideas] problems with import In-Reply-To: References: <20140627171215.GG13014@ando> <20140627191926.GH13014@ando> Message-ID: <87ionlvxvy.fsf@uwakimon.sk.tsukuba.ac.jp> Neal Becker writes: > I think the semantics we'd want is that the 2 trees are effectively > merged, with the local overriding in the event of a conflict Maybe. XEmacs does such overriding, and my experience is that users expect DWIM behavior (the "best" version gets used). Typically local trees get out of date and may not be compatible with newer versions of the main tree updated by the OS's PMS, etc. It makes things hard to diagnose. From abarnert at yahoo.com Sat Jun 28 23:33:58 2014 From: abarnert at yahoo.com (Andrew Barnert) Date: Sat, 28 Jun 2014 14:33:58 -0700 Subject: [Python-ideas] Special keyword denoting an infinite loop In-Reply-To: References: <1403967419.84811.YahooMailNeo@web122102.mail.ne1.yahoo.com> Message-ID: On Jun 28, 2014, at 8:14, Nick Coghlan wrote: > On 29 June 2014 00:56, Benny Khoo wrote: >> rather than a special keyword in Python, how about having Python to support >> the concept of passing block (a group of statements) as argument? I thought >> that can be quite elegant solution. So a loop statement can be interpreted >> simply as a function that accept a block e.g. loop [block]? >> >> Supporting block has a lot of practical applications. I remember seeing some >> special purpose flow control functions as early as Tcl. We also see it in >> Ruby and the more recently the new Swift language. > > This is a well worn path, and it's difficult to retrofit to an > existing language. Ruby, at least, relies heavily on a convention of > taking blocks as the last argument to a function to make things work, > which is a poor fit to Python's keyword arguments and far more varied > positional signatures for higher order functions. Since Benny mentioned Swift, it's probably worth following up on that. Swift doesn't actually have blocks* (somewhat surprising, since Apple previously added blocks to ObjC and even C); it has a clever way of getting all the benefits of blocks without the downsides. Could there be something for Python there? In Ruby (and ObjC) blocks and functions are different types of things. They're defined, called, and passed differently; they have different scope semantics (functions can't capture local variables, blocks can); they can't even easily be converted to each other. Like Python, Swift functions are closures. Swift functions can be defined in two ways, but either way, they're the same kind of function. Just like Python's def and lambda. Their func statement is almost exactly like our def statement except with braces. Their inline closure expression is similar to our lambda expression, but with some major differences (most of which have actually been proposed for Python): no keyword to introduce the expression, params go inside the braces, params can be anonymous (so { $1 + $2 } is a complete definition, equivalent to lambda _1, _2: _1 + _2), and of course they can be multiline and contain statements. (And like ours, return isn't necessary.) Then Swift added one tiny pieces of syntactic sugar: if an anonymous function definition is the last argument in a function call, it can go outside the parens. So, you can write this: reduce(myArray, 0) { $1 + $2 } myArray.filter { $1 >= 0 } That looks just like Ruby blocks, but it's still just functions. So if you already have a function defined out of line (or a bound method, or a function you received from elsewhere and stored in a variable, or whatever), you don't need to wrap it in a block, you just pass it: reduce(myArray, "", smartConcat) myArray.filter(myPredicate) And that looks just like Python or Lisp. And if you want to write a function that takes two functions, with an optional keyword argument after them, it can still take them both inline, quite readable. Ruby users claim they don't miss this ability, but anyone who uses promises in JS (or anything at all in Haskell) can think of dozens of times they passed non-final function arguments today, and wouldn't be happy with an API that made that impossible. So, if we adopted Nick's not-really-serious proposals for omitting lambda before the colon when it's not syntactically ambiguous and allowing anonymous _1-style params, and added the syntactic sugar to allow lambdas as final arguments to come outside the parens, would that make Python better? reduce(my_list, 0, lambda x, y: x + y) reduce(my_list, 0) :_1 + _2 I think the general agreement on the first two changes was that, while they can be nice in a few cases, they can also be very ugly--and, more importantly, simple cases are already good enough today (see below), while more complex cases can't be done without multiline lambdas so there's no help. And I don't think the last bit of sugar sways things. If we had a way to do multiline (statement-having) lambdas, that might be another story. But after years of trying, nobody's come up with a good solution, so you can't just assume that's solvable. Solve that first, and then we should definitely look at how the Swift stuff could be added on top of that. Meanwhile, I think Nick's PEP 403, or something like it, is both more general and more Pythonic. Instead of trying to find a better way to embed function definitions into expressions, find a way to lift any subexpression out of an expression and define it (normally) after the current statement, and you solve the current problem for free, and a bunch of other problems too. (Although, unfortunately, none that really seem to demand solutions in practical code.) --- * I lied a little at the top. Because Swift compiles into code that interacts with the ObjC runtime and Apple's blocks-filled C frameworks, of course it has to deal with ObjC blocks, in both directions. But it does this the same way it deals with C functions--by transparently bridging them to plain-old Swift functions. Unless you want to dig into the bridging using as-yet-undocumented stdlib functions, you never see blocks anywhere. From abarnert at yahoo.com Sun Jun 29 00:02:29 2014 From: abarnert at yahoo.com (Andrew Barnert) Date: Sat, 28 Jun 2014 15:02:29 -0700 Subject: [Python-ideas] problems with import In-Reply-To: <87ionlvxvy.fsf@uwakimon.sk.tsukuba.ac.jp> References: <20140627171215.GG13014@ando> <20140627191926.GH13014@ando> <87ionlvxvy.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <1403992949.43852.YahooMailNeo@web181006.mail.ne1.yahoo.com> On Saturday, June 28, 2014 8:48 AM, Stephen J. Turnbull wrote: >Neal Becker writes: > >> I think the semantics we'd want is that the 2 trees are effectively >> merged, with the local overriding in the event of a conflict > >Maybe.? XEmacs does such overriding, and my experience is that users >expect DWIM behavior (the "best" version gets used).? Typically local >trees get out of date and may not be compatible with newer versions of >the main tree updated by the OS's PMS, etc.? It makes things hard to >diagnose. Isn't Python already flexible enough here? Whichever one comes first in sys.path wins. If you (as a distro packager, a sysadmin, a user, or even a program at runtime) want to customize that order, it's trivial to do so.?If even that isn't good enough, you can replace almost any piece of the import machinery pretty easily?you could write a custom finder that finds all versions and picks the one with the newest timestamp, or whatever you think is better.?So, what do people want here that Python doesn't do? Meanwhile, what I personally prefer is to let local beat global, as long as I have?an easy way to check when that may not be a good idea anymore. For example, I had PyObjC 2.5.0 installed for my system Python 2.7, back when Apple was using 2.4.something. At some point, Apple switched to 2.5.1, so after installing that system upgrade, my local copy was no longer needed, or wanted. Fortunately, I have a script that I wrote for just that occasion that told me about it so I could uninstall it.? I definitely wouldn't want Python to automatically start ignoring my 2.5.0 because there's a 2.5.1. I'd love it if something (Python, Apple's installer, a script that came with either Python or OS X, whatever) would alert me to the problem so I didn't need my own script, but I'm not expecting that. Also, what's right for my primary dev machine is not necessarily what's right for my company's testing systems or live deployed servers, or our customers' disparate systems; all I can really do there is require PyObjC 2.5.0 and let whoever's in charge of the machine figure out that they may be able to get that by uninstalling instead of upgrading. Anything that tries to DWIM its way out of that problem is going to get things mysteriously wrong as often as it helps. From ncoghlan at gmail.com Sun Jun 29 07:19:07 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 29 Jun 2014 15:19:07 +1000 Subject: [Python-ideas] Special keyword denoting an infinite loop In-Reply-To: References: <1403967419.84811.YahooMailNeo@web122102.mail.ne1.yahoo.com> Message-ID: On 29 June 2014 07:33, Andrew Barnert wrote: > Meanwhile, I think Nick's PEP 403, or something like it, is both more general and more Pythonic. Instead of trying to find a better way to embed function definitions into expressions, find a way to lift any subexpression out of an expression and define it (normally) after the current statement, and you solve the current problem for free, and a bunch of other problems too. (Although, unfortunately, none that really seem to demand solutions in practical code.) On that last point, one of my goals at SciPy next month will be to encourage folks in the scientific community that are keen to see something resembling block support in Python to go hunting for compelling *use cases*. The fatal barrier to proposals like PEP 403 and 3150 has long been that there are other options already available, so the substantial additional complexity they introduce isn't adequately justified. The two main stumbling blocks: - generators-as-coroutines already offer a way of suspending execution of a sequential operation, as embodied in asyncio.coroutine and contexlib.contextmanager - nested definitions of named functions are usually a readable alternative in the cases lambdas can't handle The reason I occasionally spend time on PEPs 403 and 3150 is because I think we're missing a case where "one shot" functions could be handled more gracefully - situations where we're defining a function solely because we want to pass it to other code as an object at runtime, not because we need to reference it at multiple places in the *source* code. That's a pretty narrow niche, though - if you *do* need to invoke the same code in multiple places, than a named function is always going to be better, even if dedicated one-shot function support is available. Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From abarnert at yahoo.com Sun Jun 29 09:22:02 2014 From: abarnert at yahoo.com (Andrew Barnert) Date: Sun, 29 Jun 2014 00:22:02 -0700 Subject: [Python-ideas] Special keyword denoting an infinite loop In-Reply-To: References: <1403967419.84811.YahooMailNeo@web122102.mail.ne1.yahoo.com> Message-ID: <1404026522.94328.YahooMailNeo@web181001.mail.ne1.yahoo.com> Sorry, just realized I left out the example I meant to give. I'll insert it below: On Saturday, June 28, 2014 2:37 PM, Andrew Barnert wrote: [snip] >And if you want to write a function that takes two functions, with an optional keyword argument after them, it can still take them both inline, quite readable. Ruby users claim they don't miss this ability, but anyone who uses promises in JS (or anything at all in Haskell) can think of dozens of times they passed non-final function arguments today, and wouldn't be happy with an API that made that impossible. Here's some slightly simplified real-life JS code using Promises: ? ? db.select_one(sql, thingid) ? ? .then(function(rowset) { return rowset[0]['foo']; }, log_error); Here's what the same code looks like with a Ruby port of Promises: ? ? db.select_one(sql, thingid) ? ? .then {|rowset| rowset[0]['foo'} ? ? .then(nil, proc {|err| log_error(err)}) I think this shows why blocks are a second-rate?substitute for first-class, closure-capturing, inline-definable functions (which Ruby and ObjC don't have, but JS and Swift do). The only reason anyone should want blocks in Python is if they're convinced that it's impossible to come up with a clean syntax for multiline lambdas, but it's easy to come up with one for multiline blocks. From hernan.grecco at gmail.com Sun Jun 29 14:53:31 2014 From: hernan.grecco at gmail.com (Hernan Grecco) Date: Sun, 29 Jun 2014 09:53:31 -0300 Subject: [Python-ideas] Special keyword denoting an infinite loop In-Reply-To: References: <1403967419.84811.YahooMailNeo@web122102.mail.ne1.yahoo.com> Message-ID: Hi On Sun, Jun 29, 2014 at 2:19 AM, Nick Coghlan wrote: > On that last point, one of my goals at SciPy next month will be to > encourage folks in the scientific community that are keen to see > something resembling block support in Python to go hunting for > compelling *use cases*. The fatal barrier to proposals like PEP 403 > and 3150 has long been that there are other options already available, > so the substantial additional complexity they introduce isn't > adequately justified. The two main stumbling blocks: What is the status of PEP 3150? I remember reading that you were withdrawing 3150 in favor of 403 but this is not reflected in http://legacy.python.org/dev/peps/pep-3150/. cheers, Hern?n From ncoghlan at gmail.com Sun Jun 29 15:37:43 2014 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 29 Jun 2014 23:37:43 +1000 Subject: [Python-ideas] Special keyword denoting an infinite loop In-Reply-To: References: <1403967419.84811.YahooMailNeo@web122102.mail.ne1.yahoo.com> Message-ID: On 29 Jun 2014 23:00, "Hernan Grecco" wrote: > > Hi > > On Sun, Jun 29, 2014 at 2:19 AM, Nick Coghlan wrote: > > On that last point, one of my goals at SciPy next month will be to > > encourage folks in the scientific community that are keen to see > > something resembling block support in Python to go hunting for > > compelling *use cases*. The fatal barrier to proposals like PEP 403 > > and 3150 has long been that there are other options already available, > > so the substantial additional complexity they introduce isn't > > adequately justified. The two main stumbling blocks: > > What is the status of PEP 3150? I remember reading that you were > withdrawing 3150 in favor of 403 but this is not reflected in > http://legacy.python.org/dev/peps/pep-3150/. I see merit in both alternatives, so I still update both of them occasionally. I did withdraw 3150 at one point, but I later figured out a possible solution to the previously fatal flaw in its namespace handling semantics and moved it back to Deferred. I tend not to announce any updates to either of them, since they'll remain pure speculation in the absence of clear use cases where they would provide a compelling readability benefit over the status quo. Cheers, Nick. > > cheers, > > Hern?n > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From j.wielicki at sotecware.net Sun Jun 29 15:52:38 2014 From: j.wielicki at sotecware.net (Jonas Wielicki) Date: Sun, 29 Jun 2014 15:52:38 +0200 Subject: [Python-ideas] Special keyword denoting an infinite loop In-Reply-To: References: <1403967419.84811.YahooMailNeo@web122102.mail.ne1.yahoo.com> Message-ID: <53B01A26.3050309@sotecware.net> On 29.06.2014 15:37, Nick Coghlan wrote: > On 29 Jun 2014 23:00, "Hernan Grecco" wrote: >> >> Hi >> >> On Sun, Jun 29, 2014 at 2:19 AM, Nick Coghlan wrote: >>> On that last point, one of my goals at SciPy next month will be to >>> encourage folks in the scientific community that are keen to see >>> something resembling block support in Python to go hunting for >>> compelling *use cases*. The fatal barrier to proposals like PEP 403 >>> and 3150 has long been that there are other options already available, >>> so the substantial additional complexity they introduce isn't >>> adequately justified. The two main stumbling blocks: >> >> What is the status of PEP 3150? I remember reading that you were >> withdrawing 3150 in favor of 403 but this is not reflected in >> http://legacy.python.org/dev/peps/pep-3150/. > > I see merit in both alternatives, so I still update both of them > occasionally. I did withdraw 3150 at one point, but I later figured out a > possible solution to the previously fatal flaw in its namespace handling > semantics and moved it back to Deferred. It is still written in the abstract of 403 that 3150 was withdrawn. regards, jwi p.s.: while I?m at it, in the ?Explaining Decorator Clause Evaluation and Application?, 3150 is missing a ?, I think, and in the ?Out of Order Execution? section there seems to be a markup issue after the second code block > I tend not to announce any updates to either of them, since they'll remain > pure speculation in the absence of clear use cases where they would provide > a compelling readability benefit over the status quo. > > Cheers, > Nick. > >> >> cheers, >> >> Hern?n >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From random832 at fastmail.us Mon Jun 30 19:18:10 2014 From: random832 at fastmail.us (random832 at fastmail.us) Date: Mon, 30 Jun 2014 13:18:10 -0400 Subject: [Python-ideas] .pyu nicode syntax symbols (was Re: Empty set, Empty dict) In-Reply-To: References: <1403931602.14407.135458493.44CF193B@webmail.messagingengine.com> Message-ID: <1404148690.18766.136186337.62A26E9E@webmail.messagingengine.com> On Sat, Jun 28, 2014, at 01:28, Chris Angelico wrote: > empty_set_literal = > type(lambda:0)(type((lambda:0).__code__)(0,0,0,3,67,b't\x00\x00d\x01\x00h\x00\x00\x83\x02\x00\x01d\x00\x00S',(None,"I'm If you're embedding the entire compiler (in fact, a modified one) in your tool, why not just output a .pyc? From random832 at fastmail.us Mon Jun 30 19:24:37 2014 From: random832 at fastmail.us (random832 at fastmail.us) Date: Mon, 30 Jun 2014 13:24:37 -0400 Subject: [Python-ideas] Special keyword denoting an infinite loop In-Reply-To: References: <20140628091112.GI13014@ando> Message-ID: <1404149077.20890.136187005.57F7B11E@webmail.messagingengine.com> On Sat, Jun 28, 2014, at 06:05, Stefan Behnel wrote: > Adding a new keyword needs very serious reasoning, and that's a good > thing. For pedantry's sake, I will note that "NAME ':'" is not a valid sequence to start a statement with today. That is, however, probably _not_ a road anyone wants to go down if there is any other option. It's almost enough to make one wish that Python had defined an expansive set of reserved words as Javascript does - a set which might not contain "loop" but would probably contain "do". What about _just_ "while:" or "for:"? From steve at pearwood.info Mon Jun 30 20:20:30 2014 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 1 Jul 2014 04:20:30 +1000 Subject: [Python-ideas] Special keyword denoting an infinite loop In-Reply-To: <1404149077.20890.136187005.57F7B11E@webmail.messagingengine.com> References: <20140628091112.GI13014@ando> <1404149077.20890.136187005.57F7B11E@webmail.messagingengine.com> Message-ID: <20140630182030.GR13014@ando> On Mon, Jun 30, 2014 at 01:24:37PM -0400, random832 at fastmail.us wrote: > On Sat, Jun 28, 2014, at 06:05, Stefan Behnel wrote: > > Adding a new keyword needs very serious reasoning, and that's a good > > thing. [...] > What about _just_ "while:" or "for:"? Why bother? Is there anything you can do with a bare "while:" that you can't do with "while True:"? If not, what's the point? -- Steven