From steve at pearwood.info Sat Jul 1 02:13:41 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 1 Jul 2017 16:13:41 +1000 Subject: [Python-ideas] + operator on generators In-Reply-To: <20170630010951.3fc7d17b@grzmot> References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170630010951.3fc7d17b@grzmot> Message-ID: <20170701061339.GN3149@ando.pearwood.info> On Fri, Jun 30, 2017 at 01:09:51AM +0200, Jan Kaliszewski wrote: > But implementation of the OP's proposal does not need to be based on > __add__ at all. It could be based on extending the current behaviour of > the `+` operator itself. > > Now this behavior is (roughly): try left side's __add__, if failed try > right side's __radd__, if failed raise TypeError. > > New behavior could be (again: roughly): try left side's __add__, if > failed try right side's __radd__, if failed try __iter__ of both sides > and chain them (creating a new iterator?), if failed raise TypeError. That's what I suggested earlier, except using & instead of + as the operator. The reason I suggested & instead of + is that there will be fewer clashes between iterables that already support the operator and hence fewer surprises. Using + will be a bug magnet. Consider: it = x + y # chain two iterables first = next(it, "default") That code looks pretty safe, but it's actually a minefield waiting to blow you up. It works fine if you pass (let's say) a generator object and a string, or a list and an iterator, but if x and y happen to both be strings, or both lists, or both tuples, the + operator will concatenate them instead of chaining them, and the call to next will blow up. So you would have to write: it = iter(x + y) # chain two iterables, and ensure the result is an iterator to be sure. Which is not a huge burden, but it does take away the benefit of having an operator. In that case, you might as well do: it = chain(x, y) and be done with it. It's true that exactly the same potential problem occurs with & but its less likely. Strings, tuples, lists and other sequences don't typically support __(r)and__ and the & operator, so you're less likely to be burned. Still, the possibility is there. Maybe we should use a different operator. ++ is out because that already has meaning, so that leaves either && or inventing some arbitrary symbol. But the more I think about it the more I agree with Nick. Let's start by moving itertools.chain into built-ins, with zip and map, and only consider giving it an operator after we've had a few years of experience with chain as a built-in. We might even find that an operator doesn't add any real value. > ? Preferably using the existing `yield from` mechanism -- because, in > case of generators, it would provide a way to combine ("concatenate") > *generators*, preserving semantics of all that their __next__(), send(), > throw() nice stuff... I don't think that would be generally useful. If you're sending values into an arbitrary generator, who knows what you're getting? chain() will operate on arbitrary iterables, you can't expect to send values into chain([1, 2, 3], my_generator(), "xyz") and have anything sensible occur. -- Steve From wes.turner at gmail.com Sat Jul 1 02:35:29 2017 From: wes.turner at gmail.com (Wes Turner) Date: Sat, 1 Jul 2017 01:35:29 -0500 Subject: [Python-ideas] + operator on generators In-Reply-To: <20170701061339.GN3149@ando.pearwood.info> References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170630010951.3fc7d17b@grzmot> <20170701061339.GN3149@ando.pearwood.info> Message-ID: On Saturday, July 1, 2017, Steven D'Aprano wrote: > On Fri, Jun 30, 2017 at 01:09:51AM +0200, Jan Kaliszewski wrote: > > [...] > > But the more I think about it the more I agree with Nick. Let's start > by moving itertools.chain into built-ins, with zip and map, and only > consider giving it an operator after we've had a few years of experience > with chain as a built-in. We might even find that an operator doesn't > add any real value. - Would that include chain.from_iterable? - So there's then a new conditional import (e.g. in a compat package)? What does this add? > > > ? Preferably using the existing `yield from` mechanism -- because, in > > case of generators, it would provide a way to combine ("concatenate") > > *generators*, preserving semantics of all that their __next__(), send(), > > throw() nice stuff... > > I don't think that would be generally useful. Flatten one level? > > If you're sending values > into an arbitrary generator, who knows what you're getting? chain() will > operate on arbitrary iterables, you can't expect to send values into > chain([1, 2, 3], my_generator(), "xyz") and have anything sensible > occur. - is my_generator() mutable (e.g. before or during iteration)? - https://docs.python.org/2/reference/expressions.html#generator.send -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sat Jul 1 04:11:52 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 1 Jul 2017 18:11:52 +1000 Subject: [Python-ideas] + operator on generators In-Reply-To: References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170630010951.3fc7d17b@grzmot> <20170701061339.GN3149@ando.pearwood.info> Message-ID: <20170701081152.GO3149@ando.pearwood.info> On Sat, Jul 01, 2017 at 01:35:29AM -0500, Wes Turner wrote: > On Saturday, July 1, 2017, Steven D'Aprano wrote: > > > On Fri, Jun 30, 2017 at 01:09:51AM +0200, Jan Kaliszewski wrote: > > > > [...] > > > > But the more I think about it the more I agree with Nick. Let's start > > by moving itertools.chain into built-ins, with zip and map, and only > > consider giving it an operator after we've had a few years of experience > > with chain as a built-in. We might even find that an operator doesn't > > add any real value. > > > - Would that include chain.from_iterable? Yes. > - So there's then a new conditional import (e.g. in a compat package)? What > does this add? try: chain except NameError: from itertools import chain Two lines, if and only if you both need chain and want to support versions of Python older than 3.7. There's no need to import it if you aren't going to use it. > > > ? Preferably using the existing `yield from` mechanism -- because, in > > > case of generators, it would provide a way to combine ("concatenate") > > > *generators*, preserving semantics of all that their __next__(), send(), > > > throw() nice stuff... > > > > I don't think that would be generally useful. > > Flatten one level? Flattening typically applies to lists and sequences. I'm not saying that chain shouldn't support generators. That would be silly: a generator is an iterable and chaining supports iterables. I'm saying that it wouldn't be helpful to require chain objects to support send(), throw() etc. > > If you're sending values > > into an arbitrary generator, who knows what you're getting? chain() will > > operate on arbitrary iterables, you can't expect to send values into > > chain([1, 2, 3], my_generator(), "xyz") and have anything sensible > > occur. > > > - is my_generator() mutable (e.g. before or during iteration)? It doesn't matter. Sending into a chain of arbitrary iterators isn't a useful thing to do. > - https://docs.python.org/2/reference/expressions.html#generator.send Why are you linking to the 2 version of the docs? We're discusing a hypotheticial new feature which must go into 3, not 2. -- Steve From rosuav at gmail.com Sat Jul 1 04:21:29 2017 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 1 Jul 2017 18:21:29 +1000 Subject: [Python-ideas] + operator on generators In-Reply-To: <20170701081152.GO3149@ando.pearwood.info> References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170630010951.3fc7d17b@grzmot> <20170701061339.GN3149@ando.pearwood.info> <20170701081152.GO3149@ando.pearwood.info> Message-ID: On Sat, Jul 1, 2017 at 6:11 PM, Steven D'Aprano wrote: >> - So there's then a new conditional import (e.g. in a compat package)? What >> does this add? > > try: chain > except NameError: from itertools import chain > > Two lines, if and only if you both need chain and want to support > versions of Python older than 3.7. > > There's no need to import it if you aren't going to use it. > It'd be even simpler. If you want to support <3.7 and 3.7+, you write: from itertools import chain At least, I presume it isn't going to be *removed* from itertools. Promotion to builtin shouldn't break pre-existing code, so the way to be compatible with pre-promotion Pythons is simply to code for those and not take advantage of the new builtin. ChrisA From p.f.moore at gmail.com Sat Jul 1 05:30:00 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Sat, 1 Jul 2017 10:30:00 +0100 Subject: [Python-ideas] + operator on generators In-Reply-To: <20170701061339.GN3149@ando.pearwood.info> References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170630010951.3fc7d17b@grzmot> <20170701061339.GN3149@ando.pearwood.info> Message-ID: On 1 July 2017 at 07:13, Steven D'Aprano wrote: > But the more I think about it the more I agree with Nick. Let's start > by moving itertools.chain into built-ins, with zip and map, and only > consider giving it an operator after we've had a few years of experience > with chain as a built-in. We might even find that an operator doesn't > add any real value. I'm struck here by the contrast between this and the "let's slim down the stdlib" debates we've had in the past. How difficult is it really to add "from itertools import chain" at the start of a file? It's not even as if itertools is a 3rd party dependency. Paul From wes.turner at gmail.com Sat Jul 1 10:38:02 2017 From: wes.turner at gmail.com (Wes Turner) Date: Sat, 1 Jul 2017 09:38:02 -0500 Subject: [Python-ideas] + operator on generators In-Reply-To: <20170701081152.GO3149@ando.pearwood.info> References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170630010951.3fc7d17b@grzmot> <20170701061339.GN3149@ando.pearwood.info> <20170701081152.GO3149@ando.pearwood.info> Message-ID: On Saturday, July 1, 2017, Steven D'Aprano wrote: > On Sat, Jul 01, 2017 at 01:35:29AM -0500, Wes Turner wrote: > > On Saturday, July 1, 2017, Steven D'Aprano > wrote: > > > > > On Fri, Jun 30, 2017 at 01:09:51AM +0200, Jan Kaliszewski wrote: > > > > > > [...] > > > > > > But the more I think about it the more I agree with Nick. Let's start > > > by moving itertools.chain into built-ins, with zip and map, and only > > > consider giving it an operator after we've had a few years of > experience > > > with chain as a built-in. We might even find that an operator doesn't > > > add any real value. > > > > > > - Would that include chain.from_iterable? > > Yes. > > > - So there's then a new conditional import (e.g. in a compat package)? > What > > does this add? > > try: chain > except NameError: from itertools import chain > > Two lines, if and only if you both need chain and want to support > versions of Python older than 3.7. > > There's no need to import it if you aren't going to use it. Or, can I just continue to import the same function from the same place: from itertools import chain Nice, simple, easy. There's even (for all you functional lovers): from itertools import * And, again, this works today: from fn import Stream itr = Stream() << my_generator() << (8,9,0) - https://github.com/kachayev/fn.py/blob/master/README.rst#streams-and-infinite-sequences-declaration - https://github.com/kachayev/fn.py/blob/master/fn/stream.py - AFAIU, + doesn't work because e.g. numpy already defines + and & for Iterable arrays. > > > > > > ? Preferably using the existing `yield from` mechanism -- because, in > > > > case of generators, it would provide a way to combine ("concatenate") > > > > *generators*, preserving semantics of all that their __next__(), > send(), > > > > throw() nice stuff... > > > > > > I don't think that would be generally useful. > > > > Flatten one level? > > Flattening typically applies to lists and sequences. > > I'm not saying that chain shouldn't support generators. That would be > silly: a generator is an iterable and chaining supports iterables. I'm > saying that it wouldn't be helpful to require chain objects to support > send(), throw() etc. So the argspec is/shouldbe Iterables with __iter__ (but not necessarily __len__)? > > > > If you're sending values > > > into an arbitrary generator, who knows what you're getting? chain() > will > > > operate on arbitrary iterables, you can't expect to send values into > > > chain([1, 2, 3], my_generator(), "xyz") and have anything sensible > > > occur. > > > > > > - is my_generator() mutable (e.g. before or during iteration)? > > It doesn't matter. Sending into a chain of arbitrary iterators isn't a > useful thing to do. So, with a generator function, I get a traceback at the current yield statement. With chain() I get whatever line the chain call is on. > > > > - https://docs.python.org/2/reference/expressions.html#generator.send > > Why are you linking to the 2 version of the docs? We're discusing a > hypotheticial new feature which must go into 3, not 2. In your opinion, has the send() functionality changed at all? > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Sat Jul 1 12:22:31 2017 From: brett at python.org (Brett Cannon) Date: Sat, 01 Jul 2017 16:22:31 +0000 Subject: [Python-ideas] CPython should get... In-Reply-To: References: <2474590d-418d-7aaf-09c7-24b34f7fd4a7@gmail.com> <20170630151530.GA23663@phdru.name> Message-ID: On Fri, Jun 30, 2017, 10:38 Koos Zevenhoven, wrote: > On Jun 30, 2017 5:16 PM, "Oleg Broytman" wrote: > > On Fri, Jun 30, 2017 at 12:09:52PM -0300, "Soni L." > wrote: > > CPython should get a > > You're welcome to create one. Go on, send your pull requests! > > > But if you are planning to do that, it is still a good idea to ask for > feedback here first. That will increase the chances of acceptance by a lot. > Also, it doesn't necessarily need to be your own idea :) > I think Oleg was more responding to the fact that Soni said "CPython should" do something. Phrasing it that way comes off as demanding instead of just sharing an idea. Oleg tried to turn it around and point out that if Soni thinks this should happen then he should be ready to contribute the work to see it happen. -brett > -- Koos > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From phd at phdru.name Sat Jul 1 13:17:03 2017 From: phd at phdru.name (Oleg Broytman) Date: Sat, 1 Jul 2017 19:17:03 +0200 Subject: [Python-ideas] CPython should get... In-Reply-To: References: <2474590d-418d-7aaf-09c7-24b34f7fd4a7@gmail.com> <20170630151530.GA23663@phdru.name> Message-ID: <20170701171703.GA27098@phdru.name> Hi, All! On Sat, Jul 01, 2017 at 04:22:31PM +0000, Brett Cannon wrote: > On Fri, Jun 30, 2017, 10:38 Koos Zevenhoven, wrote: > > On Jun 30, 2017 5:16 PM, "Oleg Broytman" wrote: > > > > On Fri, Jun 30, 2017 at 12:09:52PM -0300, "Soni L." > > wrote: > > > CPython should get a > > > > You're welcome to create one. Go on, send your pull requests! > > > > But if you are planning to do that, it is still a good idea to ask for > > feedback here first. That will increase the chances of acceptance by a lot. > > Also, it doesn't necessarily need to be your own idea :) > > I think Oleg was more responding to the fact that Soni said "CPython > should" do something. Phrasing it that way comes off as demanding instead Exactly! > of just sharing an idea. Oleg tried to turn it around and point out that if > Soni thinks this should happen then he should be ready to contribute the > work to see it happen. I think the sentence "Python should have " should be ;-) forbidden if it is not followed with "I'm in the middle of development. Expect the 1st PR in ." Python can only have features that You, the , implemented (or paid for) and submitted. > -brett > > > -- Koos Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From prometheus235 at gmail.com Sat Jul 1 13:35:09 2017 From: prometheus235 at gmail.com (Nick Timkovich) Date: Sat, 1 Jul 2017 13:35:09 -0400 Subject: [Python-ideas] CPython should get... In-Reply-To: <20170701171703.GA27098@phdru.name> References: <2474590d-418d-7aaf-09c7-24b34f7fd4a7@gmail.com> <20170630151530.GA23663@phdru.name> <20170701171703.GA27098@phdru.name> Message-ID: On Sat, Jul 1, 2017 at 1:17 PM, Oleg Broytman wrote: > > I think the sentence "Python should have implement feature>" should be ;-) forbidden if it is not followed with > "I'm in the middle of development. Expect the 1st PR in timeframe>." > > Python can only have features that You, the , implemented (or > paid for) and submitted. > Devil's advocate: why prepare a patch and submit it if it is going to be dismissed out of hand. Trying to gauge support for the idea is a reasonable first-step. Devil's devil's advocate: if it has value, it could stand on it's own and gain it's own group of supporters as a CPython fork. Nick -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Sat Jul 1 17:51:56 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Sat, 1 Jul 2017 22:51:56 +0100 Subject: [Python-ideas] CPython should get... In-Reply-To: References: <2474590d-418d-7aaf-09c7-24b34f7fd4a7@gmail.com> <20170630151530.GA23663@phdru.name> <20170701171703.GA27098@phdru.name> Message-ID: On 1 July 2017 at 18:35, Nick Timkovich wrote: > Devil's advocate: why prepare a patch and submit it if it is going to be > dismissed out of hand. Trying to gauge support for the idea is a reasonable > first-step. That's perfectly OK, but it's important to phrase the email in a way that makes that clear - "I'm considering putting together a PR for Python to implement X. Does that sound like a good idea, or does anyone have suggestions for potential issues I might consider? Also, is there any prior work in this area that I should look into?" "Python should have X" implies (a) that you are criticising the python developers for missing that feature out, (b) that you consider your position self-evident, and (c) that you expect someone to implement it. People have different ways of expressing themselves, so we should all be prepared to allow some leeway in how people put their ideas across. But the writer has some responsibility for the tone, too. Paul From victor.stinner at gmail.com Sat Jul 1 18:34:42 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Sun, 2 Jul 2017 00:34:42 +0200 Subject: [Python-ideas] Bytecode JIT In-Reply-To: References: <2474590d-418d-7aaf-09c7-24b34f7fd4a7@gmail.com> Message-ID: Let's say that you have a function "def mysum (x; y): return x+y", do you always want to use your new IADD instruction here? What if I call mysum ("a", "b")? Victor -------------- next part -------------- An HTML attachment was scrubbed... URL: From fakedme+py at gmail.com Sat Jul 1 18:52:55 2017 From: fakedme+py at gmail.com (Soni L.) Date: Sat, 1 Jul 2017 19:52:55 -0300 Subject: [Python-ideas] Bytecode JIT In-Reply-To: References: <2474590d-418d-7aaf-09c7-24b34f7fd4a7@gmail.com> Message-ID: On 2017-07-01 07:34 PM, Victor Stinner wrote: > Let's say that you have a function "def mysum (x; y): return x+y", do > you always want to use your new IADD instruction here? What if I call > mysum ("a", "b")? > > Victor Let's say that you do. Given how short it is, it would just get inlined. Your call of mysum ("a", "b") would indeed not use IADD, nor would it be a call. It would potentially not invoke any operators, but instead get replaced with "ab". When you have a tracing JIT, you can do away with a lot of overhead. You can inline functions, variables, do away with typechecks, and many other things. This holds true even if that JIT never emits a single byte of machine code. From rosuav at gmail.com Sat Jul 1 19:32:39 2017 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 2 Jul 2017 09:32:39 +1000 Subject: [Python-ideas] Bytecode JIT In-Reply-To: References: <2474590d-418d-7aaf-09c7-24b34f7fd4a7@gmail.com> Message-ID: On Sun, Jul 2, 2017 at 8:52 AM, Soni L. wrote: > On 2017-07-01 07:34 PM, Victor Stinner wrote: >> >> Let's say that you have a function "def mysum (x; y): return x+y", do you >> always want to use your new IADD instruction here? What if I call mysum >> ("a", "b")? >> >> Victor > > > Let's say that you do. Given how short it is, it would just get inlined. > Your call of mysum ("a", "b") would indeed not use IADD, nor would it be a > call. It would potentially not invoke any operators, but instead get > replaced with "ab". > > When you have a tracing JIT, you can do away with a lot of overhead. You can > inline functions, variables, do away with typechecks, and many other things. > This holds true even if that JIT never emits a single byte of machine code. Let's try a more complicated example. # demo.py def mysum(x, y): return x + y def do_stuff(a, b): print(mysum("foo", "bar")) print(mysum(5, 7)) print(mysum(a, 42)) print(mysum(b, "spam")) What can you optimize here? Now let's look at a file that might call it: # cruel.py import demo def nasty(x, y): demo.mysum = random.choice([ lambda x, y: x + y, lambda x, y: f"{x} + f{y}", lambda x, y: "muahahaha", ]) return Ellipsis demo.mysum = nasty demo.do_stuff("what", "now?") Unless you can prove that this doesn't happen, you can't really optimize much of mysum away. That's where a tracing JIT compiler has the advantage: it can notice *at run time* that you're not doing this kind of thing, and in effect, forfeit the optimizations when you're running your tests (since test suites are where this kind of monkey-patching tends to happen). ChrisA From rymg19 at gmail.com Sat Jul 1 22:57:52 2017 From: rymg19 at gmail.com (rymg19 at gmail.com) Date: Sat, 1 Jul 2017 19:57:52 -0700 Subject: [Python-ideas] Bytecode JIT In-Reply-To: <> <> References: <<2474590d-418d-7aaf-09c7-24b34f7fd4a7@gmail.com>> <> <> <> Message-ID: This is literally PyPy. There's little reason for something like this to end up in official CPython, at least for now. -- Ryan (????) Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone elsehttp://refi64.com On Jul 1, 2017 at 5:53 PM, > wrote: On 2017-07-01 07:34 PM, Victor Stinner wrote: > Let's say that you have a function "def mysum (x; y): return x+y", do > you always want to use your new IADD instruction here? What if I call > mysum ("a", "b")? > > Victor Let's say that you do. Given how short it is, it would just get inlined. Your call of mysum ("a", "b") would indeed not use IADD, nor would it be a call. It would potentially not invoke any operators, but instead get replaced with "ab". When you have a tracing JIT, you can do away with a lot of overhead. You can inline functions, variables, do away with typechecks, and many other things. This holds true even if that JIT never emits a single byte of machine code. _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From fakedme+py at gmail.com Sat Jul 1 23:14:49 2017 From: fakedme+py at gmail.com (Soni L.) Date: Sun, 2 Jul 2017 00:14:49 -0300 Subject: [Python-ideas] Bytecode JIT In-Reply-To: References: <<2474590d-418d-7aaf-09c7-24b34f7fd4a7@gmail.com> < < < Message-ID: <45daeeaa-9696-37ea-b300-be81cf6aaaec@gmail.com> On 2017-07-01 11:57 PM, rymg19 at gmail.com wrote: > This is literally PyPy. There's little reason for something like this > to end up in official CPython, at least for now. It's literally not PyPy. PyPy's internal bytecode, for one, does have typechecks. And PyPy emits machine code, which is not something I wanna deal with because you shouldn't need to write a C compiler AND a whole assembly backend just to port python to a new CPU architecture. A C compiler should be enough. > > > -- > Ryan (????) > Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone else > http://refi64.com >> On Jul 1, 2017 at 5:53 PM, > >> wrote: >> >> >> >> On 2017-07-01 07:34 PM, Victor Stinner wrote: >> >> > Let's say that you have a function "def mysum (x; y): return x+y", do >> >> > you always want to use your new IADD instruction here? What if I call >> >> > mysum ("a", "b")? >> >> > >> >> > Victor >> >> >> >> Let's say that you do. Given how short it is, it would just get inlined. >> >> Your call of mysum ("a", "b") would indeed not use IADD, nor would it be >> >> a call. It would potentially not invoke any operators, but instead get >> >> replaced with "ab". >> >> >> >> When you have a tracing JIT, you can do away with a lot of overhead. You >> >> can inline functions, variables, do away with typechecks, and many other >> >> things. This holds true even if that JIT never emits a single byte of >> >> machine code. >> >> _______________________________________________ >> >> Python-ideas mailing list >> >> Python-ideas at python.org >> >> https://mail.python.org/mailman/listinfo/python-ideas >> >> Code of Conduct:http://python.org/psf/codeofconduct/ >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sun Jul 2 01:41:58 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 2 Jul 2017 15:41:58 +1000 Subject: [Python-ideas] Bytecode JIT In-Reply-To: References: <2474590d-418d-7aaf-09c7-24b34f7fd4a7@gmail.com> Message-ID: <20170702054157.GP3149@ando.pearwood.info> On Sat, Jul 01, 2017 at 07:52:55PM -0300, Soni L. wrote: > > > On 2017-07-01 07:34 PM, Victor Stinner wrote: > >Let's say that you have a function "def mysum (x; y): return x+y", do > >you always want to use your new IADD instruction here? What if I call > >mysum ("a", "b")? > > > >Victor > > Let's say that you do. Given how short it is, it would just get inlined. > Your call of mysum ("a", "b") would indeed not use IADD, nor would it be > a call. It would potentially not invoke any operators, but instead get > replaced with "ab". What you are describing sounds more like the output of a keyhole optimizer that folds constants, only extended to look inside functions. I expect that it would have to be a VERY clever optimizer, since it would have to do a complete whole-of-program static analysis to be sure that mysum has not been replaced, shadowed or redefined by the time it is called. I won't say that is outright impossible, but it would be *extremely* hard to do something like that at compile time. > When you have a tracing JIT, you can do away with a lot of overhead. You > can inline functions, variables, do away with typechecks, and many other > things. This holds true even if that JIT never emits a single byte of > machine code. What you are describing sounds more like an "Ahead Of Time" (AOT) compiler to me. Especially the part about doing away with typechecks. As far as I know you can really only do away with typechecks or other guards if you know ahead of time (at compile time) what the types of values are, and that requires static typing. -- Steve From rosuav at gmail.com Sun Jul 2 01:52:34 2017 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 2 Jul 2017 15:52:34 +1000 Subject: [Python-ideas] Bytecode JIT In-Reply-To: <20170702054157.GP3149@ando.pearwood.info> References: <2474590d-418d-7aaf-09c7-24b34f7fd4a7@gmail.com> <20170702054157.GP3149@ando.pearwood.info> Message-ID: On Sun, Jul 2, 2017 at 3:41 PM, Steven D'Aprano wrote: >> Let's say that you do. Given how short it is, it would just get inlined. >> Your call of mysum ("a", "b") would indeed not use IADD, nor would it be >> a call. It would potentially not invoke any operators, but instead get >> replaced with "ab". > > What you are describing sounds more like the output of a keyhole > optimizer that folds constants, only extended to look inside functions. > I expect that it would have to be a VERY clever optimizer, since it > would have to do a complete whole-of-program static analysis to be sure > that mysum has not been replaced, shadowed or redefined by the time it > is called. > > I won't say that is outright impossible, but it would be *extremely* > hard to do something like that at compile time. Isn't that the sort of thing that the "versioned globals dictionary" was supposed to do? If your globals haven't changed, you know that the optimizer was correct. But that's still a hard problem. Or at very least, it's decidedly non-trivial, and the costs are significant, so the net benefits aren't proven. ChrisA From steve at pearwood.info Sun Jul 2 07:16:08 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 2 Jul 2017 21:16:08 +1000 Subject: [Python-ideas] Runtime types vs static types In-Reply-To: References: Message-ID: <20170702111607.GQ3149@ando.pearwood.info> On Sat, Jun 24, 2017 at 10:42:19PM +0300, Koos Zevenhoven wrote: [...] > Clearly, there needs to be some sort of distinction between runtime > classes/types and static types, because static types can be more precise > than Python's dynamic runtime semantics. I think that's backwards: runtime types can be more precise than static types. Runtime types can make use of information known at compile time *and* at runtime, while static types can only make use of information known at compile time. Consider: List[str if today == 'Tuesday' else int] The best that the compile-time checker can do is treat it as List[Union[str, int]] if even that, but at runtime we can tell whether or not [1, 2, 3] is legal or not. But in any case, *static types* and *dynamic types* (runtime types, classes) are distinct concepts, but with significant overlap. Static types apply to *variables* (or expressions) while dynamic types apply to *values*. Values are, in general, only known at runtime. > For example, Iterable[int] is an > iterable that contains integers. For a static type checker, it is clear > what this means. But at runtime, it may be impossible to figure out whether > an iterable is really of this type without consuming the whole iterable and > checking whether each yielded element is an integer. There's a difference between *requesting* an object's runtime type and *verifying* that it is what it says it is. Of course if we try to verify that an iterator yields nothing but ints, we can't do so without consuming the iterator, or possibly even entering an infinite loop. But we can ask an object what type they are, they can tell you that they're an Iterable[int], and this could be an extremely fast check. Assuming you trust the object not to lie. ("Consenting adults" may apply here.) > Even that is not > possible if the iterable is infinite. Even Sequence[int] is problematic, > because checking the types of all elements of the sequence could take a > long time. > > Since things like isinstance(it, Iterable[int]) cannot guarantee a proper > answer, one easily arrives at the conclusion that static types and runtime > classes are just two separate things and that one cannot require that all > types support something like isinstance at runtime. That's way too strong. I agree that static types and runtime types (I don't use the term "class" because in principle at least this could include types not implemented as a class, e.g. a struct or record or primitive unboxed value) are distinct, but they do overlap. To describe them as "separate" implies that they are unconnected and that one could sensibly have things which are statically typed as (let's say) Sequence[bool] but runtime typed as float. Gradual typing is useful because the static types are at least an approximation to the runtime types. If they had no connection at all, we'd learn nothing from static type checking and there would be no reason to do it. So static types and runtime types must be at least closely related to be useful. [...] > These and other incompatibilities between runtime and static typing will > create two (or more) different kinds of type-annotated Python: > runtime-oriented Python and Python with static type checking. These may be > incompatible in both directions: a static type checker may complain about > code that is perfectly valid for the runtime folks, and code written for > static type checking may not be able to use new Python techniques that make > use of type hints at runtime. Yes? What's your point? Consenting adults certainly applies here. There are lots of reasons why people might avoid "new Python techniques" for *anything*, not just type hints: - they have to support older versions of Python; - they're stuck on an older version and can't upgrade; - they just don't like those new techniques. Nobody forces you to run a static type-checker. If you choose to run one, and it gives the wrong answers, then you can: - stop using it; - use a better one that gives the right answer; - fix the broken code that the checker says is broken (regardless of whether it is genuinely broken or not); - add, remove or modify annotations to satisfy the checker; - disable type-checking for that code unit (module?) alone. But the critical thing here is that so long as Python is a dynamically typed language, you cannot eliminate runtime type checks. You can choose *not* to write them in your code, and rely on duck typing and exceptions, but the type checks are still there in the implementation. E.g. you have x + 1 in your code. Even if *you* don't guard with an type check: # if isinstance(x, int): y = x + 1 there's still a runtime check in the byte-code which prevents low-level machine code errors that could lead to a segmentation fault or worse. > There may not even be a fully functional > subset of the two "languages". What do you mean by "fully functional"? Of course there will be working code that can pass both the static checks and run without error. Here's a trivial example: print("Hello World") On the other hand, it's trivially true that code which works at runtime cannot *always* be statically checked: s = input("Type some Python code: ") exec(s) The static type checker cannot possibly check code that doesn't even exist until runtime! I don't think it is plausible to say that there is, or could be, no overlap between (a) legal Python code that runs under a type-checker, and (b) legal Python code that runs without it. That's literally impossible since the type-checker is not part of the Python interpreter, so you can always just *not run the type-checker* to turn (a) into (b). > Different libraries will adhere to different > standards and will not be compatible with each other. The split will be > much worse and more difficult to understand than Python 2 vs 3, peoples > around the world will suffer like never before, and programming in Python > will become a very complicated mess. I think this is Chicken Little "The Sky Is Falling" FUD. > One way of solving the problem would be that type annotations are only a > static concept, like with stubs or comment-based type annotations. I don't agree that there's a problem that needs to be solved. > This > would also be nice from a memory and performance perspective, as evaluating > and storing the annotations would not occupy memory (although both issues > and some more might be nicely solved by making the annotations lazily > ealuated). Sounds like premature optimization to me. How many distinct annotations do you have? How much memory do you think they will use? If you're running 64-bit Python, each pointer to the annotation takes a full eight bytes. If we assume that every annotation is distinct, and we allow 1000 bytes for each annotation, a thousand annotations would only use 1MB of memory. On modern machines, that's trivial. I don't think this will be a problem for the average developer. (Although people programming on embedded devices may be different.) If we want to support that optimization, we could add an optimization flag that strips annotations at runtime, just as the -OO flag strips docstrings. That becomes a matter of *consenting adults* -- if you don't want annotations, you don't need to keep them, but it then becomes your responsibility that you don't try to use them. (If you do, you'll get a runtime AttributeError.) > However, leaving out runtime effects of type annotations is not > the approach taken, and runtime introspection of annotations seems to have > some promising applications as well. And for many cases, the traditional > Python class actually acts very nicely as both the runtime and static type. > > So if type annotations will be both for runtime and for static checking, > how to make everything work for both static and runtime typing? > > Since a writer of a library does not know what the type hints will be used > for by the library users, No, that's backwards. The library creator gets to decide what their library uses annotations for: type-hints, or something else. As the user of a library, I don't get to decide what the library does with its own annotations. > it is very important that there is only one way > of making type annotations which will work regardless of what the > annotations are used for in the end. This will also make it much easier to > learn Python typing. I don't understand this. > Regarding runtime types and isinstance, let's look at the Iterable[int] > example. For this case, there are a few options: > > 1) Don't implement isinstance > > This is problematic for runtime uses of annotations. > > 2) isinstance([1, '2', 'three'], Iterable[int]) returns True > > This is in fact now the case. That's clearly a bug. If isinstance(... Iterable[int]) is supported at all, then clearly the result should be False. [...] > 3) Check as much as you can at runtime For what purpose? > 4) Do a deeper check than in (2) but trust the annotations > > For example, an instance of a class that has a method like > > def __iter__(self) -> Iterator[int]: > some code > > could be identified as Iterable[int] at runtime, even if it is not > guaranteed that all elements are really integers. I suggested something similar to this earlier in this post. > On the other hand, an object returned by > > def get_ints() -> Iterable[int]: > some code > > does not know its own annotations, so the check is difficult to do at > runtime. And of course, there may not be annotations available. Right -- when annotations are not available, the type checker will either infer types, if it can, or default to the Any type. I don't really understand where you are going with this. The premise, that statically-type-checked Python is fundamentally different from Python-without-static-checks, and therefore we have to bring in a bunch of extra runtime checks to make them the same, seems wrong to me. Perhaps I have not understood you. -- Steve From rosuav at gmail.com Sun Jul 2 07:38:11 2017 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 2 Jul 2017 21:38:11 +1000 Subject: [Python-ideas] Runtime types vs static types In-Reply-To: <20170702111607.GQ3149@ando.pearwood.info> References: <20170702111607.GQ3149@ando.pearwood.info> Message-ID: On Sun, Jul 2, 2017 at 9:16 PM, Steven D'Aprano wrote: > If we want to support that optimization, we could add an optimization > flag that strips annotations at runtime, just as the -OO flag strips > docstrings. That becomes a matter of *consenting adults* -- if you don't > want annotations, you don't need to keep them, but it then becomes your > responsibility that you don't try to use them. (If you do, you'll get a > runtime AttributeError.) IMO people should act as if this will eventually be the case. Annotations should be evaluated solely for the purpose of populating __annotations__, and not for any sort of side effects - just like with assertions. ChrisA From steve at pearwood.info Sun Jul 2 07:54:22 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 2 Jul 2017 21:54:22 +1000 Subject: [Python-ideas] Runtime types vs static types In-Reply-To: References: Message-ID: <20170702115422.GR3149@ando.pearwood.info> On Sun, Jun 25, 2017 at 09:13:44AM -0700, Lucas Wiman wrote: > > > > For some background on the removal of __instancecheck__, check the linked > > issues here: > > > > Thanks for the reference (the most relevant discussion starts here > ). > That said, I think I totally disagree with the underlying philosophy of > throwing away a useful and intuitive feature (having `is_instance(foo, > Union[Bar, Baz])` just work as you'd naively expect) in the name of making > sure that people *understand* there's a distinction between types and > classes. Yes... I agree. I think Mark Shannon has an exaggerated preference for "purity over practicality" when he writes: Determining whether a class is a subclass of a type is meaningless as far I'm concerned. https://github.com/python/typing/issues/136#issuecomment-217386769 That implies that runtime types ("classes") and static types are completely unrelated. I don't think that's true, and I think that would make static types pointless if it were true. I'd put it this way... runtime types are instantations of static types (not *instances*). https://en.wiktionary.org/wiki/instantiation If we didn't already use the terms for something else, I'd say that static types are *abstract types* and runtime types ("classes") are *concrete types*. But that clashes with the existing use of abstract versus concrete types. I can see that there are actual problems to be solved, and *perhaps* Mark's conclusion is the right one (even if for the wrong reasons). For example: isinstance([], List[int]) isinstance([], List[str]) How can a single value be an instance of two mutually incompatible types? (But see below, for an objection.) On the other hand, just because a corner case is problematic, doesn't mean that the vast majority of cases aren't meaningful. It just seems perverse to me to say that it is "meaningless" (in Mark's words) to ask whether isinstance(['a', 'b'], List[int]) isinstance(123, List[str]) for example). If static type checking has any meaning at all, then the answers to those two surely have to be False. > This seems opposed to the "zen" of python that there should be exactly one > obvious way to do it, since (1) there isn't a way to do it without a third > party library, and (2) the obvious way to do it is with `isinstance` and > `issubclass`. Indeed, the current implementation makes it somewhat > nonobvious even how to implement this functionality yourself in a > third-party library (see this gist > ). I think that the current status is that the MyPy folks, including Guido, consider that it *is* reasonable to ask these questions for the purpose of introspection, but issubclass and isinstance are not the way to do it. > One of the first things I did when playing around with the `typing` module > was to fire up the REPL, and try runtime typechecks: > > >>> from typing import * > >>> isinstance(0, Union[int, float]) > Traceback (most recent call last): > File "", line 1, in > File "/Users/lucaswiman/.pyenv/versions/3.6/lib/python3.6/typing.py", > line 767, in __instancecheck__ > raise TypeError("Unions cannot be used with isinstance().") > TypeError: Unions cannot be used with isinstance(). > > I think the natural reaction of naive users of the library is "That's > annoying. Why? What is this library good for?", not "Ah, I've sagely > learned a valuable lesson about the subtle-and-important-though-unmentioned > distinction between types and classes!" Indeed! Why shouldn't isinstance(x, Union[A, B]) isinstance(x, (A, B)) be treated as equivalent? [...] > Mark Shannon's example also specifically does not apply to the types I'm > thinking of for the reasons I mentioned: > > > For example, > > List[int] and List[str] and mutually incompatible types, yet > > isinstance([], List[int]) and isinstance([], List[str)) > > both return true. > > > > There is no corresponding objection for `Union`; I can't think of any* > inconsistencies or runtime type changes that would result from defining > `_Union.__instancecheck__` as `any(isinstance(obj, t) for t in > self.__args__`. Or just isinstance(x, tuple(self.__args__)) as above. > For `Tuple`, it's true that `()` would be an instance of > `Tuple[X, ...]` for all types X. However, the objection for the `List` case > (IIUC; extrapolating slightly) is that the type of the object could change > depending on what's added to it. That's not true for tuples since they're > immutable, so it's not *inconsistent* to say that `()` is an instance of > `Tuple[int, ...]` and `Tuple[str, ...]`, it's just applying a sensible > definition to the base case of an empty tuple. I'm not even completely convinced that the List example really is a problem. Well, it may be a problem for applying the theory of types, which in turn may make actually programming a type-checker more difficult. But to the human reader, why is is a problem that an empty list can be considered both a list of strings and a list of ints? That's just the vacuous truth! An empty bag can be equally well described as a bag containing no apples or a bag containing no oranges. They're both true, and if the theory of types cannot cope with that fact, that's a weakness in the theory, not the fact. (That's analogous to the Circle-Ellipse problem for the the theory behind object oriented code.) https://en.wikipedia.org/wiki/Circle-ellipse_problem -- Steve From steve at pearwood.info Sun Jul 2 07:57:33 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 2 Jul 2017 21:57:33 +1000 Subject: [Python-ideas] Runtime types vs static types In-Reply-To: References: <20170702111607.GQ3149@ando.pearwood.info> Message-ID: <20170702115732.GS3149@ando.pearwood.info> On Sun, Jul 02, 2017 at 09:38:11PM +1000, Chris Angelico wrote: > On Sun, Jul 2, 2017 at 9:16 PM, Steven D'Aprano wrote: > > If we want to support that optimization, we could add an optimization > > flag that strips annotations at runtime, just as the -OO flag strips > > docstrings. That becomes a matter of *consenting adults* -- if you don't > > want annotations, you don't need to keep them, but it then becomes your > > responsibility that you don't try to use them. (If you do, you'll get a > > runtime AttributeError.) > > IMO people should act as if this will eventually be the case. > Annotations should be evaluated solely for the purpose of populating > __annotations__, and not for any sort of side effects - just like with > assertions. Avoiding side-effects is generally a good idea, but I think that's probably taking it too far. I think that we should assume that def func(x:Spam()): ... will always look up and call Spam when the function is defined. But we should be prepared that func.__annotations__ might not exist, if we're running in a highly-optimized mode, or MicroPython, or similar. -- Steve From levkivskyi at gmail.com Sun Jul 2 07:58:41 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Sun, 2 Jul 2017 13:58:41 +0200 Subject: [Python-ideas] Runtime types vs static types In-Reply-To: References: Message-ID: @ Koos Zevenhoven > and there should at least *exist* a well-defined answer to whether an object is in 'instance' of a given type. (Not sure if 'instance' should be world used here) Let me illustrate why being an "instance" (or any other word) does not apply well to runtime objects. Consider a list [1, 2, 3], then is it an "instance" of List[int]? Probably yes. Is it an "instance" of List[Union[str, int]]? Probably also yes. However, List[int] and List[Union[str, int]] are mutually incompatible i.e. the latter is not a subtype of the former and the former is not a subtype of the latter. (This is due to lists being mutable and therefore invariant in its type variable.) The next important point is that static type checkers never decide (or at least I have never seen this) whether a given literal (since there are no objects before runtime) is an "instance" of a type. Static type checkers (roughly speaking) verify that the semantics represented by an AST is consistent with declared/inferred types. Concerning the above example with [1, 2, 3], static type checkers can infer List[int] for such literal, or refuse to do so and require an explicit annotation, or a user can overrule the inference by an explicit annotation. This decision (whether to use List[int] or any other acceptable type for this literal) will influence type checking outcome (i.e. are there errors or not) even _earlier_ in the program, this is something that is not possible at runtime. > Ignoring that *types* are also a runtime concept seems dangerous to me. It is not ignored. Annotations are structured type metadata accessible both at static and runtime, there is even typing_inspect module on PyPI designed to provide some runtime introspection of types (it is at an early stage of development) and some elements of it might end up in typing. Also checking subtyping between two types (without mixing them with classes) at runtime is entirely possible, but this is very complicated task with lots of corner cases, therefore I don't think it will be in stdlib. stdlib was always kept simple and easy to maintain. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sun Jul 2 08:13:15 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 2 Jul 2017 22:13:15 +1000 Subject: [Python-ideas] Bytecode JIT In-Reply-To: References: <2474590d-418d-7aaf-09c7-24b34f7fd4a7@gmail.com> <20170702054157.GP3149@ando.pearwood.info> Message-ID: <20170702121305.GT3149@ando.pearwood.info> On Sun, Jul 02, 2017 at 03:52:34PM +1000, Chris Angelico wrote: > On Sun, Jul 2, 2017 at 3:41 PM, Steven D'Aprano wrote: > >> Let's say that you do. Given how short it is, it would just get inlined. > >> Your call of mysum ("a", "b") would indeed not use IADD, nor would it be > >> a call. It would potentially not invoke any operators, but instead get > >> replaced with "ab". > > > > What you are describing sounds more like the output of a keyhole > > optimizer that folds constants, only extended to look inside functions. > > I expect that it would have to be a VERY clever optimizer, since it > > would have to do a complete whole-of-program static analysis to be sure > > that mysum has not been replaced, shadowed or redefined by the time it > > is called. > > > > I won't say that is outright impossible, but it would be *extremely* > > hard to do something like that at compile time. > > Isn't that the sort of thing that the "versioned globals dictionary" > was supposed to do? If your globals haven't changed, you know that the > optimizer was correct. That only solves the problem of mysum being modified, not whether the arguments are ints. You still need to know whether it is safe to call some low-level (fast) integer addition routine, or whether you have to go through the (slow) high-level Python code. In any case, guards are a kind of runtime check. It might not be an explicit isinstance() check, but it is logically implies one. If x was an int, and nothing has changed, then x is still an int. If Victor is around, he might like to comment on how his FAT Python handles this. > But that's still a hard problem. Or at very least, it's decidedly > non-trivial, and the costs are significant, so the net benefits aren't > proven. In fairness, they are proven for other languages, and they certainly worked for things like Psyco. So this isn't completely pie-in-the-sky dreaming. -- Steve From rosuav at gmail.com Sun Jul 2 08:18:03 2017 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 2 Jul 2017 22:18:03 +1000 Subject: [Python-ideas] Bytecode JIT In-Reply-To: <20170702121305.GT3149@ando.pearwood.info> References: <2474590d-418d-7aaf-09c7-24b34f7fd4a7@gmail.com> <20170702054157.GP3149@ando.pearwood.info> <20170702121305.GT3149@ando.pearwood.info> Message-ID: On Sun, Jul 2, 2017 at 10:13 PM, Steven D'Aprano wrote: >> But that's still a hard problem. Or at very least, it's decidedly >> non-trivial, and the costs are significant, so the net benefits aren't >> proven. > > In fairness, they are proven for other languages, and they certainly > worked for things like Psyco. So this isn't completely pie-in-the-sky > dreaming. Yeah. It's the realm of "let's put in some solid research, then do some proof of concept work, and maybe it'll be worth going ahead with" - it's not "Python should be able to optimize this away". ChrisA From fakedme+py at gmail.com Sun Jul 2 09:32:41 2017 From: fakedme+py at gmail.com (Soni L.) Date: Sun, 2 Jul 2017 10:32:41 -0300 Subject: [Python-ideas] Bytecode JIT In-Reply-To: <20170702054157.GP3149@ando.pearwood.info> References: <2474590d-418d-7aaf-09c7-24b34f7fd4a7@gmail.com> <20170702054157.GP3149@ando.pearwood.info> Message-ID: <0efb2c28-4b54-86cc-927e-e5e99d15c49a@gmail.com> On 2017-07-02 02:41 AM, Steven D'Aprano wrote: > On Sat, Jul 01, 2017 at 07:52:55PM -0300, Soni L. wrote: >> >> On 2017-07-01 07:34 PM, Victor Stinner wrote: >>> Let's say that you have a function "def mysum (x; y): return x+y", do >>> you always want to use your new IADD instruction here? What if I call >>> mysum ("a", "b")? >>> >>> Victor >> Let's say that you do. Given how short it is, it would just get inlined. >> Your call of mysum ("a", "b") would indeed not use IADD, nor would it be >> a call. It would potentially not invoke any operators, but instead get >> replaced with "ab". > What you are describing sounds more like the output of a keyhole > optimizer that folds constants, only extended to look inside functions. > I expect that it would have to be a VERY clever optimizer, since it > would have to do a complete whole-of-program static analysis to be sure > that mysum has not been replaced, shadowed or redefined by the time it > is called. Runtime. Not static. This is the same kind of stuff LuaJIT (and any other JIT) does. > > I won't say that is outright impossible, but it would be *extremely* > hard to do something like that at compile time. > > >> When you have a tracing JIT, you can do away with a lot of overhead. You >> can inline functions, variables, do away with typechecks, and many other >> things. This holds true even if that JIT never emits a single byte of >> machine code. > What you are describing sounds more like an "Ahead Of Time" (AOT) > compiler to me. Especially the part about doing away with typechecks. As > far as I know you can really only do away with typechecks or other > guards if you know ahead of time (at compile time) what the types of > values are, and that requires static typing. > > You can do that at runtime with a JIT and flushing the JIT cache when your assumptions (guards) change. (altho in reality you wouldn't flush the whole JIT cache because that'd be expensive.) From tjreedy at udel.edu Sun Jul 2 14:09:38 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 2 Jul 2017 14:09:38 -0400 Subject: [Python-ideas] Runtime types vs static types In-Reply-To: <20170702115732.GS3149@ando.pearwood.info> References: <20170702111607.GQ3149@ando.pearwood.info> <20170702115732.GS3149@ando.pearwood.info> Message-ID: On 7/2/2017 7:57 AM, Steven D'Aprano wrote: > On Sun, Jul 02, 2017 at 09:38:11PM +1000, Chris Angelico wrote: >> On Sun, Jul 2, 2017 at 9:16 PM, Steven D'Aprano wrote: >>> If we want to support that optimization, we could add an optimization >>> flag that strips annotations at runtime, just as the -OO flag strips >>> docstrings. That becomes a matter of *consenting adults* -- if you don't >>> want annotations, you don't need to keep them, but it then becomes your >>> responsibility that you don't try to use them. (If you do, you'll get a >>> runtime AttributeError.) >> >> IMO people should act as if this will eventually be the case. >> Annotations should be evaluated solely for the purpose of populating >> __annotations__, and not for any sort of side effects - just like with >> assertions. > > Avoiding side-effects is generally a good idea, but I think that's > probably taking it too far. > > I think that we should assume that > > def func(x:Spam()): > ... > > will always look up and call Spam when the function is defined. But we > should be prepared that > > func.__annotations__ > > might not exist, if we're running in a highly-optimized mode, or > MicroPython, or similar. Code that does not control the compilation of the file with func should also not assume the existence of func.__doc__. On the other hand, programs, such as IDEs, that do control compilation, by calling the standard compile(), can assume both attributes if they pass the appropriate compile flags. -- Terry Jan Reedy From lucas.wiman at gmail.com Sun Jul 2 15:14:28 2017 From: lucas.wiman at gmail.com (Lucas Wiman) Date: Sun, 2 Jul 2017 12:14:28 -0700 Subject: [Python-ideas] Runtime types vs static types In-Reply-To: <20170702115422.GR3149@ando.pearwood.info> References: <20170702115422.GR3149@ando.pearwood.info> Message-ID: On Sun, Jul 2, 2017 at 4:54 AM, Steven D'Aprano wrote: > I think that the current status is that the MyPy folks, including Guido, > consider that it *is* reasonable to ask these questions for the purpose > of introspection, but issubclass and isinstance are not the way to do > it. > That may have once been the viewpoint of the developers of typing/MyPy, though it seems to have changed. The current view is that this should be implemented in a third party library. Further discussion is here . - Lucas -------------- next part -------------- An HTML attachment was scrubbed... URL: From python-ideas at shalmirane.com Sun Jul 2 15:19:54 2017 From: python-ideas at shalmirane.com (Ken Kundert) Date: Sun, 2 Jul 2017 12:19:54 -0700 Subject: [Python-ideas] Arguments to exceptions Message-ID: <20170702191953.GA17773@kundert.designers-guide.com> All, Here is a proposal for enhancing the way BaseException handles arguments. -Ken Rationale ========= Currently, the base exception class takes all the unnamed arguments passed to an exception and saves them into args. In this way they are available to the exception handler. A commonly used and very useful idiom is to extract an error message from the exception by casting the exception to a string. If you do so while passing one argument, you get that argument as a string: >>> try: ... raise Exception('Hey there!') ... except Exception as e: ... print(str(e)) Hey there! However, if more than one argument is passed, you get the string representation of the tuple containing all the arguments: >>> try: ... raise Exception('Hey there!', 'Something went wrong.') ... except Exception as e: ... print(str(e)) ('Hey there!', 'Something went wrong.') That behavior does not seem very useful, and I believe it leads to people passing only one argument to their exceptions. An example of that is the system NameError exception: >>> try: ... foo ... except Exception as e: ... print('str:', str(e)) ... print('args:', e.args) str: name 'foo' is not defined args: ("name 'foo' is not defined",) Notice that the only argument is the error message. If you want access to the problematic name, you have to dig it out of the message. For example ... >>> import Food >>> try: ... import meals ... except NameError as e: ... name = str(e).split("'")[1] # <-- fragile code ... from difflib import get_close_matches ... candidates = ', '.join(get_close_matches(name, Food.foods, 1, 0.6)) ... print(f'{name}: not found. Did you mean {candidates}?') In this case, the missing name was needed but not directly available. Instead, the name must be extracted from the message, which is innately fragile. The same is true with AttributeError: the only argument is a message and the name of the attribute, if needed, must be extracted from the message. Oddly, with a KeyError it is the opposite situation, the name of the key is the argument and no message is included. With IndexError there is a message but no index. However none of these behaviors can be counted on; they could be changed at any time. When writing exception handlers it is often useful to have both a generic error message and access to the components of the message if for no other reason than to be able to construct a better error message. However, I believe that the way the arguments are converted to a string when there are multiple arguments discourages this. When reporting an exception, you must either give one argument or add a custom __str__ method to the exception. To do otherwise means that the exception handlers that catch your exception will not have a reasonable error message and so would be forced to construct one from the arguments. This is spelled out in PEP 352, which explicitly recommends that there be only one argument and that it be a helpful human readable message. Further it suggests that if more than one argument is required that Exception should be subclassed and the extra arguments should be attached as attributes. However, the extra effort required means that in many cases people just pass an error message alone. This approach is in effect discouraging people from adding additional arguments to exceptions, with the result being that if they are needed by the handler they have to be extracted from the message. It is important to remember that the person that writes the exception handler often does not raise the exception, and so they just must live with what is available. As such, a convention that encourages the person raising the exception to include all the individual components of the message should be preferred. That is the background. Here is my suggestion on how to improve this situation. Proposal ======== I propose that the Exception class be modified to allow passing a message template as a named argument that is nothing more than a format string that interpolates the exception arguments into an error message. If the template is not given, the arguments would simply be converted to strings individually and combined as in the print function. So, I am suggesting the BaseException class look something like this: class BaseException: def __init__(self, *args, **kwargs): self.args = args self.kwargs = kwargs def __str__(self): template = self.kwargs.get('template') if template is None: sep = self.kwargs.get('sep', ' ') return sep.join(str(a) for a in self.args) else: return template.format(*self.args, **self.kwargs) Benefits ======== Now, NameError could be defined as: class NameError(Exception): pass A NameError would be raised with: try: raise NameError(name, template="name '{0}' is not defined.") except NameError as e: name = e.args[0] msg = str(e) ... Or, perhaps like this: try: raise NameError(name=name, template="name '{name}' is not defined.") except NameError as e: name = e.kwargs['name'] msg = str(e) ... One of the nice benefits of this approach is that the message printed can be easily changed after the exception is caught. For example, it could be converted to German. try: raise NameError(name, template="name '{0}' is not defined.") except NameError as e: print('{}: nicht gefunden.'.format(e.args[0])) A method could be provided to generate the error message from a custom format string: try: raise NameError(name, template="name '{0}' is not defined.") except NameError as e: print(e.render('{}: nicht gefunden.')) Another nice benefit of this approach is that both named and unnamed arguments to the exception are retained and can be processed by the exception handler. Currently this is only true of unnamed arguments. And since named arguments are not currently allowed, this proposal is completely backward compatible. Of course, with this change, the built-in exceptions should be changed to use this new approach. Hopefully over time, others will change the way they write exceptions to follow suit, making it easier to write more capable exception handlers. Conclusion ========== Sorry for the length of the proposal, but I thought it was important to give a clear rationale for it. Hopefully me breaking it into sections made it easier to scan. Comments? From greg.ewing at canterbury.ac.nz Sun Jul 2 18:59:15 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 03 Jul 2017 10:59:15 +1200 Subject: [Python-ideas] Runtime types vs static types In-Reply-To: <20170702115422.GR3149@ando.pearwood.info> References: <20170702115422.GR3149@ando.pearwood.info> Message-ID: <59597AC3.9040408@canterbury.ac.nz> Steven D'Aprano wrote: > isinstance([], List[int]) > isinstance([], List[str]) > > How can a single value be an instance of two mutually incompatible > types? I don't think there's any contradiction there, because the compatibility rules are different for static and runtime types. Statically, when you assign something of type A to a variable of type B, you're assering that *all* values of type A are compatible with B. But at runtime, you're only asserting that one *particular* value is compatible with B. > It just seems > perverse to me to say that it is "meaningless" (in Mark's words) to ask > whether > > isinstance(['a', 'b'], List[int]) > isinstance(123, List[str]) > > for example). If static type checking has any meaning at all, then the > answers to those two surely have to be False. I doubt whether Mark meant "separate" to imply "unrelated". Static and runtime types are clearly related, although the relationship is not one-to-one and involves complicated overlaps. To my mind, the question isn't whether tests like that are meaningful -- clearly they are. The question is whether we should attempt to support answering them at run time, given that doing so in the general case requires unbounded amounts of computation. -- Greg From greg.ewing at canterbury.ac.nz Sun Jul 2 19:04:58 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 03 Jul 2017 11:04:58 +1200 Subject: [Python-ideas] Runtime types vs static types In-Reply-To: <20170702115732.GS3149@ando.pearwood.info> References: <20170702111607.GQ3149@ando.pearwood.info> <20170702115732.GS3149@ando.pearwood.info> Message-ID: <59597C1A.4020406@canterbury.ac.nz> Steven D'Aprano wrote: > I think that we should assume that > > def func(x:Spam()): > ... > > will always look up and call Spam when the function is defined. Personally I think we should be moving towards not even guaranteeing that. Then we would have a chance of some day ending up with a static typing mechanism that's sanely designed and truly fit for purpose. -- Greg From tritium-list at sdamon.com Sun Jul 2 23:18:04 2017 From: tritium-list at sdamon.com (Alex Walters) Date: Sun, 2 Jul 2017 23:18:04 -0400 Subject: [Python-ideas] Should iscoroutine and iscoroutinefunction be in builtins? Message-ID: <0fdc01d2f3aa$fcb27770$f6176650$@sdamon.com> Before async/await it made sense that iscoroutine and iscoroutinefunction live in asyncio. But now that coroutines are a built in type, supported by its own syntax, wouldn't it make sense to make those functions builtins? From steve at pearwood.info Sun Jul 2 23:18:10 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 3 Jul 2017 13:18:10 +1000 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <20170702191953.GA17773@kundert.designers-guide.com> References: <20170702191953.GA17773@kundert.designers-guide.com> Message-ID: <20170703031810.GU3149@ando.pearwood.info> On Sun, Jul 02, 2017 at 12:19:54PM -0700, Ken Kundert wrote: [...] > >>> try: > ... foo > ... except Exception as e: > ... print('str:', str(e)) > ... print('args:', e.args) > str: name 'foo' is not defined > args: ("name 'foo' is not defined",) > > Notice that the only argument is the error message. If you want access to the > problematic name, you have to dig it out of the message. In the common case, you don't. You know the name because you know what name you just tried to look up: try: spam except NameError: name = "spam" # what else could it be? print("%s missing" % name) The general rule for try...except is that the try block should contain only the smallest amount of code that can fail, that you can deal with. So if it is important for your code to distinguish *which* name failed, you should put them in separate try blocks: # don't do this try: spam or eggs or cheese or aardvark except NameError as err: name = err.name # doesn't actually work if name == 'spam': spam = 'hello' elif name == 'eggs': eggs = 'hello' elif ... # you get the picture One problem with that is that it assumes that only *one* name might not exist. If the lookup of spam failed, that doesn't mean that eggs would have succeeded. Instead, we should write: # do this try: spam except NameError: spam = "hello" try: eggs except NameError: ... # and so on I won't categorically state that it is "never" useful to extract the name from NameError (or key from KeyError, index from IndexError, etc) but I'd consider that needing to do so programmatically may be a code smell: something which may be fine, but it smells a little fishy and requires a closer look. [...] > This is spelled out in PEP 352, which explicitly recommends that there be only > one argument and that it be a helpful human readable message. Further it > suggests that if more than one argument is required that Exception should be > subclassed and the extra arguments should be attached as attributes. No restriction is placed upon what may be passed in for args for backwards-compatibility reasons. In practice, though, only a single string argument should be used. This keeps the string representation of the exception to be a useful message about the exception that is human-readable; this is why the __str__ method special-cases on length-1 args value. Including programmatic information (e.g., an error code number) should be stored as a separate attribute in a subclass. https://www.python.org/dev/peps/pep-0352/ > Proposal > ======== > > I propose that the Exception class be modified to allow passing a message > template as a named argument that is nothing more than a format string that > interpolates the exception arguments into an error message. If the template is > not given, the arguments would simply be converted to strings individually and > combined as in the print function. So, I am suggesting the BaseException class > look something like this: [snip example implementation] > A NameError would be raised with: > > try: > raise NameError(name, template="name '{0}' is not defined.") > except NameError as e: > name = e.args[0] > msg = str(e) > ... I think that's *exactly* what PEP 352 is trying to get away from: people having to memorize the order of arguments to the exception, so they know what index to give to extract them. And I tend to agree. I think that's a poor API. Assuming I ever find a reason to extract the name, I don't want to have to write `error.args[0]` to do so, not if I can write `error.name` instead. Look at ImportError and OSError: they define individual attributes for important error information: py> dir(OSError) [ # uninterest dunders trimmed ... 'args', 'characters_written', 'errno', 'filename', 'filename2', 'strerror', 'with_traceback'] > Or, perhaps like this: > > try: > raise NameError(name=name, template="name '{name}' is not defined.") > except NameError as e: > name = e.kwargs['name'] > msg = str(e) > ... That's not much of an improvement. Unless you can give a good rationale for why PEP 353 is wrong to recommend named attributes instead of error.args[index], I think this proposal is the wrong approach. -- Steve From steve at pearwood.info Sun Jul 2 23:25:47 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 3 Jul 2017 13:25:47 +1000 Subject: [Python-ideas] Runtime types vs static types In-Reply-To: <59597C1A.4020406@canterbury.ac.nz> References: <20170702111607.GQ3149@ando.pearwood.info> <20170702115732.GS3149@ando.pearwood.info> <59597C1A.4020406@canterbury.ac.nz> Message-ID: <20170703032547.GV3149@ando.pearwood.info> On Mon, Jul 03, 2017 at 11:04:58AM +1200, Greg Ewing wrote: > Steven D'Aprano wrote: > >I think that we should assume that > > > >def func(x:Spam()): > > ... > > > >will always look up and call Spam when the function is defined. > > Personally I think we should be moving towards not even > guaranteeing that. Then we would have a chance of some > day ending up with a static typing mechanism that's > sanely designed and truly fit for purpose. Did you intend to imply that gradual typing in general, or at least as implemented in MyPy, is *insanely* designed and unfit for purpose? -- Steve From steve at pearwood.info Sun Jul 2 23:28:39 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 3 Jul 2017 13:28:39 +1000 Subject: [Python-ideas] Should iscoroutine and iscoroutinefunction be in builtins? In-Reply-To: <0fdc01d2f3aa$fcb27770$f6176650$@sdamon.com> References: <0fdc01d2f3aa$fcb27770$f6176650$@sdamon.com> Message-ID: <20170703032838.GW3149@ando.pearwood.info> On Sun, Jul 02, 2017 at 11:18:04PM -0400, Alex Walters wrote: > Before async/await it made sense that iscoroutine and iscoroutinefunction > live in asyncio. But now that coroutines are a built in type, supported by > its own syntax, wouldn't it make sense to make those functions builtins? Generators and generator functions are built-in, supported by their own syntax, but 'isgenerator' and 'isgeneratorfunction' still live in the inspect module. So I think the answer to your question is no. -- Steve From njs at pobox.com Sun Jul 2 23:35:34 2017 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 2 Jul 2017 20:35:34 -0700 Subject: [Python-ideas] Should iscoroutine and iscoroutinefunction be in builtins? In-Reply-To: <0fdc01d2f3aa$fcb27770$f6176650$@sdamon.com> References: <0fdc01d2f3aa$fcb27770$f6176650$@sdamon.com> Message-ID: On Sun, Jul 2, 2017 at 8:18 PM, Alex Walters wrote: > Before async/await it made sense that iscoroutine and iscoroutinefunction > live in asyncio. But now that coroutines are a built in type, supported by > its own syntax, wouldn't it make sense to make those functions builtins? Technically "builtins" refers to the functions like "len" that are automatically available without having to be imported; this is a pretty select set of functions, and iscoroutine{,function} are unlikely to qualify. However, there are now versions of iscoroutine and iscoroutinefunction that live in the "inspect" module, and which do what you'd expect for native built-in coroutines. The versions in the asyncio module are a little different; they also recognize various objects that asyncio wants to treat like coroutines, like generators decorated with @asyncio.coroutine. -n -- Nathaniel J. Smith -- https://vorpus.org From python-ideas at shalmirane.com Mon Jul 3 04:59:02 2017 From: python-ideas at shalmirane.com (Ken Kundert) Date: Mon, 3 Jul 2017 01:59:02 -0700 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <20170703031810.GU3149@ando.pearwood.info> References: <20170702191953.GA17773@kundert.designers-guide.com> <20170703031810.GU3149@ando.pearwood.info> Message-ID: <20170703085902.GA27217@kundert.designers-guide.com> I think in trying to illustrate the existing behavior I made things more confusing than they needed to be. Let me try again. Consider this code. >>> import Food >>> try: ... import meals ... except NameError as e: ... name = str(e).split("'")[1] # <-- fragile code ... from difflib import get_close_matches ... candidates = ', '.join(get_close_matches(name, Food.foods, 1, 0.6)) ... print(f'{name}: not found. Did you mean {candidates}?') In this case *meals* instantiates a collection of foods. It is a Python file, but it is also a data file (in this case the user knows Python, so Python is a convenient data format). In that file thousands of foods may be instantiated. If the user misspells a food, I would like to present the available alternatives. To do so, I need the misspelled name. The only way I can get it is by parsing the error message. That is the problem. To write the error handler, I need the misspelled name. The only way to get it is to extract it from the error message. The need to unpack information that was just packed suggests that the packing was done too early. That is my point. Fundamentally, pulling the name out of an error message is a really bad coding practice because it is fragile. The code will likely break if the formatting or the wording of the message changes. But given the way the exception was implemented, I am forced to choose between two unpleasant choices: pulling the name from the error message or not giving the enhanced message at all. The above is an example. It is a bit contrived. I simply wanted to illustrate the basic issue in a few lines of code. However, my intent was also to illustrate what I see as a basic philosophical problem in the way we approach exceptions in Python: It is a nice convenience that an error message is provided by the source of the error, but it should not have the final say on the matter. Fundamentally, the code that actually presents the error to the user is in a better position to produce a message that is meaningful to the user. So, when we raise exceptions, not only should we provide a convenient human readable error message, we should anticipate that the exception handler may need to reformat or reinterpret the exception and provide it with what it need to do so. The current approach taken by exceptions in Python makes that unnecessarily difficult. PEP 352 suggests that this situation can be handled with a custom exception, and that is certainly true, but that only works if the person writing the code that raises the exception anticipates the need for passing the components of the error message as separate arguments. But as we can see from the NameError, AttributeError, etc, they don't always do. And PEP 352 actively discourages them from doing so. What I am hoping to do with this proposal is to get the Python developer community to see that: 1. The code that handles the exception benefits from having access to the components of the error message. In the least it can present the message to the user is the best possible way. Perhaps that means enforcing a particular style, or presenting it in the user's native language, or perhaps it means providing additional related information as in the example above. 2. The current approach to exceptions follows the opposite philosophy, suggesting that the best place to construct the error message is at the source of the error. It inadvertently puts obstacles in place that make it difficult to customize the message in the handler. 3. Changing the approach in the BaseException class to provide the best of both approaches provides considerable value and is both trivial and backward compatible. -Ken From p.f.moore at gmail.com Mon Jul 3 05:37:25 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 3 Jul 2017 10:37:25 +0100 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <20170703085902.GA27217@kundert.designers-guide.com> References: <20170702191953.GA17773@kundert.designers-guide.com> <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> Message-ID: On 3 July 2017 at 09:59, Ken Kundert wrote: > I think in trying to illustrate the existing behavior I made things more > confusing than they needed to be. Let me try again. > > Consider this code. > > >>> import Food > >>> try: > ... import meals > ... except NameError as e: > ... name = str(e).split("'")[1] # <-- fragile code > ... from difflib import get_close_matches > ... candidates = ', '.join(get_close_matches(name, Food.foods, 1, 0.6)) > ... print(f'{name}: not found. Did you mean {candidates}?') > > In this case *meals* instantiates a collection of foods. It is a Python file, > but it is also a data file (in this case the user knows Python, so Python is > a convenient data format). In that file thousands of foods may be instantiated. > If the user misspells a food, I would like to present the available > alternatives. To do so, I need the misspelled name. The only way I can get it > is by parsing the error message. As Steven pointed out, this is a pretty good example of a code smell. My feeling is that you may have just proved that Python isn't quite as good a fit for your data file format as you thought - or that your design has flaws. Suppose your user had a breakfast menu, and did something like: if now < lunchtim: # Should have been "lunchtime" Your error handling will be fairly confusing in that case. > That is the problem. To write the error handler, I need the misspelled name. > The only way to get it is to extract it from the error message. The need to > unpack information that was just packed suggests that the packing was done too > early. That is my point. I don't have any problem with *having* the misspelled name as an attribute to the error, I just don't think it's going to be as useful as you hope, and it may indeed (as above) encourage people to use it without thinking about whether there might be problems with using error handling that way. > Fundamentally, pulling the name out of an error message is a really bad coding > practice because it is fragile. The code will likely break if the formatting or > the wording of the message changes. But given the way the exception was > implemented, I am forced to choose between two unpleasant choices: pulling the > name from the error message or not giving the enhanced message at all. Or using a different approach. ("Among our different approaches...!" :-)) Agreed that's also an unpleasant choice at this point. > What I am hoping to do with this proposal is to get the Python developer > community to see that: > 1. The code that handles the exception benefits from having access to the > components of the error message. In the least it can present the message to > the user is the best possible way. Perhaps that means enforcing a particular > style, or presenting it in the user's native language, or perhaps it means > providing additional related information as in the example above. I see it as a minor bug magnet, but not really a problem in principle. > 2. The current approach to exceptions follows the opposite philosophy, > suggesting that the best place to construct the error message is at the > source of the error. It inadvertently puts obstacles in place that make it > difficult to customize the message in the handler. It's more about implicitly enforcing the policy of "catch errors over as small a section of code as practical". In your example, you're trapping NameError from anywhere in a "many thousands" of line file. That's about as far from the typical use of one or two lines in a try block as you can get. > 3. Changing the approach in the BaseException class to provide the best of both > approaches provides considerable value and is both trivial and backward > compatible. A small amount of value in a case we don't particularly want to encourage. Whether it's trivial comes down to implementation - I'll leave that to whoever writes the PR to demonstrate. (Although if it *is* trivial, is it something you could write a PR for?) Also, given that this would be Python 3.7 only, would people needing this functionality (only you have expressed a need so far) be OK with either insisting their users go straight to Python 3.7, or including backward compatible code for older versions? Overall, I'm -0 on this request (assuming it is trivial to implement - I certainly don't feel it's worth significant implementation effort). Paul From apalala at gmail.com Mon Jul 3 06:29:05 2017 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Mon, 3 Jul 2017 06:29:05 -0400 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <20170703085902.GA27217@kundert.designers-guide.com> References: <20170702191953.GA17773@kundert.designers-guide.com> <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> Message-ID: On Mon, Jul 3, 2017 at 4:59 AM, Ken Kundert wrote: > That is the problem. To write the error handler, I need the misspelled > name. > The only way to get it is to extract it from the error message. The need to > unpack information that was just packed suggests that the packing was done > too > early. That is my point. > 1. You can pass an object with all the required information and an appropriate __str__() method to the exception constructor. 2. If you own the exception hierarchy, you can modify the __str__() method of the exception class. -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Mon Jul 3 07:48:12 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 3 Jul 2017 13:48:12 +0200 Subject: [Python-ideas] Bytecode JIT In-Reply-To: <20170702121305.GT3149@ando.pearwood.info> References: <2474590d-418d-7aaf-09c7-24b34f7fd4a7@gmail.com> <20170702054157.GP3149@ando.pearwood.info> <20170702121305.GT3149@ando.pearwood.info> Message-ID: 2017-07-02 14:13 GMT+02:00 Steven D'Aprano : > That only solves the problem of mysum being modified, not whether the > arguments are ints. You still need to know whether it is safe to call > some low-level (fast) integer addition routine, or whether you have to > go through the (slow) high-level Python code. > > In any case, guards are a kind of runtime check. It might not be an > explicit isinstance() check, but it is logically implies one. If x was > an int, and nothing has changed, then x is still an int. > > If Victor is around, he might like to comment on how his FAT Python > handles this. FAT Python design is generic, you are free to implement any kind of check. A check is an object which provides a C callback. FAT Python provides the following guards: GuardArgType GuardBuiltins GuardDict GuardFunc GuardGlobals About "mysym being modified", it's handled by this guard: http://fatoptimizer.readthedocs.io/en/latest/fat.html#GuardFunc Right now, only the func.__code__ is watched. It's not enought, but it's a compromise to keep my implementation simple :-) Tomorrow, if FAT Python becomes a real thing, the builtin function type can be modified to add a version as we have for dictionaries, and the version will be increased for any function modification: argument defaults, arguments, name, etc. We would only have to modify GuardFunc implementation, not users of this guard. To really respect the Python semantics, guards became more complex than expected. GuardBuiltins doesn't only check that len() is still the same function in builtins. It only has to the globals name of the current frame, globals()[name] and builtins of the current frame. Python allows crazy stuff like running a single function with custom builtin functions: see exec() function. Victor From jeff.walker00 at yandex.com Mon Jul 3 15:46:14 2017 From: jeff.walker00 at yandex.com (Jeff Walker) Date: Mon, 03 Jul 2017 13:46:14 -0600 Subject: [Python-ideas] Arguments to exceptions Message-ID: <213001499111174@web5j.yandex.ru> Paul, I think you are fixating too much on Ken's example. I think I understand what he is saying and I agree with him. It is a problem I struggle with routinely. It occurs in the following situations: 1. You are handling an exception that you are not raising. This could be because Python itself is raising the exception, as in Ken's example, or it could be raised by some package you did not write. 2. You need to process or transform the message in some way. Consider this example: import json >>> s = '{"abc": 0, "cdf: 1}' >>> try: ... d = json.loads(s) ... except Exception as e: ... print(e) ... print(e.args) Unterminated string starting at: line 1 column 12 (char 11) ('Unterminated string starting at: line 1 column 12 (char 11)',) Okay, I have caught an exception for which I have no control over how the exception was raised. Now, imagine that I am writing an application that highlights json errors in place. To do so, I would need the line and column numbers to highlight the location of the error, and ideally I'd like to strip them from the base message and just show that. You can see from my second print statement that the line and column numbers were not passed as separate arguments. Thus I need to parse the error message to extract them. Not a difficult job, but fragile. Any change to the error message could break my code. I don't know what this code smell is that people keep referring to, but to me, that code would smell. Jeff > On 3 July 2017 at 09:59, Ken Kundert wrote: > > I think in trying to illustrate the existing behavior I made things more > > confusing than they needed to be. Let me try again. > > > > Consider this code. > > > > >>> import Food > > >>> try: > > ... import meals > > ... except NameError as e: > > ... name = str(e).split("'")[1] # <-- fragile code > > ... from difflib import get_close_matches > > ... candidates = ', '.join(get_close_matches(name, Food.foods, 1, 0.6)) > > ... print(f'{name}: not found. Did you mean {candidates}?') > > > > In this case *meals* instantiates a collection of foods. It is a Python file, > > but it is also a data file (in this case the user knows Python, so Python is > > a convenient data format). In that file thousands of foods may be instantiated. > > If the user misspells a food, I would like to present the available > > alternatives. To do so, I need the misspelled name. The only way I can get it > > is by parsing the error message. > > As Steven pointed out, this is a pretty good example of a code smell. > My feeling is that you may have just proved that Python isn't quite as > good a fit for your data file format as you thought - or that your > design has flaws. Suppose your user had a breakfast menu, and did > something like: > > if now < lunchtim: # Should have been "lunchtime" > > Your error handling will be fairly confusing in that case. > > > That is the problem. To write the error handler, I need the misspelled name. > > The only way to get it is to extract it from the error message. The need to > > unpack information that was just packed suggests that the packing was done too > > early. That is my point. > > I don't have any problem with *having* the misspelled name as an > attribute to the error, I just don't think it's going to be as useful > as you hope, and it may indeed (as above) encourage people to use it > without thinking about whether there might be problems with using > error handling that way. > > > Fundamentally, pulling the name out of an error message is a really bad coding > > practice because it is fragile. The code will likely break if the formatting or > > the wording of the message changes. But given the way the exception was > > implemented, I am forced to choose between two unpleasant choices: pulling the > > name from the error message or not giving the enhanced message at all. > > Or using a different approach. ("Among our different approaches...!" > :-)) Agreed that's also an unpleasant choice at this point. > > > What I am hoping to do with this proposal is to get the Python developer > > community to see that: > > 1. The code that handles the exception benefits from having access to the > > components of the error message. In the least it can present the message to > > the user is the best possible way. Perhaps that means enforcing a particular > > style, or presenting it in the user's native language, or perhaps it means > > providing additional related information as in the example above. > > I see it as a minor bug magnet, but not really a problem in principle. > > > 2. The current approach to exceptions follows the opposite philosophy, > > suggesting that the best place to construct the error message is at the > > source of the error. It inadvertently puts obstacles in place that make it > > difficult to customize the message in the handler. > > It's more about implicitly enforcing the policy of "catch errors over > as small a section of code as practical". In your example, you're > trapping NameError from anywhere in a "many thousands" of line file. > That's about as far from the typical use of one or two lines in a try > block as you can get. > > > 3. Changing the approach in the BaseException class to provide the best of both > > approaches provides considerable value and is both trivial and backward > > compatible. > > A small amount of value in a case we don't particularly want to encourage. > Whether it's trivial comes down to implementation - I'll leave that to > whoever writes the PR to demonstrate. (Although if it *is* trivial, is > it something you could write a PR for?) > > Also, given that this would be Python 3.7 only, would people needing > this functionality (only you have expressed a need so far) be OK with > either insisting their users go straight to Python 3.7, or including > backward compatible code for older versions? > > Overall, I'm -0 on this request (assuming it is trivial to implement - > I certainly don't feel it's worth significant implementation effort). > > Paul From p.f.moore at gmail.com Mon Jul 3 16:23:29 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 3 Jul 2017 21:23:29 +0100 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <213001499111174@web5j.yandex.ru> References: <213001499111174@web5j.yandex.ru> Message-ID: On 3 July 2017 at 20:46, Jeff Walker wrote: > I think you are fixating too much on Ken's example. I think I understand what he > is saying and I agree with him. It is a problem I struggle with routinely. It occurs in > the following situations: Possibly. I hadn't reread the original email. Having done so, I'm confused as to how the proposal and the example are related. The proposal makes no difference unless the places where (for example) NameError are raised are changed. But the proposal doesn't suggest changing how the interpreter raises NameError. So how will the proposal make a difference? I'd understood from the example that Ken's need was to be able to find the name that triggered the NameError. His proposal doesn't do that (unless we are talking about user-raised NameError exceptions, as opposed to ones the interpreter raises - in which case why not just use a user-defined exception? So I'm -1 on his proposal, as I don't see anything in it that couldn't be done in user code for user-defined exceptions, and there's nothing in the proposal suggesting a change in how interpreter-raised exceptions are created. > 1. You are handling an exception that you are not raising. This could be because > Python itself is raising the exception, as in Ken's example, or it could be raised > by some package you did not write. > 2. You need to process or transform the message in some way. Then yes, you need to know the API presented by the exception. Projects (and the core interpreter) are not particularly good at documenting (or designing) the API for their exceptions, but that doesn't alter the fact that exceptions are user-defined classes and as such do have an API. I'd be OK with arguments that the API of built in exceptions as raised by the interpreter could be improved. Indeed, I thought that was Ken's proposal. But his proposal seems to be that if we add a ___str__ method to BaseException, that will somehow automatically improve the API of all other exceptions. To quote Ken: > However, if more than one argument is passed, you get the string representation > of the tuple containing all the arguments: > > >>> try: > ... raise Exception('Hey there!', 'Something went wrong.') > ... except Exception as e: > ... print(str(e)) > ('Hey there!', 'Something went wrong.') > > That behavior does not seem very useful, and I believe it leads to people > passing only one argument to their exceptions. Alternatively, I could argue that code which uses print(str(e)) as its exception handling isn't very well written, and the fact that people do this is what leads to people passing only one argument to their exceptions when creating them. Look, I see that there might be something that could be improved here. But I don't see an explanation of how, if we implement just the proposed change to BaseException, the user code that Ken's quoting as having a problem could be improved. There seems to be an assumption of "and because of that change, people raising exceptions would change what they do". Frankly, no they wouldn't. There's no demonstrated benefit for them, and they'd have to maintain a mess of backward compatibility code. So why would they bother? Anyway, you were right that I'd replied to just the example, not the original proposal. I apologise for that, I should have read the thread more carefully. But if I had done so, it wouldn't have made much difference - I still don't see a justification for the proposed change. Paul From jeff.walker00 at yandex.com Mon Jul 3 16:56:39 2017 From: jeff.walker00 at yandex.com (Jeff Walker) Date: Mon, 03 Jul 2017 14:56:39 -0600 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: References: <213001499111174@web5j.yandex.ru> Message-ID: <293031499115399@web13j.yandex.ru> Paul, Indeed, nothing gets better until people change the way they do their exceptions. Ken's suggested enhancement to BaseException does not directly solve the problem, but it removes the roadblocks that discourage people from passing the components to the message. Seems to me that to address this problem, four things are needed: 1. Change BaseException. This allows people to pass the components to the message without ruining str(e). 2. A PEP that educates the Python community on this better way of writing exceptions. 3. Changes to the base language and standard library to employ the new approach. These would changes would be quite small and could be done opportunistically. 4. Time. Over time things will just get better as more people see the benefits of this new approach to exceptions and adopt it. And if they don't, we are no worse off than we are now. And frankly, I don't see any downside. The changes he is proposing are tiny and simple and backward compatible. -Jeff 03.07.2017, 14:23, "Paul Moore" : > On 3 July 2017 at 20:46, Jeff Walker wrote: >> ??????I think you are fixating too much on Ken's example. I think I understand what he >> ?is saying and I agree with him. It is a problem I struggle with routinely. It occurs in >> ??the following situations: > > Possibly. I hadn't reread the original email. Having done so, I'm > confused as to how the proposal and the example are related. The > proposal makes no difference unless the places where (for example) > NameError are raised are changed. But the proposal doesn't suggest > changing how the interpreter raises NameError. So how will the > proposal make a difference? I'd understood from the example that Ken's > need was to be able to find the name that triggered the NameError. His > proposal doesn't do that (unless we are talking about user-raised > NameError exceptions, as opposed to ones the interpreter raises - in > which case why not just use a user-defined exception? > > So I'm -1 on his proposal, as I don't see anything in it that couldn't > be done in user code for user-defined exceptions, and there's nothing > in the proposal suggesting a change in how interpreter-raised > exceptions are created. > >> ?1. You are handling an exception that you are not raising. This could be because >> ?????Python itself is raising the exception, as in Ken's example, or it could be raised >> ?????by some package you did not write. >> ?2. You need to process or transform the message in some way. > > Then yes, you need to know the API presented by the exception. > Projects (and the core interpreter) are not particularly good at > documenting (or designing) the API for their exceptions, but that > doesn't alter the fact that exceptions are user-defined classes and as > such do have an API. I'd be OK with arguments that the API of built in > exceptions as raised by the interpreter could be improved. Indeed, I > thought that was Ken's proposal. But his proposal seems to be that if > we add a ___str__ method to BaseException, that will somehow > automatically improve the API of all other exceptions. > > To quote Ken: > >> ?However, if more than one argument is passed, you get the string representation >> ?of the tuple containing all the arguments: >> >> ?????>>> try: >> ?????... raise Exception('Hey there!', 'Something went wrong.') >> ?????... except Exception as e: >> ?????... print(str(e)) >> ?????('Hey there!', 'Something went wrong.') >> >> ?That behavior does not seem very useful, and I believe it leads to people >> ?passing only one argument to their exceptions. > > Alternatively, I could argue that code which uses print(str(e)) as its > exception handling isn't very well written, and the fact that people > do this is what leads to people passing only one argument to their > exceptions when creating them. > > Look, I see that there might be something that could be improved here. > But I don't see an explanation of how, if we implement just the > proposed change to BaseException, the user code that Ken's quoting as > having a problem could be improved. There seems to be an assumption of > "and because of that change, people raising exceptions would change > what they do". Frankly, no they wouldn't. There's no demonstrated > benefit for them, and they'd have to maintain a mess of backward > compatibility code. So why would they bother? > > Anyway, you were right that I'd replied to just the example, not the > original proposal. I apologise for that, I should have read the thread > more carefully. But if I had done so, it wouldn't have made much > difference - I still don't see a justification for the proposed > change. > > Paul From p.f.moore at gmail.com Mon Jul 3 17:44:20 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 3 Jul 2017 22:44:20 +0100 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <293031499115399@web13j.yandex.ru> References: <213001499111174@web5j.yandex.ru> <293031499115399@web13j.yandex.ru> Message-ID: On 3 July 2017 at 21:56, Jeff Walker wrote: > Paul, > Indeed, nothing gets better until people change the way they do their > exceptions. Ken's suggested enhancement to BaseException does not > directly solve the problem, but it removes the roadblocks that discourage > people from passing the components to the message. As noted, I disagree that people are not passing components because str(e) displays them the way it does. But we're both just guessing at people's motivations, so there's little point in speculating. > Seems to me that to address this problem, four things are needed: > 1. Change BaseException. This allows people to pass the components > to the message without ruining str(e). I dispute this is the essential place to start. If nothing else, the proposed approach encourages people to use a position-based "args" attribute for exceptions, rather than properly named attributes. > 2. A PEP that educates the Python community on this better way of > writing exceptions. Educating the community can be done right now, and doesn't need a PEP. Someone could write a blog post, or an article, that explains how to code exception classes, how to create the exceptions, and how client code can/should use the API. This can be done now, all you need to do is to start with "at the moment, BaseException doesn't implement these features, so you should create an application-specific base exception class to minimise duplication of code". If project authors take up the proposed approach, then that makes a good argument for moving the supporting code into the built in BaseException class. > 3. Changes to the base language and standard library to employ the > new approach. These would changes would be quite small and could > be done opportunistically. And I've never said that there's a problem with these. Although I do dispute that using an args list is the best approach here - I'd much rather see NameError instances have a "name" attribute that had the name that couldn't be found as its value. Opportunistic changes to built in exceptions can implement the most appropriate API for the given exception - why constrain such changes to a "lowest common denominator" API that is ideal for no-one? class NameError(BaseException): def __init__(self, name): self.name = name def __str__(self): return f"name '{self.name}' is not defined" Of course, that's not backward compatible as it stands, but it could probably be made so, just as easily as implementing the proposed solution. > 4. Time. Over time things will just get better as more people see the > benefits of this new approach to exceptions and adopt it. And if they > don't, we are no worse off than we are now. The same could be said of any improved practice. And I agree, let's encourage people to learn to write better code, and promote good practices. There's definitely no reason not to do this. > And frankly, I don't see any downside. The changes he is proposing are > tiny and simple and backward compatible. Well, the main downside I see is that I don't agree that the proposed changes are the best possible approach. Implementing them in the built in exceptions therefore makes it harder for people to choose better approaches (or at least encourages them not to look for better approaches). There's no way I'd consider that e.args[0] as a better way to get the name that triggered a NameError than e.name. This seems to me to be something that should be experimented with and proven outside of the stdlib, before we rush to change the language. I don't see anything that makes that impossible. Paul From python at mrabarnett.plus.com Mon Jul 3 19:04:16 2017 From: python at mrabarnett.plus.com (MRAB) Date: Tue, 4 Jul 2017 00:04:16 +0100 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: References: <213001499111174@web5j.yandex.ru> <293031499115399@web13j.yandex.ru> Message-ID: On 2017-07-03 22:44, Paul Moore wrote: > On 3 July 2017 at 21:56, Jeff Walker wrote: >> Paul, >> Indeed, nothing gets better until people change the way they do their >> exceptions. Ken's suggested enhancement to BaseException does not >> directly solve the problem, but it removes the roadblocks that discourage >> people from passing the components to the message. > > As noted, I disagree that people are not passing components because > str(e) displays them the way it does. But we're both just guessing at > people's motivations, so there's little point in speculating. > >> Seems to me that to address this problem, four things are needed: >> 1. Change BaseException. This allows people to pass the components >> to the message without ruining str(e). > > I dispute this is the essential place to start. If nothing else, the > proposed approach encourages people to use a position-based "args" > attribute for exceptions, rather than properly named attributes. > >> 2. A PEP that educates the Python community on this better way of >> writing exceptions. > > Educating the community can be done right now, and doesn't need a PEP. > Someone could write a blog post, or an article, that explains how to > code exception classes, how to create the exceptions, and how client > code can/should use the API. This can be done now, all you need to do > is to start with "at the moment, BaseException doesn't implement these > features, so you should create an application-specific base exception > class to minimise duplication of code". If project authors take up the > proposed approach, then that makes a good argument for moving the > supporting code into the built in BaseException class. > >> 3. Changes to the base language and standard library to employ the >> new approach. These would changes would be quite small and could >> be done opportunistically. > > And I've never said that there's a problem with these. Although I do > dispute that using an args list is the best approach here - I'd much > rather see NameError instances have a "name" attribute that had the > name that couldn't be found as its value. Opportunistic changes to > built in exceptions can implement the most appropriate API for the > given exception - why constrain such changes to a "lowest common > denominator" API that is ideal for no-one? > > class NameError(BaseException): > def __init__(self, name): > self.name = name > def __str__(self): > return f"name '{self.name}' is not defined" > > Of course, that's not backward compatible as it stands, but it could > probably be made so, just as easily as implementing the proposed > solution. > >> 4. Time. Over time things will just get better as more people see the >> benefits of this new approach to exceptions and adopt it. And if they >> don't, we are no worse off than we are now. > > The same could be said of any improved practice. And I agree, let's > encourage people to learn to write better code, and promote good > practices. There's definitely no reason not to do this. > >> And frankly, I don't see any downside. The changes he is proposing are >> tiny and simple and backward compatible. > > Well, the main downside I see is that I don't agree that the proposed > changes are the best possible approach. Implementing them in the built > in exceptions therefore makes it harder for people to choose better > approaches (or at least encourages them not to look for better > approaches). There's no way I'd consider that e.args[0] as a better > way to get the name that triggered a NameError than e.name. > > This seems to me to be something that should be experimented with and > proven outside of the stdlib, before we rush to change the language. I > don't see anything that makes that impossible. > Maybe exceptions could put any keyword arguments into the instance's __dict__: class BaseException: def __init__(self, *args, **kwargs): self.args = args self.__dict__.update(kwargs) You could then raise: raise NameError('name {!a} is not defined', name='foo') and catch: try: ... except NameError as e: print('{}: nicht gefunden.'.format(e.name)) From greg.ewing at canterbury.ac.nz Mon Jul 3 19:46:13 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 04 Jul 2017 11:46:13 +1200 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: References: <213001499111174@web5j.yandex.ru> <293031499115399@web13j.yandex.ru> Message-ID: <595AD745.4020103@canterbury.ac.nz> Paul Moore wrote: > As noted, I disagree that people are not passing components because > str(e) displays them the way it does. But we're both just guessing at > people's motivations, so there's little point in speculating. I've no doubt that the current situation encourages people to be lazy -- I know, because I'm guilty of it myself! Writing a few extra lines to store attributes away and format them in __str__ might not seem like much, but in most cases those lines are of no direct benefit to the person writing the code, so there's little motivation to do it right. -- Greg From python-ideas at shalmirane.com Mon Jul 3 19:58:21 2017 From: python-ideas at shalmirane.com (Ken Kundert) Date: Mon, 3 Jul 2017 16:58:21 -0700 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: References: <213001499111174@web5j.yandex.ru> <293031499115399@web13j.yandex.ru> Message-ID: <20170703235821.GB18483@kundert.designers-guide.com> All, My primary concern is gaining access to the components that make up the messages. I am not hung up on the implementation. I just proposed the minimum that I thought would resolve the issue and introduce the least amount of risk. Concerning MRAB's idea of making the named arguments attributes, I am good with it. I considered it, though I was thinking of using __getattr__(), but thought that perhaps it was a step to far for the BaseException. -Ken On Tue, Jul 04, 2017 at 12:04:16AM +0100, MRAB wrote: > Maybe exceptions could put any keyword arguments into the instance's > __dict__: > > class BaseException: > def __init__(self, *args, **kwargs): > self.args = args > self.__dict__.update(kwargs) > > You could then raise: > > raise NameError('name {!a} is not defined', name='foo') > > and catch: > > try: > ... > except NameError as e: > print('{}: nicht gefunden.'.format(e.name)) From ncoghlan at gmail.com Tue Jul 4 01:08:55 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 4 Jul 2017 15:08:55 +1000 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <595AD745.4020103@canterbury.ac.nz> References: <213001499111174@web5j.yandex.ru> <293031499115399@web13j.yandex.ru> <595AD745.4020103@canterbury.ac.nz> Message-ID: On 4 July 2017 at 09:46, Greg Ewing wrote: > Paul Moore wrote: >> >> As noted, I disagree that people are not passing components because >> str(e) displays them the way it does. But we're both just guessing at >> people's motivations, so there's little point in speculating. > > > I've no doubt that the current situation encourages people > to be lazy -- I know, because I'm guilty of it myself! > > Writing a few extra lines to store attributes away and format > them in __str__ might not seem like much, but in most cases > those lines are of no direct benefit to the person writing > the code, so there's little motivation to do it right. So isn't this a variant of the argument that defining well-behaved classes currently involves writing too much boilerplate code, and the fact that non-structured exceptions are significantly easier to define than structured ones is just an example of that more general problem? I personally don't think there's anything all *that* special about exceptions in this case - they're just a common example of something that would be better handled as a "data record" type, but is commonly handled as an opaque string because they're so much easier to define that way. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From p.f.moore at gmail.com Tue Jul 4 04:28:48 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 4 Jul 2017 09:28:48 +0100 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: References: <213001499111174@web5j.yandex.ru> <293031499115399@web13j.yandex.ru> <595AD745.4020103@canterbury.ac.nz> Message-ID: On 4 July 2017 at 06:08, Nick Coghlan wrote: > On 4 July 2017 at 09:46, Greg Ewing wrote: >> Paul Moore wrote: >>> >>> As noted, I disagree that people are not passing components because >>> str(e) displays them the way it does. But we're both just guessing at >>> people's motivations, so there's little point in speculating. >> >> >> I've no doubt that the current situation encourages people >> to be lazy -- I know, because I'm guilty of it myself! >> >> Writing a few extra lines to store attributes away and format >> them in __str__ might not seem like much, but in most cases >> those lines are of no direct benefit to the person writing >> the code, so there's little motivation to do it right. > > So isn't this a variant of the argument that defining well-behaved > classes currently involves writing too much boilerplate code, and the > fact that non-structured exceptions are significantly easier to define > than structured ones is just an example of that more general problem? > > I personally don't think there's anything all *that* special about > exceptions in this case - they're just a common example of something > that would be better handled as a "data record" type, but is commonly > handled as an opaque string because they're so much easier to define > that way. Yes, that's what I was (badly) trying to say. I agree that we could hide a lot of the boilerplate in BaseException (which is what Ken was suggesting) but I don't believe we yet know the best way to write that boilerplate, so I'm reluctant to put anything in the stdlib until we do know better. For now, experimenting with 3rd party "rich exception" base classes seems a sufficient option. It's possible that more advanced methods than simply using a base class may make writing good exception classes even easier, but I'm not sure I've seen any evidence of that yet. Paul From steve at pearwood.info Tue Jul 4 10:03:00 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 5 Jul 2017 00:03:00 +1000 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <20170703085902.GA27217@kundert.designers-guide.com> References: <20170702191953.GA17773@kundert.designers-guide.com> <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> Message-ID: <20170704140259.GX3149@ando.pearwood.info> On Mon, Jul 03, 2017 at 01:59:02AM -0700, Ken Kundert wrote: > I think in trying to illustrate the existing behavior I made things more > confusing than they needed to be. Let me try again. I understood you the first time :-) I agree that scraping the name from the NameError exception is a fragile hack. What I'm questioning is *how often that needs be done*. As I see it, there are broadly two use-cases for wanting the name from NameError (or AttributeError, or the index from IndexError, or the key from KeyError -- for brevity, I will use "name" and "NameError" stand in for *all* these cases). 1. You are a developer reading an unexpected NameError exception, and now you need to debug the code and fix the bug. In this case, just reading the error message is sufficient. There's no need to programmatically extract the name. 2. You have a `try...except` block and you've just caught NameError and want to handle it programmatically. In that second case, needing to extract the name from the exception is a code smell. A *strong* code smell -- it suggests that you're doing too much in the try... block. You should already know which name lookup failed, and so extracting the name from the exception is redundant: try: unicode except NameError: # Python 2/3 compatibility unicode = str What other name could it be? I believe that if you are dealing with a NameError where you want to programmatically deal with a missing name, but you don't know what that name is, you're already in deep, deep trouble and the fact that you have to scrape the error message for the name is the least of your problems: try: # Feature detection for Python 2/3 compatibility. unicode ascii reduce callable except NameError as err: name = extract_name(err) # somehow... if name == 'unicode': unicode = str elif name == 'ascii': ascii = ... elif name == 'reduce': from functools import reduce elif name == 'callable': def callable(obj): ... I trust that the problem with this is obvious. The only way to safely write this code is to test for each name individually, in which case we're back to point 2 above where you know what name failed and you don't need to extract it at all. It is my hand-wavy estimate that these two use-cases cover about 95% of uses for the name. We might quibble over percentages, but I think we should agree that whatever the number is, it is a great majority. Any other use-cases, like your example of translating error messages, or suggesting "Did you mean...?" alternatives, are fairly niche uses. So my position is that given the use-cases for programmatically extracting the name from NameError fall into a quite small niche, this feature is a "Nice To Have" not a "Must Have". It seems to me that the benefit is quite marginal, and so any more than the most trivial cost to implement this is likely to be too much for the benefit gained. I don't just mean the effort of implementing your suggested change. I mean, if there is even a small chance that in the future people will expect me (or require me!) to write code like this: raise NameError('foo', template='missing "{}"') instead of raise NameError('missing "foo"') then the cost of this new feature is more than the benefit to me, and I don't want it. I see little benefit and a small cost (more of a nuisance than a major drama, but a nusiance is still a nagative). I wouldn't use the new style if it were available (why should I bother?) and I'd be annoyed if I were forced to use it. Contrast that to OSError, where the ability to extract the errno and errstr separately are *very* useful. When you catch an OSError or IOError, you typically have *no idea* what the underlying errno is, it could be one of many. I don't object to writing this: raise OSError(errno, 'message') because I see real benefit. [...] > The above is an example. It is a bit contrived. I simply wanted to illustrate > the basic issue in a few lines of code. However, my intent was also to > illustrate what I see as a basic philosophical problem in the way we approach > exceptions in Python: > > It is a nice convenience that an error message is provided by the source of > the error, but it should not have the final say on the matter. > Fundamentally, the code that actually presents the error to the user is in > a better position to produce a message that is meaningful to the user. So, > when we raise exceptions, not only should we provide a convenient human > readable error message, we should anticipate that the exception handler may > need to reformat or reinterpret the exception and provide it with what it > need to do so. That argument makes a certain amount of sense, but its still a niche use-case. In *general*, we don't do our users any favours by reinterpreting standard error messages into something that they can't google, so even if this were useful, I can't see it becoming used in a widespread manner. [...] > What I am hoping to do with this proposal is to get the Python developer > community to see that: > 1. The code that handles the exception benefits from having access to the > components of the error message. I don't think that's generally true. > In the least it can present the message to > the user is the best possible way. Perhaps that means enforcing a particular > style, or presenting it in the user's native language, or perhaps it means > providing additional related information as in the example above. And I think that's actually counter-productive, at least in general, although I'm willing to accept that there may be situations were it is helpful. > 2. The current approach to exceptions follows the opposite philosophy, > suggesting that the best place to construct the error message is at the > source of the error. What else understands the error better than the source of the error? > It inadvertently puts obstacles in place that make it > difficult to customize the message in the handler. > > 3. Changing the approach in the BaseException class to provide the best of both > approaches provides considerable value and is both trivial and backward > compatible. I'll discuss your suggested API in a followup email. -- Steve From steve at pearwood.info Tue Jul 4 10:10:26 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 5 Jul 2017 00:10:26 +1000 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: References: <20170702191953.GA17773@kundert.designers-guide.com> <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> Message-ID: <20170704141026.GY3149@ando.pearwood.info> On Mon, Jul 03, 2017 at 06:29:05AM -0400, Juancarlo A?ez wrote: > On Mon, Jul 3, 2017 at 4:59 AM, Ken Kundert > wrote: > > > That is the problem. To write the error handler, I need the misspelled > > name. > > The only way to get it is to extract it from the error message. The need to > > unpack information that was just packed suggests that the packing was done > > too > > early. That is my point. > > > > > 1. You can pass an object with all the required information and an > appropriate __str__() method to the exception constructor. Playing Devil's Advocate, or in this case, Ken's Advocate, I don't think that's a useful approach. Think of it from the perspective of the caller, who catches the exception. They have no way of forcing the callee (the code being called) to use that custom object with the appropriate __str__ method, so they can't rely on it: try: something() except NameError as err: msg = err.args[0] if hasattr(msg, 'name'): name = msg.name else: # extract using a regex or similar... name = ... which doesn't seem very friendly. And who says that this custom object with a custom __str__ actually uses 'name' as part of its API? I might be extracting a completely different attribute unrelated to the failed name lookup. I think Ken is right that *if* this problem is worth solving, we should solve it in BaseException (or at least StandardException), and not leave it up to individual users to write their own mutually incompatible APIs for extracting the name from NameError. -- Steve From steve at pearwood.info Tue Jul 4 10:37:54 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 5 Jul 2017 00:37:54 +1000 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <213001499111174@web5j.yandex.ru> References: <213001499111174@web5j.yandex.ru> Message-ID: <20170704143753.GZ3149@ando.pearwood.info> On Mon, Jul 03, 2017 at 01:46:14PM -0600, Jeff Walker wrote: > Consider this example: > > import json > > >>> s = '{"abc": 0, "cdf: 1}' > > >>> try: > ... d = json.loads(s) > ... except Exception as e: > ... print(e) > ... print(e.args) > Unterminated string starting at: line 1 column 12 (char 11) > ('Unterminated string starting at: line 1 column 12 (char 11)',) > > Okay, I have caught an exception for which I have no control over how the > exception was raised. Now, imagine that I am writing an application that highlights > json errors in place. To do so, I would need the line and column numbers to > highlight the location of the error, and ideally I'd like to strip them from the base > message and just show that. I disagree: you should be using a JSON library which offers the line and column number as part of its guaranteed, supported API for dealing with bad JSON. If those numbers aren't part of the library API, then it is just as much of a fragile hack to extract them from exception.args[] as to scrape them from the error message. Suppose you write this: print(err.args) which gives: [35, 42, 'Unterminated string starting at: line 42 column 36'] Without parsing the error string, how do you know whether 35 is the line number or the column number or the character offset or something else that you didn't think of? Even when the information is there, in the exception args, it is still a fragile hack to extract them if they aren't part of the API. > I don't know what this code smell is that people keep referring to, but to me, > that code would smell. https://www.martinfowler.com/bliki/CodeSmell.html -- Steve From steve at pearwood.info Tue Jul 4 10:45:44 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 5 Jul 2017 00:45:44 +1000 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: References: <213001499111174@web5j.yandex.ru> <293031499115399@web13j.yandex.ru> Message-ID: <20170704144544.GA3149@ando.pearwood.info> On Mon, Jul 03, 2017 at 10:44:20PM +0100, Paul Moore wrote: > > 1. Change BaseException. This allows people to pass the components > > to the message without ruining str(e). > > I dispute this is the essential place to start. If nothing else, the > proposed approach encourages people to use a position-based "args" > attribute for exceptions, rather than properly named attributes. Right -- and not only does that go against the exception PEP https://www.python.org/dev/peps/pep-0352/ but it's still fragile unless the callee guarantees to always pass the values you want in the order you expect. And even if they do make that promise, named attributes are simply better than positional arguments. Who wants to write: err.args[3] # or is it 4, I always have to look it up... when you could write: err.column instead? -- Steve From steve at pearwood.info Tue Jul 4 10:50:34 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 5 Jul 2017 00:50:34 +1000 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <20170704141026.GY3149@ando.pearwood.info> References: <20170702191953.GA17773@kundert.designers-guide.com> <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704141026.GY3149@ando.pearwood.info> Message-ID: <20170704145034.GB3149@ando.pearwood.info> On Wed, Jul 05, 2017 at 12:10:26AM +1000, Steven D'Aprano wrote: > I think Ken is right that *if* this problem is worth solving, we should > solve it in BaseException (or at least StandardException), Oops, I forgot that StandardException is gone. And it used to be spelled StandardError. -- Steve From steve at pearwood.info Tue Jul 4 11:33:12 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 5 Jul 2017 01:33:12 +1000 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <20170702191953.GA17773@kundert.designers-guide.com> References: <20170702191953.GA17773@kundert.designers-guide.com> Message-ID: <20170704153309.GC3149@ando.pearwood.info> On Sun, Jul 02, 2017 at 12:19:54PM -0700, Ken Kundert wrote: > class BaseException: > def __init__(self, *args, **kwargs): > self.args = args > self.kwargs = kwargs > > def __str__(self): > template = self.kwargs.get('template') > if template is None: > sep = self.kwargs.get('sep', ' ') > return sep.join(str(a) for a in self.args) > else: > return template.format(*self.args, **self.kwargs) I think this API is too general. It accepts arbitrary keyword arguments and stores them in the exception. I think that makes for a poor experience for the caller: try: something() except MyParsingError as err: if 'column' in err.kwargs: ... elif 'col' in err.kwargs: ... elif 'x' in err.kwargs: # 'x' is the column, or is it the row? ... The problem here is, unless the exception class offers a real API for extracting the column, how do you know what key to use? You can't expect BaseException to force the user of MyParsingError to be consistent: raise MyParsingError(17, 45, template='Error at line {} column {}') raise MyParsingError(45, 17, template='Error at column {} line {}') raise MyParsingError(45, 17, temlpate='Error at col {} row {}') # oops raise MyParsingError(17, template='Error at line {}') raise MyParsingError(99, 45, 17, template='Error code {} at column {} line {}') Unless MyParsingError offers a consistent (preferably using named attributes) API for extracting the data, pulling it out of args is just as much of a hack as scraping it from the error message. It seems to me that we can't solve this problem at BaseException. It has to be tackled by each exception class. Only the author of each exception class knows what information it carries and should be provided via named attributes. I think OSError has a good approach. https://docs.python.org/3/library/exceptions.html#OSError For backwards compatibility, when you raise OSError with a single argument, it looks like this: raise OSError('spam is my problem') => OSError: spam is my problem When you offer two or up to five arguments, they get formatted into a nicer error message, and stored into named attributes: raise OSError(99, 'spam is my problem', 'xxx', 123, 'yyy') => OSError: [Errno 99] foo is my problem: 'xxx' -> 'yyy' Missing arguments default to None. I don't think there is a generic recipe that will work for all exceptions, or even all exceptions in the standard exception heirarchy, that can make that pleasant. Perhaps I'm insufficiently imaginative, but I don't think this problem can be solved with a quick hack of the BaseException class. -- Steve From mertz at gnosis.cx Tue Jul 4 15:32:41 2017 From: mertz at gnosis.cx (David Mertz) Date: Tue, 4 Jul 2017 12:32:41 -0700 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <20170704140259.GX3149@ando.pearwood.info> References: <20170702191953.GA17773@kundert.designers-guide.com> <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> Message-ID: I don't see the usefulness rich exception data as at all as limited as this. Here's some toy code that shows a use: ---- # For some reason, imports might not be ready immediately # Maybe flaky network drive, maybe need to install modules, etc # The overall program can do things too make them available lazy_import("foo", "bar", "baz", "blat") while True: try: x = foo(1) * bar(2) + baz(3)**blat(4) break except NameError as err: lazy_import(err.name) sleep(1) ---- I'd like the expression defining x all in one place together. Then I'd like to try again to import the functions used until they are available. In this case, I have assumed the "making available" might take some time, so I sleep in the loop. Of course I could write this in other ways also. But this one feels natural and concise. Of course, I probably *could* monkey-patch NameError with no language change. But not needing to would be a nice pattern. On Jul 4, 2017 7:04 AM, "Steven D'Aprano" wrote: On Mon, Jul 03, 2017 at 01:59:02AM -0700, Ken Kundert wrote: > I think in trying to illustrate the existing behavior I made things more > confusing than they needed to be. Let me try again. I understood you the first time :-) I agree that scraping the name from the NameError exception is a fragile hack. What I'm questioning is *how often that needs be done*. As I see it, there are broadly two use-cases for wanting the name from NameError (or AttributeError, or the index from IndexError, or the key from KeyError -- for brevity, I will use "name" and "NameError" stand in for *all* these cases). 1. You are a developer reading an unexpected NameError exception, and now you need to debug the code and fix the bug. In this case, just reading the error message is sufficient. There's no need to programmatically extract the name. 2. You have a `try...except` block and you've just caught NameError and want to handle it programmatically. In that second case, needing to extract the name from the exception is a code smell. A *strong* code smell -- it suggests that you're doing too much in the try... block. You should already know which name lookup failed, and so extracting the name from the exception is redundant: try: unicode except NameError: # Python 2/3 compatibility unicode = str What other name could it be? I believe that if you are dealing with a NameError where you want to programmatically deal with a missing name, but you don't know what that name is, you're already in deep, deep trouble and the fact that you have to scrape the error message for the name is the least of your problems: try: # Feature detection for Python 2/3 compatibility. unicode ascii reduce callable except NameError as err: name = extract_name(err) # somehow... if name == 'unicode': unicode = str elif name == 'ascii': ascii = ... elif name == 'reduce': from functools import reduce elif name == 'callable': def callable(obj): ... I trust that the problem with this is obvious. The only way to safely write this code is to test for each name individually, in which case we're back to point 2 above where you know what name failed and you don't need to extract it at all. It is my hand-wavy estimate that these two use-cases cover about 95% of uses for the name. We might quibble over percentages, but I think we should agree that whatever the number is, it is a great majority. Any other use-cases, like your example of translating error messages, or suggesting "Did you mean...?" alternatives, are fairly niche uses. So my position is that given the use-cases for programmatically extracting the name from NameError fall into a quite small niche, this feature is a "Nice To Have" not a "Must Have". It seems to me that the benefit is quite marginal, and so any more than the most trivial cost to implement this is likely to be too much for the benefit gained. I don't just mean the effort of implementing your suggested change. I mean, if there is even a small chance that in the future people will expect me (or require me!) to write code like this: raise NameError('foo', template='missing "{}"') instead of raise NameError('missing "foo"') then the cost of this new feature is more than the benefit to me, and I don't want it. I see little benefit and a small cost (more of a nuisance than a major drama, but a nusiance is still a nagative). I wouldn't use the new style if it were available (why should I bother?) and I'd be annoyed if I were forced to use it. Contrast that to OSError, where the ability to extract the errno and errstr separately are *very* useful. When you catch an OSError or IOError, you typically have *no idea* what the underlying errno is, it could be one of many. I don't object to writing this: raise OSError(errno, 'message') because I see real benefit. [...] > The above is an example. It is a bit contrived. I simply wanted to illustrate > the basic issue in a few lines of code. However, my intent was also to > illustrate what I see as a basic philosophical problem in the way we approach > exceptions in Python: > > It is a nice convenience that an error message is provided by the source of > the error, but it should not have the final say on the matter. > Fundamentally, the code that actually presents the error to the user is in > a better position to produce a message that is meaningful to the user. So, > when we raise exceptions, not only should we provide a convenient human > readable error message, we should anticipate that the exception handler may > need to reformat or reinterpret the exception and provide it with what it > need to do so. That argument makes a certain amount of sense, but its still a niche use-case. In *general*, we don't do our users any favours by reinterpreting standard error messages into something that they can't google, so even if this were useful, I can't see it becoming used in a widespread manner. [...] > What I am hoping to do with this proposal is to get the Python developer > community to see that: > 1. The code that handles the exception benefits from having access to the > components of the error message. I don't think that's generally true. > In the least it can present the message to > the user is the best possible way. Perhaps that means enforcing a particular > style, or presenting it in the user's native language, or perhaps it means > providing additional related information as in the example above. And I think that's actually counter-productive, at least in general, although I'm willing to accept that there may be situations were it is helpful. > 2. The current approach to exceptions follows the opposite philosophy, > suggesting that the best place to construct the error message is at the > source of the error. What else understands the error better than the source of the error? > It inadvertently puts obstacles in place that make it > difficult to customize the message in the handler. > > 3. Changing the approach in the BaseException class to provide the best of both > approaches provides considerable value and is both trivial and backward > compatible. I'll discuss your suggested API in a followup email. -- Steve _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Tue Jul 4 16:54:11 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 4 Jul 2017 16:54:11 -0400 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <20170704140259.GX3149@ando.pearwood.info> References: <20170702191953.GA17773@kundert.designers-guide.com> <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> Message-ID: On 7/4/2017 10:03 AM, Steven D'Aprano wrote: > On Mon, Jul 03, 2017 at 01:59:02AM -0700, Ken Kundert wrote: >> I think in trying to illustrate the existing behavior I made things more >> confusing than they needed to be. Let me try again. > > I understood you the first time :-) > > I agree that scraping the name from the NameError exception is a fragile > hack. What I'm questioning is *how often that needs be done*. > > As I see it, there are broadly two use-cases for wanting the name from > NameError (or AttributeError, or the index from IndexError, or the key > from KeyError -- for brevity, I will use "name" and "NameError" stand in > for *all* these cases). > > 1. You are a developer reading an unexpected NameError exception, and > now you need to debug the code and fix the bug. > > In this case, just reading the error message is sufficient. There's no > need to programmatically extract the name. > > 2. You have a `try...except` block and you've just caught NameError and > want to handle it programmatically. > > In that second case, needing to extract the name from the exception is a > code smell. A *strong* code smell -- it suggests that you're doing too > much in the try... block. You should already know which name lookup > failed, and so extracting the name from the exception is redundant: > > try: > unicode > except NameError: > # Python 2/3 compatibility > unicode = str > > What other name could it be? > > I believe that if you are dealing with a NameError where you want to > programmatically deal with a missing name, but you don't know what that > name is, you're already in deep, deep trouble and the fact that you have > to scrape the error message for the name is the least of your problems: > > try: > # Feature detection for Python 2/3 compatibility. > unicode > ascii > reduce > callable > except NameError as err: > name = extract_name(err) # somehow... > if name == 'unicode': > unicode = str > elif name == 'ascii': > ascii = ... > elif name == 'reduce': > from functools import reduce > elif name == 'callable': > def callable(obj): ... > > I trust that the problem with this is obvious. The only way to safely > write this code is to test for each name individually, in which case > we're back to point 2 above where you know what name failed and you > don't need to extract it at all. > > It is my hand-wavy estimate that these two use-cases cover about 95% of > uses for the name. We might quibble over percentages, but I think we > should agree that whatever the number is, it is a great majority. > > Any other use-cases, like your example of translating error messages, or > suggesting "Did you mean...?" alternatives, are fairly niche uses. > > So my position is that given the use-cases for programmatically > extracting the name from NameError fall into a quite small niche, this > feature is a "Nice To Have" not a "Must Have". It seems to me that the > benefit is quite marginal, and so any more than the most trivial cost to > implement this is likely to be too much for the benefit gained. > > I don't just mean the effort of implementing your suggested change. I > mean, if there is even a small chance that in the future people > will expect me (or require me!) to write code like this: > > raise NameError('foo', template='missing "{}"') > > instead of > > raise NameError('missing "foo"') > > then the cost of this new feature is more than the benefit to me, and I > don't want it. There have been many proposals for what we might call RichExceptions, with more easily access information. But as Raymond Hettinger keeps pointing out, Python does not use exceptions only for (hopefully rare) errors. It also uses them as signals for flow control, both as an alternative form for alternation and for iteration. Alternation with try:except instead of if:else is common. In the try: unicode example above, the NameError is not an error. Until 2.2, IndexError served the role of StopIteration today, and can still be used for iteration. For flow control, richer exceptions just slow code execution. > I see little benefit and a small cost (more of a nuisance than a major > drama, but a nusiance is still a nagative). I wouldn't use the new style > if it were available (why should I bother?) and I'd be annoyed if I were > forced to use it. > > Contrast that to OSError, where the ability to extract the errno and > errstr separately are *very* useful. When you catch an OSError or > IOError, you typically have *no idea* what the underlying errno is, it > could be one of many. I don't object to writing this: > > raise OSError(errno, 'message') > > because I see real benefit. In other words, the richness of the exception should depend on the balance between the exception class's use as flow signal versus error reporting. Note that the exception reported usually does not know what the exception use will be, so raising a bare signal exception versus a rich report exception is not an option. An additional consideration, as Raymond has also pointed out, is the fractional overhead, which depends on the context of the exception raising. IndexError: list index out of range This is probably the most common flow signal after StopIteration. Also, as Raymond has noted, IndexError may be come from loops with short execution per loop. Attaching a *constant* string is very fast, to the consternation of people who would like the index reported. I believe there should usually be the workaround of naming a calculated index and accessing it in the exception clause. NameError: name 'xyz' is not defined This is less commonly a flow signal. (But not never!) Interpolating the name is worth the cost. OSError: whatever I believe this is even less commonly used as a flow signal. In any case, the context is usually a relatively slow operation, like opening a file (and reading it if successful). >> The above is an example. It is a bit contrived. I simply wanted to illustrate >> the basic issue in a few lines of code. However, my intent was also to >> illustrate what I see as a basic philosophical problem in the way we approach >> exceptions in Python: >> >> It is a nice convenience that an error message is provided by the source of >> the error, but it should not have the final say on the matter. >> Fundamentally, the code that actually presents the error to the user is in >> a better position to produce a message that is meaningful to the user. So, >> when we raise exceptions, not only should we provide a convenient human >> readable error message, we should anticipate that the exception handler may >> need to reformat or reinterpret the exception and provide it with what it >> need to do so. > > That argument makes a certain amount of sense, but its still a niche > use-case. In *general*, we don't do our users any favours by > reinterpreting standard error messages into something that they can't > google, so even if this were useful, I can't see it becoming used in a > widespread manner. I hope not. One of the frustrations of trying to answer StackOverflow question is when the user used an environment that suppresses stacktraces and mangles exception names and messages. -- Terry Jan Reedy From tjreedy at udel.edu Tue Jul 4 17:40:39 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 4 Jul 2017 17:40:39 -0400 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: References: <20170702191953.GA17773@kundert.designers-guide.com> <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> Message-ID: On 7/4/2017 3:32 PM, David Mertz wrote: > I don't see the usefulness rich exception data as at all as limited as > this. Here's some toy code that shows a use: > > > ---- > > # For some reason, imports might not be ready immediately > # Maybe flaky network drive, maybe need to install modules, etc > # The overall program can do things too make them available > lazy_import("foo", "bar", "baz", "blat") > > while True: > try: > x = foo(1) * bar(2) + baz(3)**blat(4) > break > except NameError as err: > lazy_import(err.name ) > sleep(1) Alternate proposal: give the NameError class a .name instance method that extracts the name from the message. This should not increase the time to create an instance. You would then write 'err.name()' instead of 'err.name'. For 3.6 def name(self): msg = self.args[0] return msg[6:msg.rindex("'")] # current test try: xyz except NameError as e: print(name(e) == 'xyz') # Exceptions unittest to ensure that the method # stays synchronized with future versions of instances def test_nameerror_name(self): try: xyz except NameError as e: self.assertEqual(e.name(), 'xyz') Generalize to other exceptions. Further only-partially baked idea: Since exceptions are (in cpython) coded in C, I wonder if C data could be inexpensively attached to the instance to be retrieved and converted to python objects by methods when needed. -- Terry Jan Reedy From brenbarn at brenbarn.net Tue Jul 4 17:47:58 2017 From: brenbarn at brenbarn.net (Brendan Barnwell) Date: Tue, 04 Jul 2017 14:47:58 -0700 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: References: <20170702191953.GA17773@kundert.designers-guide.com> <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> Message-ID: <595C0D0E.2030600@brenbarn.net> On 2017-07-04 13:54, Terry Reedy wrote: > There have been many proposals for what we might call RichExceptions, > with more easily access information. But as Raymond Hettinger keeps > pointing out, Python does not use exceptions only for (hopefully rare) > errors. It also uses them as signals for flow control, both as an > alternative form for alternation and for iteration. Alternation with > try:except instead of if:else is common. In the try: unicode example > above, the NameError is not an error. Until 2.2, IndexError served the > role of StopIteration today, and can still be used for iteration. For > flow control, richer exceptions just slow code execution. How significant is this slowdown in practical terms? Rejecting all "rich" exceptions just because they might add a bit of a slowdown seems premature to me. The information available from the rich exceptions has value that may or may not outweigh the performance hit. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown From mertz at gnosis.cx Tue Jul 4 17:48:51 2017 From: mertz at gnosis.cx (David Mertz) Date: Tue, 4 Jul 2017 14:48:51 -0700 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: References: <20170702191953.GA17773@kundert.designers-guide.com> <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> Message-ID: If a method, why not a property? On Jul 4, 2017 2:41 PM, "Terry Reedy" wrote: > On 7/4/2017 3:32 PM, David Mertz wrote: > >> I don't see the usefulness rich exception data as at all as limited as >> this. Here's some toy code that shows a use: >> >> >> ---- >> >> # For some reason, imports might not be ready immediately >> # Maybe flaky network drive, maybe need to install modules, etc >> # The overall program can do things too make them available >> lazy_import("foo", "bar", "baz", "blat") >> >> while True: >> try: >> x = foo(1) * bar(2) + baz(3)**blat(4) >> break >> except NameError as err: >> lazy_import(err.name ) >> sleep(1) >> > > Alternate proposal: give the NameError class a .name instance method that > extracts the name from the message. This should not increase the time to > create an instance. You would then write 'err.name()' instead of ' > err.name'. For 3.6 > > def name(self): > msg = self.args[0] > return msg[6:msg.rindex("'")] > > # current test > > try: xyz > except NameError as e: > print(name(e) == 'xyz') > > # Exceptions unittest to ensure that the method > # stays synchronized with future versions of instances > > def test_nameerror_name(self): > try: > xyz > except NameError as e: > self.assertEqual(e.name(), 'xyz') > > Generalize to other exceptions. > > Further only-partially baked idea: Since exceptions are (in cpython) coded > in C, I wonder if C data could be inexpensively attached to the instance to > be retrieved and converted to python objects by methods when needed. > > -- > Terry Jan Reedy > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Tue Jul 4 18:31:45 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 05 Jul 2017 10:31:45 +1200 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: References: <20170702191953.GA17773@kundert.designers-guide.com> <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> Message-ID: <595C1751.4050208@canterbury.ac.nz> Terry Reedy wrote: > Attaching a *constant* string is very fast, to the > consternation of people who would like the index reported. Seems to me that storing the index as an attribute would help with this. It shouldn't be much slower than storing a constant string, and formatting the message would be deferred until it's needed, if at all. -- Greg From python-ideas at shalmirane.com Tue Jul 4 20:08:57 2017 From: python-ideas at shalmirane.com (Ken Kundert) Date: Tue, 4 Jul 2017 17:08:57 -0700 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: References: <20170702191953.GA17773@kundert.designers-guide.com> <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> Message-ID: <20170705000857.GB19426@kundert.designers-guide.com> On Tue, Jul 04, 2017 at 04:54:11PM -0400, Terry Reedy wrote: > There have been many proposals for what we might call > RichExceptions, with more easily access information. But as Raymond > Hettinger keeps pointing out, Python does not use exceptions only > for (hopefully rare) errors. It also uses them as signals for flow > control, both as an alternative form for alternation and for > iteration. Alternation with try:except instead of if:else is > common. In the try: unicode example above, the NameError is not an > error. Until 2.2, IndexError served the role of StopIteration > today, and can still be used for iteration. For flow control, > richer exceptions just slow code execution. Terry, Correct me if I am wrong, but this seems like an argument for the proposal. Consider the NameError, currently when raised the error message must be constructed before it is passed to the exception. But in the proposal, you simply pass the name (already available) and the format string (a constant). The name is never interpolated into the format string unless the message is actually used, which it would not in the cases you cite. -Ken From steve at pearwood.info Tue Jul 4 20:58:33 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 5 Jul 2017 10:58:33 +1000 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <20170705000857.GB19426@kundert.designers-guide.com> References: <20170702191953.GA17773@kundert.designers-guide.com> <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <20170705000857.GB19426@kundert.designers-guide.com> Message-ID: <20170705005833.GD3149@ando.pearwood.info> On Tue, Jul 04, 2017 at 05:08:57PM -0700, Ken Kundert wrote: > On Tue, Jul 04, 2017 at 04:54:11PM -0400, Terry Reedy wrote: > > There have been many proposals for what we might call > > RichExceptions, with more easily access information. But as Raymond > > Hettinger keeps pointing out, Python does not use exceptions only > > for (hopefully rare) errors. It also uses them as signals for flow > > control, both as an alternative form for alternation and for > > iteration. Alternation with try:except instead of if:else is > > common. In the try: unicode example above, the NameError is not an > > error. Until 2.2, IndexError served the role of StopIteration > > today, and can still be used for iteration. For flow control, > > richer exceptions just slow code execution. > > Terry, > Correct me if I am wrong, but this seems like an argument for the proposal. > Consider the NameError, currently when raised the error message must be > constructed before it is passed to the exception. But in the proposal, you > simply pass the name (already available) and the format string (a constant). The > name is never interpolated into the format string unless the message is actually > used, which it would not in the cases you cite. Terry's argument is that when used for flow control, you don't care what the index is. You just raise IndexError("index out of bounds") or similar. Or for that matter, just raise IndexError(), as generators usually raise StopIteration() with no arguments. -- Steve From tjreedy at udel.edu Tue Jul 4 23:13:43 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 4 Jul 2017 23:13:43 -0400 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: References: <20170702191953.GA17773@kundert.designers-guide.com> <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> Message-ID: On 7/4/2017 5:48 PM, David Mertz wrote: > If a method, why not a property? Since the time to respond in human terms is trivial, I can imagine that this might be accepted. I just did not think of that option. > On Jul 4, 2017 2:41 PM, "Terry Reedy" > > wrote: > > On 7/4/2017 3:32 PM, David Mertz wrote: > > I don't see the usefulness rich exception data as at all as > limited as this. Here's some toy code that shows a use: > > > ---- > > # For some reason, imports might not be ready immediately > # Maybe flaky network drive, maybe need to install modules, etc > # The overall program can do things too make them available > lazy_import("foo", "bar", "baz", "blat") > > while True: > try: > x = foo(1) * bar(2) + baz(3)**blat(4) > break > except NameError as err: > lazy_import(err.name ) > sleep(1) > > > Alternate proposal: give the NameError class a .name instance method > that extracts the name from the message. This should not increase > the time to create an instance. You would then write 'err.name > ()' instead of 'err.name '. For 3.6 > > def name(self): > msg = self.args[0] > return msg[6:msg.rindex("'")] > > # current test > > try: xyz > except NameError as e: > print(name(e) == 'xyz') > > # Exceptions unittest to ensure that the method > # stays synchronized with future versions of instances > > def test_nameerror_name(self): > try: > xyz > except NameError as e: > self.assertEqual(e.name (), 'xyz') > > Generalize to other exceptions. > > Further only-partially baked idea: Since exceptions are (in cpython) > coded in C, I wonder if C data could be inexpensively attached to > the instance to be retrieved and converted to python objects by > methods when needed. > > -- > Terry Jan Reedy > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Terry Jan Reedy From tjreedy at udel.edu Tue Jul 4 23:37:51 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 4 Jul 2017 23:37:51 -0400 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <595C0D0E.2030600@brenbarn.net> References: <20170702191953.GA17773@kundert.designers-guide.com> <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <595C0D0E.2030600@brenbarn.net> Message-ID: On 7/4/2017 5:47 PM, Brendan Barnwell wrote: > On 2017-07-04 13:54, Terry Reedy wrote: >> There have been many proposals for what we might call RichExceptions, >> with more easily access information. But as Raymond Hettinger keeps >> pointing out, Python does not use exceptions only for (hopefully rare) >> errors. It also uses them as signals for flow control, both as an >> alternative form for alternation and for iteration. Alternation with >> try:except instead of if:else is common. In the try: unicode example >> above, the NameError is not an error. Until 2.2, IndexError served the >> role of StopIteration today, and can still be used for iteration. For >> flow control, richer exceptions just slow code execution. > > How significant is this slowdown in practical terms? Rejecting all > "rich" exceptions just because they might add a bit of a slowdown seems > premature to me. The information available from the rich exceptions has > value that may or may not outweigh the performance hit. I don't know if anyone has ever gone so far as to write a patch to test. I personally been on the side of wanting richer exceptions. So what has been the resistance? Speed is definitely one. Maybe space? Probably maintenance cost. Lack of interest among true 'core' (C competent) developers? My suggestion for speed is that we create exception instances as fast as possible with the information needed to produce python values and string representations on demand. Consumer pays. This is, after all, what we often do for other builtin classes. Consider ints. They have a bit length, which is an integer itself. But ints do not have a Python int data attribute for that. (If they did, the bit_length int would need its own bit_length attribute, and so on.) Instead, ints have a bit_length *method* to create a python int from internal data. Ints also have a decimal string representation, but we do not, as far as I know, precompute it. Ditto for the byte representation. -- Terry Jan Reedy From tjreedy at udel.edu Wed Jul 5 00:21:36 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 5 Jul 2017 00:21:36 -0400 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <595C1751.4050208@canterbury.ac.nz> References: <20170702191953.GA17773@kundert.designers-guide.com> <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <595C1751.4050208@canterbury.ac.nz> Message-ID: On 7/4/2017 6:31 PM, Greg Ewing wrote: > Terry Reedy wrote: >> Attaching a *constant* string is very fast, to the consternation of >> people who would like the index reported. Actually, the constant string should be attached to the class, so there is no time needed. > Seems to me that storing the index as an attribute would help > with this. It shouldn't be much slower than storing a constant > string, Given that the offending int is available as a Python int, then storing a reference should be quick, though slower than 0 (see above ;-). > and formatting the message would be deferred until > it's needed, if at all. I agree that this would be the way to do it. I will let an advocate of this enhancement lookup the rejected issue (there may be more than one) proposing to make the bad index available and see if this is the actual proposal rejected and if so, why (better than I may remember). It occurs to me that if the exception object has no reference to any python object, then all would be identical and only one cached instance should be needed. I checked and neither IndexError nor ZeroDivisionError do this. -- Terry Jan Reedy From tjreedy at udel.edu Wed Jul 5 00:43:25 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 5 Jul 2017 00:43:25 -0400 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <20170705000857.GB19426@kundert.designers-guide.com> References: <20170702191953.GA17773@kundert.designers-guide.com> <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <20170705000857.GB19426@kundert.designers-guide.com> Message-ID: On 7/4/2017 8:08 PM, Ken Kundert wrote: > On Tue, Jul 04, 2017 at 04:54:11PM -0400, Terry Reedy wrote: >> There have been many proposals for what we might call >> RichExceptions, with more easily access information. But as Raymond >> Hettinger keeps pointing out, Python does not use exceptions only >> for (hopefully rare) errors. It also uses them as signals for flow >> control, both as an alternative form for alternation and for >> iteration. Alternation with try:except instead of if:else is >> common. In the try: unicode example above, the NameError is not an >> error. Until 2.2, IndexError served the role of StopIteration >> today, and can still be used for iteration. For flow control, >> richer exceptions just slow code execution. > > Terry, > Correct me if I am wrong, but this seems like an argument for the proposal. I actually do not know what 'the proposal' is and how it is the same or different from past proposals, especially those that have been rejected. I initially elaborated on some points of Steven D'Aprano that I agree with, in light of past discussions and tracker issues. > Consider the NameError, currently when raised the error message must be > constructed before it is passed to the exception. But in the proposal, you > simply pass the name (already available) and the format string (a constant). The > name is never interpolated into the format string unless the message is actually > used, which it would not in the cases you cite. That is close to what I am thinking. I would give the format a default value, the one Python uses most often. def NameError(name, template="name {name} not found"): self.name = name self.template = template -- Terry Jan Reedy From tjreedy at udel.edu Wed Jul 5 00:54:05 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 5 Jul 2017 00:54:05 -0400 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: References: <20170702191953.GA17773@kundert.designers-guide.com> <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <595C1751.4050208@canterbury.ac.nz> Message-ID: On 7/5/2017 12:21 AM, Terry Reedy wrote: > On 7/4/2017 6:31 PM, Greg Ewing wrote: >> Terry Reedy wrote: >>> Attaching a *constant* string is very fast, to the consternation of >>> people who would like the index reported. > > Actually, the constant string should be attached to the class, so there > is no time needed. I should say, a default string, or parameter default value. I just checked and Python gives the numerator type the ZeroDivisionError message and the sequence type in the IndexError message, so scratch that idea. >> Seems to me that storing the index as an attribute would help >> with this. It shouldn't be much slower than storing a constant >> string, > > Given that the offending int is available as a Python int, then storing > a reference should be quick, though slower than 0 (see above ;-). 0 is wrong. Just a reference storage in both cases. >> and formatting the message would be deferred until >> it's needed, if at all. > > I agree that this would be the way to do it. I will let an advocate of > this enhancement lookup the rejected issue (there may be more than one) > proposing to make the bad index available and see if this is the actual > proposal rejected and if so, why (better than I may remember). > It occurs to me that if the exception object has no reference to any > python object, then all would be identical and only one cached instance > should be needed. I checked and neither IndexError nor > ZeroDivisionError do this. -- Terry Jan Reedy From greg.ewing at canterbury.ac.nz Wed Jul 5 01:09:03 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 05 Jul 2017 17:09:03 +1200 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: References: <20170702191953.GA17773@kundert.designers-guide.com> <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <595C1751.4050208@canterbury.ac.nz> Message-ID: <595C746F.5060709@canterbury.ac.nz> Terry Reedy wrote: > It occurs to me that if the exception object has no reference to any > python object, then all would be identical and only one cached instance > should be needed. I don't think that's true now that exceptions get tracebacks attached to them. -- Greg From __peter__ at web.de Wed Jul 5 05:03:32 2017 From: __peter__ at web.de (Peter Otten) Date: Wed, 05 Jul 2017 11:03:32 +0200 Subject: [Python-ideas] Arguments to exceptions References: <20170702191953.GA17773@kundert.designers-guide.com> <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> Message-ID: Terry Reedy wrote: > Alternate proposal: give the NameError class a .name instance method > that extracts the name from the message. This should not increase the > time to create an instance. You would then write 'err.name()' instead > of 'err.name'. For 3.6 > > def name(self): > msg = self.args[0] > return msg[6:msg.rindex("'")] > > # current test > > try: xyz > except NameError as e: > print(name(e) == 'xyz') > > # Exceptions unittest to ensure that the method > # stays synchronized with future versions of instances > > def test_nameerror_name(self): > try: > xyz > except NameError as e: > self.assertEqual(e.name(), 'xyz') > > Generalize to other exceptions. This sounds a bit like "Nine out of ten guests in our restaurant want scrambled eggs -- let's scramble all eggs then as we can always unscramble on demand..." Regardless of usage frequency it feels wrong to regenerate the raw data from the string representation, even when it's possible. On the other hand, if AttributeError were implemented as class AttributeError: def __init__(self, obj, name): self.obj = obj self.name = name @property def message(self): ... # generate message constructing the error message could often be avoided. It would also encourage DRY and put an end to subtle variations in the error message like >>> NAME = "A" * 50 + "NobodyExpectsTheSpanishInquisition" >>> exec("class {}: __slots__ = ()".format(NAME)) >>> A = globals()[NAME] >>> A().foo Traceback (most recent call last): File "", line 1, in AttributeError: 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA' object has no attribute 'foo' >>> A().foo = 42 Traceback (most recent call last): File "", line 1, in AttributeError: 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAANobodyExpectsTheSpanishInquisition' object has no attribute 'foo' From srkunze at mail.de Wed Jul 5 06:44:07 2017 From: srkunze at mail.de (Sven R. Kunze) Date: Wed, 5 Jul 2017 12:44:07 +0200 Subject: [Python-ideas] + operator on generators In-Reply-To: <20170630005742.3ec21b38@grzmot> References: <0b42de3e-a9f3-d0a1-0c51-39c751f3af1f@laposte.net> <20170626032336.GV3149@ando.pearwood.info> <20170627071245.GY3149@ando.pearwood.info> <595219F7.7030804@canterbury.ac.nz> <20170630005742.3ec21b38@grzmot> Message-ID: On 30.06.2017 00:57, Jan Kaliszewski wrote: > Please, note that it can be upturned: maybe they are not so common as > they could be because of all that burden with importing from separate > module -- after all we are saying about somewhat very simple operation, > so using lists and `+` just wins because of our (programmers') > laziness. :-) Is it laziness or is it just "the best way to do it"?? ;-) Regards, Sven -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed Jul 5 07:36:35 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 5 Jul 2017 21:36:35 +1000 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: References: <20170702191953.GA17773@kundert.designers-guide.com> <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <595C0D0E.2030600@brenbarn.net> Message-ID: <20170705113634.GF3149@ando.pearwood.info> On Tue, Jul 04, 2017 at 11:37:51PM -0400, Terry Reedy wrote: > I personally been on the side of wanting richer exceptions. Could you explain what you would use them for? Ken has give two use-cases which I personally consider are relatively niche, and perhaps even counter-productive: - translation into the user's native language; - providing some sort of "did you mean...?" functionality. Jeff Walker also suggested being able to extract the line and column from certain kinds of JSON errors. (But that would depend on the json module having an API that supports that use-case. You can't just say line_no = exception.args[0] if there's no guarantee that it actually will be the line number.) What would you use these for? I imagine you're thinking of this as the maintainer of IDLE? > So what has > been the resistance? Speed is definitely one. Maybe space? Probably > maintenance cost. Lack of interest among true 'core' (C competent) > developers? Where there is a clear and obvious need the core devs have spent the time to give the exception class a rich API for extracting useful information. See OSError, which offers named attributes for the errno, error message, Windows error number (when appropriate) and two file names. I expect that the fact that few of the other builtin or stdlib exceptions similarly offer named attributes is because nobody thought of it, or saw any need. -- Steve From bussonniermatthias at gmail.com Wed Jul 5 09:29:35 2017 From: bussonniermatthias at gmail.com (Matthias Bussonnier) Date: Wed, 5 Jul 2017 15:29:35 +0200 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <20170705113634.GF3149@ando.pearwood.info> References: <20170702191953.GA17773@kundert.designers-guide.com> <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <595C0D0E.2030600@brenbarn.net> <20170705113634.GF3149@ando.pearwood.info> Message-ID: Hi all, I want to point out that if it's not common to dispatch on values of exceptions it might be **because** it is hard to do or to know wether an exception will be structured or not. If Exceptions were by default more structured, if CPython would provide a default "StructuredException", or were the norm because CPython would use more of them ? then you might see more use case, and new code patterns. In particular the fact that you have to catch and exception, then check values is annoying, I could see: try: ... except OSError(EACCES): # handle not permitted only except OSError(ENOENT): # handle does not exists only #implicit otherwise reraise Be a much cleaner way of filtering exception _if_ python were to provide that. Yes in Python 3 these are sometime their own subclass (FileNotFoundError), but isn't the fact that you _have_ to subclass be a hint that something might be wrong there ? You can see it in the fact that FilenotFoundError need to be caught _before_ OSError, it is not obvious to the beginner that FileNotFoundError is a subclass of OSError and that you can't refactor by changing the excepts order. Of course the custom subclass names are way more readable, (at least to me) but it is easy as an experienced Python programmer to forget what is difficult for newcommers. And this is (IMHO) one point which is not easy. One example I'm often wishing to have better filter is warnings where you have to use a regular expression to mute (or turn into errors) a subset of warnings. Of course most of these use case can be dealt with by Subclassing and having custom attributes (if you are in control of the raising code). But nobody will use it if it's not encouraged by the core. > Where there is a clear and obvious need the core devs have spent the > time to give the exception class a rich API for extracting useful > information. See OSError, which offers named attributes for the errno, > error message, Windows error number (when appropriate) and two file > names. This does not encourage and show that doing structured exception is hard and _not_ the standard. > There should be one-- and preferably only one --obvious way to do it. So structured or not ? Please find post signature an extremely non scientific grep of all the usage of structured information on Exceptions on my machine Thanks, -- Matthias $ rg -tpy ' e\.[^g]' | grep -v test | grep -v ENOENT | grep if | grep -v sympy | grep -v 'for e ' | grep -v getorg | grep -v ename |grep -v errno gh-activity/ghactivity.py: source = get_source_login(e.repo) if e.repo[0] == login else e.repo[0] pypi/store.py: if e.code == 204: pypi-legacy/store.py: if e.code == 204: cpython/Lib/asyncore.py: if e.args[0] not in _DISCONNECTED: cpython/Lib/nntplib.py: if e.response.startswith('480'): cpython/Lib/pdb.py: if e.startswith(text)] cpython/Lib/runpy.py: if e.name is None or (e.name != pkg_name and flit/flit/upload.py: if (not repo['is_warehouse']) and e.response.status_code == 403: gitsome/xonsh/execer.py: if (e.loc is None) or (last_error_line == e.loc.lineno and git-cpython/Lib/asyncore.py: if e.args[0] not in _DISCONNECTED: git-cpython/Lib/nntplib.py: if e.response.startswith('480'): git-cpython/Lib/pdb.py: if e.startswith(text)] git-cpython/Lib/runpy.py: if e.name is None or (e.name != pkg_name and jupyter_client/jupyter_client/manager.py: if e.winerror != 5: jupyterhub/jupyterhub/utils.py: if e.code >= 500: jupyterhub/jupyterhub/utils.py: if e.code != 599: procbuild/procbuild/builder.py: if not 'Resource temporarily unavailable' in e.strerror: qtconsole/qtconsole/console_widget.py: if e.mimeData().hasUrls(): qtconsole/qtconsole/console_widget.py: elif e.mimeData().hasText(): qtconsole/qtconsole/console_widget.py: if e.mimeData().hasUrls(): qtconsole/qtconsole/console_widget.py: elif e.mimeData().hasText(): qtconsole/qtconsole/console_widget.py: if e.mimeData().hasUrls(): qtconsole/qtconsole/console_widget.py: elif e.mimeData().hasText(): xonsh/xonsh/execer.py: if (e.loc is None) or (last_error_line == e.loc.lineno and xonsh/xonsh/execer.py: if not greedy and maxcol in (e.loc.column + 1, e.loc.column): xonsh/xonsh/proc.py: r = e.code if isinstance(e.code, int) else int(bool(e.code)) django/django/forms/fields.py: if hasattr(e, 'code') and e.code in self.error_messages: django/django/forms/fields.py: errors.extend(m for m in e.error_list if m not in errors) django/django/http/multipartparser.py: if not e.connection_reset: cpython/Lib/asyncio/streams.py: if self._buffer.startswith(sep, e.consumed): cpython/Lib/idlelib/MultiCall.py: if not APPLICATION_GONE in e.args[0]: cpython/Lib/idlelib/MultiCall.py: if not APPLICATION_GONE in e.args[0]: cpython/Lib/idlelib/MultiCall.py: if not APPLICATION_GONE in e.args[0]: cpython/Lib/multiprocessing/connection.py: if e.winerror == _winapi.ERROR_BROKEN_PIPE: cpython/Lib/multiprocessing/connection.py: if e.winerror != _winapi.ERROR_NO_DATA: cpython/Lib/multiprocessing/connection.py: if e.winerror not in (_winapi.ERROR_SEM_TIMEOUT, cpython/Lib/multiprocessing/process.py: if not e.args: git-cpython/cpython/Lib/asyncore.py: if e.args[0] not in (EBADF, ECONNRESET, ENOTCONN, ESHUTDOWN, git-cpython/cpython/Lib/nntplib.py: if user and e.response[:3] == '480': git-cpython/cpython/Lib/socket.py: if e.args[0] == EINTR: git-cpython/cpython/Lib/socket.py: if e.args[0] == EINTR: git-cpython/cpython/Lib/socket.py: if e.args[0] == EINTR: git-cpython/cpython/Lib/socket.py: if e.args[0] == EINTR: git-cpython/cpython/Lib/socket.py: if e.args[0] == EINTR: git-cpython/Lib/asyncio/streams.py: if self._buffer.startswith(sep, e.consumed): ipyparallel/ipyparallel/client/client.py: if e.engine_info: ipython/IPython/core/inputtransformer.py: if 'multi-line string' in e.args[0]: ipython/IPython/core/inputsplitter.py: if 'multi-line string' in e.args[0]: ipython/IPython/core/inputsplitter.py: elif 'multi-line statement' in e.args[0]: git-cpython/Lib/idlelib/multicall.py: if not APPLICATION_GONE in e.args[0]: git-cpython/Lib/idlelib/multicall.py: if not APPLICATION_GONE in e.args[0]: git-cpython/Lib/idlelib/multicall.py: if not APPLICATION_GONE in e.args[0]: git-cpython/Lib/multiprocessing/connection.py: if e.winerror == _winapi.ERROR_BROKEN_PIPE: git-cpython/Lib/multiprocessing/connection.py: if e.winerror != _winapi.ERROR_NO_DATA: git-cpython/Lib/multiprocessing/connection.py: if e.winerror not in (_winapi.ERROR_SEM_TIMEOUT, git-cpython/Lib/multiprocessing/process.py: if not e.args: jedi/jedi/api/completion.py: if e.error_leaf.value == '.': meeseeksbox/meeseeksdev/meeseeksbox/commands.py: if ('git commit --allow-empty' in e.stderr) or ('git commit --allow-empty' in e.stdout): meeseeksbox/meeseeksdev/meeseeksbox/commands.py: elif "after resolving the conflicts" in e.stderr: notebook/notebook/notebook/handlers.py: if e.status_code == 404 and 'files' in path.split('/'): pandas/pandas/core/strings.py: if len(e.args) >= 1 and re.search(p_err, e.args[0]): pip/pip/_vendor/pyparsing.py: if not e.mayReturnEmpty: prompt_toolkit/prompt_toolkit/terminal/vt100_output.py: elif e.args and e.args[0] == 0: rust/src/etc/dec2flt_table.py:range of exponents e. The output is one array of 64 bit significands and rust/src/etc/htmldocck.py: if e.tail: rust/src/etc/htmldocck.py: if attr in e.attrib: setuptools/pkg_resources/_vendor/pyparsing.py: if not e.mayReturnEmpty: scikit-learn/sklearn/datasets/mldata.py: if e.code == 404: numpy/tools/npy_tempita/__init__.py: if e.args: django/django/core/files/images.py: if e.args[0].startswith("Error -5"): django/django/core/management/base.py: if e.is_serious() cpython/Lib/xml/etree/ElementInclude.py: if e.tag == XINCLUDE_INCLUDE: cpython/Lib/xml/etree/ElementInclude.py: if e.tail: cpython/Lib/xml/etree/ElementInclude.py: elif e.tag == XINCLUDE_FALLBACK: cpython/Lib/xml/etree/ElementPath.py: if e.tag == tag: git-cpython/cpython/Demo/pdist/rcvs.py: if not e.commitcheck(): git-cpython/cpython/Demo/pdist/rcvs.py: if e.commit(message): git-cpython/cpython/Demo/pdist/rcvs.py: e.diff(opts) git-cpython/cpython/Demo/pdist/rcvs.py: if e.proxy is None: git-cpython/cpython/Lib/multiprocessing/connection.py: if e.args[0] != win32.ERROR_PIPE_CONNECTED: git-cpython/cpython/Lib/multiprocessing/connection.py: if e.args[0] != win32.ERROR_PIPE_CONNECTED: git-cpython/cpython/Lib/multiprocessing/connection.py: if e.args[0] not in (win32.ERROR_SEM_TIMEOUT, git-cpython/cpython/Lib/multiprocessing/process.py: if not e.args: git-cpython/Lib/xml/etree/ElementInclude.py: if e.tag == XINCLUDE_INCLUDE: git-cpython/Lib/xml/etree/ElementInclude.py: if e.tail: git-cpython/Lib/xml/etree/ElementInclude.py: elif e.tag == XINCLUDE_FALLBACK: git-cpython/Lib/xml/etree/ElementPath.py: if e.tag == tag: pip/pip/_vendor/distlib/locators.py: if e.code != 404: pip/pip/_vendor/distlib/scripts.py: if e.startswith('.py'): pip/pip/_vendor/html5lib/serializer.py: if not e.endswith(";"): pythondotorg/blogs/management/commands/update_blogs.py: if e.pub_date < entry['pub_date']: scikit-learn/doc/tutorial/machine_learning_map/svg2imagemap.py: if e.nodeName == 'g': scikit-learn/doc/tutorial/machine_learning_map/svg2imagemap.py: if e.hasAttribute('transform'): scikit-learn/doc/tutorial/machine_learning_map/pyparsing.py: if not e.mayReturnEmpty: scikit-learn/doc/tutorial/machine_learning_map/pyparsing.py: if not e.mayReturnEmpty: scikit-learn/doc/tutorial/machine_learning_map/pyparsing.py: if e.mayReturnEmpty: scikit-learn/doc/tutorial/machine_learning_map/pyparsing.py: if e.mayReturnEmpty: scikit-learn/doc/tutorial/machine_learning_map/pyparsing.py: if not e.mayReturnEmpty: django/django/db/backends/mysql/base.py: if e.args[0] in self.codes_for_integrityerror: django/django/db/backends/mysql/base.py: if e.args[0] in self.codes_for_integrityerror: django/django/db/models/fields/__init__.py: if hasattr(e, 'code') and e.code in self.error_messages: git-cpython/cpython/Lib/xml/etree/ElementInclude.py: if e.tag == XINCLUDE_INCLUDE: git-cpython/cpython/Lib/xml/etree/ElementInclude.py: if e.tail: git-cpython/cpython/Lib/xml/etree/ElementInclude.py: elif e.tag == XINCLUDE_FALLBACK: pip/pip/_vendor/requests/packages/urllib3/contrib/pyopenssl.py: if self.suppress_ragged_eofs and e.args == (-1, 'Unexpected EOF'): pip/pip/_vendor/requests/packages/urllib3/contrib/pyopenssl.py: if self.suppress_ragged_eofs and e.args == (-1, 'Unexpected EOF'): pip/pip/_vendor/requests/packages/urllib3/contrib/socks.py: if e.socket_err: On Wed, Jul 5, 2017 at 1:36 PM, Steven D'Aprano wrote: > On Tue, Jul 04, 2017 at 11:37:51PM -0400, Terry Reedy wrote: > >> I personally been on the side of wanting richer exceptions. > > Could you explain what you would use them for? Ken has give two > use-cases which I personally consider are relatively niche, and perhaps > even counter-productive: > > - translation into the user's native language; > > - providing some sort of "did you mean...?" functionality. > > Jeff Walker also suggested being able to extract the line and column > from certain kinds of JSON errors. (But that would depend on the json > module having an API that supports that use-case. You can't just say > line_no = exception.args[0] if there's no guarantee that it actually > will be the line number.) > > What would you use these for? I imagine you're thinking of this as the > maintainer of IDLE? > > >> So what has >> been the resistance? Speed is definitely one. Maybe space? Probably >> maintenance cost. Lack of interest among true 'core' (C competent) >> developers? > > Where there is a clear and obvious need the core devs have spent the > time to give the exception class a rich API for extracting useful > information. See OSError, which offers named attributes for the errno, > error message, Windows error number (when appropriate) and two file > names. > > I expect that the fact that few of the other builtin or stdlib > exceptions similarly offer named attributes is because nobody thought > of it, or saw any need. > > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From steve at pearwood.info Wed Jul 5 11:39:02 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 6 Jul 2017 01:39:02 +1000 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: References: <20170702191953.GA17773@kundert.designers-guide.com> <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <595C0D0E.2030600@brenbarn.net> <20170705113634.GF3149@ando.pearwood.info> Message-ID: <20170705153901.GG3149@ando.pearwood.info> On Wed, Jul 05, 2017 at 03:29:35PM +0200, Matthias Bussonnier wrote: > Hi all, > > I want to point out that if it's not common to dispatch on values of > exceptions it might be **because** it is hard to do or to know wether > an exception will be structured or not. It "might be" for a lot of reasons. The argument that "if Python did X, we might find a use for X" could be used to justify any idea at all, regardless of its usefulness. I found your post ironic -- after spending many paragraphs complaining that exceptions aren't structured enough, that people won't write structured exception classes or make use of structured exceptions "if it's not encouraged by the core" (your words), you then run a grep over your code and find (by my rough count) nearly 100 examples of structured exceptions. So it seems to me that third party libraries are already doing what you say they won't do unless the core exceptions lead the way: providing structured exceptions. Just a handful of examples from your grep: > gh-activity/ghactivity.py: e.repo > pypi/store.py: e.code > cpython/Lib/nntplib.py: e.response > cpython/Lib/runpy.py: e.name > jupyter_client/jupyter_client/manager.py: e.winerror > jupyterhub/jupyterhub/utils.py: e.code > qtconsole/qtconsole/console_widget.py: e.mimeData > xonsh/xonsh/execer.py: e.loc > django/django/forms/fields.py: e.error_list > django/django/http/multipartparser.py: e.connection_reset > cpython/Lib/asyncio/streams.py: e.consumed > ipyparallel/ipyparallel/client/client.py: e.engine_info > jedi/jedi/api/completion.py: e.error_leaf and more. Third parties *are* providing rich exception APIs where it makes sense to do so, using the interface encouraged by PEP 352 (named attributes), without needing a default "StructuredException" in the core language. > If Exceptions were by default > more structured, if CPython would provide a default > "StructuredException", As I said earlier, perhaps I'm just not sufficiently imaginative, but I don't think you can solve the problem of providing better structure to exceptions with a simple hack on BaseException. (And a complicated hack would have costs of its own.) I've come to the conclusion that there's no substitute for tackling each individual exception class individually, deciding whether or not it makes sense for it to be structured, and give it the structure needed for that exception. > or were the norm because CPython would use > more of them ? then you might see more use case, and new code > patterns. And you might not. You might spend hours or days or weeks adding bloat to exceptions for no purpose. That's why we typically insist on good use-cases for new features before accepting them. If we accepted even 10% of the ideas proposed, Python would be a bloated, inconsistent, hard to use mess. (Actually I plucked that number out of thin air. It would be an interesting project to look at what proportion of ideas end up being accepted.) -- Steve From edk141 at gmail.com Wed Jul 5 12:12:29 2017 From: edk141 at gmail.com (Ed Kellett) Date: Wed, 05 Jul 2017 16:12:29 +0000 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <20170705153901.GG3149@ando.pearwood.info> References: <20170702191953.GA17773@kundert.designers-guide.com> <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <595C0D0E.2030600@brenbarn.net> <20170705113634.GF3149@ando.pearwood.info> <20170705153901.GG3149@ando.pearwood.info> Message-ID: Hi, On Wed, 5 Jul 2017 at 16:41 Steven D'Aprano wrote: > and more. Third parties *are* providing rich exception APIs where it > makes sense to do so, using the interface encouraged by PEP 352 (named > attributes), without needing a default "StructuredException" in the > core language. > Your arguments might be used to dismiss anything. If people are already doing $thing, clearly they don't need help from the language. If they're not already doing it, any language feature would be pointless. Ed -------------- next part -------------- An HTML attachment was scrubbed... URL: From klahnakoski at mozilla.com Wed Jul 5 12:47:42 2017 From: klahnakoski at mozilla.com (Kyle Lahnakoski) Date: Wed, 5 Jul 2017 12:47:42 -0400 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: References: <20170702191953.GA17773@kundert.designers-guide.com> <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <595C1751.4050208@canterbury.ac.nz> Message-ID: <3766dbb0-198a-b367-f70f-68d899d3ef88@mozilla.com> I agree with Ken for the need to make rich exceptions easy to write, but I do not know enough to say if changing BaseException to support this is a good idea; I made my own error reporting library to do this: For example, to define a new exception type with a `name` attribute: raise Log.error("name {{name|quote}} is not defined.", name=name) My point is that the first parameter, the template string, acts as the class definition. Now, there are some extra "features": The mustaches demand named parameters, and use pipe for some simple formatting hints; I hope you can overlook that and maybe use f-strings, or similar. The template string is not expanded unless the text log is required; the errors serialize nicely to JSON for a structured logging system; querying for instances of this error is the same as looking for the template string. My library does not solve the problem of extracting parameters out of the standard lib errors; but catching them early, chaining, and properly classing them, is good enough. >>> try: ... a={} # imagine this is something complicated ... c = a.b ... except Exception as e: ... Log.error("can not do something complicated", cause=e) # just like raise from, plus we are defining a new exception type On 2017-07-05 00:21, Terry Reedy wrote: > On 7/4/2017 6:31 PM, Greg Ewing wrote: >> Terry Reedy wrote: >>> Attaching a *constant* string is very fast, to the consternation of >>> people who would like the index reported. > > Actually, the constant string should be attached to the class, so > there is no time needed. > >> Seems to me that storing the index as an attribute would help >> with this. It shouldn't be much slower than storing a constant >> string, > > Given that the offending int is available as a Python int, then > storing a reference should be quick, though slower than 0 (see above > ;-). > >> and formatting the message would be deferred until >> it's needed, if at all. > > I agree that this would be the way to do it. I will let an advocate > of this enhancement lookup the rejected issue (there may be more than > one) proposing to make the bad index available and see if this is the > actual proposal rejected and if so, why (better than I may remember). > > It occurs to me that if the exception object has no reference to any > python object, then all would be identical and only one cached > instance should be needed. I checked and neither IndexError nor > ZeroDivisionError do this. > From steve at pearwood.info Wed Jul 5 13:32:21 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 6 Jul 2017 03:32:21 +1000 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: References: <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <595C0D0E.2030600@brenbarn.net> <20170705113634.GF3149@ando.pearwood.info> <20170705153901.GG3149@ando.pearwood.info> Message-ID: <20170705173221.GH3149@ando.pearwood.info> On Wed, Jul 05, 2017 at 04:12:29PM +0000, Ed Kellett wrote: > Hi, > > On Wed, 5 Jul 2017 at 16:41 Steven D'Aprano wrote: > > > and more. Third parties *are* providing rich exception APIs where it > > makes sense to do so, using the interface encouraged by PEP 352 (named > > attributes), without needing a default "StructuredException" in the > > core language. > > > > Your arguments might be used to dismiss anything. Do you have an answer for why the argument is wrong? People *are* writing structured exceptions, which undercuts the argument that we must do something because if we don't lead the way others won't. The argument that "we must do something, this is something, therefore we must do it" doesn't cut any ice here. Read the Zen of Python: Now is better than never. Although never is often better than *right* now. The burden is not on me to prove that this idea is a bad idea. I don't have to prove this is a bad idea. The burden is on those who want to make this change to demonstrate to the satisfaction of the core developers that their idea: - solves a genuine problem; - without introducing worse problems; - that it will do what they expect it to do; - and that the change is worth the effort in implementation and the cost to the language (bloat and churn). If proponents of the idea can't do that, then the status quo wins: http://www.curiousefficiency.org/posts/2011/02/status-quo-wins-stalemate.html I've had a number private emails complaining that I'm "too negative" for this list because I pointed out flaws. Do people think that we make Python better by introducing flawed changes that don't solve the problem they're supposed to solve? (I'm not going to name names, you know who you are.) If people want this change, it's not enough to get all snarky and complain that critics are "too negative" or "too critical of minor problems". You need to start by **addressing the criticism**. In the very first reply to Ken's initial proposal, it was pointed out that his plan goes against PEP 352 and that he would have to address why that PEP is wrong to encourage named attributes over positional arguments in exception.args. As far as I can see, nobody has even attempted to do that. I think that's the place to start: if your plan for giving exceptions structure is to just dump everything into an unstructured args list with no guaranteed order, then you're not actually giving exceptions structure and you're not solving the problem. (like the fact that the idea doesn't actually solve the problem it is intended to). You know what? I don't have to prove anything here. It's up to the people wanting this change to prove that it is useful, worth the effort, and that it will do what they expect. Ken suggested a concrete change to BaseException to solve a real lack. His solution can't work, for reasons I've already gone over, but at least he's made an attempt at a solution. (He hasn't demonstrated that there is a real problem If people are already > doing $thing, clearly they don't need help from the language. If they're > not already doing it, any language feature would be pointless. > > Ed From brenbarn at brenbarn.net Wed Jul 5 14:34:44 2017 From: brenbarn at brenbarn.net (Brendan Barnwell) Date: Wed, 05 Jul 2017 11:34:44 -0700 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <20170705113634.GF3149@ando.pearwood.info> References: <20170702191953.GA17773@kundert.designers-guide.com> <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <595C0D0E.2030600@brenbarn.net> <20170705113634.GF3149@ando.pearwood.info> Message-ID: <595D3144.5080703@brenbarn.net> On 2017-07-05 04:36, Steven D'Aprano wrote: > On Tue, Jul 04, 2017 at 11:37:51PM -0400, Terry Reedy wrote: > >> I personally been on the side of wanting richer exceptions. > > Could you explain what you would use them for? Ken has give two > use-cases which I personally consider are relatively niche, and perhaps > even counter-productive: > > - translation into the user's native language; > > - providing some sort of "did you mean...?" functionality. I'm not the one you were replying to, but I just consider it useful debugging information that reduces the work required to track down errors. Suppose I have code like this: for item in blah: do_stuff(item) def do_stuff(item): foo = item['bar'] do_other_stuff(foo) # etc., then 10 functions down the call stack. . . def do_yet_additional_stuff(whatever): x = whatever['yadda'] Now I get a KeyError exception in do_yet_additional_stuff. If the object on which the key was not found ("whatever" here) is stored as an attribute of the exception, I can put a try/except in any function in the call stack and still have access to the object that caused the exception further down. This makes it easier to try out hypotheses about what's causing the error. ("Hmmm, I know that in do_stuff it reads the 'bar', attribute, maybe if that value is negative it's resulting in a NaN later on. . .") If the object is not available as an exception attribute, I have to start by going all the way to the bottom of the call stack (to do_yet_additional_stuff) and either creating my own custom exception that does store the object (thus implementing this attribute-rich exception myself) or slowly working my way back up the call stack. Having the object available on the exception is like having your finger in the pages of a book to hold the place where the exception actually occurred, while still being able to flip back and forth to other sections of the call stack to try to figure out how they're leading to that exception. A related use is when the exception is already caught and logged or handled somewhere higher up. It makes it easier to log messages like "AttributeError was raised when foo had item=-10.345", or to conditionally handle exceptions based on such information, without requiring extra instrumentation elsewhere in the code. I realize this isn't a life-and-death use case, but to be frank I don't really see that the objections are that strong either. So far I've seen "You don't really need that information", "You don't need that information if you're using exceptions for flow control" and "There might theoretically be a performance penalty of unknown significance". (I seem to recall seeing something like "exceptions aren't supposed to be debugging tools" in an earlier discussion about this, and I basically just disagree with that one.) Now, obviously there is the generic burden-of-proof argument that the suggestion doesn't meet the bar for changing the status quo, and that's as may be, but that's different from there being no use case. It also doesn't necessarily rule out a potential resolution like "okay, this isn't urgent, but let's try to clean this up gradually as we move forward, by adding useful info to exceptions where possible". That said, I think this use case of mine is on a different track from where most of the discussion in this thread seems to have gone. For one thing, I agree with you that NameError is a particularly odd exception to pick as the poster child for rich exceptions, because NameError is almost always raised in response to a variable name that's typed literally in the code, so it's much less likely to result in the kind of situation I described above, where the true cause of an exception is far away in the call stack from the point where it's raised. I also agree that it is much better to have the information available as named attributes on the exception rather than accessing them positionally with something like exception.args[1]. In fact I agree so much that basically what I'm saying is that all exceptions as much as possible should store relevant information in such attributes rather than putting it ONLY into the message string. Also, in the situation I'm describing, I'm not particularly attached to the idea that the "rich" information would even have to be part of the exception message at all by default (since the object might have a long __str__ that would be irritating). It would just be there, attached to the exception, so that it could be used if needed. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown From chris.barker at noaa.gov Wed Jul 5 14:34:25 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 5 Jul 2017 11:34:25 -0700 Subject: [Python-ideas] Allow function to return multiple values In-Reply-To: References: Message-ID: On Sun, Jun 25, 2017 at 6:20 PM, Mikhail V wrote: > And it reminded me times starting with Python and wondering > why I can't simply write something like: > > def move(x,y): > x = x + 10 > y = y + 20 > move(x,y) > > Instead of this: > > def move(x,y): > x1 = x + 10 > y1 = y + 20 > return x1,y1 > x,y = move(x,y) you CAN do that, if x and y are mutable types. I've found that when folk want this behavior (often called "pass by reference" or something), what they really want in a mutable number. And you can make one of those if you like -- here's a minimal one that can be used as a counter: In [12]: class Mint(): ...: def __init__(self, val=0): ...: self.val = val ...: ...: def __iadd__(self, other): ...: self.val += other ...: return self ...: ...: def __repr__(self): ...: return "Mint(%i)" % self.val so now your move() function can work: In [17]: def move(x, y): ...: x += 1 ...: y += 1 In [18]: a = Mint() In [19]: b = Mint() In [20]: move(a, b) In [21]: a Out[21]: Mint(1) In [22]: b Out[22]: Mint(1) I've seen recipes for a complete mutable integer, though Google is failing me right now. This does come up fairly often -- I usually think there are more Pythonic ways of solving the problem - like returning multiple values, but maybe it has its uses. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Jul 5 14:55:20 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 5 Jul 2017 11:55:20 -0700 Subject: [Python-ideas] socket module: plain stuples vs named tuples In-Reply-To: References: <07d5cb54-e6f4-7f2d-7995-d6c01c94401f@thomas-guettler.de> Message-ID: On Mon, Jun 19, 2017 at 8:04 PM, Nick Coghlan wrote: > As context for anyone not familiar with the time module precedent that > Guido mentioned, we have a C level `PyStructSequence` that provides > some of the most essential namedtuple features, but not all of them: > https://github.com/python/cpython/blob/master/Objects/structseq.c > > So there's potentially a case to be made for: > > 1. Including the struct sequence header from "Python.h" and making it > part of the stable ABI > 2. Documenting it in the C API reference > +1 -- I was just thinking this morning that a C-level named tuple would be nice. And certainly better than re-implementing it in various places it is needed. Would there be any benefit in making a C implementation available from Python? -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From casevh at gmail.com Wed Jul 5 15:48:24 2017 From: casevh at gmail.com (Case Van Horsen) Date: Wed, 5 Jul 2017 12:48:24 -0700 Subject: [Python-ideas] [off topic] Allow function to return multiple values Message-ID: On Wed, Jul 5, 2017 at 11:34 AM, Chris Barker wrote: > On Sun, Jun 25, 2017 at 6:20 PM, Mikhail V wrote: >> >> And it reminded me times starting with Python and wondering >> why I can't simply write something like: >> >> def move(x,y): >> x = x + 10 >> y = y + 20 >> move(x,y) >> >> Instead of this: >> >> def move(x,y): >> x1 = x + 10 >> y1 = y + 20 >> return x1,y1 >> x,y = move(x,y) > > > you CAN do that, if x and y are mutable types. > > I've found that when folk want this behavior (often called "pass by > reference" or something), what they really want in a mutable number. And you > can make one of those if you like -- here's a minimal one that can be used > as a counter: [veering off-topic] I've implemented mutable integers as part of the gmpy2 library. The eXperimental MPZ (xmpz) type breaks many of the normal rules. Mutable >>> a=gmpy2.xmpz(1) >>> b=a >>> a+=1 >>> a xmpz(2) >>> b xmpz(2) Direct access to individual bits >>> a=gmpy2.xmpz(123) >>> a[0] 1 >>> a[1] 1 >>> a[0]=0 >>> a xmpz(122) Iterating over bits >>> a=gmpy2.xmpz(104) >>> bin(a) '0b1101000' >>> list(a.iter_bits()) [False, False, False, True, False, True, True] >>> list(a.iter_clear()) [0, 1, 2, 4] >>> list(a.iter_set()) [3, 5, 6] I'm not sure how useful it really is but it was fun to write. :) casevh From k7hoven at gmail.com Wed Jul 5 16:51:23 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Wed, 5 Jul 2017 23:51:23 +0300 Subject: [Python-ideas] Runtime types vs static types In-Reply-To: <20170702111607.GQ3149@ando.pearwood.info> References: <20170702111607.GQ3149@ando.pearwood.info> Message-ID: On Sun, Jul 2, 2017 at 2:16 PM, Steven D'Aprano wrote: > On Sat, Jun 24, 2017 at 10:42:19PM +0300, Koos Zevenhoven wrote: > > [...] > > Clearly, there needs to be some sort of distinction between runtime > > classes/types and static types, because static types can be more precise > > than Python's dynamic runtime semantics. > > I think that's backwards: runtime types can be more precise than static > types. Runtime types can make use of information known at compile time > *and* at runtime, while static types can only make use of information > known at compile time. ?This is not backwards -- just a different interpretation of the same situation. In fact, the problem is that 'type' already means too many different things. Clarity of terminology for a concept helps a lot in making the concept itself simpler and easier for both the designers and the users. This is analogous to the problem in English language that the verb 'argue' does not have a single meaning. Sometimes the concepts of an argument and a productive discussion get mixed up, and people who want to discuss productively just end up going away because others are turning the discussion into an argument. -- Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From songofacandy at gmail.com Wed Jul 5 19:16:00 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Thu, 6 Jul 2017 08:16:00 +0900 Subject: [Python-ideas] socket module: plain stuples vs named tuples In-Reply-To: References: <07d5cb54-e6f4-7f2d-7995-d6c01c94401f@thomas-guettler.de> Message-ID: > > Would there be any benefit in making a C implementation available from > Python? > > -CHB > Yes, at startup time point of view. Current Python namedtuple implementation uses `eval`. It means we can't cache bytecode in pyc files. For example, importing functools is not so fast and its because `_CacheInfo` namedtuple for `lru_cache`. And structseq is faster than Python namedtuple too. ref: http://bugs.python.org/issue28638#msg282412 Regards, From jeff.walker00 at yandex.com Wed Jul 5 20:46:12 2017 From: jeff.walker00 at yandex.com (Jeff Walker) Date: Wed, 05 Jul 2017 18:46:12 -0600 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <20170705173221.GH3149@ando.pearwood.info> References: <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <595C0D0E.2030600@brenbarn.net> <20170705113634.GF3149@ando.pearwood.info> <20170705153901.GG3149@ando.pearwood.info> <20170705173221.GH3149@ando.pearwood.info> Message-ID: <166441499301972@web57j.yandex.ru> I am one of those that also find you to be too negative. I find your critiques to be useful. You often raise issues that had not occurred to me. But then you go further an make pronouncements which I think go too far. For example: > the idea doesn't actually solve the problem it is intended to or > His solution can't work or > He hasn't demonstrated that there is a real problem None of these ring true to me. Rather it seems like you just don't like the approach he has taken. You are often dismissive of other peoples code, experiences and opinions. For example, in Ken's first post he used NameError as an example, though he indicated that the same issue occurred widely in the language. Rather than asking for further examples, you immediately dismissed his idea as 'code smell' largely on his use of NameError. That is a pretty derogatory response that really cannot be argued against because it is simply your opinion. I think we all understand that the chance of actually changing the language by sending an idea to python-ideas at python.org is pretty low, but it can be fun and educational to discuss the ideas. And doing so makes this more of a community. However it seems like you feel you must play the role of a barrier that protects the language from poorly conceived ideas. I don't think we need that. The good ideas are naturally sorted from the bad ones through prolonged discussion. And the early criticism can take the joy out of the process and can lead to the original poster becoming defensive (for example, consider how you just reacted to a little criticism). They can feel like people have not taken the time to understand their idea before criticizing it. It is worth noting that Ken felt the need to start fresh after your initial criticism. When people send an idea to this mailing list they are exposing themselves, and the criticism can really hurt. It is like they are being told that they are not good enough to contribute. We should be sensitive to that, and focus first on the good aspects of their idea. We should defer our criticisms and perhaps try couching them as suggested improvements. I find you a very valuable member of the community, but I have often wished that you would take a more positive approach. Jeff 05.07.2017, 11:33, "Steven D'Aprano" : > On Wed, Jul 05, 2017 at 04:12:29PM +0000, Ed Kellett wrote: >> ?Hi, >> >> ?On Wed, 5 Jul 2017 at 16:41 Steven D'Aprano wrote: >> >> ?> and more. Third parties *are* providing rich exception APIs where it >> ?> makes sense to do so, using the interface encouraged by PEP 352 (named >> ?> attributes), without needing a default "StructuredException" in the >> ?> core language. >> ?> >> >> ?Your arguments might be used to dismiss anything. > > Do you have an answer for why the argument is wrong? People *are* > writing structured exceptions, which undercuts the argument that we must > do something because if we don't lead the way others won't. > > The argument that "we must do something, this is something, therefore we > must do it" doesn't cut any ice here. Read the Zen And you seem rather dismissive of other peoples code and experiences. For example, in Ken's first post he used NameError as example, though he indicated that the same issue occurred widely in the language. Yet you dismissed his idea as 'code smell'. That is a pretty derogatory response that does not allow for the fact that not all code needs to be hardened to be useful. One of the nice things about Python over other languages such as Java and C++ is that you can throw something together quickly that works, and that may be what is needed. If you are throwing a script together for your own use, you often don't need the harden the program like you would if you are distributing it to the world. You've said repeatedly that using unnamed arguments would be 05.07.2017, 11:33, "Steven D'Aprano" : > On Wed, Jul 05, 2017 at 04:12:29PM +0000, Ed Kellett wrote: >> Hi, >> >> On Wed, 5 Jul 2017 at 16:41 Steven D'Aprano wrote: >> >> > and more. Third parties *are* providing rich exception APIs where it >> > makes sense to do so, using the interface encouraged by PEP 352 (named >> > attributes), without needing a default "StructuredException" in the >> > core language. >> > >> >> Your arguments might be used to dismiss anything. > > Do you have an answer for why the argument is wrong? People *are* > writing structured exceptions, which undercuts the argument that we must > do something because if we don't lead the way others won't. > > Tof Python: > > ????Now is better than never. > ????Although never is often better than *right* now. > > The burden is not on me to prove that this idea is a bad idea. I don't > have to prove this is a bad idea. The burden is on those who want to > make this change to demonstrate to the satisfaction of the core > developers that their idea: > > - solves a genuine problem; > > - without introducing worse problems; > > - that it will do what they expect it to do; > > - and that the change is worth the effort in implementation > ??and the cost to the language (bloat and churn). > > If proponents of the idea can't do that, then the status quo wins: > > http://www.curiousefficiency.org/posts/2011/02/status-quo-wins-stalemate.html > > I've had a number private emails complaining that I'm "too negative" for > this list because I pointed out flaws. Do people think that we make > Python better by introducing flawed changes that don't solve the problem > they're supposed to solve? > > (I'm not going to name names, you know who you are.) > > If people want this change, it's not enough to get all snarky and > complain that critics are "too negative" or "too critical of minor > problems". You need to start by **addressing the criticism**. > > In the very first reply to Ken's initial proposal, it was pointed out > that his plan goes against PEP 352 and that he would have to address why > that PEP is wrong to encourage named attributes over positional > arguments in exception.args. As far as I can see, nobody has even > attempted to do that. > > I think that's the place to start: if your plan for giving exceptions > structure is to just dump everything into an unstructured args list with > no guaranteed order, then you're not actually giving exceptions > structure and you're not solving the problem. > > ?(like the fact that the > idea > doesn't actually solve the > problem > it is intended to). > > You know what? I don't have to prove anything here. It's up to the > people wanting this change to prove that it is useful, worth the > effort, and that it will do what they expect. > > Ken suggested a concrete change to BaseException to solve a real lack. > His solution can't work, for reasons I've already gone over, but at > least he's made an attempt at a solution. (He hasn't demonstrated that > there is a real problem > > ?If people are already >> ?doing $thing, clearly they don't need help from the language. If they're >> ?not already doing it, any language feature would be pointless. >> >> ?Ed > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From rosuav at gmail.com Wed Jul 5 21:22:07 2017 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 6 Jul 2017 11:22:07 +1000 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <166441499301972@web57j.yandex.ru> References: <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <595C0D0E.2030600@brenbarn.net> <20170705113634.GF3149@ando.pearwood.info> <20170705153901.GG3149@ando.pearwood.info> <20170705173221.GH3149@ando.pearwood.info> <166441499301972@web57j.yandex.ru> Message-ID: On Thu, Jul 6, 2017 at 10:46 AM, Jeff Walker wrote: > I am one of those that also find you to be too negative. I find your critiques to > be useful. You often raise issues that had not occurred to me. But then you > go further an make pronouncements which I think go too far. For example: > >> the idea doesn't actually solve the problem it is intended to > > or > >> His solution can't work > > or > >> He hasn't demonstrated that there is a real problem > > None of these ring true to me. Rather it seems like you just don't like the > approach he has taken. > > You are often dismissive of other peoples code, experiences and opinions. > For example, in Ken's first post he used NameError as an example, though > he indicated that the same issue occurred widely in the language. Rather > than asking for further examples, you immediately dismissed his idea as > 'code smell' largely on his use of NameError. That is a pretty derogatory > response that really cannot be argued against because it is simply your > opinion. For what it's worth, I'm in favour of Steven's "too negative" approach - or rather, I don't think his style is too negative. Yes, it's a bit rough and uncomfortable to be on the receiving end of it, but it's exactly correct. All three of the statements you quote are either provably true from the emails in this thread, or are at least plausible. If you think he's wrong to say them, *say so*, and ask him to justify them. Perhaps what we need is a "falsehoods programmers believe about python-ideas" collection. I'll start it: * All ideas are worthy of respect. * My use-case is enough justification for adding something to the language. * Criticism is bad. Ideas should be welcomed just because they're ideas. * "Why not?" is enough reason to do something. * PyPI doesn't exist. * I don't need a concrete use-case; a rough idea of "this could be neat" is enough. * Performance doesn't matter. * Performance matters. * You must hate me, because you're picking holes in my brilliant idea. * Code smell is inherently bad. * Criticism means your idea is bad. * Criticism means your idea is good. * Criticism means your idea is interesting. * CPython is the only Python there is. As usual, many of these wouldn't be articulated, but you'll find that a lot of people's posts have been written with these as unwitting assumptions. I'll leave those thoughts with you. ChrisA From steve at pearwood.info Wed Jul 5 21:26:34 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 6 Jul 2017 11:26:34 +1000 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <20170705173221.GH3149@ando.pearwood.info> References: <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <595C0D0E.2030600@brenbarn.net> <20170705113634.GF3149@ando.pearwood.info> <20170705153901.GG3149@ando.pearwood.info> <20170705173221.GH3149@ando.pearwood.info> Message-ID: <20170706012634.GI3149@ando.pearwood.info> On Thu, Jul 06, 2017 at 03:32:21AM +1000, Steven D'Aprano wrote: > On Wed, Jul 05, 2017 at 04:12:29PM +0000, Ed Kellett wrote: > > Hi, > > > > On Wed, 5 Jul 2017 at 16:41 Steven D'Aprano wrote: > > > > > and more. Third parties *are* providing rich exception APIs where it > > > makes sense to do so, using the interface encouraged by PEP 352 (named > > > attributes), without needing a default "StructuredException" in the > > > core language. > > > > > > > Your arguments might be used to dismiss anything. > > Do you have an answer for why the argument is wrong? People *are* > writing structured exceptions, which undercuts the argument that we must > do something because if we don't lead the way others won't. [...] Apologies, I hit the wrong key intending to save the email for later but accidentally hit send instead, so the end of the post may be a bit (or a lot) incoherent. -- Steve From jeff.walker00 at yandex.com Wed Jul 5 21:53:05 2017 From: jeff.walker00 at yandex.com (Jeff Walker) Date: Wed, 05 Jul 2017 19:53:05 -0600 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: References: <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <595C0D0E.2030600@brenbarn.net> <20170705113634.GF3149@ando.pearwood.info> <20170705153901.GG3149@ando.pearwood.info> <20170705173221.GH3149@ando.pearwood.info> <166441499301972@web57j.yandex.ru> Message-ID: <192641499305985@web47g.yandex.ru> Stephen, These statements do not ring true to me. I have been following the conversation closely and I have not seen support for any of them. Perhaps I missed it. Could you please expand on these statements: > the idea doesn't actually solve the problem it is intended to Specifically Ken started by saying that it should not be necessary to parse the messages to get the components of the message. He then gave an example where he was able to access the components of the message without parsing the message. So how is it that he is not solving the problem he intended to solve? > His solution can't work Again, he gave an example where he was able to access the components of the message without parsing the message. Yet you claim his solution cannot work. Is his example wrong? > He hasn't demonstrated that there is a real problem You yourself admitted that parsing a message to extract the components is undesirable. Ken and others, including myself, gave examples where this was necessary. Each example was given as either being a real problem or representative of a real problem. Are we all wrong? Jeff From p.f.moore at gmail.com Thu Jul 6 05:58:43 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 6 Jul 2017 10:58:43 +0100 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <192641499305985@web47g.yandex.ru> References: <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <595C0D0E.2030600@brenbarn.net> <20170705113634.GF3149@ando.pearwood.info> <20170705153901.GG3149@ando.pearwood.info> <20170705173221.GH3149@ando.pearwood.info> <166441499301972@web57j.yandex.ru> <192641499305985@web47g.yandex.ru> Message-ID: On 6 July 2017 at 02:53, Jeff Walker wrote: > Could you please expand on these statements: > >> the idea doesn't actually solve the problem it is intended to > > Specifically Ken started by saying that it should not be necessary to parse the > messages to get the components of the message. He then gave an example > where he was able to access the components of the message without parsing > the message. So how is it that he is not solving the problem he intended to solve? Just to add my perspective here, his proposed solution (to modify BaseException) doesn't include any changes to the derived exceptions that would need to store the components. To use the (already over-used) NameError example, Ken's proposal doesn't include any change to how NameError exceptions are raised to store the name separately on the exception. So *as the proposal stands* it doesn't allow users to extract components of any exceptions, simply because the proposal doesn't suggest changing exceptions to *store* those components. >> His solution can't work > > Again, he gave an example where he was able to access the components of the > message without parsing the message. Yet you claim his solution cannot work. > Is his example wrong? Yes. Because he tries to extract the name component of a NameError, and yet that component isn't stored anywhere - under his proposal or under current CPython. >> He hasn't demonstrated that there is a real problem > > You yourself admitted that parsing a message to extract the components is > undesirable. Ken and others, including myself, gave examples where this was > necessary. Each example was given as either being a real problem or > representative of a real problem. Are we all wrong? He's given examples of use cases. To that extent, Steven is being a touch absolute here. However, there has been some debate over whether those examples are valid. We've had multiple responses pointing out that the code examples aren't restricting what's in the try block sufficiently tightly, for example (the NameError case in particular was importing a file which, according to Ken himself, had potentially *thousands* of places where NameError could be raised). It's possible that introspecting exceptions is the right way to design a solution to this particular problem, but it does go against the normal design principles that have been discussed on this list and elsewhere many times. So, to demonstrate that there's a problem, it's necessary to address the question of whether the code could in fact have been written in a different manner that avoided the claimed problem. That's obviously not a black and white situation - making it easier to write code in a certain style is a valid reason for suggesting an enhancement - but the debate has edged towards a stance of "this is needed" (as in, the lack of it is an issue) rather than "this would be an improvement". That's not what Ken said, though, and we all bear a certain responsibility for becoming a little too entrenched in our positions. As far as whether Steven's (or anyone's) comments are too negative, I think the pushback is reasonable. In particular, as far as I know Ken is not a first-time contributor here, so he's almost certainly aware of the sorts of concerns that come up in discussions like this, and with that context I doubt he's offended by the reception his idea got (indeed, his responses on this thread have been perfectly sensible and positive). I do think we need to be more sensitive with newcomers, and Chris Angelico's idea of a "falsehoods programmers believe about python-ideas" collection may well be a good resource to gently remind newcomers of some of the parameters of discussion around here. You also say > but it can be fun and educational to discuss the ideas Indeed, very much so. I've certainly learned a lot about language and API design from discussions here over the years. But again, that's the point - the biggest things I've learned are about how *hard* good design is, and how important it is to think beyond your own personal requirements. Most of the "negative" comments I've seen on this list have been along those lines - reminding people that there's a much wider impact for their proposals, and that benefits need to be a lot more compelling than you originally thought. That's daunting, and often discouraging (plenty of times, I've been frustrated by the fact that proposals that seem good to be are blocked by the risk that someone might have built their whole business around a corner case that I'd never thought of, or cared about). But it's the reality and it's an immensely valuable lesson to learn (IMO). Paul From chris.barker at noaa.gov Thu Jul 6 12:54:41 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 6 Jul 2017 09:54:41 -0700 Subject: [Python-ideas] [off topic] Allow function to return multiple values In-Reply-To: References: Message-ID: On Wed, Jul 5, 2017 at 12:48 PM, Case Van Horsen wrote: > [veering off-topic] > > I've implemented mutable integers as part of the gmpy2 library. The > eXperimental MPZ (xmpz) type breaks many of the normal rules. > looks like you've added a lot more than mutability :-) > I'm not sure how useful it really is > Exactly -- when this comes up, I tend to think "I should make a mutable integer and put it on PyPi" -- but then I realize that the use case at hand is better handled another way. I have yet to encounter a use case where is seems like the best way to handle it. IT's mostly people trying to write C# (or another language) in Python :-) But a good exercise in writing code to emulate types in python.... -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From mehaase at gmail.com Thu Jul 6 13:30:52 2017 From: mehaase at gmail.com (Mark E. Haase) Date: Thu, 6 Jul 2017 13:30:52 -0400 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: References: <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <595C0D0E.2030600@brenbarn.net> <20170705113634.GF3149@ando.pearwood.info> <20170705153901.GG3149@ando.pearwood.info> <20170705173221.GH3149@ando.pearwood.info> <166441499301972@web57j.yandex.ru> Message-ID: On Wed, Jul 5, 2017 at 9:22 PM, Chris Angelico wrote: > For what it's worth, I'm in favour of Steven's "too negative" approach > - or rather, I don't think his style is too negative. Yes, it's a bit > rough and uncomfortable to be on the receiving end of it, but it's > exactly correct. All three of the statements you quote are either > provably true from the emails in this thread, or are at least > plausible. If you think he's wrong to say them, *say so*, and ask him > to justify them. > > Perhaps what we need is a "falsehoods programmers believe about > python-ideas" collection. I'll start it: > > * All ideas are worthy of respect. > * My use-case is enough justification for adding something to the language. > * Criticism is bad. Ideas should be welcomed just because they're ideas. > ...snip... I don't think Ken actually made any of the false assumptions you've listed here, so it's a bit harsh to post that list in this thread. This list is for "speculative language ideas" and "discussion". Ken has met that standard. The topic of tone is interesting, and a broader discussion of how to use python-ideas for newcomers and regulars alike is probably overdue, just not in this thread. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Thu Jul 6 13:46:27 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 7 Jul 2017 03:46:27 +1000 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: References: <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <595C0D0E.2030600@brenbarn.net> <20170705113634.GF3149@ando.pearwood.info> <20170705153901.GG3149@ando.pearwood.info> <20170705173221.GH3149@ando.pearwood.info> <166441499301972@web57j.yandex.ru> Message-ID: On Fri, Jul 7, 2017 at 3:30 AM, Mark E. Haase wrote: > On Wed, Jul 5, 2017 at 9:22 PM, Chris Angelico wrote: >> >> For what it's worth, I'm in favour of Steven's "too negative" approach >> - or rather, I don't think his style is too negative. Yes, it's a bit >> rough and uncomfortable to be on the receiving end of it, but it's >> exactly correct. All three of the statements you quote are either >> provably true from the emails in this thread, or are at least >> plausible. If you think he's wrong to say them, *say so*, and ask him >> to justify them. >> >> Perhaps what we need is a "falsehoods programmers believe about >> python-ideas" collection. I'll start it: >> >> * All ideas are worthy of respect. >> * My use-case is enough justification for adding something to the >> language. >> * Criticism is bad. Ideas should be welcomed just because they're ideas. >> ...snip... > > > I don't think Ken actually made any of the false assumptions you've listed > here, so it's a bit harsh to post that list in this thread. This list is for > "speculative language ideas" and "discussion". Ken has met that standard. > > The topic of tone is interesting, and a broader discussion of how to use > python-ideas for newcomers and regulars alike is probably overdue, just not > in this thread. I didn't intend to imply that any one person had made any particular assumptions. But if a list like this could be published somewhere, it would help people to realise what they're unintentionally implying. ChrisA From mehaase at gmail.com Thu Jul 6 13:59:02 2017 From: mehaase at gmail.com (Mark E. Haase) Date: Thu, 6 Jul 2017 13:59:02 -0400 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: References: <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <595C0D0E.2030600@brenbarn.net> <20170705113634.GF3149@ando.pearwood.info> <20170705153901.GG3149@ando.pearwood.info> <20170705173221.GH3149@ando.pearwood.info> <166441499301972@web57j.yandex.ru> <192641499305985@web47g.yandex.ru> Message-ID: On Thu, Jul 6, 2017 at 5:58 AM, Paul Moore wrote: > To use the (already > over-used) NameError example, Ken's proposal doesn't include any > change to how NameError exceptions are raised to store the name > separately on the exception. > Maybe I'm misunderstanding you, but the proposal has a clear example of raising NameError and getting the name attribute from the exception instance: try: raise NameError(name=name, template="name '{name}' is not defined.") except NameError as e: name = e.kwargs['name'] msg = str(e) ... Yes. Because he tries to extract the name component of a NameError, > and yet that component isn't stored anywhere - under his proposal or > under current CPython. > I'm not sure what you mean by "extract", but the proposal calls for the name to be passed as a keyword argument (see above) and stored in self.kwargs: class BaseException: def __init__(self, *args, **kwargs): self.args = args self.kwargs = kwargs I think this code does what you're asking it to do, right? (N.B. BaseException is written in C, so this Python code is presumably illustrative, not an actual implementation.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From mistersheik at gmail.com Thu Jul 6 14:26:41 2017 From: mistersheik at gmail.com (Neil Girdhar) Date: Thu, 6 Jul 2017 11:26:41 -0700 (PDT) Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <213001499111174@web5j.yandex.ru> References: <213001499111174@web5j.yandex.ru> Message-ID: <735d53f7-d781-4ddd-9943-02d78e2d22c5@googlegroups.com> This is a good example. I like this idea. I think that a good place to start would be setting the right example in the standard library: IndexError could have the offending index, KeyError the offending key, TypeError the offending type, etc. On Monday, July 3, 2017 at 3:49:23 PM UTC-4, Jeff Walker wrote: > > Paul, > I think you are fixating too much on Ken's example. I think I > understand what he > is saying and I agree with him. It is a problem I struggle with routinely. > It occurs in > the following situations: > > 1. You are handling an exception that you are not raising. This could be > because > Python itself is raising the exception, as in Ken's example, or it > could be raised > by some package you did not write. > 2. You need to process or transform the message in some way. > > Consider this example: > > import json > > > > >>> s = '{"abc": 0, "cdf: 1}' > > > > >>> try: > > ... d = json.loads(s) > > ... except Exception as e: > > ... print(e) > > ... print(e.args) > Unterminated string starting at: line 1 column 12 (char 11) > ('Unterminated string starting at: line 1 column 12 (char 11)',) > > Okay, I have caught an exception for which I have no control over how the > exception was raised. Now, imagine that I am writing an application that > highlights > json errors in place. To do so, I would need the line and column numbers > to > highlight the location of the error, and ideally I'd like to strip them > from the base > message and just show that. You can see from my second print statement > that > the line and column numbers were not passed as separate arguments. Thus > I need to parse the error message to extract them. Not a difficult job, > but fragile. > Any change to the error message could break my code. > > I don't know what this code smell is that people keep referring to, but to > me, > that code would smell. > > Jeff > > > > On 3 July 2017 at 09:59, Ken Kundert > wrote: > > > I think in trying to illustrate the existing behavior I made things > more > > > confusing than they needed to be. Let me try again. > > > > > > Consider this code. > > > > > > >>> import Food > > > >>> try: > > > ... import meals > > > ... except NameError as e: > > > ... name = str(e).split("'")[1] # <-- fragile code > > > ... from difflib import get_close_matches > > > ... candidates = ', '.join(get_close_matches(name, Food.foods, > 1, 0.6)) > > > ... print(f'{name}: not found. Did you mean {candidates}?') > > > > > > In this case *meals* instantiates a collection of foods. It is a > Python file, > > > but it is also a data file (in this case the user knows Python, so > Python is > > > a convenient data format). In that file thousands of foods may be > instantiated. > > > If the user misspells a food, I would like to present the available > > > alternatives. To do so, I need the misspelled name. The only way I > can get it > > > is by parsing the error message. > > > > As Steven pointed out, this is a pretty good example of a code smell. > > My feeling is that you may have just proved that Python isn't quite as > > good a fit for your data file format as you thought - or that your > > design has flaws. Suppose your user had a breakfast menu, and did > > something like: > > > > if now < lunchtim: # Should have been "lunchtime" > > > > Your error handling will be fairly confusing in that case. > > > > > That is the problem. To write the error handler, I need the > misspelled name. > > > The only way to get it is to extract it from the error message. The > need to > > > unpack information that was just packed suggests that the packing was > done too > > > early. That is my point. > > > > I don't have any problem with *having* the misspelled name as an > > attribute to the error, I just don't think it's going to be as useful > > as you hope, and it may indeed (as above) encourage people to use it > > without thinking about whether there might be problems with using > > error handling that way. > > > > > Fundamentally, pulling the name out of an error message is a really > bad coding > > > practice because it is fragile. The code will likely break if the > formatting or > > > the wording of the message changes. But given the way the exception > was > > > implemented, I am forced to choose between two unpleasant choices: > pulling the > > > name from the error message or not giving the enhanced message at all. > > > > Or using a different approach. ("Among our different approaches...!" > > :-)) Agreed that's also an unpleasant choice at this point. > > > > > What I am hoping to do with this proposal is to get the Python > developer > > > community to see that: > > > 1. The code that handles the exception benefits from having access to > the > > > components of the error message. In the least it can present the > message to > > > the user is the best possible way. Perhaps that means enforcing a > particular > > > style, or presenting it in the user's native language, or perhaps > it means > > > providing additional related information as in the example above. > > > > I see it as a minor bug magnet, but not really a problem in principle. > > > > > 2. The current approach to exceptions follows the opposite philosophy, > > > suggesting that the best place to construct the error message is at > the > > > source of the error. It inadvertently puts obstacles in place that > make it > > > difficult to customize the message in the handler. > > > > It's more about implicitly enforcing the policy of "catch errors over > > as small a section of code as practical". In your example, you're > > trapping NameError from anywhere in a "many thousands" of line file. > > That's about as far from the typical use of one or two lines in a try > > block as you can get. > > > > > 3. Changing the approach in the BaseException class to provide the > best of both > > > approaches provides considerable value and is both trivial and > backward > > > compatible. > > > > A small amount of value in a case we don't particularly want to > encourage. > > Whether it's trivial comes down to implementation - I'll leave that to > > whoever writes the PR to demonstrate. (Although if it *is* trivial, is > > it something you could write a PR for?) > > > > Also, given that this would be Python 3.7 only, would people needing > > this functionality (only you have expressed a need so far) be OK with > > either insisting their users go straight to Python 3.7, or including > > backward compatible code for older versions? > > > > Overall, I'm -0 on this request (assuming it is trivial to implement - > > I certainly don't feel it's worth significant implementation effort). > > > > Paul > _______________________________________________ > Python-ideas mailing list > Python... at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Thu Jul 6 14:33:58 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 6 Jul 2017 19:33:58 +0100 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: References: <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <595C0D0E.2030600@brenbarn.net> <20170705113634.GF3149@ando.pearwood.info> <20170705153901.GG3149@ando.pearwood.info> <20170705173221.GH3149@ando.pearwood.info> <166441499301972@web57j.yandex.ru> <192641499305985@web47g.yandex.ru> Message-ID: On 6 July 2017 at 18:59, Mark E. Haase wrote: > On Thu, Jul 6, 2017 at 5:58 AM, Paul Moore wrote: >> >> To use the (already >> >> over-used) NameError example, Ken's proposal doesn't include any >> change to how NameError exceptions are raised to store the name >> separately on the exception. > > > Maybe I'm misunderstanding you, but the proposal has a clear example of > raising NameError and getting the name attribute from the exception > instance: But no-one manually raises NameError, so Ken's example wouldn't work with "real" NameErrors. If Ken was intending to present a use case that did involve manually-raised NameError exceptions, then he needs to show the context to demonstrate why manually raising NameError rather than a custom exception (which can obviously work like he wants) is necessary. Paul From steve at pearwood.info Thu Jul 6 14:56:32 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 7 Jul 2017 04:56:32 +1000 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: References: <20170705113634.GF3149@ando.pearwood.info> <20170705153901.GG3149@ando.pearwood.info> <20170705173221.GH3149@ando.pearwood.info> <166441499301972@web57j.yandex.ru> <192641499305985@web47g.yandex.ru> Message-ID: <20170706185632.GK3149@ando.pearwood.info> On Thu, Jul 06, 2017 at 01:59:02PM -0400, Mark E. Haase wrote: > On Thu, Jul 6, 2017 at 5:58 AM, Paul Moore wrote: > > > To use the (already > > over-used) NameError example, Ken's proposal doesn't include any > > change to how NameError exceptions are raised to store the name > > separately on the exception. > > > > Maybe I'm misunderstanding you, but the proposal has a clear example of > raising NameError and getting the name attribute from the exception > instance: > > try: > raise NameError(name=name, template="name '{name}' is not defined.") > except NameError as e: > name = e.kwargs['name'] > msg = str(e) > ... What prevents the programmer from writing this? raise NameError(nym=s, template="name '{nym}' is not defined.") Or any other keyword name for that matter. Since the exception class accepts arbitrary keyword arguments, we have to expect that it could be used with arbitrary keyword arguments. Only the exception subclass knows how many and what information it expects: - NameError knows that it expects a name; - IndexError knows that it expects an index; - OSError knows that it expects anything up to five arguments (errno, errstr, winerr, filename1, filename2); etc. BaseException cannot be expected to enforce that. Ken's suggestion to put the argument handling logic in BaseException doesn't give us any way to guarantee that NameError.kwargs['name'] will even exist, or that NameError.args[0] is the name. > > Yes. Because he tries to extract the name component of a NameError, > > and yet that component isn't stored anywhere - under his proposal or > > under current CPython. > > > > I'm not sure what you mean by "extract", but the proposal calls for the > name to be passed as a keyword argument (see above) and stored in > self.kwargs: > > > class BaseException: > def __init__(self, *args, **kwargs): > self.args = args > self.kwargs = kwargs Keyword *or positional argument*. Even if given as a keyword argument, it could use any keyword. That's a problem. Assuming it will be "name" is fragile and barely any better than the status quo, and it's harder to use than named attributes. If there is a need for NameError to make the name programmably discoverable without scraping the error message, then as PEP 352 recommends, it should be made available via an attribute: err.name, not err.args[0] or err.kwargs['name']. Here's a proof of concept of the sort of thing we could do that is backwards compatible and follows PEP 352: class NameError(Exception): def __init__(self, *args): self.args = args if len(args) == 2: self.name = args[0] else: self.name = None def __str__(self): if len(self.args) == 1: return str(self.args[0]) elif len(self.args) == 2: return "[{}] {}".format(*self.args) elif self.args: return str(self.args) return '' -- Steven From python at mrabarnett.plus.com Thu Jul 6 16:17:59 2017 From: python at mrabarnett.plus.com (MRAB) Date: Thu, 6 Jul 2017 21:17:59 +0100 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <192641499305985@web47g.yandex.ru> References: <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <595C0D0E.2030600@brenbarn.net> <20170705113634.GF3149@ando.pearwood.info> <20170705153901.GG3149@ando.pearwood.info> <20170705173221.GH3149@ando.pearwood.info> <166441499301972@web57j.yandex.ru> <192641499305985@web47g.yandex.ru> Message-ID: <2b5935ef-b675-1786-0c0e-f189a4bdb63a@mrabarnett.plus.com> On 2017-07-06 02:53, Jeff Walker wrote: > Stephen, > These statements do not ring true to me. I have been following the conversation > closely and I have not seen support for any of them. Perhaps I missed it. > Could you please expand on these statements: > >> the idea doesn't actually solve the problem it is intended to > > Specifically Ken started by saying that it should not be necessary to parse the > messages to get the components of the message. He then gave an example > where he was able to access the components of the message without parsing > the message. So how is it that he is not solving the problem he intended to solve? > >> His solution can't work > > Again, he gave an example where he was able to access the components of the > message without parsing the message. Yet you claim his solution cannot work. > Is his example wrong? > > >> He hasn't demonstrated that there is a real problem > > You yourself admitted that parsing a message to extract the components is > undesirable. Ken and others, including myself, gave examples where this was > necessary. Each example was given as either being a real problem or > representative of a real problem. Are we all wrong? > Sometimes you can't even parse the message. Here's an annoyance I've just come across: Python 3.6.1 (v3.6.1:69c0db5, Mar 21 2017, 18:41:36) [MSC v.1900 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> my_list = ['foo', 'bar'] >>> my_list.remove('baz') Traceback (most recent call last): File "", line 1, in ValueError: list.remove(x): x not in list If it was a KeyError, it would at least tell me what it was that was missing! From mehaase at gmail.com Thu Jul 6 16:29:25 2017 From: mehaase at gmail.com (Mark E. Haase) Date: Thu, 6 Jul 2017 16:29:25 -0400 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <20170706185632.GK3149@ando.pearwood.info> References: <20170705113634.GF3149@ando.pearwood.info> <20170705153901.GG3149@ando.pearwood.info> <20170705173221.GH3149@ando.pearwood.info> <166441499301972@web57j.yandex.ru> <192641499305985@web47g.yandex.ru> <20170706185632.GK3149@ando.pearwood.info> Message-ID: On Thu, Jul 6, 2017 at 2:56 PM, Steven D'Aprano wrote: > > Maybe I'm misunderstanding you, but the proposal has a clear example of > > raising NameError and getting the name attribute from the exception > > instance: > > > > try: > > raise NameError(name=name, template="name '{name}' is not > defined.") > > except NameError as e: > > name = e.kwargs['name'] > > msg = str(e) > > ... > > What prevents the programmer from writing this? > > raise NameError(nym=s, template="name '{nym}' is not defined.") > > Or any other keyword name for that matter. Since the exception class > accepts arbitrary keyword arguments, we have to expect that it could be > used with arbitrary keyword arguments. > I agree completely with your point here, as well as the overall conclusion that the proposal to change BaseException is a bad idea. I was merely replying to [what I perceived as] straw manning of the proposal. -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Thu Jul 6 20:26:31 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 07 Jul 2017 12:26:31 +1200 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: References: <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <595C0D0E.2030600@brenbarn.net> <20170705113634.GF3149@ando.pearwood.info> <20170705153901.GG3149@ando.pearwood.info> <20170705173221.GH3149@ando.pearwood.info> <166441499301972@web57j.yandex.ru> <192641499305985@web47g.yandex.ru> Message-ID: <595ED537.40605@canterbury.ac.nz> Paul Moore wrote: > But no-one manually raises NameError, so Ken's example wouldn't work > with "real" NameErrors. Part of his suggestion was that core and stdlib code would move towards using attributes. The NameError example makes sense in that context. -- Greg From ncoghlan at gmail.com Thu Jul 6 22:18:30 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 7 Jul 2017 12:18:30 +1000 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <595ED537.40605@canterbury.ac.nz> References: <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <595C0D0E.2030600@brenbarn.net> <20170705113634.GF3149@ando.pearwood.info> <20170705153901.GG3149@ando.pearwood.info> <20170705173221.GH3149@ando.pearwood.info> <166441499301972@web57j.yandex.ru> <192641499305985@web47g.yandex.ru> <595ED537.40605@canterbury.ac.nz> Message-ID: On 7 July 2017 at 10:26, Greg Ewing wrote: > Paul Moore wrote: >> >> But no-one manually raises NameError, so Ken's example wouldn't work >> with "real" NameErrors. > > Part of his suggestion was that core and stdlib code would > move towards using attributes. The NameError example makes > sense in that context. I haven't been following the thread, so making sure this is stated explicitly: as a matter of general policy, we're OK with enhancing builtin exception types to store additional state for programmatic introspection (when we aren't, it's usually because the exception is often raised and then handled in contexts where it never actually gets instantiated, and you can't use that trick anymore if you add meaningful state to the exception instance). Actually doing that isn't especially hard conceptually, it's just tedious in practice, since somebody has to do the work to: 1. Decide what field they want to add 2. Figure out how to support that in the C API without breaking backwards compatibility 3. Go through the code base to actually set the new field in the relevant locations By contrast, we're incredibly wary of trying to enhance exception behaviour by way of `BaseException` changes due to our past negative experiences with that during the original implementation of PEP 352: https://www.python.org/dev/peps/pep-0352/#retracted-ideas In principle, such approaches sound great. In practice, they tend to fall apart once they try to cope with the way CPython's internals (as opposed to user level Python code) creates and manipulates exceptions. Cheers, Nick. P.S. It isn't a coincidence that the import system's exceptions started to become significantly more informative *after* Brett rewrote most of it in Python :) -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From jeff.walker00 at yandex.com Thu Jul 6 23:54:24 2017 From: jeff.walker00 at yandex.com (Jeff Walker) Date: Thu, 06 Jul 2017 21:54:24 -0600 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: References: <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <595C0D0E.2030600@brenbarn.net> <20170705113634.GF3149@ando.pearwood.info> <20170705153901.GG3149@ando.pearwood.info> <20170705173221.GH3149@ando.pearwood.info> <166441499301972@web57j.yandex.ru> <192641499305985@web47g.yandex.ru> Message-ID: <781721499399664@web27g.yandex.ru> Paul, I am having trouble understanding your response. 06.07.2017, 03:58, "Paul Moore" : > On 6 July 2017 at 02:53, Jeff Walker wrote: >> ?Could you please expand on these statements: >> >>> ??the idea doesn't actually solve the problem it is intended to >> >> ?Specifically Ken started by saying that it should not be necessary to parse the >> ?messages to get the components of the message. He then gave an example >> ?where he was able to access the components of the message without parsing >> ?the message. So how is it that he is not solving the problem he intended to solve? > > Just to add my perspective here, his proposed solution (to modify > BaseException) doesn't include any changes to the derived exceptions > that would need to store the components. To use the (already > over-used) NameError example, Ken's proposal doesn't include any > change to how NameError exceptions are raised to store the name > separately on the exception. > > So *as the proposal stands* it doesn't allow users to extract > components of any exceptions, simply because the proposal doesn't > suggest changing exceptions to *store* those components. > >>> ??His solution can't work >> >> ?Again, he gave an example where he was able to access the components of the >> ?message without parsing the message. Yet you claim his solution cannot work. >> ?Is his example wrong? > > Yes. Because he tries to extract the name component of a NameError, > and yet that component isn't stored anywhere - under his proposal or > under current CPython. > Here is Ken's original proposal: class BaseException: def __init__(self, *args, **kwargs): self.args = args self.kwargs = kwargs def __str__(self): template = self.kwargs.get('template') if template is None: sep = self.kwargs.get('sep', ' ') return sep.join(str(a) for a in self.args) else: return template.format(*self.args, **self.kwargs) The code for storing the arguments is in the constructor. You can access the arguments through the args and kwargs attributes. This is exactly the way BaseException works currently, except that it provides no support for kwargs. Here is an example: class NameError(BaseException): pass try: raise NameError('welker', db='users', template='{0}: unknown {db}.') except NameError as e: unknown_name = e.args[0] missing_from = e.kwargs('db') print(str(e)) Given this example, please explain why it is you say that the arguments are not be stored and are not accessible. >>> ??He hasn't demonstrated that there is a real problem >> >> ?You yourself admitted that parsing a message to extract the components is >> ?undesirable. Ken and others, including myself, gave examples where this was >> ?necessary. Each example was given as either being a real problem or >> ?representative of a real problem. Are we all wrong? > > He's given examples of use cases. To that extent, Steven is being a > touch absolute here. However, there has been some debate over whether > those examples are valid. We've had multiple responses pointing out > that the code examples aren't restricting what's in the try block > sufficiently tightly, for example (the NameError case in particular > was importing a file which, according to Ken himself, had potentially > *thousands* of places where NameError could be raised). It's possible > that introspecting exceptions is the right way to design a solution to > this particular problem, but it does go against the normal design > principles that have been discussed on this list and elsewhere many > times. So, to demonstrate that there's a problem, it's necessary to > address the question of whether the code could in fact have been > written in a different manner that avoided the claimed problem. People keep picking on that example, but Ken gave a reasonable justification for why it was written that way. It was a script written for a knowledgeable user. Further investment for a more 'correct' solution was neither desired nor justifiable. By dismissing the example, you dismiss a use-model for Python that is one of its strengths. It's ability to allow users to throw together powerful scripts with minimal effort. Also people seem to forget that he also pointed out that his proposal allows error messages to be trivially translated to different languages: try: raise NameError('welker') except NameError as e: print('{}: nicht gefunden.'.format(e.args[0])) From ncoghlan at gmail.com Fri Jul 7 00:28:07 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 7 Jul 2017 14:28:07 +1000 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: References: <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <595C0D0E.2030600@brenbarn.net> <20170705113634.GF3149@ando.pearwood.info> <20170705153901.GG3149@ando.pearwood.info> <20170705173221.GH3149@ando.pearwood.info> <166441499301972@web57j.yandex.ru> <192641499305985@web47g.yandex.ru> <595ED537.40605@canterbury.ac.nz> Message-ID: On 7 July 2017 at 12:18, Nick Coghlan wrote: > By contrast, we're incredibly wary of trying to enhance exception > behaviour by way of `BaseException` changes due to our past negative > experiences with that during the original implementation of PEP 352: > https://www.python.org/dev/peps/pep-0352/#retracted-ideas > > In principle, such approaches sound great. In practice, they tend to > fall apart once they try to cope with the way CPython's internals (as > opposed to user level Python code) creates and manipulates exceptions. To elaborate on the potential problem with the specific proposal in this thread: adding arbitrary kwargs support to BaseException would be straightforward. Adding arbitrary kwargs support to the dozens of exceptions defined as builtins and in the standard library would *not* necessarily be straightforward, *and* would potentially get in the way of adding more clearly defined attributes to particular subclasses in the future. It would also create potential inconsistencies with third party exceptions which may or may not start accepting arbitrary keywords depending on how they're defined. As a result, our advice is to *avoid* trying to come up with systemic fixes for structured exception handling, and instead focus on specific use cases of "I want *this* exception type to have *that* attribute for *these* reasons". Those kinds of proposals usually don't even rise to the level of needing a PEP - they're just normal RFEs on the issue tracker. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From jeff.walker00 at yandex.com Fri Jul 7 00:37:57 2017 From: jeff.walker00 at yandex.com (Jeff Walker) Date: Thu, 06 Jul 2017 22:37:57 -0600 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <20170706185632.GK3149@ando.pearwood.info> References: <20170705113634.GF3149@ando.pearwood.info> <20170705153901.GG3149@ando.pearwood.info> <20170705173221.GH3149@ando.pearwood.info> <166441499301972@web57j.yandex.ru> <192641499305985@web47g.yandex.ru> <20170706185632.GK3149@ando.pearwood.info> Message-ID: <803211499402277@web26g.yandex.ru> 06.07.2017, 13:00, "Steven D'Aprano" : > What prevents the programmer from writing this? > > raise NameError(nym=s, template="name '{nym}' is not defined.") > > Or any other keyword name for that matter. Since the exception class > accepts arbitrary keyword arguments, we have to expect that it could be > used with arbitrary keyword arguments. > > Only the exception subclass knows how many and what information it > expects: > > - NameError knows that it expects a name; > - IndexError knows that it expects an index; > - OSError knows that it expects anything up to five arguments > (errno, errstr, winerr, filename1, filename2); > > etc. BaseException cannot be expected to enforce that. Ken's suggestion > to put the argument handling logic in BaseException doesn't give us any > way to guarantee that NameError.kwargs['name'] will even exist, or that > NameError.args[0] is the name. I am not understanding what is bad about your example. Yes, BaseException would allow an arbitrary set of arguments to be passed to the exception. It also defines how they would be handled by default. If people did not want that, they could subclass BaseException and replace the constructor and the __str__ method. With Ken's proposal, NameError and IndexError could then be replaced with: class NameError(Exception): '''Name not found. :param *args: args[0] contains the name that was not found. Values passed in args[1:] are ignored. :param **kwargs: kwargs['template'] is format string used to assemble an error message. This strings format() method is called with *args and **kwargs as arguments, and the result is returned by __str__(). args and kwargs are saved as attributes, so you can access the exception's arguments through them. ''' pass class IndexError(Exception): '''Sequence index out of range. :param *args: args[0] contains the value of the index. :param **kwargs: kwargs['template'] is format string used to assemble an error message. This strings format() method is called with *args and **kwargs as arguments, and the result is returned by __str__(). args and kwargs are saved as attributes, so you can access the exception's arguments through them. ''' pass OSError could be implemented by overriding the __str__() method. Once this is done, the new versions do everything the old versions do, but provide access to the components of the error message. The are also discoverable, more discoverable than the originals. In addition, they are easily extensible. For example, if I raise the NameError myself, I can provide additional useful information: try: raise NameError('welker', db='user') except NameError as e: db = e.kwargs.get('db') print('{}: {} not found.'.format(e.args[0], db) if db else str(e)) If, as you suggest, some one writes: raise NameError(nym=s, template="name '{nym}' is not defined.") it will work as expected as long as they confined themselves to using str(e). It would only fail if someone directly tried to access the argument using e.args, but that would be a very unusual thing to do and the issue could be worked around with by examining args and kwargs. Jeff From p.f.moore at gmail.com Fri Jul 7 04:10:54 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 7 Jul 2017 09:10:54 +0100 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <781721499399664@web27g.yandex.ru> References: <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <595C0D0E.2030600@brenbarn.net> <20170705113634.GF3149@ando.pearwood.info> <20170705153901.GG3149@ando.pearwood.info> <20170705173221.GH3149@ando.pearwood.info> <166441499301972@web57j.yandex.ru> <192641499305985@web47g.yandex.ru> <781721499399664@web27g.yandex.ru> Message-ID: On 7 July 2017 at 04:54, Jeff Walker wrote: > Here is an example: > > class NameError(BaseException): > pass > > try: > raise NameError('welker', db='users', template='{0}: unknown {db}.') > except NameError as e: > unknown_name = e.args[0] > missing_from = e.kwargs('db') > print(str(e)) > > Given this example, please explain why it is you say that the arguments are not > be stored and are not accessible. Because the proposal doesn't state that NameError is to be changed, and the example code isn't real, as it's manually raising a system exception. Anyway, I'm tired of this endless debate about what Ken may or may not have meant. I'm going to bow out now and restrict myself to only reading and responding to actual proposals. Paul From k7hoven at gmail.com Fri Jul 7 08:23:35 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Fri, 7 Jul 2017 15:23:35 +0300 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: References: <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <595C0D0E.2030600@brenbarn.net> <20170705113634.GF3149@ando.pearwood.info> <20170705153901.GG3149@ando.pearwood.info> <20170705173221.GH3149@ando.pearwood.info> <166441499301972@web57j.yandex.ru> <192641499305985@web47g.yandex.ru> <595ED537.40605@canterbury.ac.nz> Message-ID: On Fri, Jul 7, 2017 at 7:28 AM, Nick Coghlan wrote: > > As a result, our advice is to *avoid* trying to come up with systemic > fixes for structured exception handling, and instead focus on specific > use cases of "I want *this* exception type to have *that* attribute > for *these* reasons". Those kinds of proposals usually don't even rise > to the level of needing a PEP - they're just normal RFEs on the issue > tracker. > > It would seem to make sense to try and 'standardize' how this is done for the cases where it is done at all. For example, there could be some kind of collection in the exceptions (not necessarily all exceptions) that contains all the objects that participated in the operation that led to the exception. Something consistent across the different exception types would be easier to learn. For instance, with an AttributeError coming from a deeper call stack, you might want to check something like: myobj in exc.participants or myname in exc.participants This could also be more precise. For instance 'participants' might be divided into 'subjects' and 'objects', or whatever might be a suitable categorization for a broad range of different exceptions. This would need some research, though, to find the most useful naming and guidelines for these things. This might also be handy when converting from one exception type to another, because the 'participants' (or whatever they would be called) might stay the same. -- Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From zuo at chopin.edu.pl Fri Jul 7 20:57:02 2017 From: zuo at chopin.edu.pl (Jan Kaliszewski) Date: Sat, 8 Jul 2017 02:57:02 +0200 Subject: [Python-ideas] Improving Catching Exceptions In-Reply-To: <594E5BAC.5050809@canterbury.ac.nz> References: <20170622205532.GP3149@ando.pearwood.info> <20170622232923.GA48632@cskk.homeip.net> <594DBA3F.1010505@canterbury.ac.nz> <20170624100326.GT3149@ando.pearwood.info> <594E5BAC.5050809@canterbury.ac.nz> Message-ID: <20170708025702.33035ef8@grzmot> 2017-06-25 Greg Ewing dixit: > > (2) There's a *specific* problem with property where a bug in your > > getter or setter that raises AttributeError will be masked, > > appearing as if the property itself doesn't exist. [...] > Case 2 needs to be addressed within the method concerned on a > case-by-case basis. If there's a general principle there, it's > something like this: If you're writing a method that uses > an exception as part of it's protocol, you should catch any > incidental occurrences of the same exception and reraise it > as a different exception. > > I don't think there's anything more the Python language could do > to help with either of those. In "case 2", maybe some auto-zeroing flag (or even a decrementing counter?) could be useful? Please, consider the following draft: class ExtAttributeError(AttributeError): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) self._propagate_beyond = True # of course this should be a class and it should support not only # getters but also setters and deleters -- but you get the idea... def new_property(func): def wrapper(self): try: return func(self) except AttributeError as exc: if getattr(exc, '_propagate_beyond', False): exc._propagate_beyond = False raise raise RuntimeError( f'Unexpected {exc.__class__.__name__}') from exc return wrapper Then we could have: class Spam: @new_property def target(self): if len(self.targgets) == 1: return self.targets[0] raise ExtAttributeError( 'only exists when this has exactly one target') Cheers. *j From mistersheik at gmail.com Fri Jul 7 21:25:38 2017 From: mistersheik at gmail.com (Neil Girdhar) Date: Fri, 7 Jul 2017 18:25:38 -0700 (PDT) Subject: [Python-ideas] Consider allowing the use of abstractmethod without metaclasses Message-ID: <38b9d49d-e498-41ef-bd2e-747a22804cb1@googlegroups.com> I want to use abstractmethod, but I have my own metaclasses and I don't want to build composite metaclasses using abc.ABCMeta. Thanks to PEP 487, one approach is to facator out the abstractmethod checks from ABCMeta into a regular (non-meta) class. So, my first suggestion is to split abc.ABC into two pieces, a parent regular class with metaclass "type": class AbstractBaseClass: def __init_subclass__(cls): # Compute set of abstract method names abstracts = {name for name, value in vars(cls).items() if getattr(value, "__isabstractmethod__", False)} for base in cls.__bases__: for name in getattr(base, "__abstractmethods__", set()): value = getattr(cls, name, None) if getattr(value, "__isabstractmethod__", False): abstracts.add(name) cls.__abstractmethods__ = frozenset(abstracts) My alternative suggestion is to move this logic directly into "type" so that all classes have this logic, and then move abstractmethod into builtins. Of course, this isn't pressing since I can do this in my own code, it's just a suggestion from a neatness standpoint. Best, Neil -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Jul 8 02:53:26 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 8 Jul 2017 16:53:26 +1000 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: References: <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <595C0D0E.2030600@brenbarn.net> <20170705113634.GF3149@ando.pearwood.info> <20170705153901.GG3149@ando.pearwood.info> <20170705173221.GH3149@ando.pearwood.info> <166441499301972@web57j.yandex.ru> <192641499305985@web47g.yandex.ru> <595ED537.40605@canterbury.ac.nz> Message-ID: On 7 July 2017 at 22:23, Koos Zevenhoven wrote: > On Fri, Jul 7, 2017 at 7:28 AM, Nick Coghlan wrote: >> >> >> As a result, our advice is to *avoid* trying to come up with systemic >> fixes for structured exception handling, and instead focus on specific >> use cases of "I want *this* exception type to have *that* attribute >> for *these* reasons". Those kinds of proposals usually don't even rise >> to the level of needing a PEP - they're just normal RFEs on the issue >> tracker. >> > > It would seem to make sense to try and 'standardize' how this is done for > the cases where it is done at all. For example, there could be some kind of > collection in the exceptions (not necessarily all exceptions) that contains > all the objects that participated in the operation that led to the > exception. Something consistent across the different exception types would > be easier to learn. Potentially, but that would just require a proposal for a new created-on-first-access property on BaseException, rather than a proposal to change the constructor to implicitly populate that attribute from arbitrary keyword arguments. The functional equivalent of: @property def details(self): if "_details" in self.__dict__: details = self._details else: details = self._details = dict() return details Given something like that, the preferred API for adding details to an exception could be to do "exc.details[k] = v" rather than passing arbitrary arguments to the constructor, which would avoid most of the problems that afflicted the original "message" attribute concept. Subclasses would also be free to override the property to do something different (e.g. pre-populating the dictionary based on other exception attributes) So if folks want to explore this further, it's probably worth framing the design question in terms of a few different audiences: 1. Folks writing sys.excepthook implementations wanting to provide more useful detail to their users without overwhelming them with irrelevant local variables and without having to customise the hook for every structured exception type 2. Folks wanting to "make notes" on an exception before re-raising it 3. Folks writing new exception subclasses in Python and wanting to provide structured exception details in a consistent way 4. Folks adapting existing structured exception subclasses in Python to expose the new API 5. Folks adapting existing structured exception subclasses in C to expose the new API 6. Folks raising structured exceptions for flow control purposes who'd be annoyed by performance lost to unnecessary memory allocations Any such proposal would also need to account for any existing exception types that use "details" as the name of an ordinary instance attribute and provide a suitable deprecation period. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From zuo at chopin.edu.pl Sat Jul 8 13:15:32 2017 From: zuo at chopin.edu.pl (Jan Kaliszewski) Date: Sat, 8 Jul 2017 19:15:32 +0200 Subject: [Python-ideas] Improving Catching Exceptions [ERRATUM] In-Reply-To: <20170708025702.33035ef8@grzmot> References: <20170622205532.GP3149@ando.pearwood.info> <20170622232923.GA48632@cskk.homeip.net> <594DBA3F.1010505@canterbury.ac.nz> <20170624100326.GT3149@ando.pearwood.info> <594E5BAC.5050809@canterbury.ac.nz> <20170708025702.33035ef8@grzmot> Message-ID: <20170708191532.27edb15b@grzmot> 2017-07-08 Jan Kaliszewski dixit: > return wrapper Should be: return property(wrapper) From turnbull.stephen.fw at u.tsukuba.ac.jp Mon Jul 10 07:36:36 2017 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Mon, 10 Jul 2017 20:36:36 +0900 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <192641499305985@web47g.yandex.ru> References: <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <595C0D0E.2030600@brenbarn.net> <20170705113634.GF3149@ando.pearwood.info> <20170705153901.GG3149@ando.pearwood.info> <20170705173221.GH3149@ando.pearwood.info> <166441499301972@web57j.yandex.ru> <192641499305985@web47g.yandex.ru> Message-ID: <22883.26308.389706.743344@turnbull.sk.tsukuba.ac.jp> Jeff Walker writes: > Stephen, Mr. d'Aprano is a "Steven." I'm a "Stephen". We both go by "Steve". And you replied to Chris, not to Steve. It's worth being careful about these things. I don't see what I would consider satisfactory, concise answers to your questions about Steve's claims about defects in Ken's proposal. So I'll take a somewhat belated hack at it myself. Warning, I don't do so well on "concise", after all. ;-) > > the idea doesn't actually solve the problem it is intended to > > Specifically Ken started by saying that it should not be necessary > to parse the messages to get the components of the message. He then > gave an example where he was able to access the components of the > message without parsing the message. So how is it that he is not > solving the problem he intended to solve? Changing the convention for raising NameError to >>> raise NameError("foo", "is undefined") Traceback (most recent call last): File "", line 1, in NameError: ('foo', 'is undefined') provides all of the functionality of Ken's BaseException with the default template of None. Thus, it's not Ken's BaseException that solves the problem, it's the required change to the convention for raising NameError. Ken's BaseException does provide additional functionality (the template feature, allowing a pretty error message), but that's irrelevant to the actual solution of instantiating Exceptions with unformatted argument(s). Note that even >>> raise NameError('foo') Traceback (most recent call last): File "", line 1, in NameError: foo isn't actually terrible in the interpreter, and would give Ken a more reliable way to extract the name (e.args[0]). Of course there's a good chance that wouldn't work so well for arbitrary Exceptions, but the multiple argument, ugly but parseable message, approach usually should work. > > His solution can't work > > Again, he gave an example where he was able to access the > components of the message without parsing the message. Yet you > claim his solution cannot work. Is his example wrong? This claim is missing three words: "in full generality". This is demonstrated by Steven's "nym" example, which is a general problem that can occur in any raise statement. I think this indicates that you really want to provide a custom API for each and every derived Exception. Can't enforce that, or even provide a nice generic default, in BaseException as Ken proposed to revise it. There's also the rather tricky question of backward compatibility. In particular, any use of this feature in existing raise statements that involves changing the content of Exception.args will break code that parses the args rather than stringifying the Exception and parsing that. I think this breakage is likely to be very tempting when people consider "upgrading" existing raise statements with preformatted string arguments to "structured" arguments containing objects of arbitrary types. And any change to the format string (such as translating to a different language where the name appears in a different position) would imply the same breakage. Finally, while it's unfair to judge the whole proposal on proof-of- concept code, his BaseException is inadequate. Specifically, the template argument should get a decent default ("{:s}" seems reasonable, as we've seen above), and given that some arguments are likely to be strings containing spaces, his generic __str__ is going to confuse the heck out of users at the interpreter. Maybe it's really as easy as indicated here to fix those problems, but I wouldn't bet on it. To address all of the above problems, it seems reasonable to me to follow PEP 352's advice and initialize attributes on the exception from the arguments to the constructor. That plus custom __str__s to deal with nice formatting in tracebacks would make changes to BaseException unnecessary, although the addition of the template feature *might* be useful to reduce boilerplate in __str__, or even allow BaseException to provide a widely useful generic __str__. I'm doubtful it would be that widely useful, but who knows? > > He hasn't demonstrated that there is a real problem > > You yourself admitted that parsing a message to extract the > components is undesirable. Sure, but again two words are missing from the claim: "in BaseException". The claim is that "parse a string" is a problem in the definitions of *derived* exceptions, and in the conventions used to raise them. Programmers writing inadequate code is a real problem, but not one that Python (the language) can fix. (Of course we can submit bug reports and PEPs and fix our own inadequate code!) Another way to express this point is that we know that the raise statements will have to be fixed *everywhere*, and that some __str__s will need to be changed or added. The claim "hasn't demonstrated" is meant to say "a problem that can be fixed by changing BaseException". It seems likely to me that there are going to be a pile of slightly different ways to address this depending on the particular exception in question, and many of the more important ones (eg, OSError) are already structured and won't benefit from this change at all. Finally, generic experience has shown that the closer to root in one of these hierarchies you go, the more conservative you should be. The ramifications of a change in a very general feature for older code, as well as for more derived components, are often very hard to foresee accurately, and most surprises are unpleasant. Specifically for BaseException, Nick points out that a lot of corner cases were considered for exception hierarchies, and it was decided at that time that this kind of thing was rather risky for the foreseeable benefit. That doesn't mean it's still so risky, but I think going slow here, and doing our best to beat the proposal into a twisting, steaming hunk of half-melted metal is warranted. If it's still holding its shape after attacks with precision-guided munitions, then the risk is more likely to be worth it. :-) Steve From victor.stinner at gmail.com Tue Jul 11 06:19:06 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 11 Jul 2017 12:19:06 +0200 Subject: [Python-ideas] PEP: Hide implementation details in the C API Message-ID: Hi, This is the first draft of a big (?) project to prepare CPython to be able to "modernize" its implementation. Proposed changes should allow to make CPython more efficient in the future. The optimizations themself are out of the scope of the PEP, but some examples are listed to explain why these changes are needed. For the background, see also my talk at the previous Python Language Summit at Pycon US, Portland OR: "Keeping Python competitive" https://lwn.net/Articles/723752/#723949 "Python performance", slides (PDF): https://github.com/haypo/conf/raw/master/2017-PyconUS/summit.pdf Since this is really the first draft, I didn't assign a PEP number to it yet. I prefer to wait for a first feedback round. Victor PEP: xxx Title: Hide implementation details in the C API Version: $Revision$ Last-Modified: $Date$ Author: Victor Stinner , Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 31-May-2017 Abstract ======== Modify the C API to remove implementation details. Add an opt-in option to compile C extensions to get the old full API with implementation details. The modified C API allows to more easily experiment new optimizations: * Indirect Reference Counting * Remove Reference Counting, New Garbage Collector * Remove the GIL * Tagged pointers Reference counting may be emulated in a future implementation for backward compatibility. Rationale ========= History of CPython forks ------------------------ Last 10 years, CPython was forked multiple times to attempt different CPython enhancements: * Unladen Swallow: add a JIT compiler based on LLVM * Pyston: add a JIT compiler based on LLVM (CPython 2.7 fork) * Pyjion: add a JIT compiler based on Microsoft CLR * Gilectomy: remove the Global Interpreter Lock nicknamed "GIL" * etc. Sadly, none is this project has been merged back into CPython. Unladen Swallow looses its funding from Google, Pyston looses its funding from Dropbox, Pyjion is developed in the limited spare time of two Microsoft employees. One hard technically issue which blocked these projects to really unleash their power is the C API of CPython. Many old technical choices of CPython are hardcoded in this API: * reference counting * garbage collector * C structures like PyObject which contains headers for reference counting and the garbage collector * specific memory allocators * etc. PyPy ---- PyPy uses more efficient structures and use a more efficient garbage collector without reference counting. Thanks to that (but also many other optimizations), PyPy succeeded to run Python code up to 5x faster than CPython. Plan made of multiple small steps ================================= Step 1: split Include/ into subdirectories ------------------------------------------ Split the ``Include/`` directory of CPython: * ``python`` API: ``Include/Python.h`` remains the default C API * ``core`` API: ``Include/core/Python.h`` is a new C API designed for building Python * ``stable`` API: ``Include/stable/Python.h`` is the stable ABI Expect declarations to be duplicated on purpose: ``#include`` should be not used to include files from a different API to prevent mistakes. In the past, too many functions were exposed *by mistake*, especially symbols exported to the stable ABI by mistake. At this point, ``Include/Python.h`` is not changed at all: zero risk of backward incompatibility. The ``core`` API is the most complete API exposing *all* implementation details and use macros for best performances. XXX should we abandon the stable ABI? Never really used by anyone. Step 2: Add an opt-in API option to tools building packages ----------------------------------------------------------- Modify Python packaging tools (distutils, setuptools, flit, pip, etc.) to add an opt-in option to choose the API: ``python``, ``core`` or ``stable``. For example, debuggers like ``vmprof`` need need the ``core`` API to get a full access to implementation details. XXX handle backward compatibility for packaging tools. Step 3: first pass of implementation detail removal --------------------------------------------------- Modify the ``python`` API: * Add a new ``API`` subdirectory in the Python source code which will "implement" the Python C API * Replace macros with functions. The implementation of new functions will be written in the ``API/`` directory. For example, Py_INCREF() becomes the function ``void Py_INCREF(PyObject *op)`` and its implementation will be written in the ``API`` directory. * Slowly remove more and more implementation details from this API. Modifications of these API should be driven by tests of popular third party packages like: * Django with database drivers * numpy * scipy * Pillow * lxml * etc. Compilation errors on these extensions are expected. This step should help to draw a line for the backward incompatible change. Goal: remove a few implementation details but don't break numpy and lxml. Step 4 ------ Switch the default API to the new restricted ``python`` API. Help third party projects to patch their code: don't break the "Python world". Step 5 ------ Continue Step 3: remove even more implementation details. Long-term goal to complete this PEP: Remove *all* implementation details, remove all structures and macros. Alternative: keep core as the default API ========================================= A smoother transition would be to not touch the existing API but work on a new API which would only be used as an opt-in option. Similar plan used by Gilectomy: opt-in option to get best performances. It would be at least two Python binaries per Python version: default compatible version, and a new faster but incompatible version. Idea: implementation of the C API supporting old Python versions? ================================================================= Open questions. Q: Would it be possible to design an external library which would work on Python 2.7, Python 3.4-3.6, and the future Python 3.7? Q: Should such library be linked to libpythonX.Y? Or even to a pythonX.Y binary which wasn't built with shared library? Q: Would it be easy to use it? How would it be downloaded and installed to build extensions? Collaboration with PyPy, IronPython, Jython and MicroPython =========================================================== XXX to be done Enhancements becoming possible thanks to a new C API ==================================================== Indirect Reference Counting --------------------------- * Replace ``Py_ssize_t ob_refcnt;`` (integer) with ``Py_ssize_t *ob_refcnt;`` (pointer to an integer). * Same change for GC headers? * Store all reference counters in a separated memory block (or maybe multiple memory blocks) Expected advantage: smaller memory footprint when using fork() on UNIX which is implemented with Copy-On-Write on physical memory pages. See also `Dismissing Python Garbage Collection at Instagram `_. Remove Reference Counting, New Garbage Collector ------------------------------------------------ If the new C API hides well all implementation details, it becomes possible to change fundamental features like how CPython tracks the lifetime of an object. * Remove ``Py_ssize_t ob_refcnt;`` from the PyObject structure * Replace the current XXX garbage collector with a new tracing garbage collector * Use new macros to define a variable storing an object and to set the value of an object * Reimplement Py_INCREF() and Py_DECREF() on top of that using a hash table: object => reference counter. XXX PyPy is only partially successful on that project, cpyext remains very slow. XXX Would it require an opt-in option to really limit backward compatibility? Remove the GIL -------------- * Don't remove the GIL, but replace the GIL with smaller locks * Builtin mutable types: list, set, dict * Modules * Classes * etc. Backward compatibility: * Keep the GIL Tagged pointers --------------- https://en.wikipedia.org/wiki/Tagged_pointer Common optimization, especially used for "small integers". Current C API doesn't allow to implement tagged pointers. Tagged pointers are used in MicroPython to reduce the memory footprint. Note: ARM64 was recently extended its address space to 48 bits, causing issue in LuaJIT: `47 bit address space restriction on ARM64 `_. Misc ideas ---------- * Software Transactional Memory? See `PyPy STM `_ Idea: Multiple Python binaries ============================== Instead of a single ``python3.7``, providing two or more binaries, as PyPy does, would allow to experiment more easily changes without breaking the backward compatibility. For example, ``python3.7`` would remain the default binary with reference counting and the current garbage collector, whereas ``fastpython3.7`` would not use reference counting and a new garbage collector. It would allow to more quickly "break the backward compatibility" and make it even more explicit than only prepared C extensions will be compatible with the new ``fastpython3.7``. cffi ==== XXX Long term goal: "more cffi, less libpython". Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: From ncoghlan at gmail.com Tue Jul 11 08:22:50 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 11 Jul 2017 22:22:50 +1000 Subject: [Python-ideas] PEP: Hide implementation details in the C API In-Reply-To: References: Message-ID: On 11 July 2017 at 20:19, Victor Stinner wrote: > Hi, > > This is the first draft of a big (?) project to prepare CPython to be > able to "modernize" its implementation. Proposed changes should allow > to make CPython more efficient in the future. The optimizations > themself are out of the scope of the PEP, but some examples are listed > to explain why these changes are needed. Please don't use the word "needed" for speed increases, as we can just as well diagnose the problem with the status quo as "The transition from writing and publishing pure Python modules to writing and publishing pre-compiled accelerated modules in Cython, C, C++, Rust, or Go is too hard, so folks mistakenly think they need to rewrite their whole application in something else, rather than just selectively replacing key pieces of it". After all, we already *have* an example of a breakout success for writing Python applications that rival C and FORTRAN applications for raw speed: Cython. By contrast, most of the alternatives that have attempted to make Python faster without forcing users to give up on some the language's dynamism in the process have been plagued by compatibility challenges and found themselves needing to ask for the language's runtime semantics to be relaxed in one way or another. That said, trying to improve how we manage the distinction between the public API and the interpreter's internal APIs is still an admirable goal, and it would be *great* to have CPython natively provide the public API that cffi relies on (so that other projects could also effectively target it), so my main request is just to reign in the dramatic rhetoric and start by exploring how many of the benefits you'd like can be obtained *without* a hard compatibility break. While the broad C API is one of CPython's greatest strengths that enabled the Python ecosystem to become the powerhouse that it is, it is *also* a pain to maintain consistently, *and* it poses problems for some technical experiments various folks would like to carry out. Those kinds of use cases are more than enough to justify changes to the way we manage our public header files - you don't need to dress it up in "sky is falling" rhetoric founded in the fear of other programming languages. Yes, Python is a nice language to program in, and it's great that we can get jobs where we can get paid to program in it. That doesn't mean we have to treat it as an existential threat that we aren't always going to be the best choice for everything :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From p.f.moore at gmail.com Tue Jul 11 08:36:10 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 11 Jul 2017 13:36:10 +0100 Subject: [Python-ideas] PEP: Hide implementation details in the C API In-Reply-To: References: Message-ID: On 11 July 2017 at 11:19, Victor Stinner wrote: > XXX should we abandon the stable ABI? Never really used by anyone. Please don't. On Windows, embedding Python is a pain because a new version of Python requires a recompile (which isn't ideal for apps that just want to optionally allow Python scripting, for example). Also, the recent availability of the embedded distribution on Windows has opened up some opportunities and I've been using the stable ABI there. It's not the end of the world if we lose it, but I'd rather see it retained (or better still, enhanced). Paul From victor.stinner at gmail.com Tue Jul 11 11:04:53 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 11 Jul 2017 17:04:53 +0200 Subject: [Python-ideas] PEP: Hide implementation details in the C API In-Reply-To: References: Message-ID: > Step 3: first pass of implementation detail removal > --------------------------------------------------- > > Modify the ``python`` API: > > * Add a new ``API`` subdirectory in the Python source code which will > "implement" the Python C API > * Replace macros with functions. The implementation of new functions > will be written in the ``API/`` directory. For example, Py_INCREF() > becomes the function ``void Py_INCREF(PyObject *op)`` and its > implementation will be written in the ``API`` directory. > * Slowly remove more and more implementation details from this API. When I discussed this issue with Serhiy Storchaka, he didn't see the purpose of the API directory. I started to implement the PEP in my "capi2" fork of CPython: https://github.com/haypo/cpython/tree/capi2 See https://github.com/haypo/cpython/tree/capi2/API for examples of C code to "implement the C API". Just one example, the macro #define PyUnicode_IS_READY(op) (((PyASCIIObject*)op)->state.ready) is replaced with a function: int PyUnicode_IS_READY(const PyObject *op) { return ((PyASCIIObject*)op)->state.ready; } So the header file doesn't have to expose the PyASCIIObject, PyCompactUnicodeObject and PyUnicodeObject structures. I was already able to remove the PyUnicodeObject structure without breaking the C extensions of the stdib. I don't want to pollute Objects/unicodeobject.c with such "wrapper" functions. In the future, the implementation of API/ can evolve a lot. Victor From ericsnowcurrently at gmail.com Tue Jul 11 13:41:07 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 11 Jul 2017 11:41:07 -0600 Subject: [Python-ideas] PEP: Hide implementation details in the C API In-Reply-To: References: Message-ID: On Tue, Jul 11, 2017 at 4:19 AM, Victor Stinner wrote: > Step 1: split Include/ into subdirectories > ------------------------------------------ > > Split the ``Include/`` directory of CPython: > > * ``python`` API: ``Include/Python.h`` remains the default C API > * ``core`` API: ``Include/core/Python.h`` is a new C API designed for > building Python > * ``stable`` API: ``Include/stable/Python.h`` is the stable ABI > > Expect declarations to be duplicated on purpose: ``#include`` should be > not used to include files from a different API to prevent mistakes. In > the past, too many functions were exposed *by mistake*, especially > symbols exported to the stable ABI by mistake. > > At this point, ``Include/Python.h`` is not changed at all: zero risk of > backward incompatibility. > > The ``core`` API is the most complete API exposing *all* implementation > details and use macros for best performances. FWIW, this is similar to something I've done while working on gathering up the CPython global runtime state. [1] I needed to share some internal details across compilation modules. Nick suggested a separate Includes/internal directory for header files containing "private" API. There is a _Python.h file there that starts with: #ifndef Py_BUILD_CORE #error "Internal headers are not available externally." #endif In Includes/Python.h, Includes/internal/_Python.h gets included if Py_BUILD_CORE is defined. This approach makes a strict boundary that keeps internal details out of the public API. That way we don't accidentally leak private API. It sounds similar to this part of your proposal (adding the "core" API). -eric [1] http://bugs.python.org/issue30860 From barry at barrys-emacs.org Tue Jul 11 17:01:30 2017 From: barry at barrys-emacs.org (Barry Scott) Date: Tue, 11 Jul 2017 22:01:30 +0100 Subject: [Python-ideas] PEP: Hide implementation details in the C API In-Reply-To: References: Message-ID: This is a great idea. The suggestions in your first draft would help clean up some of the uglier corners of the PyCXX code. I'd suggest that you might want to add at least 1 PyCXX based extension to your testing. PyCXX aims to expose all the C API as C++ classes. (I'm missing the class variable support as you discussed on this list a while ago). I'd be happy to support your API in PyCXX. Barry From ncoghlan at gmail.com Tue Jul 11 23:30:19 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 12 Jul 2017 13:30:19 +1000 Subject: [Python-ideas] PEP: Hide implementation details in the C API In-Reply-To: References: Message-ID: Commenting more on specific technical details rather than just tone this time :) On 11 July 2017 at 20:19, Victor Stinner wrote: > PEP: xxx > Title: Hide implementation details in the C API > Version: $Revision$ > Last-Modified: $Date$ > Author: Victor Stinner , > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 31-May-2017 > > > Abstract > ======== > > Modify the C API to remove implementation details. Add an opt-in option > to compile C extensions to get the old full API with implementation > details. > > The modified C API allows to more easily experiment new optimizations: > > * Indirect Reference Counting > * Remove Reference Counting, New Garbage Collector > * Remove the GIL > * Tagged pointers > > Reference counting may be emulated in a future implementation for > backward compatibility. I don't believe this is the best rationale to use for the PEP, as we (or at least I) have emphatically promised *not* to do another Python 3 style compatibility break, and we know from PyPy's decade of challenges that a lot of Python's users care even more about CPython C API/ABI compatibility than they do the core data model. It also has the downside of not really being true, since *other implementations* are happily experimenting with alternative approaches, and projects like PyMetabiosis attempt to use CPython itself as an adapter between other runtimes and the full C API for those extension modules that need it. What is unequivocally true though is that in the current C API: 1. We're not sure which APIs other projects (including extension module generators and helper libraries like Cython, Boost, PyCXX, SWIG, cffi, etc) are *actually* relying on. 2. It's easy for us to accidentally expand the public C API without thinking about it, since Py_BUILD_CORE guards are opt-in and Py_LIMITED_API guards are opt-out 3. We haven't structured our header files in a way that makes it obvious at a glance which API we're modifying (internal API, public API, stable ABI) > Rationale > ========= > > History of CPython forks > ------------------------ > > Last 10 years, CPython was forked multiple times to attempt > different CPython enhancements: > > * Unladen Swallow: add a JIT compiler based on LLVM > * Pyston: add a JIT compiler based on LLVM (CPython 2.7 fork) > * Pyjion: add a JIT compiler based on Microsoft CLR > * Gilectomy: remove the Global Interpreter Lock nicknamed "GIL" > * etc. > > Sadly, none is this project has been merged back into CPython. Unladen > Swallow looses its funding from Google, Pyston looses its funding from > Dropbox, Pyjion is developed in the limited spare time of two Microsoft > employees. > > One hard technically issue which blocked these projects to really > unleash their power is the C API of CPython. This is a somewhat misleadingly one-sided presentation of Python's history, as the broad access to CPython internals offered by the C API is precisely what *enabled* the scientific Python stack (including NumPy, SciPy, Pandas, scikit-learn, Cython, Numba, PyCUDA, etc) to develop largely independently of CPython itself. So for folks that are willing to embrace the use of Cython (and extension modules in general), many of CPython's runtime limitations (like the GIL and the overheads of working with boxed values) can already be avoided by pushing particular sections of code closer to C semantics than they are to traditional Python semantics. We've also been working to bring the runtime semantics of extension modules ever closer to those of pure Python modules, to the point where Python 3.7 is likely to be able to run an extension module as __main__ (see https://www.python.org/dev/peps/pep-0547/ for details) > Many old technical choices > of CPython are hardcoded in this API: > > * reference counting > * garbage collector > * C structures like PyObject which contains headers for reference > counting and the garbage collector > * specific memory allocators > * etc. > > PyPy > ---- > > PyPy uses more efficient structures and use a more efficient garbage > collector without reference counting. Thanks to that (but also many > other optimizations), PyPy succeeded to run Python code up to 5x faster > than CPython. This framing makes it look a bit like you're saying "It's hard for PyPy to correctly emulate these aspects of CPython, so we should eliminate them as a barrier to adoption for PyPy by breaking them for currently happy CPython's users as well". I don't think that's really a framing you want to run with in the near term, as it's going to start a needless fight, when there's plenty of unambiguously beneficial work that coule be done before anyone starts contemplating any kind of API compatibility break :) In particular, better segmenting our APIs into "solely for CPython's internal use", "ABI is specific to a CPython version", "API is portable across Python implementations", "ABI is portable across CPython versions (and maybe even Python implementations)" allows tooling developers and extension module authors to make more informed decisions about how closely they want to couple their work to CPython specifically. And then *after* we've done that API clarification work, *then* we can ask the question about what the default behaviour of "#include " should be, and perhaps introduce an opt-in Py_CPYTHON_API flag to request access to the full traditional C API for extension modules and embedding applications that actually need it. (While that's still a compatibility break, it's one that can be trivially resolved by putting an unconditional "#define Py_CPYTHON_API" before the Python header inclusion for projects that find they were actually relying on CPython specifics) > Plan made of multiple small steps > ================================= > > Step 1: split Include/ into subdirectories > ------------------------------------------ > > Split the ``Include/`` directory of CPython: > > * ``python`` API: ``Include/Python.h`` remains the default C API > * ``core`` API: ``Include/core/Python.h`` is a new C API designed for > building Python > * ``stable`` API: ``Include/stable/Python.h`` is the stable ABI > > Expect declarations to be duplicated on purpose: ``#include`` should be > not used to include files from a different API to prevent mistakes. In > the past, too many functions were exposed *by mistake*, especially > symbols exported to the stable ABI by mistake. > > At this point, ``Include/Python.h`` is not changed at all: zero risk of > backward incompatibility. > > The ``core`` API is the most complete API exposing *all* implementation > details and use macros for best performances. This part I like, although as Eric noted, we can avoid making wholesale changes to the headers of our implementation files by putting a Py_BUILD_CORE guard around the inclusion of a "Include/core/_CPython.h" header from "Include/Python.h" > XXX should we abandon the stable ABI? Never really used by anyone. It's also not available in Python 2.7, so anyone straddling the 2/3 boundary isn't currently able to rely on it. As folks become more willing to drop Python 2.7 support, then expending the effort to start targeting the stable ABI instead becomes more attractive (especially for extension module creation tools like Cython, cffi, and SWIG), since the stable ABI usage can *replace* the code that uses the traditional CPython API. > Step 2: Add an opt-in API option to tools building packages > ----------------------------------------------------------- > > Modify Python packaging tools (distutils, setuptools, flit, pip, etc.) > to add an opt-in option to choose the API: ``python``, ``core`` or > ``stable``. > > For example, debuggers like ``vmprof`` need need the ``core`` API to get > a full access to implementation details. > > XXX handle backward compatibility for packaging tools. For handcoded extensions, defining which API to use would be part of the C/C++ code. For generated extensions, it would be an option passed to Cython, cffi, etc. Packaging frontends shouldn't need to explicitly support it any more than they explicitly support the stable ABI today. > Step 3: first pass of implementation detail removal > --------------------------------------------------- > > Modify the ``python`` API: > > * Add a new ``API`` subdirectory in the Python source code which will > "implement" the Python C API > * Replace macros with functions. The implementation of new functions > will be written in the ``API/`` directory. For example, Py_INCREF() > becomes the function ``void Py_INCREF(PyObject *op)`` and its > implementation will be written in the ``API`` directory. > * Slowly remove more and more implementation details from this API. I'd suggest doing this slightly differently by ensuring that the APIs are defined as strict supersets of each other as follows: 1. CPython internal APIs (Py_BUILD_CORE) 2. CPython C API (status quo, currently no qualifier) 3. Portable Python API (new, starts as equivalent to stable ABI) 4. Stable Python ABI (Py_LIMITED_API) The two new qualifiers would then be: #define Py_CPYTHON_API #define Py_PORTABLE_API And Include/Python.h would end up looking something like this: [Common configuration includes would still go here] #ifdef Py_BUILD_CORE #include "core/_CPython.h" #else #ifdef Py_LIMITED_API #include "stable/Python.h" #else #ifdef Py_PORTABLE_API #include "portable/Python.h" #else #define Py_CPYTHON_API #include "cpython/Python.h" #endif #endif #endif At some future date, the default could then potentially switch to being the portable API for the current Python version, with folks having to opt-in to using either the full CPython API or the portable API for an older version. To avoid having to duplicate prototype definitions, and to ensure that C compilers complain when we inadvertently redefine a symbol differently from the way a more restricted API defines it, each API superset would start by including the next narrower API. So we'd have this: Include/stable/Python.h: [No special preamble, as it's the lowest common denominator API] Include/portable/Python.h: #define Py_LIMITED_API Py_PORTABLE_API #include "../stable/Python.h" #undef Py_LIMITED_API [Any desired API additions and overrides] Include/cpython/Python.h: #include "../patchlevel.h" #define Py_PORTABLE_API PY_VERSION_HEX #include "../portable/Python.h" #undef Py_PORTABLE_API [Include the rest of the current public C API] Include/core/_CPython.h: #ifndef Py_BUILD_CORE #error "Internal headers are only available when building CPython" #endif #include "../cpython/Python.h" [Include the rest of the internal C API] And at least initially, the subdirectories would be mostly empty - instead, we'd have the following setup: 1. Unported headers would remain directly in "Include/" and be included from "Include/Python.h" 2. Ported headers would have their contents split between core, cpython, and stable based on their #ifdef chains 3. When porting, the more expansive APIs would use "#undef" as needed when overriding a symbol deliberately And then, once all the APIs had been clearly categorised in a way that C compilers can better help us manage, the folks that were interested in this could start building key extension modules (such as NumPy and lxml) using "Py_PORTABLE_API=0x03070000", and *adding* to the portable API on an explicitly needs-driven basis. Cheers, NIck. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From tjreedy at udel.edu Wed Jul 12 00:24:27 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 12 Jul 2017 00:24:27 -0400 Subject: [Python-ideas] PEP: Hide implementation details in the C API In-Reply-To: References: Message-ID: On 7/11/2017 11:30 PM, Nick Coghlan wrote: > Commenting more on specific technical details rather than just tone this time :) > > On 11 July 2017 at 20:19, Victor Stinner wrote: >> Reference counting may be emulated in a future implementation for >> backward compatibility. One heavy user's experience with garbage collection: https://engineering.instagram.com/dismissing-python-garbage-collection-at-instagram-4dca40b29172 "Instagram can run 10% more efficiently. ... Yes, you heard it right! By disabling GC [and relying only on ref counting], we can reduce the memory footprint and improve the CPU LLC cache hit ratio" It turned out that gc.disable() was inadequate because imported libraries could turn it on, and one did. > I don't believe this is the best rationale to use for the PEP, as we > (or at least I) have emphatically promised *not* to do another Python > 3 style compatibility break, and we know from PyPy's decade of > challenges that a lot of Python's users care even more about CPython C > API/ABI compatibility than they do the core data model. [snip most] -- Terry Jan Reedy From ronaldoussoren at mac.com Wed Jul 12 03:24:23 2017 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Wed, 12 Jul 2017 09:24:23 +0200 Subject: [Python-ideas] PEP: Hide implementation details in the C API In-Reply-To: References: Message-ID: <0A294574-B7A4-4764-98D2-5584278CD757@mac.com> > On 11 Jul 2017, at 12:19, Victor Stinner wrote: > > Hi, > > This is the first draft of a big (?) project to prepare CPython to be > able to "modernize" its implementation. Proposed changes should allow > to make CPython more efficient in the future. The optimizations > themself are out of the scope of the PEP, but some examples are listed > to explain why these changes are needed. I?m not sure if hiding implementation details will help a lot w.r.t. making CPython more efficient, but cleaning up the public API would avoid accidentally depending on non-public information (and is sound engineering anyway). That said, a lot of care should be taken to avoid breaking existing extensions as the ease of writing extensions is one of the strong points of CPython. > > > > Plan made of multiple small steps > ================================= > > Step 1: split Include/ into subdirectories > ------------------------------------------ > > Split the ``Include/`` directory of CPython: > > * ``python`` API: ``Include/Python.h`` remains the default C API > * ``core`` API: ``Include/core/Python.h`` is a new C API designed for > building Python > * ``stable`` API: ``Include/stable/Python.h`` is the stable ABI Looks good in principle. It is currently too easy to accidentally add to the stable ABI by forgetting to add ?#if? guards around a non-stable API. > > Expect declarations to be duplicated on purpose: ``#include`` should be > not used to include files from a different API to prevent mistakes. In > the past, too many functions were exposed *by mistake*, especially > symbols exported to the stable ABI by mistake. Not sure about this, shouldn?t it be possible to have ``python`` include ``core`` and ``core`` include ``stable``? This would avoid having to update multiple header files when adding new definitions. > > At this point, ``Include/Python.h`` is not changed at all: zero risk of > backward incompatibility. > > The ``core`` API is the most complete API exposing *all* implementation > details and use macros for best performances. > > XXX should we abandon the stable ABI? Never really used by anyone. Assuming that?s true, has anyone looked into why it is barely used? If I?d have to guess its due to inertia. > > > Step 3: first pass of implementation detail removal > --------------------------------------------------- > > Modify the ``python`` API: > > * Add a new ``API`` subdirectory in the Python source code which will > "implement" the Python C API > * Replace macros with functions. The implementation of new functions > will be written in the ``API/`` directory. For example, Py_INCREF() > becomes the function ``void Py_INCREF(PyObject *op)`` and its > implementation will be written in the ``API`` directory. In this particular case (Py_INCREF/DECREF) making them functions isn?t really useful and is likely to be harmful for performance. It is not useful because these macros manipulate state in a struct that must be public because that struct is included into the structs for custom objects (PyObject_HEAD). Having them as macro?s also doesn?t preclude moving to indirect reference counts. Moving to anything that isn?t reference counts likely needs changes to the API (but not necessarily, see PyPy?s cpext). > * Slowly remove more and more implementation details from this API. > > Modifications of these API should be driven by tests of popular third > party packages like: > > * Django with database drivers > * numpy > * scipy > * Pillow > * lxml > * etc. > > Compilation errors on these extensions are expected. This step should > help to draw a line for the backward incompatible change. This could also help to find places where the documented API is not sufficient. One of the places where I poke directly into implementation details is a C-level subclass of str (PyUnicode_Type). I?d prefer not doing that, but AFAIK there is no other way to be string-like to the C API other than by being a subclass of str. BTW. The reason I need to subclass str: in PyObjC I use a subclass of str to represent Objective-C strings (NSString/NSMutableString), and I need to keep track of the original value; mostly because there are some Objective-C APIs that use object identity. The worst part is that fully initialising the PyUnicodeObject fields often isn?t necessary as a lot of Objective-C strings aren?t used as strings in Python code. > > > Enhancements becoming possible thanks to a new C API > ==================================================== > > Indirect Reference Counting > --------------------------- > > * Replace ``Py_ssize_t ob_refcnt;`` (integer) > with ``Py_ssize_t *ob_refcnt;`` (pointer to an integer). > * Same change for GC headers? > * Store all reference counters in a separated memory block > (or maybe multiple memory blocks) This could be done right now with a minimal change to the API: just make the ob_refcnt and ob_type fields of the PyObject struct private by renaming them, in Py3 the documented way to access theses fields is through function macros and these could by changed to do indirect refcounting instead. > Tagged pointers > --------------- > > https://en.wikipedia.org/wiki/Tagged_pointer > > Common optimization, especially used for "small integers". > > Current C API doesn't allow to implement tagged pointers. Why not? Thanks to Py_TYPE and Py_INCREF/Py_DECREF it should be possible to use tagged pointers without major changes to the API (also: see above). > > Tagged pointers are used in MicroPython to reduce the memory footprint. > > Note: ARM64 was recently extended its address space to 48 bits, causing > issue in LuaJIT: `47 bit address space restriction on ARM64 > `_. That shouldn?t be a problem when only using the least significant bits as tag bits (those bits that are known to be zero in untagged pointers due to alignment). > > Idea: Multiple Python binaries > ============================== > > Instead of a single ``python3.7``, providing two or more binaries, as > PyPy does, would allow to experiment more easily changes without > breaking the backward compatibility. > > For example, ``python3.7`` would remain the default binary with > reference counting and the current garbage collector, whereas > ``fastpython3.7`` would not use reference counting and a new garbage > collector. > > It would allow to more quickly "break the backward compatibility" and > make it even more explicit than only prepared C extensions will be > compatible with the new ``fastpython3.7``. The cost is having to maintain both indefinitely. Ronald From brett at python.org Wed Jul 12 14:51:22 2017 From: brett at python.org (Brett Cannon) Date: Wed, 12 Jul 2017 18:51:22 +0000 Subject: [Python-ideas] PEP: Hide implementation details in the C API In-Reply-To: <0A294574-B7A4-4764-98D2-5584278CD757@mac.com> References: <0A294574-B7A4-4764-98D2-5584278CD757@mac.com> Message-ID: On Wed, 12 Jul 2017 at 01:25 Ronald Oussoren wrote: > > > On 11 Jul 2017, at 12:19, Victor Stinner > wrote: > > > > Hi, > > > > This is the first draft of a big (?) project to prepare CPython to be > > able to "modernize" its implementation. Proposed changes should allow > > to make CPython more efficient in the future. The optimizations > > themself are out of the scope of the PEP, but some examples are listed > > to explain why these changes are needed. > > I?m not sure if hiding implementation details will help a lot w.r.t. > making CPython more efficient, but cleaning up the public API would avoid > accidentally depending on non-public information (and is sound engineering > anyway). That said, a lot of care should be taken to avoid breaking > existing extensions as the ease of writing extensions is one of the strong > points of CPython. > I also think the motivation doesn't have to be performance but simply cleaning up how we expose our C APIs to users as shown by the fact we have messed up the stable API by making it opt-out instead of opt-in. > > > > > > > > > Plan made of multiple small steps > > ================================= > > > > Step 1: split Include/ into subdirectories > > ------------------------------------------ > > > > Split the ``Include/`` directory of CPython: > > > > * ``python`` API: ``Include/Python.h`` remains the default C API > > * ``core`` API: ``Include/core/Python.h`` is a new C API designed for > > building Python > > * ``stable`` API: ``Include/stable/Python.h`` is the stable ABI > > Looks good in principle. It is currently too easy to accidentally add to > the stable ABI by forgetting to add ?#if? guards around a non-stable API. > > > > Expect declarations to be duplicated on purpose: ``#include`` should be > > not used to include files from a different API to prevent mistakes. In > > the past, too many functions were exposed *by mistake*, especially > > symbols exported to the stable ABI by mistake. > > Not sure about this, shouldn?t it be possible to have ``python`` include > ``core`` and ``core`` include ``stable``? This would avoid having to update > multiple header files when adding new definitions. > Yeah, that's also what I initially thought. Use a cascading hierarchy so that people know they should put anything as high up as possible to minimize its exposure. [SNIP] > > > > Step 3: first pass of implementation detail removal > > --------------------------------------------------- > > > > Modify the ``python`` API: > > > > * Add a new ``API`` subdirectory in the Python source code which will > > "implement" the Python C API > > * Replace macros with functions. The implementation of new functions > > will be written in the ``API/`` directory. For example, Py_INCREF() > > becomes the function ``void Py_INCREF(PyObject *op)`` and its > > implementation will be written in the ``API`` directory. > > In this particular case (Py_INCREF/DECREF) making them functions isn?t > really useful and is likely to be harmful for performance. It is not useful > because these macros manipulate state in a struct that must be public > because that struct is included into the structs for custom objects > (PyObject_HEAD). Having them as macro?s also doesn?t preclude moving to > indirect reference counts. Moving to anything that isn?t reference counts > likely needs changes to the API (but not necessarily, see PyPy?s cpext). > I think Victor has long-term plans to try and hide the struct details at a higher-level and so that would make macros a bad thing. But ignoring the specific Py_INCREF/DECREF example, switching to functions does buy us the ability to actually change the function implementations between Python versions compared to having to worry about what a macro used to do (which is a possibility with the stable ABI). > > > * Slowly remove more and more implementation details from this API. > > > > Modifications of these API should be driven by tests of popular third > > party packages like: > > > > * Django with database drivers > > * numpy > > * scipy > > * Pillow > > * lxml > > * etc. > > > > Compilation errors on these extensions are expected. This step should > > help to draw a line for the backward incompatible change. > > This could also help to find places where the documented API is not > sufficient. One of the places where I poke directly into implementation > details is a C-level subclass of str (PyUnicode_Type). I?d prefer not doing > that, but AFAIK there is no other way to be string-like to the C API other > than by being a subclass of str. > Yeah, this would allow us to very clearly know what should or should not be documented (I would say the same for the stdlib but we all know old code didn't hide things with a leading underscore consistently). > > BTW. The reason I need to subclass str: in PyObjC I use a subclass of str > to represent Objective-C strings (NSString/NSMutableString), and I need to > keep track of the original value; mostly because there are some Objective-C > APIs that use object identity. The worst part is that fully initialising > the PyUnicodeObject fields often isn?t necessary as a lot of Objective-C > strings aren?t used as strings in Python code. > > > > > > > Enhancements becoming possible thanks to a new C API > > ==================================================== > > > > Indirect Reference Counting > > --------------------------- > > > > * Replace ``Py_ssize_t ob_refcnt;`` (integer) > > with ``Py_ssize_t *ob_refcnt;`` (pointer to an integer). > > * Same change for GC headers? > > * Store all reference counters in a separated memory block > > (or maybe multiple memory blocks) > > This could be done right now with a minimal change to the API: just make > the ob_refcnt and ob_type fields of the PyObject struct private by renaming > them, in Py3 the documented way to access theses fields is through function > macros and these could by changed to do indirect refcounting instead. > I think this is why Victor wants functions, because even if you change the names the macros will be locked into their implementations if you try to write code that supports multiple versions and so you can't change it per-version of Python. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan_ml at behnel.de Wed Jul 12 18:23:39 2017 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 13 Jul 2017 00:23:39 +0200 Subject: [Python-ideas] PEP: Hide implementation details in the C API In-Reply-To: References: Message-ID: Victor Stinner schrieb am 11.07.2017 um 12:19: > Split the ``Include/`` directory of CPython: > > * ``python`` API: ``Include/Python.h`` remains the default C API > * ``core`` API: ``Include/core/Python.h`` is a new C API designed for > building Python > * ``stable`` API: ``Include/stable/Python.h`` is the stable ABI > [...] > Step 3: first pass of implementation detail removal > --------------------------------------------------- > > Modify the ``python`` API: > > * Add a new ``API`` subdirectory in the Python source code which will > "implement" the Python C API > * Replace macros with functions. The implementation of new functions > will be written in the ``API/`` directory. For example, Py_INCREF() > becomes the function ``void Py_INCREF(PyObject *op)`` and its > implementation will be written in the ``API`` directory. > * Slowly remove more and more implementation details from this API. >From a Cython perspective, it's (not great but) ok if these "implementation details" were moved somewhere else, but it would be a problem if they became entirely unavailable for external modules. Cython uses some of the internals for performance reasons, and we adapt it to changes of these internals whenever necessary. The question then arises if this proposal fulfills its intended purpose if Cython based tools like NumPy or lxml continued to use internal implementation details in their Cython generated C code. Specifically because that code is generated, I find it acceptable that it actively exploits non-portable details, because it already takes care of adapting to different Python platforms anyway. Cython has incorporated support for CPython, PyPy and Pyston that way, adding others is probably not difficult, and optimising for a specific one (usually CPython) is also easy. The general rule of thumb in Cython core development is that it's ok to exploit internals as long as there is a generic fallback through some C-API operations which can be used in other Python implementations. I'd be happy if that continued to be supported by CPython in the future. Exposing CPython internals is a good thing! :) Stefan From greg at krypto.org Wed Jul 12 21:02:43 2017 From: greg at krypto.org (Gregory P. Smith) Date: Thu, 13 Jul 2017 01:02:43 +0000 Subject: [Python-ideas] Data Classes (was: Re: JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects)) In-Reply-To: <3c1923e6-19c3-eeba-39ec-b5d0b406234a@trueblade.com> References: <17a0b553-161b-cff5-9ad1-8710260d7ec6@mail.de> <3c1923e6-19c3-eeba-39ec-b5d0b406234a@trueblade.com> Message-ID: On Thu, May 18, 2017 at 6:38 PM Eric V. Smith wrote: > On 5/18/17 2:26 PM, Sven R. Kunze wrote: > > On 17.05.2017 23:29, Ivan Levkivskyi wrote: > >> the idea is to write it into a PEP and consider API/corner > >> cases/implementation/etc. > > > > Who's writing it? > > Guido, Hynek, and I met today. I'm writing up our notes, and hopefully > that will eventually become a PEP. I'm going to propose calling this > feature "Data Classes" as a placeholder until we come up with something > better. > > Once I have something readable, I'll open it up for discussion. > Did anything PEP-ish ever come out of this? -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Jul 12 21:48:56 2017 From: guido at python.org (Guido van Rossum) Date: Wed, 12 Jul 2017 18:48:56 -0700 Subject: [Python-ideas] Data Classes (was: Re: JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects)) In-Reply-To: References: <17a0b553-161b-cff5-9ad1-8710260d7ec6@mail.de> <3c1923e6-19c3-eeba-39ec-b5d0b406234a@trueblade.com> Message-ID: I hope it will happen. https://github.com/ericvsmith/dataclasses On Wed, Jul 12, 2017 at 6:02 PM, Gregory P. Smith wrote: > > On Thu, May 18, 2017 at 6:38 PM Eric V. Smith wrote: > >> On 5/18/17 2:26 PM, Sven R. Kunze wrote: >> > On 17.05.2017 23:29, Ivan Levkivskyi wrote: >> >> the idea is to write it into a PEP and consider API/corner >> >> cases/implementation/etc. >> > >> > Who's writing it? >> >> Guido, Hynek, and I met today. I'm writing up our notes, and hopefully >> that will eventually become a PEP. I'm going to propose calling this >> feature "Data Classes" as a placeholder until we come up with something >> better. >> >> Once I have something readable, I'll open it up for discussion. >> > > Did anything PEP-ish ever come out of this? > > -gps > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Wed Jul 12 21:10:35 2017 From: eric at trueblade.com (Eric V. Smith) Date: Wed, 12 Jul 2017 21:10:35 -0400 Subject: [Python-ideas] Data Classes (was: Re: JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects)) In-Reply-To: References: <17a0b553-161b-cff5-9ad1-8710260d7ec6@mail.de> <3c1923e6-19c3-eeba-39ec-b5d0b406234a@trueblade.com> Message-ID: <6CA23670-BC9A-4394-B4C2-A9CF7E32B93E@trueblade.com> > On Jul 12, 2017, at 9:02 PM, Gregory P. Smith wrote: > > >> On Thu, May 18, 2017 at 6:38 PM Eric V. Smith wrote: >> On 5/18/17 2:26 PM, Sven R. Kunze wrote: >> > On 17.05.2017 23:29, Ivan Levkivskyi wrote: >> >> the idea is to write it into a PEP and consider API/corner >> >> cases/implementation/etc. >> > >> > Who's writing it? >> >> Guido, Hynek, and I met today. I'm writing up our notes, and hopefully >> that will eventually become a PEP. I'm going to propose calling this >> feature "Data Classes" as a placeholder until we come up with something >> better. >> >> Once I have something readable, I'll open it up for discussion. > > Did anything PEP-ish ever come out of this? > > -gps The real world has intruded on my "Data Classes" time. But when I can, I'll get back to it and complete it. Hopefully I can spend all of the Core Sprint time on it. Eric. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Jul 12 23:33:42 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 13 Jul 2017 13:33:42 +1000 Subject: [Python-ideas] PEP: Hide implementation details in the C API In-Reply-To: References: Message-ID: On 13 July 2017 at 08:23, Stefan Behnel wrote: > The general rule of thumb in Cython core development is that it's ok to > exploit internals as long as there is a generic fallback through some C-API > operations which can be used in other Python implementations. I'd be happy > if that continued to be supported by CPython in the future. Exposing > CPython internals is a good thing! :) +1 This is my major motivation for suggesting "Include/cpython/" as the directory for the header files that define a supported API that is specific to CPython - it helps make it clear to other implementations that it's OK to go beyond the portable Python C API, but such API extensions should be clearly flagged as implementation specific so that consumers can make an informed decision as to which level they want to target. I do want to revise my naming suggestions slightly though: I think it would make sense for the internal APIs (the ones already guarded by Py_BUILD_CORE) to be under "Include/_core/", where the leading underscore helps to emphasise "if you are not working on CPython itself, you should not be going anywhere near these header files". I think the other key point to clarify will be API versioning, since that will flow through to things like the C ABI compatibility tags in the binary wheel format. Currently [1], that looks like: cp35m # Specifically built for CPython 3.5 with PyMalloc cp35dm # Debugging enabled cp3_10m # Disambiguation uses underscores pp18 # It's the implementation version, not the Python version There's currently only one tag for the stable ABI: abi3 # Built for the stable ABI as of Python 3.2 So I think the existing Py_LIMITED_API/stable ABI is the right place for the strict "No public structs!" policy that completely decouples extension modules from CPython internals. We'll just need to refine the definition of the compatibility tags so that folks can properly indicate the minimum required version of that API: abi3 # Py_LIMITED_API=0x03020000 abi32 # Py_LIMITED_API=0x03020000 abi33 # Py_LIMITED_API=0x03030000 abi34 # Py_LIMITED_API=0x03040000 abi35 # Py_LIMITED_API=0x03050000 etc... Where Py_PORTABLE_API would come in is that it could be less strict on the "no public structs" rule (allowing some structs to be exposed as needed to enable building key projects like NumPy and lxml), and instead represent an API that offered source code and extension module portability across Python implementations, rather than strict ABI stability across versions. api37 # Py_PORTABLE_API=0x03070000 api38 # Py_PORTABLE_API=0x03080000 api39 # Py_PORTABLE_API=0x03090000 api3_10 # Py_PORTABLE_API=0x030A0000 etc... Cheers, Nick. [1] https://www.python.org/dev/peps/pep-0425/#details -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Wed Jul 12 23:43:14 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 13 Jul 2017 13:43:14 +1000 Subject: [Python-ideas] Data Classes (was: Re: JavaScript-Style Object Creation in Python (using a constructor function instead of a class to create objects)) In-Reply-To: <6CA23670-BC9A-4394-B4C2-A9CF7E32B93E@trueblade.com> References: <17a0b553-161b-cff5-9ad1-8710260d7ec6@mail.de> <3c1923e6-19c3-eeba-39ec-b5d0b406234a@trueblade.com> <6CA23670-BC9A-4394-B4C2-A9CF7E32B93E@trueblade.com> Message-ID: On 13 July 2017 at 11:10, Eric V. Smith wrote: > The real world has intruded on my "Data Classes" time. But when I can, I'll > get back to it and complete it. Hopefully I can spend all of the Core Sprint > time on it. One of the presentations at this year's PyCon Australia Education Seminar is on teaching OOP concepts with Python [1], so I'll make sure to bring this topic up with Bruce (the presenter) and other attendees. While https://github.com/ericvsmith/dataclasses/blob/master/pep-xxxx.rst is clearly still a work in progress, there's enough there for me to at least collect some first impressions :) Cheers, Nick. [1] https://2017.pycon-au.org/schedule/presentation/94/ -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From victor.stinner at gmail.com Thu Jul 13 07:38:48 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 13 Jul 2017 13:38:48 +0200 Subject: [Python-ideas] PEP: Hide implementation details in the C API In-Reply-To: References: Message-ID: 2017-07-13 0:23 GMT+02:00 Stefan Behnel : > From a Cython perspective, it's (not great but) ok if these "implementation > details" were moved somewhere else, but it would be a problem if they > became entirely unavailable for external modules. Cython uses some of the > internals for performance reasons, and we adapt it to changes of these > internals whenever necessary. I don't want to break the Python world, or my project will just fail. For me, it's ok if Cython or even numpy use the full CPython C API with structures and macros to get the best performances. But we need something like the PEP 399 for C extensions: https://www.python.org/dev/peps/pep-0399/ The best would be if C extensions would have two compilations mode: * "Optimize for CPython" (with impl. detail) * "Use the smaller portable C API" (no impl. detail) For example, use the new private Python 3.6 _PyObject_FastCall() if available, but fallback on PyObject_Call() (or other similar function) otherwise. Once we are are to compile in the two "modes", it becomes possible to run benchmarks and decide if it's worth it. For extensions written with Cython, I expect that Cython will take care of that. The problem is more for C code written manually. The best would be to limit as much as possible to "optimized code" and mostly write "portable" code. Victor From victor.stinner at gmail.com Thu Jul 13 07:46:50 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 13 Jul 2017 13:46:50 +0200 Subject: [Python-ideas] PEP: Hide implementation details in the C API In-Reply-To: References: <0A294574-B7A4-4764-98D2-5584278CD757@mac.com> Message-ID: 2017-07-12 20:51 GMT+02:00 Brett Cannon : > I also think the motivation doesn't have to be performance but simply > cleaning up how we expose our C APIs to users as shown by the fact we have > messed up the stable API by making it opt-out instead of opt-in. It's hard to sell a "cleanup" to users with no carrot :-) Did someone remind trying to sell the "Python 3 cleanup"? :-) > Yeah, that's also what I initially thought. Use a cascading hierarchy so > that people know they should put anything as high up as possible to minimize > its exposure. Yeah, maybe we can do that. I have to make my own experiment to make sure that #include doesn't leak symbols by mistakes and that it's still possible to use optimized macros or functions in builtin modules. > I think Victor has long-term plans to try and hide the struct details at a > higher-level and so that would make macros a bad thing. But ignoring the > specific Py_INCREF/DECREF example, switching to functions does buy us the > ability to actually change the function implementations between Python > versions compared to having to worry about what a macro used to do (which is > a possibility with the stable ABI). I think that my PEP is currently badly written :-) In fact, the idea is just to make the stable ABI usable :-) Instead of hiding structures *and* remove macros, my idea is just to hide structures but still provides macros... as functions. Basically, it will be the same API, but usable on more various implementations of Python. Victor From ncoghlan at gmail.com Thu Jul 13 09:21:39 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 13 Jul 2017 23:21:39 +1000 Subject: [Python-ideas] PEP: Hide implementation details in the C API In-Reply-To: References: <0A294574-B7A4-4764-98D2-5584278CD757@mac.com> Message-ID: On 13 July 2017 at 21:46, Victor Stinner wrote: > 2017-07-12 20:51 GMT+02:00 Brett Cannon : >> I think Victor has long-term plans to try and hide the struct details at a >> higher-level and so that would make macros a bad thing. But ignoring the >> specific Py_INCREF/DECREF example, switching to functions does buy us the >> ability to actually change the function implementations between Python >> versions compared to having to worry about what a macro used to do (which is >> a possibility with the stable ABI). > > I think that my PEP is currently badly written :-) > > In fact, the idea is just to make the stable ABI usable :-) Instead of > hiding structures *and* remove macros, my idea is just to hide > structures but still provides macros... as functions. Basically, it > will be the same API, but usable on more various implementations of > Python. As far as I know, this isn't really why folks find the stable ABI hard to switch to. Rather, I believe it's because switching to the stable ABI means completely changing how you define classes to be closer to the way you define them from Python code. That's why I like the idea of defining a "portable" API that *doesn't* adhere to the "no public structs" rule - if we can restore support for static class declarations (which requires exposing all the static method structs as well as the object header structs, although perhaps with obfuscated field names to avoid any dependency on the details of CPython's reference counting model), I think such an API would have dramatically lower barriers to adoption than the stable ABI does. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From victor.stinner at gmail.com Thu Jul 13 11:35:18 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 13 Jul 2017 17:35:18 +0200 Subject: [Python-ideas] PEP: Hide implementation details in the C API In-Reply-To: References: <0A294574-B7A4-4764-98D2-5584278CD757@mac.com> Message-ID: 2017-07-13 15:21 GMT+02:00 Nick Coghlan : > As far as I know, this isn't really why folks find the stable ABI hard > to switch to. Rather, I believe it's because switching to the stable > ABI means completely changing how you define classes to be closer to > the way you define them from Python code. > > That's why I like the idea of defining a "portable" API that *doesn't* > adhere to the "no public structs" rule - if we can restore support for > static class declarations (which requires exposing all the static > method structs as well as the object header structs, although perhaps > with obfuscated field names to avoid any dependency on the details of > CPython's reference counting model), I think such an API would have > dramatically lower barriers to adoption than the stable ABI does. I am not aware of this issue. Can you give an example of missing feature in the stable ABI? Or maybe an example of a class definition in C which cannot be implemented with the stable ABI? Victor From ronaldoussoren at mac.com Thu Jul 13 12:11:26 2017 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Thu, 13 Jul 2017 18:11:26 +0200 Subject: [Python-ideas] PEP: Hide implementation details in the C API In-Reply-To: References: <0A294574-B7A4-4764-98D2-5584278CD757@mac.com> Message-ID: <5E6F0295-8B89-42A9-BA23-84A02DAB1CD2@mac.com> > On 12 Jul 2017, at 20:51, Brett Cannon wrote: > > > > On Wed, 12 Jul 2017 at 01:25 Ronald Oussoren > wrote: > > > On 11 Jul 2017, at 12:19, Victor Stinner > wrote: > > > > Hi, > > > > This is the first draft of a big (?) project to prepare CPython to be > > able to "modernize" its implementation. Proposed changes should allow > > to make CPython more efficient in the future. The optimizations > > themself are out of the scope of the PEP, but some examples are listed > > to explain why these changes are needed. > > I?m not sure if hiding implementation details will help a lot w.r.t. making CPython more efficient, but cleaning up the public API would avoid accidentally depending on non-public information (and is sound engineering anyway). That said, a lot of care should be taken to avoid breaking existing extensions as the ease of writing extensions is one of the strong points of CPython. > > I also think the motivation doesn't have to be performance but simply cleaning up how we expose our C APIs to users as shown by the fact we have messed up the stable API by making it opt-out instead of opt-in. I agree with this. [?] > > > > Step 3: first pass of implementation detail removal > > --------------------------------------------------- > > > > Modify the ``python`` API: > > > > * Add a new ``API`` subdirectory in the Python source code which will > > "implement" the Python C API > > * Replace macros with functions. The implementation of new functions > > will be written in the ``API/`` directory. For example, Py_INCREF() > > becomes the function ``void Py_INCREF(PyObject *op)`` and its > > implementation will be written in the ``API`` directory. > > In this particular case (Py_INCREF/DECREF) making them functions isn?t really useful and is likely to be harmful for performance. It is not useful because these macros manipulate state in a struct that must be public because that struct is included into the structs for custom objects (PyObject_HEAD). Having them as macro?s also doesn?t preclude moving to indirect reference counts. Moving to anything that isn?t reference counts likely needs changes to the API (but not necessarily, see PyPy?s cpext). > > I think Victor has long-term plans to try and hide the struct details at a higher-level and so that would make macros a bad thing. But ignoring the specific Py_INCREF/DECREF example, switching to functions does buy us the ability to actually change the function implementations between Python versions compared to having to worry about what a macro used to do (which is a possibility with the stable ABI). I don?t understand. Moving too functions instead of macros for some thing doesn?t really help with keeping the public API stable (for the non-stable ABI). Avoiding macros does help with keeping more of the object internals hidden, and possibly easier to change within a major release, but doesn?t help (or hinder) changing the implementation of an API. AFAIK there is no API stability guarantee for the details of the struct definitions for object representation, which is why it was possible to change the dict representation for CPython 3.6, and the str representation earlier. I wouldn?t mind having to explicitly opt-in to getting access to those internals, but removing them from public headers altogether does have a cost. > > > > * Slowly remove more and more implementation details from this API. > > > > Modifications of these API should be driven by tests of popular third > > party packages like: > > > > * Django with database drivers > > * numpy > > * scipy > > * Pillow > > * lxml > > * etc. > > > > Compilation errors on these extensions are expected. This step should > > help to draw a line for the backward incompatible change. > > This could also help to find places where the documented API is not sufficient. One of the places where I poke directly into implementation details is a C-level subclass of str (PyUnicode_Type). I?d prefer not doing that, but AFAIK there is no other way to be string-like to the C API other than by being a subclass of str. > > Yeah, this would allow us to very clearly know what should or should not be documented (I would say the same for the stdlib but we all know old code didn't hide things with a leading underscore consistently). I tried to write about how this could help to evolve the API by exposing documented APIs or features for things where extensions currently directly peek and poke into implementation details. Moving away from private stuff is a lot easier when there are sanctioned alternatives :-) > > > BTW. The reason I need to subclass str: in PyObjC I use a subclass of str to represent Objective-C strings (NSString/NSMutableString), and I need to keep track of the original value; mostly because there are some Objective-C APIs that use object identity. The worst part is that fully initialising the PyUnicodeObject fields often isn?t necessary as a lot of Objective-C strings aren?t used as strings in Python code. > > > > > > > Enhancements becoming possible thanks to a new C API > > ==================================================== > > > > Indirect Reference Counting > > --------------------------- > > > > * Replace ``Py_ssize_t ob_refcnt;`` (integer) > > with ``Py_ssize_t *ob_refcnt;`` (pointer to an integer). > > * Same change for GC headers? > > * Store all reference counters in a separated memory block > > (or maybe multiple memory blocks) > > This could be done right now with a minimal change to the API: just make the ob_refcnt and ob_type fields of the PyObject struct private by renaming them, in Py3 the documented way to access theses fields is through function macros and these could by changed to do indirect refcounting instead. > > I think this is why Victor wants functions, because even if you change the names the macros will be locked into their implementations if you try to write code that supports multiple versions and so you can't change it per-version of Python. I really don?t understand. The macros are part of the code for a version of Python and can be changed when necessary between python versions; the only advantage of functions is that its easier to tweak the implementation in patch releases. BTW. As I mentioned before the PyObject struct is one that cannot be made private without major changes because that struct is included in all extension object definitions by way of PyObject_HEAD. But anyway, that?s just a particular example and doesn?t mean we cannot hide any implementation details. Ronald P.S. I?ve surfaced because I?m at EuroPython, and experience learns that I?ll likely submerge again afterwards even if I?d prefer not to do so :-( -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Thu Jul 13 12:29:06 2017 From: brett at python.org (Brett Cannon) Date: Thu, 13 Jul 2017 16:29:06 +0000 Subject: [Python-ideas] PEP: Hide implementation details in the C API In-Reply-To: <5E6F0295-8B89-42A9-BA23-84A02DAB1CD2@mac.com> References: <0A294574-B7A4-4764-98D2-5584278CD757@mac.com> <5E6F0295-8B89-42A9-BA23-84A02DAB1CD2@mac.com> Message-ID: On Thu, 13 Jul 2017 at 09:12 Ronald Oussoren wrote: > On 12 Jul 2017, at 20:51, Brett Cannon wrote: > > > > On Wed, 12 Jul 2017 at 01:25 Ronald Oussoren > wrote: > >> >> > On 11 Jul 2017, at 12:19, Victor Stinner >> wrote: >> > [SNIP] > > > >> > Step 3: first pass of implementation detail removal >> > --------------------------------------------------- >> > >> > Modify the ``python`` API: >> > >> > * Add a new ``API`` subdirectory in the Python source code which will >> > "implement" the Python C API >> > * Replace macros with functions. The implementation of new functions >> > will be written in the ``API/`` directory. For example, Py_INCREF() >> > becomes the function ``void Py_INCREF(PyObject *op)`` and its >> > implementation will be written in the ``API`` directory. >> >> In this particular case (Py_INCREF/DECREF) making them functions isn?t >> really useful and is likely to be harmful for performance. It is not useful >> because these macros manipulate state in a struct that must be public >> because that struct is included into the structs for custom objects >> (PyObject_HEAD). Having them as macro?s also doesn?t preclude moving to >> indirect reference counts. Moving to anything that isn?t reference counts >> likely needs changes to the API (but not necessarily, see PyPy?s cpext). >> > > I think Victor has long-term plans to try and hide the struct details at a > higher-level and so that would make macros a bad thing. But ignoring the > specific Py_INCREF/DECREF example, switching to functions does buy us the > ability to actually change the function implementations between Python > versions compared to having to worry about what a macro used to do (which > is a possibility with the stable ABI). > > > I don?t understand. Moving too functions instead of macros for some thing > doesn?t really help with keeping the public API stable (for the non-stable > ABI). > Sorry, I didn't specify which ABI/API I was talking about; my point was from the stable ABI. I think this is quickly showing how naming is going to play into this since e.g. we say "stable ABI" but call it "Py_LIMITED_API" in the code which is rather confusing. Just to make sure I'm not missing anything, it seems we have a few levels here: 1. The stable A**B**I which is compatible across versions 2. A stable A**P**I which hides enough details that if we change a struct your code won't require an update, just a recompile 3. An API that exposes CPython-specific details such as structs and other details that might not be entirely portable to e.g. PyPy easily but that we try not to break 4. An internal API that we use for implementing the interpreter but don't expect anyone else to use, so we can break it between feature releases (although if e.g. Cython chooses to use it they can) (There's also an API local to a single file, but since that is never exported to the linker it doesn't come into play here.) So, a portable API/ABI, a stable API, a CPython API, and then an internal/core/interpreter API. Correct? -------------- next part -------------- An HTML attachment was scrubbed... URL: From ronaldoussoren at mac.com Thu Jul 13 12:30:37 2017 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Thu, 13 Jul 2017 18:30:37 +0200 Subject: [Python-ideas] PEP: Hide implementation details in the C API In-Reply-To: References: <0A294574-B7A4-4764-98D2-5584278CD757@mac.com> Message-ID: > On 13 Jul 2017, at 13:46, Victor Stinner wrote: > > 2017-07-12 20:51 GMT+02:00 Brett Cannon : >> I also think the motivation doesn't have to be performance but simply >> cleaning up how we expose our C APIs to users as shown by the fact we have >> messed up the stable API by making it opt-out instead of opt-in. > > It's hard to sell a "cleanup" to users with no carrot :-) Did someone > remind trying to sell the "Python 3 cleanup"? :-) But then there should actually be a carrot ;-). Just declaring the contents of object definitions private in the documentation could also help, especially when adding preprocessor guards to enable access to those definitions. Consulting adults etc? > >> Yeah, that's also what I initially thought. Use a cascading hierarchy so >> that people know they should put anything as high up as possible to minimize >> its exposure. > > Yeah, maybe we can do that. > > I have to make my own experiment to make sure that #include doesn't > leak symbols by mistakes and that it's still possible to use optimized > macros or functions in builtin modules. Avoiding symbol leaks with a cascading hierarchy should be easy enough, some care may be needed to be able to override definitions in the ?more private? headers, especially when making current macros available as functions in the more public headers. Although it could be considered to just remove macros like PyTuple_GET_ITEM from the most public layer. > > >> I think Victor has long-term plans to try and hide the struct details at a >> higher-level and so that would make macros a bad thing. But ignoring the >> specific Py_INCREF/DECREF example, switching to functions does buy us the >> ability to actually change the function implementations between Python >> versions compared to having to worry about what a macro used to do (which is >> a possibility with the stable ABI). > > I think that my PEP is currently badly written :-) > > In fact, the idea is just to make the stable ABI usable :-) Instead of > hiding structures *and* remove macros, my idea is just to hide > structures but still provides macros... as functions. Basically, it > will be the same API, but usable on more various implementations of > Python. It might be better to push users towards tools like ctypes and cffi, the latter especially is tuned to work both with CPython and PyPy and appears to gain momentum. That won?t work for everything, but could work for a large subset of extensions. Ronald From ncoghlan at gmail.com Fri Jul 14 00:13:19 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 14 Jul 2017 14:13:19 +1000 Subject: [Python-ideas] PEP: Hide implementation details in the C API In-Reply-To: References: <0A294574-B7A4-4764-98D2-5584278CD757@mac.com> Message-ID: On 14 July 2017 at 01:35, Victor Stinner wrote: > 2017-07-13 15:21 GMT+02:00 Nick Coghlan : >> As far as I know, this isn't really why folks find the stable ABI hard >> to switch to. Rather, I believe it's because switching to the stable >> ABI means completely changing how you define classes to be closer to >> the way you define them from Python code. >> >> That's why I like the idea of defining a "portable" API that *doesn't* >> adhere to the "no public structs" rule - if we can restore support for >> static class declarations (which requires exposing all the static >> method structs as well as the object header structs, although perhaps >> with obfuscated field names to avoid any dependency on the details of >> CPython's reference counting model), I think such an API would have >> dramatically lower barriers to adoption than the stable ABI does. > > I am not aware of this issue. Can you give an example of missing > feature in the stable ABI? Or maybe an example of a class definition > in C which cannot be implemented with the stable ABI? Pretty much all the type definitions in CPython except the ones in https://github.com/python/cpython/blob/master/Modules/xxlimited.c will fail on the stable ABI :) It's not that they *can't* be ported to the stable ABI, it's that they *haven't* been, and there isn't currently any kind of code generator to automate the conversion process. For the standard library, the lack of motivation comes from the fact that we recompile for every version anyway, so there's nothing specific to be gained from switching to compiling optional extension modules under the stable ABI instead of the default CPython API. For third party projects, the problem is that they need to continue using static type declarations if they want to support Python 2.7, so using static type declarations for both Py2 and Py3 is a more attractive option than defining their types differently depending on the version. As folks start dropping Python 2.7 support *then* the stable ABI starts to become a more attractive option, as it should let them significantly reduce the number of wheels they publish to PyPI *without* having to maintain two different ways of defining types (assuming we redefine the stable ABI compatibility tags to let people specify a minimum required version that's higher than 3.2). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Fri Jul 14 00:26:53 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 14 Jul 2017 14:26:53 +1000 Subject: [Python-ideas] PEP: Hide implementation details in the C API In-Reply-To: References: <0A294574-B7A4-4764-98D2-5584278CD757@mac.com> <5E6F0295-8B89-42A9-BA23-84A02DAB1CD2@mac.com> Message-ID: On 14 July 2017 at 02:29, Brett Cannon wrote: > On Thu, 13 Jul 2017 at 09:12 Ronald Oussoren wrote: >> I don?t understand. Moving too functions instead of macros for some thing >> doesn?t really help with keeping the public API stable (for the non-stable >> ABI). > > > Sorry, I didn't specify which ABI/API I was talking about; my point was from > the stable ABI. > > I think this is quickly showing how naming is going to play into this since > e.g. we say "stable ABI" but call it "Py_LIMITED_API" in the code which is > rather confusing. I honestly think we should just change that symbol to Py_STABLE_ABI (with Py_LIMITED_API retained as a backwards compatibility feature). Yes, Py_LIMITED_API is technically more correct, but it's confusing in practice, while "Py_STABLE_ABI" matches the user's intent: "make sure my binaries only depend on the stable ABI". > Just to make sure I'm not missing anything, it seems we have a few levels > here: > > 1. The stable A**B**I which is compatible across versions > 2. A stable A**P**I which hides enough details that if we change a struct > your code won't require an update, just a recompile I don't think we want to promise that the portable API will be completely backwards compatible over time - unlike the stable ABI, it should be subject to Python's normal deprecation policy (i.e. if an API emits a deprecation warning in X.Y, we may remove it entirely in X.Y+1). Instead, I think the key promises of the portable API should be: 1. It only exposes interfaces that are genuinely portable across at least CPython and PyPy 2. It adheres as closely to the stable ABI as it can, with additions made *solely* to support the building of existing popular extension modules (e.g. by adding back static type declaration support) > 3. An API that exposes CPython-specific details such as structs and other > details that might not be entirely portable to e.g. PyPy easily but that we > try not to break > 4. An internal API that we use for implementing the interpreter but don't > expect anyone else to use, so we can break it between feature releases > (although if e.g. Cython chooses to use it they can) > > (There's also an API local to a single file, but since that is never > exported to the linker it doesn't come into play here.) > > So, a portable API/ABI, a stable API, a CPython API, and then an > internal/core/interpreter API. Correct? Not quite: - stable ABI (strict extension module compatibility policy) - portable API (no ABI stability guarantees, normal deprecation policy) - public CPython API (no cross-implementation portability guarantees) - internal-only CPython core API (arbitrary changes, no deprecation warnings) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From jeff.walker00 at yandex.com Sat Jul 15 20:12:13 2017 From: jeff.walker00 at yandex.com (Jeff Walker) Date: Sat, 15 Jul 2017 18:12:13 -0600 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <22883.26308.389706.743344@turnbull.sk.tsukuba.ac.jp> References: <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <595C0D0E.2030600@brenbarn.net> <20170705113634.GF3149@ando.pearwood.info> <20170705153901.GG3149@ando.pearwood.info> <20170705173221.GH3149@ando.pearwood.info> <166441499301972@web57j.yandex.ru> <192641499305985@web47g.yandex.ru> <22883.26308.389706.743344@turnbull.sk.tsukuba.ac.jp> Message-ID: <1169061500163933@web18o.yandex.ru> Sorry Stephen (and Steven). I'll do better next time. The way I see it there are two problems, and the second causes the first. The first problem is that there is no direct access to the components that make up the error in some of the standard Python exceptions. >>> foo Traceback (most recent call last): File "", line 1, in NameError: name 'foo' is not defined If you need access to the name, you must de-construct the error message. To get direct access to the name, it would need to be passed to the exception when raised. Why wasn't that done? That leads us to the second problem: the base exception does not handle arguments gracefully unless you only pass an error message all by itself. For example: >>> try: >>> name = 'foo' >>> raise NameError('%s: name is not defined.' % name) >>> except NameError as e: >>> print(str(e)) foo: name is not defined. Here, printing the exception cast to a string produces a reasonable error message. >>> try: >>> name = 'foo' >>> raise NameError(name, '%s: name is not defined.' % name) >>> except NameError as e: >>> print(str(e)) ('foo', 'foo: name is not defined.') In this case, printing the exception cast to a string does not result in a reasonable error message. So the basic point is that the lack of reasonable behavior for str(e) when passing multiple arguments encourages encapsulating everything into an error message, which makes it difficult to do many useful things when handling the exceptions. Steven seems to object to the fact that the proposal takes arbitrary keyword arguments. I think this is the point of the 'nym' example. But in fact that is not really the point of the proposal, it is just a minor design choice. Instead, Steven recommends creating a custom exception with explicitly declared named arguments. And I agree with him that that is the right thing to do in many cases. But sometimes you just want something quick that you can use that does not require you to define your own exception. And sometimes, even though people should define custom exceptions, they don't. For example, NameError, KeyError, IndexError, ... -Jeff From rosuav at gmail.com Sat Jul 15 20:33:15 2017 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 16 Jul 2017 10:33:15 +1000 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <1169061500163933@web18o.yandex.ru> References: <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <595C0D0E.2030600@brenbarn.net> <20170705113634.GF3149@ando.pearwood.info> <20170705153901.GG3149@ando.pearwood.info> <20170705173221.GH3149@ando.pearwood.info> <166441499301972@web57j.yandex.ru> <192641499305985@web47g.yandex.ru> <22883.26308.389706.743344@turnbull.sk.tsukuba.ac.jp> <1169061500163933@web18o.yandex.ru> Message-ID: On Sun, Jul 16, 2017 at 10:12 AM, Jeff Walker wrote: > The first problem is that there is no direct access to the components that make up the error in some of the standard Python exceptions. > > >>> foo > Traceback (most recent call last): > File "", line 1, in > NameError: name 'foo' is not defined > > If you need access to the name, you must de-construct the error message. To get direct access to the name, it would need to be passed to the exception when raised. Why wasn't that done? > Because it normally isn't needed. Can you give an example of where NameError could legitimately be raised from multiple causes? Most of the time, NameError is either being used to probe a feature (eg raw_input vs input), or indicates a bug (in which case you just let the exception get printed out as is). When do you actually need access to that name? ChrisA From ncoghlan at gmail.com Sun Jul 16 00:18:40 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 16 Jul 2017 14:18:40 +1000 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <1169061500163933@web18o.yandex.ru> References: <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <595C0D0E.2030600@brenbarn.net> <20170705113634.GF3149@ando.pearwood.info> <20170705153901.GG3149@ando.pearwood.info> <20170705173221.GH3149@ando.pearwood.info> <166441499301972@web57j.yandex.ru> <192641499305985@web47g.yandex.ru> <22883.26308.389706.743344@turnbull.sk.tsukuba.ac.jp> <1169061500163933@web18o.yandex.ru> Message-ID: On 16 July 2017 at 10:12, Jeff Walker wrote: > Sorry Stephen (and Steven). I'll do better next time. > > The way I see it there are two problems, and the second causes the first. > > The first problem is that there is no direct access to the components that make up the error in some of the standard Python exceptions. > > >>> foo > Traceback (most recent call last): > File "", line 1, in > NameError: name 'foo' is not defined > > If you need access to the name, you must de-construct the error message. To get direct access to the name, it would need to be passed to the exception when raised. Why wasn't that done? Ease of implementation mainly, as PyErr_Format is currently the simplest general purpose mechanism for raising informative exceptions from the C code: https://docs.python.org/3/c-api/exceptions.html#c.PyErr_Format All the more structured ones currently require expanding the C API to cover creating particular exception types with particular field values (the various PyErr_Set* functions listed on that page). That does prompt a thought, though: what if there was an f-string style "PyErr_SetStructured" API, where instead of writing 'PyErr_Format(PyExc_NameError, "name '%.200s' is not defined", name);', you could instead write 'PyErr_SetStructured(PyExc_NameError, "name '{name:.200s}' is not defined", name);', and that effectively translated to setting a NameError exception with that message *and* setting its "name" attribute appropriately. The key benefit such a switch to f-string style formatting would provide is the ability to optionally *name* the fields in the exception message if they correspond to an instance attribute on the exception being raised. Actually implementing such an API wouldn't be easy by any means (and I'm not volunteering to do it myself), but it *would* address the main practical barrier to making structured exceptions more pervasive in the reference interpreter. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From cgfarrellchg at gmail.com Sun Jul 16 10:37:26 2017 From: cgfarrellchg at gmail.com (Connor Farrell) Date: Sun, 16 Jul 2017 10:37:26 -0400 Subject: [Python-ideas] Custom Code Folding: Standardized Rules and Syntax? Message-ID: Background: I work in scientific computing and use Community Pycharm IDE. I'm a religious follower of the 'readability counts' mantra, and two things I find myself doing often are: - Writing custom code folds to segregate code, from groups of classes in a file, to groups of lines in an individual function. While spacing works great to separate ideas, my IDE allows me to collapse the entirety of the code in exchange for a line of English. For my purposes, this enhances readability immensely, as first time users are confronted with an explanation of the contents, rather than the code itself with a comment on top. I find comments don't draw the eye, and also don't have the ability to their code as well. - Writing high level code, such as __init__ calls for large aggregates, with one keyworded argument per line (plus dict unpackings at the end), sort of like a simple XML file. This allows me to make parameters explicit for other users, and optionally provide a comment indicating physical units, cite sources, and/or give a list of tag/enum options for every parameter. In the end I have 30+ line inits, but the readability is 10x greater. My IDE doesn't yet offer to fold long parameter lists by default, but I think it makes sense. In the end, I end up with very well folded code (except for large parameter lists) and a bunch of co-workers asking about all the "editor-fold" comments that don't work in their API. Python was a trail-blazer in terms of emphasizing the importance of code readability and effective syntax. I think that we should consider some sort of standard for folding comments, if not only to promote stronger code organizations. I know standards don't usually interact with IDEs, but hey, the 'typing' module is pretty dang nice. TL;DR: Code folding is great, custom code folding is great, let's upgrade it to a language feature. Cheers -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sun Jul 16 11:15:59 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 17 Jul 2017 01:15:59 +1000 Subject: [Python-ideas] Custom Code Folding: Standardized Rules and Syntax? In-Reply-To: References: Message-ID: <20170716151559.GW3149@ando.pearwood.info> Hi Connor, and welcome! On Sun, Jul 16, 2017 at 10:37:26AM -0400, Connor Farrell wrote: > Background: I work in scientific computing and use Community Pycharm IDE. > > I'm a religious follower of the 'readability counts' mantra, and two things > I find myself doing often are: > - Writing custom code folds to segregate code, from groups of classes in a > file, to groups of lines in an individual function. While spacing works > great to separate ideas, my IDE allows me to collapse the entirety of the > code in exchange for a line of English. For my purposes, this enhances > readability immensely, as first time users are confronted with an > explanation of the contents, rather than the code itself with a comment on > top. I find comments don't draw the eye, and also don't have the ability to > their code as well. I'm afraid I'm having a lot of difficulty understanding this. I think the last sentence is missing a word. Comments don't have the ability to **what** their (whose?) code? Which IDE are you using? When you say it collapses the "entirety of the code", do you mean the entire file? > - Writing high level code, such as __init__ calls for large aggregates, > with one keyworded argument per line (plus dict unpackings at the end), > sort of like a simple XML file. Even if I accept that this is a reasonable design for __init__, I would not agree that it is a reasonable design for "high level code" in general. > This allows me to make parameters explicit > for other users, and optionally provide a comment indicating physical > units, cite sources, and/or give a list of tag/enum options for every > parameter. In the end I have 30+ line inits, but the readability is 10x > greater. Perhaps I might be convinced if I saw some actual code, but from your description alone, it doesn't sound particularly more readable. Why would I want to read citations in the parameter list of a method? I want to call the method, not do peer review on the theory behind it. > My IDE doesn't yet offer to fold long parameter lists by default, > but I think it makes sense. *shrug* Personally, I don't find code folding a big help. Perhaps once in a blue moon. I'm glad you like it and that it helps you. > In the end, I end up with very well folded code (except for large parameter > lists) and a bunch of co-workers asking about all the "editor-fold" > comments that don't work in their API. I'm afraid I'm not understanding you here either. What's an "editor-fold" comment? What do they mean by API? API for which application? How does the programming interface to an application relate to code folding in a text editor? > Python was a trail-blazer in terms of emphasizing the importance of code > readability and effective syntax. I think that we should consider some sort > of standard for folding comments, if not only to promote stronger code > organizations. I know standards don't usually interact with IDEs, but hey, > the 'typing' module is pretty dang nice. > > TL;DR: Code folding is great, custom code folding is great, let's upgrade > it to a language feature. What does that even mean? Are you suggesting that the Python interpreter should raise a SyntaxError or something if your code was written in an editor that didn't support code folding? How would it know? Python is a programming language. The source code is text. I should be able to write Python code in NotePad if I want. Why should the Python core devs try to force every text editor and IDE fold code exactly the same way? That sounds awful to me. People choose different editors because they like different features, and that may include the particular way the editor folds code. Or to not fold it at all. I'm sorry to be so negative, but I don't understand your proposal, and the bits that I *think* I understand sound pretty awful to me. Perhaps you can explain a bit better what you mean and why it should be a language feature, apart from "I want everyone to lay out their source code the way I do". Because that's what it sounds like to me. -- Steve From cgfarrellchg at gmail.com Sun Jul 16 12:42:10 2017 From: cgfarrellchg at gmail.com (Connor Farrell) Date: Sun, 16 Jul 2017 12:42:10 -0400 Subject: [Python-ideas] Python-ideas Digest, Vol 128, Issue 41 In-Reply-To: References: Message-ID: Thanks for your feedback, I guess I was a little unclear. In short, I was thinking of a pair of comment tokens (something like #<<, #>>, idk) that would indicate a code fold, like what virtually all IDEs do for classes and methods, but with more granularity. It would allow devs to better organize their code. I agree with what you're saying about separation of language and text editor, but we already have the *typing* module in 3.6, so improved linting and communication is apparently game. I have no desire to get the interpreter involved, this is pure linter. A good example would be something like: class A: #<< INTERFACE def foo(): .... def bar(): .... #>> #<< BACKEND def _guts(): .... #>> Would become something like: class A: *>+ INTERFACE* * >+ BACKEND* Where *modern* editors should fold the code. It provides an optional additional layer of granularity above the current system, but has no effect on behaviour otherwise. It increases visual information density in complex functions, large classes, and lets you group classes. It looks stupid with only three functions, but at larger sizes, or with complex code, I'd rather see: def func(f: Callable[[float],float, float]) -> None: *>+* *Input validation* *>+ Some complex algorithm* * >+ Some other complex algorithm* * >+ Generating plot* It adds a human explanation of the code within, an additional level(s) of organization, and easier navigation. It's not for everyone, but I feel like it improves on the idea of modular code at for active devs, without any drawbacks for library users. On Sun, Jul 16, 2017 at 12:00 PM, wrote: > Send Python-ideas mailing list submissions to > python-ideas at python.org > > To subscribe or unsubscribe via the World Wide Web, visit > https://mail.python.org/mailman/listinfo/python-ideas > or, via email, send a message with subject or body 'help' to > python-ideas-request at python.org > > You can reach the person managing the list at > python-ideas-owner at python.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Python-ideas digest..." > > > Today's Topics: > > 1. Re: Custom Code Folding: Standardized Rules and Syntax? > (Steven D'Aprano) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Mon, 17 Jul 2017 01:15:59 +1000 > From: Steven D'Aprano > To: python-ideas at python.org > Subject: Re: [Python-ideas] Custom Code Folding: Standardized Rules > and Syntax? > Message-ID: <20170716151559.GW3149 at ando.pearwood.info> > Content-Type: text/plain; charset=us-ascii > > Hi Connor, and welcome! > > On Sun, Jul 16, 2017 at 10:37:26AM -0400, Connor Farrell wrote: > > Background: I work in scientific computing and use Community Pycharm IDE. > > > > I'm a religious follower of the 'readability counts' mantra, and two > things > > I find myself doing often are: > > - Writing custom code folds to segregate code, from groups of classes in > a > > file, to groups of lines in an individual function. While spacing works > > great to separate ideas, my IDE allows me to collapse the entirety of the > > code in exchange for a line of English. For my purposes, this enhances > > readability immensely, as first time users are confronted with an > > explanation of the contents, rather than the code itself with a comment > on > > top. I find comments don't draw the eye, and also don't have the ability > to > > their code as well. > > I'm afraid I'm having a lot of difficulty understanding this. I think > the last sentence is missing a word. Comments don't have the ability to > **what** their (whose?) code? > > Which IDE are you using? When you say it collapses the "entirety of the > code", do you mean the entire file? > > > > - Writing high level code, such as __init__ calls for large aggregates, > > with one keyworded argument per line (plus dict unpackings at the end), > > sort of like a simple XML file. > > Even if I accept that this is a reasonable design for __init__, I would > not agree that it is a reasonable design for "high level code" in > general. > > > > This allows me to make parameters explicit > > for other users, and optionally provide a comment indicating physical > > units, cite sources, and/or give a list of tag/enum options for every > > parameter. In the end I have 30+ line inits, but the readability is 10x > > greater. > > Perhaps I might be convinced if I saw some actual code, but from your > description alone, it doesn't sound particularly more readable. Why > would I want to read citations in the parameter list of a method? I want > to call the method, not do peer review on the theory behind it. > > > > My IDE doesn't yet offer to fold long parameter lists by default, > > but I think it makes sense. > > *shrug* > > Personally, I don't find code folding a big help. Perhaps once in a blue > moon. I'm glad you like it and that it helps you. > > > > In the end, I end up with very well folded code (except for large > parameter > > lists) and a bunch of co-workers asking about all the "editor-fold" > > comments that don't work in their API. > > I'm afraid I'm not understanding you here either. What's an > "editor-fold" comment? What do they mean by API? API for which > application? How does the programming interface to an application relate > to code folding in a text editor? > > > > Python was a trail-blazer in terms of emphasizing the importance of code > > readability and effective syntax. I think that we should consider some > sort > > of standard for folding comments, if not only to promote stronger code > > organizations. I know standards don't usually interact with IDEs, but > hey, > > the 'typing' module is pretty dang nice. > > > > TL;DR: Code folding is great, custom code folding is great, let's upgrade > > it to a language feature. > > What does that even mean? Are you suggesting that the Python interpreter > should raise a SyntaxError or something if your code was written in an > editor that didn't support code folding? How would it know? > > Python is a programming language. The source code is text. I should be > able to write Python code in NotePad if I want. Why should the Python > core devs try to force every text editor and IDE fold code exactly the > same way? That sounds awful to me. People choose different editors > because they like different features, and that may include the > particular way the editor folds code. Or to not fold it at all. > > I'm sorry to be so negative, but I don't understand your proposal, and > the bits that I *think* I understand sound pretty awful to me. Perhaps > you can explain a bit better what you mean and why it should be a > language feature, apart from "I want everyone to lay out their source > code the way I do". Because that's what it sounds like to me. > > > -- > Steve > > > ------------------------------ > > Subject: Digest Footer > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > > > ------------------------------ > > End of Python-ideas Digest, Vol 128, Issue 41 > ********************************************* > -------------- next part -------------- An HTML attachment was scrubbed... URL: From contact at brice.xyz Sun Jul 16 13:58:51 2017 From: contact at brice.xyz (Brice PARENT) Date: Sun, 16 Jul 2017 19:58:51 +0200 Subject: [Python-ideas] Custom Code Folding: Standardized Rules and Syntax? In-Reply-To: References: Message-ID: Hi Connor, I'm also a Pycharm user (Professional edition for me, but I'm not sure it changes anything here). I didn't know about this functionality (if you're referring to this : https://www.jetbrains.com/help/pycharm/code-folding.html#using_folding_comments at least): |//region Description ... //endregion| But in general, I'm not a fan of having those type of meta-comments: they don't document the code or help understanding what it's doing, they just help those who happen to use an IDE that implemented this and who know what this means. And the help it gives is not to show something, but to hide it. For the other ones, it's just polluting the file with nonsensical (to their eyes) words. Another aspect that I'm worried about when inserting things that don't really belong to the code: it will be commited with the rest of the code. And if you added those comments because it helped you during the development of a special part of the file, it will still get pushed to the repository whether it is still useful or not. And if it's not, it will cost another commit to clean up, along with many potential cascading and maybe automatic things (pull-request, unit-tests, build, push to production which can be a huge thing if it's shared between many servers and services for example). All for a comment that might be only usable with some IDEs, and maybe even only useful during 4 hours for a single developer. But... That doesn't mean that the functionality isn't useful. The IDE could manage an external gitignorable file containg metadata about which parts of the code could be folded as a pack. If there are many people who want this, they might consider implementing it. But I don't believe this should be implemented at Python's level. Code folding feels to me like something some of us may want while some others may not, or not in the same cases. If I'm off topic (I might not have understood it very well), don't hesitate to tell me. And to provide examples! -Brice Le 16/07/17 ? 16:37, Connor Farrell a ?crit : > Background: I work in scientific computing and use Community Pycharm IDE. > > I'm a religious follower of the 'readability counts' mantra, and two > things I find myself doing often are: > - Writing custom code folds to segregate code, from groups of classes > in a file, to groups of lines in an individual function. While spacing > works great to separate ideas, my IDE allows me to collapse the > entirety of the code in exchange for a line of English. For my > purposes, this enhances readability immensely, as first time users are > confronted with an explanation of the contents, rather than the code > itself with a comment on top. I find comments don't draw the eye, and > also don't have the ability to their code as well. > - Writing high level code, such as __init__ calls for large > aggregates, with one keyworded argument per line (plus dict unpackings > at the end), sort of like a simple XML file. This allows me to make > parameters explicit for other users, and optionally provide a comment > indicating physical units, cite sources, and/or give a list of > tag/enum options for every parameter. In the end I have 30+ line > inits, but the readability is 10x greater. My IDE doesn't yet offer to > fold long parameter lists by default, but I think it makes sense. > > In the end, I end up with very well folded code (except for large > parameter lists) and a bunch of co-workers asking about all the > "editor-fold" comments that don't work in their API. > > Python was a trail-blazer in terms of emphasizing the importance of > code readability and effective syntax. I think that we should consider > some sort of standard for folding comments, if not only to promote > stronger code organizations. I know standards don't usually interact > with IDEs, but hey, the 'typing' module is pretty dang nice. > > TL;DR: Code folding is great, custom code folding is great, let's > upgrade it to a language feature. > > Cheers > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From contact at brice.xyz Sun Jul 16 14:26:55 2017 From: contact at brice.xyz (Brice PARENT) Date: Sun, 16 Jul 2017 20:26:55 +0200 Subject: [Python-ideas] Python-ideas Digest, Vol 128, Issue 41 In-Reply-To: References: Message-ID: Sorry I didn't see this answer (as the title changed, it was moved to another topic in my mailbox). So I still believe code-folding indications don't really belong to the source files. But what you're showing is interesting, and the code folding is just a consequence of it. It can be as chapters or sub-chapters, which might be a nice organization of the code. I'm not sure of the relevance of the closing code though: If they are like chapters, they describe the content until the next chapter, or until the scope ends (end of method, of function, of class, ...). So the IDE just need to parse those comments and implement a code folding method based on it. It could simply be `### [description]`, like `### Security features`. I think it could be a good help in some cases, like in a settings.py files (like Django's one which can be pretty long). But there are downsides: - We should try to avoid to write classes, methods or functions that are too long. It often means that it should be split into smaller functionalities. This would probably not push the developers in the right direction if we asked them to used such a tool. But I don't know if this big-pieces-of-code-is-a-code-smell-for-refactorisation is true for scientific development, as I imagine there can be some quite long pieces of codes that really are meant to be together. - It still enforces a way of coding. Some prefer to group their methods by functionality (everything related to security is in grouped for example), but others prefer to sort their methods alphabetically. Or by type of methods (higher level methods first, lower level last), etc. Also, what about magic methods, or methods that are shared between two or more functionalities? So I'm not sure how I feel about this, the discussion and examples may be interesting (more than what I understood from first mail anyway!). -Brice Le 16/07/17 ? 18:42, Connor Farrell a ?crit : > Thanks for your feedback, I guess I was a little unclear. In short, I > was thinking of a pair of comment tokens (something like #<<, #>>, > idk) that would indicate a code fold, like what virtually all IDEs do > for classes and methods, but with more granularity. It would allow > devs to better organize their code. I agree with what you're saying > about separation of language and text editor, but we already have the > /typing/ module in 3.6, so improved linting and communication is > apparently game. I have no desire to get the interpreter involved, > this is pure linter. A good example would be something like: > > class A: > #<< INTERFACE > def foo(): > .... > def bar(): > .... > #>> > > #<< BACKEND > def _guts(): > .... > #>> > > Would become something like: > > class A: > *>+ INTERFACE > > * > * >+ BACKEND > > * > Where *modern* editors should fold the code. It provides an optional > additional layer of granularity above the current system, but has no > effect on behaviour otherwise. It increases visual information density > in complex functions, large classes, and lets you group classes. It > looks stupid with only three functions, but at larger sizes, or with > complex code, I'd rather see: > > def func(f: Callable[[float],float, float]) -> None: > *>+* *Input validation* > > *>+ Some complex algorithm > * > * > * > * >+ Some other complex algorithm > > * > * >+ Generating plot > * > * > * > It adds a human explanation of the code within, an additional level(s) > of organization, and easier navigation. It's not for everyone, but I > feel like it improves on the idea of modular code at for active devs, > without any drawbacks for library users. > * > * > * > * > * > * > > > > On Sun, Jul 16, 2017 at 12:00 PM, > wrote: > > Send Python-ideas mailing list submissions to > python-ideas at python.org > > To subscribe or unsubscribe via the World Wide Web, visit > https://mail.python.org/mailman/listinfo/python-ideas > > or, via email, send a message with subject or body 'help' to > python-ideas-request at python.org > > > You can reach the person managing the list at > python-ideas-owner at python.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Python-ideas digest..." > > > Today's Topics: > > 1. Re: Custom Code Folding: Standardized Rules and Syntax? > (Steven D'Aprano) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Mon, 17 Jul 2017 01:15:59 +1000 > From: Steven D'Aprano > > To: python-ideas at python.org > Subject: Re: [Python-ideas] Custom Code Folding: Standardized Rules > and Syntax? > Message-ID: <20170716151559.GW3149 at ando.pearwood.info > > > Content-Type: text/plain; charset=us-ascii > > Hi Connor, and welcome! > > On Sun, Jul 16, 2017 at 10:37:26AM -0400, Connor Farrell wrote: > > Background: I work in scientific computing and use Community > Pycharm IDE. > > > > I'm a religious follower of the 'readability counts' mantra, and > two things > > I find myself doing often are: > > - Writing custom code folds to segregate code, from groups of > classes in a > > file, to groups of lines in an individual function. While > spacing works > > great to separate ideas, my IDE allows me to collapse the > entirety of the > > code in exchange for a line of English. For my purposes, this > enhances > > readability immensely, as first time users are confronted with an > > explanation of the contents, rather than the code itself with a > comment on > > top. I find comments don't draw the eye, and also don't have the > ability to > > their code as well. > > I'm afraid I'm having a lot of difficulty understanding this. I think > the last sentence is missing a word. Comments don't have the > ability to > **what** their (whose?) code? > > Which IDE are you using? When you say it collapses the "entirety > of the > code", do you mean the entire file? > > > > - Writing high level code, such as __init__ calls for large > aggregates, > > with one keyworded argument per line (plus dict unpackings at > the end), > > sort of like a simple XML file. > > Even if I accept that this is a reasonable design for __init__, I > would > not agree that it is a reasonable design for "high level code" in > general. > > > > This allows me to make parameters explicit > > for other users, and optionally provide a comment indicating > physical > > units, cite sources, and/or give a list of tag/enum options for > every > > parameter. In the end I have 30+ line inits, but the readability > is 10x > > greater. > > Perhaps I might be convinced if I saw some actual code, but from your > description alone, it doesn't sound particularly more readable. Why > would I want to read citations in the parameter list of a method? > I want > to call the method, not do peer review on the theory behind it. > > > > My IDE doesn't yet offer to fold long parameter lists by default, > > but I think it makes sense. > > *shrug* > > Personally, I don't find code folding a big help. Perhaps once in > a blue > moon. I'm glad you like it and that it helps you. > > > > In the end, I end up with very well folded code (except for > large parameter > > lists) and a bunch of co-workers asking about all the "editor-fold" > > comments that don't work in their API. > > I'm afraid I'm not understanding you here either. What's an > "editor-fold" comment? What do they mean by API? API for which > application? How does the programming interface to an application > relate > to code folding in a text editor? > > > > Python was a trail-blazer in terms of emphasizing the importance > of code > > readability and effective syntax. I think that we should > consider some sort > > of standard for folding comments, if not only to promote > stronger code > > organizations. I know standards don't usually interact with > IDEs, but hey, > > the 'typing' module is pretty dang nice. > > > > TL;DR: Code folding is great, custom code folding is great, > let's upgrade > > it to a language feature. > > What does that even mean? Are you suggesting that the Python > interpreter > should raise a SyntaxError or something if your code was written in an > editor that didn't support code folding? How would it know? > > Python is a programming language. The source code is text. I should be > able to write Python code in NotePad if I want. Why should the Python > core devs try to force every text editor and IDE fold code exactly the > same way? That sounds awful to me. People choose different editors > because they like different features, and that may include the > particular way the editor folds code. Or to not fold it at all. > > I'm sorry to be so negative, but I don't understand your proposal, and > the bits that I *think* I understand sound pretty awful to me. Perhaps > you can explain a bit better what you mean and why it should be a > language feature, apart from "I want everyone to lay out their source > code the way I do". Because that's what it sounds like to me. > > > -- > Steve > > > ------------------------------ > > Subject: Digest Footer > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > > > > ------------------------------ > > End of Python-ideas Digest, Vol 128, Issue 41 > ********************************************* > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Sun Jul 16 14:57:15 2017 From: mertz at gnosis.cx (David Mertz) Date: Sun, 16 Jul 2017 11:57:15 -0700 Subject: [Python-ideas] Python-ideas Digest, Vol 128, Issue 41 In-Reply-To: References: Message-ID: Many editors allow you to explicitly select blocks to fold rather than only basing it on explicit syntax in a code file. Obviously, the information on where those folds occurred is them generally stopped somewhere apart from the text of the code itself. It sounds like you should choose an editor that does that and/or a macro/extension/mode to your favorite editor to behave as you like. On Jul 16, 2017 11:27 AM, "Brice PARENT" wrote: > Sorry I didn't see this answer (as the title changed, it was moved to > another topic in my mailbox). > > So I still believe code-folding indications don't really belong to the > source files. But what you're showing is interesting, and the code folding > is just a consequence of it. > > It can be as chapters or sub-chapters, which might be a nice organization > of the code. > > I'm not sure of the relevance of the closing code though: If they are like > chapters, they describe the content until the next chapter, or until the > scope ends (end of method, of function, of class, ...). > > So the IDE just need to parse those comments and implement a code folding > method based on it. > > It could simply be `### [description]`, like `### Security features`. > > I think it could be a good help in some cases, like in a settings.py files > (like Django's one which can be pretty long). > > But there are downsides: > > - We should try to avoid to write classes, methods or functions that are > too long. It often means that it should be split into smaller > functionalities. This would probably not push the developers in the right > direction if we asked them to used such a tool. But I don't know if this > big-pieces-of-code-is-a-code-smell-for-refactorisation is true for > scientific development, as I imagine there can be some quite long pieces of > codes that really are meant to be together. > > - It still enforces a way of coding. Some prefer to group their methods by > functionality (everything related to security is in grouped for example), > but others prefer to sort their methods alphabetically. Or by type of > methods (higher level methods first, lower level last), etc. Also, what > about magic methods, or methods that are shared between two or more > functionalities? > > So I'm not sure how I feel about this, the discussion and examples may be > interesting (more than what I understood from first mail anyway!). > > -Brice > > Le 16/07/17 ? 18:42, Connor Farrell a ?crit : > > Thanks for your feedback, I guess I was a little unclear. In short, I was > thinking of a pair of comment tokens (something like #<<, #>>, idk) that > would indicate a code fold, like what virtually all IDEs do for classes and > methods, but with more granularity. It would allow devs to better organize > their code. I agree with what you're saying about separation of language > and text editor, but we already have the *typing* module in 3.6, so > improved linting and communication is apparently game. I have no desire to > get the interpreter involved, this is pure linter. A good example would be > something like: > > class A: > #<< INTERFACE > def foo(): > .... > def bar(): > .... > #>> > > #<< BACKEND > def _guts(): > .... > #>> > > Would become something like: > > class A: > > > *>+ INTERFACE * > > > * >+ BACKEND * > Where *modern* editors should fold the code. It provides an optional > additional layer of granularity above the current system, but has no effect > on behaviour otherwise. It increases visual information density in complex > functions, large classes, and lets you group classes. It looks stupid with > only three functions, but at larger sizes, or with complex code, I'd rather > see: > > def func(f: Callable[[float],float, float]) -> None: > *>+* *Input validation* > > > *>+ Some complex algorithm * > > > > * >+ Some other complex algorithm * > > * >+ Generating plot * > > It adds a human explanation of the code within, an additional level(s) of > organization, and easier navigation. It's not for everyone, but I feel like > it improves on the idea of modular code at for active devs, without any > drawbacks for library users. > > > > > > > > On Sun, Jul 16, 2017 at 12:00 PM, wrote: > >> Send Python-ideas mailing list submissions to >> python-ideas at python.org >> >> To subscribe or unsubscribe via the World Wide Web, visit >> https://mail.python.org/mailman/listinfo/python-ideas >> or, via email, send a message with subject or body 'help' to >> python-ideas-request at python.org >> >> You can reach the person managing the list at >> python-ideas-owner at python.org >> >> When replying, please edit your Subject line so it is more specific >> than "Re: Contents of Python-ideas digest..." >> >> >> Today's Topics: >> >> 1. Re: Custom Code Folding: Standardized Rules and Syntax? >> (Steven D'Aprano) >> >> >> ---------------------------------------------------------------------- >> >> Message: 1 >> Date: Mon, 17 Jul 2017 01:15:59 +1000 >> From: Steven D'Aprano >> To: python-ideas at python.org >> Subject: Re: [Python-ideas] Custom Code Folding: Standardized Rules >> and Syntax? >> Message-ID: <20170716151559.GW3149 at ando.pearwood.info> >> Content-Type: text/plain; charset=us-ascii >> >> Hi Connor, and welcome! >> >> On Sun, Jul 16, 2017 at 10:37:26AM -0400, Connor Farrell wrote: >> > Background: I work in scientific computing and use Community Pycharm >> IDE. >> > >> > I'm a religious follower of the 'readability counts' mantra, and two >> things >> > I find myself doing often are: >> > - Writing custom code folds to segregate code, from groups of classes >> in a >> > file, to groups of lines in an individual function. While spacing works >> > great to separate ideas, my IDE allows me to collapse the entirety of >> the >> > code in exchange for a line of English. For my purposes, this enhances >> > readability immensely, as first time users are confronted with an >> > explanation of the contents, rather than the code itself with a comment >> on >> > top. I find comments don't draw the eye, and also don't have the >> ability to >> > their code as well. >> >> I'm afraid I'm having a lot of difficulty understanding this. I think >> the last sentence is missing a word. Comments don't have the ability to >> **what** their (whose?) code? >> >> Which IDE are you using? When you say it collapses the "entirety of the >> code", do you mean the entire file? >> >> >> > - Writing high level code, such as __init__ calls for large aggregates, >> > with one keyworded argument per line (plus dict unpackings at the end), >> > sort of like a simple XML file. >> >> Even if I accept that this is a reasonable design for __init__, I would >> not agree that it is a reasonable design for "high level code" in >> general. >> >> >> > This allows me to make parameters explicit >> > for other users, and optionally provide a comment indicating physical >> > units, cite sources, and/or give a list of tag/enum options for every >> > parameter. In the end I have 30+ line inits, but the readability is 10x >> > greater. >> >> Perhaps I might be convinced if I saw some actual code, but from your >> description alone, it doesn't sound particularly more readable. Why >> would I want to read citations in the parameter list of a method? I want >> to call the method, not do peer review on the theory behind it. >> >> >> > My IDE doesn't yet offer to fold long parameter lists by default, >> > but I think it makes sense. >> >> *shrug* >> >> Personally, I don't find code folding a big help. Perhaps once in a blue >> moon. I'm glad you like it and that it helps you. >> >> >> > In the end, I end up with very well folded code (except for large >> parameter >> > lists) and a bunch of co-workers asking about all the "editor-fold" >> > comments that don't work in their API. >> >> I'm afraid I'm not understanding you here either. What's an >> "editor-fold" comment? What do they mean by API? API for which >> application? How does the programming interface to an application relate >> to code folding in a text editor? >> >> >> > Python was a trail-blazer in terms of emphasizing the importance of code >> > readability and effective syntax. I think that we should consider some >> sort >> > of standard for folding comments, if not only to promote stronger code >> > organizations. I know standards don't usually interact with IDEs, but >> hey, >> > the 'typing' module is pretty dang nice. >> > >> > TL;DR: Code folding is great, custom code folding is great, let's >> upgrade >> > it to a language feature. >> >> What does that even mean? Are you suggesting that the Python interpreter >> should raise a SyntaxError or something if your code was written in an >> editor that didn't support code folding? How would it know? >> >> Python is a programming language. The source code is text. I should be >> able to write Python code in NotePad if I want. Why should the Python >> core devs try to force every text editor and IDE fold code exactly the >> same way? That sounds awful to me. People choose different editors >> because they like different features, and that may include the >> particular way the editor folds code. Or to not fold it at all. >> >> I'm sorry to be so negative, but I don't understand your proposal, and >> the bits that I *think* I understand sound pretty awful to me. Perhaps >> you can explain a bit better what you mean and why it should be a >> language feature, apart from "I want everyone to lay out their source >> code the way I do". Because that's what it sounds like to me. >> >> >> -- >> Steve >> >> >> ------------------------------ >> >> Subject: Digest Footer >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> >> >> ------------------------------ >> >> End of Python-ideas Digest, Vol 128, Issue 41 >> ********************************************* >> > > > > _______________________________________________ > Python-ideas mailing listPython-ideas at python.orghttps://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From python-ideas at shalmirane.com Sun Jul 16 14:59:12 2017 From: python-ideas at shalmirane.com (Ken Kundert) Date: Sun, 16 Jul 2017 11:59:12 -0700 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: References: <20170705153901.GG3149@ando.pearwood.info> <20170705173221.GH3149@ando.pearwood.info> <166441499301972@web57j.yandex.ru> <192641499305985@web47g.yandex.ru> <22883.26308.389706.743344@turnbull.sk.tsukuba.ac.jp> <1169061500163933@web18o.yandex.ru> Message-ID: <20170716185912.GA6300@kundert.designers-guide.com> Nick, I see. The Python C interface provides a very simple way of raising an exception in the case where the exception is only passed one argument, the error message. It even makes it easy to interpolate arguments into that error message. If it is important to include other arguments, as with OSError, one would presumably use another mechanism that requires more effort. Your suggestion of providing an alternative to PyErr_Format() that squirreled away its arguments into the exception and deferred their interpolation into the error message seems like a very nice approach. But as you say, it does not seem trivial to implement in C. If it could be extended to support named arguments, it does have the benefit that it could unify the way exceptions are raise from C. The alternative would be to simply enhance the individual exceptions on an as needed basis as you suggested earlier. That could be easy if the exceptions are only raised in one or two places. Do you have a sense for how many places raise some of these common exceptions such as NameError, KeyError, etc.? -Ken On Sun, Jul 16, 2017 at 02:18:40PM +1000, Nick Coghlan wrote: > On 16 July 2017 at 10:12, Jeff Walker wrote: > > Sorry Stephen (and Steven). I'll do better next time. > > > > The way I see it there are two problems, and the second causes the first. > > > > The first problem is that there is no direct access to the components that make up the error in some of the standard Python exceptions. > > > > >>> foo > > Traceback (most recent call last): > > File "", line 1, in > > NameError: name 'foo' is not defined > > > > If you need access to the name, you must de-construct the error message. To get direct access to the name, it would need to be passed to the exception when raised. Why wasn't that done? > > Ease of implementation mainly, as PyErr_Format is currently the > simplest general purpose mechanism for raising informative exceptions > from the C code: > https://docs.python.org/3/c-api/exceptions.html#c.PyErr_Format > > All the more structured ones currently require expanding the C API to > cover creating particular exception types with particular field values > (the various PyErr_Set* functions listed on that page). > > That does prompt a thought, though: what if there was an f-string > style "PyErr_SetStructured" API, where instead of writing > 'PyErr_Format(PyExc_NameError, "name '%.200s' is not defined", > name);', you could instead write 'PyErr_SetStructured(PyExc_NameError, > "name '{name:.200s}' is not defined", name);', and that effectively > translated to setting a NameError exception with that message *and* > setting its "name" attribute appropriately. > > The key benefit such a switch to f-string style formatting would > provide is the ability to optionally *name* the fields in the > exception message if they correspond to an instance attribute on the > exception being raised. > > Actually implementing such an API wouldn't be easy by any means (and > I'm not volunteering to do it myself), but it *would* address the main > practical barrier to making structured exceptions more pervasive in > the reference interpreter. > > Cheers, > Nick. From jeff.walker00 at yandex.com Sun Jul 16 19:17:39 2017 From: jeff.walker00 at yandex.com (Jeff Walker) Date: Sun, 16 Jul 2017 17:17:39 -0600 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: References: <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <595C0D0E.2030600@brenbarn.net> <20170705113634.GF3149@ando.pearwood.info> <20170705153901.GG3149@ando.pearwood.info> <20170705173221.GH3149@ando.pearwood.info> <166441499301972@web57j.yandex.ru> <192641499305985@web47g.yandex.ru> <22883.26308.389706.743344@turnbull.sk.tsukuba.ac.jp> <1169061500163933@web18o.yandex.ru> Message-ID: <1764561500247059@web21m.yandex.ru> 15.07.2017, 18:33, "Chris Angelico" : > On Sun, Jul 16, 2017 at 10:12 AM, Jeff Walker wrote: >> ?The first problem is that there is no direct access to the components that make up the error in some of the standard Python exceptions. >> >> ?????>>> foo >> ?????Traceback (most recent call last): >> ?????????File "", line 1, in >> ?????NameError: name 'foo' is not defined >> >> ?If you need access to the name, you must de-construct the error message. To get direct access to the name, it would need to be passed to the exception when raised. Why wasn't that done? > > Because it normally isn't needed. Can you give an example of where > NameError could legitimately be raised from multiple causes? For me it generally occurs when users are using Python to hold their actual data. Python can be used as a data format in much the same way JSON or Yaml. And if the end user is familiar with Python, it provides many nice benefits over the alternatives. I tend to use it in this fashion a great deal, particularly for configuration information. In these case, it would be nice to be able access the 'participants' in the following exceptions: NameError (offending name) KeyError (collection, offending key) IndexError (collection, offending index) Jeff From evanbenadler at gmail.com Mon Jul 17 01:05:57 2017 From: evanbenadler at gmail.com (Evan Adler) Date: Mon, 17 Jul 2017 01:05:57 -0400 Subject: [Python-ideas] Is this PEP viable? Message-ID: I would like to submit the following proposal. In the logging module, I would like handlers (like file handlers and stream handlers) to have a field for exc_info printing. This way, a call to logger.exception() will write the stack trace to the handlers with this flag set, and only print the message and other info to handlers without the flag set. This allows a single logger to write to a less detailed console output, a less detailed run log, and a more detailed error log. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Jul 17 02:27:40 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 17 Jul 2017 16:27:40 +1000 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <20170716185912.GA6300@kundert.designers-guide.com> References: <20170705153901.GG3149@ando.pearwood.info> <20170705173221.GH3149@ando.pearwood.info> <166441499301972@web57j.yandex.ru> <192641499305985@web47g.yandex.ru> <22883.26308.389706.743344@turnbull.sk.tsukuba.ac.jp> <1169061500163933@web18o.yandex.ru> <20170716185912.GA6300@kundert.designers-guide.com> Message-ID: On 17 July 2017 at 04:59, Ken Kundert wrote: > Nick, > I see. The Python C interface provides a very simple way of raising an > exception in the case where the exception is only passed one argument, the error > message. It even makes it easy to interpolate arguments into that error message. > If it is important to include other arguments, as with OSError, one would > presumably use another mechanism that requires more effort. > > Your suggestion of providing an alternative to PyErr_Format() that squirreled > away its arguments into the exception and deferred their interpolation into the > error message seems like a very nice approach. But as you say, it does not seem > trivial to implement in C. If it could be extended to support named arguments, > it does have the benefit that it could unify the way exceptions are raise from > C. I'll note that while it wouldn't be trivial, it *should* at least theoretically be possible to share the string segmentation and processing steps with the f-string implementation. However, I also realised that a much simpler approach to the same idea would instead look like: PyErr_SetFromAttributes(PyExc_NameError, "name '%.200s' is not defined", "name", name); Where the field names alternate with the field values in the argument list, rather than being embedded in the format string. Ignoring error handling, the implementation would then be something like: - split va_list into a list of field names and a list of field values - set the exception via `PyErr_SetObject(exc_type, PyUnicode_Format(exc_msg, field_values))` - retrieve the just set exception and do `PyObject_SetAttr(exc, field_name, field_value)` for each field with a non-NULL name > The alternative would be to simply enhance the individual exceptions on an as > needed basis as you suggested earlier. That could be easy if the exceptions are > only raised in one or two places. Do you have a sense for how many places raise > some of these common exceptions such as NameError, KeyError, etc.? A quick check shows around a dozen hits for PyExc_NameError in c files (all in ceval.c), and most of the hits for NameError in Python files being related to catching it rather than raising it. PyExc_KeyError is mentioned in a dozen or so C files (most of them raising it, a few of them check for it being raised elsewhere), and while the hits in Python files still mostly favour catching it, it's raised explicitly in the standard library more often than NameError is. The nice thing about the PyErr_SetFromAttributes approach at the C API level is that it pairs nicely with the following approach at the constructor level: class NameError(Exception): def __init__(self, message, *, name=None): super().__init__(message) self.name = name Something like KeyError may want to distinguish between "key value was None" and "key value wasn't reported when raising the exception" (by not defining the attribute at all in the latter case), but PyErr_SetFromAttributes wouldn't need to care about those kinds of details. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From rhodri at kynesim.co.uk Mon Jul 17 07:11:24 2017 From: rhodri at kynesim.co.uk (Rhodri James) Date: Mon, 17 Jul 2017 12:11:24 +0100 Subject: [Python-ideas] Custom Code Folding: Standardized Rules and Syntax? In-Reply-To: <20170716151559.GW3149@ando.pearwood.info> References: <20170716151559.GW3149@ando.pearwood.info> Message-ID: <8dfd6e1a-38a6-32b1-b8a9-bdb2ebf58b96@kynesim.co.uk> On 16/07/17 16:15, Steven D'Aprano wrote: > Hi Connor, and welcome! > > On Sun, Jul 16, 2017 at 10:37:26AM -0400, Connor Farrell wrote: >> Background: I work in scientific computing and use Community Pycharm IDE. >> In the end, I end up with very well folded code (except for large parameter >> lists) and a bunch of co-workers asking about all the "editor-fold" >> comments that don't work in their API. > > I'm afraid I'm not understanding you here either. What's an > "editor-fold" comment? What do they mean by API? API for which > application? How does the programming interface to an application relate > to code folding in a text editor? The usual mechanism for folding (in those editors that don't think they know best) is to delimit them with comments that start with a particular sequence of characters. You might have: class Shrubbery: def __init__(self, size, planting, rockery=False): # {{{ Boring bits you don't need to see self.size = size self.rockery = rockery # }}} self.cost = complicated_function_of(planting) ...and so on. >> Python was a trail-blazer in terms of emphasizing the importance of code >> readability and effective syntax. I think that we should consider some sort >> of standard for folding comments, if not only to promote stronger code >> organizations. I know standards don't usually interact with IDEs, but hey, >> the 'typing' module is pretty dang nice. >> >> TL;DR: Code folding is great, custom code folding is great, let's upgrade >> it to a language feature. > > What does that even mean? Are you suggesting that the Python interpreter > should raise a SyntaxError or something if your code was written in an > editor that didn't support code folding? How would it know? > > Python is a programming language. The source code is text. I should be > able to write Python code in NotePad if I want. Why should the Python > core devs try to force every text editor and IDE fold code exactly the > same way? That sounds awful to me. People choose different editors > because they like different features, and that may include the > particular way the editor folds code. Or to not fold it at all. > > I'm sorry to be so negative, but I don't understand your proposal, and > the bits that I *think* I understand sound pretty awful to me. Perhaps > you can explain a bit better what you mean and why it should be a > language feature, apart from "I want everyone to lay out their source > code the way I do". Because that's what it sounds like to me. I think I do understand the proposal, but I still agree with Steven. Folding is a feature of IDEs, not languages, and trying to legislate about it in a language definition is a bad plan. I say this as someone who wrote a folding editor to make programming in Occam bearable :-) -- Rhodri James *-* Kynesim Ltd From turnbull.stephen.fw at u.tsukuba.ac.jp Mon Jul 17 11:20:55 2017 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Tue, 18 Jul 2017 00:20:55 +0900 Subject: [Python-ideas] Arguments to exceptions In-Reply-To: <1169061500163933@web18o.yandex.ru> References: <20170703031810.GU3149@ando.pearwood.info> <20170703085902.GA27217@kundert.designers-guide.com> <20170704140259.GX3149@ando.pearwood.info> <595C0D0E.2030600@brenbarn.net> <20170705113634.GF3149@ando.pearwood.info> <20170705153901.GG3149@ando.pearwood.info> <20170705173221.GH3149@ando.pearwood.info> <166441499301972@web57j.yandex.ru> <192641499305985@web47g.yandex.ru> <22883.26308.389706.743344@turnbull.sk.tsukuba.ac.jp> <1169061500163933@web18o.yandex.ru> Message-ID: <22892.54743.25255.224365@turnbull.sk.tsukuba.ac.jp> First, I would like to make it clear that I am not claiming that the current design of standard Exceptions cannot be improved, nor do I necessarily have a problem with doing it at the BaseException level. My argument has always been directed to the issue of "does changing BaseException actually save work in design and implementation of the interpreter, of standard derived Exceptions, and of new user Exceptions?" I don't think it does. But Ken and you have taken opposition to Ken's proposal to indicate that we don't see a problem with the current implementation of the standard Exception hierarchy. That's not true. We do understand the problems. We differ on importance: Steven d'A seems to deprecate the NameError in "data-sets-formatted-as-Python-containers" problem as rare and only encountered by a few users. I think that's a big enough class of use cases (I do it myself, though I've never encountered Ken's use case) to worry about, though I wouldn't be willing to do the work myself on the evidence so far presented. But we do understand. What you and Ken haven't done yet is to show 1. the implementation of an improvement is simple and generic, and 2. enough so to justify what seems to be an improvement restricted to an uncommon set of use cases. Nick has helped to convince me that (1) can be addressed, though he warned that it's not all that easy (especially if you're a DRY fanatic). The question then is the balance between (1) and (2). Jeff Walker writes: > The first problem is that there is no direct access to the > components that make up the error in some of the standard Python > exceptions. That's not a problem in BaseException, though. BaseException allows access to arbitrary Python objects, if, and only if, the developer puts them in the exception. > Why wasn't that done? I don't know, but it's not prevented by BaseException. > That leads us to the second problem: the base exception does not > handle arguments gracefully unless you only pass an error message > all by itself. I think it would have been a mistake to try to do something graceful. The point is that looking at the design, I would suppose that BaseException had only one job: conveying whatever information the programmer chose to put in it faithfully to an exception handler. It does that, even if you hand it something really wierd, and in several pieces. This is called "humility in design when you have no idea what you're designing for". The Zen expresses it "But sometimes never is better than *right* now." It's easy to design a better BaseException for a handful of cases. It's not at all clear that such a BaseException would work well for more than a double handful. If it doesn't, it would tend to cause the same problem for exceptions that don't fit it very well but the programmer doesn't think are important enough to deserve a well-defined derived Exception. Since it would necessarily be more complex than the current BaseException, it might have even uglier failure modes. > >>> try: > >>> name = 'foo' > >>> raise NameError(name, '%s: name is not defined.' % name) l > >>> except NameError as e: > >>> print(str(e)) > ('foo', 'foo: name is not defined.') > > In this case, printing the exception cast to a string does not > result in a reasonable error message. Sure, but that's because the error message was poorly chosen. I would raise NameError(name, 'error: undefined name') which would stringify as ('foo', 'error: undefined name') (and there are probably even better ways to phrase that). But in the case of NameError, I don't understand why a string is used at all. I really don't see a problem with the interpreter printing NameError: foo and a traceback indicating where "foo" was encountered, and having programmatic handlers do something appropriate with e.args[0]. It's not obvious to me that the same is not true of most of the "stupid string formatting" Exceptions that have been given as examples. > So the basic point is that the lack of reasonable behavior for > str(e) I hate to tell you this, but str doesn't guarantee anything like reasonable behavior for error handling. Even repr doesn't guarantee you can identify an object passed to it, nor reproduce it. I have to assume that the developers who designed these Exceptions believed that in the vast majority of cases the mere fact that an instance of particular Exception was raised is enough to identify the offending object. Perhaps they were mistaken. But if they were, the places in the interpreter where they are raised and handled all must be be checked to see if they need to be changed, whether we change BaseException or the individual derived Exceptions. And if we change BaseException, I expect it will be very desirable to change individual derived Exceptions as well (to give them intelligible error message templates). Also, it's not obvious to me that default templates are such a useful idea. I almost always use unique messages when "printf debugging", out of habit developed in other languages. That means I rarely have to look at tracebacks for simple errors. If better templates make it easier for people to not use context-specific messages, when they are warranted that's not a win, I think. > But in fact [use of keyword arguments to set "advanced" formatting] > is not really the point of the proposal, it is just a minor design > choice. I don't think it's so minor. The keyword argument approach was apparently chosen to avoid breaking backward compatibility in BaseException. Compatibility break is not an option here, IMO. I think it's also an important part of Ken's proposal that this be an "easy" tweak that makes it easier to write better derived Exceptions and better exception handling in many cases. > But sometimes you just want something quick that you can use that > does not require you to define your own exception. And sometimes, > even though people should define custom exceptions, they don't. For > example, NameError, KeyError, IndexError, ... I don't see how that is an argument for the proposed change to BaseException. Steve From brett at python.org Mon Jul 17 16:50:27 2017 From: brett at python.org (Brett Cannon) Date: Mon, 17 Jul 2017 20:50:27 +0000 Subject: [Python-ideas] Is this PEP viable? In-Reply-To: References: Message-ID: This doesn't require a PEP as it falls under "adding" to a pre-existing module: https://devguide.python.org/stdlibchanges/. On Sun, 16 Jul 2017 at 22:06 Evan Adler wrote: > I would like to submit the following proposal. In the logging module, I > would like handlers (like file handlers and stream handlers) to have a > field for exc_info printing. This way, a call to logger.exception() will > write the stack trace to the handlers with this flag set, and only print > the message and other info to handlers without the flag set. This allows a > single logger to write to a less detailed console output, a less detailed > run log, and a more detailed error log. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Mon Jul 17 20:01:58 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 17 Jul 2017 17:01:58 -0700 Subject: [Python-ideas] a new namedtuple Message-ID: <596D4FF6.5080908@stoneleaf.us> Guido has decreed that namedtuple shall be reimplemented with speed in mind. I haven't timed it (I'm hoping somebody will volunteer to be the bench mark guru), I'll offer my NamedTuple implementation from my aenum [1] library. It uses the same metaclass techniques as Enum, and offers doc string and default value support in the class-based form. -- ~Ethan~ [1] https://pypi.python.org/pypi/aenum/1.4.5 From ethan at stoneleaf.us Mon Jul 17 20:04:34 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 17 Jul 2017 17:04:34 -0700 Subject: [Python-ideas] a new namedtuple In-Reply-To: <596D4FF6.5080908@stoneleaf.us> References: <596D4FF6.5080908@stoneleaf.us> Message-ID: <596D5092.2040604@stoneleaf.us> On 07/17/2017 05:01 PM, Ethan Furman wrote: > I haven't timed it (I'm hoping somebody will volunteer to be the bench mark guru), I'll offer my NamedTuple > implementation from my aenum [1] library. It uses the same metaclass techniques as Enum, and offers doc string and > default value support in the class-based form. Oh, and to be clear, there are some other nice-to-have and/or neat features (such as variable-sized NamedTuples), that I would expect to be trimmed from a stdlib version. -- ~Ethan~ From levkivskyi at gmail.com Mon Jul 17 20:04:37 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Tue, 18 Jul 2017 02:04:37 +0200 Subject: [Python-ideas] a new namedtuple In-Reply-To: <596D4FF6.5080908@stoneleaf.us> References: <596D4FF6.5080908@stoneleaf.us> Message-ID: Just FYI, typing.NamedTuple is there for almost a year and already supports default values, methods, docstrings etc. Also there is ongoing work towards dataclasses PEP, see https://github.com/ericvsmith/dataclasses So that would keep namedtuple API as it is, and focus only on performance improvements. -- Ivan On 18 July 2017 at 02:01, Ethan Furman wrote: > Guido has decreed that namedtuple shall be reimplemented with speed in > mind. > > I haven't timed it (I'm hoping somebody will volunteer to be the bench > mark guru), I'll offer my NamedTuple implementation from my aenum [1] > library. It uses the same metaclass techniques as Enum, and offers doc > string and default value support in the class-based form. > > -- > ~Ethan~ > > > [1] https://pypi.python.org/pypi/aenum/1.4.5 > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From joejev at gmail.com Mon Jul 17 20:24:37 2017 From: joejev at gmail.com (Joseph Jevnik) Date: Mon, 17 Jul 2017 20:24:37 -0400 Subject: [Python-ideas] a new namedtuple In-Reply-To: References: <596D4FF6.5080908@stoneleaf.us> Message-ID: If we are worried about speed but want to keep the same API I have a near drop in replacement for collections.namedtuple that dramatically improves class and instance creation speed [1]. The only things missing from this implementation are `_source` and `verbose` which could be dynamically computed to provide equivalent Python source. This project was originally proposed as a replacement for the standard namedtuple, but after talking to Raymond we decided the performance did not outweigh the simplicity of the existing implementation. Now that people seem more concerned with performance, I wanted to bring this up again. [1] https://github.com/llllllllll/cnamedtuple On Mon, Jul 17, 2017 at 8:04 PM, Ivan Levkivskyi wrote: > Just FYI, typing.NamedTuple is there for almost a year and already supports > default values, methods, docstrings etc. > Also there is ongoing work towards dataclasses PEP, see > https://github.com/ericvsmith/dataclasses > > So that would keep namedtuple API as it is, and focus only on performance > improvements. > > -- > Ivan > > > > On 18 July 2017 at 02:01, Ethan Furman wrote: >> >> Guido has decreed that namedtuple shall be reimplemented with speed in >> mind. >> >> I haven't timed it (I'm hoping somebody will volunteer to be the bench >> mark guru), I'll offer my NamedTuple implementation from my aenum [1] >> library. It uses the same metaclass techniques as Enum, and offers doc >> string and default value support in the class-based form. >> >> -- >> ~Ethan~ >> >> >> [1] https://pypi.python.org/pypi/aenum/1.4.5 >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From ericsnowcurrently at gmail.com Mon Jul 17 21:25:09 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Mon, 17 Jul 2017 19:25:09 -0600 Subject: [Python-ideas] a new namedtuple In-Reply-To: <596D4FF6.5080908@stoneleaf.us> References: <596D4FF6.5080908@stoneleaf.us> Message-ID: On Mon, Jul 17, 2017 at 6:01 PM, Ethan Furman wrote: > Guido has decreed that namedtuple shall be reimplemented with speed in mind. FWIW, I'm sure that any changes to namedtuple will be kept as minimal as possible. Changes would be limited to the underlying implementation, and would not include the namedtuple() signature, or using metaclasses, etc. However, I don't presume to speak for Guido or Raymond. :) -eric From steve at pearwood.info Mon Jul 17 21:34:48 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 18 Jul 2017 11:34:48 +1000 Subject: [Python-ideas] a new namedtuple In-Reply-To: <596D4FF6.5080908@stoneleaf.us> References: <596D4FF6.5080908@stoneleaf.us> Message-ID: <20170718013447.GB3149@ando.pearwood.info> On Mon, Jul 17, 2017 at 05:01:58PM -0700, Ethan Furman wrote: > Guido has decreed that namedtuple shall be reimplemented with speed in mind. > > I haven't timed it (I'm hoping somebody will volunteer to be the bench mark > guru), I'll offer my NamedTuple implementation from my aenum [1] library. With respect Ethan, if you're going to offer up NamedTuple as a faster version of namedtuple, you should at least do a quick proof of concept to demonstrate that it actually *is* faster. Full bench marking can wait, but you should be able to do at least something like: python3 -m timeit --setup "from collections import namedtuple" \ "K = namedtuple('K', 'a b c')" versus python3 -m timeit --setup "from aenum import NamedTuple" \ "K = NamedTuple('K', 'a b c')" (or whatever the interface is). If there's only a trivial speed up, or if its slower, then there's no point even considing it unless you speed it up first. -- Steve From ethan at stoneleaf.us Mon Jul 17 22:57:08 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 17 Jul 2017 19:57:08 -0700 Subject: [Python-ideas] a new namedtuple In-Reply-To: <20170718013447.GB3149@ando.pearwood.info> References: <596D4FF6.5080908@stoneleaf.us> <20170718013447.GB3149@ando.pearwood.info> Message-ID: <596D7904.3090900@stoneleaf.us> On 07/17/2017 06:34 PM, Steven D'Aprano wrote: > On Mon, Jul 17, 2017 at 05:01:58PM -0700, Ethan Furman wrote: >> Guido has decreed that namedtuple shall be reimplemented with speed in mind. >> >> I haven't timed it (I'm hoping somebody will volunteer to be the bench mark >> guru), I'll offer my NamedTuple implementation from my aenum [1] library. > > With respect Ethan, if you're going to offer up NamedTuple as a faster > version of namedtuple, you should at least do a quick proof of > concept to demonstrate that it actually *is* faster. I suck at benchmarking, so thank you for providing those quick-and-dirty hints. > Full bench marking > can wait, but you should be able to do at least something like: > > > python3 -m timeit --setup "from collections import namedtuple" \ > "K = namedtuple('K', 'a b c')" 546 usec > versus > > python3 -m timeit --setup "from aenum import NamedTuple" \ > "K = NamedTuple('K', 'a b c')" 250 usec So it seems to be faster! :) It is also namedtuple compatible, except for the _source attribute. -- ~Ethan~ From mertz at gnosis.cx Tue Jul 18 00:06:05 2017 From: mertz at gnosis.cx (David Mertz) Date: Mon, 17 Jul 2017 21:06:05 -0700 Subject: [Python-ideas] a new namedtuple In-Reply-To: <596D7904.3090900@stoneleaf.us> References: <596D4FF6.5080908@stoneleaf.us> <20170718013447.GB3149@ando.pearwood.info> <596D7904.3090900@stoneleaf.us> Message-ID: Can you try across a range of tuple sizes? E.g. what about with 100 items? 1000? On Jul 17, 2017 7:56 PM, "Ethan Furman" wrote: On 07/17/2017 06:34 PM, Steven D'Aprano wrote: > On Mon, Jul 17, 2017 at 05:01:58PM -0700, Ethan Furman wrote: > Guido has decreed that namedtuple shall be reimplemented with speed in mind. >> >> I haven't timed it (I'm hoping somebody will volunteer to be the bench >> mark >> guru), I'll offer my NamedTuple implementation from my aenum [1] library. >> > > With respect Ethan, if you're going to offer up NamedTuple as a faster > version of namedtuple, you should at least do a quick proof of > concept to demonstrate that it actually *is* faster. > I suck at benchmarking, so thank you for providing those quick-and-dirty hints. Full bench marking > can wait, but you should be able to do at least something like: > > > python3 -m timeit --setup "from collections import namedtuple" \ > "K = namedtuple('K', 'a b c')" > 546 usec versus > > python3 -m timeit --setup "from aenum import NamedTuple" \ > "K = NamedTuple('K', 'a b c')" > 250 usec So it seems to be faster! :) It is also namedtuple compatible, except for the _source attribute. -- ~Ethan~ _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Jul 18 00:31:39 2017 From: guido at python.org (Guido van Rossum) Date: Mon, 17 Jul 2017 21:31:39 -0700 Subject: [Python-ideas] a new namedtuple In-Reply-To: References: <596D4FF6.5080908@stoneleaf.us> Message-ID: On Mon, Jul 17, 2017 at 6:25 PM, Eric Snow wrote: > On Mon, Jul 17, 2017 at 6:01 PM, Ethan Furman wrote: > > Guido has decreed that namedtuple shall be reimplemented with speed in > mind. > > FWIW, I'm sure that any changes to namedtuple will be kept as minimal > as possible. Changes would be limited to the underlying > implementation, and would not include the namedtuple() signature, or > using metaclasses, etc. However, I don't presume to speak for Guido > or Raymond. :) > Indeed. I referred people here for discussion of ideas like this: >>> a = (x=1, y=0) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From george at fischhof.hu Tue Jul 18 07:55:15 2017 From: george at fischhof.hu (George Fischhof) Date: Tue, 18 Jul 2017 13:55:15 +0200 Subject: [Python-ideas] tempfile.TemporaryDirectory() should be able to create temporary directory at a given arbitrary place Message-ID: Hi there, I used tempfile.TemporaryDirectory(). On first usage it was good, but on second one there was a need to create tempopray directory and files in it a given place. (It needed for a test). And I found that TemporaryDirectory() is not able to do this. So my idea is to implement this behaviour with an addittional path parameter in it. BR, George -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Tue Jul 18 08:06:20 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 18 Jul 2017 15:06:20 +0300 Subject: [Python-ideas] tempfile.TemporaryDirectory() should be able to create temporary directory at a given arbitrary place In-Reply-To: References: Message-ID: 18.07.17 14:55, George Fischhof ????: > I used tempfile.TemporaryDirectory(). On first usage it was good, but on > second one there was a need to create tempopray directory and files in > it a given place. (It needed for a test). > > And I found that TemporaryDirectory() is not able to do this. So my idea > is to implement this behaviour with an addittional path parameter in it. You can pass the dir argument to TemporaryDirectory(). From george at fischhof.hu Tue Jul 18 08:08:47 2017 From: george at fischhof.hu (George Fischhof) Date: Tue, 18 Jul 2017 14:08:47 +0200 Subject: [Python-ideas] pathlib.Path should handle Pythonpath or package root Message-ID: Hi there, I created a program which uses plugins (import them). I started to test it, and found that I need two types of paths: one for file system and another one which is package relative. So I thing this is a good idea, to enhance pathlib to handle package roots. (I know about sys.path, environment variables, importlib etc, but I think it should be in pathlib.) BR, George -------------- next part -------------- An HTML attachment was scrubbed... URL: From george at fischhof.hu Tue Jul 18 08:14:14 2017 From: george at fischhof.hu (George Fischhof) Date: Tue, 18 Jul 2017 14:14:14 +0200 Subject: [Python-ideas] tempfile.TemporaryDirectory() should be able to create temporary directory at a given arbitrary place In-Reply-To: References: Message-ID: 2017-07-18 14:06 GMT+02:00 Serhiy Storchaka : > 18.07.17 14:55, George Fischhof ????: > >> I used tempfile.TemporaryDirectory(). On first usage it was good, but on >> second one there was a need to create tempopray directory and files in it a >> given place. (It needed for a test). >> >> And I found that TemporaryDirectory() is not able to do this. So my idea >> is to implement this behaviour with an addittional path parameter in it. >> > > You can pass the dir argument to TemporaryDirectory(). > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > Hi Serhiy, thank you very much. ;-) I was lost in documentation... and did not find this. BR, George -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Tue Jul 18 08:15:39 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 18 Jul 2017 22:15:39 +1000 Subject: [Python-ideas] tempfile.TemporaryDirectory() should be able to create temporary directory at a given arbitrary place In-Reply-To: References: Message-ID: <20170718121537.GC3149@ando.pearwood.info> On Tue, Jul 18, 2017 at 01:55:15PM +0200, George Fischhof wrote: > Hi there, > > I used tempfile.TemporaryDirectory(). On first usage it was good, but on > second one there was a need to create tempopray directory and files in it a > given place. (It needed for a test). > > And I found that TemporaryDirectory() is not able to do this. So my idea is > to implement this behaviour with an addittional path parameter in it. Guido's Time Machine strikes again. TemporaryDirectory takes a dir argument to set the location where the temporary directory is created. https://docs.python.org/3/library/tempfile.html#tempfile.TemporaryDirectory As far as I can tell, this functionality has existed as far back as Python 2.3: https://docs.python.org/2/library/tempfile.html#tempfile.mkdtemp -- Steve From ncoghlan at gmail.com Tue Jul 18 09:29:47 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 18 Jul 2017 23:29:47 +1000 Subject: [Python-ideas] a new namedtuple In-Reply-To: References: <596D4FF6.5080908@stoneleaf.us> Message-ID: On 18 July 2017 at 14:31, Guido van Rossum wrote: > On Mon, Jul 17, 2017 at 6:25 PM, Eric Snow > wrote: >> >> On Mon, Jul 17, 2017 at 6:01 PM, Ethan Furman wrote: >> > Guido has decreed that namedtuple shall be reimplemented with speed in >> > mind. >> >> FWIW, I'm sure that any changes to namedtuple will be kept as minimal >> as possible. Changes would be limited to the underlying >> implementation, and would not include the namedtuple() signature, or >> using metaclasses, etc. However, I don't presume to speak for Guido >> or Raymond. :) > > > Indeed. I referred people here for discussion of ideas like this: > >>>> a = (x=1, y=0) In that vein, something I'll note that *wasn't* historically possible due to the lack of keyword argument order preservation is an implementation that implicitly defines anonymous named tuple types based on the received keyword arguments. Given Python 3.6+ though, this works: from collections import namedtuple def _make_named_tuple(*fields): cls_name = "_ntuple_" + "_".join(fields) # Use the module globals as a cache for pickle compatibility namespace = globals() try: return namespace[cls_name] except KeyError: cls = namedtuple(cls_name, fields) return namespace.setdefault(cls_name, cls) def ntuple(**items): cls = _make_named_tuple(*items) return cls(*items.values()) >>> p1 = ntuple(x=1, y=2) >>> p2 = ntuple(x=4, y=5) >>> type(p1) is type(p2) True >>> type(p1) That particular approach isn't *entirely* pickle friendly (since unpickling will still fail if a suitable type hasn't been defined in the destination process yet), but you can fix that by way of playing games with setting cls.__qualname__ to refer to an instance of a custom class that splits "_ntuple_*" back out into the component field names in __getattr__ and then calls _make_named_tuple, rather than relying solely on a factory function as I have done here. However, it also isn't all that hard to imagine a literal syntax instead using a dedicated builtin type factory (perhaps based on structseq) that implicitly produced types that knew to rely on the appropriate builtin to handle instance creation on unpickling - the hardest part of the problem (preserving the keyword argument order) was already addressed in 3.6. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Tue Jul 18 09:56:07 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 18 Jul 2017 23:56:07 +1000 Subject: [Python-ideas] pathlib.Path should handle Pythonpath or package root In-Reply-To: References: Message-ID: On 18 July 2017 at 22:08, George Fischhof wrote: > Hi there, > > I created a program which uses plugins (import them). I started to test it, > and found that I need two types of paths: one for file system and another > one which is package relative. > > So I thing this is a good idea, to enhance pathlib to handle package roots. Is there a specific behaviour you're looking for that isn't already addressed by "pathlib.Path(module.__file__).parent"? Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ethan at stoneleaf.us Tue Jul 18 11:56:54 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 18 Jul 2017 08:56:54 -0700 Subject: [Python-ideas] a new namedtuple In-Reply-To: References: <596D4FF6.5080908@stoneleaf.us> Message-ID: <596E2FC6.8020007@stoneleaf.us> On 07/17/2017 06:25 PM, Eric Snow wrote: > On Mon, Jul 17, 2017 at 6:01 PM, Ethan Furman wrote: >> Guido has decreed that namedtuple shall be reimplemented with speed in mind. > > FWIW, I'm sure that any changes to namedtuple will be kept as minimal > as possible. Changes would be limited to the underlying > implementation, and would not include the namedtuple() signature, or > using metaclasses, etc. However, I don't presume to speak for Guido > or Raymond. :) I certainly don't expect the signature to change, but why is using a metaclass out? The use (or not) of a metaclass /is/ an implementation detail. -- ~Ethan~ From guido at python.org Tue Jul 18 12:09:18 2017 From: guido at python.org (Guido van Rossum) Date: Tue, 18 Jul 2017 09:09:18 -0700 Subject: [Python-ideas] a new namedtuple In-Reply-To: <596E2FC6.8020007@stoneleaf.us> References: <596D4FF6.5080908@stoneleaf.us> <596E2FC6.8020007@stoneleaf.us> Message-ID: On Tue, Jul 18, 2017 at 8:56 AM, Ethan Furman wrote: > I certainly don't expect the signature to change, but why is using a > metaclass out? The use (or not) of a metaclass /is/ an implementation > detail. > It is until you try to subclass with another metaclass -- then you have a metaclass conflict. If the namedtuple had no metaclass this would not be a conflict. (This is one reason to love class decorators.) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Tue Jul 18 14:38:46 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 18 Jul 2017 11:38:46 -0700 Subject: [Python-ideas] a new namedtuple In-Reply-To: References: <596D4FF6.5080908@stoneleaf.us> <20170718013447.GB3149@ando.pearwood.info> <596D7904.3090900@stoneleaf.us> Message-ID: <596E55B6.2030609@stoneleaf.us> On 07/17/2017 09:06 PM, David Mertz wrote: > On Jul 17, 2017 7:56 PM, "Ethan Furman" > wrote: > > On 07/17/2017 06:34 PM, Steven D'Aprano wrote: > > On Mon, Jul 17, 2017 at 05:01:58PM -0700, Ethan Furman wrote: > > > Guido has decreed that namedtuple shall be reimplemented with speed in mind. > > I haven't timed it (I'm hoping somebody will volunteer to be the bench mark > guru), I'll offer my NamedTuple implementation from my aenum [1] library. > > > With respect Ethan, if you're going to offer up NamedTuple as a faster > version of namedtuple, you should at least do a quick proof of > concept to demonstrate that it actually *is* faster. > > > I suck at benchmarking, so thank you for providing those quick-and-dirty hints. > > > Full bench marking > can wait, but you should be able to do at least something like: > > > python3 -m timeit --setup "from collections import namedtuple" \ > "K = namedtuple('K', 'a b c')" > > > 546 usec > > > versus > > python3 -m timeit --setup "from aenum import NamedTuple" \ > "K = NamedTuple('K', 'a b c')" > > > 250 usec > > > So it seems to be faster! :) > Can you try across a range of tuple sizes? E.g. what about with 100 items? 1000? I tried 26 and 52 (which really seems unlikely for a named tuple), and NamedTuple was consistently faster by about 250 usec. Using a metaclass is off the table, but this is still interesting data for me. :) -- ~Ethan~ From george at fischhof.hu Tue Jul 18 16:40:53 2017 From: george at fischhof.hu (George Fischhof) Date: Tue, 18 Jul 2017 22:40:53 +0200 Subject: [Python-ideas] pathlib.Path should handle Pythonpath or package root In-Reply-To: References: Message-ID: 2017-07-18 15:56 GMT+02:00 Nick Coghlan : > On 18 July 2017 at 22:08, George Fischhof wrote: > > Hi there, > > > > I created a program which uses plugins (import them). I started to test > it, > > and found that I need two types of paths: one for file system and another > > one which is package relative. > > > > So I thing this is a good idea, to enhance pathlib to handle package > roots. > > Is there a specific behaviour you're looking for that isn't already > addressed by "pathlib.Path(module.__file__).parent"? > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > Hi Nick, I think yes ;-) I would like to use (or I think it would be good to use) something like pathlib.Path(package_root) so I could use importlib.import_module(pathlib.Path(package_root) / plugins / plugin_name) and normal file system operations (for example) with open(pathlib.Path(package_root) / plugins / plugin_name) as my_file: do_something_with_file Import statement can be used to go downward and path can be used up and down in the directory hierarchy. If someone wants to use both of them, the only common point (branch) is the package root. From the package root one can use the same path expressions for import and for other file system operations. BR, George -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsbueno at python.org.br Tue Jul 18 17:06:00 2017 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Tue, 18 Jul 2017 18:06:00 -0300 Subject: [Python-ideas] a new namedtuple In-Reply-To: <20170718013447.GB3149@ando.pearwood.info> References: <596D4FF6.5080908@stoneleaf.us> <20170718013447.GB3149@ando.pearwood.info> Message-ID: In the other thread, I had mentioned my "extradict" implementation - it does have quite a few differences as it did not try to match namedtuple API, but it works nicely for all common use cases - these are the timeit timings: (env) [gwidion at caylus ]$ python3 -m timeit --setup "from collections import namedtuple" "K = namedtuple('K', 'a b c')" 1000 loops, best of 3: 362 usec per loop (env) [gwidion at caylus ]$ python3 -m timeit --setup "from extradict import namedtuple" "K = namedtuple('K', 'a b c')" 10000 loops, best of 3: 20 usec per loop (env) [gwidion at caylus ]$ python3 -m timeit --setup "from extradict import fastnamedtuple as namedtuple" "K = namedtuple('K', 'a b c')" 10000 loops, best of 3: 21 usec per loop Source at: https://github.com/jsbueno/extradict/blob/master/extradict/extratuple.py On 17 July 2017 at 22:34, Steven D'Aprano wrote: > On Mon, Jul 17, 2017 at 05:01:58PM -0700, Ethan Furman wrote: > > Guido has decreed that namedtuple shall be reimplemented with speed in > mind. > > > > I haven't timed it (I'm hoping somebody will volunteer to be the bench > mark > > guru), I'll offer my NamedTuple implementation from my aenum [1] library. > > With respect Ethan, if you're going to offer up NamedTuple as a faster > version of namedtuple, you should at least do a quick proof of > concept to demonstrate that it actually *is* faster. Full bench marking > can wait, but you should be able to do at least something like: > > > python3 -m timeit --setup "from collections import namedtuple" \ > "K = namedtuple('K', 'a b c')" > > versus > > python3 -m timeit --setup "from aenum import NamedTuple" \ > "K = NamedTuple('K', 'a b c')" > > (or whatever the interface is). If there's only a trivial speed up, or > if its slower, then there's no point even considing it unless you speed > it up first. > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pfreixes at gmail.com Tue Jul 18 17:33:00 2017 From: pfreixes at gmail.com (Pau Freixes) Date: Tue, 18 Jul 2017 23:33:00 +0200 Subject: [Python-ideas] Enabling Event.set to notify all waiters with an exception Message-ID: Morning guys, I came across with that idea trying to resolve a typical dogpile pattern [1], having many DNS calls to the same domain because of a miss in a DNS cache. The usage of the the set either to notify that the waiters could be awake and get the result from the cache or use it to notify that something was wrong helped me to reduce the complexity of the code. Just as an example: ``` if key in throttle_dns_events: yield from throttle_dns_events[key].wait() else: throttle_dns_events[key] = Event(loop=loop) try: addrs = yield from \ resolver.resolve(host, port, family=family) cached_hosts.add(key, addrs) throttle_dns_events[key].set() except Exception as e: # any DNS exception, independently of the implementation # is set for the waiters to raise the same exception. throttle_dns_events[key].set(exc=e) raise finally: throttle_dns_events.pop(key) ``` Any error caught by the locker will be broadcasted to the waiters. For example, a invalid hostname. I tried to open a PR to the CPython implementation, and they claim that the current interface of all of the locks objects behind the asyncio.locks [2] module try to keep the same interface as the threading one [3]. Therefore, to modify the asyncio implementation would need first a change in the threading interface. I was determined to justify that change, but after a bit research, I didn't find any example in other languages such as Java [4], C# [5] or C++ [6] allowing you to send an exception as a signal value to wake up the sleeping threads. is that enough to give up? I'm still reticent, I believe that this simple change in the interface can help reducing the complexity to handle errors in some scenarios. I would like to gather more ideas, thoughts, and comments from you about how can I still justify this change ... Thanks, [1] https://github.com/pfreixes/aiohttp/blob/throttle_dns_requests/aiohttp/connector.py#L678 [2] https://docs.python.org/3/library/asyncio-sync.html [3] https://docs.python.org/3/library/threading.html [4] https://docs.oracle.com/javase/7/docs/api/java/lang/Object.html#notifyAll() [5] https://msdn.microsoft.com/en-us/library/system.threading.monitor.pulseall.aspx [6] http://en.cppreference.com/w/cpp/thread/condition_variable/notify_all -- --pau From ethan at stoneleaf.us Tue Jul 18 17:58:33 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 18 Jul 2017 14:58:33 -0700 Subject: [Python-ideas] a new namedtuple In-Reply-To: References: <596D4FF6.5080908@stoneleaf.us> <596E2FC6.8020007@stoneleaf.us> Message-ID: <596E8489.2050705@stoneleaf.us> On 07/18/2017 09:09 AM, Guido van Rossum wrote: > On Tue, Jul 18, 2017 at 8:56 AM, Ethan Furman wrote: >> I certainly don't expect the signature to change, but why is using a metaclass out? The use (or not) of a metaclass >> /is/ an implementation detail. > > It is until you try to subclass with another metaclass -- then you have a metaclass conflict. If the namedtuple had no > metaclass this would not be a conflict. (This is one reason to love class decorators.) Ah, so metaclasses are leaky implementation details. Makes sense. -- ~Ethan~ From jimjjewett at gmail.com Tue Jul 18 18:16:26 2017 From: jimjjewett at gmail.com (Jim J. Jewett) Date: Tue, 18 Jul 2017 18:16:26 -0400 Subject: [Python-ideas] namedtuple with ordereddict Message-ID: Given that (1) dicts now always pay the price for ordering (2) namedtuple is being accelerated is there any reason not to simply define it as a view on a dict, or at least as a limited proxy to one? Then constructing a specific instance from the arguments used to create it could be as simple as keeping a reference to the temporary created to pass those arguments... -jJ From greg.ewing at canterbury.ac.nz Tue Jul 18 18:18:13 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 19 Jul 2017 10:18:13 +1200 Subject: [Python-ideas] a new namedtuple In-Reply-To: <596E2FC6.8020007@stoneleaf.us> References: <596D4FF6.5080908@stoneleaf.us> <596E2FC6.8020007@stoneleaf.us> Message-ID: <596E8925.5030907@canterbury.ac.nz> Ethan Furman wrote: > I certainly don't expect the signature to change, but why is using a > metaclass out? The use (or not) of a metaclass /is/ an implementation > detail. For me, the main benefit of using a metaclass would be that it enables using normal class declaration syntax to define a namedtuple. That's not just an implementation detail! -- Greg From njs at pobox.com Tue Jul 18 18:24:59 2017 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 18 Jul 2017 15:24:59 -0700 Subject: [Python-ideas] Enabling Event.set to notify all waiters with an exception In-Reply-To: References: Message-ID: On Tue, Jul 18, 2017 at 2:33 PM, Pau Freixes wrote: > Morning guys, (Not everyone here is a guy.) > I came across with that idea trying to resolve a typical dogpile > pattern [1], having many DNS calls to the same domain because of a > miss in a DNS cache. > > The usage of the the set either to notify that the waiters could be > awake and get the result from the cache or use it to notify that > something was wrong helped me to reduce the complexity of the code. > Just as an example: > > ``` > if key in throttle_dns_events: > yield from throttle_dns_events[key].wait() > else: > throttle_dns_events[key] = Event(loop=loop) > try: > addrs = yield from \ > resolver.resolve(host, port, family=family) > cached_hosts.add(key, addrs) > throttle_dns_events[key].set() > except Exception as e: > # any DNS exception, independently of the implementation > # is set for the waiters to raise the same exception. > throttle_dns_events[key].set(exc=e) > raise > finally: > throttle_dns_events.pop(key) > ``` > > Any error caught by the locker will be broadcasted to the waiters. For > example, a invalid hostname. > > I tried to open a PR to the CPython implementation, and they claim > that the current interface of all of the locks objects behind the > asyncio.locks [2] module try to keep the same interface as the > threading one [3]. Therefore, to modify the asyncio implementation > would need first a change in the threading interface. > > I was determined to justify that change, but after a bit research, I > didn't find any example in other languages such as Java [4], C# [5] or > C++ [6] allowing you to send an exception as a signal value to wake > up the sleeping threads. 'Event' is designed as a lowish-level primitive: the idea is that it purely provides the operation of "waiting for something", and then you can compose it with other data structures to build whatever higher-level semantics you need. From this point of view, it doesn't make much sense to add features like exception throwing -- that would make it more useful for some particular cases, but add overhead that others don't want or need. In this case, don't you want to cache an error return as well, anyway? It sounds like you're reinventing the idea of a Future, which is intended as a multi-reader eventually-arriving value-or-error -- exactly what you want here. So it seems like you could just write: # Conceptually correct, but subtly broken due to asyncio quirks if key not in cache: cache[key] = asyncio.ensure_future(resolver.resolve(...)) return await cache[key] BUT, unfortunately, this turns out to be really broken when combined with asyncio's cancellation feature, so you shouldn't do this :-(. When using asyncio, you basically need to make sure to never await any given Future more than once. Maybe adding a shield() inside the await is the right solution? The downside is that you actually do want to propagate the cancellation into the resolution Task, just... only if *all* the callers are cancelled *and* only if you can make sure that the cancellation is not cached. It's quite tricky actually! But I don't think adding exception-throwing functionality to Event() is the right solution :-) -n -- Nathaniel J. Smith -- https://vorpus.org From greg.ewing at canterbury.ac.nz Tue Jul 18 18:33:22 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 19 Jul 2017 10:33:22 +1200 Subject: [Python-ideas] namedtuple with ordereddict In-Reply-To: References: Message-ID: <596E8CB2.8080008@canterbury.ac.nz> Jim J. Jewett wrote: > is there any reason not to simply define it as a view on a dict, or at > least as a limited proxy to one? Some valuable characteristics of namedtuples as they are now: * Instances are very lightweight * Access by index is fast * Can be used as a dict key All of those would be lost if namedtuple became a dict view. -- Greg From jimjjewett at gmail.com Tue Jul 18 18:35:11 2017 From: jimjjewett at gmail.com (Jim J. Jewett) Date: Tue, 18 Jul 2017 18:35:11 -0400 Subject: [Python-ideas] Alternative Unicode implementations (NSString/NSMutableString) Message-ID: Ronald Oussoren came up with a concrete use case for wanting the interpreter to consider something a string, even if it isn't implemented with the default datastructure. In https://mail.python.org/pipermail/python-ideas/2017-July/046407.html he writes: The reason I need to subclass str: in PyObjC I use a subclass of str to represent Objective-C strings (NSString/NSMutableString), and I need to keep track of the original value; mostly because there are some Objective-C APIs that use object identity. The worst part is that fully initialising the PyUnicodeObject fields often isn?t necessary as a lot of Objective-C strings aren?t used as strings in Python code. The PyUnicodeObject (via its leading PyASCIIObject member) currently uses 7 flag bits including 2 for kind. Would it be worth adding an 8th big to indicate that string is a virtual subclass, and that the internals should not be touched directly? (This would require changing some of the macros; at the time of PEP 393 it Martin ruled YAGNI ... but is this something that might reasonably be reconsidered, if someone did the work. Which I am considering, but not committing to.) -jJ From guido at python.org Tue Jul 18 19:20:23 2017 From: guido at python.org (Guido van Rossum) Date: Tue, 18 Jul 2017 16:20:23 -0700 Subject: [Python-ideas] a new namedtuple In-Reply-To: <596E8925.5030907@canterbury.ac.nz> References: <596D4FF6.5080908@stoneleaf.us> <596E2FC6.8020007@stoneleaf.us> <596E8925.5030907@canterbury.ac.nz> Message-ID: On Tue, Jul 18, 2017 at 3:18 PM, Greg Ewing wrote: > Ethan Furman wrote: > >> I certainly don't expect the signature to change, but why is using a >> metaclass out? The use (or not) of a metaclass /is/ an implementation >> detail. >> > > For me, the main benefit of using a metaclass would be that > it enables using normal class declaration syntax to define a > namedtuple. That's not just an implementation detail! Newer versions of the typing module do this: https://docs.python.org/3/library/typing.html#typing.NamedTuple (and indeed it's done with a metaclass). -- --Guido van Rossum (python.org/~guido ) -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Tue Jul 18 19:40:46 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 19 Jul 2017 01:40:46 +0200 Subject: [Python-ideas] Alternative Unicode implementations (NSString/NSMutableString) In-Reply-To: References: Message-ID: Supporting a new kind of string storage would require a lot of efforts. There are a lot of C code specialized for each Unicode kind Victor Le 19 juil. 2017 12:43 AM, "Jim J. Jewett" a ?crit : > Ronald Oussoren came up with a concrete use case for wanting the > interpreter to consider something a string, even if it isn't > implemented with the default datastructure. > > In https://mail.python.org/pipermail/python-ideas/2017-July/046407.html > he writes: > > The reason I need to subclass str: in PyObjC I use > a subclass of str to represent Objective-C strings > (NSString/NSMutableString), and I need to keep track > of the original value; mostly because there are some > Objective-C APIs that use object identity. The worst > part is that fully initialising the PyUnicodeObject fields > often isn?t necessary as a lot of Objective-C strings > aren?t used as strings in Python code. > > The PyUnicodeObject (via its leading PyASCIIObject member) currently > uses 7 flag bits including 2 for kind. Would it be worth adding an > 8th big to indicate that string is a virtual subclass, and that the > internals should not be touched directly? (This would require > changing some of the macros; at the time of PEP 393 it Martin ruled > YAGNI ... but is this something that might reasonably be reconsidered, > if someone did the work. Which I am considering, but not committing > to.) > > -jJ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimjjewett at gmail.com Tue Jul 18 21:13:11 2017 From: jimjjewett at gmail.com (Jim J. Jewett) Date: Tue, 18 Jul 2017 21:13:11 -0400 Subject: [Python-ideas] between block and function [was: Custom Code Folding: Standardized Rules and Syntax?] Message-ID: There have been times when I wanted to group portions of a module, class, or function. Giving names to each of those groupings would be useful, and would be appropriate for python-level changes. That said, it would probably need more proof in the wild first, so the first step would be getting support in an editor (such as PyCharm), and the second would be an informational PEP on how to standardize the practice, like PEP 257 does for docstrings. https://www.python.org/dev/peps/pep-0257/ For these steps, you would probably have to use a comment or string convention, such as #{{{ or ":(group X ", though I suppose an external file (similar to the stub files for typing) is also possible. If these take off enough that the line noise gets annoying, that will prove the need, and getting support from python itself will be a lot easier. Example code :group C class C1: :group emulateFoo ... :group manageBar def xx(self, bar, name, val=None): :group setupBar ... :group actualXX class C2: ... :group D From steve at pearwood.info Tue Jul 18 21:27:02 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 19 Jul 2017 11:27:02 +1000 Subject: [Python-ideas] namedtuple with ordereddict In-Reply-To: References: Message-ID: <20170719012702.GD3149@ando.pearwood.info> On Tue, Jul 18, 2017 at 06:16:26PM -0400, Jim J. Jewett wrote: > Given that > > (1) dicts now always pay the price for ordering > (2) namedtuple is being accelerated > > is there any reason not to simply define it as a view on a dict, or at > least as a limited proxy to one? Tuples are much more memory efficient than dicts, they support lookup by index, and you'll break a whole lot of code that treats namedtuples as tuples and performs tuple operations on them. For instance, tuple concatenation. > Then constructing a specific instance from the arguments used to > create it could be as simple as keeping a reference to the temporary > created to pass those arguments... The bottleneck isn't creating the instances themselves, the expensive part is calling namedtuple() to generate the named tuple CLASS itself. Creating the instances themselves should be fast, they're just tuples. -- Steve From ncoghlan at gmail.com Tue Jul 18 21:51:08 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 19 Jul 2017 11:51:08 +1000 Subject: [Python-ideas] pathlib.Path should handle Pythonpath or package root In-Reply-To: References: Message-ID: On 19 July 2017 at 06:40, George Fischhof wrote: > I think yes ;-) > I would like to use (or I think it would be good to use) something like > pathlib.Path(package_root) > so I could use > > importlib.import_module(pathlib.Path(package_root) / plugins / plugin_name) No, as that's fundamentally incompatible with how Python's import system works - the filesystem is *a* way of representing package namespacing, but far from the only way. Managing the import state also has nothing whatsoever to do with pathlib. That said, the idea of better encapsulating the import state so we can more readily have constrained "import engines" *is* a reasonable one, it just runs into significant practical problems related to the handling of transitive imports in both Python modules and (especially) extension modules. The last serious attempt at pursuing something like that is documented in PEP 406, "Improved Encapsulation of Import State": https://www.python.org/dev/peps/pep-0406/ Unfortunately, the main outcome of Greg Slodkowicz's GSoC work on the idea was to conclude that that particular approach wasn't viable due to the fact that import system plugins at that time were pretty much required to directly manipulate global state, which ran directly counter to the goal of encapsulation. However, we also haven't had anyone seriously revisit the idea since the updated import plugin API was defined in PEP 451 - that deliberately moved a lot of the global state management out of the plugins and into the import system, so it should be more amenable to an "import engine" style approach to state encapsulation. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From steve at pearwood.info Tue Jul 18 21:52:17 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 19 Jul 2017 11:52:17 +1000 Subject: [Python-ideas] between block and function [was: Custom Code Folding: Standardized Rules and Syntax?] In-Reply-To: References: Message-ID: <20170719015217.GE3149@ando.pearwood.info> On Tue, Jul 18, 2017 at 09:13:11PM -0400, Jim J. Jewett wrote: > There have been times when I wanted to group portions of a module, > class, or function. Giving names to each of those groupings would be > useful, and would be appropriate for python-level changes. I don't know about grouping parts of a class or function, but I've certainly often wanted something in between the single file module and the multiple file package. As far as grouping parts of a class, we already have ways to do that: - parts of a class are called "methods" :-) - more usefully, you can use mixins (or traits) and multiple inheritence to separate related code into their own class - or forego inheritence at all and use composition. For functions, I think that if you need to group parts of a function, your function is probably too big. If not, you can use nested functions (although they're not as useful as they might be) or refactor. But your third suggestion would be useful to me. I'd really like to follow the Zen: Namespaces are one honking great idea -- let's do more of those! and have a namespace data structure that was functionally like a module but didn't need to be separated out into another file. A sketch of syntax: namespace Spam: # introduces a new block x = 1 def function(a): return a + x class K: ... assert Spam.x == 1 x = 99 assert Spam.function(100) == 101 # not 199 Name resolution inside Spam.function would go: locals -> nonlocals -> namespace -> module globals -> builtins I've played around with using a class statement (to get the indented block) and a decorator to create a namespace. The namespace itself is easy: I just subclass types.ModuleType, and inject names into that. The hard part is changing the functions to search their enclosing namespace first, before searching the module globals. If we had such a construct, then code folding would be given. If we used the existing "class" keyword, then any editor which can fold classes would automatically work. If we introduced a new keyword, then there would be a delay until editors gained support for that new keyword, but eventually the better editors would support code folding them. -- Steve From ncoghlan at gmail.com Tue Jul 18 22:34:19 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 19 Jul 2017 12:34:19 +1000 Subject: [Python-ideas] Alternative Unicode implementations (NSString/NSMutableString) In-Reply-To: References: Message-ID: On 19 July 2017 at 09:40, Victor Stinner wrote: > Supporting a new kind of string storage would require a lot of efforts. > There are a lot of C code specialized for each Unicode kind If I understand the requested flag correctly, it would be to request one of the following: 1. *Never* use any of CPython's fast paths, and instead be permanently slow; or 2. Indicate that it's a "lazily rendered" subclass that should hold off on calling PyUnicode_Ready for as long as possible, but still do so when necessary (akin to creating strings via the old Py_UNICODE APIs and then calling PyUnicode_READY on them) Neither of those is exactly straightforward, but I think it has the potential to tie in well with a Rust concept that Armin Ronacher recently pointed out, which is that in addition to their native String type, they also define a *separate* CString type as part of their C FFI layer: https://doc.rust-lang.org/std/ffi/struct.CString.html The Rust example does prompt me to ask whether this might be better modeled as a "PlatformString" data type (essentially a str subclass with an extra void * entry for a pointer to the native object), while the operator.index() precedent prompts me to ask whether or not this might be better handled with a "__platformstr__" protocol, but the basic *idea* of having a clearly defined way of modeling platform-native text strings at least somewhat independently of the core Python data types seems reasonable to me. (If we do go with the "flag bit" option, then it may actually be possible to steal the existing "Py_UNICODE *" pointer at the same time - that way an externally defined string would automatically be handled the same way as any other unready string, and "Py_UNICODE *" would just be a particular example of a platform string type) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ronaldoussoren at mac.com Wed Jul 19 02:07:07 2017 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Wed, 19 Jul 2017 08:07:07 +0200 Subject: [Python-ideas] Alternative Unicode implementations (NSString/NSMutableString) In-Reply-To: References: Message-ID: > On 19 Jul 2017, at 00:35, Jim J. Jewett wrote: > > Ronald Oussoren came up with a concrete use case for wanting the > interpreter to consider something a string, even if it isn't > implemented with the default datastructure. > > In https://mail.python.org/pipermail/python-ideas/2017-July/046407.html > he writes: > > The reason I need to subclass str: in PyObjC I use > a subclass of str to represent Objective-C strings > (NSString/NSMutableString), and I need to keep track > of the original value; mostly because there are some > Objective-C APIs that use object identity. The worst > part is that fully initialising the PyUnicodeObject fields > often isn?t necessary as a lot of Objective-C strings > aren?t used as strings in Python code. > > The PyUnicodeObject (via its leading PyASCIIObject member) currently > uses 7 flag bits including 2 for kind. Would it be worth adding an > 8th big to indicate that string is a virtual subclass, and that the > internals should not be touched directly? (This would require > changing some of the macros; at the time of PEP 393 it Martin ruled > YAGNI ... but is this something that might reasonably be reconsidered, > if someone did the work. Which I am considering, but not committing > to.) The reason I subclass str is primarily that it isn?t possible to be accepted as string like by the C API otherwise (that is, PyArg_Parse and the like require a PyUnicode_Type instance when the caller asks for a string). Adding a string equivalent of __index__ would most likely be a solution for my use case[1]. Without such a hook it would be nice to be able to postpone moving to PyUnicode_IS_READY state as long as possible, with a hook to provide the character buffer when the transition happens. That would make it possible to avoid duplicating the string buffer until it is truly needed. Ronald [1] Ignoring backward compatibility concerns on my side and without having fully thought through the consequences. > > -jJ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From pfreixes at gmail.com Wed Jul 19 02:37:07 2017 From: pfreixes at gmail.com (Pau Freixes) Date: Wed, 19 Jul 2017 08:37:07 +0200 Subject: [Python-ideas] Enabling Event.set to notify all waiters with an exception In-Reply-To: References: Message-ID: Yeps, > 'Event' is designed as a lowish-level primitive: the idea is that it > purely provides the operation of "waiting for something", and then you > can compose it with other data structures to build whatever > higher-level semantics you need. From this point of view, it doesn't > make much sense to add features like exception throwing -- that would > make it more useful for some particular cases, but add overhead that > others don't want or need. I do agree, indeed this was the main rationale that I wrote down. All languages kept the same interface, Python shouldn't be different in this. But, in the other side, the change added to the set method is almost negligible, enabling as advantage reduce the complexity in the client code to handle some situations. Just that, pros and cons. > > In this case, don't you want to cache an error return as well, anyway? Not at all, the error is ephemeral, is never cached. If an error is produced, perhaps a network outage, this is broadcasted to the other waiting clients. Once there are no more waiting clients, the same DNS resolution will have the chance to make the proper query. No error caching. > > It sounds like you're reinventing the idea of a Future, which is > intended as a multi-reader eventually-arriving value-or-error -- > exactly what you want here. So it seems like you could just write: > > # Conceptually correct, but subtly broken due to asyncio quirks > if key not in cache: > cache[key] = asyncio.ensure_future(resolver.resolve(...)) > return await cache[key] Not at all, the idea is taking into advantage the Event principle, having a set of Futures waiting to be awakened and returning either a value or exception. > > BUT, unfortunately, this turns out to be really broken when combined > with asyncio's cancellation feature, so you shouldn't do this :-(. > When using asyncio, you basically need to make sure to never await any > given Future more than once. > > Maybe adding a shield() inside the await is the right solution? The > downside is that you actually do want to propagate the cancellation > into the resolution Task, just... only if *all* the callers are > cancelled *and* only if you can make sure that the cancellation is not > cached. It's quite tricky actually! I've realized that I must protect the resolve() with a shield() for the caller that holds the event. Otherwise, the other waiters will have the chance to get a Canceled exception. Regarding the propagation of the cancellation if and only if *all* callers are canceled IMHO will fall on the side of a complex problem, and the solution might be do nothing. > > But I don't think adding exception-throwing functionality to Event() > is the right solution :-) Then I will be forced to make the code stateful, getting as an output as a complex solution if you compare it with the code that you might get using the Event() -- --pau From storchaka at gmail.com Wed Jul 19 02:48:15 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 19 Jul 2017 09:48:15 +0300 Subject: [Python-ideas] namedtuple with ordereddict In-Reply-To: <596E8CB2.8080008@canterbury.ac.nz> References: <596E8CB2.8080008@canterbury.ac.nz> Message-ID: 19.07.17 01:33, Greg Ewing ????: > Jim J. Jewett wrote: >> is there any reason not to simply define it as a view on a dict, or at >> least as a limited proxy to one? > > Some valuable characteristics of namedtuples as they are now: > > * Instances are very lightweight > * Access by index is fast > * Can be used as a dict key * Are tuple subclasses. This is important for compatibility with tuples, because namedtuples usually are used as replacements for tuples. From victor.stinner at gmail.com Wed Jul 19 07:38:18 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 19 Jul 2017 13:38:18 +0200 Subject: [Python-ideas] Alternative Unicode implementations (NSString/NSMutableString) In-Reply-To: References: Message-ID: 2017-07-19 4:34 GMT+02:00 Nick Coghlan : > 2. Indicate that it's a "lazily rendered" subclass that should hold > off on calling PyUnicode_Ready for as long as possible, but still do > so when necessary (akin to creating strings via the old Py_UNICODE > APIs and then calling PyUnicode_READY on them) Py_UNICODE is deprecated and should go away in the long term. Serhiy Storchaka started to deprecate APIs using Py_UNICODE. We call PyUnicode_READY() *everywhere* to cast "legacy string" to the new compact format *as soon as possible*. So I don't think that you should abuse this machinery :-( Victor From george at fischhof.hu Wed Jul 19 08:01:37 2017 From: george at fischhof.hu (George Fischhof) Date: Wed, 19 Jul 2017 14:01:37 +0200 Subject: [Python-ideas] pathlib.Path should handle Pythonpath or package root In-Reply-To: References: Message-ID: Sorry it was misunderstandable. I do not want to touch the import. I just would like to access the package root (of my current script). Practically in a Path object if the package resides in the file system. Parent is not enough, because the place of actual running script can vary. BR, George 2017-07-19 3:51 GMT+02:00 Nick Coghlan : > On 19 July 2017 at 06:40, George Fischhof wrote: > > I think yes ;-) > > I would like to use (or I think it would be good to use) something like > > pathlib.Path(package_root) > > so I could use > > > > importlib.import_module(pathlib.Path(package_root) / plugins / > plugin_name) > > No, as that's fundamentally incompatible with how Python's import > system works - the filesystem is *a* way of representing package > namespacing, but far from the only way. Managing the import state also > has nothing whatsoever to do with pathlib. > > That said, the idea of better encapsulating the import state so we can > more readily have constrained "import engines" *is* a reasonable one, > it just runs into significant practical problems related to the > handling of transitive imports in both Python modules and (especially) > extension modules. > > The last serious attempt at pursuing something like that is documented > in PEP 406, "Improved Encapsulation of Import State": > https://www.python.org/dev/peps/pep-0406/ > > Unfortunately, the main outcome of Greg Slodkowicz's GSoC work on the > idea was to conclude that that particular approach wasn't viable due > to the fact that import system plugins at that time were pretty much > required to directly manipulate global state, which ran directly > counter to the goal of encapsulation. > > However, we also haven't had anyone seriously revisit the idea since > the updated import plugin API was defined in PEP 451 - that > deliberately moved a lot of the global state management out of the > plugins and into the import system, so it should be more amenable to > an "import engine" style approach to state encapsulation. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.rodola at gmail.com Wed Jul 19 10:28:37 2017 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Wed, 19 Jul 2017 16:28:37 +0200 Subject: [Python-ideas] namedtuple with ordereddict In-Reply-To: <20170719012702.GD3149@ando.pearwood.info> References: <20170719012702.GD3149@ando.pearwood.info> Message-ID: On Wed, Jul 19, 2017 at 3:27 AM, Steven D'Aprano wrote: > On Tue, Jul 18, 2017 at 06:16:26PM -0400, Jim J. Jewett wrote: > > Then constructing a specific instance from the arguments used to > > create it could be as simple as keeping a reference to the temporary > > created to pass those arguments... > > The bottleneck isn't creating the instances themselves, the expensive > part is calling namedtuple() to generate the named tuple CLASS itself. > Creating the instances themselves should be fast, they're just tuples. > Still much slower (-4.3x) than plain tuples though: $ python3.7 -m timeit -s "import collections; Point = collections.namedtuple('Point', ('x', 'y'));" "Point(5, 11)" 1000000 loops, best of 5: 313 nsec per loop $ python3.7 -m timeit "tuple((5, 11))" 5000000 loops, best of 5: 71.4 nsec per loop Giampaolo - http://grodola.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Jul 19 10:53:23 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 20 Jul 2017 00:53:23 +1000 Subject: [Python-ideas] pathlib.Path should handle Pythonpath or package root In-Reply-To: References: Message-ID: On 19 July 2017 at 22:01, George Fischhof wrote: > Sorry it was misunderstandable. > I do not want to touch the import. > > I just would like to access the package root (of my current script). > Practically in a Path object if the package resides in the file system. > Parent is not enough, because the place of actual running script can vary. If you just want to do relative imports, then that's the purpose of the explicit relative import syntax ("from . import sibling" etc). You can also do dynamic relative imports by passing "__package__" as the second argument to importlib.import_module. If you're looking to read *data* files relative to the current module, then the API you're likely after is `pkgutil.get_data`: https://docs.python.org/3/library/pkgutil.html#pkgutil.get_data And if you're looking to access files relative to *__main__*, then the metadata atribute you want is __main__.__file__. So it's currently still really unclear what you actually mean by "package root", and what related metadata you see as currently being missing at runtime. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From tim.peters at gmail.com Wed Jul 19 11:20:38 2017 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 19 Jul 2017 10:20:38 -0500 Subject: [Python-ideas] namedtuple with ordereddict In-Reply-To: References: <20170719012702.GD3149@ando.pearwood.info> Message-ID: [Giampaolo Rodola' ] > Still much slower (-4.3x) than plain tuples though: > > $ python3.7 -m timeit -s "import collections; Point => collections.namedtuple('Point', ('x', 'y'));" "Point(5, 11)" > 1000000 loops, best of 5: 313 nsec per loop > > $ python3.7 -m timeit "tuple((5, 11))" > 5000000 loops, best of 5: 71.4 nsec per loop I believe this was pointed out earlier: in the second case, 1. (5, 11) is built at _compile_ time, so at runtime it's only measuring a LOAD_FAST to fetch it from the code's constants block. 2. The tuple() constructor does close to nothing when passed a tuple: it just increments the argument's reference count and returns it. >>> t = (1, 2) >>> tuple(t) is t True In other words, the second case isn't measuring tuple _creation_ time in any sense: it's just measuring how long it takes to look up the name "tuple" and increment the refcount on a tuple that was created at compile time. From g.rodola at gmail.com Wed Jul 19 12:10:05 2017 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Wed, 19 Jul 2017 18:10:05 +0200 Subject: [Python-ideas] namedtuple with ordereddict In-Reply-To: References: <20170719012702.GD3149@ando.pearwood.info> Message-ID: On Wed, Jul 19, 2017 at 5:20 PM, Tim Peters wrote: > [Giampaolo Rodola' ] > > Still much slower (-4.3x) than plain tuples though: > > > > $ python3.7 -m timeit -s "import collections; Point => > collections.namedtuple('Point', ('x', 'y'));" "Point(5, 11)" > > 1000000 loops, best of 5: 313 nsec per loop > > > > $ python3.7 -m timeit "tuple((5, 11))" > > 5000000 loops, best of 5: 71.4 nsec per loop > > I believe this was pointed out earlier: in the second case, > > 1. (5, 11) is built at _compile_ time, so at runtime it's only > measuring a LOAD_FAST to fetch it from the code's constants block. > > 2. The tuple() constructor does close to nothing when passed a tuple: > it just increments the argument's reference count and returns it. > > >>> t = (1, 2) > >>> tuple(t) is t > True > > In other words, the second case isn't measuring tuple _creation_ time > in any sense: it's just measuring how long it takes to look up the > name "tuple" and increment the refcount on a tuple that was created at > compile time. > Oh right, I didn't realize that, sorry. Should have been something like this instead: $ python3.7 -m timeit -s "import collections; Point = collections.namedtuple('Point', ('x', 'y')); x = [5, 1]" "Point(*x)" 1000000 loops, best of 5: 311 nsec per loop $ python3.7 -m timeit -s "x = [5, 1]" "tuple(x)" 5000000 loops, best of 5: 89.8 nsec per loop -- Giampaolo - http://grodola.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From mehaase at gmail.com Wed Jul 19 14:28:28 2017 From: mehaase at gmail.com (Mark E. Haase) Date: Wed, 19 Jul 2017 14:28:28 -0400 Subject: [Python-ideas] Enabling Event.set to notify all waiters with an exception In-Reply-To: References: Message-ID: On Wed, Jul 19, 2017 at 2:37 AM, Pau Freixes wrote: > Not at all, the idea is taking into advantage the Event principle, > having a set of Futures waiting to be awakened and returning either a > value or exception. That's not the principle of Event. You are describing a Future. Regarding the propagation of the cancellation if and only if *all* > callers are canceled IMHO will fall on the side of a complex problem, > and the solution might be do nothing. If you're willing to do nothing when all callers cancel, then the Future solution that Nathaniel posted should work for you (replacing ensure_future() with shield()). Have you tried it? Do you have a specific objection to it? -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Wed Jul 19 14:35:08 2017 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 19 Jul 2017 14:35:08 -0400 Subject: [Python-ideas] namedtuple with ordereddict In-Reply-To: References: <20170719012702.GD3149@ando.pearwood.info> Message-ID: On Wed, Jul 19, 2017 at 12:10 PM, Giampaolo Rodola' wrote: > Should have been something like this instead: > > $ python3.7 -m timeit -s "import collections; Point = > collections.namedtuple('Point', ('x', 'y')); x = [5, 1]" "Point(*x)" > 1000000 loops, best of 5: 311 nsec per loop > > $ python3.7 -m timeit -s "x = [5, 1]" "tuple(x)" > 5000000 loops, best of 5: 89.8 nsec per loop > This looks like a typical python function call overhead. Consider a toy class: $ cat c.py class C(tuple): def __new__(cls, *items): return tuple.__new__(cls, items) Comparing to a naked tuple, creation of a C instance is more than 3x slower. $ python3 -m timeit -s "from c import C; x = [1, 2]" "C(*x)" 1000000 loops, best of 3: 0.363 usec per loop $ python3 -m timeit -s "x = [1, 2]" "tuple(x)" 10000000 loops, best of 3: 0.114 usec per loop -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Jul 19 14:39:47 2017 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 19 Jul 2017 11:39:47 -0700 Subject: [Python-ideas] Enabling Event.set to notify all waiters with an exception In-Reply-To: References: Message-ID: On Jul 18, 2017 11:37 PM, "Pau Freixes" wrote: Yeps, > 'Event' is designed as a lowish-level primitive: the idea is that it > purely provides the operation of "waiting for something", and then you > can compose it with other data structures to build whatever > higher-level semantics you need. [...] > But I don't think adding exception-throwing functionality to Event() > is the right solution :-) Then I will be forced to make the code stateful, getting as an output as a complex solution if you compare it with the code that you might get using the Event() Not really ? the point of the first part of my message is that if you really want a Future/Event hybrid that can raise an error from 'wait', then python gives you the tools to implement this yourself, and then you can use it however you like. Something like class ErrorfulOneShotEvent: def __init__(self): self._event = asyncio.Event() self._error = None def set(self, error=None): self._error = error self._event.set() async def wait(self): await self._event.wait() if self._error is not None: raise self._error ...and you're good to go. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Wed Jul 19 15:06:08 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 19 Jul 2017 15:06:08 -0400 Subject: [Python-ideas] namedtuple with ordereddict In-Reply-To: References: <20170719012702.GD3149@ando.pearwood.info> Message-ID: On 7/19/2017 12:10 PM, Giampaolo Rodola' wrote: > > > On Wed, Jul 19, 2017 at 5:20 PM, Tim Peters > > wrote: > > [Giampaolo Rodola' >] > > Still much slower (-4.3x) than plain tuples though: > > > > $ python3.7 -m timeit -s "import collections; Point => collections.namedtuple('Point', ('x', 'y'));" "Point(5, 11)" > > 1000000 loops, best of 5: 313 nsec per loop > > > > $ python3.7 -m timeit "tuple((5, 11))" > > 5000000 loops, best of 5: 71.4 nsec per loop > > I believe this was pointed out earlier: in the second case, > > 1. (5, 11) is built at _compile_ time, so at runtime it's only > measuring a LOAD_FAST to fetch it from the code's constants block. > > 2. The tuple() constructor does close to nothing when passed a tuple: > it just increments the argument's reference count and returns it. > > >>> t = (1, 2) > >>> tuple(t) is t > True > > In other words, the second case isn't measuring tuple _creation_ time > in any sense: it's just measuring how long it takes to look up the > name "tuple" and increment the refcount on a tuple that was created at > compile time. > > > Oh right, I didn't realize that, sorry. Should have been something like > this instead: > > $ python3.7 -m timeit -s "import collections; Point = > collections.namedtuple('Point', ('x', 'y')); x = [5, 1]" "Point(*x)" > 1000000 loops, best of 5: 311 nsec per loop > > $ python3.7 -m timeit -s "x = [5, 1]" "tuple(x)" > 5000000 loops, best of 5: 89.8 nsec per loop I thing "x,y = 5, 1" in the setup and "Point(x,y)", and "(x,y)" better model real situations. "x,y" cannot be optimized away but reflects how people would construct a tuple given x and y. On my Win10 machine with 3.7 with debug win32 build (half as fast as without debug), I get F:\dev\3x>python -m timeit -s "import collections as c; Point = c.namedtuple('Point',('x','y')); x,y=5,1", "Point(x,y)" 200000 loops, best of 5: 1.86 usec per loop F:\dev\3x>python -m timeit -s "x,y=5,1", "(x,y)" 2000000 loops, best of 5: 156 nsec per loop If one starts with a tuple, then the Point call is pure extra overhead. If one does start with a list, I get 1.85 usec and 419 nsec > -- > Giampaolo - http://grodola.blogspot.com > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Terry Jan Reedy From pfreixes at gmail.com Wed Jul 19 15:51:12 2017 From: pfreixes at gmail.com (Pau Freixes) Date: Wed, 19 Jul 2017 21:51:12 +0200 Subject: [Python-ideas] Enabling Event.set to notify all waiters with an exception In-Reply-To: References: Message-ID: yes thats the behviour that I was looking for. The code its enought simple. Using the shield as has been proposed to protect the cancellation, plus this pattern it will work. Thanks for the feedback and proposal, I will abandon the idea of modify the set method. Cheers, El 19/07/2017 20:39, "Nathaniel Smith" escribi?: On Jul 18, 2017 11:37 PM, "Pau Freixes" wrote: Yeps, > 'Event' is designed as a lowish-level primitive: the idea is that it > purely provides the operation of "waiting for something", and then you > can compose it with other data structures to build whatever > higher-level semantics you need. [...] > But I don't think adding exception-throwing functionality to Event() > is the right solution :-) Then I will be forced to make the code stateful, getting as an output as a complex solution if you compare it with the code that you might get using the Event() Not really ? the point of the first part of my message is that if you really want a Future/Event hybrid that can raise an error from 'wait', then python gives you the tools to implement this yourself, and then you can use it however you like. Something like class ErrorfulOneShotEvent: def __init__(self): self._event = asyncio.Event() self._error = None def set(self, error=None): self._error = error self._event.set() async def wait(self): await self._event.wait() if self._error is not None: raise self._error ...and you're good to go. -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.rodola at gmail.com Wed Jul 19 20:14:21 2017 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Thu, 20 Jul 2017 02:14:21 +0200 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] Message-ID: On Tue, Jul 18, 2017 at 6:31 AM, Guido van Rossum wrote: > On Mon, Jul 17, 2017 at 6:25 PM, Eric Snow > wrote: > >> On Mon, Jul 17, 2017 at 6:01 PM, Ethan Furman wrote: >> > Guido has decreed that namedtuple shall be reimplemented with speed in >> mind. >> >> FWIW, I'm sure that any changes to namedtuple will be kept as minimal >> as possible. Changes would be limited to the underlying >> implementation, and would not include the namedtuple() signature, or >> using metaclasses, etc. However, I don't presume to speak for Guido >> or Raymond. :) >> > > Indeed. I referred people here for discussion of ideas like this: > > >>> a = (x=1, y=0) > Thanks for bringing this up, I'm gonna summarize my idea in form of a PEP-like draft, hoping to collect some feedback. Proposal ======== Introduction of a new syntax and builtin function to create lightweight namedtuples "on the fly" as in: >>> (x=10, y=20) (x=10, y=20) >>> ntuple(x=10, y=20) (x=10, y=20) Motivations =========== Avoid declaration ----------------- Other than the startup time cost: https://mail.python.org/pipermail/python-dev/2017-July/148592.html ...the fact that namedtuples need to be declared upfront implies they mostly end up being used only in public, end-user APIs / functions. For generic functions returning more than 1 argument it would be nice to just do: def get_coordinates(): return (x=10, y=20) ...instead of: from collections import namedtuple Coordinates = namedtuple('coordinates', ['x', 'y']) def get_coordinates(): return Coordinates(10, 20) Declaration also has the drawback of unnecessarily polluting the module API with an object (Coordinates) which is rarely needed. AFAIU namedtuple was designed this way for efficiency of the pure-python implementation currently in place and for serialization purposes (e.g. pickle), but I may be missing something else. Generally namedtuples are declared in a private module, imported from elsewhere and they are never exposed in the main namespace, which is kind of annoying. In case of one module scripts it's not uncommon to add a leading underscore which makes __repr__ uglier. To me, this suggests that the factory function should have been a first-class function instead. Speed ------ Other than the startup declaration overhead, a namedtuple is slower than a tuple or a C structseq in almost any aspect: - Declaration (50x slower than cnamedtuple): $ python3.7 -m timeit -s "from collections import namedtuple" \ "namedtuple('Point', ('x', 'y'))" 1000 loops, best of 5: 264 usec per loop $ python3.7 -m timeit -s "from cnamedtuple import namedtuple" \ "namedtuple('Point', ('x', 'y'))" 50000 loops, best of 5: 5.27 usec per loop - Instantiation (3.5x slower than tuple): $ python3.7 -m timeit -s "import collections; Point = collections.namedtuple('Point', ('x', 'y')); x = [1, 2]" "Point(*x)" 1000000 loops, best of 5: 310 nsec per loop $ python3.7 -m timeit -s "x = [1, 2]" "tuple(x)" 5000000 loops, best of 5: 88 nsec per loop - Unpacking (2.8x slower than tuple): $ python3.7 -m timeit -s "import collections; p = collections.namedtuple( \ 'Point', ('x', 'y'))(5, 11)" "x, y = p" 5000000 loops, best of 5: 41.9 nsec per loop $ python3.7 -m timeit -s "p = (5, 11)" "x, y = p" 20000000 loops, best of 5: 14.8 nsec per loop - Field access by name (1.9x slower than structseq and cnamedtuple): $ python3.7 -m timeit -s "from collections import namedtuple as nt; \ p = nt('Point', ('x', 'y'))(5, 11)" "p.x" 5000000 loops, best of 5: 42.7 nsec per loop $ python3.7 -m timeit -s "from cnamedtuple import namedtuple as nt; \ p = nt('Point', ('x', 'y'))(5, 11)" "p.x" 10000000 loops, best of 5: 22.5 nsec per loop $ python3.7 -m timeit -s "import os; p = os.times()" "p.user" 10000000 loops, best of 5: 22.6 nsec per loop - Field access by index is the same as tuple: $ python3.7 -m timeit -s "from collections import namedtuple as nt; \ p = nt('Point', ('x', 'y'))(5, 11)" "p[0]" 10000000 loops, best of 5: 20.3 nsec per loop $ python3.7 -m timeit -s "p = (5, 11)" "p[0]" 10000000 loops, best of 5: 20.5 nsec per loop It is being suggested that most of these complaints about speed aren't an issue but in certain circumstances such as busy loops, getattr() being 1.9x slower could make a difference, e.g.: https://github.com/python/cpython/blob/3e2ad8ec61a322370a6fbdfb2209cf74546f5e08/Lib/asyncio/selector_events.py#L523 Same goes for values unpacking. isinstance() ------------ Probably a minor complaint, I just bring this up because I recently had to do this in psutil's unit tests. Anyway, checking a namedtuple instance isn't exactly straightforward: https://stackoverflow.com/a/2166841 Backward compatibility ====================== This is probably the biggest barrier other than the "a C implementation is less maintainable" argument. In order to avoid duplication of functionality it would be great if collections.namedtuple() could remain a (deprecated) factory function using ntuple() internally. FWIW I tried running stdlib's unittests against https://github.com/llllllllll/cnamedtuple, I removed the ones about "_source", "verbose" and "module" arguments and I get a couple of errors about __doc__. I'm not sure about more advanced use cases (subclassing, others...?) but overall it appears pretty doable. collections.namedtuple() Python wrapper can include the necessary logic to implement "verbose" and "rename" parameters when they're used. I'm not entirely sure about the implications of the "module" parameter though (Raymond?). _make(), _asdict(), _replace() and _fields attribute should also be exposed; as for "_source" it appears it can easily be turned into a property which would also save some memory. The biggest annoyance is probably fields' __doc__ assignment: https://github.com/python/cpython/blob/ced36a993fcfd1c76637119d31c03156a8772e11/Lib/selectors.py#L53-L58 ...which would require returning a clever class object slowing down the namedtuple declaration also in case no parameters are passed, but considering that the long-term plan is the replace collections.namedtuple() with ntuple() I consider this acceptable. Thoughts? -- Giampaolo - http://grodola.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.rodola at gmail.com Wed Jul 19 20:28:34 2017 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Thu, 20 Jul 2017 02:28:34 +0200 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: Message-ID: On Thu, Jul 20, 2017 at 2:14 AM, Giampaolo Rodola' wrote > > In case of one module scripts it's not uncommon to add a leading > underscore which makes __repr__ uglier. > Actually forget about this: __repr__ is dictated by the first argument. =) -- Giampaolo - http://grodola.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Jul 19 21:08:27 2017 From: guido at python.org (Guido van Rossum) Date: Wed, 19 Jul 2017 18:08:27 -0700 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: Message-ID: The proposal in your email seems incomplete -- there's two paragraphs on the actual proposal, and the entire rest of your email is motivation. That may be appropriate for a PEP, but while you're working on a proposal it's probably better to focus on clarifying the spec. Regarding that spec, I think there's something missing: given a list (or tuple!) of values, how do you turn it into an 'ntuple'? That seems a common use case, e.g. when taking database results like row_factory in sqlite3. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Wed Jul 19 21:20:10 2017 From: python at mrabarnett.plus.com (MRAB) Date: Thu, 20 Jul 2017 02:20:10 +0100 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: Message-ID: <660a04e7-a593-884d-55db-de1f93d79bd0@mrabarnett.plus.com> On 2017-07-20 02:08, Guido van Rossum wrote: > The proposal in your email seems incomplete -- there's two paragraphs on > the actual proposal, and the entire rest of your email is motivation. > That may be appropriate for a PEP, but while you're working on a > proposal it's probably better to focus on clarifying the spec. > > Regarding that spec, I think there's something missing: given a list (or > tuple!) of values, how do you turn it into an 'ntuple'? That seems a > common use case, e.g. when taking database results like row_factory in > sqlite3. > It could borrow from what a dict does: accept a list of 2-tuples, name and value. From shoyer at gmail.com Wed Jul 19 21:21:05 2017 From: shoyer at gmail.com (Stephan Hoyer) Date: Wed, 19 Jul 2017 18:21:05 -0700 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: Message-ID: On Wed, Jul 19, 2017 at 6:08 PM, Guido van Rossum wrote: > Regarding that spec, I think there's something missing: given a list (or > tuple!) of values, how do you turn it into an 'ntuple'? That seems a common > use case, e.g. when taking database results like row_factory in sqlite3. > One obvious choice is to allow for construction from a dict with **kwargs unpacking. This actually works now that keyword arguments are ordered. This would mean either ntuple(**kwargs) or the possibly too cute (**kwargs) . -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Wed Jul 19 21:35:35 2017 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 19 Jul 2017 21:35:35 -0400 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: Message-ID: On Wed, Jul 19, 2017 at 9:08 PM, Guido van Rossum wrote: > The proposal in your email seems incomplete The proposal does not say anything about type((x=1, y=2)). I assume it will be the same as the type currently returned by namedtuple(?, 'x y'), but will these types be cached? Will type((x=1, y=2)) is type((x=3, y=4)) be True?. > Regarding that spec, I think there's something missing: given a list (or > tuple!) of values, how do you turn it into an 'ntuple'? Maybe type((x=1, y=2))(values) will work? From songofacandy at gmail.com Wed Jul 19 21:45:18 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Thu, 20 Jul 2017 10:45:18 +0900 Subject: [Python-ideas] Fwd: Consider allowing the use of abstractmethod without metaclasses In-Reply-To: References: <38b9d49d-e498-41ef-bd2e-747a22804cb1@googlegroups.com> Message-ID: Hi, Neil. > I want to use abstractmethod, but I have my own metaclasses and I don't want > to build composite metaclasses using abc.ABCMeta. > > Thanks to PEP 487, one approach is to facator out the abstractmethod checks > from ABCMeta into a regular (non-meta) class. So, my first suggestion is to > split abc.ABC into two pieces, a parent regular class with metaclass "type": > I'm +1 with your idea in performance point of view. Some people having other language background (C# or Java) want to use ABC like Java's interface. But ABC is too heavy to use only for checking abstract methods. It uses three inefficient WeakSet [1] and it overrides isinstance and issubclass with slow Python implementation. [1] WeakSet is implemented in Python, having one __dict__, list and two sets. And C implementation of gathering abstract methods will reduce Python startup time too. Because `_collections_abc` module has many ABCs and `os` module import it. ## in abc.py # Helper function. ABCMeta use this too. # And Python 3.7 can have C implementation of this. def _init_abstractclass(cls, bases=None): # Compute set of abstract method names if bases is None: bases = cls.__bases__ abstracts = {name for name, value in vars(cls).items() if getattr(value, "__isabstractmethod__", False)} for base in bases: for name in getattr(base, "__abstractmethods__", set()): value = getattr(cls, name, None) if getattr(value, "__isabstractmethod__", False): abstracts.add(name) cls.__abstractmethods__ = frozenset(abstracts) class Abstract: __init_subclass__ = _init_abstractclass ## usage import abc class AbstractBar(abc.Abstract): @abc.abstractmethod def bar(self): ... Bests, INADA Naoki From ncoghlan at gmail.com Thu Jul 20 00:06:31 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 20 Jul 2017 14:06:31 +1000 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: Message-ID: On 20 July 2017 at 11:35, Alexander Belopolsky wrote: > On Wed, Jul 19, 2017 at 9:08 PM, Guido van Rossum wrote: >> The proposal in your email seems incomplete > > The proposal does not say anything about type((x=1, y=2)). I assume > it will be the same as the type currently returned by namedtuple(?, 'x > y'), but will these types be cached? Will type((x=1, y=2)) is > type((x=3, y=4)) be True?. Right, this is one of the key challenges to be addressed, as is the question of memory consumption - while Giampaolo's write-up is good in terms of covering the runtime performance motivation, it misses that one of the key motivations of the namedtuple design is to ensure that the amortised memory overhead of namedtuple instances is *zero*, since the name/position mapping is stored on the type, and *not* on the individual instances. >From my point of view, I think the best available answers to those questions are: - ntuple literals will retain the low memory overhead characteristics of collections.namedtuple - we will use a type caching strategy akin to string interning - ntuple types will be uniquely identified by their field names and order - if you really want to prime the type cache, just create a module level instance without storing it: (x=1, y=2) # Prime the ntuple type cache A question worth asking will be whether or not "collections.namedtuple" will implicitly participate in the use of the type cache, and I think the answer needs to be "No". The problem is twofold: 1. collections.namedtuple accepts an additional piece of info that won't be applicable for ntuple types: the *name* 2. collections.namedtuple has existed for years *without* implicit type caching, so adding it now would be a bit weird That means the idiomatic way of getting the type of an ntuple would be to create an instance and take the type of it: type((x=1, y=2)) The could still be the same kind of type as is created by collections.namedtuple, or else a slight variant that tailors repr() and pickling support based on the fact it's a kind of tuple literal. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From mertz at gnosis.cx Thu Jul 20 01:12:55 2017 From: mertz at gnosis.cx (David Mertz) Date: Wed, 19 Jul 2017 22:12:55 -0700 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: Message-ID: I'm concerned in the proposal about losing access to type information (i.e. name) in this proposal. For example, I might write some code like this now: >>> from collections import namedtuple >>> Car = namedtuple("Car", "cost hp weight") >>> Motorcycle = namedtuple("Motorcycle", "cost hp weight") >>> smart = Car(18_900, 89, 949) >>> harley = Motorcyle(18_900, 89, 949) >>> if smart==harley and type(smart)==type(harley): ... print("These are identical vehicles") The proposal to define this as: >>> smart = (cost=18_900, hp=89, weight=949) >>> harley = (cost=18_900, hp=89, weight=949) Doesn't seem to leave any way to distinguish the objects of different types that happen to have the same fields. Comparing ` smart._fields==harley._fields` doesn't help here, nor does any type constructed solely from the fields. Yes, I know a Harley-Davidson only weighs about half as much as a SmartCar, although the price and HP aren't far off. I can think of a few syntax ideas for how we might mix in a "name" to the `ntuple` objects, but I don't want to bikeshed. I'd just like to have the option of giving a name or class that isn't solely derived from the field names. On Wed, Jul 19, 2017 at 9:06 PM, Nick Coghlan wrote: > On 20 July 2017 at 11:35, Alexander Belopolsky > wrote: > > On Wed, Jul 19, 2017 at 9:08 PM, Guido van Rossum > wrote: > >> The proposal in your email seems incomplete > > > > The proposal does not say anything about type((x=1, y=2)). I assume > > it will be the same as the type currently returned by namedtuple(?, 'x > > y'), but will these types be cached? Will type((x=1, y=2)) is > > type((x=3, y=4)) be True?. > > Right, this is one of the key challenges to be addressed, as is the > question of memory consumption - while Giampaolo's write-up is good in > terms of covering the runtime performance motivation, it misses that > one of the key motivations of the namedtuple design is to ensure that > the amortised memory overhead of namedtuple instances is *zero*, since > the name/position mapping is stored on the type, and *not* on the > individual instances. > > From my point of view, I think the best available answers to those > questions are: > > - ntuple literals will retain the low memory overhead characteristics > of collections.namedtuple > - we will use a type caching strategy akin to string interning > - ntuple types will be uniquely identified by their field names and order > - if you really want to prime the type cache, just create a module > level instance without storing it: > > (x=1, y=2) # Prime the ntuple type cache > > A question worth asking will be whether or not > "collections.namedtuple" will implicitly participate in the use of the > type cache, and I think the answer needs to be "No". The problem is > twofold: > > 1. collections.namedtuple accepts an additional piece of info that > won't be applicable for ntuple types: the *name* > 2. collections.namedtuple has existed for years *without* implicit > type caching, so adding it now would be a bit weird > > That means the idiomatic way of getting the type of an ntuple would be > to create an instance and take the type of it: type((x=1, y=2)) > > The could still be the same kind of type as is created by > collections.namedtuple, or else a slight variant that tailors repr() > and pickling support based on the fact it's a kind of tuple literal. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From contact at brice.xyz Thu Jul 20 02:50:13 2017 From: contact at brice.xyz (Brice PARENT) Date: Thu, 20 Jul 2017 08:50:13 +0200 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: Message-ID: If the type is a data, it probably belongs to the inside of the tuple: smart = (type="Car", cost=18_900, hp=89, weight=949) harley = (type="Motorcycle", cost=18_900, hp=89, weight=949) both_vehicles = (type(smart) == type(harley)) # True - type+cost+hp+weight on both sides same_vehicles = (smart == harley) # False - cost, hp and weight are identical, but not type Le 20/07/17 ? 07:12, David Mertz a ?crit : > I'm concerned in the proposal about losing access to type information > (i.e. name) in this proposal. For example, I might write some code > like this now: > > >>> from collections import namedtuple > >>> Car = namedtuple("Car", "cost hp weight") > >>> Motorcycle = namedtuple("Motorcycle", "cost hp weight") > >>> smart = Car(18_900, 89, 949) > >>> harley = Motorcyle(18_900, 89, 949) > >>> if smart==harley and type(smart)==type(harley): > ... print("These are identical vehicles") > > The proposal to define this as: > > >>> smart = (cost=18_900, hp=89, weight=949) > >>> harley = (cost=18_900, hp=89, weight=949) > > Doesn't seem to leave any way to distinguish the objects of different > types that happen to have the same fields. Comparing > `smart._fields==harley._fields` doesn't help here, nor does any type > constructed solely from the fields. > > Yes, I know a Harley-Davidson only weighs about half as much as a > SmartCar, although the price and HP aren't far off. > > I can think of a few syntax ideas for how we might mix in a "name" to > the `ntuple` objects, but I don't want to bikeshed. I'd just like to > have the option of giving a name or class that isn't solely derived > from the field names. > > > On Wed, Jul 19, 2017 at 9:06 PM, Nick Coghlan > wrote: > > On 20 July 2017 at 11:35, Alexander Belopolsky > > wrote: > > On Wed, Jul 19, 2017 at 9:08 PM, Guido van Rossum > > wrote: > >> The proposal in your email seems incomplete > > > > The proposal does not say anything about type((x=1, y=2)). I assume > > it will be the same as the type currently returned by > namedtuple(?, 'x > > y'), but will these types be cached? Will type((x=1, y=2)) is > > type((x=3, y=4)) be True?. > > Right, this is one of the key challenges to be addressed, as is the > question of memory consumption - while Giampaolo's write-up is good in > terms of covering the runtime performance motivation, it misses that > one of the key motivations of the namedtuple design is to ensure that > the amortised memory overhead of namedtuple instances is *zero*, since > the name/position mapping is stored on the type, and *not* on the > individual instances. > > From my point of view, I think the best available answers to those > questions are: > > - ntuple literals will retain the low memory overhead characteristics > of collections.namedtuple > - we will use a type caching strategy akin to string interning > - ntuple types will be uniquely identified by their field names > and order > - if you really want to prime the type cache, just create a module > level instance without storing it: > > (x=1, y=2) # Prime the ntuple type cache > > A question worth asking will be whether or not > "collections.namedtuple" will implicitly participate in the use of the > type cache, and I think the answer needs to be "No". The problem is > twofold: > > 1. collections.namedtuple accepts an additional piece of info that > won't be applicable for ntuple types: the *name* > 2. collections.namedtuple has existed for years *without* implicit > type caching, so adding it now would be a bit weird > > That means the idiomatic way of getting the type of an ntuple would be > to create an instance and take the type of it: type((x=1, y=2)) > > The could still be the same kind of type as is created by > collections.namedtuple, or else a slight variant that tailors repr() > and pickling support based on the fact it's a kind of tuple literal. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com > | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > > > > > > -- > Keeping medicines from the bloodstreams of the sick; food > from the bellies of the hungry; books from the hands of the > uneducated; technology from the underdeveloped; and putting > advocates of freedom in prisons. Intellectual property is > to the 21st century what the slave trade was to the 16th. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Jul 20 02:58:14 2017 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 19 Jul 2017 23:58:14 -0700 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: Message-ID: On Wed, Jul 19, 2017 at 9:06 PM, Nick Coghlan wrote: > A question worth asking will be whether or not > "collections.namedtuple" will implicitly participate in the use of the > type cache, and I think the answer needs to be "No". The problem is > twofold: > > 1. collections.namedtuple accepts an additional piece of info that > won't be applicable for ntuple types: the *name* > 2. collections.namedtuple has existed for years *without* implicit > type caching, so adding it now would be a bit weird > > That means the idiomatic way of getting the type of an ntuple would be > to create an instance and take the type of it: type((x=1, y=2)) > > The could still be the same kind of type as is created by > collections.namedtuple, or else a slight variant that tailors repr() > and pickling support based on the fact it's a kind of tuple literal. The problem with namedtuple's semantics are that they're perfect for its original use case (replacing legacy tuple returns without breaking backwards compatibility), but turn out to be sub-optimal for pretty much anything else, which is one of the motivations behind stuff like attrs and Eric's dataclasses PEP: https://github.com/ericvsmith/dataclasses/blob/61bc9354621694a93b215e79a7187ddd82000256/pep-xxxx.rst#why-not-namedtuple >From the above it sounds like this ntuple literal idea would be giving us a third independent way to solve this niche use case (ntuple, namedtuple, structseq). This seems like two too many? Especially given that namedtuple is already arguably *too* convenient, in the sense that it's become an attractive nuisance that gets used in places where it isn't really appropriate. Also, what's the advantage of (x=1, y=2) over ntuple(x=1, y=2)? I.e., why does this need to be syntax instead of a library? -n -- Nathaniel J. Smith -- https://vorpus.org From pierre.quentel at gmail.com Thu Jul 20 03:15:05 2017 From: pierre.quentel at gmail.com (Pierre Quentel) Date: Thu, 20 Jul 2017 09:15:05 +0200 Subject: [Python-ideas] HTTP compression support for http.server Message-ID: I have reported an issue in the tracker (https://bugs.python.org/issue30576) and proposed a Pull Request on the Github CPython repository ( https://github.com/python/cpython/pull/2078) to make http.server in the standard library support HTTP compression (gzip). I have been suggested to require feedback from core devs : - should HTTP compression be supported ? - if so, should it be supported by default ? It is the case in the PR, where a number of content types, eg text/html, are compressed if the user agent accepts the gzip "encoding" - if not, should the implementation of http.server be adapted so that subclasses could implement it ? For the moment the only way to add it is to modify method send_head() of SimpleHTTPRequestHandler My opinion is that it should be supported : http.server is not meant to be a full-featured HTTP server, but compression it is a basic and normalized feature of HTTP servers, it is supported by most browsers including on smartphones, it reduces network load, and it's very easy to implement (cf. the Pull Request). For the same reason, I recently added browser cache to http.server (PR #298 ). I also think that it should be supported by default for the most common content types (text/html, text/css, application/javascript...) ; the implementation is based on a list of types to compress (SimpleHTTPServer.compressed_types) that can be modified at will, eg set to the empty list to disable compression. -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Thu Jul 20 05:02:27 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 20 Jul 2017 10:02:27 +0100 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: Message-ID: On 20 July 2017 at 07:58, Nathaniel Smith wrote: > From the above it sounds like this ntuple literal idea would be giving > us a third independent way to solve this niche use case (ntuple, > namedtuple, structseq). This seems like two too many? Especially given > that namedtuple is already arguably *too* convenient, in the sense > that it's become an attractive nuisance that gets used in places where > it isn't really appropriate. Agreed. This discussion was prompted by the fact that namedtuple class creation was slow, resulting in startup time issues. It seems to have morphed into a generalised discussion of how we design a new "named values" type. While I know that if we're rewriting the implementation, that's a good time to review the semantics, but it feels like we've gone too far in that direction. As has been noted, the new proposal - no longer supports multiple named types with the same set of field names - doesn't allow creation from a simple sequence of values I would actually struggle to see how this can be considered a replacement for namedtuple - it feels like a completely independent beast. Certainly code intended to work on multiple Python versions would seem to have no motivation to change. > Also, what's the advantage of (x=1, y=2) over ntuple(x=1, y=2)? I.e., > why does this need to be syntax instead of a library? Agreed. Now that keyword argument dictionaries retain their order, there's no need for new syntax here. In fact, that's one of the key motivating reasons for the feature. Paul From cpitclaudel at gmail.com Thu Jul 20 05:15:00 2017 From: cpitclaudel at gmail.com (=?UTF-8?Q?Cl=c3=a9ment_Pit-Claudel?=) Date: Thu, 20 Jul 2017 11:15:00 +0200 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: Message-ID: <830f5684-a6ca-b532-cb88-6d3e8bd184f9@gmail.com> On 2017-07-20 11:02, Paul Moore wrote: >> Also, what's the advantage of (x=1, y=2) over ntuple(x=1, y=2)? I.e., >> why does this need to be syntax instead of a library? > > Agreed. Now that keyword argument dictionaries retain their order, > there's no need for new syntax here. In fact, that's one of the key > motivating reasons for the feature. Isn't there a speed aspect? That is, doesn't the library approach require creating (and likely discarding) a dictionary every time a new ntuple is created? The syntax approach wouldn't need to do that. Cl?ment. From p.f.moore at gmail.com Thu Jul 20 05:30:58 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 20 Jul 2017 10:30:58 +0100 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: <830f5684-a6ca-b532-cb88-6d3e8bd184f9@gmail.com> References: <830f5684-a6ca-b532-cb88-6d3e8bd184f9@gmail.com> Message-ID: On 20 July 2017 at 10:15, Cl?ment Pit-Claudel wrote: > On 2017-07-20 11:02, Paul Moore wrote: >>> Also, what's the advantage of (x=1, y=2) over ntuple(x=1, y=2)? I.e., >>> why does this need to be syntax instead of a library? >> >> Agreed. Now that keyword argument dictionaries retain their order, >> there's no need for new syntax here. In fact, that's one of the key >> motivating reasons for the feature. > > Isn't there a speed aspect? That is, doesn't the library approach require creating (and likely discarding) a dictionary every time a new ntuple is created? The syntax approach wouldn't need to do that. I don't think anyone has suggested that the instance creation time penalty for namedtuple is the issue (it's the initial creation of the class that affects interpreter startup time), so it's not clear that we need to optimise that (at this stage). However, it's also true that namedtuple instances are created from sequences, not dictionaries (because the class holds the position/name mapping, so instance creation doesn't need it). So it could be argued that the backward-incompatible means of creating instances is *also* a problem because it's slower... Paul PS Taking ntuple as "here's a neat idea for a new class", rather than as a possible namedtuple replacement, changes the context of all of the above significantly. Just treating ntuple purely as a new class being proposed, I quite like it, but I'm not sure it's justified given all of the similar approaches available, so let's see how a 3rd party implementation fares. And it's too early to justify new syntax, but if the overhead of a creation function turns out to be too high in practice, we can revisit that question. But that's *not* what this thread is about, as I understand it. From cpitclaudel at gmail.com Thu Jul 20 05:39:56 2017 From: cpitclaudel at gmail.com (=?UTF-8?Q?Cl=c3=a9ment_Pit-Claudel?=) Date: Thu, 20 Jul 2017 11:39:56 +0200 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: <830f5684-a6ca-b532-cb88-6d3e8bd184f9@gmail.com> Message-ID: <5d375135-66e7-8490-7af2-2d9bbc8d6742@gmail.com> On 2017-07-20 11:30, Paul Moore wrote: > On 20 July 2017 at 10:15, Cl?ment Pit-Claudel wrote: >> On 2017-07-20 11:02, Paul Moore wrote: >>>> Also, what's the advantage of (x=1, y=2) over ntuple(x=1, y=2)? I.e., >>>> why does this need to be syntax instead of a library? >>> >>> Agreed. Now that keyword argument dictionaries retain their order, >>> there's no need for new syntax here. In fact, that's one of the key >>> motivating reasons for the feature. >> >> Isn't there a speed aspect? That is, doesn't the library approach require creating (and likely discarding) a dictionary every time a new ntuple is created? The syntax approach wouldn't need to do that. > > I don't think anyone has suggested that the instance creation time > penalty for namedtuple is the issue (it's the initial creation of the > class that affects interpreter startup time), so it's not clear that > we need to optimise that (at this stage) Indeed, it's not clear we do. I was just offering a response to the original question, "what's the advantage of (x=1, y=2) over ntuple(x=1, y=2)?". From victor.stinner at gmail.com Thu Jul 20 08:19:03 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 20 Jul 2017 14:19:03 +0200 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: Message-ID: For me, namedtuple was first used to upgrade an old API from returning a tuple to a "named" tuple. There was a hard requirement on backward compatibility: namedtuple API is a superset of the tuple API. For new code, there is no such backward compatibility issue. If you don't need a type, types.Namespace is a good choice. Using ns=types.Namespace, you can replace (x=0, y=1) with ns(x=0, y=1). It already works, no syntax change. *If* someone really wants (x=0, y=1) syntax sugar, I would prefer to get a namespace (no indexed (tuple) API). Victor Le 20 juil. 2017 2:15 AM, "Giampaolo Rodola'" a ?crit : On Tue, Jul 18, 2017 at 6:31 AM, Guido van Rossum wrote: > On Mon, Jul 17, 2017 at 6:25 PM, Eric Snow > wrote: > >> On Mon, Jul 17, 2017 at 6:01 PM, Ethan Furman wrote: >> > Guido has decreed that namedtuple shall be reimplemented with speed in >> mind. >> >> FWIW, I'm sure that any changes to namedtuple will be kept as minimal >> as possible. Changes would be limited to the underlying >> implementation, and would not include the namedtuple() signature, or >> using metaclasses, etc. However, I don't presume to speak for Guido >> or Raymond. :) >> > > Indeed. I referred people here for discussion of ideas like this: > > >>> a = (x=1, y=0) > Thanks for bringing this up, I'm gonna summarize my idea in form of a PEP-like draft, hoping to collect some feedback. Proposal ======== Introduction of a new syntax and builtin function to create lightweight namedtuples "on the fly" as in: >>> (x=10, y=20) (x=10, y=20) >>> ntuple(x=10, y=20) (x=10, y=20) Motivations =========== Avoid declaration ----------------- Other than the startup time cost: https://mail.python.org/pipermail/python-dev/2017-July/148592.html ...the fact that namedtuples need to be declared upfront implies they mostly end up being used only in public, end-user APIs / functions. For generic functions returning more than 1 argument it would be nice to just do: def get_coordinates(): return (x=10, y=20) ...instead of: from collections import namedtuple Coordinates = namedtuple('coordinates', ['x', 'y']) def get_coordinates(): return Coordinates(10, 20) Declaration also has the drawback of unnecessarily polluting the module API with an object (Coordinates) which is rarely needed. AFAIU namedtuple was designed this way for efficiency of the pure-python implementation currently in place and for serialization purposes (e.g. pickle), but I may be missing something else. Generally namedtuples are declared in a private module, imported from elsewhere and they are never exposed in the main namespace, which is kind of annoying. In case of one module scripts it's not uncommon to add a leading underscore which makes __repr__ uglier. To me, this suggests that the factory function should have been a first-class function instead. Speed ------ Other than the startup declaration overhead, a namedtuple is slower than a tuple or a C structseq in almost any aspect: - Declaration (50x slower than cnamedtuple): $ python3.7 -m timeit -s "from collections import namedtuple" \ "namedtuple('Point', ('x', 'y'))" 1000 loops, best of 5: 264 usec per loop $ python3.7 -m timeit -s "from cnamedtuple import namedtuple" \ "namedtuple('Point', ('x', 'y'))" 50000 loops, best of 5: 5.27 usec per loop - Instantiation (3.5x slower than tuple): $ python3.7 -m timeit -s "import collections; Point = collections.namedtuple('Point', ('x', 'y')); x = [1, 2]" "Point(*x)" 1000000 loops, best of 5: 310 nsec per loop $ python3.7 -m timeit -s "x = [1, 2]" "tuple(x)" 5000000 loops, best of 5: 88 nsec per loop - Unpacking (2.8x slower than tuple): $ python3.7 -m timeit -s "import collections; p = collections.namedtuple( \ 'Point', ('x', 'y'))(5, 11)" "x, y = p" 5000000 loops, best of 5: 41.9 nsec per loop $ python3.7 -m timeit -s "p = (5, 11)" "x, y = p" 20000000 loops, best of 5: 14.8 nsec per loop - Field access by name (1.9x slower than structseq and cnamedtuple): $ python3.7 -m timeit -s "from collections import namedtuple as nt; \ p = nt('Point', ('x', 'y'))(5, 11)" "p.x" 5000000 loops, best of 5: 42.7 nsec per loop $ python3.7 -m timeit -s "from cnamedtuple import namedtuple as nt; \ p = nt('Point', ('x', 'y'))(5, 11)" "p.x" 10000000 loops, best of 5: 22.5 nsec per loop $ python3.7 -m timeit -s "import os; p = os.times()" "p.user" 10000000 loops, best of 5: 22.6 nsec per loop - Field access by index is the same as tuple: $ python3.7 -m timeit -s "from collections import namedtuple as nt; \ p = nt('Point', ('x', 'y'))(5, 11)" "p[0]" 10000000 loops, best of 5: 20.3 nsec per loop $ python3.7 -m timeit -s "p = (5, 11)" "p[0]" 10000000 loops, best of 5: 20.5 nsec per loop It is being suggested that most of these complaints about speed aren't an issue but in certain circumstances such as busy loops, getattr() being 1.9x slower could make a difference, e.g.: https://github.com/python/cpython/blob/3e2ad8ec61a322370a6fbdfb2209cf 74546f5e08/Lib/asyncio/selector_events.py#L523 Same goes for values unpacking. isinstance() ------------ Probably a minor complaint, I just bring this up because I recently had to do this in psutil's unit tests. Anyway, checking a namedtuple instance isn't exactly straightforward: https://stackoverflow.com/a/2166841 Backward compatibility ====================== This is probably the biggest barrier other than the "a C implementation is less maintainable" argument. In order to avoid duplication of functionality it would be great if collections.namedtuple() could remain a (deprecated) factory function using ntuple() internally. FWIW I tried running stdlib's unittests against https://github.com/llllllllll/cnamedtuple, I removed the ones about "_source", "verbose" and "module" arguments and I get a couple of errors about __doc__. I'm not sure about more advanced use cases (subclassing, others...?) but overall it appears pretty doable. collections.namedtuple() Python wrapper can include the necessary logic to implement "verbose" and "rename" parameters when they're used. I'm not entirely sure about the implications of the "module" parameter though (Raymond?). _make(), _asdict(), _replace() and _fields attribute should also be exposed; as for "_source" it appears it can easily be turned into a property which would also save some memory. The biggest annoyance is probably fields' __doc__ assignment: https://github.com/python/cpython/blob/ced36a993fcfd1c76637119d31c031 56a8772e11/Lib/selectors.py#L53-L58 ...which would require returning a clever class object slowing down the namedtuple declaration also in case no parameters are passed, but considering that the long-term plan is the replace collections.namedtuple() with ntuple() I consider this acceptable. Thoughts? -- Giampaolo - http://grodola.blogspot.com _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Thu Jul 20 08:23:29 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 20 Jul 2017 14:23:29 +0200 Subject: [Python-ideas] Fwd: Consider allowing the use of abstractmethod without metaclasses In-Reply-To: References: <38b9d49d-e498-41ef-bd2e-747a22804cb1@googlegroups.com> Message-ID: Le 20 juil. 2017 3:49 AM, "INADA Naoki" a ?crit : > I'm +1 with your idea in performance point of view. (...) But ABC is too heavy to use only for checking abstract methods. It uses three inefficient WeakSet [1] and it overrides isinstance and issubclass with slow Python implementation. I don't think that we can get ride of abc from the io and importlib. They are here to stay. I hear your performance analysis. Why not making abc faster instead of trying to workaround abc for perf issue? Someone already wrote WeakrefSet, a PR is waiting for our review! Victor -------------- next part -------------- An HTML attachment was scrubbed... URL: From k7hoven at gmail.com Thu Jul 20 08:35:41 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Thu, 20 Jul 2017 15:35:41 +0300 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: Message-ID: On Thu, Jul 20, 2017 at 9:58 AM, Nathaniel Smith wrote: > > The problem with namedtuple's semantics are that they're perfect for > its original use case (replacing legacy tuple returns without breaking > backwards compatibility), but turn out to be sub-optimal for pretty > much anything else, which is one of the motivations behind stuff like > attrs and Eric's dataclasses PEP: > https://github.com/ericvsmith/dataclasses/blob/ > 61bc9354621694a93b215e79a7187ddd82000256/pep-xxxx.rst#why-not-namedtuple > > ?Well put! I agree that adding attribute names to elements in a tuple (e.g. return values) in a backwards-compatible way is where namedtuple is great. >From the above it sounds like this ntuple literal idea would be giving > us a third independent way to solve this niche use case (ntuple, > namedtuple, structseq). This seems like two too many? Especially given > that namedtuple is already arguably *too* convenient, in the sense > that it's become an attractive nuisance that gets used in places where > it isn't really appropriate. > > ?I do think it makes sense to add a convenient way to upgrade a function to return named values. Is there any reason why that couldn't replace structseq completely? These anonymous namedtuple classes could also be made fast to create (and more importantly, cached). > Also, what's the advantage of (x=1, y=2) over ntuple(x=1, y=2)? I.e., > why does this need to be syntax instead of a library? > > Indeed, we might need the syntax (x=1, y=2) later for something different. However, I hope we can forget about 'ntuple', because it suggests a tuple of n elements. Maybe something like return tuple.named(x=foo, y=bar) which is backwards compatible with return foo, bar ?-- Koos ? -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From songofacandy at gmail.com Thu Jul 20 08:39:11 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Thu, 20 Jul 2017 21:39:11 +0900 Subject: [Python-ideas] Fwd: Consider allowing the use of abstractmethod without metaclasses In-Reply-To: References: <38b9d49d-e498-41ef-bd2e-747a22804cb1@googlegroups.com> Message-ID: Hi, Victor. > Why not making abc faster instead of trying to workaround abc for perf > issue? Current ABC provides: a) Prohibit instantiating without implement abstract methods. b) registry based subclassing People want Java's interface only wants (a). (b) is unwanted side effect. Additionally, even if CPython provide C implementation of ABCMeta, other Python implementations won't. So Abstract Class (not ABC) may be nice on such implementations too. I'm +1 to implement abc module in C. And I think (a) can be nice first step, instead of implement all at once. Regards, INADA Naoki From levkivskyi at gmail.com Thu Jul 20 08:56:29 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Thu, 20 Jul 2017 14:56:29 +0200 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: <5d375135-66e7-8490-7af2-2d9bbc8d6742@gmail.com> References: <830f5684-a6ca-b532-cb88-6d3e8bd184f9@gmail.com> <5d375135-66e7-8490-7af2-2d9bbc8d6742@gmail.com> Message-ID: Something probably not directly related, but since we started to talk about syntactic changes... I think what would be great to eventually have is some form of pattern matching. Essentially pattern matching could be just a "tagged" unpacking protocol. For example, something like this will simplify a common pattern with a sequence of if isinstance() branches: class Single(NamedTuple): x: int class Pair(NamedTuple): x: int y: int def func(arg: Union[Single, Pair]) -> int: whether arg: Single as a: return a + 2 Pair as a, b: return a * b else: return 0 The idea is that the expression before ``as`` is evaluated, then if ``arg`` is an instance of the result, then ``__unpack__`` is called on it. Then the resulting tuple is unpacked into the names a, b, etc. I think named tuples could provide the __unpack__, and especially it would be great for dataclasses to provide the __unpack__ method. (Maybe we can then call it __data__?) -- Ivan On 20 July 2017 at 11:39, Cl?ment Pit-Claudel wrote: > On 2017-07-20 11:30, Paul Moore wrote: > > On 20 July 2017 at 10:15, Cl?ment Pit-Claudel > wrote: > >> On 2017-07-20 11:02, Paul Moore wrote: > >>>> Also, what's the advantage of (x=1, y=2) over ntuple(x=1, y=2)? I.e., > >>>> why does this need to be syntax instead of a library? > >>> > >>> Agreed. Now that keyword argument dictionaries retain their order, > >>> there's no need for new syntax here. In fact, that's one of the key > >>> motivating reasons for the feature. > >> > >> Isn't there a speed aspect? That is, doesn't the library approach > require creating (and likely discarding) a dictionary every time a new > ntuple is created? The syntax approach wouldn't need to do that. > > > > I don't think anyone has suggested that the instance creation time > > penalty for namedtuple is the issue (it's the initial creation of the > > class that affects interpreter startup time), so it's not clear that > > we need to optimise that (at this stage) > > Indeed, it's not clear we do. I was just offering a response to the > original question, "what's the advantage of (x=1, y=2) over ntuple(x=1, > y=2)?". > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mistersheik at gmail.com Thu Jul 20 10:54:03 2017 From: mistersheik at gmail.com (Neil Girdhar) Date: Thu, 20 Jul 2017 14:54:03 +0000 Subject: [Python-ideas] Fwd: Consider allowing the use of abstractmethod without metaclasses In-Reply-To: References: <38b9d49d-e498-41ef-bd2e-747a22804cb1@googlegroups.com> Message-ID: Good discussion so far. Please let me know if I can help with implementation or documentation. On Thu, Jul 20, 2017 at 8:40 AM INADA Naoki wrote: > Hi, Victor. > > > Why not making abc faster instead of trying to workaround abc for perf > > issue? > > Current ABC provides: > > a) Prohibit instantiating without implement abstract methods. > b) registry based subclassing > > People want Java's interface only wants (a). (b) is unwanted side effect. > Right. (b) is only unwanted because it requires metaclasses, and metaclasses put constraints on inheritance. I have switched to using the "AbstractBaseClass" class I defined above in my own code. If that is the sort of solution that will be undertaken, then there is a question of what to call this class. If instead, every class will get this functionality automatically (which is more elegant), then someone needs to show that performance is unaffected. Also, this whole thing might not be that important if (as Guido implies) linters supplant this functionality. Although, linters would not catch the rare case where classes are programatically-generated. > > Additionally, even if CPython provide C implementation of ABCMeta, > other Python implementations won't. > So Abstract Class (not ABC) may be nice on such implementations too. > > I'm +1 to implement abc module in C. > And I think (a) can be nice first step, instead of implement all at once. > > Regards, > > INADA Naoki > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- > > --- > You received this message because you are subscribed to a topic in the > Google Groups "python-ideas" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/python-ideas/r2YLrIEQlig/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > python-ideas+unsubscribe at googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From levkivskyi at gmail.com Thu Jul 20 11:12:18 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Thu, 20 Jul 2017 17:12:18 +0200 Subject: [Python-ideas] Fwd: Consider allowing the use of abstractmethod without metaclasses In-Reply-To: References: <38b9d49d-e498-41ef-bd2e-747a22804cb1@googlegroups.com> Message-ID: To be honest, I am not very happy with addition of a new special class. Imagine that the PEP 544 will be accepted (and I hope so). Then we would have, abstract classes, abstract base classes, and protocols. I think users will be overwhelmed by having three similar concepts instead of one. I think we still could squeeze a lot of performance from good old ABCs by optimizing various parts and reimplementing some parts in C. In fact, my desire to optimize and rewrite ABCMeta in C is partially due to reluctance to add yet another concept of "abstractness". -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Thu Jul 20 12:32:16 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 21 Jul 2017 02:32:16 +1000 Subject: [Python-ideas] a new namedtuple In-Reply-To: References: <596D4FF6.5080908@stoneleaf.us> <20170718013447.GB3149@ando.pearwood.info> Message-ID: <20170720163215.GI3149@ando.pearwood.info> On Tue, Jul 18, 2017 at 06:06:00PM -0300, Joao S. O. Bueno wrote: > In the other thread, I had mentioned my "extradict" implementation - it > does have quite a few differences as it did not try to match namedtuple > API, but it works nicely for all common use cases - these are the timeit > timings: [...] > (env) [gwidion at caylus ]$ python3 -m timeit --setup "from extradict import > namedtuple" "K = namedtuple('K', 'a b c')" > 10000 loops, best of 3: 20 usec per loop > > (env) [gwidion at caylus ]$ python3 -m timeit --setup "from extradict import > fastnamedtuple as namedtuple" "K = namedtuple('K', 'a b c')" > 10000 loops, best of 3: 21 usec per loop Are you concerned that "fastnamedtuple" is no faster, and possibly slower, than "namedtuple"? -- Steve From songofacandy at gmail.com Thu Jul 20 13:51:53 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Fri, 21 Jul 2017 02:51:53 +0900 Subject: [Python-ideas] Fwd: Consider allowing the use of abstractmethod without metaclasses In-Reply-To: References: <38b9d49d-e498-41ef-bd2e-747a22804cb1@googlegroups.com> Message-ID: On Fri, Jul 21, 2017 at 12:12 AM, Ivan Levkivskyi wrote: > To be honest, I am not very happy with addition of a new special class. > Imagine that the PEP 544 will be accepted (and I hope so). > Then we would have, abstract classes, abstract base classes, and protocols. > I think users will be overwhelmed by having > three similar concepts instead of one. Hmm, couldn't split protocol and ABC? Of course, existing ABCs should be ABC for backward compatibility. But any reason to force using ABCMeta for user defined classes? I hate subclassing ABC because concrete classes which mix-in some ABC are forced to use it. > > I think we still could squeeze a lot of performance from good old ABCs by > optimizing various parts and reimplementing some parts in C. > In fact, my desire to optimize and rewrite ABCMeta in C is partially due to > reluctance to add yet another concept of "abstractness". > Even if it's implemented in C, issubclass implementation is much complicated than normal type. I don't want to introduce unnecessary complexity because I'm minimalist. Regards, > -- > Ivan > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From levkivskyi at gmail.com Thu Jul 20 13:59:45 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Thu, 20 Jul 2017 19:59:45 +0200 Subject: [Python-ideas] Fwd: Consider allowing the use of abstractmethod without metaclasses In-Reply-To: References: <38b9d49d-e498-41ef-bd2e-747a22804cb1@googlegroups.com> Message-ID: On 20 July 2017 at 19:51, INADA Naoki wrote: > On Fri, Jul 21, 2017 at 12:12 AM, Ivan Levkivskyi > wrote: > > To be honest, I am not very happy with addition of a new special class. > > Imagine that the PEP 544 will be accepted (and I hope so). > > Then we would have, abstract classes, abstract base classes, and > protocols. > > I think users will be overwhelmed by having > > three similar concepts instead of one. > > Hmm, couldn't split protocol and ABC? > > Unfortunately no, it was considered and rejected for various reasons (in particular to provide smooth transition to protocols). > > I think we still could squeeze a lot of performance from good old ABCs by > > optimizing various parts and reimplementing some parts in C. > > In fact, my desire to optimize and rewrite ABCMeta in C is partially due > to > > reluctance to add yet another concept of "abstractness". > > > > Even if it's implemented in C, issubclass implementation is much > complicated > than normal type. > I don't want to introduce unnecessary complexity because I'm minimalist. > > This complexity is already there, and attempt to reduce might lead to actually an increase of complexity. This is probably the case where I would be with Raymond in terms of performance vs ease of maintenance. -- Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From lucas.wiman at gmail.com Thu Jul 20 14:05:02 2017 From: lucas.wiman at gmail.com (Lucas Wiman) Date: Thu, 20 Jul 2017 11:05:02 -0700 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: Message-ID: On Thu, Jul 20, 2017 at 5:19 AM, Victor Stinner wrote: > For me, namedtuple was first used to upgrade an old API from returning a > tuple to a "named" tuple. There was a hard requirement on backward > compatibility: namedtuple API is a superset of the tuple API. > > For new code, there is no such backward compatibility issue. If you don't > need a type, types.Namespace is a good choice. > > Using ns=types.Namespace, you can replace (x=0, y=1) with ns(x=0, y=1). It > already works, no syntax change. > > *If* someone really wants (x=0, y=1) syntax sugar, I would prefer to get a > namespace (no indexed (tuple) API). > It's a minor point, but the main reason I use namedtuple is because it's far easier to get a hashable object than writing one yourself. Namespaces are not hashable. If the (x=0, y=1) sugar is accepted, IMO it should immutable and hashable like tuples/namedtuples. Best, Lucas -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Thu Jul 20 15:11:38 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 20 Jul 2017 15:11:38 -0400 Subject: [Python-ideas] HTTP compression support for http.server In-Reply-To: References: Message-ID: On 7/20/2017 3:15 AM, Pierre Quentel wrote: > I have reported an issue in the tracker > (https://bugs.python.org/issue30576) and proposed a Pull Request on the > Github CPython repository (https://github.com/python/cpython/pull/2078) > to make http.server in the standard library support HTTP compression (gzip). Full response on the issue. Briefly > I have been suggested to require feedback from core devs : > - should HTTP compression be supported ? Yes. > - if so, should it be supported by default ? No, contrary to painfully wrought policy. > It is the case in the PR, > where a number of content types, eg text/html, are compressed if the > user agent accepts the gzip "encoding" > - if not, should the implementation of http.server be adapted so that > subclasses could implement it ? For the moment the only way to add it is > to modify method send_head() of SimpleHTTPRequestHandler Add subclass. > My opinion is that it should be supported : http.server is not meant to > be a full-featured HTTP server, but compression it is a basic and > normalized feature of HTTP servers, it is supported by most browsers > including on smartphones, it reduces network load, and it's very easy to > implement (cf. the Pull Request). For the same reason, I recently added > browser cache to http.server (PR #298 > ). > > I also think that it should be supported by default for the most common > content types (text/html, text/css, application/javascript...) ; the > implementation is based on a list of types to compress > (SimpleHTTPServer.compressed_types) that can be modified at will, eg set > to the empty list to disable compression. -- Terry Jan Reedy From jimjjewett at gmail.com Thu Jul 20 16:17:05 2017 From: jimjjewett at gmail.com (Jim J. Jewett) Date: Thu, 20 Jul 2017 16:17:05 -0400 Subject: [Python-ideas] namedtuple from ordereddict Message-ID: Several of the replies seemed to suggest that re-using the current dict structure for a tuple wouldn't work. Since I'm not sure whether people are still thinking of the old structure, or I'm missing something obvious, I'll be more explicit. Per https://github.com/python/cpython/blob/master/Include/dictobject.h#L40 the last element of a dict object is now a separate pointer to ma_values, which is an array of objects. Per https://github.com/python/cpython/blob/master/Include/tupleobject.h#L27 the last element of a tuple object is also an array of objects. Is there some reason these arrays cannot be the same memory? e.g., does a tuple header *have* to be contiguous with its data, and if so, is there a reason that the dict's ma_array can't be allocated with an extra tuple-header prefix? -jJ From njs at pobox.com Thu Jul 20 20:36:44 2017 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 20 Jul 2017 17:36:44 -0700 Subject: [Python-ideas] Fwd: Consider allowing the use of abstractmethod without metaclasses In-Reply-To: References: <38b9d49d-e498-41ef-bd2e-747a22804cb1@googlegroups.com> Message-ID: On Jul 20, 2017 05:39, "INADA Naoki" wrote: Hi, Victor. > Why not making abc faster instead of trying to workaround abc for perf > issue? Current ABC provides: a) Prohibit instantiating without implement abstract methods. b) registry based subclassing People want Java's interface only wants (a). (b) is unwanted side effect. Except (b) is what allows you to subclass an ABC without using the ABC metaclass :-) I wonder if it would make sense to go further and merge *both* of these features into regular classes. Checking for @abstractmethod in type.__new__ surely can't be that expensive, can it? And if regular types supported 'register', then it would allow for a potentially simpler and faster implementation. Right now, superclass.register(subclass) has to work by mutating superclass, because that's the special ABCMeta object, and this leads to complicated stuff with weakrefs and all that. But if this kind of nominal inheritance was a basic feature of 'type' itself, then it could work by doing something like subclass.__nominal_bases__ += (superclass,) and then precalculating the "nominal mro" just like it already precalculates the mro, so issubclass/isinstance would remain fast. I guess enabling this across the board might cause problems for C classes whose users currently use isinstance to get information about the internal memory layout. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From songofacandy at gmail.com Thu Jul 20 20:51:45 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Fri, 21 Jul 2017 09:51:45 +0900 Subject: [Python-ideas] Fwd: Consider allowing the use of abstractmethod without metaclasses In-Reply-To: References: <38b9d49d-e498-41ef-bd2e-747a22804cb1@googlegroups.com> Message-ID: INADA Naoki On Fri, Jul 21, 2017 at 2:59 AM, Ivan Levkivskyi wrote: > On 20 July 2017 at 19:51, INADA Naoki wrote: >> >> On Fri, Jul 21, 2017 at 12:12 AM, Ivan Levkivskyi >> wrote: >> > To be honest, I am not very happy with addition of a new special class. >> > Imagine that the PEP 544 will be accepted (and I hope so). >> > Then we would have, abstract classes, abstract base classes, and >> > protocols. >> > I think users will be overwhelmed by having >> > three similar concepts instead of one. >> >> Hmm, couldn't split protocol and ABC? >> > > Unfortunately no, it was considered and rejected for various reasons (in > particular to provide smooth transition to protocols). > Sorry about my poor English. "split" meant "optionally ABC". I understand that existing classes (like typing.Sequence) must be ABC. But why new user defined protocol must be ABC? >> > by >> > optimizing various parts and reimplementing some parts in C. >> > In fact, my desire to optimize and rewrite ABCMeta in C is partially due >> > to >> > reluctance to add yet another concept of "abstractness". >> > >> >> Even if it's implemented in C, issubclass implementation is much >> complicated >> than normal type. >> I don't want to introduce unnecessary complexity because I'm minimalist. >> > > This complexity is already there, and attempt to reduce might lead to > actually an increase of complexity. > This is probably the case where I would be with Raymond in terms of > performance vs ease of maintenance. > Sorry again. I meant I don't want import the complexity to my class when I don't need it. In other words, I hate inheriting ABC when I don't need register based subclass. > -- > Ivan > > From songofacandy at gmail.com Thu Jul 20 21:03:56 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Fri, 21 Jul 2017 10:03:56 +0900 Subject: [Python-ideas] Fwd: Consider allowing the use of abstractmethod without metaclasses In-Reply-To: References: <38b9d49d-e498-41ef-bd2e-747a22804cb1@googlegroups.com> Message-ID: > > I wonder if it would make sense to go further and merge *both* of these > features into regular classes. > > Checking for @abstractmethod in type.__new__ surely can't be that expensive, > can it? > But it affects startup time. It iterate all of the namespace and tries `getattr(obj, `__isabstractmethod__`, False). It means massive AttributeErrors are raised and cleared while loading large library. OTOH, I have another idea: class AbstractFoo: def foo(self): ... __abstractmethods__ = ("foo",) In this idea, `type.__new__` can check only `__abstractmethods__`. > And if regular types supported 'register', then it would allow for a > potentially simpler and faster implementation. Right now, > superclass.register(subclass) has to work by mutating superclass, because > that's the special ABCMeta object, and this leads to complicated stuff with > weakrefs and all that. But if this kind of nominal inheritance was a basic > feature of 'type' itself, then it could work by doing something like > > subclass.__nominal_bases__ += (superclass,) > > and then precalculating the "nominal mro" just like it already precalculates > the mro, so issubclass/isinstance would remain fast. > I don't like it. In 99.9% of my classes, I don't need register based subclassing. > I guess enabling this across the board might cause problems for C classes > whose users currently use isinstance to get information about the internal > memory layout. > > -n From storchaka at gmail.com Fri Jul 21 01:28:40 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 21 Jul 2017 08:28:40 +0300 Subject: [Python-ideas] namedtuple from ordereddict In-Reply-To: References: Message-ID: 20.07.17 23:17, Jim J. Jewett ????: > Several of the replies seemed to suggest that re-using the current > dict structure for a tuple wouldn't work. Since I'm not sure whether > people are still thinking of the old structure, or I'm missing > something obvious, I'll be more explicit. > > Per https://github.com/python/cpython/blob/master/Include/dictobject.h#L40 > the last element of a dict object is now a separate pointer to > ma_values, which is an array of objects. > > Per https://github.com/python/cpython/blob/master/Include/tupleobject.h#L27 > the last element of a tuple object is also an array of objects. > > Is there some reason these arrays cannot be the same memory? e.g., > does a tuple header *have* to be contiguous with its data, and if so, > is there a reason that the dict's ma_array can't be allocated with an > extra tuple-header prefix? Having a tuple header to be contiguous with its data decreases a total size of consumed memory, decreases memory fragmentation, speeds up tuple's creation and item access. Memory consumption and performance of tuples are critically important. Allocating the dict's ma_values with an extra tuple-header prefix will increase memory consumption for instance dictionaries and will complicate the dict implementation (this can harm the performance). This also will increase a code coupling between dicts and tuples. Named tuples are rarely used in comparision with ordinary tuples and dictionaries, and they shouldn't be improved at the cost of tuples and dicts. From storchaka at gmail.com Fri Jul 21 01:49:20 2017 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 21 Jul 2017 08:49:20 +0300 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: Message-ID: 20.07.17 04:35, Alexander Belopolsky ????: > On Wed, Jul 19, 2017 at 9:08 PM, Guido van Rossum wrote: >> The proposal in your email seems incomplete > > The proposal does not say anything about type((x=1, y=2)). I assume > it will be the same as the type currently returned by namedtuple(?, 'x > y'), but will these types be cached? Will type((x=1, y=2)) is > type((x=3, y=4)) be True?. Yes, this is the key problem with this idea. If the type of every namedtuple literal is unique, this is a waste of memory and CPU time. Creating a new type is much more slower than instantiating it, even without compiling. If support the global cache of types, we have problems with mutability and life time. If types are mutable (namedtuple classes are), setting the __doc__ or __name__ attributes of type((x=1, y=2)) will affect type((x=3, y=4)). How to create two different named tuple types with different names and docstrings? In Python 2 all types are immortal, in python 3 they can be collected as ordinary objects, and you can create types dynamically without a fear of spent too much memory. If types are cached, we should take care about collecting unused types, this will significantly complicate the implementation. From k7hoven at gmail.com Fri Jul 21 10:09:08 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Fri, 21 Jul 2017 17:09:08 +0300 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: Message-ID: On Fri, Jul 21, 2017 at 8:49 AM, Serhiy Storchaka wrote: > 20.07.17 04:35, Alexander Belopolsky ????: > >> On Wed, Jul 19, 2017 at 9:08 PM, Guido van Rossum >> wrote: >> >>> The proposal in your email seems incomplete >>> >> >> The proposal does not say anything about type((x=1, y=2)). I assume >> it will be the same as the type currently returned by namedtuple(?, 'x >> y'), but will these types be cached? Will type((x=1, y=2)) is >> type((x=3, y=4)) be True?. >> > > Yes, this is the key problem with this idea. > > If the type of every namedtuple literal is unique, this is a waste of > memory and CPU time. Creating a new type is much more slower than > instantiating it, even without compiling. If support the global cache of > types, we have problems with mutability and life time. If types are mutable > (namedtuple classes are), setting the __doc__ or __name__ attributes of > type((x=1, y=2)) will affect type((x=3, y=4)). How to create two different > named tuple types with different names and docstrings? How about just making a named namedtuple if you want to mutate the type? Or perhaps make help() work better for __doc__ attributes on instances. Currently, >>> class Foo: pass ... >>> f = Foo() >>> f.__doc__ = "Hello" >>> help(f) does not show "Hello" at all. In Python 2 all types are immortal, in python 3 they can be collected as > ordinary objects, and you can create types dynamically without a fear of > spent too much memory. If types are cached, we should take care about > collecting unused types, this will significantly complicate the > implementation. > > Hmm. Good point. Even if making large amounts of arbitrary disposable anonymous namedtuples is probably not a great idea, someone might do it. Maybe having a separate type for each anonymous named tuple is not worth it. After all, keeping references to the attribute names in the object shouldn't take up that much memory. And the tuples are probably often short-lived. Given all this, the syntax for creating anonymous namedtuples efficiently probably does not really need to be super convenient on the Python side, but having it available and unified with that structseq thing would seem useful. -- Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri Jul 21 11:18:09 2017 From: guido at python.org (Guido van Rossum) Date: Fri, 21 Jul 2017 08:18:09 -0700 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: Message-ID: Honestly I would like to declare the bare (x=1, y=0) proposal dead. Let's encourage the use of objects rather than tuples (named or otherwise) for most data exchanges. I know of a large codebase that uses dicts instead of objects, and it's a mess. I expect the bare ntuple to encourage the same chaos. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From jhihn at gmx.com Fri Jul 21 13:07:13 2017 From: jhihn at gmx.com (Jason H) Date: Fri, 21 Jul 2017 19:07:13 +0200 Subject: [Python-ideas] Idea : for smarter assignment? Message-ID: I experimented with Python in college and I've been for close to 20 years now. (Coming and going as needed) I love the language. But there is one annoyance that I continually run into. There are basically two assignment operators, based on context, = and : a = 1 { a: 1 } They cannot be used interchangeably: a: 1 # error {a=1} # error I don't think I should be this way. There are times when I have a bunch of variables that I want to collect into an object or destructure. This involves adding commas, and swapping the :/=. I don't have a good fix for the adding of commas (maybe a newline?) but I think : should at least be accepted as = everywhere except in ifs: a: 1 # same as a = 1 One area where it might help (although the python parser already catches it) is in ifs: if a:1 # always error ? if a=1 # currently error, but might be accepted shorthand for == ? Ideally, I could take a: 1 b: 2 then in 3 edits: 1. first line prepend 'x: {' 2. last line append '}' 3. indent between { } I guess his would imply that { open up an assignment scope, where newlines are commas if the last line did not end with an operator or the next line did not start with an operator: x: { a: x - # - operator f(x) b: # : operator 5487234728394720348988734574357 c: 7 # c is 13, no trailing operator but next line has a preceding operator + 6 } The only issue then is how do we address a? x.a # looks fine to me x['a'] # as a dict, but the conversion of a to string 'a' could be confusing. Additionally, I was also thinking about : as an implied await: a = await f() # await generator a: f() # await generator, or direct assignment if return type is not a generator/async func Thoughts? Please be gentle :-) From rhodri at kynesim.co.uk Fri Jul 21 13:13:35 2017 From: rhodri at kynesim.co.uk (Rhodri James) Date: Fri, 21 Jul 2017 18:13:35 +0100 Subject: [Python-ideas] Idea : for smarter assignment? In-Reply-To: References: Message-ID: On 21/07/17 18:07, Jason H wrote: > There are basically two assignment operators, based on context, = and : > a = 1 > { a: 1 } No there aren't. The colon isn't assigning at all, it's separating a key from a corresponding value. The object referenced by 'a' is unchanged by being part of a dictionary literal. From that point on your whole argument falls apart. -- Rhodri James *-* Kynesim Ltd From brett at python.org Fri Jul 21 13:19:12 2017 From: brett at python.org (Brett Cannon) Date: Fri, 21 Jul 2017 17:19:12 +0000 Subject: [Python-ideas] Idea : for smarter assignment? In-Reply-To: References: Message-ID: On Fri, 21 Jul 2017 at 10:08 Jason H wrote: > I experimented with Python in college and I've been for close to 20 years > now. (Coming and going as needed) I love the language. But there is one > annoyance that I continually run into. > > There are basically two assignment operators, based on context, = and : > a = 1 > { a: 1 } > So the latter is not an assignment. `=` is an assignment as it creates a new entry in a namespace, while the later just associates a key with a value in a dictionary. > > They cannot be used interchangeably: > a: 1 # error > {a=1} # error > > I don't think I should be this way. > But it's on purpose as they do different things. Thanks for sharing the idea but I don't see this changing. -Brett > > There are times when I have a bunch of variables that I want to collect > into an object or destructure. This involves adding commas, and swapping > the :/=. I don't have a good fix for the adding of commas (maybe a > newline?) > but I think : should at least be accepted as = everywhere except in ifs: > a: 1 # same as a = 1 > > One area where it might help (although the python parser already catches > it) is in ifs: > if a:1 # always error ? > if a=1 # currently error, but might be accepted shorthand for == ? > > Ideally, I could take > a: 1 > b: 2 > then in 3 edits: > 1. first line prepend 'x: {' > 2. last line append '}' > 3. indent between { } > > I guess his would imply that { open up an assignment scope, where newlines > are commas if the last line did not end with an operator or the next line > did not start with an operator: > x: { > a: x - # - operator > f(x) > b: # : operator > 5487234728394720348988734574357 > c: 7 # c is 13, no trailing operator but next line has a preceding > operator > + 6 > } > > The only issue then is how do we address a? > x.a # looks fine to me > x['a'] # as a dict, but the conversion of a to string 'a' could be > confusing. > > > Additionally, I was also thinking about : as an implied await: > a = await f() # await generator > a: f() # await generator, or direct assignment if return type is > not a generator/async func > > Thoughts? Please be gentle :-) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jelle.zijlstra at gmail.com Fri Jul 21 13:31:57 2017 From: jelle.zijlstra at gmail.com (Jelle Zijlstra) Date: Fri, 21 Jul 2017 10:31:57 -0700 Subject: [Python-ideas] Idea : for smarter assignment? In-Reply-To: References: Message-ID: 2017-07-21 10:07 GMT-07:00 Jason H : > I experimented with Python in college and I've been for close to 20 years > now. (Coming and going as needed) I love the language. But there is one > annoyance that I continually run into. > > There are basically two assignment operators, based on context, = and : > a = 1 > { a: 1 } > > They cannot be used interchangeably: > a: 1 # error > {a=1} # error > > I don't think I should be this way. > > There are times when I have a bunch of variables that I want to collect > into an object or destructure. This involves adding commas, and swapping > the :/=. I don't have a good fix for the adding of commas (maybe a > newline?) > but I think : should at least be accepted as = everywhere except in ifs: > a: 1 # same as a = 1 > > One area where it might help (although the python parser already catches > it) is in ifs: > if a:1 # always error ? > if a=1 # currently error, but might be accepted shorthand for == ? > > Ideally, I could take > a: 1 > b: 2 > This conflicts with PEP 526 variable annotations: "a: int" already means "a is of type int", but with your syntax there would be no way to distinguish between "a = int" and "a: int". > then in 3 edits: > 1. first line prepend 'x: {' > 2. last line append '}' > 3. indent between { } > > I guess his would imply that { open up an assignment scope, where newlines > are commas if the last line did not end with an operator or the next line > did not start with an operator: > x: { > a: x - # - operator > f(x) > b: # : operator > 5487234728394720348988734574357 > c: 7 # c is 13, no trailing operator but next line has a preceding > operator > + 6 > } > > The only issue then is how do we address a? > x.a # looks fine to me > x['a'] # as a dict, but the conversion of a to string 'a' could be > confusing. > > > Additionally, I was also thinking about : as an implied await: > a = await f() # await generator > a: f() # await generator, or direct assignment if return type is > not a generator/async func > > Thoughts? Please be gentle :-) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Fri Jul 21 13:59:14 2017 From: mertz at gnosis.cx (David Mertz) Date: Fri, 21 Jul 2017 10:59:14 -0700 Subject: [Python-ideas] Idea : for smarter assignment? In-Reply-To: References: Message-ID: On Fri, Jul 21, 2017 at 10:19 AM, Brett Cannon wrote: > On Fri, 21 Jul 2017 at 10:08 Jason H wrote: > >> I experimented with Python in college and I've been for close to 20 years >> now. (Coming and going as needed) I love the language. But there is one >> annoyance that I continually run into. >> >> There are basically two assignment operators, based on context, = and : >> a = 1 >> { a: 1 } >> > The `=` isn't an assignment operator, it's a *binding*. The name 'a' gets bound to the integer object "1" in your example. Don't confuse this with a language like C where it really is an assignment. If I later write: a = 2 I haven't changed the "cell" that contains the integer object, I've rebound the NAME `a` to a different object. But you've left out quite a few binding operations. I might forget some, but here are several: import a # bind the name `a` to a module object with open(fname) as a: pass # bind the name `a` to a file handle for a in [1]: pass # bind the name `a` to each of the objects in an iterable # ... In this case, the net result is identical to `a=1` def a(): pass # bind the name `a` to a function object defined in the body class a: pass # bind the name `a` to a class object defined in the body With a bit of circuitous code, you *can* use a dictionary to bind a variable too: >>> globals().update({'a':1}) >>> a 1 -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From apalala at gmail.com Fri Jul 21 14:36:54 2017 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Fri, 21 Jul 2017 14:36:54 -0400 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: Message-ID: > Honestly I would like to declare the bare (x=1, y=0) proposal dead. Let's > encourage the use of objects rather than tuples (named or otherwise) for > most data exchanges. I know of a large codebase that uses dicts instead of > objects, and it's a mess. I expect the bare ntuple to encourage the same > chaos. > Languages since the original Pascal have had a way to define types by structure. If Python did the same, ntuples with the same structure would be typed "objects" that are not pre-declared. In Python's case, because typing of fields is not required and thus can't be used to hint the structures type, the names and order of fields could be used. Synthesizing a (reserved) type name for (x=1, y=0) should be straight forward. I short, >>> isinstance(x=None, y=None), type((x=1, y=0))) True That can be implemented with namedtuple with some ingenious mangling for the (quasi-anonymous) type name. Equivalence of types by structure is useful, and is very different from the mess that using dicts as records can produce. Cheers, -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From desmoulinmichel at gmail.com Fri Jul 21 16:11:07 2017 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Fri, 21 Jul 2017 22:11:07 +0200 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: Message-ID: Having to define objects for scripts or small projects is really adding a burden. A namedtuple litteral strike the perferfect balance between expressivity and concision for those cases. Le 21/07/2017 ? 17:18, Guido van Rossum a ?crit : > Honestly I would like to declare the bare (x=1, y=0) proposal dead. > Let's encourage the use of objects rather than tuples (named or > otherwise) for most data exchanges. I know of a large codebase that uses > dicts instead of objects, and it's a mess. I expect the bare ntuple to > encourage the same chaos. > > -- > --Guido van Rossum (python.org/~guido ) > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From toddrjen at gmail.com Sun Jul 23 12:08:03 2017 From: toddrjen at gmail.com (Todd) Date: Sun, 23 Jul 2017 12:08:03 -0400 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: Message-ID: On Jul 20, 2017 1:13 AM, "David Mertz" wrote: I'm concerned in the proposal about losing access to type information (i.e. name) in this proposal. For example, I might write some code like this now: >>> from collections import namedtuple >>> Car = namedtuple("Car", "cost hp weight") >>> Motorcycle = namedtuple("Motorcycle", "cost hp weight") >>> smart = Car(18_900, 89, 949) >>> harley = Motorcyle(18_900, 89, 949) >>> if smart==harley and type(smart)==type(harley): ... print("These are identical vehicles") The proposal to define this as: >>> smart = (cost=18_900, hp=89, weight=949) >>> harley = (cost=18_900, hp=89, weight=949) Doesn't seem to leave any way to distinguish the objects of different types that happen to have the same fields. Comparing ` smart._fields==harley._fields` doesn't help here, nor does any type constructed solely from the fields. What about making a syntax to declare a type? The ones that come to mind are name = (x=, y=) Or name = (x=pass, y=pass) They may not be clear enough, though. -------------- next part -------------- An HTML attachment was scrubbed... URL: From desmoulinmichel at gmail.com Sun Jul 23 13:47:16 2017 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Sun, 23 Jul 2017 19:47:16 +0200 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: Message-ID: <8dcbd07b-7bd8-7cb4-db71-6ec46d520b44@gmail.com> I'm not sure why everybody have such a grip on the type. When we use regular tuples, noone care, it's all tuples, no matter what. Well in that case, let's make all those namedtuple and be done with it. If somebody really needs a type, this person will either used collections.namedtuple the old way, or use a namespace or a class. If using the type "namedtuple" is an issue because it already exist, let's find a name for this new type that convey the meaning, like labelledtuple or something. The whole point of this is to make it a litteral, simple and quick to use. If you make it more than it is, we already got everything to do this and don't need to modify the language. Le 23/07/2017 ? 18:08, Todd a ?crit : > > > On Jul 20, 2017 1:13 AM, "David Mertz" > wrote: > > I'm concerned in the proposal about losing access to type > information (i.e. name) in this proposal. For example, I might > write some code like this now: > > >>> from collections import namedtuple > >>> Car = namedtuple("Car", "cost hp weight") > >>> Motorcycle = namedtuple("Motorcycle", "cost hp weight") > >>> smart = Car(18_900, 89, 949) > >>> harley = Motorcyle(18_900, 89, 949) > >>> if smart==harley and type(smart)==type(harley): > ... print("These are identical vehicles") > > The proposal to define this as: > > >>> smart = (cost=18_900, hp=89, weight=949) > >>> harley = (cost=18_900, hp=89, weight=949) > > Doesn't seem to leave any way to distinguish the objects of > different types that happen to have the same fields. Comparing > `smart._fields==harley._fields` doesn't help here, nor does any type > constructed solely from the fields. > > > What about making a syntax to declare a type? The ones that come to mind are > > name = (x=, y=) > > Or > > name = (x=pass, y=pass) > > They may not be clear enough, though. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From markusmeskanen at gmail.com Sun Jul 23 14:36:30 2017 From: markusmeskanen at gmail.com (Markus Meskanen) Date: Sun, 23 Jul 2017 21:36:30 +0300 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: <8dcbd07b-7bd8-7cb4-db71-6ec46d520b44@gmail.com> References: <8dcbd07b-7bd8-7cb4-db71-6ec46d520b44@gmail.com> Message-ID: 23.7.2017 20.59 "Michel Desmoulin" wrote: I'm not sure why everybody have such a grip on the type. When we use regular tuples, noone care, it's all tuples, no matter what. Well in that case, let's make all those namedtuple and be done with it. If somebody really needs a type, this person will either used collections.namedtuple the old way, or use a namespace or a class. If using the type "namedtuple" is an issue because it already exist, let's find a name for this new type that convey the meaning, like labelledtuple or something. The whole point of this is to make it a litteral, simple and quick to use. If you make it more than it is, we already got everything to do this and don't need to modify the language. +1 to this, why not just have: type((x=0, y=0)) == namedtuple similar to how tuples work. If you want to go advanced, feel free to use classes. Also, would it be crazy to suggest mixing tuples and named tuples: >>> t = ('a', x='b', y='c', 'd') >>> t[0], t[2] ('a', 'c') >>> t.y 'c' Just an idea, I'm not sure if it would have any use though. -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Sun Jul 23 14:56:42 2017 From: python at mrabarnett.plus.com (MRAB) Date: Sun, 23 Jul 2017 19:56:42 +0100 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: Message-ID: <32afe663-ca0f-153d-72df-332f9c8b0485@mrabarnett.plus.com> On 2017-07-23 17:08, Todd wrote: > > On Jul 20, 2017 1:13 AM, "David Mertz" > wrote: > > I'm concerned in the proposal about losing access to type > information (i.e. name) in this proposal. For example, I might > write some code like this now: > > >>> from collections import namedtuple > >>> Car = namedtuple("Car", "cost hp weight") > >>> Motorcycle = namedtuple("Motorcycle", "cost hp weight") > >>> smart = Car(18_900, 89, 949) > >>> harley = Motorcyle(18_900, 89, 949) > >>> if smart==harley and type(smart)==type(harley): > ... print("These are identical vehicles") > > The proposal to define this as: > > >>> smart = (cost=18_900, hp=89, weight=949) > >>> harley = (cost=18_900, hp=89, weight=949) > > Doesn't seem to leave any way to distinguish the objects of > different types that happen to have the same fields. Comparing > `smart._fields==harley._fields` doesn't help here, nor does any type > constructed solely from the fields. > > > What about making a syntax to declare a type? The ones that come to mind are > > name = (x=, y=) > > Or > > name = (x=pass, y=pass) > > They may not be clear enough, though. > Guido has already declared that he doesn't like those bare forms, so it'll probably be something like ntuple(...). From c at anthonyrisinger.com Sun Jul 23 16:26:52 2017 From: c at anthonyrisinger.com (C Anthony Risinger) Date: Sun, 23 Jul 2017 15:26:52 -0500 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: <32afe663-ca0f-153d-72df-332f9c8b0485@mrabarnett.plus.com> Message-ID: On Jul 23, 2017 1:56 PM, "MRAB" wrote: On 2017-07-23 17:08, Todd wrote: > On Jul 20, 2017 1:13 AM, "David Mertz" mertz at gnosis.cx>> wrote: > > I'm concerned in the proposal about losing access to type > information (i.e. name) in this proposal. For example, I might > write some code like this now: > > >>> from collections import namedtuple > >>> Car = namedtuple("Car", "cost hp weight") > >>> Motorcycle = namedtuple("Motorcycle", "cost hp weight") > >>> smart = Car(18_900, 89, 949) > >>> harley = Motorcyle(18_900, 89, 949) > >>> if smart==harley and type(smart)==type(harley): > ... print("These are identical vehicles") > > The proposal to define this as: > > >>> smart = (cost=18_900, hp=89, weight=949) > >>> harley = (cost=18_900, hp=89, weight=949) > > Doesn't seem to leave any way to distinguish the objects of > different types that happen to have the same fields. Comparing > `smart._fields==harley._fields` doesn't help here, nor does any type > constructed solely from the fields. > > > What about making a syntax to declare a type? The ones that come to mind > are > > name = (x=, y=) > > Or > > name = (x=pass, y=pass) > > They may not be clear enough, though. > > Guido has already declared that he doesn't like those bare forms, so it'll probably be something like ntuple(...). Not exactly a literal in that case. If this is true, why not simply add keyword arguments to tuple(...)? Something like (a=1, b=2, c=3) makes very clear sense to me, or even (1, 2, c=3), where the first two are accessible by index only. Or even (1, 2, c: 3), which reminds me of Elixir's expansion of tuple and list keywords. A tuple is a tuple is a tuple. No types. Just convenient accessors. -- C Anthony -------------- next part -------------- An HTML attachment was scrubbed... URL: From turnbull.stephen.fw at u.tsukuba.ac.jp Sun Jul 23 21:54:57 2017 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Mon, 24 Jul 2017 10:54:57 +0900 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: <32afe663-ca0f-153d-72df-332f9c8b0485@mrabarnett.plus.com> Message-ID: <22901.21361.654220.556617@turnbull.sk.tsukuba.ac.jp> C Anthony Risinger writes: > A tuple is a tuple is a tuple. No types. Just convenient accessors. That's not possible, though. A *tuple* is an immutable collection indexed by the natural numbers, which is useful to define as a single type precisely because the natural numbers are the canonical abstraction of "sequence". You can use the venerable idiom X = 0 Y = 1 point = (1.0, 1.0) x = point[X] to give the tuple "named attributes", restricting the names to Python identifiers. Of course this lacks the "namespace" aspect of namedtuple, where ".x" has the interpretation of "[0]" only in the context of a namedtuple with an ".x" attribute. But this is truly an untyped tuple-with-named-attributes. However, once you attach specific names to particular indexes, you have a type. The same attribute identifiers may be reused to correspond to different indexes to represent a different "convenience type". Since we need to be able to pass these objects to functions, pickle them, etc, that information has to be kept in the object somewhere, either directly (losing the space efficiency of tuples) or indirectly in a class-like structure. I see the convenience of the unnamed-type-typed tuple, but as that phrase suggests, I think it's fundamentally incoherent, a high price to pay for a small amount of convenience. Note that this is not an objection to a forgetful syntax that creates a namedtuple subtype but doesn't bother to record the type name explicitly in the program. In fact, we already have that: >>> from collections import namedtuple >>> a = namedtuple('_', ['x', 'y'])(0,1) >>> b = namedtuple('_', ['x', 'y'])(0,1) >>> a == b True >>> c = namedtuple('_', ['a', 'b'])(0,1) This even gives you free equality as I suppose you want it: >>> a == c True >>> a.x == c.a True >>> a.a == c.x Traceback (most recent call last): File "", line 1, in AttributeError: '_' object has no attribute 'a' >>> c.x == a.a Traceback (most recent call last): File "", line 1, in AttributeError: '_' object has no attribute 'x' Bizarre errors are the inevitable price to pay for this kind of abuse, of course. I'm not a fan of syntaxes like "(x=0, y=1)" or "(x:0, y:1)", but I'll leave it up to others to decide how to abbreviate the abominably ugly notation I used. Steve From ethan at stoneleaf.us Mon Jul 24 07:02:33 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 24 Jul 2017 04:02:33 -0700 Subject: [Python-ideas] namedtuple redesign goals Message-ID: <5975D3C9.5060104@stoneleaf.us> On 07/23/2017 10:47 AM, Michel Desmoulin wrote: > I'm not sure why everybody have such a grip on the type. If I understand the goal of "a new namedtuple" correctly, it is not to come up with yet another namedtuple type -- it is to make the existing collections.namedtuple a faster experience, and possibly add another way to create such a thing. This means that the "replacement" namedtuple MUST be backwards compatible with the existing collections.namedtuple, and keeping track of type is one of the things it does: --> from collections import namedtuple --> Point = namedtuple('Point', 'x y') --> p1 = Point(3, 7) --> p1.x 3 --> p1.y 7 --> isinstance(p1, Point) True -- ~Ethan~ From ethan at stoneleaf.us Mon Jul 24 07:45:58 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 24 Jul 2017 04:45:58 -0700 Subject: [Python-ideas] namedtuple redesign goals In-Reply-To: References: <5975D3C9.5060104@stoneleaf.us> Message-ID: <5975DDF6.3010304@stoneleaf.us> [redirecting back to list] On 07/24/2017 04:19 AM, Michel Desmoulin wrote: > Le 24/07/2017 ? 13:02, Ethan Furman a ?crit : >> On 07/23/2017 10:47 AM, Michel Desmoulin wrote: >>> I'm not sure why everybody have such a grip on the type. >> >> If I understand the goal of "a new namedtuple" correctly, it is not to >> come up with yet another namedtuple type -- it is to make the existing >> collections.namedtuple a faster experience, and possibly add another way >> to create such a thing. >> >> This means that the "replacement" namedtuple MUST be backwards >> compatible with the existing collections.namedtuple, and keeping track >> of type is one of the things it does: > > Is it ? Maybe we should check that, cause we may be arguing around a "nice to have" for nothing. Um, yes, it is. Did you not read the section you snipped? [1] > How many people among those intereted by the proposal have a strong need for the type ? Whether there is a strong need for it is largely irrelevant; it's there now, it needs to stay. If we were to remove it there would need to be a strong need for what we gain and that it outweighs the broken backward compatibility commitment that we try very hard to maintain. -- ~Ethan~ [1] My apologies for the first paragraph if this is a language translation issue and you were talking about the backwards compatibility and not the type tracking. From steve at pearwood.info Mon Jul 24 09:31:19 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 24 Jul 2017 23:31:19 +1000 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: <8dcbd07b-7bd8-7cb4-db71-6ec46d520b44@gmail.com> References: <8dcbd07b-7bd8-7cb4-db71-6ec46d520b44@gmail.com> Message-ID: <20170724133117.GO3149@ando.pearwood.info> On Sun, Jul 23, 2017 at 07:47:16PM +0200, Michel Desmoulin wrote: > I'm not sure why everybody have such a grip on the type. > > When we use regular tuples, noone care, it's all tuples, no matter what. Some people care. This is one of the serious disadvantages of ordinary tuples as a record/struct type. There's no way to distinguish between (let's say) rectangular coordinates (1, 2) and polar coordinates (1, 2), or between (name, age) and (movie_title, score). They're all just 2-tuples. [...] > The whole point of this is to make it a litteral, simple and quick to > use. If you make it more than it is, we already got everything to do > this and don't need to modify the language. I disagree: in my opinion, the whole point is to make namedtuple faster, so that Python's startup time isn't affected so badly. Creating new syntax for a new type of tuple is scope-creep. Even if we had that new syntax, the problem of namedtuple slowing down Python startup would remain. People can't use this new syntax until they have dropped support for everything before 3.7, which might take many years. But a fast namedtuple will give them benfit immediately their users upgrade to 3.7. I agree that there is a strong case to be made for a fast, built-in, easy way to make record/structs without having to pre-declare them. But as the Zen of Python says: Now is better than never. Although never is often better than *right* now. Let's not rush into designing a poor record/struct builtin just because we have a consensus (Raymond dissenting?) that namedtuple is too slow. The two issues are, not unrelated, but orthogonal. Record syntax would be still useful even if namedtuple was accelerated, and faster namedtuple would still be necessary even if we have record syntax. I believe that a couple of people (possibly including Guido?) are already thinking about a PEP for that. If that's the case, let's wait and see what they come up with. In the meantime, lets get back to the original question here: how can we make namedtuple faster? - Guido has ruled out using a metaclass as the implementation, as that makes it hard to inherit from namedtuple and another class with a different metaclass. - Backwards compatibility is a must. - *But* maybe we can afford to bend backwards compatibility a bit. Perhaps we don't need to generate the *entire* class using exec, just __new__. - I don't think that the _source attribute itself makes namedtuple slow. That might effect the memory usage of the class object itself, but its just a name binding: result._source = class_definition The expensive part is, I'm fairly sure, this: exec(class_definition, namespace) (Taken from the 3.5 collections/__init__.py.) I asked on PythonList at python.org whether people made us of the _source attribute, and the overwhelming response was that they either didn't know it existed, or if they did know, they didn't use it. https://mail.python.org/pipermail/python-list/2017-July/723888.html *If* it is accurate to say that nobody uses _source, then perhaps we might be willing to make this minor backwards-incompatible change in 3.7 (but not in a bug-fix release): - Only the __new__ method is generated by exec (my rough tests suggest that may make namedtuple four times faster); - _source only gives the source to __new__; - or perhaps we can save backwards compatibility by making _source generate the rest of the template lazily, when needed, even if the entire template isn't used by exec. That risks getting the *actual* source and the *reported* source getting out of sync. Maybe its better to just break compatibility rather than risk introducing a discrepancy between the two. -- Steve From ncoghlan at gmail.com Mon Jul 24 10:12:33 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 25 Jul 2017 00:12:33 +1000 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: Message-ID: On 22 July 2017 at 01:18, Guido van Rossum wrote: > Honestly I would like to declare the bare (x=1, y=0) proposal dead. Let's > encourage the use of objects rather than tuples (named or otherwise) for > most data exchanges. I know of a large codebase that uses dicts instead of > objects, and it's a mess. I expect the bare ntuple to encourage the same > chaos. That sounds sensible to me - given ordered keyword arguments, anything that bare syntax could do can be done with a new builtin instead, and be inherently more self-documenting as a result. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From chris.barker at noaa.gov Mon Jul 24 12:20:47 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 24 Jul 2017 09:20:47 -0700 Subject: [Python-ideas] HTTP compression support for http.server In-Reply-To: References: Message-ID: The opinion of some random guy on the list... On Thu, Jul 20, 2017 at 12:15 AM, Pierre Quentel wrote: > I have been suggested to require feedback from core devs : > - should HTTP compression be supported ? > Yes. You are quite right, it's pretty standard stuff these days. > - if so, should it be supported by default ? It is the case in the PR, > where a number of content types, eg text/html, are compressed if the user > agent accepts the gzip "encoding" > I'm pretty wary of compression happening by default -- i.e. someone runs exactly the same code with a newer version of Python, and suddenly some content is getting compressed. - if not, should the implementation of http.server be adapted so that > subclasses could implement it ? For the moment the only way to add it is to > modify method send_head() of SimpleHTTPRequestHandler > sure -- though it would be nice for folks to be able to use compression without going through that process. The implementation is based on a list of types to compress > (SimpleHTTPServer.compressed_types) that can be modified at will, eg set > to the empty list to disable compression. > How about having it be an empty list by default and have one or more lists of common types be pre-populated and available in the SimpleHTTPServer namespace. that is: SimpleHTTPServer.compressed_types = SimpleHTTPServer.standard_compressed_types Or may be a method to turn on the "standard" set -- though if it really is simply a list, better to expose that so it's obvious that you can create your own list or examine and edit the existing one(s). Thanks for doing this -- nice feature! -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Mon Jul 24 12:30:53 2017 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 25 Jul 2017 02:30:53 +1000 Subject: [Python-ideas] HTTP compression support for http.server In-Reply-To: References: Message-ID: On Tue, Jul 25, 2017 at 2:20 AM, Chris Barker wrote: > On Thu, Jul 20, 2017 at 12:15 AM, Pierre Quentel > wrote: >> - if so, should it be supported by default ? It is the case in the PR, >> where a number of content types, eg text/html, are compressed if the user >> agent accepts the gzip "encoding" > > > I'm pretty wary of compression happening by default -- i.e. someone runs > exactly the same code with a newer version of Python, and suddenly some > content is getting compressed. FWIW I'm quite okay with that. HTTP already has a mechanism for negotiating compression (Accept-Encoding), designed to be compatible with servers that don't support it. Any time a server gains support for something that clients already support, it's going to start happening as soon as you upgrade. Obviously this kind of change won't be happening in a bugfix release of Python, so it would be part of the regular checks when you upgrade from 3.6 to 3.7 - it'll be in the NEWS file and so on, so you read up on it before you upgrade. ChrisA From desmoulinmichel at gmail.com Mon Jul 24 12:37:37 2017 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Mon, 24 Jul 2017 18:37:37 +0200 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: <20170724133117.GO3149@ando.pearwood.info> References: <8dcbd07b-7bd8-7cb4-db71-6ec46d520b44@gmail.com> <20170724133117.GO3149@ando.pearwood.info> Message-ID: <4f64e52e-6877-2b8c-cff5-f5dd0e139233@gmail.com> Le 24/07/2017 ? 15:31, Steven D'Aprano a ?crit : > On Sun, Jul 23, 2017 at 07:47:16PM +0200, Michel Desmoulin wrote: > >> I'm not sure why everybody have such a grip on the type. >> >> When we use regular tuples, noone care, it's all tuples, no matter what. > > Some people care. > > This is one of the serious disadvantages of ordinary tuples as a > record/struct type. There's no way to distinguish between (let's say) > rectangular coordinates (1, 2) and polar coordinates (1, 2), or between > (name, age) and (movie_title, score). They're all just 2-tuples. You are just using my figure of speech as a way to counter argument. It's not a very useful thing to do. Of course some people care, there are always a few people caring about anything. But you just created your manual namedtuple or a namespace and be done with it. Rejecting completly the literal syntax just because it doesn't improve this use case you already had and worked but was a bit verbose is very radical. Unless you have a very nice counter proposal that makes everyone happy, accepting the current one doesn't take anything from you. > > > [...] >> The whole point of this is to make it a litteral, simple and quick to >> use. If you make it more than it is, we already got everything to do >> this and don't need to modify the language. > > I disagree: in my opinion, the whole point is to make namedtuple faster, > so that Python's startup time isn't affected so badly. Creating new > syntax for a new type of tuple is scope-creep. You are in the wrong thread. This thread is specifically about namedtupels literal. Making namedtuple faster can be done in many other ways and doesn't require a literal syntax. A literal syntax, while making things slightly faster by nature, is essentially to make things faster to read and write. > > Even if we had that new syntax, the problem of namedtuple slowing down > Python startup would remain. People can't use this new syntax until they > have dropped support for everything before 3.7, which might take many > years. But a fast namedtuple will give them benfit immediately their > users upgrade to 3.7. Again you are mixing the 2 things. This is why we have 2 threads: the debate splitted. > > I agree that there is a strong case to be made for a fast, built-in, > easy way to make record/structs without having to pre-declare them. Do other languages have such a thing that can be checked against types ? > But > as the Zen of Python says: > > Now is better than never. > Although never is often better than *right* now. > I agree. I don't thing we need to rush it. I can live without it now. I can live without it at all. > Let's not rush into designing a poor record/struct builtin just because > we have a consensus (Raymond dissenting?) that namedtuple is too slow. We don't. We can solve the slowness problem without having the namedtuple. The litteral is a convenience. > The two issues are, not unrelated, but orthogonal. Record syntax would > be still useful even if namedtuple was accelerated, and faster > namedtuple would still be necessary even if we have record syntax. On that we agree. > > I believe that a couple of people (possibly including Guido?) are > already thinking about a PEP for that. If that's the case, let's wait > and see what they come up with. Yes but it's about making classes less verbose if I recall. Or at least use the class syntax. It's nice but not the same thing. Namedtuple litterals are way more suited for scripting. You really don't want to write a class in quick scripts, when you do exploratory programming or data analysis on the fly. > > In the meantime, lets get back to the original question here: how can we > make namedtuple faster? The go to the other thread for that. > > - Guido has ruled out using a metaclass as the implementation, > as that makes it hard to inherit from namedtuple and another > class with a different metaclass. > > - Backwards compatibility is a must. > > - *But* maybe we can afford to bend backwards compatibility > a bit. Perhaps we don't need to generate the *entire* class > using exec, just __new__. > > - I don't think that the _source attribute itself makes > namedtuple slow. That might effect the memory usage of the > class object itself, but its just a name binding: > > result._source = class_definition > > The expensive part is, I'm fairly sure, this: > > exec(class_definition, namespace) > > (Taken from the 3.5 collections/__init__.py.) > > I asked on PythonList at python.org whether people made us of the _source > attribute, and the overwhelming response was that they either didn't > know it existed, or if they did know, they didn't use it. > > https://mail.python.org/pipermail/python-list/2017-July/723888.html > > > *If* it is accurate to say that nobody uses _source, then perhaps we > might be willing to make this minor backwards-incompatible change in 3.7 > (but not in a bug-fix release): > > - Only the __new__ method is generated by exec (my rough tests > suggest that may make namedtuple four times faster); > > - _source only gives the source to __new__; > > - or perhaps we can save backwards compatibility by making _source > generate the rest of the template lazily, when needed, even if > the entire template isn't used by exec. > > That risks getting the *actual* source and the *reported* source > getting out of sync. Maybe its better to just break compatibility rather > than risk introducing a discrepancy between the two. > > > From desmoulinmichel at gmail.com Mon Jul 24 12:46:53 2017 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Mon, 24 Jul 2017 18:46:53 +0200 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: Message-ID: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> Le 24/07/2017 ? 16:12, Nick Coghlan a ?crit : > On 22 July 2017 at 01:18, Guido van Rossum wrote: >> Honestly I would like to declare the bare (x=1, y=0) proposal dead. Let's >> encourage the use of objects rather than tuples (named or otherwise) for >> most data exchanges. I know of a large codebase that uses dicts instead of >> objects, and it's a mess. I expect the bare ntuple to encourage the same >> chaos. This is the people working on big code base talking. Remember, Python is not just for Google and Dropbox. We have thousands of user just being sysadmin, mathematicians, bankers, analysts, that just want a quick way to make a record. They don't want nor need a class. Dictionaries and collections.namedtuple are verbose and so they just used regular tuples. They don't use mypy either so having a type would be moot for them. In many languages we have the opposite problem: people using classes as a container for everything. It makes things very complicated with little value. Python actually has a good balance here. Yes, Python doesn't have pattern matching witch makes it harder to check if a nested data structure match the desired schema but all in all, the bloat/expressiveness equilibrium is quite nice. A litteral namedtuple would allow a clearer way to make a quick and simple record. > > That sounds sensible to me - given ordered keyword arguments, anything > that bare syntax could do can be done with a new builtin instead, and > be inherently more self-documenting as a result. > > Cheers, > Nick. > From desmoulinmichel at gmail.com Mon Jul 24 12:49:04 2017 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Mon, 24 Jul 2017 18:49:04 +0200 Subject: [Python-ideas] namedtuple redesign goals In-Reply-To: <5975DDF6.3010304@stoneleaf.us> References: <5975D3C9.5060104@stoneleaf.us> <5975DDF6.3010304@stoneleaf.us> Message-ID: <45220c91-b6e7-364f-6395-b41a833ef0ae@gmail.com> Le 24/07/2017 ? 13:45, Ethan Furman a ?crit : > [redirecting back to list] > > On 07/24/2017 04:19 AM, Michel Desmoulin wrote: >> Le 24/07/2017 ? 13:02, Ethan Furman a ?crit : >>> On 07/23/2017 10:47 AM, Michel Desmoulin wrote: > >>>> I'm not sure why everybody have such a grip on the type. >>> >>> If I understand the goal of "a new namedtuple" correctly, it is not to >>> come up with yet another namedtuple type -- it is to make the existing >>> collections.namedtuple a faster experience, and possibly add another way >>> to create such a thing. >>> >>> This means that the "replacement" namedtuple MUST be backwards >>> compatible with the existing collections.namedtuple, and keeping track >>> of type is one of the things it does: >> >> Is it ? Maybe we should check that, cause we may be arguing around a >> "nice to have" for nothing. > > Um, yes, it is. Did you not read the section you snipped? [1] > >> How many people among those intereted by the proposal have a strong >> need for the type ? > > Whether there is a strong need for it is largely irrelevant; it's there > now, it needs to stay. If we were to remove it there would need to be a > strong need for what we gain and that it outweighs the broken backward > compatibility commitment that we try very hard to maintain. You are assuming a namedtuple litteral would mean collections.namedtuple would lose the type hability. It's not the case. The litterals can be complement, not a remplacement. Accelerating namedtuple can be made by rewritting it in C. The litteral namedtuple is not necessary for that. > > -- > ~Ethan~ > > [1] My apologies for the first paragraph if this is a language > translation issue and you were talking about the backwards compatibility > and not the type tracking. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From chris.barker at noaa.gov Mon Jul 24 12:50:36 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 24 Jul 2017 09:50:36 -0700 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: <20170724133117.GO3149@ando.pearwood.info> References: <8dcbd07b-7bd8-7cb4-db71-6ec46d520b44@gmail.com> <20170724133117.GO3149@ando.pearwood.info> Message-ID: On Mon, Jul 24, 2017 at 6:31 AM, Steven D'Aprano wrote: > > I'm not sure why everybody have such a grip on the type. > > > > When we use regular tuples, noone care, it's all tuples, no matter what. > > Some people care. > > This is one of the serious disadvantages of ordinary tuples as a > record/struct type. There's no way to distinguish between (let's say) > rectangular coordinates (1, 2) and polar coordinates (1, 2), or between > (name, age) and (movie_title, score). They're all just 2-tuples. > sure -- but Python is dynamically typed, and we all like to talk abou tit as duck typing -- so asking: Is this a "rect_coord" or a "polar_coord" object isn't only unnecessary, it's considered non-pythonic. Bad example, actually, as a rect_coord would likely have names like 'x' and 'y', while a polar_coord would have "r' and 'theta' -- showing why having a named-tuple-like structure is helpful, even without types. So back to the example before of "Motorcycle" vs "Car" -- if they have the same attributes, then who cares which it is? If there is different functionality tied to each one, then that's what classes and sub-classing are for. I think the entire point of this proposed object is that it be as lightweight as possible -- it's just a data storage object -- if you want to switch functionality on type, then use subclasses. As has been said, NameTupule is partly the way it is because it was desired to be a drop-in replacement for a regular tuple, and need to be reasonably implemented in pure python. If we can have an object that is: immutable indexable like a tuple has named attributes is lightweight and efficient I think that would be very useful, and would take the place of NamedTuple for most use-cases, while being both more pythonic and more efficient. Whether it gets a literal or a simple constructor makes little difference, though if it got a literal, it would likely end up seeing much wider use (kind of like the set literal). I disagree: in my opinion, the whole point is to make namedtuple faster, > so that Python's startup time isn't affected so badly. Creating new > syntax for a new type of tuple is scope-creep. > I think making it easier to access and use is a worthwhile goal, too. If we are re-thinking this, a littel scope creep is OK. Even if we had that new syntax, the problem of namedtuple slowing down > Python startup would remain. People can't use this new syntax until they > have dropped support for everything before 3.7, which might take many > years. But a fast namedtuple will give them benefit immediately their > users upgrade to 3.7. > These aren't mutually exclusive, if 3.7 has collection.NamedTuple wrap the new object. IIUC, the idea of chached types would mean that objects _would_ be a Type, even if that wasn't usually exposed -- so it could be exposed in the case where it was constructed from a collections.NamedTuple() -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From c at anthonyrisinger.com Mon Jul 24 13:13:01 2017 From: c at anthonyrisinger.com (C Anthony Risinger) Date: Mon, 24 Jul 2017 12:13:01 -0500 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: <22901.21361.654220.556617@turnbull.sk.tsukuba.ac.jp> References: <32afe663-ca0f-153d-72df-332f9c8b0485@mrabarnett.plus.com> <22901.21361.654220.556617@turnbull.sk.tsukuba.ac.jp> Message-ID: On Sun, Jul 23, 2017 at 8:54 PM, Stephen J. Turnbull < turnbull.stephen.fw at u.tsukuba.ac.jp> wrote: > C Anthony Risinger writes: > > > A tuple is a tuple is a tuple. No types. Just convenient accessors. > > That's not possible, though. A *tuple* is an immutable collection > indexed by the natural numbers, which is useful to define as a single > type precisely because the natural numbers are the canonical > abstraction of "sequence". You can use the venerable idiom > > X = 0 > Y = 1 > > point = (1.0, 1.0) > x = point[X] > > to give the tuple "named attributes", restricting the names to Python > identifiers. Of course this lacks the "namespace" aspect of > namedtuple, where ".x" has the interpretation of "[0]" only in the > context of a namedtuple with an ".x" attribute. But this is truly an > untyped tuple-with-named-attributes. > > However, once you attach specific names to particular indexes, you > have a type. The same attribute identifiers may be reused to > correspond to different indexes to represent a different "convenience > type". Since we need to be able to pass these objects to functions, > pickle them, etc, that information has to be kept in the object > somewhere, either directly (losing the space efficiency of tuples) or > indirectly in a class-like structure. > > I see the convenience of the unnamed-type-typed tuple, but as that > phrase suggests, I think it's fundamentally incoherent, a high price > to pay for a small amount of convenience. > > Note that this is not an objection to a forgetful syntax that creates > a namedtuple subtype but doesn't bother to record the type name > explicitly in the program. In fact, we already have that: > > >>> from collections import namedtuple > >>> a = namedtuple('_', ['x', 'y'])(0,1) > >>> b = namedtuple('_', ['x', 'y'])(0,1) > >>> a == b > True > >>> c = namedtuple('_', ['a', 'b'])(0,1) > > This even gives you free equality as I suppose you want it: > > >>> a == c > True > >>> a.x == c.a > True > >>> a.a == c.x > Traceback (most recent call last): > File "", line 1, in > AttributeError: '_' object has no attribute 'a' > >>> c.x == a.a > Traceback (most recent call last): > File "", line 1, in > AttributeError: '_' object has no attribute 'x' > > Bizarre errors are the inevitable price to pay for this kind of abuse, > of course. > > I'm not a fan of syntaxes like "(x=0, y=1)" or "(x:0, y:1)", but I'll > leave it up to others to decide how to abbreviate the abominably ugly > notation I used. > Sure sure, this all makes sense, and I agree you can't get the accessors without storing information, somewhere, that links indexes to attributes, and it makes complete sense it might be implemented as a subtype, just like namedtuple works today. I was more commenting on what it conceptually means to have the designation "literal". It seems surprising to me that one literal has a different type from another literal with the same construction syntax. If underneath the hood it's technically a different type stored in some cached and hidden lookup table, so be it, but on the surface I think most just want a basic container with simpler named indexes. Every time I've used namedtuples, I've thought it more of a chore to pick a name for it, because it's only semi-useful to me in reprs, and I simply don't care about the type, ever. I only care about the shape for comparison with other tuples. If I want typed comparisons I can always just use a class. I'd also be perfectly fine with storing the "type" as a field on the tuple itself, because it's just a value container, and that's all I'll ever want from it. Alas, when I think I want a namedtuple, I usually end up using a dict subclass that assigns `self.__dict__ = self` within __new__, because this makes attribute access (and assignment) work automagically, and I care about that more than order (though it can be made to support both). At the end of the day, I don't see a way to have both a literal and something that is externally "named", because the only ways to pass the name I can imagine would make it look like a value within the container itself (such as using a literal string for the first item), unless even more new syntax was added. -- C Anthony -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Mon Jul 24 14:36:55 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 24 Jul 2017 19:36:55 +0100 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: <4f64e52e-6877-2b8c-cff5-f5dd0e139233@gmail.com> References: <8dcbd07b-7bd8-7cb4-db71-6ec46d520b44@gmail.com> <20170724133117.GO3149@ando.pearwood.info> <4f64e52e-6877-2b8c-cff5-f5dd0e139233@gmail.com> Message-ID: On 24 July 2017 at 17:37, Michel Desmoulin wrote: > You are in the wrong thread. This thread is specifically about > namedtupels literal. In which case, did you not see Guido's post "Honestly I would like to declare the bare (x=1, y=0) proposal dead."? The namedtuple literal proposal that started this thread is no longer an option, so can we move on? Preferably by dropping the whole idea - no-one has to my mind offered any sort of "replacement namedtuple" proposal that can't be implemented as a 3rd party library on PyPI *except* the (x=1, y=0) syntax proposal, and I see no justification for adding a *fourth* implementation of this type of object in the stdlib (which means any proposal would have to include deprecation of at least one of namedtuple, structseq or types.SimpleNamespace). The only remaining discussion on the table that I'm aware of is how we implement a more efficient version of the stdlib namedtuple class (and there's not much of that to be discussed here - implementation details can be thrashed out on the tracker issue). Paul From ethan at stoneleaf.us Mon Jul 24 14:49:13 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 24 Jul 2017 11:49:13 -0700 Subject: [Python-ideas] namedtuple redesign goals In-Reply-To: <45220c91-b6e7-364f-6395-b41a833ef0ae@gmail.com> References: <5975D3C9.5060104@stoneleaf.us> <5975DDF6.3010304@stoneleaf.us> <45220c91-b6e7-364f-6395-b41a833ef0ae@gmail.com> Message-ID: <59764129.10006@stoneleaf.us> On 07/24/2017 09:49 AM, Michel Desmoulin wrote: > You are assuming a namedtuple litteral would mean collections.namedtuple > would lose the type hability. It's not the case. The litterals can be > complement, not a remplacement. > > Accelerating namedtuple can be made by rewritting it in C. The litteral > namedtuple is not necessary for that. Ah, that makes sense. Personally, though, I'm not excited about another namedtuple variant. -- ~Ethan~ From chris.barker at noaa.gov Mon Jul 24 21:15:12 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 24 Jul 2017 18:15:12 -0700 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: Message-ID: On Fri, Jul 21, 2017 at 8:18 AM, Guido van Rossum wrote: > Honestly I would like to declare the bare (x=1, y=0) proposal dead. Let's > encourage the use of objects rather than tuples (named or otherwise) for > most data exchanges. I know of a large codebase that uses dicts instead of > objects, and it's a mess. I expect the bare ntuple to encourage the same > chaos. > I've seen the same sort of mess, but I think it's because folks have come down on the wrong side of "what's code, and what's data?" Data belongs in dicts (and tuples, and lists, and...) and code belongs in objects. With Python's dynamic nature, it is very easy to blur these lines, but the way I define it: a_point['x'] is accessing data, and a_point.x is running code. It more or less comes down to -- "if you know the names you need when you are writing the code, then it is probably code. So be wary if you are using literals for dict keys frequently. But from this perspective a NamedTuple (with or without a clearly defined type) is code, as it should be. In the duck-typing spirit, you should be able to do something like: p = get_the_point(something) do_something_with(p.x, p.y) And not know or care what type p is. With this perspective, a NamedTuple, with a known type or otherwise, AVOIDS the chaos of passing dicts around, and thus should be encouraged. And indeed, making it as easy as possible to create and pass an object_with_attributes around, rather than a plain tuple or dict would be a good thing. I do agree that we have multiple goals on the table, and DON'T want to have any more similar, but slightly different, lightweight objects with named attributes. So it makes sense to: 1) make namedtuple faster and then, optionally: 2) make it easier to quickly whip out an (anonymous) namedtuple. Maybe types.SimpleNamespace is the "better" solution to the above, but it hasn't gained the traction that namedtuple has. And there is a lot to be said for imutablilty, and the SimpleNamespace docs even say: "... for a structured record type use namedtuple() instead." -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Jul 24 21:57:18 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 25 Jul 2017 11:57:18 +1000 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> Message-ID: On 25 July 2017 at 02:46, Michel Desmoulin wrote: > Le 24/07/2017 ? 16:12, Nick Coghlan a ?crit : >> On 22 July 2017 at 01:18, Guido van Rossum wrote: >>> Honestly I would like to declare the bare (x=1, y=0) proposal dead. Let's >>> encourage the use of objects rather than tuples (named or otherwise) for >>> most data exchanges. I know of a large codebase that uses dicts instead of >>> objects, and it's a mess. I expect the bare ntuple to encourage the same >>> chaos. > > This is the people working on big code base talking. Dedicated syntax: (x=1, y=0) New builtin: ntuple(x=1, y=0) So the only thing being ruled out is the dedicated syntax option, since it doesn't let us do anything that a new builtin can't do, it's harder to find help on (as compared to "help(ntuple)" or searching online for "python ntuple"), and it can't be readily backported to Python 3.6 as part of a third party library (you can't easily backport it any further than that regardless, since you'd be missing the order-preservation guarantee for the keyword arguments passed to the builtin). Having such a builtin implictly create and cache new namedtuple type definitions so the end user doesn't need to care about pre-declaring them is still fine, and remains the most straightforward way of building a capability like this atop the underlying `collections.namedtuple` type. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Mon Jul 24 23:20:42 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 25 Jul 2017 13:20:42 +1000 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> Message-ID: On 25 July 2017 at 11:57, Nick Coghlan wrote: > Having such a builtin implictly create and cache new namedtuple type > definitions so the end user doesn't need to care about pre-declaring > them is still fine, and remains the most straightforward way of > building a capability like this atop the underlying > `collections.namedtuple` type. I've updated the example I posted in the other thread with all the necessary fiddling required for full pickle compatibility with auto-generated collections.namedtuple type definitions: https://gist.github.com/ncoghlan/a79e7a1b3f7dac11c6cfbbf59b189621 This shows that given ordered keyword arguments as a building block, most of the actual implementation complexity now lies in designing an implicit type cache that plays nicely with the way pickle works: from collections import namedtuple class _AutoNamedTupleTypeCache(dict): """Pickle compatibility helper for autogenerated collections.namedtuple type definitions""" def __new__(cls): # Ensure that unpickling reuses the existing cache instance self = globals().get("_AUTO_NTUPLE_TYPE_CACHE") if self is None: maybe_self = super().__new__(cls) self = globals().setdefault("_AUTO_NTUPLE_TYPE_CACHE", maybe_self) return self def __missing__(self, fields): cls_name = "_ntuple_" + "_".join(fields) return self._define_new_type(cls_name, fields) def __getattr__(self, cls_name): parts = cls_name.split("_") if not parts[:2] == ["", "ntuple"]: raise AttributeError(cls_name) fields = tuple(parts[2:]) return self._define_new_type(cls_name, fields) def _define_new_type(self, cls_name, fields): cls = namedtuple(cls_name, fields) cls.__module__ = __name__ cls.__qualname__ = "_AUTO_NTUPLE_TYPE_CACHE." + cls_name # Rely on setdefault to handle race conditions between threads return self.setdefault(fields, cls) _AUTO_NTUPLE_TYPE_CACHE = _AutoNamedTupleTypeCache() def auto_ntuple(**items): cls = _AUTO_NTUPLE_TYPE_CACHE[tuple(items)] return cls(*items.values()) But given such a cache, you get implicitly defined types that are automatically shared between instances that want to use the same field names: >>> p1 = auto_ntuple(x=1, y=2) >>> p2 = auto_ntuple(x=4, y=5) >>> type(p1) is type(p2) True >>> >>> import pickle >>> p3 = pickle.loads(pickle.dumps(p1)) >>> p1 == p3 True >>> type(p1) is type(p3) True >>> >>> p1, p2, p3 (_ntuple_x_y(x=1, y=2), _ntuple_x_y(x=4, y=5), _ntuple_x_y(x=1, y=2)) >>> type(p1) And writing the pickle out to a file and reloading it also works without needing to explicitly predefine that particular named tuple variant: >>> with open("auto_ntuple.pkl", "rb") as f: ... p1 = pickle.load(f) ... >>> p1 _ntuple_x_y(x=1, y=2) In effect, implicitly named tuples would be like key-sharing dictionaries, but sharing at the level of full type objects rather than key sets. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From apalala at gmail.com Tue Jul 25 02:42:30 2017 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Tue, 25 Jul 2017 02:42:30 -0400 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> Message-ID: > So the only thing being ruled out is the dedicated syntax option, > since it doesn't let us do anything that a new builtin can't do, it's > harder to find help on (as compared to "help(ntuple)" or searching > online for "python ntuple"), and it can't be readily backported to > Python 3.6 as part of a third party library (you can't easily backport > it any further than that regardless, since you'd be missing the > order-preservation guarantee for the keyword arguments passed to the > builtin). > If an important revamp of namedtuple will happen (actually, "easy and friendly immutable structures"), I'd suggest that the new syntax is not discarded upfront, but rather be left as a final decision, after all the other forces are resolved. FWIW, there's another development thread about "easy class declarations (with typining)". From MHPOV, the threads are different enough to remain separate. Cheers! -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From turnbull.stephen.fw at u.tsukuba.ac.jp Tue Jul 25 05:05:11 2017 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Tue, 25 Jul 2017 18:05:11 +0900 Subject: [Python-ideas] namedtuple redesign goals In-Reply-To: <45220c91-b6e7-364f-6395-b41a833ef0ae@gmail.com> References: <5975D3C9.5060104@stoneleaf.us> <5975DDF6.3010304@stoneleaf.us> <45220c91-b6e7-364f-6395-b41a833ef0ae@gmail.com> Message-ID: <22903.2503.99960.882973@turnbull.sk.tsukuba.ac.jp> Michel Desmoulin writes: > You are assuming a namedtuple litteral would mean > collections.namedtuple would lose the type hability. It's not the > case. The litterals can be complement, not a remplacement. Unlikely to fly in Python. We really don't like things that have "obvious semantics" based on appearance that don't have those semantics. Something like "(x=0, y=1)" is so obviously a literal creating a collections.namedtuple, the object it creates really needs to *be* a collections.namedtuple. YMMV, but I suspect most Python developers will agree with me to some extent, and most of those, pretty strongly. Steve From turnbull.stephen.fw at u.tsukuba.ac.jp Tue Jul 25 05:08:18 2017 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Tue, 25 Jul 2017 18:08:18 +0900 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: <32afe663-ca0f-153d-72df-332f9c8b0485@mrabarnett.plus.com> <22901.21361.654220.556617@turnbull.sk.tsukuba.ac.jp> Message-ID: <22903.2690.232850.422549@turnbull.sk.tsukuba.ac.jp> C Anthony Risinger writes: > At the end of the day, I don't see a way to have both a literal and > something that is externally "named", because the only ways to pass the > name I can imagine would make it look like a value within the container > itself (such as using a literal string for the first item), unless even > more new syntax was added. OK, so I took your "a tuple is a tuple is a tuple" incorrectly. What you want (as I understand it now) is not what def ntuple0(attr_list): return namedtuple("_", attr_list) gives you, but something like what def ntuple1(attr_list) return namedtuple("ImplicitNamedtuple_" + "_".join(attr_list), attr_list) does. Then this would truly be a "duck-typed namedtuple" as Chris Barker proposed in response to Steven d'Aprano elsewhere in this thread. See also Nick's full, namedtuple-compatible, implementation. Of course we still have the horrible "list of strings naming attributes" argument, so you still want a literal if possible, but with a **key argument, a new builtin would do the trick for me. YMMV. -- Associate Professor Division of Policy and Planning Science http://turnbull/sk.tsukuba.ac.jp/ Faculty of Systems and Information Email: turnbull at sk.tsukuba.ac.jp University of Tsukuba Tel: 029-853-5175 Tennodai 1-1-1, Tsukuba 305-8573 JAPAN From apalala at gmail.com Tue Jul 25 07:19:26 2017 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Tue, 25 Jul 2017 07:19:26 -0400 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: <22903.2690.232850.422549@turnbull.sk.tsukuba.ac.jp> References: <32afe663-ca0f-153d-72df-332f9c8b0485@mrabarnett.plus.com> <22901.21361.654220.556617@turnbull.sk.tsukuba.ac.jp> <22903.2690.232850.422549@turnbull.sk.tsukuba.ac.jp> Message-ID: Steven, (short of time here) With **kwargs and a little more work, the function would check if the type is already defined, and retur the ntuple with the correct type, not the type. Your sketch of a solution convinced me it can be done with a library function; no additional syntax needed. Cheers, On Tue, Jul 25, 2017 at 5:08 AM, Stephen J. Turnbull < turnbull.stephen.fw at u.tsukuba.ac.jp> wrote: > C Anthony Risinger writes: > > > At the end of the day, I don't see a way to have both a literal and > > something that is externally "named", because the only ways to pass the > > name I can imagine would make it look like a value within the container > > itself (such as using a literal string for the first item), unless even > > more new syntax was added. > > OK, so I took your "a tuple is a tuple is a tuple" incorrectly. What > you want (as I understand it now) is not what > > def ntuple0(attr_list): > return namedtuple("_", attr_list) > > gives you, but something like what > > def ntuple1(attr_list) > return namedtuple("ImplicitNamedtuple_" + "_".join(attr_list), > attr_list) > > does. Then this would truly be a "duck-typed namedtuple" as Chris > Barker proposed in response to Steven d'Aprano elsewhere in this > thread. See also Nick's full, namedtuple-compatible, implementation. > Of course we still have the horrible "list of strings naming > attributes" argument, so you still want a literal if possible, but > with a **key argument, a new builtin would do the trick for me. YMMV. > > > -- > Associate Professor Division of Policy and Planning Science > http://turnbull/sk.tsukuba.ac.jp/ Faculty of Systems and Information > Email: turnbull at sk.tsukuba.ac.jp University of Tsukuba > Tel: 029-853-5175 Tennodai 1-1-1, Tsukuba 305-8573 JAPAN > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From prometheus235 at gmail.com Tue Jul 25 13:02:58 2017 From: prometheus235 at gmail.com (Nick Timkovich) Date: Tue, 25 Jul 2017 12:02:58 -0500 Subject: [Python-ideas] Idea : for smarter assignment? In-Reply-To: References: Message-ID: On Fri, Jul 21, 2017 at 12:59 PM, David Mertz wrote: > > But you've left out quite a few binding operations. I might forget some, > but here are several: > Ned Batchelder had a good presentation at PyCon 2015 about names/values/assignments/binding: https://youtu.be/_AEJHKGk9ns?t=12m52s His summary of all assignment operators: X = ... for X in ... class X: pass def X: pass def fn(X): # when called, X is bound import X from ... import X except ... as X: with ... as X: ...I think only includes one other assignment type from what you listed (function parameters) that ironically is where one could maybe blur =/:, as doing f(x=3) and f(**{x: 3}) are usually similar (I think some C functions react poorly?). -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Tue Jul 25 13:34:43 2017 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 26 Jul 2017 03:34:43 +1000 Subject: [Python-ideas] Idea : for smarter assignment? In-Reply-To: References: Message-ID: On Wed, Jul 26, 2017 at 3:02 AM, Nick Timkovich wrote: > ...I think only includes one other assignment type from what you listed > (function parameters) that ironically is where one could maybe blur =/:, as > doing f(x=3) and f(**{x: 3}) are usually similar (I think some C functions > react poorly?). The only difference with C functions is that you can have named positional-only parameters, which you can't do in a pure-Python function. The nearest equivalent is to use *args and then peel the arguments off that manually, but then they don't have names at all. ChrisA From python at mrabarnett.plus.com Tue Jul 25 13:49:30 2017 From: python at mrabarnett.plus.com (MRAB) Date: Tue, 25 Jul 2017 18:49:30 +0100 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> Message-ID: <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> On 2017-07-25 02:57, Nick Coghlan wrote: > On 25 July 2017 at 02:46, Michel Desmoulin wrote: >> Le 24/07/2017 ? 16:12, Nick Coghlan a ?crit : >>> On 22 July 2017 at 01:18, Guido van Rossum wrote: >>>> Honestly I would like to declare the bare (x=1, y=0) proposal dead. Let's >>>> encourage the use of objects rather than tuples (named or otherwise) for >>>> most data exchanges. I know of a large codebase that uses dicts instead of >>>> objects, and it's a mess. I expect the bare ntuple to encourage the same >>>> chaos. >> >> This is the people working on big code base talking. > > Dedicated syntax: > > (x=1, y=0) > > New builtin: > > ntuple(x=1, y=0) > > So the only thing being ruled out is the dedicated syntax option, > since it doesn't let us do anything that a new builtin can't do, it's > harder to find help on (as compared to "help(ntuple)" or searching > online for "python ntuple"), and it can't be readily backported to > Python 3.6 as part of a third party library (you can't easily backport > it any further than that regardless, since you'd be missing the > order-preservation guarantee for the keyword arguments passed to the > builtin). > [snip] I think it's a little like function arguments. Arguments can be all positional, but you have to decide in what order they are listed. Named arguments are clearer than positional arguments when calling functions. So an ntuple would be like a tuple, but with names (attributes) instead of positions. I don't see how they could be compatible with tuples because the positions aren't fixed. You would need a NamedTuple where the type specifies the order. I think... From python at mrabarnett.plus.com Tue Jul 25 13:55:52 2017 From: python at mrabarnett.plus.com (MRAB) Date: Tue, 25 Jul 2017 18:55:52 +0100 Subject: [Python-ideas] Idea : for smarter assignment? In-Reply-To: References: Message-ID: <396d19fe-19dd-3030-06cf-acdc5ae891a5@mrabarnett.plus.com> On 2017-07-25 18:02, Nick Timkovich wrote: > On Fri, Jul 21, 2017 at 12:59 PM, David Mertz > wrote: > > But you've left out quite a few binding operations. I might forget > some, but here are several: > > > Ned Batchelder had a good presentation at PyCon 2015 about > names/values/assignments/binding: https://youtu.be/_AEJHKGk9ns?t=12m52s > His summary of all assignment operators: > > X = ... > for X in ... > class X: pass > def X: pass > def fn(X): # when called, X is bound > import X > from ... import X > except ... as X: > with ... as X: > There's also: import ... as X from ... import ... as X > ...I think only includes one other assignment type from what you listed > (function parameters) that ironically is where one could maybe blur =/:, > as doing f(x=3) and f(**{x: 3}) are usually similar (I think some C > functions react poorly?). > From pavol.lisy at gmail.com Tue Jul 25 14:23:13 2017 From: pavol.lisy at gmail.com (Pavol Lisy) Date: Tue, 25 Jul 2017 20:23:13 +0200 Subject: [Python-ideas] Idea : for smarter assignment? In-Reply-To: <396d19fe-19dd-3030-06cf-acdc5ae891a5@mrabarnett.plus.com> References: <396d19fe-19dd-3030-06cf-acdc5ae891a5@mrabarnett.plus.com> Message-ID: On 7/25/17, MRAB wrote: > On 2017-07-25 18:02, Nick Timkovich wrote: >> On Fri, Jul 21, 2017 at 12:59 PM, David Mertz > > wrote: >> >> But you've left out quite a few binding operations. I might forget >> some, but here are several: >> >> >> Ned Batchelder had a good presentation at PyCon 2015 about >> names/values/assignments/binding: https://youtu.be/_AEJHKGk9ns?t=12m52s >> His summary of all assignment operators: >> >> X = ... >> for X in ... >> class X: pass >> def X: pass >> def fn(X): # when called, X is bound >> import X >> from ... import X >> except ... as X: >> with ... as X: >> > There's also: > > import ... as X > from ... import ... as X globals()['X'] = ... From g.rodola at gmail.com Tue Jul 25 14:48:01 2017 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Tue, 25 Jul 2017 20:48:01 +0200 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> Message-ID: On Tue, Jul 25, 2017 at 7:49 PM, MRAB wrote: > On 2017-07-25 02:57, Nick Coghlan wrote: > >> On 25 July 2017 at 02:46, Michel Desmoulin >> wrote: >> >>> Le 24/07/2017 ? 16:12, Nick Coghlan a ?crit : >>> >>>> On 22 July 2017 at 01:18, Guido van Rossum wrote: >>>> >>>>> Honestly I would like to declare the bare (x=1, y=0) proposal dead. >>>>> Let's >>>>> encourage the use of objects rather than tuples (named or otherwise) >>>>> for >>>>> most data exchanges. I know of a large codebase that uses dicts >>>>> instead of >>>>> objects, and it's a mess. I expect the bare ntuple to encourage the >>>>> same >>>>> chaos. >>>>> >>>> >>> This is the people working on big code base talking. >>> >> >> Dedicated syntax: >> >> (x=1, y=0) >> >> New builtin: >> >> ntuple(x=1, y=0) >> >> So the only thing being ruled out is the dedicated syntax option, >> since it doesn't let us do anything that a new builtin can't do, it's >> harder to find help on (as compared to "help(ntuple)" or searching >> online for "python ntuple"), and it can't be readily backported to >> Python 3.6 as part of a third party library (you can't easily backport >> it any further than that regardless, since you'd be missing the >> order-preservation guarantee for the keyword arguments passed to the >> builtin). >> >> [snip] > > I think it's a little like function arguments. > > Arguments can be all positional, but you have to decide in what order they > are listed. Named arguments are clearer than positional arguments when > calling functions. > > So an ntuple would be like a tuple, but with names (attributes) instead of > positions. > > I don't see how they could be compatible with tuples because the positions > aren't fixed. You would need a NamedTuple where the type specifies the > order. > > I think... > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > Most likely ntuple() will require keyword args only, whereas for collections.namedtuple they are mandatory only during declaration. The order is the same as kwargs, so: >>> nt = ntuple(x=1, y=2) >>> nt[0] 1 >>> nt[1] 2 What's less clear is how isinstance() should behave. Perhaps: >>> t = (1, 2) >>> nt = ntuple(x=1, y=2) >>> isinstance(nt, tuple) True >>> isinstance(t, ntuple) False -- Giampaolo - http://grodola.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Tue Jul 25 15:30:14 2017 From: python at mrabarnett.plus.com (MRAB) Date: Tue, 25 Jul 2017 20:30:14 +0100 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> Message-ID: On 2017-07-25 19:48, Giampaolo Rodola' wrote: > > On Tue, Jul 25, 2017 at 7:49 PM, MRAB > wrote: > > On 2017-07-25 02:57, Nick Coghlan wrote: > > On 25 July 2017 at 02:46, Michel Desmoulin > > > wrote: > > Le 24/07/2017 ? 16:12, Nick Coghlan a ?crit : > > On 22 July 2017 at 01:18, Guido van Rossum > > wrote: > > Honestly I would like to declare the bare (x=1, > y=0) proposal dead. Let's > encourage the use of objects rather than tuples > (named or otherwise) for > most data exchanges. I know of a large codebase > that uses dicts instead of > objects, and it's a mess. I expect the bare ntuple > to encourage the same > chaos. > > > This is the people working on big code base talking. > > > Dedicated syntax: > > (x=1, y=0) > > New builtin: > > ntuple(x=1, y=0) > > So the only thing being ruled out is the dedicated syntax option, > since it doesn't let us do anything that a new builtin can't > do, it's > harder to find help on (as compared to "help(ntuple)" or searching > online for "python ntuple"), and it can't be readily backported to > Python 3.6 as part of a third party library (you can't easily > backport > it any further than that regardless, since you'd be missing the > order-preservation guarantee for the keyword arguments passed > to the > builtin). > > [snip] > > I think it's a little like function arguments. > > Arguments can be all positional, but you have to decide in what > order they are listed. Named arguments are clearer than positional > arguments when calling functions. > > So an ntuple would be like a tuple, but with names (attributes) > instead of positions. > > I don't see how they could be compatible with tuples because the > positions aren't fixed. You would need a NamedTuple where the type > specifies the order. > > I think... > > > Most likely ntuple() will require keyword args only, whereas for > collections.namedtuple they are mandatory only during declaration. The > order is the same as kwargs, so: > > >>> nt = ntuple(x=1, y=2) > >>> nt[0] > 1 > >>> nt[1] > 2 > > What's less clear is how isinstance() should behave. Perhaps: > > >>> t = (1, 2) > >>> nt = ntuple(x=1, y=2) > >>> isinstance(nt, tuple) > True > >>> isinstance(t, ntuple) > False Given: >>> nt = ntuple(x=1, y=2) you have nt[0] == 1 because that's the order of the args. But what about: >>> nt2 = ntuple(y=2, x=1) ? Does that mean that nt[0] == 2? Presumably, yes. Does nt == nt2? If it's False, then you've lost some of the advantage of using names instead of positions. It's a little like saying that functions can be called with keyword arguments, but the order of those arguments still matters! From g.rodola at gmail.com Tue Jul 25 16:48:14 2017 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Tue, 25 Jul 2017 22:48:14 +0200 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> Message-ID: On Tue, Jul 25, 2017 at 9:30 PM, MRAB wrote: > On 2017-07-25 19:48, Giampaolo Rodola' wrote: > >> >> On Tue, Jul 25, 2017 at 7:49 PM, MRAB > > wrote: >> >> On 2017-07-25 02:57, Nick Coghlan wrote: >> >> On 25 July 2017 at 02:46, Michel Desmoulin >> > >> wrote: >> >> Le 24/07/2017 ? 16:12, Nick Coghlan a ?crit : >> >> On 22 July 2017 at 01:18, Guido van Rossum >> > wrote: >> >> Honestly I would like to declare the bare (x=1, >> y=0) proposal dead. Let's >> encourage the use of objects rather than tuples >> (named or otherwise) for >> most data exchanges. I know of a large codebase >> that uses dicts instead of >> objects, and it's a mess. I expect the bare ntuple >> to encourage the same >> chaos. >> >> >> This is the people working on big code base talking. >> >> >> Dedicated syntax: >> >> (x=1, y=0) >> >> New builtin: >> >> ntuple(x=1, y=0) >> >> So the only thing being ruled out is the dedicated syntax option, >> since it doesn't let us do anything that a new builtin can't >> do, it's >> harder to find help on (as compared to "help(ntuple)" or searching >> online for "python ntuple"), and it can't be readily backported to >> Python 3.6 as part of a third party library (you can't easily >> backport >> it any further than that regardless, since you'd be missing the >> order-preservation guarantee for the keyword arguments passed >> to the >> builtin). >> >> [snip] >> >> I think it's a little like function arguments. >> >> Arguments can be all positional, but you have to decide in what >> order they are listed. Named arguments are clearer than positional >> arguments when calling functions. >> >> So an ntuple would be like a tuple, but with names (attributes) >> instead of positions. >> >> I don't see how they could be compatible with tuples because the >> positions aren't fixed. You would need a NamedTuple where the type >> specifies the order. >> >> I think... >> >> >> Most likely ntuple() will require keyword args only, whereas for >> collections.namedtuple they are mandatory only during declaration. The >> order is the same as kwargs, so: >> >> >>> nt = ntuple(x=1, y=2) >> >>> nt[0] >> 1 >> >>> nt[1] >> 2 >> >> What's less clear is how isinstance() should behave. Perhaps: >> >> >>> t = (1, 2) >> >>> nt = ntuple(x=1, y=2) >> >>> isinstance(nt, tuple) >> True >> >>> isinstance(t, ntuple) >> False >> > > Given: > > >>> nt = ntuple(x=1, y=2) > > you have nt[0] == 1 because that's the order of the args. > > But what about: > > >>> nt2 = ntuple(y=2, x=1) > > ? Does that mean that nt[0] == 2? Presumably, yes. > Does nt == nt2? > > If it's False, then you've lost some of the advantage of using names > instead of positions. > > It's a little like saying that functions can be called with keyword > arguments, but the order of those arguments still matters! Mmmm excellent point. I would expect "nt == nt2" to be True because collections.namedtuple() final instance works like that (compares pure values), because at the end of the day it's a tuple subclass and so should be ntuple() (meaning I expect "isinstance(ntuple(x=1, y=2), tuple)" to be True). On the other hand it's also legitimate to expect "nt == nt2" to be False because field names are different. That would be made clear in the doc, but the fact that people will have to look it up means it's not obvious. -- Giampaolo - http://grodola.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.rodola at gmail.com Tue Jul 25 17:00:00 2017 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Tue, 25 Jul 2017 23:00:00 +0200 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: Message-ID: On Thu, Jul 20, 2017 at 3:35 AM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > On Wed, Jul 19, 2017 at 9:08 PM, Guido van Rossum > wrote: > > The proposal in your email seems incomplete > > The proposal does not say anything about type((x=1, y=2)). I assume > it will be the same as the type currently returned by namedtuple(?, 'x > y'), but will these types be cached? I suppose that the type should be immutable at least as long as field names are the same, and the cache will occur on creation, in order to retain the 0 memory footprint. Will type((x=1, y=2)) is type((x=3, y=4)) be True?. Yes. > Maybe type((x=1, y=2))(values) will work? > It's supposed to behave like a tuple or any other primitive type (list, set, etc.), so yes. > Regarding that spec, I think there's something missing: given a list (or > tuple!) of values, how do you turn it into an 'ntuple'? As already suggested, it probably makes sense to just reuse the dict syntax: >>> dict([('a', 1), ('b', 2)]) {'a': 1, 'b': 2} >>> ntuple([('a', 1), ('b', 2)]) (a=1, b=2) -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Tue Jul 25 19:58:44 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 26 Jul 2017 11:58:44 +1200 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> Message-ID: <5977DB34.8020305@canterbury.ac.nz> Nick Coghlan wrote: > New builtin: > > ntuple(x=1, y=0) Do we really want this to be a tuple, with ordered fields? If so, what determines the order? If it's the order of the keyword arguments, this means that ntuple(x=1, y=0) and ntuple(y=0, x=1) would give objects with different behaviour. This goes against the usual expectation that keyword arguments of a constructor can be written in any order. That's one of the main benefits of using keyword arguments, that you don't have to remember a specific order for them. If we're going to have such a type, I suggest making it a pure named-fields object without any tuple aspects. In which case "ntuple" wouldn't be the right name for it, and something like "record" or "struct" would be better. Also, building a whole type object for each combination of fields seems like overkill to me. Why not have just one type of object with an attribute referencing a name-to-slot mapping? -- Greg From steve at pearwood.info Tue Jul 25 21:05:12 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 26 Jul 2017 11:05:12 +1000 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> Message-ID: <20170726010511.GT3149@ando.pearwood.info> On Tue, Jul 25, 2017 at 08:30:14PM +0100, MRAB wrote: > Given: > > >>> nt = ntuple(x=1, y=2) > > you have nt[0] == 1 because that's the order of the args. > > But what about: > > >>> nt2 = ntuple(y=2, x=1) > > ? Does that mean that nt[0] == 2? Presumably, yes. It better be. > Does nt == nt2? > > If it's False, then you've lost some of the advantage of using names > instead of positions. Not at all. It's a *tuple*, so the fields have a definite order. If you don't want a tuple, why are using a tuple? Use SimpleNamespace for an unordered "bag of attributes": py> from types import SimpleNamespace py> x = SimpleNamespace(spam=4, eggs=3) py> y = SimpleNamespace(eggs=3, spam=4) py> x == y True > It's a little like saying that functions can be called with keyword > arguments, but the order of those arguments still matters! That's the wrong analogy and it won't work. But people will expect that it will, and be surprised when it doesn't! The real problem here is that we're combining two distinct steps into one. The *first* step should be to define the order of the fields in the record (a tuple): [x, y] is not the same as [y, x]. Once the field order is defined, then you can *instantiate* those fields either positionally, or by name in any order. But by getting rid of that first step, we no longer have the option to specify the order of the fields. We can only infer them from the order they are given when you instantiate the fields. Technically, Nick's scheme to implicitly cache the type could work around this at the cost of making it impossible to have two types with the same field names in different orders. Given: ntuple(y=1, x=2) ntuple could look up the *unordered set* {y, x} in the cache, and if found, use that type. If not found, define a new type with the fields in the stated order [y, x]. So now you can, or at least you will *think* that you can, safely write this: spam = ntuple(x=2, y=1, z=0) # defines the field order [x, y, z] eggs = ntuple(z=0, y=1, x=2) # instantiate using kwargs in any order assert spam=eggs But this has a hidden landmine. If *any* module happens to use ntuple with the same field names as you, but in a different order, you will have mysterious bugs: x, y, z = spam You expect x=2, y=1, z=0 because that's the way you defined the field order, but unknown to you some other module got in first and defined it as [z, y, x] and so your code will silently do the wrong thing. Even if the cache is per-module, the same problem will apply. If the spam and eggs assignments above are in different functions, the field order will depend on which function happens to be called first, which may not be easily predictable. I don't see any way that this proposal can be anything by a subtle source of bugs. We have two *incompatible* requirements: - we want to define the order of the fields according to the order we give keyword arguments; - we want to give keyword arguments in any order without caring about the field order. We can't have both, and we can't give up either without being a surprising source of annoyance and bugs. As far as I am concerned, this kills the proposal for me. If you care about field order, then use namedtuple and explicitly define a class with the field order you want. If you don't care about field order, use SimpleNamespace. -- Steve From steve at pearwood.info Tue Jul 25 21:15:28 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 26 Jul 2017 11:15:28 +1000 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: <5977DB34.8020305@canterbury.ac.nz> References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <5977DB34.8020305@canterbury.ac.nz> Message-ID: <20170726011528.GU3149@ando.pearwood.info> On Wed, Jul 26, 2017 at 11:58:44AM +1200, Greg Ewing wrote: > If we're going to have such a type, I suggest making it a > pure named-fields object without any tuple aspects. In which > case "ntuple" wouldn't be the right name for it, and something > like "record" or "struct" would be better. Guido's time machine strikes again. from types import SimpleNamespace By the way: records and structs define their fields in a particular order too. namedtuple does quite well at modelling records and structs in other languages. > Also, building a whole type object for each combination of > fields seems like overkill to me. Why not have just one type > of object with an attribute referencing a name-to-slot > mapping? You mean one globally shared mapping for all ntuples? So given: spam = ntuple(name="fred", age=99) eggs = ntuple(model=2, colour="green") we would have spam.colour == 99, and eggs.name == 2. Personally, I think this whole proposal for implicitly deriving type information from the way we instantiate a tuple is a bad idea. I don't see this becoming anything other than a frustrating and annoying source of subtle, hard to diagnose bugs. -- Steve From ncoghlan at gmail.com Wed Jul 26 11:48:52 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 27 Jul 2017 01:48:52 +1000 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> Message-ID: On 26 July 2017 at 03:49, MRAB wrote: > I don't see how they could be compatible with tuples because the positions > aren't fixed. You would need a NamedTuple where the type specifies the > order. Python 3.6+ guarantees that keyword-argument order is preserved in the namespace passed to the called function. This means that a function that only accepts **kwargs can reliably check the order of arguments used in the call, and hence tell the difference between: ntuple(x=1, y=2) ntuple(y=1, x=2) So because these are implicitly typed, you *can't* put the arguments in an arbitrary order - you have to put them in the desired field order, or you're going to accidentally define a different type. If that possibility bothers someone and they want to avoid it, then the solution is straightforward: predefine an explicit type, and use that instead of an implicitly defined one. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Wed Jul 26 12:05:47 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 27 Jul 2017 02:05:47 +1000 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: <20170726010511.GT3149@ando.pearwood.info> References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> <20170726010511.GT3149@ando.pearwood.info> Message-ID: On 26 July 2017 at 11:05, Steven D'Aprano wrote: > I don't see any way that this proposal can be anything by a subtle > source of bugs. We have two *incompatible* requirements: > > - we want to define the order of the fields according to the > order we give keyword arguments; > > - we want to give keyword arguments in any order without > caring about the field order. > > We can't have both, and we can't give up either without being a > surprising source of annoyance and bugs. I think the second stated requirement isn't a genuine requirement, as that *isn't* a general expectation. After all, one of the reasons we got ordered-by-default keyword arguments is because people were confused by the fact that you couldn't reliably do: mydict = collections.OrderedDict(x=1, y=2) Now, though, that's fully supported and does exactly what you'd expect: >>> from collections import OrderedDict >>> OrderedDict(x=1, y=2) OrderedDict([('x', 1), ('y', 2)]) >>> OrderedDict(y=2, x=1) OrderedDict([('y', 2), ('x', 1)]) In this case, the "order matters" expectation is informed by the nature of the constructor being called: it's an *ordered* dict, so the constructor argument order matters. The same applies to the ntuple concept, expect there it's the fact that it's a *tuple* that conveys the "order matters" expectation. ntuple(x=1, y=2) == ntuple(y=1, x=2) == tuple(1, 2) ntuple(x=2, y=1) == ntuple(y=2, x=1) == tuple(2, 1) Putting the y-coordinate first would be *weird* though, and I don't think it's an accident that we mainly discuss tuples with strong order conventions in the context of implicit typing: they're the ones where it feels most annoying to have to separately define that order rather than being able to just use it directly. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From steve at pearwood.info Wed Jul 26 13:10:16 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 27 Jul 2017 03:10:16 +1000 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> <20170726010511.GT3149@ando.pearwood.info> Message-ID: <20170726171016.GW3149@ando.pearwood.info> On Thu, Jul 27, 2017 at 02:05:47AM +1000, Nick Coghlan wrote: > On 26 July 2017 at 11:05, Steven D'Aprano wrote: > > I don't see any way that this proposal can be anything by a subtle > > source of bugs. We have two *incompatible* requirements: > > > > - we want to define the order of the fields according to the > > order we give keyword arguments; > > > > - we want to give keyword arguments in any order without > > caring about the field order. > > > > We can't have both, and we can't give up either without being a > > surprising source of annoyance and bugs. > > I think the second stated requirement isn't a genuine requirement, as > that *isn't* a general expectation. > > After all, one of the reasons we got ordered-by-default keyword > arguments is because people were confused by the fact that you > couldn't reliably do: > > mydict = collections.OrderedDict(x=1, y=2) Indeed. But the reason we got *keyword arguments* in the first place was so you didn't need to care about the order of parameters. As is often the case, toy examples with arguments x and y don't really demonstrate the problem in real code. We need more realistic, non-trivial examples. Most folks can remember the first two arguments to open: open(name, 'w') but for anything more complex, we not only want to skip arguments and rely on their defaults, but we don't necessarily remember the order of definition: open(name, 'w', newline='\r', encoding='macroman', errors='replace') Without checking the documentation, how many people could tell you whether that order matches the positional order? I know I couldn't. You say > ntuple(x=1, y=2) == ntuple(y=1, x=2) == tuple(1, 2) > ntuple(x=2, y=1) == ntuple(y=2, x=1) == tuple(2, 1) > > Putting the y-coordinate first would be *weird* though Certainly, if you're used to the usual mathematics convention that the horizontal coordinate x comes first. But if you are used to the curses convention that the vertical coordinate y comes first, putting y first is completely natural. And how about ... ? ntuple(flavour='strange', spin='1/2', mass=95.0, charge='-1/3', isospin='-1/2', hypercharge='1/3') versus: ntuple(flavour='strange', mass=95.0, spin='1/2', charge='-1/3', hypercharge='1/3', isospin='-1/2') Which one is "weird"? This discussion has been taking place for many days, and it is only now (thanks to MRAB) that we've noticed this problem. I think it is dangerous to assume that the average Python coder will either: - always consistently specify the fields in the same order; - or recognise ahead of time (during the design phase of the program) that they should pre-declare a class with the fields in a particular order. Some people will, of course. But many won't. Instead, they'll happily start instantiating ntuples with keyword arguments in inconsistent order, and if they are lucky they'll get unexpected, tricky to debug exceptions. If they're unlucky, their program will silently do the wrong thing, and nobody will notice that their results are garbage. SimpleNamespace doesn't have this problem: the fields in SimpleNamespace aren't ordered, and cannot be packed or unpacked by position. namedtuple doesn't have this problem: you have to predeclare the fields in a certain order, after which you can instantiate them by keyword in any order, and unpacking the tuple will always honour that order. > Now, though, that's fully supported and does exactly what you'd expect: > > >>> from collections import OrderedDict > >>> OrderedDict(x=1, y=2) > OrderedDict([('x', 1), ('y', 2)]) > >>> OrderedDict(y=2, x=1) > OrderedDict([('y', 2), ('x', 1)]) > > In this case, the "order matters" expectation is informed by the > nature of the constructor being called: it's an *ordered* dict, so the > constructor argument order matters. I don't think that's a great analogy. There's no real equivalent of packing/unpacking OrderedDicts by position to trip us up here. It is better to think of OrderedDicts as "order-preserving dicts" rather than "dicts where the order matters". Yes, it does matter, in a weak sense. But not in the important sense of binding values to keys: py> from collections import OrderedDict py> a = OrderedDict([('spam', 1), ('eggs', 2)]) py> b = OrderedDict([('eggs', -1), ('spam', 99)]) py> a.update(b) py> a OrderedDict([('spam', 99), ('eggs', -1)]) update() has correctly bound 99 to key 'spam', even though the keys are in the wrong order. The same applies to dict unpacking: a.update(**b) In contrast, named tuples aren't just order-preserving. The field order is part of their definition, and tuple unpacking honours the field order, not the field names. While we can't update tuples in place, we can and often do unpack them into variables. When we do, we need to know the field order: flavour, charge, mass, spin, isospin, hypercharge = mytuple but the risk is that the field order may not be what we expect unless we are scrupulously careful to *always* call ntuple(...) with the arguments in the same order. -- Steve From ethan at stoneleaf.us Wed Jul 26 13:44:01 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 26 Jul 2017 10:44:01 -0700 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> <20170726010511.GT3149@ando.pearwood.info> Message-ID: <5978D4E1.8010601@stoneleaf.us> On 07/26/2017 09:05 AM, Nick Coghlan wrote: > On 26 July 2017 at 11:05, Steven D'Aprano wrote: >> I don't see any way that this proposal can be anything by a subtle >> source of bugs. We have two *incompatible* requirements: >> >> - we want to define the order of the fields according to the >> order we give keyword arguments; >> >> - we want to give keyword arguments in any order without >> caring about the field order. >> >> We can't have both, and we can't give up either without being a >> surprising source of annoyance and bugs. > > I think the second stated requirement isn't a genuine requirement, as > that *isn't* a general expectation. I have to agree with D'Aprano on this one. I certainly do not *expect* keyword argument position to matter, and it seems to me the primary reason to make it matter was not for dicts, but because a class name space is implemented by dicts. Tuples, named or otherwise, are positional first -- order matters. Specifying point = ntuple(y=2, x=-3) and having point[0] == 3 is going to be bizarre. This will be a source for horrible bugs. -- ~Ethan~ From abrault at mapgears.com Wed Jul 26 13:47:31 2017 From: abrault at mapgears.com (Alexandre Brault) Date: Wed, 26 Jul 2017 13:47:31 -0400 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: <20170726171016.GW3149@ando.pearwood.info> References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> <20170726010511.GT3149@ando.pearwood.info> <20170726171016.GW3149@ando.pearwood.info> Message-ID: <34cfcc19-22ca-554a-5f6a-a647d02ff4b1@mapgears.com> On 2017-07-26 01:10 PM, Steven D'Aprano wrote: > On Thu, Jul 27, 2017 at 02:05:47AM +1000, Nick Coghlan wrote: >> On 26 July 2017 at 11:05, Steven D'Aprano wrote: >>> I don't see any way that this proposal can be anything by a subtle >>> source of bugs. We have two *incompatible* requirements: >>> >>> - we want to define the order of the fields according to the >>> order we give keyword arguments; >>> >>> - we want to give keyword arguments in any order without >>> caring about the field order. >>> >>> We can't have both, and we can't give up either without being a >>> surprising source of annoyance and bugs. >> I think the second stated requirement isn't a genuine requirement, as >> that *isn't* a general expectation. >> >> After all, one of the reasons we got ordered-by-default keyword >> arguments is because people were confused by the fact that you >> couldn't reliably do: >> >> mydict = collections.OrderedDict(x=1, y=2) > > Indeed. But the reason we got *keyword arguments* in the first place was > so you didn't need to care about the order of parameters. As is often > the case, toy examples with arguments x and y don't really demonstrate > the problem in real code. We need more realistic, non-trivial examples. > Most folks can remember the first two arguments to open: > > open(name, 'w') > > but for anything more complex, we not only want to skip arguments and > rely on their defaults, but we don't necessarily remember the order of > definition: > > open(name, 'w', newline='\r', encoding='macroman', errors='replace') > > Without checking the documentation, how many people could tell you > whether that order matches the positional order? I know I couldn't. > > > You say > >> ntuple(x=1, y=2) == ntuple(y=1, x=2) == tuple(1, 2) >> ntuple(x=2, y=1) == ntuple(y=2, x=1) == tuple(2, 1) >> >> Putting the y-coordinate first would be *weird* though > Certainly, if you're used to the usual mathematics convention that the > horizontal coordinate x comes first. But if you are used to the curses > convention that the vertical coordinate y comes first, putting y first > is completely natural. > > And how about ... ? > > ntuple(flavour='strange', spin='1/2', mass=95.0, charge='-1/3', > isospin='-1/2', hypercharge='1/3') > > versus: > > ntuple(flavour='strange', mass=95.0, spin='1/2', charge='-1/3', > hypercharge='1/3', isospin='-1/2') > > Which one is "weird"? > > This discussion has been taking place for many days, and it is only now > (thanks to MRAB) that we've noticed this problem. I think it is > dangerous to assume that the average Python coder will either: > > - always consistently specify the fields in the same order; > > - or recognise ahead of time (during the design phase of the program) > that they should pre-declare a class with the fields in a particular > order. > > > Some people will, of course. But many won't. Instead, they'll happily > start instantiating ntuples with keyword arguments in inconsistent > order, and if they are lucky they'll get unexpected, tricky to debug > exceptions. If they're unlucky, their program will silently do the wrong > thing, and nobody will notice that their results are garbage. > > SimpleNamespace doesn't have this problem: the fields in SimpleNamespace > aren't ordered, and cannot be packed or unpacked by position. > > namedtuple doesn't have this problem: you have to predeclare the fields > in a certain order, after which you can instantiate them by keyword in > any order, and unpacking the tuple will always honour that order. > > >> Now, though, that's fully supported and does exactly what you'd expect: >> >> >>> from collections import OrderedDict >> >>> OrderedDict(x=1, y=2) >> OrderedDict([('x', 1), ('y', 2)]) >> >>> OrderedDict(y=2, x=1) >> OrderedDict([('y', 2), ('x', 1)]) >> >> In this case, the "order matters" expectation is informed by the >> nature of the constructor being called: it's an *ordered* dict, so the >> constructor argument order matters. > I don't think that's a great analogy. There's no real equivalent of > packing/unpacking OrderedDicts by position to trip us up here. It is > better to think of OrderedDicts as "order-preserving dicts" rather than > "dicts where the order matters". Yes, it does matter, in a weak sense. > But not in the important sense of binding values to keys: > > py> from collections import OrderedDict > py> a = OrderedDict([('spam', 1), ('eggs', 2)]) > py> b = OrderedDict([('eggs', -1), ('spam', 99)]) > py> a.update(b) > py> a > OrderedDict([('spam', 99), ('eggs', -1)]) > > update() has correctly bound 99 to key 'spam', even though the keys are > in the wrong order. The same applies to dict unpacking: > > a.update(**b) > > In contrast, named tuples aren't just order-preserving. The field order > is part of their definition, and tuple unpacking honours the field > order, not the field names. While we can't update tuples in place, we > can and often do unpack them into variables. When we do, we need to know > the field order: > > > flavour, charge, mass, spin, isospin, hypercharge = mytuple > > but the risk is that the field order may not be what we expect unless we > are scrupulously careful to *always* call ntuple(...) with the arguments > in the same order. > > > The main use case for ntuple literals, imo, would be to replace functions like this: >>> def spam(...): ... [...] ... return eggs, ham With the more convenient for the caller >>> def spam(...): ... [...] ... return (eggs=eggs, ham=ham) Ntuple literals don't introduce a new field-ordering problem, because this problem already existed with the bare tuple literal it replaced. In the case where you need to create compatible ntuples for multiple functions to create, collections.namedtuple is still available to predefine the named tuple type. Or you can use a one-liner helper function like this: >>> def _gai(family, type, proto, canonname, sockaddr): ... return (family=family, type=type, proto=proto, canonname=canonname, sockaddr=sockaddr) >>> type(_gai(family=1, type=2, [...])) is type(_gai(type=2, family=1, [...])) Alex Brault From k7hoven at gmail.com Wed Jul 26 17:50:09 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Thu, 27 Jul 2017 00:50:09 +0300 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: <20170726171016.GW3149@ando.pearwood.info> References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> <20170726010511.GT3149@ando.pearwood.info> <20170726171016.GW3149@ando.pearwood.info> Message-ID: On Wed, Jul 26, 2017 at 8:10 PM, Steven D'Aprano wrote: > On Thu, Jul 27, 2017 at 02:05:47AM +1000, Nick Coghlan wrote: > > >>> from collections import OrderedDict > > >>> OrderedDict(x=1, y=2) > > OrderedDict([('x', 1), ('y', 2)]) > > >>> OrderedDict(y=2, x=1) > > OrderedDict([('y', 2), ('x', 1)]) > > > > In this case, the "order matters" expectation is informed by the > > nature of the constructor being called: it's an *ordered* dict, so the > > constructor argument order matters. > > I don't think that's a great analogy. There's no real equivalent of > packing/unpacking OrderedDicts by position to trip us up here. It is > better to think of OrderedDicts as "order-preserving dicts" rather than > "dicts where the order matters". Yes, it does matter, in a weak sense. Careful here, this is misleading. What you say applies to the normal dict since 3.6, which now *preserves* order. But in OrderedDict, order matters in quite a strong way: ?od1 = OrderedDict(a=1, b=2) od2 = OrderedDict(b=2, a=1) # (kwargs order obviously matters) ? od1 == od2 # gives False !! od1 == dict(a=1, b=2) # gives True od2 == dict(a=1, b=2) # gives True od1 == OrderedDict(a=1, b=2) # gives True ? ?I also think this is how OrderedDict *should* behave to earn its name. It's great that we now also have an order-*preserving* dict, because ?often you want that, but still dict(a=!, b=2) == dict(b=2, a=1). But not in the important sense of binding values to keys: > > py> from collections import OrderedDict > py> a = OrderedDict([('spam', 1), ('eggs', 2)]) > py> b = OrderedDict([('eggs', -1), ('spam', 99)]) > py> a.update(b) > py> a > OrderedDict([('spam', 99), ('eggs', -1)]) > > update() has correctly bound 99 to key 'spam', even though the keys are > in the wrong order. The same applies to dict unpacking: > > ???The reason for this is that the order is determined by the first binding, not by subsequent? updates to already-existing keys.? > a.update(**b) > > ? Unpacking by name is still ordered in a.update(**b), but the update order does not matter, because keys 'spam' and 'eggs' already exist. > In contrast, named tuples aren't just order-preserving. The field order > is part of their definition, and tuple unpacking honours the field > order, not the field names. While we can't update tuples in place, we > can and often do unpack them into variables. When we do, we need to know > the field order: > > ?I hope this was already clear to people in the discussion, but in case not, thanks for clarifying.? > > flavour, charge, mass, spin, isospin, hypercharge = mytuple > > but the risk is that the field order may not be what we expect unless we > are scrupulously careful to *always* call ntuple(...) with the arguments > in the same order. > > This is indeed among the reasons why the tuple api is desirable mostly for backwards compatibility in existing functions, as pointed out early in this thread. New functions will hopefully use something with only attribute access to the values, unless there is a clear reason to also have integer indexing and unpacking by order. ?-- Koos? -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From k7hoven at gmail.com Wed Jul 26 18:33:39 2017 From: k7hoven at gmail.com (Koos Zevenhoven) Date: Thu, 27 Jul 2017 01:33:39 +0300 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: <34cfcc19-22ca-554a-5f6a-a647d02ff4b1@mapgears.com> References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> <20170726010511.GT3149@ando.pearwood.info> <20170726171016.GW3149@ando.pearwood.info> <34cfcc19-22ca-554a-5f6a-a647d02ff4b1@mapgears.com> Message-ID: On Wed, Jul 26, 2017 at 8:47 PM, Alexandre Brault wrote: > > The main use case for ntuple literals, imo, would be to replace > functions like this: > >>> def spam(...): > ... [...] > ... return eggs, ham > > With the more convenient for the caller > >>> def spam(...): > ... [...] > ... return (eggs=eggs, ham=ham) > > ?Yes, but for the caller it's just as convenient without new namedtuple syntax. If there's new *syntax* for returning multiple values, it would indeed hopefully look more into the future and not create a tuple. -- Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Wed Jul 26 19:46:45 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 27 Jul 2017 11:46:45 +1200 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> <20170726010511.GT3149@ando.pearwood.info> Message-ID: <597929E5.9020103@canterbury.ac.nz> Nick Coghlan wrote: > The same applies to the ntuple concept, expect there it's the fact > that it's a *tuple* that conveys the "order matters" expectation. That assumes there's a requirement that it be a tuple in the first place. I don't see that requirement in the use cases suggested here so far. -- Greg From steve at pearwood.info Wed Jul 26 20:38:07 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 27 Jul 2017 10:38:07 +1000 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: <597929E5.9020103@canterbury.ac.nz> References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> <20170726010511.GT3149@ando.pearwood.info> <597929E5.9020103@canterbury.ac.nz> Message-ID: <20170727003807.GY3149@ando.pearwood.info> On Thu, Jul 27, 2017 at 11:46:45AM +1200, Greg Ewing wrote: > Nick Coghlan wrote: > >The same applies to the ntuple concept, expect there it's the fact > >that it's a *tuple* that conveys the "order matters" expectation. > > That assumes there's a requirement that it be a tuple in > the first place. I don't see that requirement in the use > cases suggested here so far. This is an excellent point. Perhaps we should just find a shorter name for SimpleNamespace and promote it as the solution. I'm not sure about other versions, but in Python 3.5 it will even save memory for small records: py> from types import SimpleNamespace py> spam = SimpleNamespace(flavour='up', charge='1/3') py> sys.getsizeof(spam) 24 py> from collections import namedtuple py> eggs = namedtuple('record', 'flavour charge')(charge='1/3', flavour='up') py> sys.getsizeof(eggs) 32 py> sys.getsizeof(('up', '1/3')) 32 -- Steve From python-ideas at mgmiller.net Wed Jul 26 23:23:53 2017 From: python-ideas at mgmiller.net (Mike Miller) Date: Wed, 26 Jul 2017 20:23:53 -0700 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: <20170727003807.GY3149@ando.pearwood.info> References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> <20170726010511.GT3149@ando.pearwood.info> <597929E5.9020103@canterbury.ac.nz> <20170727003807.GY3149@ando.pearwood.info> Message-ID: <2c0750e1-2a1a-45f6-d6a1-90d464e68623@mgmiller.net> Many times in the olden days when I needed a bag o' attributes to be passed around like a struct I'd make a dummy class, then instantiate it. (A lot harder than the javascript equivalent.) Unfortunately, the modern Python solution: from types import SimpleNamespace as ns is only a few characters shorter. Perhaps a 'ns()' or 'bag()' builtin alias could fit the bill. Another idea I had not too long ago, was to let an object() be writable, then no further changes would be necessary. -Mike On 2017-07-26 17:38, Steven D'Aprano wrote: > This is an excellent point. Perhaps we should just find a shorter name > for SimpleNamespace and promote it as the solution. > From desmoulinmichel at gmail.com Thu Jul 27 05:29:44 2017 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Thu, 27 Jul 2017 11:29:44 +0200 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: <2c0750e1-2a1a-45f6-d6a1-90d464e68623@mgmiller.net> References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> <20170726010511.GT3149@ando.pearwood.info> <597929E5.9020103@canterbury.ac.nz> <20170727003807.GY3149@ando.pearwood.info> <2c0750e1-2a1a-45f6-d6a1-90d464e68623@mgmiller.net> Message-ID: <995437d0-98a7-c752-2cde-e76e3238fc3b@gmail.com> To void introducing a new built-in, we could do object.bag = SimpleNamespace Le 27/07/2017 ? 05:23, Mike Miller a ?crit : > Many times in the olden days when I needed a bag o' attributes to be > passed around like a struct I'd make a dummy class, then instantiate > it. (A lot harder than the javascript equivalent.) > > Unfortunately, the modern Python solution: > > from types import SimpleNamespace as ns > > is only a few characters shorter. Perhaps a 'ns()' or 'bag()' builtin > alias could fit the bill. > > Another idea I had not too long ago, was to let an object() be writable, > then no further changes would be necessary. > > -Mike > > > On 2017-07-26 17:38, Steven D'Aprano wrote: >> This is an excellent point. Perhaps we should just find a shorter name >> for SimpleNamespace and promote it as the solution. >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From ncoghlan at gmail.com Thu Jul 27 07:19:07 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 27 Jul 2017 21:19:07 +1000 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: <20170726171016.GW3149@ando.pearwood.info> References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> <20170726010511.GT3149@ando.pearwood.info> <20170726171016.GW3149@ando.pearwood.info> Message-ID: On 27 July 2017 at 03:10, Steven D'Aprano wrote: > On Thu, Jul 27, 2017 at 02:05:47AM +1000, Nick Coghlan wrote: >> On 26 July 2017 at 11:05, Steven D'Aprano wrote: >> > I don't see any way that this proposal can be anything by a subtle >> > source of bugs. We have two *incompatible* requirements: >> > >> > - we want to define the order of the fields according to the >> > order we give keyword arguments; >> > >> > - we want to give keyword arguments in any order without >> > caring about the field order. >> > >> > We can't have both, and we can't give up either without being a >> > surprising source of annoyance and bugs. >> >> I think the second stated requirement isn't a genuine requirement, as >> that *isn't* a general expectation. >> >> After all, one of the reasons we got ordered-by-default keyword >> arguments is because people were confused by the fact that you >> couldn't reliably do: >> >> mydict = collections.OrderedDict(x=1, y=2) > > > Indeed. But the reason we got *keyword arguments* in the first place was > so you didn't need to care about the order of parameters. As is often > the case, toy examples with arguments x and y don't really demonstrate > the problem in real code. We need more realistic, non-trivial examples. Trivial examples in ad hoc throwaway scripts, analysis notebooks, and student exercises *are* the use case. For non-trivial applications and non-trivial data structures with more than a few fields, the additional overhead of predefining and appropriately documenting a suitable class (whether with collections.namedtuple, a data class library like attrs, or completely by hand) is going to be small relative to the overall complexity of the application, so the most sensible course of action is usually going to be to just go ahead and do that. Instead, as Alexandre describes, the use cases that the ntuple builtin proposal aims to address *aren't* those where folks are already willing to use a properly named tuple: it's those where they're currently creating a *plain* tuple, and we want to significantly lower the barrier to making such objects a bit more self-describing, by deliberately eliminating the need to predefine the related type. In an educational setting, it may even provide a gentler introduction to the notion of custom class definitions for developers just starting out. >From an API user's perspective, if you see `ntuple(x=1, y=2)` as an object's representation, that will also carry significantly more concrete information about the object's expected behaviour than if you see `Point2D(x=1, y=2)`. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Thu Jul 27 07:48:39 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 27 Jul 2017 21:48:39 +1000 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: <20170727003807.GY3149@ando.pearwood.info> References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> <20170726010511.GT3149@ando.pearwood.info> <597929E5.9020103@canterbury.ac.nz> <20170727003807.GY3149@ando.pearwood.info> Message-ID: On 27 July 2017 at 10:38, Steven D'Aprano wrote: > On Thu, Jul 27, 2017 at 11:46:45AM +1200, Greg Ewing wrote: >> Nick Coghlan wrote: >> >The same applies to the ntuple concept, expect there it's the fact >> >that it's a *tuple* that conveys the "order matters" expectation. >> >> That assumes there's a requirement that it be a tuple in >> the first place. I don't see that requirement in the use >> cases suggested here so far. > > This is an excellent point. Perhaps we should just find a shorter name > for SimpleNamespace and promote it as the solution. > > I'm not sure about other versions, but in Python 3.5 it will even save > memory for small records: > > py> from types import SimpleNamespace > py> spam = SimpleNamespace(flavour='up', charge='1/3') > py> sys.getsizeof(spam) > 24 sys.getsizeof() isn't recursive, so this is only measuring the overhead of CPython's per-object bookkeeping. The actual storage expense is incurred via the instance dict: >>> sys.getsizeof(spam.__dict__) 240 >>> data = dict(charge='1/3', flavour='up') >>> sys.getsizeof(data) 240 Note: this is a 64-bit system, so the per-instance overhead is also higher (48 bytes rather than 24), and tuple incur a cost of 8 bytes per item rather than 4 bytes. It's simply not desirable to rely on dicts for this kind of use case, as the per-instance cost of their bookkeeping machinery is overly high for small data classes and key-sharing only mitigates that problem, it doesn't eliminate it. By contrast, tuples are not only the most memory efficient data structure Python offers, they're also one of the fastest to allocate: since they're fixed length, they can be allocated as a single contiguous block, rather than requiring multiple memory allocations per instance (and that's before taking the free list into account). As a result, "Why insist on a tuple?" has three main answers: - lowest feasible per-instance memory overhead - lowest feasible runtime allocation cost overhead - backwards compatibility with APIs that currently return a tuple without impacting either of the above benefits Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From chris.barker at noaa.gov Thu Jul 27 12:22:05 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 27 Jul 2017 09:22:05 -0700 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> <20170726010511.GT3149@ando.pearwood.info> <597929E5.9020103@canterbury.ac.nz> <20170727003807.GY3149@ando.pearwood.info> Message-ID: > To avoid introducing a new built-in, we could do object.bag = SimpleNamespace I am liking the idea of making SimpleNamespace more accessible, but maybe we need to think a bit more about why one might want a tuple-with-names, rather than just an easy way to create an object-with-just-attributes. That is -- how many times do folks use a namedtuple rather than SimpleNamespace just because they know about it, rather than because they really need it. I know that is often the case... but here are some reasons to want an actual tuple (or, an actual ImutableSequence) 1) Backward compatibility with tuples. This may have been a common use case when they were new, and maybe still is, but If we are future-looking, I don't think this the the primary use case. But maybe some of the features you get from that are important. 2) order-preserving: this makes them a good match for "records" from a DB or CSV file or something. 3) unpacking: x, y = a_point 4) iterating: for coord in a_point: ... 5) immutability: being able to use them as a key in a dict. What else? So the question is -- If we want an easier way to create a namedtuple-like object -- which of these features are desired? Personally, I think an immutable SimpleNamespace would be good. And if you want the other stuff, use a NamedTuple. And a quick and easy way to make one would be nice. I understand that the ordering could be confusing to folks, but I'm still thinking yes -- in the spirit of duck-typing, I think having to think about the Type is unfortunate. And will people really get confused if: ntuple(x=1, y=2) == ntuple(y=2, x=1) returns False? If so -- then, if we are will to introduce new syntax, then we can make that more clear. Note that: ntuple(x=1, y=2) == ntuple(z=1, w=2) Should also be False. and ntuple(x=1, y=2) == (1, 2) also False (this is losing tuple-compatibility) That is, the names, and the values, and the order are all fixed. If we use a tuple to define the "type" == ('x','y') then it's easy enough to cache and compare based on that. If, indeed, you need to cache at all. BTW, I think we need to be careful about what assumptions we are making in terms of "dicts are order-preserving". My understanding is that the fact that the latest dict in cpython is order preserving should be considered an implementation detail, and not relied on. But that we CAN count on **kwargs being order-preserving. That is, **kwargs is an order-preserving mapping, but the fact that it IS a dict is an implementation detail. Have I got that right? Of course, this will make it hard to back-port a "ntuple" implementation.... And ntuple(('x', 2), ('y', 3)) is unfortunate. -CHB On Thu, Jul 27, 2017 at 4:48 AM, Nick Coghlan wrote: > On 27 July 2017 at 10:38, Steven D'Aprano wrote: > > On Thu, Jul 27, 2017 at 11:46:45AM +1200, Greg Ewing wrote: > >> Nick Coghlan wrote: > >> >The same applies to the ntuple concept, expect there it's the fact > >> >that it's a *tuple* that conveys the "order matters" expectation. > >> > >> That assumes there's a requirement that it be a tuple in > >> the first place. I don't see that requirement in the use > >> cases suggested here so far. > > > > This is an excellent point. Perhaps we should just find a shorter name > > for SimpleNamespace and promote it as the solution. > > > > I'm not sure about other versions, but in Python 3.5 it will even save > > memory for small records: > > > > py> from types import SimpleNamespace > > py> spam = SimpleNamespace(flavour='up', charge='1/3') > > py> sys.getsizeof(spam) > > 24 > > sys.getsizeof() isn't recursive, so this is only measuring the > overhead of CPython's per-object bookkeeping. The actual storage > expense is incurred via the instance dict: > > >>> sys.getsizeof(spam.__dict__) > 240 > >>> data = dict(charge='1/3', flavour='up') > >>> sys.getsizeof(data) > 240 > > Note: this is a 64-bit system, so the per-instance overhead is also > higher (48 bytes rather than 24), and tuple incur a cost of 8 bytes > per item rather than 4 bytes. > > It's simply not desirable to rely on dicts for this kind of use case, > as the per-instance cost of their bookkeeping machinery is overly high > for small data classes and key-sharing only mitigates that problem, it > doesn't eliminate it. > > By contrast, tuples are not only the most memory efficient data > structure Python offers, they're also one of the fastest to allocate: > since they're fixed length, they can be allocated as a single > contiguous block, rather than requiring multiple memory allocations > per instance (and that's before taking the free list into account). > > As a result, "Why insist on a tuple?" has three main answers: > > - lowest feasible per-instance memory overhead > - lowest feasible runtime allocation cost overhead > - backwards compatibility with APIs that currently return a tuple > without impacting either of the above benefits > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Jul 27 12:26:34 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 27 Jul 2017 09:26:34 -0700 Subject: [Python-ideas] namedtuple nit... Message-ID: Since we are talking about namedtuple and implementation, I just noticed: In [22]: Point = namedtuple('Point', ['x', 'y']) In [23]: p = Point(2,3) In [24]: p.x = 5 --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) in () ----> 1 p.x = 5 AttributeError: can't set attribute OK -- that makes sense. but then, if you try: In [25]: p.z = 5 --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) in () ----> 1 p.z = 5 AttributeError: 'Point' object has no attribute 'z' I think this should be a different message -- key here is that you can't set a new attribute, not that one doesn't exist. Maybe: "AttributeError: can't set new attribute" -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From levkivskyi at gmail.com Thu Jul 27 12:41:28 2017 From: levkivskyi at gmail.com (Ivan Levkivskyi) Date: Thu, 27 Jul 2017 18:41:28 +0200 Subject: [Python-ideas] namedtuple nit... In-Reply-To: References: Message-ID: This error message is the same for types with __slots__, and probably it is indeed a bit too terse. -- Ivan On 27 July 2017 at 18:26, Chris Barker wrote: > Since we are talking about namedtuple and implementation, I just noticed: > > In [22]: Point = namedtuple('Point', ['x', 'y']) > In [23]: p = Point(2,3) > > In [24]: p.x = 5 > ------------------------------------------------------------ > --------------- > AttributeError Traceback (most recent call last) > in () > ----> 1 p.x = 5 > AttributeError: can't set attribute > > OK -- that makes sense. but then, if you try: > > In [25]: p.z = 5 > ------------------------------------------------------------ > --------------- > AttributeError Traceback (most recent call last) > in () > ----> 1 p.z = 5 > AttributeError: 'Point' object has no attribute 'z' > > I think this should be a different message -- key here is that you can't > set a new attribute, not that one doesn't exist. Maybe: > > "AttributeError: can't set new attribute" > > -CHB > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pavol.lisy at gmail.com Thu Jul 27 17:22:09 2017 From: pavol.lisy at gmail.com (Pavol Lisy) Date: Thu, 27 Jul 2017 23:22:09 +0200 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: <20170726010511.GT3149@ando.pearwood.info> References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> <20170726010511.GT3149@ando.pearwood.info> Message-ID: On 7/26/17, Steven D'Aprano wrote: [...] > But this has a hidden landmine. If *any* module happens to use ntuple > with the same field names as you, but in a different order, you will > have mysterious bugs: > > x, y, z = spam > > You expect x=2, y=1, z=0 because that's the way you defined the field > order, but unknown to you some other module got in first and defined it > as [z, y, x] and so your code will silently do the wrong thing. We have: from module import x, y, z # where order is not important Could we have something similar with ntuple (SimpleNamespace, ...)? maybe: for x, y, z from spam: print(x, y, z) From rosuav at gmail.com Thu Jul 27 17:50:52 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 28 Jul 2017 07:50:52 +1000 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> <20170726010511.GT3149@ando.pearwood.info> Message-ID: On Fri, Jul 28, 2017 at 7:22 AM, Pavol Lisy wrote: > On 7/26/17, Steven D'Aprano wrote: > > [...] > >> But this has a hidden landmine. If *any* module happens to use ntuple >> with the same field names as you, but in a different order, you will >> have mysterious bugs: >> >> x, y, z = spam >> >> You expect x=2, y=1, z=0 because that's the way you defined the field >> order, but unknown to you some other module got in first and defined it >> as [z, y, x] and so your code will silently do the wrong thing. > > We have: > from module import x, y, z # where order is not important > > Could we have something similar with ntuple (SimpleNamespace, ...)? > > maybe: > for x, y, z from spam: > print(x, y, z) What you're asking for is something like JavaScript's "object destructuring" syntax. It would sometimes be cool, but I haven't ever really yearned for it in Python. But you'd need to decide whether you want attributes (spam.x, spam.y) or items (spam["x"], spam["y"]). Both would be useful at different times. ChrisA From markusmeskanen at gmail.com Thu Jul 27 18:00:28 2017 From: markusmeskanen at gmail.com (Markus Meskanen) Date: Fri, 28 Jul 2017 01:00:28 +0300 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> <20170726010511.GT3149@ando.pearwood.info> Message-ID: > But you'd need to decide whether you want attributes (spam.x, spam.y) or items (spam["x"], spam["y"]). Both would be useful at different times. ChrisA If something like this was ever added, it'd probably be items, then you could implement a custom __unpack__ (or whatever name it'd be) method that would return a dict. -------------- next part -------------- An HTML attachment was scrubbed... URL: From elazarg at gmail.com Thu Jul 27 18:14:47 2017 From: elazarg at gmail.com (=?UTF-8?B?15DXnNei15bXqA==?=) Date: Thu, 27 Jul 2017 22:14:47 +0000 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> <20170726010511.GT3149@ando.pearwood.info> Message-ID: On Fri, Jul 28, 2017 at 12:51 AM Chris Angelico wrote: > On Fri, Jul 28, 2017 at 7:22 AM, Pavol Lisy wrote: > > > We have: > > from module import x, y, z # where order is not important > > > > Could we have something similar with ntuple (SimpleNamespace, ...)? > > > > maybe: > > for x, y, z from spam: > > print(x, y, z) > > What you're asking for is something like JavaScript's "object > destructuring" syntax. It would sometimes be cool, but I haven't ever > really yearned for it in Python. But you'd need to decide whether you > want attributes (spam.x, spam.y) or items (spam["x"], spam["y"]). Both > would be useful at different times. > > Maybe something like {x, y} = point Or with aliasing {x as dx, y as dy} = point Then you can have {'x' as x, 'y' as y, **} = spam Elazar -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Jul 27 19:31:25 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 27 Jul 2017 16:31:25 -0700 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> <20170726010511.GT3149@ando.pearwood.info> Message-ID: On Thu, Jul 27, 2017 at 2:50 PM, Chris Angelico wrote: > On Fri, Jul 28, 2017 at 7:22 AM, Pavol Lisy wrote: > > maybe: > > for x, y, z from spam: > > print(x, y, z) > > What you're asking for is something like JavaScript's "object > destructuring" syntax. Wasn't there just a big long discussion about something like that on this list? -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Thu Jul 27 19:54:37 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 28 Jul 2017 09:54:37 +1000 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> <20170726010511.GT3149@ando.pearwood.info> Message-ID: On Fri, Jul 28, 2017 at 9:31 AM, Chris Barker wrote: > > > On Thu, Jul 27, 2017 at 2:50 PM, Chris Angelico wrote: >> >> On Fri, Jul 28, 2017 at 7:22 AM, Pavol Lisy wrote: >> > maybe: >> > for x, y, z from spam: >> > print(x, y, z) >> >> What you're asking for is something like JavaScript's "object >> destructuring" syntax. > > > Wasn't there just a big long discussion about something like that on this > list? Yeah, and the use cases just aren't as strong in Python. I think part of it is because a JS function taking keyword arguments looks like this: function fetch(url, options) { const {method, body, headers} = options; // ... } fetch("http://httpbin.org/post", {method: "POST", body: "blah"}); whereas Python would spell it this way: def fetch(url, *, method="GET", body=None, headers=[]): ... fetch("http://httpbin.org/post", method="POST", body="blah"); So that's one big slab of use-case gone, right there. ChrisA From python-ideas at mgmiller.net Thu Jul 27 20:09:56 2017 From: python-ideas at mgmiller.net (Mike Miller) Date: Thu, 27 Jul 2017 17:09:56 -0700 Subject: [Python-ideas] namedtuple nit... In-Reply-To: References: Message-ID: <0879cff7-b6d2-5cc9-d256-c621c44f4c0c@mgmiller.net> I've never liked that error message either: >>> object().foo = 'bar' Traceback (most recent call last): File "", line 1, in AttributeError: 'object' object has no attribute 'foo' Should say the "object is immutable," not writable, or something of the sort. On 2017-07-27 09:26, Chris Barker wrote: > Since we are talking about namedtuple and implementation, I just noticed: From rosuav at gmail.com Thu Jul 27 21:02:48 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 28 Jul 2017 11:02:48 +1000 Subject: [Python-ideas] namedtuple nit... In-Reply-To: <0879cff7-b6d2-5cc9-d256-c621c44f4c0c@mgmiller.net> References: <0879cff7-b6d2-5cc9-d256-c621c44f4c0c@mgmiller.net> Message-ID: On Fri, Jul 28, 2017 at 10:09 AM, Mike Miller wrote: > I've never liked that error message either: > > >>> object().foo = 'bar' > Traceback (most recent call last): > File "", line 1, in > AttributeError: 'object' object has no attribute 'foo' > > > Should say the "object is immutable," not writable, or something of the > sort. As Ivan said, this is to do with __slots__. It's nothing to do with immutability: >>> class Demo: ... __slots__ = 'spam' ... >>> x = Demo() >>> x.spam = 1 >>> x.ham = 2 Traceback (most recent call last): File "", line 1, in AttributeError: 'Demo' object has no attribute 'ham' If the object has a __dict__, any unknown attributes go into there: >>> class Demo2(Demo): pass ... >>> y = Demo2() >>> y.ham = 2 >>> y.spam = 3 >>> y.__dict__ {'ham': 2} which prevents that "has no attribute" error. ChrisA From turnbull.stephen.fw at u.tsukuba.ac.jp Thu Jul 27 21:21:43 2017 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Fri, 28 Jul 2017 10:21:43 +0900 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> Message-ID: <22906.37287.360608.165676@turnbull.sk.tsukuba.ac.jp> MRAB writes: > But what about: > > >>> nt2 = ntuple(y=2, x=1) > > ? Does that mean that nt[0] == 2? Presumably, yes. > > Does nt == nt2? > > If it's False, then you've lost some of the advantage of using names > instead of positions. Sure. And if you use a dict, you've lost some of the advantage of using names instead positions too. I'm not sure a somewhat hacky use case (see my reply to Ethan elsewhere in the thread) justifies a builtin, but I can easily see using it myself if it did exist. Steve From turnbull.stephen.fw at u.tsukuba.ac.jp Thu Jul 27 21:24:09 2017 From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull) Date: Fri, 28 Jul 2017 10:24:09 +0900 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: <5978D4E1.8010601@stoneleaf.us> References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> <20170726010511.GT3149@ando.pearwood.info> <5978D4E1.8010601@stoneleaf.us> Message-ID: <22906.37433.808865.420400@turnbull.sk.tsukuba.ac.jp> Ethan Furman writes: > Tuples, named or otherwise, are positional first -- order matters. > Specifying > > point = ntuple(y=2, x=-3) > > and having point[0] == 3 is going to be bizarre. This will be a > source for horrible bugs. I don't see how you get that? Anyway, I expect that ntuples will *very* frequently be *written* in an order-dependent (and probably highly idiomatic) way, and *read* using attribute notation: def db_from_csv(sheet): db = [] names = next(sheet) for row in sheet: db.append(ntuple(**zip(names, row))) return db my_db = [] for sheet in my_sheets: my_db.extend(db_from_csv(sheet)) x_index = my_db[:].sort(key=lambda row: row.x) y_index = my_db[:].sort(key=lambda row: row.y) (untested). As far as I can see, this is just duck-typed collection data, as Chris Barker puts it. Note that the above idiom can create a non-rectangular database from sheets of arbitrary column orders as long as both 'x' and 'y' columns are present in all of my sheets. A bit hacky, but it's the kind of thing you might do in a one-off script, or when you're aggregating data collected by unreliable RAs. Sure, this can be abused, but an accidental pitfall? Seems to me just as likely that you'd get that with ordinary tuples. I can easily imagine scenarios like "Oh, these are tuples but I need even *more* performance. I know! I'll read my_db into a numpy array!" But I would consider that an abuse, or at least a hack (consider how you'd go about getting variable names for the numpy array columns). From steve at pearwood.info Thu Jul 27 21:53:08 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 28 Jul 2017 11:53:08 +1000 Subject: [Python-ideas] namedtuple nit... In-Reply-To: <0879cff7-b6d2-5cc9-d256-c621c44f4c0c@mgmiller.net> References: <0879cff7-b6d2-5cc9-d256-c621c44f4c0c@mgmiller.net> Message-ID: <20170728015306.GB3149@ando.pearwood.info> On Thu, Jul 27, 2017 at 05:09:56PM -0700, Mike Miller wrote: > I've never liked that error message either: > > >>> object().foo = 'bar' > Traceback (most recent call last): > File "", line 1, in > AttributeError: 'object' object has no attribute 'foo' > > > Should say the "object is immutable," not writable, or something of the > sort. Then put in a feature request on the bug tracker. It shouldn't require a week's discussion on Python-Ideas to improve an error message. -- Steve From ethan at stoneleaf.us Thu Jul 27 22:42:01 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 27 Jul 2017 19:42:01 -0700 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: <22906.37433.808865.420400@turnbull.sk.tsukuba.ac.jp> References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> <20170726010511.GT3149@ando.pearwood.info> <5978D4E1.8010601@stoneleaf.us> <22906.37433.808865.420400@turnbull.sk.tsukuba.ac.jp> Message-ID: <597AA479.7020404@stoneleaf.us> On 07/27/2017 06:24 PM, Stephen J. Turnbull wrote: > Ethan Furman writes: >> Tuples, named or otherwise, are positional first -- order matters. >> Specifying >> >> point = ntuple(y=2, x=-3) >> >> and having point[0] == 3 is going to be bizarre. This will be a >> source for horrible bugs. > > I don't see how you get that? How I get the point[0] == 3? The first definition of an ntuple had the order as x, y, and since the proposal is only comparing field names (not order), this (y, x) ntuple ends up being reversed to how it was specified. > Anyway, I expect that ntuples will *very* frequently be *written* in an > order-dependent (and probably highly idiomatic) way, and *read* using > attribute notation: Sure, but they can also be unpacked, and order matters there. Also, as D'Aprano pointed out, if the first instance of an ntuple has the fields in a different order than expected, all subsequent ntuples that are referenced in an order-dependent fashion will be returning data from the wrong indexes. -- ~Ethan~ From python-ideas at mgmiller.net Fri Jul 28 04:00:13 2017 From: python-ideas at mgmiller.net (Mike Miller) Date: Fri, 28 Jul 2017 01:00:13 -0700 Subject: [Python-ideas] namedtuple nit... In-Reply-To: References: <0879cff7-b6d2-5cc9-d256-c621c44f4c0c@mgmiller.net> Message-ID: <68487227-25c6-1bfb-0222-b6a7085e7970@mgmiller.net> On 2017-07-27 18:02, Chris Angelico wrote: > As Ivan said, this is to do with __slots__. It's nothing to do with > immutability: >>> object().__slots__ Traceback (most recent call last): File "", line 1, in AttributeError: 'object' object has no attribute '__slots__' >>> object().__dict__ Traceback (most recent call last): File "", line 1, in AttributeError: 'object' object has no attribute '__dict__' >>> If an object has no slots or dict and does not accept attribute assignment, is it not effectively immutable? -Mike From antoine.rozo at gmail.com Fri Jul 28 04:06:19 2017 From: antoine.rozo at gmail.com (Antoine Rozo) Date: Fri, 28 Jul 2017 10:06:19 +0200 Subject: [Python-ideas] namedtuple nit... In-Reply-To: <68487227-25c6-1bfb-0222-b6a7085e7970@mgmiller.net> References: <0879cff7-b6d2-5cc9-d256-c621c44f4c0c@mgmiller.net> <68487227-25c6-1bfb-0222-b6a7085e7970@mgmiller.net> Message-ID: > If an object has no slots or dict and does not accept attribute assignment, is it not effectively immutable? No, not necessarily. class A(list): __slots__ = () 2017-07-28 10:00 GMT+02:00 Mike Miller : > > > On 2017-07-27 18:02, Chris Angelico wrote: > > As Ivan said, this is to do with __slots__. It's nothing to do with > > immutability: > > >>> object().__slots__ > Traceback (most recent call last): > File "", line 1, in > AttributeError: 'object' object has no attribute '__slots__' > > >>> object().__dict__ > Traceback (most recent call last): > File "", line 1, in > AttributeError: 'object' object has no attribute '__dict__' > >>> > > If an object has no slots or dict and does not accept attribute > assignment, is it not effectively immutable? > > -Mike > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Antoine Rozo -------------- next part -------------- An HTML attachment was scrubbed... URL: From python-ideas at mgmiller.net Fri Jul 28 14:23:49 2017 From: python-ideas at mgmiller.net (Mike Miller) Date: Fri, 28 Jul 2017 11:23:49 -0700 Subject: [Python-ideas] namedtuple nit... In-Reply-To: References: <0879cff7-b6d2-5cc9-d256-c621c44f4c0c@mgmiller.net> <68487227-25c6-1bfb-0222-b6a7085e7970@mgmiller.net> Message-ID: That's a subclass. Also: >>> class A(list): __slots__ = () ... >>> >>> a = A() >>> a.foo = 'bar' Traceback (most recent call last): File "", line 1, in AttributeError: 'A' object has no attribute 'foo' -Mike On 2017-07-28 01:06, Antoine Rozo wrote: > > If an object has no slots or dict and does not accept attribute assignment, > is it not effectively immutable? > > No, not necessarily. > > class A(list): __slots__ = () > From antoine.rozo at gmail.com Fri Jul 28 14:37:19 2017 From: antoine.rozo at gmail.com (Antoine Rozo) Date: Fri, 28 Jul 2017 20:37:19 +0200 Subject: [Python-ideas] namedtuple nit... In-Reply-To: References: <0879cff7-b6d2-5cc9-d256-c621c44f4c0c@mgmiller.net> <68487227-25c6-1bfb-0222-b6a7085e7970@mgmiller.net> Message-ID: Yes, but A objects have no slots, no dict, do not accept attribute assignment, but are mutable. >>> a = A() >>> a [] >>> a.__slots__ () >>> a.__dict__ Traceback (most recent call last): File "", line 1, in AttributeError: 'A' object has no attribute '__dict__' >>> a.append(1) >>> a.append(2) >>> a [1, 2] 2017-07-28 20:23 GMT+02:00 Mike Miller : > That's a subclass. Also: > > >>> class A(list): __slots__ = () > ... > >>> > >>> a = A() > >>> a.foo = 'bar' > Traceback (most recent call last): > File "", line 1, in > AttributeError: 'A' object has no attribute 'foo' > > -Mike > > > On 2017-07-28 01:06, Antoine Rozo wrote: > >> > If an object has no slots or dict and does not accept attribute >> assignment, is it not effectively immutable? >> >> No, not necessarily. >> >> class A(list): __slots__ = () >> >> _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Antoine Rozo -------------- next part -------------- An HTML attachment was scrubbed... URL: From python-ideas at mgmiller.net Fri Jul 28 14:56:32 2017 From: python-ideas at mgmiller.net (Mike Miller) Date: Fri, 28 Jul 2017 11:56:32 -0700 Subject: [Python-ideas] namedtuple nit... In-Reply-To: References: <0879cff7-b6d2-5cc9-d256-c621c44f4c0c@mgmiller.net> <68487227-25c6-1bfb-0222-b6a7085e7970@mgmiller.net> Message-ID: Nice. Ok, so there are different dimensions of mutability. Still, haven't found any "backdoors" to object(), the one I claimed was immutable. -Mike From chris.barker at noaa.gov Fri Jul 28 20:27:23 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Fri, 28 Jul 2017 17:27:23 -0700 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: <597AA479.7020404@stoneleaf.us> References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> <20170726010511.GT3149@ando.pearwood.info> <5978D4E1.8010601@stoneleaf.us> <22906.37433.808865.420400@turnbull.sk.tsukuba.ac.jp> <597AA479.7020404@stoneleaf.us> Message-ID: On Thu, Jul 27, 2017 at 7:42 PM, Ethan Furman wrote: > How I get the point[0] == 3? The first definition of an ntuple had the > order as x, y, and since the proposal is only comparing field names (not > order), this (y, x) ntuple ends up being reversed to how it was specified. > I'm not sure there ever was a "proposal" per se, but: ntuple(x=a, y=b) had better be a different type than: ntuple(y=b, x=a) but first we need to decide if we want an easy way to make an namedtuple-like object or a SimpleNemaspace-like object.... but if you are going to allow indexing by integer, then order needs to be part of the definition. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Jul 29 09:49:35 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 29 Jul 2017 23:49:35 +1000 Subject: [Python-ideas] namedtuple nit... In-Reply-To: References: <0879cff7-b6d2-5cc9-d256-c621c44f4c0c@mgmiller.net> <68487227-25c6-1bfb-0222-b6a7085e7970@mgmiller.net> Message-ID: On 29 July 2017 at 04:56, Mike Miller wrote: > Nice. Ok, so there are different dimensions of mutability. > > Still, haven't found any "backdoors" to object(), the one I claimed was > immutable. It's possible to write builtin types that are truly immutable, and there are several examples of that (direct instances of object, tuple instances, instances of the builtin numeric types), but it isn't particularly straightforward to make Python defined classes truly immutable. While this gets close for stateless instances (akin to instantiating object() directly): >>> class MostlyImmutable: ... __slots__ = () ... @property ... def __class__(self): ... return type(self) ... It's far more difficult to actually store any meaningful state without making it open to mutation in some way (since it needs to settable from __new__, and Python doesn't provide any inherent mechanism from distinguishing those cases from post-creation modifications). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From chris.barker at noaa.gov Sat Jul 29 12:03:57 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Sat, 29 Jul 2017 09:03:57 -0700 Subject: [Python-ideas] namedtuple nit... In-Reply-To: References: <0879cff7-b6d2-5cc9-d256-c621c44f4c0c@mgmiller.net> <68487227-25c6-1bfb-0222-b6a7085e7970@mgmiller.net> Message-ID: On Sat, Jul 29, 2017 at 6:49 AM, Nick Coghlan wrote: > It's possible to write builtin types that are truly immutable, and > there are several examples of that (direct instances of object, tuple > instances, instances of the builtin numeric types), Maybe this is an argument for namedtuple to be a "proper" builtin. Though, as the names have to be defined at run time, I suppose that isn't possible. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From tritium-list at sdamon.com Sat Jul 29 12:14:51 2017 From: tritium-list at sdamon.com (Alex Walters) Date: Sat, 29 Jul 2017 12:14:51 -0400 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> <20170726010511.GT3149@ando.pearwood.info> <5978D4E1.8010601@stoneleaf.us> <22906.37433.808865.420400@turnbull.sk.tsukuba.ac.jp> <597AA479.7020404@stoneleaf.us> Message-ID: <069501d30885$cee8c430$6cba4c90$@sdamon.com> My $0.02 on the entire series of nametuple threads is? there *might* be value in an immutable namespace type, and a mutable namespace type, but namedtuple?s promise is that they can be used anywhere a tuple can be used. If passing in kwargs to create the potential replacement to namedtuple is sensitive to dict iteration order, it really isn?t a viable replacement for namedtuple. I do feel like there isn?t that big of a usecase for an immutable namespace type as there is for a namedtuple. I would rather namedtuple class creation be quicker. From: Python-ideas [mailto:python-ideas-bounces+tritium-list=sdamon.com at python.org] On Behalf Of Chris Barker Sent: Friday, July 28, 2017 8:27 PM To: Ethan Furman Cc: Python-Ideas Subject: Re: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] On Thu, Jul 27, 2017 at 7:42 PM, Ethan Furman > wrote: How I get the point[0] == 3? The first definition of an ntuple had the order as x, y, and since the proposal is only comparing field names (not order), this (y, x) ntuple ends up being reversed to how it was specified. I'm not sure there ever was a "proposal" per se, but: ntuple(x=a, y=b) had better be a different type than: ntuple(y=b, x=a) but first we need to decide if we want an easy way to make an namedtuple-like object or a SimpleNemaspace-like object.... but if you are going to allow indexing by integer, then order needs to be part of the definition. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From desmoulinmichel at gmail.com Sun Jul 30 04:19:26 2017 From: desmoulinmichel at gmail.com (Michel Desmoulin) Date: Sun, 30 Jul 2017 10:19:26 +0200 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: <069501d30885$cee8c430$6cba4c90$@sdamon.com> References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> <20170726010511.GT3149@ando.pearwood.info> <5978D4E1.8010601@stoneleaf.us> <22906.37433.808865.420400@turnbull.sk.tsukuba.ac.jp> <597AA479.7020404@stoneleaf.us> <069501d30885$cee8c430$6cba4c90$@sdamon.com> Message-ID: <80455b4d-ad6f-87e9-dc7a-8d671e8c3c95@gmail.com> Le 29/07/2017 ? 18:14, Alex Walters a ?crit : > My $0.02 on the entire series of nametuple threads is? there **might** > be value in an immutable namespace type, and a mutable namespace type, > but namedtuple?s promise is that they can be used anywhere a tuple can > be used. If passing in kwargs to create the potential replacement to > namedtuple is sensitive to dict iteration order, it really isn?t a > viable replacement for namedtuple. In Python 3.6, kwargs order is preserved and guaranteed. It's currently implemented by relying on the non guaranteed dict order. But the 2 are not linked. The spec does guaranty that for now on, kwargs order is always preserved whether the dict order is or not. > > > > I do feel like there isn?t that big of a usecase for an immutable > namespace type as there is for a namedtuple. I would rather namedtuple > class creation be quicker. > > > > > > *From:* Python-ideas > [mailto:python-ideas-bounces+tritium-list=sdamon.com at python.org] *On > Behalf Of *Chris Barker > *Sent:* Friday, July 28, 2017 8:27 PM > *To:* Ethan Furman > *Cc:* Python-Ideas > *Subject:* Re: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] > > > > On Thu, Jul 27, 2017 at 7:42 PM, Ethan Furman > wrote: > > How I get the point[0] == 3? The first definition of an ntuple had > the order as x, y, and since the proposal is only comparing field > names (not order), this (y, x) ntuple ends up being reversed to how > it was specified. > > > > I'm not sure there ever was a "proposal" per se, but: > > ntuple(x=a, y=b) > > > > had better be a different type than: > > ntuple(y=b, x=a) > > but first we need to decide if we want an easy way to make an > namedtuple-like object or a SimpleNemaspace-like object.... > > > > but if you are going to allow indexing by integer, then order needs to > be part of the definition. > > > > -CHB > > > -- > > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > From tritium-list at sdamon.com Sun Jul 30 06:03:10 2017 From: tritium-list at sdamon.com (Alex Walters) Date: Sun, 30 Jul 2017 06:03:10 -0400 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: <80455b4d-ad6f-87e9-dc7a-8d671e8c3c95@gmail.com> References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> <20170726010511.GT3149@ando.pearwood.info> <5978D4E1.8010601@stoneleaf.us> <22906.37433.808865.420400@turnbull.sk.tsukuba.ac.jp> <597AA479.7020404@stoneleaf.us> <069501d30885$cee8c430$6cba4c90$@sdamon.com> <80455b4d-ad6f-87e9-dc7a-8d671e8c3c95@gmail .com> Message-ID: <080201d3091b$0d677f90$28367eb0$@sdamon.com> > -----Original Message----- > From: Python-ideas [mailto:python-ideas-bounces+tritium- > list=sdamon.com at python.org] On Behalf Of Michel Desmoulin > Sent: Sunday, July 30, 2017 4:19 AM > To: python-ideas at python.org > Subject: Re: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] > > > > Le 29/07/2017 ? 18:14, Alex Walters a ?crit : > > My $0.02 on the entire series of nametuple threads is? there **might** > > be value in an immutable namespace type, and a mutable namespace type, > > but namedtuple?s promise is that they can be used anywhere a tuple can > > be used. If passing in kwargs to create the potential replacement to > > namedtuple is sensitive to dict iteration order, it really isn?t a > > viable replacement for namedtuple. > > In Python 3.6, kwargs order is preserved and guaranteed. It's currently > implemented by relying on the non guaranteed dict order. But the 2 are > not linked. The spec does guaranty that for now on, kwargs order is > always preserved whether the dict order is or not. MyNT = namedtuple_replacement('MyNT', 'foo bar') data = {} data['bar'] = 8675309 data['foo'] = 525600 MyNT(*data) == (525600, 8675309) # better be true, or else, we are depending on iteration order. > > > > > > > > > I do feel like there isn?t that big of a usecase for an immutable > > namespace type as there is for a namedtuple. I would rather namedtuple > > class creation be quicker. > > > > > > > > > > > > *From:* Python-ideas > > [mailto:python-ideas-bounces+tritium-list=sdamon.com at python.org] *On > > Behalf Of *Chris Barker > > *Sent:* Friday, July 28, 2017 8:27 PM > > *To:* Ethan Furman > > *Cc:* Python-Ideas > > *Subject:* Re: [Python-ideas] namedtuple literals [Was: RE a new > namedtuple] > > > > > > > > On Thu, Jul 27, 2017 at 7:42 PM, Ethan Furman > > wrote: > > > > How I get the point[0] == 3? The first definition of an ntuple had > > the order as x, y, and since the proposal is only comparing field > > names (not order), this (y, x) ntuple ends up being reversed to how > > it was specified. > > > > > > > > I'm not sure there ever was a "proposal" per se, but: > > > > ntuple(x=a, y=b) > > > > > > > > had better be a different type than: > > > > ntuple(y=b, x=a) > > > > but first we need to decide if we want an easy way to make an > > namedtuple-like object or a SimpleNemaspace-like object.... > > > > > > > > but if you are going to allow indexing by integer, then order needs to > > be part of the definition. > > > > > > > > -CHB > > > > > > -- > > > > > > Christopher Barker, Ph.D. > > Oceanographer > > > > Emergency Response Division > > NOAA/NOS/OR&R (206) 526-6959 voice > > 7600 Sand Point Way NE (206) 526-6329 fax > > Seattle, WA 98115 (206) 526-6317 main reception > > > > Chris.Barker at noaa.gov > > > > > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > Code of Conduct: http://python.org/psf/codeofconduct/ > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ From ncoghlan at gmail.com Sun Jul 30 11:24:12 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 31 Jul 2017 01:24:12 +1000 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: <080201d3091b$0d677f90$28367eb0$@sdamon.com> References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> <20170726010511.GT3149@ando.pearwood.info> <5978D4E1.8010601@stoneleaf.us> <22906.37433.808865.420400@turnbull.sk.tsukuba.ac.jp> <597AA479.7020404@stoneleaf.us> <069501d30885$cee8c430$6cba4c90$@sdamon.com> <80455b4d-ad6f-87e9-dc7a-8d671e8c3c95@gmail.com> <080201d3091b$0d677f90$28367eb0$@sdamon.com> Message-ID: On 30 July 2017 at 20:03, Alex Walters wrote: >> -----Original Message----- >> From: Python-ideas [mailto:python-ideas-bounces+tritium- >> list=sdamon.com at python.org] On Behalf Of Michel Desmoulin >> Sent: Sunday, July 30, 2017 4:19 AM >> To: python-ideas at python.org >> Subject: Re: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] >> >> >> >> Le 29/07/2017 ? 18:14, Alex Walters a ?crit : >> > My $0.02 on the entire series of nametuple threads is? there **might** >> > be value in an immutable namespace type, and a mutable namespace type, >> > but namedtuple?s promise is that they can be used anywhere a tuple can >> > be used. If passing in kwargs to create the potential replacement to >> > namedtuple is sensitive to dict iteration order, it really isn?t a >> > viable replacement for namedtuple. >> >> In Python 3.6, kwargs order is preserved and guaranteed. It's currently >> implemented by relying on the non guaranteed dict order. But the 2 are >> not linked. The spec does guaranty that for now on, kwargs order is >> always preserved whether the dict order is or not. > > MyNT = namedtuple_replacement('MyNT', 'foo bar') > data = {} > data['bar'] = 8675309 > data['foo'] = 525600 > > MyNT(*data) == (525600, 8675309) # better be true, or else, we are depending on iteration order. Did you mean "MyNT(**data)" in the last line? Either way, this is just normal predefined namedtuple creation, where the field order is set when the type is defined. Rather than being about any changes on that front, these threads are mostly about making it possible to write that first line as: MyNT = type(implicitly_typed_named_tuple_factory(foo=None, bar=None)) ... (While they do occasionally veer into discussing the idea of yet-another-kind-of-data-storage-type, that is an extraordinarily unlikely outcome) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From p.f.moore at gmail.com Sun Jul 30 14:31:02 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Sun, 30 Jul 2017 19:31:02 +0100 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> <20170726010511.GT3149@ando.pearwood.info> <5978D4E1.8010601@stoneleaf.us> <22906.37433.808865.420400@turnbull.sk.tsukuba.ac.jp> <597AA479.7020404@stoneleaf.us> <069501d30885$cee8c430$6cba4c90$@sdamon.com> <80455b4d-ad6f-87e9-dc7a-8d671e8c3c95@gmail.com> <080201d3091b$0d677f90$28367eb0$@sdamon.com> Message-ID: On 30 July 2017 at 16:24, Nick Coghlan wrote: > Rather than being about any changes on that front, these threads are > mostly about making it possible to write that first line as: > > MyNT = type(implicitly_typed_named_tuple_factory(foo=None, bar=None)) Is that really true, though? There's a lot of discussion about whether ntuple(x=1, y=2) and ntuple(y=2, x=1) are equal (which implies they are the same type). If there's any way they can be the same type, then your definition of MyNT above is inherently ambiguous, depending on whether we've previously referred to implicitly_typed_named_tuple_factory(bar=None, foo=None). For me, the showstopper with regard to this whole discussion about ntuple(x=1, y=2) is this key point - every proposed behaviour has turned out to be surprising to someone (and not just in a "hmm, that's odd" sense, but rather in the sense that it'd almost certainly result in bugs as a result of misunderstood behaviour). Paul From markusmeskanen at gmail.com Sun Jul 30 14:57:19 2017 From: markusmeskanen at gmail.com (Markus Meskanen) Date: Sun, 30 Jul 2017 21:57:19 +0300 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> <20170726010511.GT3149@ando.pearwood.info> <5978D4E1.8010601@stoneleaf.us> <22906.37433.808865.420400@turnbull.sk.tsukuba.ac.jp> <597AA479.7020404@stoneleaf.us> <069501d30885$cee8c430$6cba4c90$@sdamon.com> <80455b4d-ad6f-87e9-dc7a-8d671e8c3c95@gmail.com> <080201d3091b$0d677f90$28367eb0$@sdamon.com> Message-ID: I've been experimenting with this: class QuickNamedTuple(tuple): def __new__(cls, **kwargs): inst = super().__new__(cls, tuple(kwargs.values())) inst._names = tuple(kwargs.keys()) return inst def __getattr__(self, attr): if attr in self._names: return self[self._names.index(attr)] raise AttributeError(attr) def __repr__(self): values = [] for i, name in enumerate(self._names): values.append(f'{name}={self[i]}') return f'({", ".join(values)})' It's a quick scrap and probably not ideal code, but the idea is the point. I believe this is how the new "quick" named tuple should ideally work: In: ntuple = QuickNamedTuple(x=1, y=2, z=-1) In: ntuple Out: (x=1, y=2, z=-1) In: ntuple[1] == ntuple.y Out: True In: ntuple == (1, 2, 3) Out: True In: ntuple == QuickNamedTuple(z=-1, y=2, x=1) Out: False So yeah, the order of the keyword arguments would matter in my case, and I've found it to work the best. How often do you get the keywords in a random order? And for those cases, you have SimpleNameSpace, or you can just use the old namedtuple. But most of the time you always have the same attributes in the same order (think of reading a CSV for example), and this would be just a normal tuple, but with custom names for the indexes. Just my two cents and thoughts from an everyday Python developer. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Sun Jul 30 21:40:19 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 30 Jul 2017 21:40:19 -0400 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> <20170726010511.GT3149@ando.pearwood.info> <5978D4E1.8010601@stoneleaf.us> <22906.37433.808865.420400@turnbull.sk.tsukuba.ac.jp> <597AA479.7020404@stoneleaf.us> <069501d30885$cee8c430$6cba4c90$@sdamon.com> <80455b4d-ad6f-87e9-dc7a-8d671e8c3c95@gmail.com> <080201d3091b$0d677f90$28367eb0$@sdamon.com> Message-ID: On 7/30/2017 2:57 PM, Markus Meskanen wrote: > I've been experimenting with this: > > class QuickNamedTuple(tuple): > > def __new__(cls, **kwargs): > inst = super().__new__(cls, tuple(kwargs.values())) > inst._names = tuple(kwargs.keys()) > return inst > > def __getattr__(self, attr): > if attr in self._names: > return self[self._names.index(attr)] > raise AttributeError(attr) > > def __repr__(self): > values = [] > for i, name in enumerate(self._names): > values.append(f'{name}={self[i]}') > return f'({", ".join(values)})' > > It's a quick scrap and probably not ideal code, but the idea is the > point. I believe this is how the new "quick" named tuple should ideally > work: > > In: ntuple = QuickNamedTuple(x=1, y=2, z=-1) > In: ntuple > Out: (x=1, y=2, z=-1) > In: ntuple[1] == ntuple.y > Out: True > In: ntuple == (1, 2, 3) > Out: True > In: ntuple == QuickNamedTuple(z=-1, y=2, x=1) > Out: False > > So yeah, the order of the keyword arguments would matter in my case, and > I've found it to work the best. How often do you get the keywords in a > random order? And for those cases, you have SimpleNameSpace, or you can > just use the old namedtuple. But most of the time you always have the > same attributes in the same order (think of reading a CSV for example), > and this would be just a normal tuple, but with custom names for the > indexes. Using a name to position map: class QuickNamedTuple(tuple): def __new__(cls, **kwargs): inst = super().__new__(cls, tuple(kwargs.values())) inst._namepos = {name: i for i, name in enumerate(kwargs.keys())} return inst def __getattr__(self, attr): try: return self[self._namepos[attr]] except KeyError: raise AttributeError(attr) from None def __repr__(self): values = [] for name, i in self._namepos.items(): values.append(f'{name}={self[i]}') return f'({", ".join(values)})' Same outputs as above. -- Terry Jan Reedy From steve at pearwood.info Mon Jul 31 00:34:04 2017 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 31 Jul 2017 14:34:04 +1000 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: <5978D4E1.8010601@stoneleaf.us> <22906.37433.808865.420400@turnbull.sk.tsukuba.ac.jp> <597AA479.7020404@stoneleaf.us> <069501d30885$cee8c430$6cba4c90$@sdamon.com> <80455b4d-ad6f-87e9-dc7a-8d671e8c3c95@gmail.com> <080201d3091b$0d677f90$28367eb0$@sdamon.com> Message-ID: <20170731043404.GC3149@ando.pearwood.info> On Sun, Jul 30, 2017 at 09:57:19PM +0300, Markus Meskanen wrote: > So yeah, the order of the keyword arguments would matter in my case, and > I've found it to work the best. How often do you get the keywords in a > random order? "Random" order? Never. *Arbitrary* order? All the time. That's the whole point of keyword arguments: you don't have to care about the order. > And for those cases, you have SimpleNameSpace, Which is no good for when you need a tuple. > or you can > just use the old namedtuple. But most of the time you always have the same > attributes in the same order (think of reading a CSV for example), If you're reading from CSV, you probably aren't specifying the arguments by keyword, you're probably reading them and assigning by position. You may not even know what the columns are until you read the CSV file. Let's think some more about reading from a CSV file. How often do you have three one-letter column names like "x", "y", "z"? I don't know about you, but for me, never. I'm more likely to have a dozen columns, or more, and I can't remember and don't want to remember what order they're supposed to be *every single time* I read a row or make a tuple of values. The point of using keywords is to avoid needing to remember the order. If I have to remember the order, why bother naming them? I think this proposal combines the worst of both worlds: - like positional arguments, you have to care about the order, and if you get it wrong, your code will likely silently break in a hard to debug way; - and like keyword arguments, you have the extra typing of having to include the field names; - but unlike keyword arguments, you have to include every single one, in the right order. -- Steve From ncoghlan at gmail.com Mon Jul 31 01:38:26 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 31 Jul 2017 15:38:26 +1000 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> <20170726010511.GT3149@ando.pearwood.info> <5978D4E1.8010601@stoneleaf.us> <22906.37433.808865.420400@turnbull.sk.tsukuba.ac.jp> <597AA479.7020404@stoneleaf.us> <069501d30885$cee8c430$6cba4c90$@sdamon.com> <80455b4d-ad6f-87e9-dc7a-8d671e8c3c95@gmail.com> <080201d3091b$0d677f90$28367eb0$@sdamon.com> Message-ID: On 31 July 2017 at 04:31, Paul Moore wrote: > On 30 July 2017 at 16:24, Nick Coghlan wrote: >> Rather than being about any changes on that front, these threads are >> mostly about making it possible to write that first line as: >> >> MyNT = type(implicitly_typed_named_tuple_factory(foo=None, bar=None)) > > Is that really true, though? There's a lot of discussion about whether > ntuple(x=1, y=2) and ntuple(y=2, x=1) are equal (which implies they > are the same type). No, they're different types, because the requested field order is different, just as if you made two separate calls to "collections.namedtuple". If you want them to be the same type, so that the parameter order in the second call gets ignored, then you need to ask for that explicitly (either by using "collections.namedtuple" directly, or by calling type() on an implicitly typed instance), or else by keeping the field order consistent. > If there's any way they can be the same type, then > your definition of MyNT above is inherently ambiguous, depending on > whether we've previously referred to > implicitly_typed_named_tuple_factory(bar=None, foo=None). This is why any implicit type definition would *have* to use the field order as given: anything else opens up the opportunity for action-at-a-distance that changes the field order based on the order in which instances are created. (Even without that concern, you'd also get a problematic combinatorial expansion when searching for matching existing field definitions as the number of field names increases) > For me, the showstopper with regard to this whole discussion about > ntuple(x=1, y=2) is this key point - every proposed behaviour has > turned out to be surprising to someone (and not just in a "hmm, that's > odd" sense, but rather in the sense that it'd almost certainly result > in bugs as a result of misunderstood behaviour). I suspect the only way it would make sense is if the addition was made in tandem with a requirement that the builtin dictionary type be insertion ordered by default. The reason I say that is that given such a rule, it would *consistently* be true that: tuple(dict(x=1, y=2).items()) != tuple(dict(y=2, y=1).items()) Just as this is already reliably true in Python 3.6 today: >>> from collections import OrderedDict >>> x_first = tuple(OrderedDict(x=1, y=2).items()) >>> y_first = tuple(OrderedDict(y=2, x=1).items()) >>> x_first != y_first True >>> x_first (('x', 1), ('y', 2)) >>> y_first (('y', 2), ('x', 1)) In both PyPy and CPython 3.6+, that's actually true for the builtin dict as well (since their builtin implementations are order preserving and that's now a requirement for keyword argument and class execution namespace handling). That way, the invariant that folks would need to learn would just be: ntuple(x=1, y=2) == tuple(dict(x=1, y=2).values()) ntuple(y=2, x=1) == tuple(dict(y=2, x=1).values()) rather than the current: from collections import OrderedDict auto_ntuple(x=1, y=2) == tuple(OrderedDict(x=1, y=2).values()) auto_ntuple(y=2, x=1) == tuple(OrderedDict(y=2, x=1).values()) (Using Python 3.6 and auto_ntuple from https://gist.github.com/ncoghlan/a79e7a1b3f7dac11c6cfbbf59b189621#file-auto_ntuple-py ) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From mertz at gnosis.cx Mon Jul 31 01:41:35 2017 From: mertz at gnosis.cx (David Mertz) Date: Sun, 30 Jul 2017 22:41:35 -0700 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> <20170726010511.GT3149@ando.pearwood.info> <5978D4E1.8010601@stoneleaf.us> <22906.37433.808865.420400@turnbull.sk.tsukuba.ac.jp> <597AA479.7020404@stoneleaf.us> <069501d30885$cee8c430$6cba4c90$@sdamon.com> <80455b4d-ad6f-87e9-dc7a-8d671e8c3c95@gmail.com> <080201d3091b$0d677f90$28367eb0$@sdamon.com> Message-ID: > > But most of the time you always have the same attributes in the same order > (think of reading a CSV for example), and this would be just a normal > tuple, but with custom names for the indexes. > You apparently live in a halcyon world of data cleanliness where CSV data is so well behaved. In my world, I more typically deal with stuff like data1.csv: -------------- name,age,salaryK John,39,50 Sally,52,37 data2.csv: -------------- name,salaryK,age Juan,47,31 Siu,88,66 I'm likely to define different namedtuples for dealing with this: NameSalAge = namedtuple('NSA','name salary age') NameAgeSal = namedtuple('NAS','name age salary') Then later, indeed, I might ask: if employee1.salary == employee2.salary: ... And this would work even though I got the data from the different formats. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Mon Jul 31 01:54:44 2017 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 31 Jul 2017 15:54:44 +1000 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> <20170726010511.GT3149@ando.pearwood.info> <5978D4E1.8010601@stoneleaf.us> <22906.37433.808865.420400@turnbull.sk.tsukuba.ac.jp> <597AA479.7020404@stoneleaf.us> <069501d30885$cee8c430$6cba4c90$@sdamon.com> <80455b4d-ad6f-87e9-dc7a-8d671e8c3c95@gmail.com> <080201d3091b$0d677f90$28367eb0$@sdamon.com> Message-ID: On Mon, Jul 31, 2017 at 3:41 PM, David Mertz wrote: >> But most of the time you always have the same attributes in the same order >> (think of reading a CSV for example), and this would be just a normal tuple, >> but with custom names for the indexes. > > > You apparently live in a halcyon world of data cleanliness where CSV data is > so well behaved. > > In my world, I more typically deal with stuff like > > data1.csv: > -------------- > name,age,salaryK > John,39,50 > Sally,52,37 > > data2.csv: > -------------- > name,salaryK,age > Juan,47,31 > Siu,88,66 > > > I'm likely to define different namedtuples for dealing with this: > > NameSalAge = namedtuple('NSA','name salary age') > NameAgeSal = namedtuple('NAS','name age salary') > > Then later, indeed, I might ask: > > if employee1.salary == employee2.salary: ... > > And this would work even though I got the data from the different formats. Then you want csv.DictReader and dictionary lookups. ChrisA From mertz at gnosis.cx Mon Jul 31 02:27:36 2017 From: mertz at gnosis.cx (David Mertz) Date: Sun, 30 Jul 2017 23:27:36 -0700 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> <20170726010511.GT3149@ando.pearwood.info> <5978D4E1.8010601@stoneleaf.us> <22906.37433.808865.420400@turnbull.sk.tsukuba.ac.jp> <597AA479.7020404@stoneleaf.us> <069501d30885$cee8c430$6cba4c90$@sdamon.com> <80455b4d-ad6f-87e9-dc7a-8d671e8c3c95@gmail.com> <080201d3091b$0d677f90$28367eb0$@sdamon.com> Message-ID: Yep. DictRreader is better for my simple example. Just pointing out that encountering attributes in different orders isn't uncommon. On Jul 30, 2017 10:55 PM, "Chris Angelico" wrote: On Mon, Jul 31, 2017 at 3:41 PM, David Mertz wrote: >> But most of the time you always have the same attributes in the same order >> (think of reading a CSV for example), and this would be just a normal tuple, >> but with custom names for the indexes. > > > You apparently live in a halcyon world of data cleanliness where CSV data is > so well behaved. > > In my world, I more typically deal with stuff like > > data1.csv: > -------------- > name,age,salaryK > John,39,50 > Sally,52,37 > > data2.csv: > -------------- > name,salaryK,age > Juan,47,31 > Siu,88,66 > > > I'm likely to define different namedtuples for dealing with this: > > NameSalAge = namedtuple('NSA','name salary age') > NameAgeSal = namedtuple('NAS','name age salary') > > Then later, indeed, I might ask: > > if employee1.salary == employee2.salary: ... > > And this would work even though I got the data from the different formats. Then you want csv.DictReader and dictionary lookups. ChrisA _______________________________________________ Python-ideas mailing list Python-ideas at python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Mon Jul 31 01:40:04 2017 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 31 Jul 2017 17:40:04 +1200 Subject: [Python-ideas] namedtuple literals [Was: RE a new namedtuple] In-Reply-To: References: <4fc400d9-2625-f829-f441-08ed34a84ac9@gmail.com> <58c03f26-45e1-a3df-85e3-f48d4e302895@mrabarnett.plus.com> <20170726010511.GT3149@ando.pearwood.info> <5978D4E1.8010601@stoneleaf.us> <22906.37433.808865.420400@turnbull.sk.tsukuba.ac.jp> <597AA479.7020404@stoneleaf.us> <069501d30885$cee8c430$6cba4c90$@sdamon.com> <80455b4d-ad6f-87e9-dc7a-8d671e8c3c95@gmail.com> <080201d3091b$0d677f90$28367eb0$@sdamon.com> Message-ID: <597EC2B4.1010108@canterbury.ac.nz> > On 7/30/2017 2:57 PM, Markus Meskanen wrote: > >> How often do you get the keywords >> in a random order? The x, y, z example being used here is a bit deceptive, because the fields have an obvious natural order. That isn't always going to be the case. -- Greg