From brett at python.org Tue Mar 21 06:19:36 2006 From: brett at python.org (Brett Cannon) Date: Mon, 20 Mar 2006 21:19:36 -0800 Subject: [Python-3000] feature proposal process Message-ID: It seems to me that we should follow the normal process we have been following on python-dev; library changes can just be done by the discretion of committers and language changes require a PEP. I guess the real question is how stringent we should be with the PEP process. For instance, do we really need a PEP for changing dict.keys() to return an iterator and to drop dict.iterkeys()? This has been planned for so long, I say it is not needed. PEP 3000 can be augmented to make sure all obvious changes have a basic explanation. But what about situations like changing dict.keys() to an attribute? Does that require a full PEP? Since mutation is not expected during iteration on the returned iterator, I say it should be an attribute. But I am not sure if that idea should be back up by me thinking that a general design idea that something that returns immutable information should come from an attribute should be enshrined in a PEP describing general Py3K design principles or as a separate PEP describing overall planned changes to dict. It probably wouldn't hurt to write a general guidelines PEP in terms of design. Obviously it would not be hard rules or authoratative, but stuff like what should be an attribute compared to a function with no arguments might be helpful. Also gives us a clear idea of where things should go in terms of directioninstead of relying on Guido spilling his brain every so often (obviously Guido should be the primary author of a PEP like this). The over-arching question I am posing is how granular we should be with the PEPs. Should some meta-PEPs in terms of design be hashed out first so we have general guidelines to follow. I say we should get at least some rough ideas down. After that we should probably have a PEP for each of the core built-in types to cover exactly what the API is that we want. I doubt they will change much beyond the API so these shouldn't be major, but it will provide a good overview of what is and isn't lacking in them at the moment. -Brett From pedronis at strakt.com Tue Mar 21 13:05:11 2006 From: pedronis at strakt.com (Samuele Pedroni) Date: Tue, 21 Mar 2006 13:05:11 +0100 Subject: [Python-3000] feature proposal process In-Reply-To: References: Message-ID: <441FEBF7.3030304@strakt.com> Brett Cannon wrote: > It seems to me that we should follow the normal process we have been > following on python-dev; library changes can just be done by the > discretion of committers and language changes require a PEP. > > I guess the real question is how stringent we should be with the PEP > process. For instance, do we really need a PEP for changing > dict.keys() to return an iterator and to drop dict.iterkeys()? This > has been planned for so long, I say it is not needed. PEP 3000 can be > augmented to make sure all obvious changes have a basic explanation. > > But what about situations like changing dict.keys() to an attribute? > Does that require a full PEP? Since mutation is not expected during > iteration on the returned iterator, I say it should be an attribute. uh? it produces a fresh iterator each time tough. I cannot recall anything like this that follow your philophy at the moment. I think that mixing your ideas with process discussion is a mistake. (or was it an weird example) > But I am not sure if that idea should be back up by me thinking that a > general design idea that something that returns immutable information > should come from an attribute should be enshrined in a PEP describing > general Py3K design principles or as a separate PEP describing overall > planned changes to dict. > > It probably wouldn't hurt to write a general guidelines PEP in terms > of design. Obviously it would not be hard rules or authoratative, but > stuff like what should be an attribute compared to a function with no > arguments might be helpful. Also gives us a clear idea of where > things should go in terms of directioninstead of relying on Guido > spilling his brain every so often (obviously Guido should be the > primary author of a PEP like this). > > The over-arching question I am posing is how granular we should be > with the PEPs. Should some meta-PEPs in terms of design be hashed out > first so we have general guidelines to follow. I say we should get at > least some rough ideas down. > > After that we should probably have a PEP for each of the core built-in > types to cover exactly what the API is that we want. I doubt they > will change much beyond the API so these shouldn't be major, but it > will provide a good overview of what is and isn't lacking in them at > the moment. > > -Brett > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: http://mail.python.org/mailman/options/python-3000/pedronis%40strakt.com From thomas at python.org Tue Mar 21 17:01:31 2006 From: thomas at python.org (Thomas Wouters) Date: Tue, 21 Mar 2006 17:01:31 +0100 Subject: [Python-3000] [Python-Dev] Python 3000 Process In-Reply-To: References: Message-ID: <9e804ac0603210801q3c975675n96cd45b2a0b38a0d@mail.gmail.com> On 3/20/06, Guido van Rossum wrote: > > We need to start deciding on important meta-issues like: > > - What's the timeline? I don't expect to be setting a schedule now and > sticking to it for the next five years. But we owe everybody out there > who is watching some clarity about when Python 3000 can be expected, > and how we plan to get there; there are widely differing estimates of > how long it will take, and I don't want to scare users away or cause > developers to hold their breath waiting for it (some of which I > imagine is happening with Perl 6). Personally, I like the idea of Python 3.0 being released somewhere between 2.6 and 2.7, as you suggested at PyCon. That puts it between 1.5 and 3 years from now, given the usual release schedules. From what I can see now, that should be enough time to work out the major changes in 3.0. Maybe we should bill 3.0 as a 'developer release', urging extension/module-writers to adapt their code to 3.0 but not pushing it for general use. On the other hand, maybe it's better to make many pre-releases of 3.0, to give people a chance to look at 3.0 and to give developers (including ourselves) a chance to figure out how to upgrade (or facilitate upgrading) libraries, extensions and applications. > - What's the upgrade path? Do we provide a conversion tool, or a > compatibility mode, or both, or what? Will it be at all possible to > write code that runs in Python 2.x (for large enough values of x) as > well as in 3.0? I would like Pythonic code to be usable in both 2.x (where x is, say, 6 or higher) and 3.0, but 3.0 shouldn't need to worry too much about it. A compatibility mode is probably a bad idea -- isn't the whole reason for 3.0to not worry about compatibility? :-) Then again, if it costs us little or nothing, it may be worth considering. As for a conversion tool, it's a nice idea, but I wonder how well it'll work in practice. Take dict.items() returning an iterator; how are you going to detect it, let alone convert it? You can 'convert' it by translating it to 'list(dict.items())', but it won't result in high quality code. Maybe a 'change highlighter', which runs through code and detects things that might have changed in Python 3.0, is a more workable idea. I do believe that's the minimum we should aim for, though. Or, if source-inspection turns out to be too hard, have a backward-compatibility-aware 3.0, which can warn that something will behave differently than 2.x. (Maybe that's what you meant by 'compatibility mode') I wouldn't call it 'python' or 'python3' though, but keep it separate from python. It would have to replace a number of types with proxy-types that also check various uses. This also touches upon the issue of parallel releases > of Python 2.x and 3.x. My personal expectation (contrary to what MvL > said recently) is that there will be several 2.x releases issued even > after 3.0 is out; possibly 3.0 and 2.6 may coexist, and 2.7-2.9 may > continue to evolve 2.x while 3.x is maturing. I've seen this used > successfully in Perl (with 4->5) and Apache, and closer to home in > Zope. Again, this is important in the light of how the transition is > perceived in the world outside python-dev. I think the choice for parallel releases is a no-brainer, although we'll have to see how it turns out in practice. If 3.0 is a smash hit with an easy upgrade path, we'll need less 2.x releases than if upgrading ends up being a labourious process. > - Will we do a grand library reform at the same time? Personally I see > that as quite a separate issue; apart from some specific things like > the stdio redesign, we could start the library reform in 2.6, or > post-3.0, depending on how much energy there it. Agreed. We can certainly add new modules/names to 2.6 to make them forward-compatible and add pending-deprecationwarnings to the old names. - What's the implementation strategy? I've started a branch where I > plan to do some weeding out; but I've already found that the large > amount of legacy code makes the weeding difficult. I may yet decide to > switch to a sandbox model where only new code or carefully modernized > old code is added (this is how Zope 3 was developed). Hm, I don't feel that we need to throw out that much code. If you want to re-implement Python from scratch (with copious copy-pasting from 2.x) a sandbox model might make more sense, but then 1.5-3 years is quite unrealistic, and we'd also need to consider the C API stability. I'd much rather go for rigorous reviews and weeding of the existing source, than copying reviewed code until we have a working interpreter. - What's the procedure for proposing and new features? It may be time > to start a new series of PEPs that focus exclusively on Python 3000. > I'd like to reserve the numbers 3000-3099 for meta-PEPs (e.g. > addressing the above questions) and 3100-3999 for feature PEPs. Aye. I would like to suggest we document all changes, even when they are no-brainers that have been on the wishlist for years and don't fit the PEP style: making dict.keys/items/values return iterators, making -tt the default, removing the emacs/vi-tabsize-comment-hack from the parser, making all strings unicode, etc. In the end, a 'best practices' document describing how to write 2.x-and-3.x-compatible code may come in handy, too (but that can't be written until many features are fleshed out.) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20060321/e6a0c0eb/attachment.html From aahz at pythoncraft.com Tue Mar 21 17:09:14 2006 From: aahz at pythoncraft.com (Aahz) Date: Tue, 21 Mar 2006 08:09:14 -0800 Subject: [Python-3000] feature proposal process In-Reply-To: References: Message-ID: <20060321160914.GA9056@panix.com> [some snippage] On Mon, Mar 20, 2006, Brett Cannon wrote: > > I guess the real question is how stringent we should be with the PEP > process. For instance, do we really need a PEP for changing > dict.keys() to return an iterator and to drop dict.iterkeys()? This > has been planned for so long, I say it is not needed. PEP 3000 can be > augmented to make sure all obvious changes have a basic explanation. > > The over-arching question I am posing is how granular we should be > with the PEPs. Should some meta-PEPs in terms of design be hashed out > first so we have general guidelines to follow. I say we should get at > least some rough ideas down. My take is that we should include all changes in some PEP. I think it makes good sense to group small changes into a single related PEP. The main reason for this is to provide a formalized mechanism to record changes in these small changes ;-) as we work through the Py3K process. Thus, we will need to be careful to continuously record the current state in the PEPs. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "19. A language that doesn't affect the way you think about programming, is not worth knowing." --Alan Perlis From nas at arctrix.com Tue Mar 21 20:08:40 2006 From: nas at arctrix.com (Neil Schemenauer) Date: Tue, 21 Mar 2006 12:08:40 -0700 Subject: [Python-3000] C API changes? [Was: Python 3000 Process] In-Reply-To: <9e804ac0603210801q3c975675n96cd45b2a0b38a0d@mail.gmail.com> References: <9e804ac0603210801q3c975675n96cd45b2a0b38a0d@mail.gmail.com> Message-ID: <20060321190839.GB13871@mems-exchange.org> On Tue, Mar 21, 2006 at 05:01:31PM +0100, Thomas Wouters wrote: > I would like Pythonic code to be usable in both 2.x (where x is, > say, 6 or higher) and 3.0, but 3.0 shouldn't need to worry too > much about it. The other question I haven't seen much discussion about is C API changes. Are we hoping that people can port extension modules with little effort or are we going to be doing lots of cleanup? Some possible areas of cleanup: * Use unions for PyObject structure (allowing strict aliasing by the compiler). * Move GC attributes into the PyObject structure (e.g. similar to how ob_size works). * Rethink weakref implementation (not sure about details but maybe we can do better). * Rationalize method structures (e.g. sq_concat and sq_repeat could go away since we can use nb_add and nb_multiply). * Rationalize finalizer behavior (e.g. get rid of __del__ methods in favor of guardians?). * Make it easier to write simple extension types. For example, there could be standard tp_clear and tp_traverse methods that could work if the object is simply an array of PyObject pointers. * Do we still need tp_print and tp_repr? * Register instead of stack based VM? One part of me hopes that we could do a lot of cleanup in these areas. Another part is concerned about badly breaking the huge number of extensions out there (a major reason for Python's success, IMO). In any case, I hope an important objective of P3K will be to make writing extensions even easier. Neil From bioinformed at gmail.com Tue Mar 21 20:23:36 2006 From: bioinformed at gmail.com (Kevin Jacobs ) Date: Tue, 21 Mar 2006 14:23:36 -0500 Subject: [Python-3000] C API changes? [Was: Python 3000 Process] In-Reply-To: <20060321190839.GB13871@mems-exchange.org> References: <9e804ac0603210801q3c975675n96cd45b2a0b38a0d@mail.gmail.com> <20060321190839.GB13871@mems-exchange.org> Message-ID: <2e1434c10603211123j16a18c1ai87e891da78079bd@mail.gmail.com> Hi Neil, I've been fairly quiet on the Python-dev front for a while -- mainly because I use Python every day and it almost always works. However, as someone who regularly writes small C extension modules and types, I would certainly add a vote for sacrificing any vestiges of backward compatibility in order to rationalize the PyObject structure. The relative short-term pain of forward porting my code base is far offset by the long-term gains of having a legacy-free and streamlined structure. Put another way, forward porting code is less work than continuing to write code long-term against the current model. Two important disclaimers: 1) For my purposes, I don't need code that runs on both Python pre-3k and post-3k without change. 2) I have the ability and means to forward port my code and the necessary third-party extensions that I'll need. Many other developers and users will not have this luxury. -Kevin On 3/21/06, Neil Schemenauer wrote: > > On Tue, Mar 21, 2006 at 05:01:31PM +0100, Thomas Wouters wrote: > > I would like Pythonic code to be usable in both 2.x (where x is, > > say, 6 or higher) and 3.0, but 3.0 shouldn't need to worry too > > much about it. > > The other question I haven't seen much discussion about is C API > changes. Are we hoping that people can port extension modules with > little effort or are we going to be doing lots of cleanup? > > Some possible areas of cleanup: > > * Use unions for PyObject structure (allowing strict aliasing > by the compiler). > > * Move GC attributes into the PyObject structure (e.g. similar > to how ob_size works). > > * Rethink weakref implementation (not sure about details but > maybe we can do better). > > * Rationalize method structures (e.g. sq_concat and sq_repeat > could go away since we can use nb_add and nb_multiply). > > * Rationalize finalizer behavior (e.g. get rid of __del__ > methods in favor of guardians?). > > * Make it easier to write simple extension types. For example, > there could be standard tp_clear and tp_traverse methods that > could work if the object is simply an array of PyObject pointers. > > * Do we still need tp_print and tp_repr? > > * Register instead of stack based VM? > > > One part of me hopes that we could do a lot of cleanup in these > areas. Another part is concerned about badly breaking the huge > number of extensions out there (a major reason for Python's success, > IMO). In any case, I hope an important objective of P3K will be to > make writing extensions even easier. > > Neil > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: > http://mail.python.org/mailman/options/python-3000/jacobs%40bioinformed.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20060321/36e8b341/attachment.htm From nas at arctrix.com Tue Mar 21 20:30:35 2006 From: nas at arctrix.com (Neil Schemenauer) Date: Tue, 21 Mar 2006 12:30:35 -0700 Subject: [Python-3000] C API changes? [Was: Python 3000 Process] In-Reply-To: <2e1434c10603211123j16a18c1ai87e891da78079bd@mail.gmail.com> References: <9e804ac0603210801q3c975675n96cd45b2a0b38a0d@mail.gmail.com> <20060321190839.GB13871@mems-exchange.org> <2e1434c10603211123j16a18c1ai87e891da78079bd@mail.gmail.com> Message-ID: <20060321193035.GA14247@mems-exchange.org> Kevin Jacobs wrote: > Put another way, forward porting code is less work than continuing > to write code long-term against the current model. Another argument for cleanup: the number of extensions to be written for P3k will outnumber the current number of extensions. That's perhaps optimistic but I think it's entirely possible. Neil From msoulier at digitaltorque.ca Tue Mar 21 20:49:29 2006 From: msoulier at digitaltorque.ca (Michael P. Soulier) Date: Tue, 21 Mar 2006 14:49:29 -0500 Subject: [Python-3000] C API changes? [Was: Python 3000 Process] In-Reply-To: <20060321190839.GB13871@mems-exchange.org> References: <9e804ac0603210801q3c975675n96cd45b2a0b38a0d@mail.gmail.com> <20060321190839.GB13871@mems-exchange.org> Message-ID: <20060321194928.GO20833@tigger.digitaltorque.ca> On 21/03/06 Neil Schemenauer said: > The other question I haven't seen much discussion about is C API > changes. Are we hoping that people can port extension modules with > little effort or are we going to be doing lots of cleanup? Can the old API stay around for a while? There's little to anger users more than API changes that aren't backwards compatible. Mike -- Michael P. Soulier "Any intelligent fool can make things bigger and more complex... It takes a touch of genius - and a lot of courage to move in the opposite direction." --Albert Einstein -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://mail.python.org/pipermail/python-3000/attachments/20060321/bfa75f4d/attachment.pgp From bioinformed at gmail.com Tue Mar 21 21:30:12 2006 From: bioinformed at gmail.com (Kevin Jacobs ) Date: Tue, 21 Mar 2006 15:30:12 -0500 Subject: [Python-3000] C API changes? [Was: Python 3000 Process] In-Reply-To: <20060321194928.GO20833@tigger.digitaltorque.ca> References: <9e804ac0603210801q3c975675n96cd45b2a0b38a0d@mail.gmail.com> <20060321190839.GB13871@mems-exchange.org> <20060321194928.GO20833@tigger.digitaltorque.ca> Message-ID: <2e1434c10603211230q5b797ccbvb1e824daa19fab43@mail.gmail.com> On 3/21/06, Michael P. Soulier wrote: > > On 21/03/06 Neil Schemenauer said: > > > The other question I haven't seen much discussion about is C API > > changes. Are we hoping that people can port extension modules with > > little effort or are we going to be doing lots of cleanup? > > Can the old API stay around for a while? There's little to anger users > more > than API changes that aren't backwards compatible. > Python 2.x will be around for a long time and should remain compatible. My view is that Python 3k should be an optional (but hopefully compelling) upgrade and should start afresh without any unnecessary legacy. It sounds like many of the suggested core languages changes for Py3k are likely to be substantial enough to require changes to both Python code and extensions anyway. -Kevin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20060321/73061056/attachment.htm From martin at v.loewis.de Tue Mar 21 21:44:36 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 21 Mar 2006 21:44:36 +0100 Subject: [Python-3000] C API changes? [Was: Python 3000 Process] In-Reply-To: <20060321194928.GO20833@tigger.digitaltorque.ca> References: <9e804ac0603210801q3c975675n96cd45b2a0b38a0d@mail.gmail.com> <20060321190839.GB13871@mems-exchange.org> <20060321194928.GO20833@tigger.digitaltorque.ca> Message-ID: <442065B4.2090804@v.loewis.de> Michael P. Soulier wrote: > On 21/03/06 Neil Schemenauer said: > >> The other question I haven't seen much discussion about is C API >> changes. Are we hoping that people can port extension modules with >> little effort or are we going to be doing lots of cleanup? > > Can the old API stay around for a while? There's little to anger users more > than API changes that aren't backwards compatible. The changes Neil mention do not necessarily to involve API changes. For example: Use unions for PyObject structure (allowing strict aliasing by the compiler). I don't think it this is really necessary, perhaps Neil meant "use struct inheritance" instead, where you have typedef struct _object{ Py_ssize_t ob_refcnt; struct _typeobject *ob_type; } PyObject; typedef struct _var_object{ PyObject ob; Py_ssize_t ob_size; } PyVarObject; typedef struct { PyVarObject ob; long ob_shash; int ob_sstate; char ob_sval[1]; } PyStringObject; So, to access ob_refcnt when given a PyStringObject*o, you currently could write o->ob_refcnt and, in P3yk, you would write o->ob.ob.ob_refcnt However, this isn't really an API change: You don't access ob_refcnt *anyway*. Instead, you write Py_INCREF and Py_DECREF, and these continue to work, with Py_INCREF being defined as #define Py_INCREF(o) (((PyObject*)o)->ob_ref++) Now, there are slight details, such as access to ob_type. We could introduce a Py_OB_TYPE macro that gives access to ob_type across versions, likewise Py_OB_SIZE. Other changes Neil mentioned would be API changes, e.g. dropping sq_concat. Yet other changes might be unimplementable, e.g. "Move GC attributes into the PyObject structure": where would you place ob_size if both are present, and how would code deal with ob_size and/or these fields being at different offsets? So it must be considered on a case-by-case basis. Making a general promise that the old API could be maintained in all cases misses the point of Python 3. Regards, Martin From brett at python.org Tue Mar 21 23:00:40 2006 From: brett at python.org (Brett Cannon) Date: Tue, 21 Mar 2006 14:00:40 -0800 Subject: [Python-3000] C API changes? [Was: Python 3000 Process] In-Reply-To: <20060321190839.GB13871@mems-exchange.org> References: <9e804ac0603210801q3c975675n96cd45b2a0b38a0d@mail.gmail.com> <20060321190839.GB13871@mems-exchange.org> Message-ID: On 3/21/06, Neil Schemenauer wrote: > On Tue, Mar 21, 2006 at 05:01:31PM +0100, Thomas Wouters wrote: > > I would like Pythonic code to be usable in both 2.x (where x is, > > say, 6 or higher) and 3.0, but 3.0 shouldn't need to worry too > > much about it. > > The other question I haven't seen much discussion about is C API > changes. Are we hoping that people can port extension modules with > little effort or are we going to be doing lots of cleanup? > I hope there is lots of cleanup. I don't write extensions that often, but when I do it takes me a little while to remember how to do them. If there is a way to simplify it I am in support of it. Perhaps we should take some queues from other language known for easy extension writing (Lua and Ruby come to mind, although I have never done extensions for them and instead just hear their names come up constantly on this subject). -Brett From brett at python.org Wed Mar 22 00:40:21 2006 From: brett at python.org (Brett Cannon) Date: Tue, 21 Mar 2006 15:40:21 -0800 Subject: [Python-3000] feature proposal process In-Reply-To: <441FEBF7.3030304@strakt.com> References: <441FEBF7.3030304@strakt.com> Message-ID: On 3/21/06, Samuele Pedroni wrote: > Brett Cannon wrote: > > It seems to me that we should follow the normal process we have been > > following on python-dev; library changes can just be done by the > > discretion of committers and language changes require a PEP. > > > > I guess the real question is how stringent we should be with the PEP > > process. For instance, do we really need a PEP for changing > > dict.keys() to return an iterator and to drop dict.iterkeys()? This > > has been planned for so long, I say it is not needed. PEP 3000 can be > > augmented to make sure all obvious changes have a basic explanation. > > > > But what about situations like changing dict.keys() to an attribute? > > Does that require a full PEP? Since mutation is not expected during > > iteration on the returned iterator, I say it should be an attribute. > > > uh? it produces a fresh iterator each time tough. I cannot recall > anything like this that follow your philophy at the moment. I am not talking about getting back the same iterator. I am talking about if you request an iterator and mutate the dict before exhausting the iterator. That makes the dict "read-only" in a way temporarily. > > I think that mixing your ideas with process discussion is a mistake. > (or was it an weird example) > Weird example. Just trying to illustrate the various granularity that we can take in terms of proposing changes. -Brett From nnorwitz at gmail.com Wed Mar 22 05:52:34 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue, 21 Mar 2006 20:52:34 -0800 Subject: [Python-3000] C API changes? [Was: Python 3000 Process] In-Reply-To: <20060321190839.GB13871@mems-exchange.org> References: <9e804ac0603210801q3c975675n96cd45b2a0b38a0d@mail.gmail.com> <20060321190839.GB13871@mems-exchange.org> Message-ID: On 3/21/06, Neil Schemenauer wrote: > > The other question I haven't seen much discussion about is C API > changes. Are we hoping that people can port extension modules with > little effort or are we going to be doing lots of cleanup? Ideally both. I started ripping out coerce. I thought about re-organizing all the methods to try to group them a bit better. Then I realized the beneift is likely far too small for the larger pain of having to maintain 2 tables of methods (one for 2.x one for 3.x). It would be easier for extensions to conditionally not compile in the nb_coerce method for 3.x. I don't have any idea about the longer term. It will be clearer where we stand once we rip out all the cruft/backwards compatability. Get rid of old APIs, macros, etc. At that point we should be in a better position to determine how to proceed next. I'm going to strive to rip out all the old stuff by the end of April, but 2.5 should take higher priority. > Some possible areas of cleanup: > > * Use unions for PyObject structure (allowing strict aliasing > by the compiler). Martin expanded on this one a bit. We should do something so we don't need -fno-strict-aliasing. > * Rethink weakref implementation (not sure about details but > maybe we can do better). Sure, we should rethink everything IMO. If we can't come up with anything better, there's no work. > * Rationalize method structures (e.g. sq_concat and sq_repeat > could go away since we can use nb_add and nb_multiply). Yes. > * Rationalize finalizer behavior (e.g. get rid of __del__ > methods in favor of guardians?). At least try to get rid of __del__. > * Make it easier to write simple extension types. Yes. > * Register instead of stack based VM? IMO this could go into 2.6, but definitely could be experimented in 3.x and backported. > One part of me hopes that we could do a lot of cleanup in these > areas. Another part is concerned about badly breaking the huge > number of extensions out there (a major reason for Python's success, > IMO). In any case, I hope an important objective of P3K will be to > make writing extensions even easier. Definitely. n From nnorwitz at gmail.com Wed Mar 22 06:08:04 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue, 21 Mar 2006 21:08:04 -0800 Subject: [Python-3000] [Python-Dev] Python 3000 Process In-Reply-To: <9e804ac0603210801q3c975675n96cd45b2a0b38a0d@mail.gmail.com> References: <9e804ac0603210801q3c975675n96cd45b2a0b38a0d@mail.gmail.com> Message-ID: On 3/21/06, Thomas Wouters wrote: > > Personally, I like the idea of Python 3.0 being released somewhere between > 2.6 and 2.7, as you suggested at PyCon. That puts it between 1.5 and 3 years > from now, given the usual release schedules. From what I can see now, that > should be enough time to work out the major changes in 3.0. Maybe we should > bill 3.0 as a 'developer release', urging extension/module-writers to adapt > their code to 3.0 but not pushing it for general use. On the other hand, > maybe it's better to make many pre-releases of 3.0, to give people a chance > to look at 3.0 and to give developers (including ourselves) a chance to > figure out how to upgrade (or facilitate upgrading) libraries, extensions > and applications. I was thinking something very similar. I originally hoped 2.6 (or even 2.5!) would be the end of 2.x. That doesn't seem right. I think there should definitely be a 2.6 and 2.7. Possibly more 2.x, but that would depend on the future and we shouldn't promise any. 2.x would continue development/release as it has. 3.0 target near 2.6, about 2 years, maybe a bit longer, but definitely long before 2.7. I think 3.x should have releases every 4-6 months. These would be preview releases and might not even have a ton of testing. Although given the buildbot, I suspect all the previews will work reasonably well. The first 3.x release should be the first one that rips out all the old behaviour and is reasonably minimal. Meaning all the Python and C crufty APIs, builtins, etc have been removed, but (almost) nothing added. As I said in a previous mail, I hope to have everything ripped out by the end of April. I wonder if we should call the 3.x series as 2.99.x for the "pre-releases". That would give us 99 tries to get it write, plus we have an asymptote for 2.x. :-) > > - What's the upgrade path? Do we provide a conversion tool, or a > > compatibility mode, or both, or what? Will it be at all possible to > > write code that runs in Python 2.x (for large enough values of x) as > > well as in 3.0? > > I would like Pythonic code to be usable in both 2.x (where x is, say, 6 or > higher) and 3.0, but 3.0 shouldn't need to worry too much about it. A > compatibility mode is probably a bad idea -- isn't the whole reason for 3.0 > to not worry about compatibility? :-) Then again, if it costs us little or > nothing, it may be worth considering. Definitely. > As for a conversion tool, it's a nice idea, but I wonder how well it'll work > in practice. Take dict.items () returning an iterator; how are you going to > detect it, let alone convert it? You can 'convert' it by translating it to > 'list(dict.items())', but it won't result in high quality code. Maybe a > 'change highlighter', which runs through code and detects things that might > have changed in Python 3.0, is a more workable idea. I do believe that's the > minimum we should aim for, though. > > Or, if source-inspection turns out to be too hard, have a > backward-compatibility-aware 3.0, which can warn that something will behave > differently than 2.x. (Maybe that's what you meant by 'compatibility mode') > I wouldn't call it 'python' or 'python3' though, but keep it separate from > python. It would have to replace a number of types with proxy-types that > also check various uses. Perhaps we could have an extension module. I have no idea if a source conversion tool could work. We kinda have to wait and see what 3k ends up looking like. We should at least try to flag constructs in a tool. > > - Will we do a grand library reform at the same time? Personally I see > > that as quite a separate issue; apart from some specific things like > > the stdio redesign, we could start the library reform in 2.6, or > > post-3.0 , depending on how much energy there it. > > Agreed. We can certainly add new modules/names to 2.6 to make them > forward-compatible and add pending-deprecationwarnings to the old names. Better to start sooner. We could move everything to the new location, but still keep the old locations around with pending deprecations as Thomas says. This way the 2.6 library should closely resemble 3.0. > > - What's the implementation strategy? I've started a branch where I > > plan to do some weeding out; but I've already found that the large > > amount of legacy code makes the weeding difficult. I may yet decide to > > switch to a sandbox model where only new code or carefully modernized > > old code is added (this is how Zope 3 was developed). > > Hm, I don't feel that we need to throw out that much code. If you want to > re-implement Python from scratch (with copious copy-pasting from 2.x ) a > sandbox model might make more sense, but then 1.5-3 years is quite > unrealistic, and we'd also need to consider the C API stability. I'd much > rather go for rigorous reviews and weeding of the existing source, than > copying reviewed code until we have a working interpreter. I agree with Thomas at this point. We should try to rip out as much as possible. If we don't like the result, we can always go to a sandbox or another approach later. I don't think it will be necessary. I do expect major components to be rewritten, like stdio, but the new implementation can be moved into place atomically with just that portion developed in a sandbox. n From nas at arctrix.com Wed Mar 22 06:14:25 2006 From: nas at arctrix.com (Neil Schemenauer) Date: Tue, 21 Mar 2006 22:14:25 -0700 Subject: [Python-3000] C API changes? [Was: Python 3000 Process] In-Reply-To: References: <9e804ac0603210801q3c975675n96cd45b2a0b38a0d@mail.gmail.com> <20060321190839.GB13871@mems-exchange.org> Message-ID: <20060322051424.GC16724@mems-exchange.org> On Tue, Mar 21, 2006 at 08:52:34PM -0800, Neal Norwitz wrote: > I thought about re-organizing all the methods to try to group them > a bit better. Then I realized the beneift is likely far too small > for the larger pain of having to maintain 2 tables of methods (one > for 2.x one for 3.x). Maybe C99 designated initializers could solve that problem (assuming we are going to require a C99 compiler). Neil From nas at arctrix.com Wed Mar 22 06:19:08 2006 From: nas at arctrix.com (Neil Schemenauer) Date: Tue, 21 Mar 2006 22:19:08 -0700 Subject: [Python-3000] [Python-Dev] Python 3000 Process In-Reply-To: References: <9e804ac0603210801q3c975675n96cd45b2a0b38a0d@mail.gmail.com> Message-ID: <20060322051907.GD16724@mems-exchange.org> On Tue, Mar 21, 2006 at 09:08:04PM -0800, Neal Norwitz wrote: > I wonder if we should call the 3.x series as 2.99.x for the > "pre-releases". That would give us 99 tries to get it write, plus we > have an asymptote for 2.x. :-) I'd rather see 3.0alphaX. Personally I think 2.x should go on as long as someone is interested in doing the releases (similar to Linux 2.0.x, 2.2.x, and 2.4.x). I guess the chances of making it to 2.99 is very slim. Neil From nnorwitz at gmail.com Wed Mar 22 06:23:43 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue, 21 Mar 2006 21:23:43 -0800 Subject: [Python-3000] C API changes? [Was: Python 3000 Process] In-Reply-To: <20060322051424.GC16724@mems-exchange.org> References: <9e804ac0603210801q3c975675n96cd45b2a0b38a0d@mail.gmail.com> <20060321190839.GB13871@mems-exchange.org> <20060322051424.GC16724@mems-exchange.org> Message-ID: On 3/21/06, Neil Schemenauer wrote: > On Tue, Mar 21, 2006 at 08:52:34PM -0800, Neal Norwitz wrote: > > I thought about re-organizing all the methods to try to group them > > a bit better. Then I realized the beneift is likely far too small > > for the larger pain of having to maintain 2 tables of methods (one > > for 2.x one for 3.x). > > Maybe C99 designated initializers could solve that problem (assuming > we are going to require a C99 compiler). That (using initializers) would be good for several reasons. We can get rid of the "holes" 0s. We don't need the comments as they would be redudant with the real names. If we used C99, we could also use // comments and inline declarations, rather than only at the start of a scope. I would like all of these, though I'm not sure we want to require C99. I updated the PEP with an outstanding issues section and added using C99. n From martin at v.loewis.de Wed Mar 22 09:23:58 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 22 Mar 2006 09:23:58 +0100 Subject: [Python-3000] C API changes? [Was: Python 3000 Process] In-Reply-To: <20060322051424.GC16724@mems-exchange.org> References: <9e804ac0603210801q3c975675n96cd45b2a0b38a0d@mail.gmail.com> <20060321190839.GB13871@mems-exchange.org> <20060322051424.GC16724@mems-exchange.org> Message-ID: <4421099E.1090702@v.loewis.de> Neil Schemenauer wrote: > On Tue, Mar 21, 2006 at 08:52:34PM -0800, Neal Norwitz wrote: >> I thought about re-organizing all the methods to try to group them >> a bit better. Then I realized the beneift is likely far too small >> for the larger pain of having to maintain 2 tables of methods (one >> for 2.x one for 3.x). > > Maybe C99 designated initializers could solve that problem (assuming > we are going to require a C99 compiler). For that to work, you would have to compile your 2.x extension with a C99 compiler, as well - otherwise, you still have two tables in your extension. Still, I think that Python 3 should use designated initializers in its own code base, which implies that it requires a C99 compiler (atleast wrt. to that aspect of the language). Regards, Martin From nnorwitz at gmail.com Wed Mar 22 10:47:54 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Wed, 22 Mar 2006 01:47:54 -0800 Subject: [Python-3000] [Python-Dev] r43214 - peps/trunk/pep-3000.txt In-Reply-To: <2mlkv2x1pa.fsf@starship.python.net> References: <20060322052432.419F01E4004@bag.python.org> <2mlkv2x1pa.fsf@starship.python.net> Message-ID: [bcc python-dev to move to python-3000] On 3/22/06, Michael Hudson wrote: > "Fredrik Lundh" writes: > > > neal.norwitz wrote: > > > >> +Outstanding Issues > >> +================== > >> + > >> +* Require C99, so we can use // comments, named initializers, declare variables > >> + without introducing a new scope, among other benefits. > > > > gcc only, in other words ? > > Heh, I was going to make this point as well: it's not clear that MSVC > will ever support C99. It supports some of the features listed here, > of course, but probably won't support everything. Note this is the outstanding issues section. So everything here is a question. By the time 3k is released, who knows what compilers will support. > >> +* Remove support for old systems, including: OS2, BeOS, RISCOS, (SGI) Irix, Tru64 > > > > what's old with tru64 ? it's not that uncommon in places where Python > > has a strong presence, you can still buy AXP hardware throughout 2006, > > and HP says they'll keep developing and supporting the software platform > > at least through 2011. > > And we still have someone actively interested in maintaining the OS2 > port, it seems. OS2 was a mistake. I created this list a few days ago before Alan said he was interested in maintaining it. Tru64 is difficult (I think there are still some open bugs that go back years) because no developer has access to any of these boxes. It would be good for people interested in these platforms to speak up and offer their time or at least access to the platform so we can test. I haven't researched Tru64 status, but I remember hearing a bunch of times how it wasn't going to be supported. If we support these platforms, we still need to decide what versions. Again remember all of these are questions. n From thomas at python.org Wed Mar 22 11:49:12 2006 From: thomas at python.org (Thomas Wouters) Date: Wed, 22 Mar 2006 11:49:12 +0100 Subject: [Python-3000] C API changes? [Was: Python 3000 Process] In-Reply-To: References: <9e804ac0603210801q3c975675n96cd45b2a0b38a0d@mail.gmail.com> <20060321190839.GB13871@mems-exchange.org> <20060322051424.GC16724@mems-exchange.org> Message-ID: <9e804ac0603220249j71c2a8d5ob0fd3fe09998e626@mail.gmail.com> On 3/22/06, Neal Norwitz wrote: > > On 3/21/06, Neil Schemenauer wrote: > > On Tue, Mar 21, 2006 at 08:52:34PM -0800, Neal Norwitz wrote: > > > I thought about re-organizing all the methods to try to group them > > > a bit better. Then I realized the beneift is likely far too small > > > for the larger pain of having to maintain 2 tables of methods (one > > > for 2.x one for 3.x). > > > > Maybe C99 designated initializers could solve that problem (assuming > > we are going to require a C99 compiler). > > That (using initializers) would be good for several reasons. We can > get rid of the "holes" 0s. We don't need the comments as they would > be redudant with the real names. For the old and ignorant among us (although I concede I may be the only one) could someone explain 'designated initializers' and how the code would look? As for keeping 3.x and 2.x Type structs similar, we _could_ add a new type struct for 3.0 and transparently support the old type struct (by translating it to a 3.0 one.) The two structs would have to be mirrored for the benefit of the 'old' code (so references to typeobj.tp_free where it was initialized to 0 would still work right) but at least we can have source-level perfection without compromizing source-level backward compatibility. If we used C99, we could also use // comments and inline declarations, > rather than only at the start of a scope. I would like all of these, > though I'm not sure we want to require C99. I updated the PEP with an > outstanding issues section and added using C99. I'm not sure I like those. In fact, I'm pretty sure I don't :) // is all about style; I don't think it has any. Inline declarations is about code readability. Code that spreads out the declarations all over the place just look messy. I'll grant that there are cases where it's more logical to keep a declaration right near its (only) use, but a great many of those cases also introduce a new block. I've worked a bit with (messy) C99 code, and I always got a much clearer picture of what a function tries to do (and how messy it is) when I moved all the declarations to the top of the function/block. Just my two cents; I'll happily follow whatever PEP-3007 says. Speaking of PEP-3007, any ideas on how different from PEP-0007 it'll be? tabs or four-space indents? How rigourously will it be applied? (I honestly don't have any problems with PEP-0007 except for its intermittent application.) Do we get to arbitrarily clean up extensions and the like? :> -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20060322/880709db/attachment.html From barry at python.org Wed Mar 22 13:28:49 2006 From: barry at python.org (Barry Warsaw) Date: Wed, 22 Mar 2006 07:28:49 -0500 Subject: [Python-3000] C API changes? [Was: Python 3000 Process] In-Reply-To: References: <9e804ac0603210801q3c975675n96cd45b2a0b38a0d@mail.gmail.com> <20060321190839.GB13871@mems-exchange.org> <20060322051424.GC16724@mems-exchange.org> Message-ID: <1143030529.5178.55.camel@geddy.wooz.org> On Tue, 2006-03-21 at 21:23 -0800, Neal Norwitz wrote: > If we used C99, we could also use // comments and inline declarations, > rather than only at the start of a scope. I would like all of these, > though I'm not sure we want to require C99. I updated the PEP with an > outstanding issues section and added using C99. I was going to ask the same question. Maybe we should make a table outlining the major platform/compilers that would support (or be missing) a sufficient C99 compiler. I do think we should strive for C99 as a base-line requirement. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 309 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-3000/attachments/20060322/f52eb936/attachment.pgp From barry at python.org Wed Mar 22 13:34:30 2006 From: barry at python.org (Barry Warsaw) Date: Wed, 22 Mar 2006 07:34:30 -0500 Subject: [Python-3000] C API changes? [Was: Python 3000 Process] In-Reply-To: <9e804ac0603220249j71c2a8d5ob0fd3fe09998e626@mail.gmail.com> References: <9e804ac0603210801q3c975675n96cd45b2a0b38a0d@mail.gmail.com> <20060321190839.GB13871@mems-exchange.org> <20060322051424.GC16724@mems-exchange.org> <9e804ac0603220249j71c2a8d5ob0fd3fe09998e626@mail.gmail.com> Message-ID: <1143030870.5198.58.camel@geddy.wooz.org> On Wed, 2006-03-22 at 11:49 +0100, Thomas Wouters wrote: > I'm not sure I like those. In fact, I'm pretty sure I don't :) // is > all about style; I don't think it has any. Inline declarations is > about code readability. Code that spreads out the declarations all > over the place just look messy. I'll grant that there are cases where > it's more logical to keep a declaration right near its (only) use, but > a great many of those cases also introduce a new block. I've worked a > bit with (messy) C99 code, and I always got a much clearer picture of > what a function tries to do (and how messy it is) when I moved all the > declarations to the top of the function/block. Just my two cents; I'll > happily follow whatever PEP-3007 says. > > Speaking of PEP-3007, any ideas on how different from PEP-0007 it'll > be? tabs or four-space indents? How rigourously will it be applied? (I > honestly don't have any problems with PEP-0007 except for its > intermittent application.) Do we get to arbitrarily clean up > extensions and the like? :> One thing I hope we can finally do is re-indent to 4 spaces. That's been on my wish list since about 1995. :) -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 309 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-3000/attachments/20060322/d79bc224/attachment.pgp From anthony at interlink.com.au Wed Mar 22 15:50:23 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Thu, 23 Mar 2006 01:50:23 +1100 Subject: [Python-3000] [Python-Dev] Python 3000 Process In-Reply-To: <20060322051907.GD16724@mems-exchange.org> References: <20060322051907.GD16724@mems-exchange.org> Message-ID: <200603230150.25122.anthony@interlink.com.au> On Wednesday 22 March 2006 16:19, Neil Schemenauer wrote: > On Tue, Mar 21, 2006 at 09:08:04PM -0800, Neal Norwitz wrote: > > I wonder if we should call the 3.x series as 2.99.x for the > > "pre-releases". That would give us 99 tries to get it write, > > plus we have an asymptote for 2.x. :-) > > I'd rather see 3.0alphaX. Personally I think 2.x should go on > as long as someone is interested in doing the releases (similar to > Linux 2.0.x, 2.2.x, and 2.4.x). I guess the chances of making it > to 2.99 is very slim. I could see some bugfix releases of the last 2.x continuing for some time, but I doubt we have the resources to put into feature releases of both 2.x and 3.x for an extended period of time. Anthony -- Anthony Baxter It's never too late to have a happy childhood. From martin at v.loewis.de Wed Mar 22 19:31:30 2006 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Wed, 22 Mar 2006 19:31:30 +0100 Subject: [Python-3000] C API changes? [Was: Python 3000 Process] In-Reply-To: <9e804ac0603220249j71c2a8d5ob0fd3fe09998e626@mail.gmail.com> References: <9e804ac0603210801q3c975675n96cd45b2a0b38a0d@mail.gmail.com> <20060321190839.GB13871@mems-exchange.org> <20060322051424.GC16724@mems-exchange.org> <9e804ac0603220249j71c2a8d5ob0fd3fe09998e626@mail.gmail.com> Message-ID: <44219802.5060106@v.loewis.de> Thomas Wouters wrote: > For the old and ignorant among us (although I concede I may be the only > one) could someone explain 'designated initializers' and how the code > would look? These are like keyword arguments. To initialize mmapmodule.c:mmap_object_type, you would write static PyTypeObject mmap_object_type = { PyObject_HEAD_INIT(0) .tp_name = "mmap.mmap", .tp_size = sizeof(mmap_object), .tp_dealloc = mmap_object_dealloc, .tp_getattr = mmap_object_getattr, .tp_as_sequence = &mmap_as_sequence, .tp_as_buffer = &mmap_as_buffer, .tp_flags = Py_TPFLAGS_HAVE_GETCHARBUFFER, }; Everything not mentioned is null-initialized; order of fields does not matter. FWIW, the same is also available for arrays: char *hash[1000] = { [317] = "foo", [220] = "bar" }; Regards, Martin From brett at python.org Thu Mar 23 00:14:06 2006 From: brett at python.org (Brett Cannon) Date: Wed, 22 Mar 2006 15:14:06 -0800 Subject: [Python-3000] C style guide (was: C API changes?) Message-ID: On 3/22/06, Thomas Wouters wrote: [SNIP] > Speaking of PEP-3007, any ideas on how different from PEP-0007 it'll be? > tabs or four-space indents? How rigourously will it be applied? (I honestly > don't have any problems with PEP-0007 except for its intermittent > application.) Do we get to arbitrarily clean up extensions and the like? :> Yes, please! I think for the Py3K codebase we should at least require code meet the style guide. We are all guilty of having ignored it at some point, but over the years it has made the C code a pain to edit since you have to pay attention to the formatting of the code around where you are touching which can vary from file to file, making multi-file patches a pain to review. Plus if we require this before we add new code in it will help guarantee at least a basic code review. Plus I would like to say that upping the amount of source comments would be a very good thing. While we do tend to have great comments that cover really technical stuff (usually from Tim), we tend not to comment on what a function does or why it exists. While it might be obvious to some, it won't be to someone who is just stepping into Python development. Hell, I still sometimes have to spend a good amount of time jumping around and grepping to figure out what a function does or why it is there. We are already used to doing simple docstrings for Python code so I don't see extending this to C code as a huge overhead. -Brett From ncoghlan at gmail.com Thu Mar 23 03:11:49 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 23 Mar 2006 12:11:49 +1000 Subject: [Python-3000] C style guide (was: C API changes?) In-Reply-To: References: Message-ID: <442203E5.7090009@gmail.com> Brett Cannon wrote: > On 3/22/06, Thomas Wouters wrote: > [SNIP] >> Speaking of PEP-3007, any ideas on how different from PEP-0007 it'll be? >> tabs or four-space indents? How rigourously will it be applied? (I honestly >> don't have any problems with PEP-0007 except for its intermittent >> application.) Do we get to arbitrarily clean up extensions and the like? :> > > Yes, please! I think for the Py3K codebase we should at least require > code meet the style guide. We are all guilty of having ignored it at > some point, but over the years it has made the C code a pain to edit > since you have to pay attention to the formatting of the code around > where you are touching which can vary from file to file, making > multi-file patches a pain to review. Plus if we require this before > we add new code in it will help guarantee at least a basic code > review. I would love it if PEP 3007 standardised on 4-space indents, the same as the standard for Python code in the standard lib. I'd love it even more if reindent.py cleaned up C whitespace as well as Python whitespace. These days, getting any C file to display properly involves tinkering with my editor's indentation and tab display settings. . . Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From nnorwitz at gmail.com Thu Mar 23 03:29:30 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Wed, 22 Mar 2006 18:29:30 -0800 Subject: [Python-3000] C style guide (was: C API changes?) In-Reply-To: <442203E5.7090009@gmail.com> References: <442203E5.7090009@gmail.com> Message-ID: On 3/22/06, Nick Coghlan wrote: > > I would love it if PEP 3007 standardised on 4-space indents, the same as the > standard for Python code in the standard lib. I'd love it even more if > reindent.py cleaned up C whitespace as well as Python whitespace. Wait! I thought we were switching to 2-space indents for all code. ducks, n :-) From guido at python.org Thu Mar 23 03:31:10 2006 From: guido at python.org (Guido van Rossum) Date: Wed, 22 Mar 2006 18:31:10 -0800 Subject: [Python-3000] C style guide (was: C API changes?) In-Reply-To: <442203E5.7090009@gmail.com> References: <442203E5.7090009@gmail.com> Message-ID: On 3/22/06, Nick Coghlan wrote: > I would love it if PEP 3007 standardised on 4-space indents, the same as the > standard for Python code in the standard lib. I'd love it even more if > reindent.py cleaned up C whitespace as well as Python whitespace. These days, > getting any C file to display properly involves tinkering with my editor's > indentation and tab display settings. . . That won't go away for me (Google's settings default to TWO-space indents :-( ) but I agree with the 4-space indent -- eventually. Right now I think that making it nearly impossible to merge changes from 2.5 into the 3.0 branch is a disadvantage; I'd rather not deal with this just yet (but I will eventually, of course). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.peters at gmail.com Thu Mar 23 04:56:16 2006 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 22 Mar 2006 22:56:16 -0500 Subject: [Python-3000] C style guide (was: C API changes?) In-Reply-To: <442203E5.7090009@gmail.com> References: <442203E5.7090009@gmail.com> Message-ID: <1f7befae0603221956q3d49f6afodbeaa4f5bd2e182c@mail.gmail.com> [Brett Cannon] |>> Yes, please! I think for the Py3K codebase we should at least require >> code meet the style guide. We are all guilty of having ignored it at >> some point, I'm not :-) > ... [Nick Coghlan] > I would love it if PEP 3007 standardised on 4-space indents, the same as the > standard for Python code in the standard lib. +1 here. > I'd love it even more if reindent.py cleaned up C whitespace as well as > Python whitespace. I doubt that will happen. reindent.py relies entirely on tokenize.py for parsing, and that's 100% specific to Python. Last time I was a Unix-head, though, there were 3797 different programs for reindenting C code. There are probably a million now. Someone who knows of a good one can do the whole job. As reindent.py's docstring notes: The hard part of reindenting is figuring out what to do with comment lines. From jim at zope.com Thu Mar 23 20:52:54 2006 From: jim at zope.com (Jim Fulton) Date: Thu, 23 Mar 2006 14:52:54 -0500 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) Message-ID: <4422FC96.2020409@zope.com> Looking over some of the messages in the archives, I saw a reference to making dict keys, items, and values methods return iterators. I've heard Guido mention this in the past. I'd like to offer a word of caution here. ZODB has a BTree implemention that uses iterators for keys, values, and items. After years of experience with this, I can definately say that this is very annoying. A common scenario is that I'm exploring a complex data structure. I give an expression and repeatedly re-execute, adding a bit to the expression each time. I can generally, call up a previous line interactively and add bits to it. That is, until I get to a BTree keys (or items or values). At that point, I have to edit both the beginning and end of the expression to put a list() call around it. I then often have to take the list() call off when I want to look at an individual item in more detail. If we are dead set on making these methods return iterators, I'd really like to see a way to either get non-iterators by calling a method or see some new facilities in the iterators returned. Perhaps these iterators could have a method for getting a set of values? Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From ianb at colorstudy.com Thu Mar 23 21:06:57 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 23 Mar 2006 14:06:57 -0600 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <4422FC96.2020409@zope.com> References: <4422FC96.2020409@zope.com> Message-ID: <4422FFE1.8050807@colorstudy.com> Jim Fulton wrote: > Looking over some of the messages in the archives, I saw a reference > to making dict keys, items, and values methods return iterators. I've heard > Guido mention this in the past. I saw this too in the archives, and thought shit, that's going to mess up a lot of my code. I would assume (though it's a separate point of discussion) that Python 3k should still try hard to keep backward compatibility. Backward compatibility isn't a requirement, but it's still clearly a feature. For an instance of code that would be broken: for key in d.keys(): if something(key): del d[key] If I didn't want a list, I probably would have iterated over d, wouldn't I? Items is a little fuzzier, but I do a lot of: items = d.items() items.sort() Not as big an issue, because these days I can already do sorted(d.items()) for the same effect. Still, the change doesn't seem that interesting or useful to me, in comparison to the effect it will have on so much code. One idea I had after reading a post of Brett's was a dual-use attribute; if you do d.keys you get an iterable (not an iterator, of course), and if you call that iterable you get a list. This is backward compatible, arguably prettier anyway to make it a property (since there's no side effects and getting an iterable isn't expensive, the method call seems somewhat superfluous). One can argue that this adds redundancy. But anyway, a conceptual argument against .items() returning an iterator: .items() reads as a request for a concrete object to me. That is, it doesn't read as "give me a promise that later you can give me the items from this object", it reads as "give me the items, right now and right here". If it was a set-like object instead of a list, that'd be fine (maybe better -- avoid arbitrary ordering entirely). But that's a separate conversation. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From jeremy at alum.mit.edu Thu Mar 23 21:31:25 2006 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Thu, 23 Mar 2006 15:31:25 -0500 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <4422FFE1.8050807@colorstudy.com> References: <4422FC96.2020409@zope.com> <4422FFE1.8050807@colorstudy.com> Message-ID: On 3/23/06, Ian Bicking wrote: > One idea I had after reading a post of Brett's was a dual-use attribute; > if you do d.keys you get an iterable (not an iterator, of course), and > if you call that iterable you get a list. This is backward compatible, > arguably prettier anyway to make it a property (since there's no side > effects and getting an iterable isn't expensive, the method call seems > somewhat superfluous). I don't think we should overload attributes name such that they are sometimes attributes and sometimes methods, particularly when they return things that behave almost-but-not-quite the same. It will create confusion and subtle bugs. Jeremy From guido at python.org Thu Mar 23 21:42:25 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 23 Mar 2006 12:42:25 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <4422FFE1.8050807@colorstudy.com> References: <4422FC96.2020409@zope.com> <4422FFE1.8050807@colorstudy.com> Message-ID: On 3/23/06, Ian Bicking wrote: > Jim Fulton wrote: > > Looking over some of the messages in the archives, I saw a reference > > to making dict keys, items, and values methods return iterators. I've heard > > Guido mention this in the past. It's been one of the first things I've always wanted in Python 3000, ever since we added iterators in 2.2. I've read and re-read Jim's message, and I'm not sure I understand it. It seems he's working in an interactive session but I'm not sure I understand the problem he has with adding list() around an expression (unless he hasn't got readline, in which case he's got worse problems). IMO the feature he asks for getting a list back already exists is adding list() around the expression. (I actually suspect that in many cases set() would be a more useful choice.) > I saw this too in the archives, and thought shit, that's going to mess > up a lot of my code. I would assume (though it's a separate point of > discussion) that Python 3k should still try hard to keep backward > compatibility. Backward compatibility isn't a requirement, but it's > still clearly a feature. You seem to be misunderstanding what Python 3000 is. The whole point of Python 3000 is to *not* be bound by backwards compatibility constraints, but instead make the best decisions possible (without making it a different language). > For an instance of code that would be broken: > > for key in d.keys(): > if something(key): > del d[key] > > If I didn't want a list, I probably would have iterated over d, wouldn't > I? Depends on whether you wrote that code before or after Python 2.2. In 2.1 and before, you *couldn't* iterate over d, so you were forced to use d.keys() whether you wanted it or not. > Items is a little fuzzier, but I do a lot of: > > items = d.items() > items.sort() > > Not as big an issue, because these days I can already do > sorted(d.items()) for the same effect. Still, the change doesn't seem > that interesting or useful to me, in comparison to the effect it will > have on so much code. It's interesting to me because there's a bunch of APIs that currently have two versions: one to get a list and one to get an iterator. It would be cleaner if only the iterator version existed, and the way to get a list was to put an explicit list() around it. Building the list is expensive, and often not needed (a lot of algorithms don't mutate the dict). > One idea I had after reading a post of Brett's was a dual-use attribute; > if you do d.keys you get an iterable (not an iterator, of course), and > if you call that iterable you get a list. This is backward compatible, > arguably prettier anyway to make it a property (since there's no side > effects and getting an iterable isn't expensive, the method call seems > somewhat superfluous). You gotta be kidding about calling something pretty which allows a common mistake (leaving out the () brackets) turn into such a subtle bug (not making a copy). > One can argue that this adds redundancy. > > But anyway, a conceptual argument against .items() returning an > iterator: .items() reads as a request for a concrete object to me. That > is, it doesn't read as "give me a promise that later you can give me the > items from this object", it reads as "give me the items, right now and > right here". If it was a set-like object instead of a list, that'd be > fine (maybe better -- avoid arbitrary ordering entirely). But that's a > separate conversation. Actually that's a very interesting conversation. Last year I wrote a large body of Java code that used the Java collections package a lot, and I ended up liking some of its choices. Its maps have methods to return keys, values and items, but these return neither new lists nor iterators; they return "views" which obey set (or multiset, in the case of items) semantics. The effect of mutating one or the other is carefully defined to allow an efficient implementation and foolproof use (e.g. you can delete an item from the keys set and it will remove the corresponding item from the underlying mapl but you can't insert an item into the keys set because there's no value to map it to in the map. The views can then be iterated over as many times as you want to. I'd like to explore this as an alternative to making keys() etc. return iterators. It also might make keys() etc. more similar to the new range(), which will behave like the current xrange(): it doesn't return an iterator, but an "iterator well". That's the behavior range() would have had in the first place if I had thought of iterators earlier. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jim at zope.com Thu Mar 23 21:57:09 2006 From: jim at zope.com (Jim Fulton) Date: Thu, 23 Mar 2006 15:57:09 -0500 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <4422FC96.2020409@zope.com> <4422FFE1.8050807@colorstudy.com> Message-ID: <44230BA5.8070407@zope.com> Guido van Rossum wrote: ... > I've read and re-read Jim's message, and I'm not sure I understand it. > It seems he's working in an interactive session Yup > but I'm not sure I > understand the problem he has with adding list() around an expression It's a hassle. > (unless he hasn't got readline, in which case he's got worse > problems). Nope. I've got readline or the emacs shell equivalent. > IMO the feature he asks for getting a list back already > exists is adding list() around the expression. That and the fact that I almost always type: foo.keys() and then get something useless and *then*, say "Doh!" and then have to add the list call. This is a case where I can give feedback based on actual experience with a feature. It's like you get user testing for free. :) I'd be interested to hear if other people who have experience working with ZODB BTrees have been as annoyed as I've been. Then again, I suspect that when you actually implement this you or perhaps other will be as annoyed with this feature as I am. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From barry at python.org Thu Mar 23 22:04:19 2006 From: barry at python.org (Barry Warsaw) Date: Thu, 23 Mar 2006 16:04:19 -0500 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <4422FC96.2020409@zope.com> References: <4422FC96.2020409@zope.com> Message-ID: <1143147859.10792.5.camel@resist.wooz.org> On Thu, 2006-03-23 at 14:52 -0500, Jim Fulton wrote: > If we are dead set on making these methods return iterators, I'd really like > to see a way to either get non-iterators by calling a method or see some > new facilities in the iterators returned. Perhaps these iterators > could have a method for getting a set of values? What I've found painful as we converted all of our api's from returning lists (or tuples) to returning iterators, is that the few places that did random access into the concrete sequence broke, requiring a wrapping of list() of some other fix. What I think would be useful would be to add /optional/ support for __getitem__() to certain iterators. IOW, iterators could chose to support random access or not. So, dict().keys could support __getitem__() because it knows the size of the iterator's underlying sequence. Generators and other iterators could chose to raise TypeError if they have an unknown size. There are probably all sorts of reasons why this won't work, but it seems like it would be useful. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 309 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-3000/attachments/20060323/4235fbbd/attachment.pgp From skip at pobox.com Thu Mar 23 22:12:47 2006 From: skip at pobox.com (skip at pobox.com) Date: Thu, 23 Mar 2006 15:12:47 -0600 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <4422FC96.2020409@zope.com> <4422FFE1.8050807@colorstudy.com> Message-ID: <17443.3919.235023.510956@montanaro.dyndns.org> Guido> It's interesting to me because there's a bunch of APIs that Guido> currently have two versions: one to get a list and one to get an Guido> iterator. It would be cleaner if only the iterator version Guido> existed, and the way to get a list was to put an explicit list() Guido> around it. Building the list is expensive, and often not needed Guido> (a lot of algorithms don't mutate the dict). Agreed. I think there's also a perceived performance difference, whether one exists or not. I'd be real surprised if for key, val in somedict.iteritems(): blah(key, val) was faster than for key, val in somedict.items(): blah(key, val) for small dicts. Still, around work I see a great preference for the longer (and uglier IMO) spelling. Maybe it's a mental carryover from C++ that makes people what that version? In any case, I vote to get rid of iterBLAH in favor of just BLAH, and in most cases make BLAH() return an iterator (or a view I suppose), with explicit list(), tuple(), set() required to get various concrete containers. Skip From aleaxit at gmail.com Thu Mar 23 22:21:27 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Thu, 23 Mar 2006 13:21:27 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <17443.3919.235023.510956@montanaro.dyndns.org> References: <4422FC96.2020409@zope.com> <4422FFE1.8050807@colorstudy.com> <17443.3919.235023.510956@montanaro.dyndns.org> Message-ID: On 3/23/06, skip at pobox.com wrote: ... > Agreed. I think there's also a perceived performance difference, whether > one exists or not. I'd be real surprised if > > for key, val in somedict.iteritems(): > blah(key, val) > > was faster than > > for key, val in somedict.items(): > blah(key, val) > > for small dicts. Still, around work I see a great preference for the longer Not sure what's a "small dict" in your world -- here, for example: python2.4 -mtimeit -s'd=dict.fromkeys(range(23))' 'for k, v in d.iteritems(): pass' 100000 loops, best of 3: 2.81 usec per loop python2.4 -mtimeit -s'd=dict.fromkeys(range(23))' 'for k, v in d.items(): pass' 100000 loops, best of 3: 4.82 usec per loop and an avoidable overhead of 2.01/2.81 = 71.5% does matter. But maybe you mean dictionaries that are much smaller than a couple dozen items? Using range(3) instead of range(23), I measure 0.804 vs 1.36 microseconds -- still, even that 68% avoidable overhead, while definitely less than 71.5%, is still pretty unpleasant, no? Alex From guido at python.org Thu Mar 23 22:26:11 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 23 Mar 2006 13:26:11 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <44230BA5.8070407@zope.com> References: <4422FC96.2020409@zope.com> <4422FFE1.8050807@colorstudy.com> <44230BA5.8070407@zope.com> Message-ID: On 3/23/06, Jim Fulton wrote: > Guido van Rossum wrote: > ... > > I've read and re-read Jim's message, and I'm not sure I understand it. > > It seems he's working in an interactive session > > Yup Feedback taken. But I don't want to design the feature or base its user testing entirely on interactive sessions. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From fumanchu at amor.org Thu Mar 23 22:27:58 2006 From: fumanchu at amor.org (Robert Brewer) Date: Thu, 23 Mar 2006 13:27:58 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) Message-ID: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> Guido van Rossum: > > I've read and re-read Jim's message, and I'm not > > sure I understand it. It seems he's working in > > an interactive session but I'm not sure I > > understand the problem he has with adding list() > > around an expression Jim Fulton wrote: > It's a hassle. > ... > I'd be interested to hear if other people who have experience working > with ZODB BTrees have been as annoyed as I've been. It is a hassle. I recently changed my ORM's API from returning iterators to lists based on similar user feedback (and I could have sworn I already made this comment on this issue, but I can't find it now). That experience hints to me that any interface that prefers iterators over lists is going to be resisted; builtins doubly so. Robert Brewer System Architect Amor Ministries fumanchu at amor.org From guido at python.org Thu Mar 23 22:43:58 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 23 Mar 2006 13:43:58 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> Message-ID: On 3/23/06, Robert Brewer wrote: > It is a hassle. I recently changed my ORM's API from returning iterators > to lists based on similar user feedback (and I could have sworn I > already made this comment on this issue, but I can't find it now). That > experience hints to me that any interface that prefers iterators over > lists is going to be resisted; builtins doubly so. Interesting. Was your feedback also based on use in interactive sessions only (like Jim's)? Would the same objection exist against APIs that return "views" as I described in a previous message (a la the Java collections package)? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ianb at colorstudy.com Thu Mar 23 23:00:30 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 23 Mar 2006 16:00:30 -0600 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> Message-ID: <44231A7E.6050801@colorstudy.com> Robert Brewer wrote: > Guido van Rossum: > >>>I've read and re-read Jim's message, and I'm not >>>sure I understand it. It seems he's working in >>>an interactive session but I'm not sure I >>>understand the problem he has with adding list() >>>around an expression > > > Jim Fulton wrote: > >>It's a hassle. >>... >>I'd be interested to hear if other people who have experience working >>with ZODB BTrees have been as annoyed as I've been. > > > It is a hassle. I recently changed my ORM's API from returning iterators > to lists based on similar user feedback (and I could have sworn I > already made this comment on this issue, but I can't find it now). That > experience hints to me that any interface that prefers iterators over > lists is going to be resisted; builtins doubly so. This has been my personal experience with the iterators in SQLObject as well. The fact that an empty iterator is true tends to cause particular problems in that case, though I notice iterkeys() acts properly in this case; maybe part of the issue is that I'm actually using iterables instead of iterators, where I can't actually test the truthfulness. Another issue I have with generators (and hence iterators) is the uselessness of repr() on them. This causes a lot of list() invocations as well, which works with interactive testing, but fails badly with print statements (where you have to do another run with list() around it -- but that itself causes problems when exhausting the iterator introduces a new bug). -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From guido at python.org Thu Mar 23 23:10:43 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 23 Mar 2006 14:10:43 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <44231A7E.6050801@colorstudy.com> References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> <44231A7E.6050801@colorstudy.com> Message-ID: On 3/23/06, Ian Bicking wrote: > This has been my personal experience with the iterators in SQLObject as > well. The fact that an empty iterator is true tends to cause particular > problems in that case, though I notice iterkeys() acts properly in this > case; maybe part of the issue is that I'm actually using iterables > instead of iterators, where I can't actually test the truthfulness. This sounds like some kind of fundamental confusion -- you should never be tempted to test an iterator for its truth value. I wonder if the confusion lies in the API similarities with dict, which *does* return an iterable (not an iterator) from keys() etc.? Other objects that return iterators() from the same API are not obeying the (2.x) API contract for maps. > Another issue I have with generators (and hence iterators) is the > uselessness of repr() on them. This causes a lot of list() invocations > as well, which works with interactive testing, but fails badly with > print statements (where you have to do another run with list() around it > -- but that itself causes problems when exhausting the iterator > introduces a new bug). Views could solve this problem as well, since they are reiterable. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ianb at colorstudy.com Thu Mar 23 23:19:45 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 23 Mar 2006 16:19:45 -0600 Subject: [Python-3000] Backward compatibility In-Reply-To: References: <4422FC96.2020409@zope.com> <4422FFE1.8050807@colorstudy.com> Message-ID: <44231F01.9060900@colorstudy.com> Guido van Rossum wrote: >>I saw this too in the archives, and thought shit, that's going to mess >>up a lot of my code. I would assume (though it's a separate point of >>discussion) that Python 3k should still try hard to keep backward >>compatibility. Backward compatibility isn't a requirement, but it's >>still clearly a feature. > > > You seem to be misunderstanding what Python 3000 is. The whole point > of Python 3000 is to *not* be bound by backwards compatibility > constraints, but instead make the best decisions possible (without > making it a different language). I think this was in the bullet points of pending discussions, so maybe should be a separate thread. When I say it is a feature... I guess that seems obvious to me. In 2.x backward compatibility is something of a requirement. But in 3.0/3000 backward compatibility is still a nice thing to have, that can be weighed against other nice things. That doesn't seem overly constrained, just practical. As someone who wants to use new features, but also wants to do useful work collaboratively with other people who may not care about nice features, the upgrade path is really important to me. That there's some working subset for both 2.x and 3.0 is *really* important. That code doesn't break in weird or hard ways is also really important. Easy ways isn't that big a deal -- a few AttributeErrors, or even better a suggestive NotImplementedError. Better yet SyntaxError. And maybe if d.keys().append() just fails, that'll be okay. That's probably an argument against clever implementations, really -- better to get an exception sooner than later. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From ianb at colorstudy.com Thu Mar 23 23:29:15 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 23 Mar 2006 16:29:15 -0600 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> <44231A7E.6050801@colorstudy.com> Message-ID: <4423213B.4050603@colorstudy.com> Guido van Rossum wrote: > On 3/23/06, Ian Bicking wrote: > >>This has been my personal experience with the iterators in SQLObject as >>well. The fact that an empty iterator is true tends to cause particular >>problems in that case, though I notice iterkeys() acts properly in this >>case; maybe part of the issue is that I'm actually using iterables >>instead of iterators, where I can't actually test the truthfulness. > > > This sounds like some kind of fundamental confusion -- you should > never be tempted to test an iterator for its truth value. I'm testing if it is empty or not, which seems natural enough. Or would be, if it worked. So I start out doing: for item in select_results: ... Then I realize that the zero-item case is special (which is common), and do: select_results = list(select_results) if select_results: ... else: for item in select_results:... That's not a very comfortable code transformation. When I was just first learning Python I thought this would work: for item in select_results: ... else: ... stuff when there are no items ... But it doesn't work like that. .iterkeys() does return an iterator with a useful __len__ method, so the principle that iterators shouldn't be tested for truth doesn't seem right. (Very small mostly unrelated problem that occurs to me just at this moment -- I can't override __len__ with any implementation that isn't really cheap, because lots of code calls __len__ under the covers, like list() -- originally SQLObject used len(query) to do a COUNT(*) query, but that didn't work) >>Another issue I have with generators (and hence iterators) is the >>uselessness of repr() on them. This causes a lot of list() invocations >>as well, which works with interactive testing, but fails badly with >>print statements (where you have to do another run with list() around it >>-- but that itself causes problems when exhausting the iterator >>introduces a new bug). > > > Views could solve this problem as well, since they are reiterable. Yes, I would expect them to have a good __repr__, and so it wouldn't be a problem. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From guido at python.org Thu Mar 23 23:58:02 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 23 Mar 2006 14:58:02 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <4423213B.4050603@colorstudy.com> References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> Message-ID: On 3/23/06, Ian Bicking wrote: > Guido van Rossum wrote: > > On 3/23/06, Ian Bicking wrote: > > > >>This has been my personal experience with the iterators in SQLObject as > >>well. The fact that an empty iterator is true tends to cause particular > >>problems in that case, though I notice iterkeys() acts properly in this > >>case; maybe part of the issue is that I'm actually using iterables > >>instead of iterators, where I can't actually test the truthfulness. > > > > This sounds like some kind of fundamental confusion -- you should > > never be tempted to test an iterator for its truth value. > > I'm testing if it is empty or not, which seems natural enough. Or would > be, if it worked. Testing whether an iterator is empty or not is an oxymoron; the only legit way is to call next() and see whether it raises StopIteration. This is the fundamental confusion I am talking about. It is NOT "natural enough". It reveals a fundamental misunderstanding of the design of the iterator protocol. (There's also a design bug in 2.4 which perpetuates the confusion, unfortunately; see below.) > So I start out doing: > > for item in select_results: ... > > Then I realize that the zero-item case is special (which is common), and do: > > select_results = list(select_results) > if select_results: > ... > else: > for item in select_results:... You should write that like this: empty = True for item in select_results: empty = False ... if empty: ... > That's not a very comfortable code transformation. When I was just > first learning Python I thought this would work: > > for item in select_results: > ... > else: > ... stuff when there are no items ... > > But it doesn't work like that. Another fundamental confusion (about the for loop's else clause). It can't mean two different things. It means "if I didn't break out of the loop with a break statement". > .iterkeys() does return an iterator with a useful __len__ method, so the > principle that iterators shouldn't be tested for truth doesn't seem right. Which iterkeys()? This is dependent on the object and on the Python version; Python 2.4 accidentally implemented __len__ on certain built-in iterators, which may explain why you are seeing this. It doesn't work pre-2.4 not post-2.5, at least not for dict.iterkeys(). > (Very small mostly unrelated problem that occurs to me just at this > moment -- I can't override __len__ with any implementation that isn't > really cheap, because lots of code calls __len__ under the covers, like > list() -- originally SQLObject used len(query) to do a COUNT(*) query, > but that didn't work) Except in 2.4, you can avoid most implicit __len__ calls by implementing __nonzero__ separately; bool(x) tries __nonzero__ before __len__. Unfortunately, the iterator accelerator in 2.4 is called __len__ so various code tries to call __len__ when converting an iterator to a list/tuple. 2.3 didn't; 2.5 won't. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ianb at colorstudy.com Fri Mar 24 00:31:37 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 23 Mar 2006 17:31:37 -0600 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> Message-ID: <44232FD9.6050209@colorstudy.com> Guido van Rossum wrote: > On 3/23/06, Ian Bicking wrote: > >>Guido van Rossum wrote: >> >>>On 3/23/06, Ian Bicking wrote: >>> >>> >>>>This has been my personal experience with the iterators in SQLObject as >>>>well. The fact that an empty iterator is true tends to cause particular >>>>problems in that case, though I notice iterkeys() acts properly in this >>>>case; maybe part of the issue is that I'm actually using iterables >>>>instead of iterators, where I can't actually test the truthfulness. >>> >>>This sounds like some kind of fundamental confusion -- you should >>>never be tempted to test an iterator for its truth value. >> >>I'm testing if it is empty or not, which seems natural enough. Or would >>be, if it worked. > > > Testing whether an iterator is empty or not is an oxymoron; the only > legit way is to call next() and see whether it raises StopIteration. > This is the fundamental confusion I am talking about. It is NOT > "natural enough". It reveals a fundamental misunderstanding of the > design of the iterator protocol. I'm talking about a use case, not the protocol. Where iterators are used, it is very common that you also want to distinguish between zero and some items. The use case isn't odd or confused or oxymoronic -- it's very natural. The problem is that the natural and common use case doesn't translate into nice Python when you use iterators. > (There's also a design bug in 2.4 which perpetuates the confusion, > unfortunately; see below.) > > >>So I start out doing: >> >> for item in select_results: ... >> >>Then I realize that the zero-item case is special (which is common), and do: >> >> select_results = list(select_results) >> if select_results: >> ... >> else: >> for item in select_results:... > > > You should write that like this: > > empty = True > for item in select_results: > empty = False > ... > if empty: > ... I write this sometimes (if I don't want to use list()), and it hurts me. It makes me unhappy. Literally. This came to mind: for count, item in enumerate(select_results): ... if not count: ... But I realized that doesn't work at all. Damn, that would have been nice. >>That's not a very comfortable code transformation. When I was just >>first learning Python I thought this would work: >> >> for item in select_results: >> ... >> else: >> ... stuff when there are no items ... >> >>But it doesn't work like that. > > > Another fundamental confusion (about the for loop's else clause). It > can't mean two different things. It means "if I didn't break out of > the loop with a break statement". I'm not proposing it be changed, just recalling my own learning process. It was actually a really long time before I understood what that else: means, and my initial intuition was incorrect. >>.iterkeys() does return an iterator with a useful __len__ method, so the >>principle that iterators shouldn't be tested for truth doesn't seem right. > > > Which iterkeys()? This is dependent on the object and on the Python > version; Python 2.4 accidentally implemented __len__ on certain > built-in iterators, which may explain why you are seeing this. It > doesn't work pre-2.4 not post-2.5, at least not for dict.iterkeys(). And here I thought it was a feature ;) -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From guido at python.org Fri Mar 24 00:41:37 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 23 Mar 2006 15:41:37 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <44232FD9.6050209@colorstudy.com> References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> <44232FD9.6050209@colorstudy.com> Message-ID: On 3/23/06, Ian Bicking wrote: [Guido] > > Testing whether an iterator is empty or not is an oxymoron; the only > > legit way is to call next() and see whether it raises StopIteration. > > This is the fundamental confusion I am talking about. It is NOT > > "natural enough". It reveals a fundamental misunderstanding of the > > design of the iterator protocol. > > I'm talking about a use case, not the protocol. Where iterators are > used, it is very common that you also want to distinguish between zero > and some items. Really? Methinks you are thinking of a fairly specific context -- when presenting database query results to a user. The problem IMO lies in SQLObject (which I admit I've never used) or perhaps in SQL itself, or the specific underlying DB. In most other situations, you have an honest-to-god container (e.g. a dict) which you can test for emptiness before even asking for an iterator over its items. When all you have is a query represented as an iterator this doesn't fly. That's why some DB API implementations return the number of results as the non-standard return value of the query API (at least that's what I recall -- it's been a while since I used the DB API). > The use case isn't odd or confused or oxymoronic -- > it's very natural. The problem is that the natural and common use case > doesn't translate into nice Python when you use iterators. There's no reason why the *specific* iterator returned by a query can't have an additional API to inquire whether it returned any results at all, or an exact result count; IIRC next() is but one of the many methods of query results. But this doesn't generalize to a flaw in the iterator protocol per se. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ianb at colorstudy.com Fri Mar 24 01:01:08 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 23 Mar 2006 18:01:08 -0600 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> <44232FD9.6050209@colorstudy.com> Message-ID: <442336C4.6050807@colorstudy.com> Guido van Rossum wrote: > On 3/23/06, Ian Bicking wrote: > [Guido] > >>>Testing whether an iterator is empty or not is an oxymoron; the only >>>legit way is to call next() and see whether it raises StopIteration. >>>This is the fundamental confusion I am talking about. It is NOT >>>"natural enough". It reveals a fundamental misunderstanding of the >>>design of the iterator protocol. >> >>I'm talking about a use case, not the protocol. Where iterators are >>used, it is very common that you also want to distinguish between zero >>and some items. > > > Really? Methinks you are thinking of a fairly specific context -- when > presenting database query results to a user. The problem IMO lies in > SQLObject (which I admit I've never used) or perhaps in SQL itself, or > the specific underlying DB. In most other situations, you have an > honest-to-god container (e.g. a dict) which you can test for emptiness > before even asking for an iterator over its items. When all you have > is a query represented as an iterator this doesn't fly. That's why > some DB API implementations return the number of results as the > non-standard return value of the query API (at least that's what I > recall -- it's been a while since I used the DB API). In SQLObject it came about due to a desire to lazily load objects out of a query. The lazy behavior had other problems (mostly introducing concurrency where you wouldn't expect). In addition, the query is only run when you start iterating. I'm not sure if that is good or bad design -- that queries are iterable doesn't seem that bad, except that the query is only invoked with iter() and that doesn't give very good access to the actual executed-query object; it's all too implicit. I don't know if the same issues exist for .items/.keys; I guess it would only be an issue if you passed one of iterators to some routine that didn't have access to the original dict. The identical problem does exist for all generators. Using ad hoc flags in for loops isn't a great solution. It's all somewhat similar to the repr() problem as well. Coming back around to the idea of implementing __getitem__ and such, I suppose a list-like iterator wrapper could be useful. That would consume and retain the results of the iterator lazily to satisfy the things done to the object. That would be kind of interesting; I implemented several such methods on the select result object in SQLObject for that purpose, and that aspect actually works pretty well. There's some predictability problems, though. bool(obj) would only have to consume one item, but len(obj) would consume the entire thing, and usually len() is a pretty innocuous function to use. If this was done, it would be nice if an iterator could give hints, like a faster implementation of __len__ than the fallback behavior that only can use .next(). -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From brett at python.org Fri Mar 24 01:02:30 2006 From: brett at python.org (Brett Cannon) Date: Thu, 23 Mar 2006 16:02:30 -0800 Subject: [Python-3000] Backward compatibility In-Reply-To: <44231F01.9060900@colorstudy.com> References: <4422FC96.2020409@zope.com> <4422FFE1.8050807@colorstudy.com> <44231F01.9060900@colorstudy.com> Message-ID: On 3/23/06, Ian Bicking wrote: > Guido van Rossum wrote: > >>I saw this too in the archives, and thought shit, that's going to mess > >>up a lot of my code. I would assume (though it's a separate point of > >>discussion) that Python 3k should still try hard to keep backward > >>compatibility. Backward compatibility isn't a requirement, but it's > >>still clearly a feature. > > > > > > You seem to be misunderstanding what Python 3000 is. The whole point > > of Python 3000 is to *not* be bound by backwards compatibility > > constraints, but instead make the best decisions possible (without > > making it a different language). > > I think this was in the bullet points of pending discussions, so maybe > should be a separate thread. > > When I say it is a feature... I guess that seems obvious to me. In 2.x > backward compatibility is something of a requirement. But in 3.0/3000 > backward compatibility is still a nice thing to have, that can be > weighed against other nice things. That doesn't seem overly > constrained, just practical. > I don't think things are going to be broken gratuitously. Just look at the backlash against PEP 348. The most common reason for not wanting a change was not because a suggestion was bad, just that the benefit of the change was outweighed by the breakage and pain of transition. I suspect all changes will be weighed this way and thus extreme breakage will be done only in cases where a clear benefit exists. Plus we will have to all convert stdlib code over so we will have a decent idea of what is needed in terms of guidellines or tools to make the transition easier. -Brett From ianb at colorstudy.com Fri Mar 24 01:11:47 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 23 Mar 2006 18:11:47 -0600 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <442336C4.6050807@colorstudy.com> References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> <44232FD9.6050209@colorstudy.com> <442336C4.6050807@colorstudy.com> Message-ID: <44233943.50201@colorstudy.com> Ian Bicking wrote: > Coming back around to the idea of implementing __getitem__ and such, I > suppose a list-like iterator wrapper could be useful. That would > consume and retain the results of the iterator lazily to satisfy the > things done to the object. That would be kind of interesting; I > implemented several such methods on the select result object in > SQLObject for that purpose, and that aspect actually works pretty well. > There's some predictability problems, though. bool(obj) would only > have to consume one item, but len(obj) would consume the entire thing, > and usually len() is a pretty innocuous function to use. > > If this was done, it would be nice if an iterator could give hints, like > a faster implementation of __len__ than the fallback behavior that only > can use .next(). BTW, I actually like the view idea better, though in some ways they are similar -- listish(iterator) is kind of like a list-like view on an iterator, like dict.keys() would be a multiset-like view of a dictionary's keys. As a generalized concepts views could be pretty neat and expansive. A downside is mistranslations of old code will lead to very hard bugs, as .keys() currently makes a copy. Though if the new objects aren't quite the same as lists -- e.g., implementing .add() instead of .append() -- then maybe that won't be so bad. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From barry at python.org Fri Mar 24 01:13:33 2006 From: barry at python.org (Barry Warsaw) Date: Thu, 23 Mar 2006 19:13:33 -0500 Subject: [Python-3000] Backward compatibility In-Reply-To: References: <4422FC96.2020409@zope.com> <4422FFE1.8050807@colorstudy.com> <44231F01.9060900@colorstudy.com> Message-ID: <1143159213.10793.11.camel@resist.wooz.org> On Thu, 2006-03-23 at 16:02 -0800, Brett Cannon wrote: > I don't think things are going to be broken gratuitously. However, I hope we don't throw out clearly beneficial improvements for backward compatibility's sake. But yes, all things being equal, if it comes down to a tie-breaker then backward compatibility may tip the scales against a particular change. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 309 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-3000/attachments/20060323/ff1b87ae/attachment.pgp From guido at python.org Fri Mar 24 01:36:41 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 23 Mar 2006 16:36:41 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <442336C4.6050807@colorstudy.com> References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> <44232FD9.6050209@colorstudy.com> <442336C4.6050807@colorstudy.com> Message-ID: On 3/23/06, Ian Bicking wrote: > Guido van Rossum wrote: > > On 3/23/06, Ian Bicking wrote: > > [Guido] > >>>Testing whether an iterator is empty or not is an oxymoron; the only > >>>legit way is to call next() and see whether it raises StopIteration. > >>>This is the fundamental confusion I am talking about. It is NOT > >>>"natural enough". It reveals a fundamental misunderstanding of the > >>>design of the iterator protocol. > >> > >>I'm talking about a use case, not the protocol. Where iterators are > >>used, it is very common that you also want to distinguish between zero > >>and some items. > > > > Really? Methinks you are thinking of a fairly specific context -- when > > presenting database query results to a user. The problem IMO lies in > > SQLObject (which I admit I've never used) or perhaps in SQL itself, or > > the specific underlying DB. In most other situations, you have an > > honest-to-god container (e.g. a dict) which you can test for emptiness > > before even asking for an iterator over its items. When all you have > > is a query represented as an iterator this doesn't fly. That's why > > some DB API implementations return the number of results as the > > non-standard return value of the query API (at least that's what I > > recall -- it's been a while since I used the DB API). > > In SQLObject it came about due to a desire to lazily load objects out of > a query. The lazy behavior had other problems (mostly introducing > concurrency where you wouldn't expect). In addition, the query is only > run when you start iterating. I'm not sure if that is good or bad > design -- that queries are iterable doesn't seem that bad, except that > the query is only invoked with iter() and that doesn't give very good > access to the actual executed-query object; it's all too implicit. I'm becoming more and more doubtful about the design of SQLobject; perhaps it's just not a good example since the issues seem to be caused by its specific design more than by the language features it's using. > I don't know if the same issues exist for .items/.keys; I guess it would > only be an issue if you passed one of iterators to some routine that > didn't have access to the original dict. But again that's an API design issue -- if the routine needed to know ahead of time whether the underlying collection was empty it should be given access to the collection. OTOH if you have an API that knows it can be given *any* iterator, then the "empty" flag pattern that I mentioned earlier is the only reliable way to differentiate between an empty and a non-empty containier. (Note that I refuse to say "empty iterator"!) > The identical problem does exist for all generators. Using ad hoc flags > in for loops isn't a great solution. It's all somewhat similar to the > repr() problem as well. Not all generators. A fair number of generators are methods on collections that implement various iterators. OTOH generators are one of the reasons that the iterator protocol is as restricted as it is. > Coming back around to the idea of implementing __getitem__ and such, I > suppose a list-like iterator wrapper could be useful. That would > consume and retain the results of the iterator lazily to satisfy the > things done to the object. I'm not sure that's all that useful. It reminds me of early pseudo-iterators that were implemented as lazy lists using __getitem__; these were eventually replaced by true iterators. The two extremes of the spectrum are already taken care of: use list(it) if you need truly random access; or iterate over the iterator exactly once if you can handle sequential access (like reading a file). > That would be kind of interesting; I > implemented several such methods on the select result object in > SQLObject for that purpose, and that aspect actually works pretty well. > There's some predictability problems, though. bool(obj) would only > have to consume one item, but len(obj) would consume the entire thing, > and usually len() is a pretty innocuous function to use. Which is why I think it's a bad idea to go down this lane. > If this was done, it would be nice if an iterator could give hints, like > a faster implementation of __len__ than the fallback behavior that only > can use .next(). That's what __len__ on iterators was intended for in 2.4. In 2.5 it will be reincarnated as __sizehint__ (I believe that's the name we settled on). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From brett at python.org Fri Mar 24 01:46:29 2006 From: brett at python.org (Brett Cannon) Date: Thu, 23 Mar 2006 16:46:29 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> <44232FD9.6050209@colorstudy.com> Message-ID: On 3/23/06, Guido van Rossum wrote: > On 3/23/06, Ian Bicking wrote: > [Guido] > > > Testing whether an iterator is empty or not is an oxymoron; the only > > > legit way is to call next() and see whether it raises StopIteration. > > > This is the fundamental confusion I am talking about. It is NOT > > > "natural enough". It reveals a fundamental misunderstanding of the > > > design of the iterator protocol. > > > > I'm talking about a use case, not the protocol. Where iterators are > > used, it is very common that you also want to distinguish between zero > > and some items. > > Really? Methinks you are thinking of a fairly specific context -- when > presenting database query results to a user. The problem IMO lies in > SQLObject (which I admit I've never used) or perhaps in SQL itself, or > the specific underlying DB. In most other situations, you have an > honest-to-god container (e.g. a dict) which you can test for emptiness > before even asking for an iterator over its items. When all you have > is a query represented as an iterator this doesn't fly. That's why > some DB API implementations return the number of results as the > non-standard return value of the query API (at least that's what I > recall -- it's been a while since I used the DB API). > I think there is a fundamental difference between your views of iterators. It sounds like Ian is viewing them as a separate object; something that happens to have derived its values from a dict in the situation begin discussed. While it seems Guido views the iterator for the dict as a view (ala Java in a way) of the data and providing an object with an API for viewing that data. And this leads to the difference between wanting to know the length of the iterator compared to the object that returned the iterator. Taking the view of the iterator as its own object with data, you would do:: obj = {} it = obj.keys() len(it) But taking the view of the iterator as a view of the original object, it makes more sense to work off the original object:: obj = {} len(obj) it = obj.keys() I understand Ian's view since I know I like to pass around iterators for use and that disconnects the iterator from the object that generated it and thus makes it impossible to find out possible info on the data contained without exhausting the iterator compared to just performing data upon the object containing the original the data. But I think if objects returned iterators instead of lists the iterator-as-view will begin to be used more than viewing them as iterator-has-own-data. But this also means that making a more view-like interface would be handy. In terms of what would need to be supported (len, deletion, etc.) I don't know. I personally have not had that much of a need since I then just pass the originating object and get the iterator as needed instead of passing around the iterator. -Brett From brett at python.org Fri Mar 24 01:52:32 2006 From: brett at python.org (Brett Cannon) Date: Thu, 23 Mar 2006 16:52:32 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <4423213B.4050603@colorstudy.com> References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> Message-ID: On 3/23/06, Ian Bicking wrote: > Guido van Rossum wrote: > > On 3/23/06, Ian Bicking wrote: [SNIP] > I'm testing if it is empty or not, which seems natural enough. Or would > be, if it worked. So I start out doing: > > for item in select_results: ... > > Then I realize that the zero-item case is special (which is common), and do: > > select_results = list(select_results) > if select_results: > ... > else: > for item in select_results:... > > That's not a very comfortable code transformation. When I was just > first learning Python I thought this would work: > > for item in select_results: > ... > else: > ... stuff when there are no items ... > > But it doesn't work like that. I have to admit that is what I initially thought as well. I think it is because when I read 'else' I viewed it as an alternative if the clause it was attached to didn't happen (ala an 'if' statement). Obviously Python has broken me of that habit of thinking of it that way, but I bet most people are used to 'else' working like that. Would be nice to have a an easy way to specify what to do if the loop didn't execute. But that would require a keyword. Otherwise we should really include the idiom of having a boolean that gets set within the loop as part of the docs (or some Python Best Practices doc in terms of using loops or something). -Brett From ianb at colorstudy.com Fri Mar 24 02:00:17 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 23 Mar 2006 19:00:17 -0600 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> <44232FD9.6050209@colorstudy.com> <442336C4.6050807@colorstudy.com> Message-ID: <442344A1.4070001@colorstudy.com> Guido van Rossum wrote: >>In SQLObject it came about due to a desire to lazily load objects out of >>a query. The lazy behavior had other problems (mostly introducing >>concurrency where you wouldn't expect). In addition, the query is only >>run when you start iterating. I'm not sure if that is good or bad >>design -- that queries are iterable doesn't seem that bad, except that >>the query is only invoked with iter() and that doesn't give very good >>access to the actual executed-query object; it's all too implicit. > > > I'm becoming more and more doubtful about the design of SQLobject; > perhaps it's just not a good example since the issues seem to be > caused by its specific design more than by the language features it's > using. I'm just outlining the specific problems I found looking back on the design there, where I tried some of these techniques, with different levels of success or frustration. I haven't argued that those decisions were all good decisions. >>I don't know if the same issues exist for .items/.keys; I guess it would >>only be an issue if you passed one of iterators to some routine that >>didn't have access to the original dict. > > > But again that's an API design issue -- if the routine needed to know > ahead of time whether the underlying collection was empty it should be > given access to the collection. OTOH if you have an API that knows it > can be given *any* iterator, then the "empty" flag pattern that I > mentioned earlier is the only reliable way to differentiate between an > empty and a non-empty containier. (Note that I refuse to say "empty > iterator"!) Empty iterator or iterator that produced no items -- from the outside it's the same use case. Iterators look a lot like containers. Often I only use a list by iterating over it; if that's all I do then I can't the difference. At that point it is ambiguous. I'm not even sure if a "sequence" means a list-like object or an iterable. That's ambiguous too. So I'm only pointing out an existing ambiguity, and a place where that ambiguity causes problems. Right now this is how I would iterate over a container, special-casing an empty container: if container: for item in container: ... else: ... In this case I am testing if the container is empty, and this generally works. Then an iterator is introduced, and my code breaks. So, I have to choose -- do I convert the iterator to a container with list() (and maybe needlessly copying a container), or do I switch to only using the iteratable aspect of the container, like: empty = True for item in container: empty = False ... if empty: ... If using the iterable interface in this case felt as natural as using the container interface, then I'd probably have used the iterable form from the beginning and I wouldn't have a problem. But it doesn't feel as natural, so I don't. I can't say *everyone* makes the same choice as me, so I am using the first person in this argument. But I think most people do the same as I do, and so because the language does not make the iterable form very pretty it causes people to use the container interface (i.e., __nonzero__) even though they don't really need to. >>The identical problem does exist for all generators. Using ad hoc flags >>in for loops isn't a great solution. It's all somewhat similar to the >>repr() problem as well. > > > Not all generators. A fair number of generators are methods on > collections that implement various iterators. > > OTOH generators are one of the reasons that the iterator protocol is > as restricted as it is. I'm not arguing for adding __nonzero__ to iterators, only for addressing this use case where currently I make use of __nonzero__. Or, alternately, having whatever d.keys() returns implement __nonzero__, or otherwise be an iterable and not an iterator. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From guido at python.org Fri Mar 24 02:06:30 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 23 Mar 2006 17:06:30 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> Message-ID: On 3/23/06, Brett Cannon wrote: > On 3/23/06, Ian Bicking wrote: > > When I was just > > first learning Python I thought this would work: > > > > for item in select_results: > > ... > > else: > > ... stuff when there are no items ... > > > > But it doesn't work like that. > > I have to admit that is what I initially thought as well. I think it > is because when I read 'else' I viewed it as an alternative if the > clause it was attached to didn't happen (ala an 'if' statement). > Obviously Python has broken me of that habit of thinking of it that > way, but I bet most people are used to 'else' working like that. > > Would be nice to have a an easy way to specify what to do if the loop > didn't execute. But that would require a keyword. Otherwise we > should really include the idiom of having a boolean that gets set > within the loop as part of the docs (or some Python Best Practices doc > in terms of using loops or something). But this is only needed if *all you have* is the iterator. Most of the time, the code containing the for loop has access to the container, and the iterator is only instantiated by the __iter__() call implied by the for loop. (Off-topic: maybe we can drop the fall-back behavior of iter() if __iter__ isn't found?) So the common pattern is simply this: if not container: ...it's empty... else: for item in container: ...handle items... The pattern with the 'empty' flag is only needed when due to API constraints you have only got an iterator. I don't think that's very common -- IMO SQLobject just made a poor choice there. It would have made more sense if its query object conceptually represented the *result set* and iterating over it was just the way of accessing the items of the result set. Even if you're iterating over the lines of a file, can do without the 'empty' flag pattern -- you can simply stat the file to see whether it's empty or not. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Fri Mar 24 02:16:24 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 23 Mar 2006 17:16:24 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> <44232FD9.6050209@colorstudy.com> Message-ID: On 3/23/06, Brett Cannon wrote: > I think there is a fundamental difference between your views of > iterators. It sounds like Ian is viewing them as a separate object; > something that happens to have derived its values from a dict in the > situation begin discussed. While it seems Guido views the iterator > for the dict as a view (ala Java in a way) of the data and providing > an object with an API for viewing that data. Actually I'd say it's the opposite -- I *don't* see an iterator as a view, I see it as a throw-away way to access the elements of some underlying container in a sequential fashion. Sometimes the nature of the containing makes it impractical to inquire the container directly about its size (e.g. if the "container" represents the lines of a file, or a graph represented by the pages of a web server). Ian (or perhaps SQLobject) seems to collapse the notion of the iterator and the underlying container, and that causes the problems because the properties of the iterator are quite limited (all you can really do is ask for the next item, fingers crossed) while the container may have other metadata even if it can't tell you how many values the iteration will return. [SNIP] > I understand Ian's view since I know I like to pass around iterators > for use and that disconnects the iterator from the object that > generated it and thus makes it impossible to find out possible info on > the data contained without exhausting the iterator compared to just > performing data upon the object containing the original the data. You shouldn't do that unless you are consciously designing an API that must be able to work with in(de)finite sequences or other strange things. The itertools library is an example of such an API because (by intent) it must work for all iterators. Most APIs aren't as constrained and it's fine to require an iterable instead of an iterator. > But I think if objects returned iterators instead of lists the > iterator-as-view will begin to be used more than viewing them as > iterator-has-own-data. But this also means that making a more > view-like interface would be handy. In terms of what would need to be > supported (len, deletion, etc.) I don't know. I personally have not > had that much of a need since I then just pass the originating object > and get the iterator as needed instead of passing around the iterator. I'm dead set against giving iterators more view-like properties; it would rule out generators and other potentially infinite sequences. But I'm all for a different concept, views in the sense of Java's collection framework. Please study it; it's worth it: http://java.sun.com/j2se/1.4.2/docs/api/java/util/Collection.html -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ianb at colorstudy.com Fri Mar 24 02:21:24 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 23 Mar 2006 19:21:24 -0600 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> Message-ID: <44234994.1020508@colorstudy.com> Guido van Rossum wrote: > But this is only needed if *all you have* is the iterator. Most of the > time, the code containing the for loop has access to the container, > and the iterator is only instantiated by the __iter__() call implied > by the for loop. I don't think that is the case. For instance: def non_empty_lines(seq): for line in seq: if line.strip() and not line.strip().startswith('#'): yield line for line in non_empty_lines(open('config.txt')): ... I think wrapping the iterator in non_empty_lines() shouldn't cause you to have to rewrite your logic to radically. More generally, I find myself using list() fairly often lately as generators have become more popular, and it's not just with SQLObject. Testing for the existence of any items in the iterator (is that a better way of saying it than empty?) is often the reason. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From guido at python.org Fri Mar 24 02:27:48 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 23 Mar 2006 17:27:48 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <442344A1.4070001@colorstudy.com> References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> <44232FD9.6050209@colorstudy.com> <442336C4.6050807@colorstudy.com> <442344A1.4070001@colorstudy.com> Message-ID: On 3/23/06, Ian Bicking wrote: > Empty iterator or iterator that produced no items -- from the outside > it's the same use case. I see it as an education issue. Because we have generators, the iterator protocol can't be extended to support a test for emptiness -- that's just not something that generators can be expected to support. So yes, there's confusion, but there's no way to remove the confusion, no matter how many times you repeat the usability issues. If we extend the iterator protocol with a mandatory emptiness test API, that excludes generators from being considered iterators, and the API design question you have to ask is whether you want to support generators. If you find the 'empty' flag pattern too ugly, you can always write a helper class that takes an iterator and returns an object that represents the same iterator, but sometimes buffers one element. But the buffering violates the coroutine-ish properties of generators, so it should not be the only (or even the default) way to access generators. Here's a sample wrapper (untested): class IteratorWrapper(object): def __init__(self, it): self.it = it self.buffer = None self.buffered = False self.exhausted = False def __iter__(self): return self def next(self): if self.buffered: value = self.buffer self.buffered = False self.buffer = None return value if self.exhausted: raise StopIteration() try: return self.it.next() except StopIteration: self.exhausted = True raise def __nonzero__(self): if self.buffered: return True if self.exhausted: return False try: self.buffer = self.it.next() except StopIteration: self.exhausted = True return False self.buffered = True return True -- --Guido van Rossum (home page: http://www.python.org/~guido/) From brett at python.org Fri Mar 24 02:24:06 2006 From: brett at python.org (Brett Cannon) Date: Thu, 23 Mar 2006 17:24:06 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> Message-ID: On 3/23/06, Guido van Rossum wrote: > On 3/23/06, Brett Cannon wrote: > > On 3/23/06, Ian Bicking wrote: > > > When I was just > > > first learning Python I thought this would work: > > > > > > for item in select_results: > > > ... > > > else: > > > ... stuff when there are no items ... > > > > > > But it doesn't work like that. > > > > I have to admit that is what I initially thought as well. I think it > > is because when I read 'else' I viewed it as an alternative if the > > clause it was attached to didn't happen (ala an 'if' statement). > > Obviously Python has broken me of that habit of thinking of it that > > way, but I bet most people are used to 'else' working like that. > > > > Would be nice to have a an easy way to specify what to do if the loop > > didn't execute. But that would require a keyword. Otherwise we > > should really include the idiom of having a boolean that gets set > > within the loop as part of the docs (or some Python Best Practices doc > > in terms of using loops or something). > > But this is only needed if *all you have* is the iterator. Most of the > time, the code containing the for loop has access to the container, > and the iterator is only instantiated by the __iter__() call implied > by the for loop. Right. That is what I do; pass around an iterator if I know all I need is an iterator, otherwise I have API require the actual object so I can get the iterator directly and perform any possible queries I need. I just don't know how often people think about doing that. > (Off-topic: maybe we can drop the fall-back behavior > of iter() if __iter__ isn't found?) > I say yes. Iterators will be common enough that objects that want the support should just directly support it. > So the common pattern is simply this: > > if not container: > ...it's empty... > else: > for item in container: > ...handle items... > Right. I am really starting to think that having a group of Best Practices essays that discuss common Python idioms might be handy. Part tutorial, part advanced usage, they would provide a way for people to have a place to go to find out expected usage of things such as iterators without having to discover this kind of thing the hard way. Could also help us see where possible improvements could come in for Py3K if we write them from the perspective of 2.x, or even how things improve if we write them for Py3K. Basically like the HOWTOs, but for the language instead of a module and with a better name. =) -Brett From guido at python.org Fri Mar 24 02:31:27 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 23 Mar 2006 17:31:27 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <44234994.1020508@colorstudy.com> References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> <44234994.1020508@colorstudy.com> Message-ID: On 3/23/06, Ian Bicking wrote: > Guido van Rossum wrote: > > But this is only needed if *all you have* is the iterator. Most of the > > time, the code containing the for loop has access to the container, > > and the iterator is only instantiated by the __iter__() call implied > > by the for loop. > > I don't think that is the case. For instance: > > def non_empty_lines(seq): > for line in seq: > if line.strip() and not line.strip().startswith('#'): > yield line > > for line in non_empty_lines(open('config.txt')): > ... > > I think wrapping the iterator in non_empty_lines() shouldn't cause you > to have to rewrite your logic to radically. Radically compared to what? > More generally, I find > myself using list() fairly often lately as generators have become more > popular, and it's not just with SQLObject. Testing for the existence of > any items in the iterator (is that a better way of saying it than > empty?) is often the reason. If creating a copy of all items using list() is not a problem, then you shouldn't have been using iterators in the first place. Iterators exist so you can efficiently handle cases where list() would overflow memory. If you don't have such cases, you should just design your APIs to return lists in the first place. But I betcha that many of the APIs you're using are giving you iterators instead of lists *because* (in the general case -- maybe not for your application) they can return more data than fits in memory. SQL queries being an example. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From brett at python.org Fri Mar 24 02:48:58 2006 From: brett at python.org (Brett Cannon) Date: Thu, 23 Mar 2006 17:48:58 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> <44232FD9.6050209@colorstudy.com> Message-ID: On 3/23/06, Guido van Rossum wrote: > On 3/23/06, Brett Cannon wrote: > > I think there is a fundamental difference between your views of > > iterators. It sounds like Ian is viewing them as a separate object; > > something that happens to have derived its values from a dict in the > > situation begin discussed. While it seems Guido views the iterator > > for the dict as a view (ala Java in a way) of the data and providing > > an object with an API for viewing that data. > > Actually I'd say it's the opposite -- I *don't* see an iterator as a > view, I see it as a throw-away way to access the elements of some > underlying container in a sequential fashion. Sometimes the nature of > the containing makes it impractical to inquire the container directly > about its size (e.g. if the "container" represents the lines of a > file, or a graph represented by the pages of a web server). > OK, I think view was the wrong word to choose then. But your view of iterators as providing a way to access the data of the underlying container does match up with what I was thinking. > Ian (or perhaps SQLobject) seems to collapse the notion of the > iterator and the underlying container, and that causes the problems > because the properties of the iterator are quite limited (all you can > really do is ask for the next item, fingers crossed) while the > container may have other metadata even if it can't tell you how many > values the iteration will return. > > [SNIP] > > > I understand Ian's view since I know I like to pass around iterators > > for use and that disconnects the iterator from the object that > > generated it and thus makes it impossible to find out possible info on > > the data contained without exhausting the iterator compared to just > > performing data upon the object containing the original the data. > > You shouldn't do that unless you are consciously designing an API that > must be able to work with in(de)finite sequences or other strange > things. The itertools library is an example of such an API because (by > intent) it must work for all iterators. > > Most APIs aren't as constrained and it's fine to require an iterable > instead of an iterator. > > > But I think if objects returned iterators instead of lists the > > iterator-as-view will begin to be used more than viewing them as > > iterator-has-own-data. But this also means that making a more > > view-like interface would be handy. In terms of what would need to be > > supported (len, deletion, etc.) I don't know. I personally have not > > had that much of a need since I then just pass the originating object > > and get the iterator as needed instead of passing around the iterator. > > I'm dead set against giving iterators more view-like properties; it > would rule out generators and other potentially infinite sequences. > Good point. > But I'm all for a different concept, views in the sense of Java's > collection framework. Please study it; it's worth it: > > http://java.sun.com/j2se/1.4.2/docs/api/java/util/Collection.html > This looks like a container/set protocol to me. Beyond having stronger guarantees in terms of the deletion methods (would have them always raise NotImplemented if they don't mutate the object they represent) and dicthing Java-specific stuff (like the toArray() cruft) this would be a nice complement to iterators. If we formalize the mapping protocol we would have a specified API for most major uses of data structures. We would have a protocol for detecting if an object contains something (container), a way to linearly go over the the values of an object (iterator), or provide random-access to values in an object (mapping). And the container protocol could have an optional location or key method that returns the value associated with the wanted object if such a things exist (so lists would return the index, dicts the key, and sets would not implement the method). -Brett From fumanchu at amor.org Fri Mar 24 02:51:36 2006 From: fumanchu at amor.org (Robert Brewer) Date: Thu, 23 Mar 2006 17:51:36 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) Message-ID: <435DF58A933BA74397B42CDEB8145A86EAD1C6@ex9.hostedexchange.local> Guido van Rossum wrote: > On 3/23/06, Robert Brewer wrote: > > It is a hassle. I recently changed my ORM's API from > > returning iterators to lists based on similar user > > feedback (and I could have sworn I already made > > this comment on this issue, but I can't find it > > now). That experience hints to me that any > > interface that prefers iterators over lists > > is going to be resisted; builtins doubly so. > > Interesting. Was your feedback also based on use in interactive > sessions only (like Jim's)? Not at all; the interfaces I mentioned are never used interactively as far as I know. I think the issue is one of wanting to write code quickly and easily (whether done 'interactively' or not), and to therefore use idioms that fit the domain best. In my cases (database search) the results are almost always going to be 1) iterated over completely, and 2) more than once, whether that's an additional sort() operation, or a DSU, or type coercion, reformatting, etc. In that domain, lists are the natural fit. Having to wrap iterators with list() at that point is boilerplate, for the mind as much as the fingers. You've got two types of objects in the wild, the "list" set and the "iterator" set, and they overlap by a large margin (because you can perform many operations on either with no change in syntax). Say it's an 80% overlap. It might seem trivial, since the overlap is large, to switch to preferring iterators over lists for return values. But you still have to weigh the benefits of improvements for the "iterator-specific" operations (10%) versus the burden of additional syntax for the "list-specific" operations (10%). +----------------------+ | list | +----+----------------------+ iter| |----------------------| | |------- shared -------| +----+----------------------+ But in my experience, the remainders are not currently equal. I'd make a wild-ass guess like "60% shared syntax, 30% lists, 10% iter". So I feel like the "list operations" (like indexing, slicing, len, etc.) are going to be penalized with additional boilerplate, in exchange for a very small win (e.g., typing "keys" instead of "iterkeys"). I'd also opine that the standard library is not similar in this regard to most application code, simply because most library and framework code is going to try to be more generic, accepting both kinds of iterables where possible (whereas application code tends to be more naive, and therefore assume far more about data types). > Would the same objection exist against APIs that return "views" as I > described in a previous message (a la the Java collections package)? If the vastly common case is to coerce an object into something else before using it, I think that's a strong argument that the design is flawed. For example, Python is better IMO for not having an Integer class on top of an int primitive type (like Java has). If "views" introduce a similar amount of conceptual and lexical overhead for the common case, then I think the objection would remain. Robert Brewer System Architect Amor Ministries fumanchu at amor.org From steven.bethard at gmail.com Fri Mar 24 02:57:16 2006 From: steven.bethard at gmail.com (Steven Bethard) Date: Thu, 23 Mar 2006 18:57:16 -0700 Subject: [Python-3000] else-clause on for-loops Message-ID: On 3/23/06, Ian Bicking wrote: > When I was just first learning Python I thought this would work: > > for item in select_results: > ... > else: > ... stuff when there are no items ... > > But it doesn't work like that. On 3/23/06, Brett Cannon wrote: > I have to admit that is what I initially thought as well. I think it > is because when I read 'else' I viewed it as an alternative if the > clause it was attached to didn't happen (ala an 'if' statement). Yeah, I use for-else occasionally, and I know how it works in Python, but every time I want to special-case the empty iterable case, I still have to remind myself that the else-clause doesn't do what I want it to. There was talk previously_ about removing the else clause on for-loops (and while-loops). One possibility would be to change the else-clause to behave as expected above (i.e. only executed when the loop fails to iterate over any items). I don't feel strongly on this one way or another -- I use the current for-else syntax about as often as I need to special-case an empty iterable. Just thought I'd point out the old thread since it was aimed at Python 3000-ish anyway. .. previously: http://mail.python.org/pipermail/python-dev/2005-July/054695.html STeVe -- Grammar am for people who can't think for myself. --- Bucky Katt, Get Fuzzy From edcjones at comcast.net Fri Mar 24 03:05:01 2006 From: edcjones at comcast.net (Edward C. Jones) Date: Thu, 23 Mar 2006 21:05:01 -0500 Subject: [Python-3000] Best Practices essays In-Reply-To: References: Message-ID: <442353CD.1050101@comcast.net> "Brett Cannon" wrote: > Right. I am really starting to think that having a group of Best > Practices essays that discuss common Python idioms might be handy. > Part tutorial, part advanced usage, they would provide a way for > people to have a place to go to find out expected usage of things > such as iterators without having to discover this kind of thing the > hard way. Could also help us see where possible improvements could > come in for Py3K if we write them from the perspective of 2.x, or > even how things improve if we write them for Py3K. That's a good idea. Include everything from m * [n * [0]] to pickling classes with __new__. For the latter, Google on "pickling a subclass of tuple". From tdelaney at avaya.com Fri Mar 24 03:15:00 2006 From: tdelaney at avaya.com (Delaney, Timothy (Tim)) Date: Fri, 24 Mar 2006 13:15:00 +1100 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) Message-ID: <2773CAC687FD5F4689F526998C7E4E5FF1E619@au3010avexu1.global.avaya.com> Jeremy Hylton wrote: > On 3/23/06, Ian Bicking wrote: >> One idea I had after reading a post of Brett's was a dual-use >> attribute; if you do d.keys you get an iterable (not an iterator, of >> course), and if you call that iterable you get a list. This is >> backward compatible, arguably prettier anyway to make it a property >> (since there's no side effects and getting an iterable isn't >> expensive, the method call seems somewhat superfluous). > > I don't think we should overload attributes name such that they are > sometimes attributes and sometimes methods, particularly when they > return things that behave almost-but-not-quite the same. It will > create confusion and subtle bugs. This whole discussion suggests to me that what would be best is if we defined an actual "view" protocol, and various builtins return views, rather than either copies or iterators. A view provides the same access methods, etc as the object it is backed by. The aim of a view is to be lightweight. A view should not allow modification of the underlying object, but the view itself may change if the underlying object changes (and how it changes would need to be documented). For the case of dict.keys(), a list-view would be returned. From all appearances, this would be an immutable sequence i.e. it would implement __getitem__, __iter__ and __len__. Tim Delaney From ianb at colorstudy.com Fri Mar 24 04:24:12 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 23 Mar 2006 21:24:12 -0600 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> <44234994.1020508@colorstudy.com> Message-ID: <4423665C.9080807@colorstudy.com> Guido van Rossum wrote: > On 3/23/06, Ian Bicking wrote: >> Guido van Rossum wrote: >>> But this is only needed if *all you have* is the iterator. Most of the >>> time, the code containing the for loop has access to the container, >>> and the iterator is only instantiated by the __iter__() call implied >>> by the for loop. >> I don't think that is the case. For instance: >> >> def non_empty_lines(seq): >> for line in seq: >> if line.strip() and not line.strip().startswith('#'): >> yield line >> >> for line in non_empty_lines(open('config.txt')): >> ... >> >> I think wrapping the iterator in non_empty_lines() shouldn't cause you >> to have to rewrite your logic to radically. > > Radically compared to what? Starts out: if os.stat(filename).st_size: # weird, but your suggestion ;) with open(filename) as lines: for line in lines: read_config(line) else: get_default_config() Adding comments and empty line handling, the code becomes: with open(filename) as lines: empty = True for line in non_empty_lines(lines): empty = False read_config(line) if empty: get_default_config() To me that feels like a big transformation, where I would prefer to just be able to use "non_empty_lines(lines)" in place of "lines" and everything would work perfectly. If I started out with the second example instead of the first, it *would* work perfectly. But I don't do so. If that second example looked just a little nicer I would use that form, and then there wouldn't be any problem. It really doesn't matter for this case if the test comes before (using __nonzero__) or after the for loop (using a did-that-loop-run flag). Of course, no syntax comes to mind to improve this. Repurposing the else clause in for loops seems like it just adds to the confusing of an already confusing construct. If you could somehow count how many times the loop had run, that'd work great; but I don't see any way to do that without new syntax. >> More generally, I find >> myself using list() fairly often lately as generators have become more >> popular, and it's not just with SQLObject. Testing for the existence of >> any items in the iterator (is that a better way of saying it than >> empty?) is often the reason. > > If creating a copy of all items using list() is not a problem, then > you shouldn't have been using iterators in the first place. Iterators > exist so you can efficiently handle cases where list() would overflow > memory. If you don't have such cases, you should just design your APIs > to return lists in the first place. You've just made the argument that dict.keys() should return a list ;) Or a view would work just as well, I suppose. Maybe you've just made the argument that it should return an iterable, not an iterator. -- Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org From guido at python.org Fri Mar 24 05:17:29 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 23 Mar 2006 20:17:29 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <4423665C.9080807@colorstudy.com> References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> <44234994.1020508@colorstudy.com> <4423665C.9080807@colorstudy.com> Message-ID: On 3/23/06, Ian Bicking wrote: [...] > To me that feels like a big transformation, where I would prefer to just > be able to use "non_empty_lines(lines)" in place of "lines" and > everything would work perfectly. Well wishing ain't going to make it so. I'm not sure what you're proposing; what you want can be done just fine by casting the iterator to a list(). But that's not acceptable as the fundamental API because of the premise that the whole sequence may not fit in memory. It's clear that YOU don't care about that case, but *I* do. [Guido] > > If creating a copy of all items using list() is not a problem, then > > you shouldn't have been using iterators in the first place. Iterators > > exist so you can efficiently handle cases where list() would overflow > > memory. If you don't have such cases, you should just design your APIs > > to return lists in the first place. > > You've just made the argument that dict.keys() should return a list ;) How so? If the dict fits in memory, the dict plus the list may not -- or it may be so tight that you end up swapping. > Or a view would work just as well, I suppose. Maybe you've just made > the argument that it should return an iterable, not an iterator. Technically an iterator is an iterable, so requiring it to return an iterable doesn't solve your problem. Requiring a sequence or a collection which may be a view instead of a copy *does* solve it, so I propose to go for that. This would also solve the redundancy of having iter(d) and iter(d.keys()) return the same thing -- d.keys() would return a set (not multiset!) view which has other uses than either d or iter(d). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Fri Mar 24 05:25:33 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 23 Mar 2006 20:25:33 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5FF1E619@au3010avexu1.global.avaya.com> References: <2773CAC687FD5F4689F526998C7E4E5FF1E619@au3010avexu1.global.avaya.com> Message-ID: On 3/23/06, Delaney, Timothy (Tim) wrote: > This whole discussion suggests to me that what would be best is if we > defined an actual "view" protocol, and various builtins return views, > rather than either copies or iterators. Right. > A view provides the same access methods, etc as the object it is backed > by. No it doesn't! Otherwise there would be no difference between a view and the underlying object. Please read the Java docs that I referenced before. > The aim of a view is to be lightweight. A view should not allow > modification of the underlying object, but the view itself may change if > the underlying object changes (and how it changes would need to be > documented). The Java collections framework has precise rules for this. Please read it. A better URL to start than the one I gave before is: http://java.sun.com/j2se/1.4.2/docs/guide/collections/index.html For eqample, here's a quote from the spec for Map.keySet() -- the Java equivalent of dict.keys() -- at http://java.sun.com/j2se/1.4.2/docs/api/java/util/Map.html#keySet(): """ Returns a set view of the keys contained in this map. The set is backed by the map, so changes to the map are reflected in the set, and vice-versa. If the map is modified while an iteration over the set is in progress, the results of the iteration are undefined. The set supports element removal, which removes the corresponding mapping from the map, via the Iterator.remove, Set.remove, removeAll retainAll, and clear operations. It does not support the add or addAll operations. """ > For the case of dict.keys(), a list-view would be returned. No, a set view. You know that the keys are unique so it is a set; and a list view would imply cheap random access, which does not work for a hash table -- it's expensive to compute the N'th element of a hash table with holes, since the only way to do it is to start at the beginning and count forward, skipping the holes, until you've passed the requested number of non-holes. > From all > appearances, this would be an immutable sequence i.e. it would implement > __getitem__, __iter__ and __len__. No. See the Java quote above. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jack at performancedrivers.com Fri Mar 24 06:08:46 2006 From: jack at performancedrivers.com (Jack Diederich) Date: Fri, 24 Mar 2006 00:08:46 -0500 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> <44232FD9.6050209@colorstudy.com> <442336C4.6050807@colorstudy.com> <442344A1.4070001@colorstudy.com> Message-ID: <20060324050846.GB4440@performancedrivers.com> On Thu, Mar 23, 2006 at 05:27:48PM -0800, Guido van Rossum wrote: > On 3/23/06, Ian Bicking wrote: > > Empty iterator or iterator that produced no items -- from the outside > > it's the same use case. > > I see it as an education issue. Because we have generators, the > iterator protocol can't be extended to support a test for emptiness -- > that's just not something that generators can be expected to support. > So yes, there's confusion, but there's no way to remove the confusion, > no matter how many times you repeat the usability issues. If we extend > the iterator protocol with a mandatory emptiness test API, that > excludes generators from being considered iterators, and the API > design question you have to ask is whether you want to support > generators. 90% of the time I don't know or care if I am using an iterable or an iterator. As Martelli pointed out in this thread, for the times when you don't care iterators are preferable on speed-and-space grounds. Most of the other 10% requires special testing for empty regardless of a list or iter. import csv import itertools as it # list way mycsv = list(csv.reader(open('foo.csv'))) header = mycsv[0] rows = mycsv[1:] for (d) in [dict(zip(header, row)) for (row) in rows]: # do stuff # iter way mycsv = csv.reader(open('foo.csv')) header = mycsv.next() for (d) in (dict(zip(header, row)) for (row) in mycsv): # do stuff For finger typing the two are a push, I'd give the iter version a slight edge in clarity. Both examples raise exceptions for empty files. Ian's and other's seem to prefer the readability of iterables for that exceptional case. > If you find the 'empty' flag pattern too ugly, you can always write a > helper class that takes an iterator and returns an object that > represents the same iterator, but sometimes buffers one element. But > the buffering violates the coroutine-ish properties of generators, so > it should not be the only (or even the default) way to access > generators. [snip Guido's peek ahead iterator class] My good sir, you have stopped at mere hack when a depraved hack was available. Peek-ahead isn't necessary for this one, pep 343 is. from __future__ import pep343_is_implemented with iter_or_false(it) as test: for (val) in test: print val else: print "break wasn't called" if (not test): print "it was empty!" # implementation of iter_or_false (tested!) import itertools class iter_or_false(object): def __init__(self, iterable): self.it = iter(iterable) self.empty = True def __nonzero__(self): return not self.empty def __context__(self): return self def __exit__(self, *info): return False def __enter__(self): return self def __iter__(self): return self def next(self): val = self.it.next() self.empty = False return val I agree it is much better to put the smarts in the SQL result class (grow a __nonzero__ method and a ob.rows iterator) than to force it into a general case But if it absolutely must be tacked-on ... -jackdied From adam.deprince at gmail.com Fri Mar 24 07:25:09 2006 From: adam.deprince at gmail.com (Adam DePrince) Date: Fri, 24 Mar 2006 01:25:09 -0500 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5FF1E619@au3010avexu1.global.avaya.com> References: <2773CAC687FD5F4689F526998C7E4E5FF1E619@au3010avexu1.global.avaya.com> Message-ID: <1143181509.3287.36.camel@localhost.localdomain> On Fri, 2006-03-24 at 13:15 +1100, Delaney, Timothy (Tim) wrote: > Jeremy Hylton wrote: > > > On 3/23/06, Ian Bicking wrote: > >> One idea I had after reading a post of Brett's was a dual-use > >> attribute; if you do d.keys you get an iterable (not an iterator, of > >> course), and if you call that iterable you get a list. This is > >> backward compatible, arguably prettier anyway to make it a property > >> (since there's no side effects and getting an iterable isn't > >> expensive, the method call seems somewhat superfluous). > > > > I don't think we should overload attributes name such that they are > > sometimes attributes and sometimes methods, particularly when they > > return things that behave almost-but-not-quite the same. It will > > create confusion and subtle bugs. > > This whole discussion suggests to me that what would be best is if we > defined an actual "view" protocol, and various builtins return views, > rather than either copies or iterators. > > A view provides the same access methods, etc as the object it is backed > by. The aim of a view is to be lightweight. A view should not allow > modification of the underlying object, but the view itself may change if > the underlying object changes (and how it changes would need to be > documented). > > For the case of dict.keys(), a list-view would be returned. From all > appearances, this would be an immutable sequence i.e. it would implement > __getitem__, __iter__ and __len__. By immutable you mean changes only arise from changes in the view's underlying data source, correct? The contents of the list would change as the key-set changed. There is one problem that I see. A list-view actually requires more information than the dict keys provide. Keys in a dict are unordered, when dict.keys() is called a synthetic ordering is created to accommodate the fact that a list is ordered, and therefore the list must have some order. Right now this a reflection of a combination of hash value and insertion order - completely arbitrary from the application's perspective. When items are removed from the dict, do we leave a hole in the list? Or do we rearrange the positions of the remaining items in the list? Now, even if we do this in a documented fashion, if we are giving the user the ability to index into our keys like a real list, what happens when their saved index value no longer matches the location they intended. An additional problem arises if we have multiple list-views in existence, created at different times. Does the existence of old list views affect the ordering of our new views? If not, then an older view that has been subject to a number of insertions and deletions will have a distinctly different ordering than a brand new view. But if so, then we have considerable housekeeping expenses of maintaining this information. Forgive my tangent, but my only concern is that if we are going to introduce a notion of a view, then we should be very careful about situations where the target perspective encodes more information than the source. Anytime the target perspective requires more information, such as going from an unordered to ordered or perhaps a partially ordered to fully ordered collection, we start to entail housekeeping associated with tracking the added information, thus ensuring the self consistency of separate implementations of our views. And beyond the logistical complexity is the processing time ... dicts are fast, anything fully ordered automatically involves O(lg(n)) insertion/deletion times. I like the idea of information neutral or reducing views; but for anything that requires the addition of "synthetic information" to construct a view worries me. Cheers, Adam DePrince > > Tim Delaney > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: http://mail.python.org/mailman/options/python-3000/adam.deprince%40gmail.com From aahz at pythoncraft.com Fri Mar 24 07:28:22 2006 From: aahz at pythoncraft.com (Aahz) Date: Thu, 23 Mar 2006 22:28:22 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> <44232FD9.6050209@colorstudy.com> Message-ID: <20060324062821.GA19103@panix.com> On Thu, Mar 23, 2006, Guido van Rossum wrote: > On 3/23/06, Brett Cannon wrote: >> >> I understand Ian's view since I know I like to pass around iterators >> for use and that disconnects the iterator from the object that >> generated it and thus makes it impossible to find out possible info on >> the data contained without exhausting the iterator compared to just >> performing data upon the object containing the original the data. > > You shouldn't do that unless you are consciously designing an API that > must be able to work with in(de)finite sequences or other strange > things. The itertools library is an example of such an API because (by > intent) it must work for all iterators. > > Most APIs aren't as constrained and it's fine to require an iterable > instead of an iterator. The problem is that prior to Python 2.1, there weren't any iterators; moreover, my impression is that the majority of the Python community didn't really "get" iterators until fairly recently. I think part of the objection here is that while Py3K is intended to be a point where we can break backward compatibility, semantic breakage is going to be very hard to track down in this case. I'm not really sure where I stand. I *like* the idea of making d.keys() return an iterator, but my impression is that it's going to be one of the more painful changes. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "Look, it's your affair if you want to play with five people, but don't go calling it doubles." --John Cleese anticipates Usenet From adam.deprince at gmail.com Fri Mar 24 08:06:33 2006 From: adam.deprince at gmail.com (Adam DePrince) Date: Fri, 24 Mar 2006 02:06:33 -0500 Subject: [Python-3000] Best Practices essays In-Reply-To: <442353CD.1050101@comcast.net> References: <442353CD.1050101@comcast.net> Message-ID: <1143183993.3287.67.camel@localhost.localdomain> On Thu, 2006-03-23 at 21:05 -0500, Edward C. Jones wrote: > "Brett Cannon" wrote: > > > Right. I am really starting to think that having a group of Best > > Practices essays that discuss common Python idioms might be handy. > > Part tutorial, part advanced usage, they would provide a way for > > people to have a place to go to find out expected usage of things > > such as iterators without having to discover this kind of thing the > > hard way. Could also help us see where possible improvements could > > come in for Py3K if we write them from the perspective of 2.x, or > > even how things improve if we write them for Py3K. > > That's a good idea. Include everything from m * [n * [0]] to pickling > classes with __new__. For the latter, Google on "pickling a > subclass of tuple". Sort of like a cookbook? Martelli, Ravenscroft and Ascher beat you to it. Yes, while "There should be one -- and preferably only one -- obvious way to do it", the truth is there are many ways of doing the same thing - and each way will have its trade-offs and underlying philosophy that will, from its authors perspective, make it the one way. I don't want to discourage your efforts in creating this, but self discovery of the best way to do things via the "hard way" is important. I always like to say that experience is simply a measure of how much stuff you have broken over your career. Now, as for your example m * [ n * [0]], I would exclude it from a best practices document. If your goal is to create a two dimensional array of numbers, it doesn't work. The first part, n* [0] is right, you are creating a list of n zeros, and when you say l[x]=y you are replacing that element. The second part, m *, is wrong. You are creating a list of m references to the same list of n zeros. Look at the following: >>> m = 5 >>> n = 10 >>> l = m * [n * [0]] >>> l[3][5] = 1 >>> l [[0, 0, 0, 0, 0, 1, 0, 0, 0, 0], [0, 0, 0, 0, 0, 1, 0, 0, 0, 0], [0, 0, 0, 0, 0, 1, 0, 0, 0, 0], [0, 0, 0, 0, 0, 1, 0, 0, 0, 0], [0, 0, 0, 0, 0, 1, 0, 0, 0, 0]] >>> I don't think that is what you had in mind. One "quick and dirty" way of fixing that is: l = map( list, m * [n * [0] ]) Now it works. >>> l = map( list, m*[n*[0]] ) >>> l[3][5] = 1 >>> l [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 1, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]] >>> But there are other ways. >>> l = [n*[0] for _m in xrange( m )] >>> l[3][5] = 1 >>> l [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 1, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]] [adam at localhost ~]$ python2.4 -mtimeit -s 'from numarray import array;n=50;m=100' 'a = map( list, m * [n * [0] ])' 10000 loops, best of 3: 86.6 usec per loop [adam at localhost ~]$ python2.4 -mtimeit -s 'from numarray import array;n=50;m=100' 'a = [n*[0] for _m in xrange( m )]' 10000 loops, best of 3: 105 usec per loop Which is best practice? Well, if you are doing real bona-fide scientific work, best practice is to install numpy and say import numarray a = array( (0,)*(n*m), shape=(n,m)) # default type is 32 bit signed int a-=a # To zero the array, without initial # data, we get random junk. [adam at localhost ~]$ python2.4 -mtimeit -s 'from numarray import array;n=50;m=100' 'a = array( shape=(n,m));a-=a' 10000 loops, best of 3: 25.2 usec per loop [adam at localhost ~]$ Well, which is best practice? Numpy beats plain old lists by a factor of 3, and all of your subsequent computation is likely to be faster too, but it requires an addition to python. Cheers - Adam DePrince From greg.ewing at canterbury.ac.nz Fri Mar 24 08:41:16 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 24 Mar 2006 19:41:16 +1200 Subject: [Python-3000] C style guide In-Reply-To: References: <442203E5.7090009@gmail.com> Message-ID: <4423A29C.3060006@canterbury.ac.nz> Guido van Rossum wrote: > That won't go away for me (Google's settings default to TWO-space > indents :-( ) but I agree with the 4-space indent -- eventually. If we standardised on all-tabs, people could set their editors to display indentation however they wanted, and there would be no need to argue about how many spaces should be dancing at the head of a code line. Greg From jcarlson at uci.edu Fri Mar 24 09:26:50 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri, 24 Mar 2006 00:26:50 -0800 Subject: [Python-3000] Best Practices essays In-Reply-To: <442353CD.1050101@comcast.net> References: <442353CD.1050101@comcast.net> Message-ID: <20060324002439.F7E0.JCARLSON@uci.edu> "Edward C. Jones" wrote: > That's a good idea. Include everything from m * [n * [0]] to pickling Except that it probably doesn't do what you intend... >>> a = 2*[3*[0]] >>> a [[0, 0, 0], [0, 0, 0]] >>> a[0][0] = 2 >>> a [[2, 0, 0], [2, 0, 0]] Unless you intend something like that... - Josiah From nico at tekNico.net Fri Mar 24 09:36:37 2006 From: nico at tekNico.net (Nicola Larosa) Date: Fri, 24 Mar 2006 09:36:37 +0100 Subject: [Python-3000] else-clause on for-loops In-Reply-To: References: Message-ID: <4423AF95.40701@tekNico.net> Ian Bicking: >>> When I was just first learning Python I thought this would work: >>> >>> for item in select_results: >>> ... >>> else: >>> ... stuff when there are no items ... >>> >>> But it doesn't work like that. Brett Cannon: >> I have to admit that is what I initially thought as well. I think it >> is because when I read 'else' I viewed it as an alternative if the >> clause it was attached to didn't happen (ala an 'if' statement). Steven Bethard: > Yeah, I use for-else occasionally, and I know how it works in Python, > but every time I want to special-case the empty iterable case, I still > have to remind myself that the else-clause doesn't do what I want it > to. The same for me. I sometimes may have had a need for the current semantics of the else after loops, but I don't remember it; on the other hand, I have had a use for the no-iteration case a number of times. Somehow I find it hard to stick into my mind that's not what it means. > There was talk previously_ about removing the else clause on for-loops > (and while-loops). One possibility would be to change the else-clause > to behave as expected above (i.e. only executed when the loop fails to > iterate over any items). I'd like that. Of course it would break compatibility with the past, and may cause subtle bugs; the advantages would surpass the drawbacks, in my case, since I rarely use the current semantics, if at all. -- Nicola Larosa - http://www.tekNico.net/ Life [...] is a tale told by an idiot, full of sound and fury, signifying nothing. -- William Shakespeare, MacBeth Life is like the chicken ladder: short and full of shit. -- Anonymous From greg.ewing at canterbury.ac.nz Fri Mar 24 10:43:49 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 24 Mar 2006 21:43:49 +1200 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <4422FC96.2020409@zope.com> <4422FFE1.8050807@colorstudy.com> Message-ID: <4423BF55.7030608@canterbury.ac.nz> Guido van Rossum wrote: > Its maps have methods to > return keys, values and items, but these return neither new lists nor > iterators; they return "views" which obey set (or multiset, in the > case of items) semantics. > I'd like to explore this as an alternative to making keys() etc. > return iterators. This sounds like a really really good idea! It would solve Jim's problem, because the result of d.keys() would print out just like a real list, and then he could backspace over the .keys() and do something else. Greg From greg.ewing at canterbury.ac.nz Fri Mar 24 10:43:56 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 24 Mar 2006 21:43:56 +1200 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> <44232FD9.6050209@colorstudy.com> <442336C4.6050807@colorstudy.com> <442344A1.4070001@colorstudy.com> Message-ID: <4423BF5C.4020402@canterbury.ac.nz> Guido van Rossum wrote: > If you find the 'empty' flag pattern too ugly, you can always write a > helper class that takes an iterator and returns an object that > represents the same iterator, but sometimes buffers one element. Perhaps one of these could be included as an itertool? Greg From greg.ewing at canterbury.ac.nz Fri Mar 24 10:44:54 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 24 Mar 2006 21:44:54 +1200 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <442344A1.4070001@colorstudy.com> References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> <44232FD9.6050209@colorstudy.com> <442336C4.6050807@colorstudy.com> <442344A1.4070001@colorstudy.com> Message-ID: <4423BF96.4040608@canterbury.ac.nz> Ian Bicking wrote: > Iterators look a lot like containers. Actually, they hardly look like containers at all. About the only thing you can do with a container that you can also do with an iterator is use it in a for-loop. There are a great many other things you *can't* do with an iterator -- index it, slice it, take its len(), etc. In some ways it's rather misleading that you're allowed to say for x in iterator: because the items are not "in" the iterator, they're *produced* by the iterator on demand. I've speculated that perhaps it should be illegal to use an iterator that way, and you would instead have to say for x from iterator: This would force people to keep the distinction firmly in mind and might lead to less confusion. Greg From greg.ewing at canterbury.ac.nz Fri Mar 24 10:46:18 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 24 Mar 2006 21:46:18 +1200 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> Message-ID: <4423BFEA.1040109@canterbury.ac.nz> Brett Cannon wrote: > Someone else wrote: > > When I was just > > first learning Python I thought this would work: > > > > for item in select_results: > > ... > > else: > > ... stuff when there are no items ... > > > > But it doesn't work like that. I have to agree that's actually a more intuitive use of "else" in relation to a for-loop. It's a pity that some other word wasn't chosen that would have left "else" free for this purpose. Could something perhaps be done about this in Py3k? Blatantly changing the meaning of "else" here might be going too far, but maybe some other construct could be found that expresses the same intent. Greg From greg.ewing at canterbury.ac.nz Fri Mar 24 10:49:01 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 24 Mar 2006 21:49:01 +1200 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <4422FC96.2020409@zope.com> <4422FFE1.8050807@colorstudy.com> <17443.3919.235023.510956@montanaro.dyndns.org> Message-ID: <4423C08D.1000704@canterbury.ac.nz> Alex Martelli wrote: > Not sure what's a "small dict" in your world -- here, for example: He's obviously talking about quantum dicts which (averaged over a sufficiently large ensemble) contain a fractional number of items. Greg From greg.ewing at canterbury.ac.nz Fri Mar 24 10:49:19 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 24 Mar 2006 21:49:19 +1200 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <44233943.50201@colorstudy.com> References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> <44232FD9.6050209@colorstudy.com> <442336C4.6050807@colorstudy.com> <44233943.50201@colorstudy.com> Message-ID: <4423C09F.8020708@canterbury.ac.nz> Ian Bicking wrote: > A downside is mistranslations of old code will lead to very hard bugs, > as .keys() currently makes a copy. Though if the new objects aren't > quite the same as lists -- e.g., implementing .add() instead of > .append() -- then maybe that won't be so bad. I'd suggest that the view objects be immutable. Then code which was expecting an independent object would fail rather than accidentally messing up the underlying object. If you need a mutable copy, you can always wrap list() around it, or something else such as sorted(). Greg From greg.ewing at canterbury.ac.nz Fri Mar 24 10:49:24 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 24 Mar 2006 21:49:24 +1200 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <17443.3919.235023.510956@montanaro.dyndns.org> References: <4422FC96.2020409@zope.com> <4422FFE1.8050807@colorstudy.com> <17443.3919.235023.510956@montanaro.dyndns.org> Message-ID: <4423C0A4.50703@canterbury.ac.nz> skip at pobox.com wrote: > Still, around work I see a great preference for the longer > (and uglier IMO) spelling. Maybe it's a mental carryover from C++ that > makes people what that version? I think it's a natural tendency, and one that isn't necessarily wrong to have. Although the iterator version won't necessarily be quite as fast as it could be in some situations, you at least know that it won't perform spectacularly badly in any situation. Think of it as a form of defensive coding. Greg From greg.ewing at canterbury.ac.nz Fri Mar 24 11:03:58 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 24 Mar 2006 22:03:58 +1200 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <4423665C.9080807@colorstudy.com> References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> <44234994.1020508@colorstudy.com> <4423665C.9080807@colorstudy.com> Message-ID: <4423C40E.9070808@canterbury.ac.nz> Ian Bicking wrote: > If you could somehow count how many times > the loop had run, that'd work great; but I don't see any way to do that > without new syntax. line = None for line in some_lines: ... if line is None: # we didn't get any lines Speculating on a syntax to make this one line shorter: for line in some_lines else None: ... if line is None: # we didn't get any lines Or using a different keyword: for line in some_lines: ... except: # we didn't get any lines Greg From greg.ewing at canterbury.ac.nz Fri Mar 24 11:13:04 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 24 Mar 2006 22:13:04 +1200 Subject: [Python-3000] Best Practices essays In-Reply-To: <1143183993.3287.67.camel@localhost.localdomain> References: <442353CD.1050101@comcast.net> <1143183993.3287.67.camel@localhost.localdomain> Message-ID: <4423C630.1010306@canterbury.ac.nz> Adam DePrince wrote: > Now, as for your example m * [ n * [0]], I would exclude it from a best > practices document. I'm assuming he meant the best-practices document would be documenting how to do that *right*! BTW, something I had in mind for the list comprehension syntax back when it was being developed, but didn't get around to pursuing, was letting you say [0 times n] or for multidimensional arrays [[0 times m] times n] etc. Not sure if the use cases are frequent enough to justify it, though. Greg From greg.ewing at canterbury.ac.nz Fri Mar 24 11:15:48 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 24 Mar 2006 22:15:48 +1200 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5FF1E619@au3010avexu1.global.avaya.com> References: <2773CAC687FD5F4689F526998C7E4E5FF1E619@au3010avexu1.global.avaya.com> Message-ID: <4423C6D4.8080305@canterbury.ac.nz> Delaney, Timothy (Tim) wrote: > This whole discussion suggests to me that what would be best is if we > defined an actual "view" protocol, and various builtins return views, > rather than either copies or iterators. I'm not sure that any formal protocol is needed. Each container will be providing its own set of methods for producing views, and each of the views will behave in ways specific to the type that it's viewing. I can't see there being a place for anything like the __iter__ slot for views. Each case will be unique. > A view provides the same access methods, etc as the object it is backed > by. They can't be *exactly* the same, or there would be no point in having a view in the first place. > The aim of a view is to be lightweight. Agreed. > A view should not allow modification of the underlying object I think I agree with that, at least in the cases where we're changing an existing method (e.g. dict.keys()) that used to return an independent object. Then code which is expecting the old semantics will fail fairly obviously. Greg From fredrik at pythonware.com Fri Mar 24 10:28:16 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 24 Mar 2006 10:28:16 +0100 Subject: [Python-3000] else-clause on for-loops References: Message-ID: Steven Bethard wrote: > There was talk previously_ about removing the else clause on for-loops > (and while-loops). One possibility would be to change the else-clause > to behave as expected above (i.e. only executed when the loop fails to > iterate over any items). I'm well aware that Python 3000 doesn't aim to be backwards compatible, but I would prefer if we could refrain from subtly changing the behaviour of existing constructs in way that breaks all existing use, and makes it un- necessarily hard to convert old code (whether by hand or by machine). (fwiw, the else statement in Python *always* means the same thing: exe- cute this when the controlling condition has been tested and found false) From nico at tekNico.net Fri Mar 24 11:50:25 2006 From: nico at tekNico.net (Nicola Larosa) Date: Fri, 24 Mar 2006 11:50:25 +0100 Subject: [Python-3000] else-clause on for-loops In-Reply-To: References: Message-ID: <4423CEF1.8050109@tekNico.net> Steven Bethard wrote: >> There was talk previously_ about removing the else clause on for-loops >> (and while-loops). One possibility would be to change the else-clause >> to behave as expected above (i.e. only executed when the loop fails to >> iterate over any items). Fredrik Lundh: > I'm well aware that Python 3000 doesn't aim to be backwards compatible, > but I would prefer if we could refrain from subtly changing the behaviour > of existing constructs in way that breaks all existing use, and makes it un- > necessarily hard to convert old code (whether by hand or by machine). That's right. Unfortunately this case does not present a clear balance of advantages and disadvantages. > (fwiw, the else statement in Python *always* means the same thing: exe- > cute this when the controlling condition has been tested and found false) Precisely. And that's why the current behavior is counterintuitive: "no loops executed" is a better "false controlling condition" than "a few loops executed, but not all", as is the case when using "break". Even if it would be suboptimal, for backward compatibility's sake we could keep the current "else" semantics, and adopt the "except" keyword for the "no loops executed" case, as suggested by Greg Ewing. Otherwise, I'd remove the current semantics altogether: they're confusing, and apparently not useful enough. -- Nicola Larosa - http://www.tekNico.net/ Life [...] is a tale told by an idiot, full of sound and fury, signifying nothing. -- William Shakespeare, MacBeth Life is like the chicken ladder: short and full of shit. -- Anonymous From ncoghlan at gmail.com Fri Mar 24 11:59:50 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 24 Mar 2006 20:59:50 +1000 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> <44232FD9.6050209@colorstudy.com> Message-ID: <4423D126.8060208@gmail.com> Guido van Rossum wrote: > On 3/23/06, Brett Cannon wrote: >> But I think if objects returned iterators instead of lists the >> iterator-as-view will begin to be used more than viewing them as >> iterator-has-own-data. But this also means that making a more >> view-like interface would be handy. In terms of what would need to be >> supported (len, deletion, etc.) I don't know. I personally have not >> had that much of a need since I then just pass the originating object >> and get the iterator as needed instead of passing around the iterator. > > I'm dead set against giving iterators more view-like properties; it > would rule out generators and other potentially infinite sequences. > > But I'm all for a different concept, views in the sense of Java's > collection framework. Please study it; it's worth it: > > http://java.sun.com/j2se/1.4.2/docs/api/java/util/Collection.html A very interesting read indeed (Their explanation in the Design FAQ regarding the use of 'UnsupportedOperationException' sounds downright Pythonic. . .) Some more specific reading material can be found by looking at List.subList [1] along with Map.keySet, Map.entrySet and Map.values [2]. An interesting point is that making the views immutable doesn't really make life any easier, because you need to define what happens to the views if someone mutates the *original*. And once you do that, then that means you should be able to define what happens to the original if someone mutates the view. I would be a big fan of adding views as a core concept - it would also provide a nice bridge to the way array slicing works in numpy (you get a mutable view rather than a copy). Regards, Nick. [1] http://java.sun.com/j2se/1.4.2/docs/api/java/util/List.html#subList(int,%20int) [2] http://java.sun.com/j2se/1.4.2/docs/api/java/util/Map.html#keySet() -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From ncoghlan at gmail.com Fri Mar 24 12:12:55 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 24 Mar 2006 21:12:55 +1000 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> Message-ID: <4423D437.6080805@gmail.com> Brett Cannon wrote: > On 3/23/06, Guido van Rossum wrote: >> (Off-topic: maybe we can drop the fall-back behavior >> of iter() if __iter__ isn't found?) >> > > I say yes. Iterators will be common enough that objects that want the > support should just directly support it. Hmm, I'd expect the typical generator used for this to be a fair bit slower than the current custom sequence iterator: def __iter__(self): for i in range(len(self)): yield i OTOH, it would make sense if the fallback could instead be written: __iter__ = itertools.iterseq (where 'iterseq' gives a real name to the currently hidden default sequence iterator) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From ncoghlan at gmail.com Fri Mar 24 12:23:10 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 24 Mar 2006 21:23:10 +1000 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> <44234994.1020508@colorstudy.com> <4423665C.9080807@colorstudy.com> Message-ID: <4423D69E.9090308@gmail.com> Guido van Rossum wrote: > On 3/23/06, Ian Bicking wrote: >> Or a view would work just as well, I suppose. Maybe you've just made >> the argument that it should return an iterable, not an iterator. > > Technically an iterator is an iterable, so requiring it to return an > iterable doesn't solve your problem. Requiring a sequence or a > collection which may be a view instead of a copy *does* solve it, so I > propose to go for that. This would also solve the redundancy of having > iter(d) and iter(d.keys()) return the same thing -- d.keys() would > return a set (not multiset!) view which has other uses than either d > or iter(d). Well, d.keys and d.items would be set views, while d.values would be a multiset view. For sequence views (the equivalent of Java's List.subList), maybe it would make sense to have a separate 'seqview' type that gives a view on an arbitrary existing sequence. First argument would be the sequence itself, while the second argument would be an optional slice that limited the visible sequence elements. Something like: >>> data = range(10) >>> view_all = seqview(data) >>> view_end = view_all[5:10] >>> view_end[0] = 10 >>> data[5] 10 Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From ncoghlan at gmail.com Fri Mar 24 13:12:54 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 24 Mar 2006 22:12:54 +1000 Subject: [Python-3000] else-clause on for-loops In-Reply-To: <4423CEF1.8050109@tekNico.net> References: <4423CEF1.8050109@tekNico.net> Message-ID: <4423E246.7010903@gmail.com> Nicola Larosa wrote: > Fredrik Lundh: >> (fwiw, the else statement in Python *always* means the same thing: exe- >> cute this when the controlling condition has been tested and found false) > > Precisely. And that's why the current behavior is counterintuitive: "no > loops executed" is a better "false controlling condition" than "a few loops > executed, but not all", as is the case when using "break". It actually maps better than one might think - the only thing that's not necessarily obvious is that the condition being tested in both the while loop and for loop cases is "do I want to run the loop body?". If that's False, we execute the else clause and get out of there, just like a normal if statement. The reason 'break' is special is because it kills the loop without the loop condition ever becoming false - so the else clause gets skipped as a result. Making the criteria "no loops were executed" means that the loop condition now has to be tested in two separate places, because the first iteration has to be special cased. Defining those semantics is definitely possible, but it really isn't very nice. OTOH, there's a fairly easy alternative for handling arbitrary iterables: from itertools import chain def checked_iter(iterable): "Returns None if the iterable is empty, equivalent iterator otherwise" itr = iter(iterable) try: item = itr.next() except StopIteration: return None return chain((item,), itr) Used like: my_itr = checked_iter(iterable) if my_itr is not None: # Use it else: # It was empty Is it worth sending this to Raymond as an itertools candidate? It's the cleanest way I know of to convert code that relies on "bool(container)" to handle arbitrary iterators instead, and judging from responses here, it's only obvious if you've drunk enough of the iterator Kool-Aid ;) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From barry at python.org Fri Mar 24 14:09:20 2006 From: barry at python.org (Barry Warsaw) Date: Fri, 24 Mar 2006 08:09:20 -0500 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> Message-ID: <1143205760.2317.20.camel@geddy.wooz.org> On Thu, 2006-03-23 at 17:06 -0800, Guido van Rossum wrote: > The pattern with the 'empty' flag is only needed when due to API > constraints you have only got an iterator. Which can happen quite often actually. Perhaps making the original object available as an attribute of the iterator can help in those situations though. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 309 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-3000/attachments/20060324/b5102145/attachment.pgp From barry at python.org Fri Mar 24 14:46:21 2006 From: barry at python.org (Barry Warsaw) Date: Fri, 24 Mar 2006 08:46:21 -0500 Subject: [Python-3000] C style guide In-Reply-To: <4423A29C.3060006@canterbury.ac.nz> References: <442203E5.7090009@gmail.com> <4423A29C.3060006@canterbury.ac.nz> Message-ID: <1143207981.2310.22.camel@geddy.wooz.org> On Fri, 2006-03-24 at 19:41 +1200, Greg Ewing wrote: > Guido van Rossum wrote: > > > That won't go away for me (Google's settings default to TWO-space > > indents :-( ) but I agree with the 4-space indent -- eventually. > > If we standardised on all-tabs, people could set their > editors to display indentation however they wanted, and > there would be no need to argue about how many spaces > should be dancing at the head of a code line. 4 space tabs are evil, as are all-tab styles. Everyone knows that a tab is 8 spaces. :) -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 309 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-3000/attachments/20060324/1ce12a3f/attachment.pgp From michael.walter at gmail.com Fri Mar 24 15:05:54 2006 From: michael.walter at gmail.com (Michael Walter) Date: Fri, 24 Mar 2006 15:05:54 +0100 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <1143205760.2317.20.camel@geddy.wooz.org> References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> <1143205760.2317.20.camel@geddy.wooz.org> Message-ID: <877e9a170603240605v543ae2dcgc660b52d69d1dcaf@mail.gmail.com> On 3/24/06, Barry Warsaw wrote: > On Thu, 2006-03-23 at 17:06 -0800, Guido van Rossum wrote: > > > The pattern with the 'empty' flag is only needed when due to API > > constraints you have only got an iterator. > > Which can happen quite often actually. Perhaps making the original > object available as an attribute of the iterator can help in those > situations though. This potentially gives a different result (the iterator could be partially consumed). If you are interested in all items anyway you can just pass the iterable object. Michael > > -Barry > > > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.2.2 (GNU/Linux) > > iQCVAwUARCPvgHEjvBPtnXfVAQIyxgP/ePmQkJpAd+EISjQT02F8eQt/KymgsCwc > oBlyp6w5HmnUx25QAY54CiivOfNT79lwTBK3vwkc0C0F6KFX7ezpuEPitrEurdy2 > bPYuttB02pENO/9msH87GmYgQg93b03IbMqLiuLBURt6nL0ywecArDKu62kYELD9 > ADZZVNQIvKA= > =VzfM > -----END PGP SIGNATURE----- > > > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: http://mail.python.org/mailman/options/python-3000/michael.walter%40gmail.com > > > From skip at pobox.com Fri Mar 24 15:41:37 2006 From: skip at pobox.com (skip at pobox.com) Date: Fri, 24 Mar 2006 08:41:37 -0600 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <4422FC96.2020409@zope.com> <4422FFE1.8050807@colorstudy.com> <17443.3919.235023.510956@montanaro.dyndns.org> Message-ID: <17444.1313.614763.170662@montanaro.dyndns.org> Alex> Not sure what's a "small dict" in your world -- here, for example: I would say "small" for us is typically fewer than ten keys. We rarely have dictionaries with dozens or hundreds of keys. Alex> python2.4 -mtimeit -s'd=dict.fromkeys(range(23))' 'for k, v in d.iteritems(): pass' Alex> 100000 loops, best of 3: 2.81 usec per loop Alex> python2.4 -mtimeit -s'd=dict.fromkeys(range(23))' 'for k, v in d.items(): pass' Alex> 100000 loops, best of 3: 4.82 usec per loop We're still on 2.3 at work. (All the people who have begun using iteritems() have never used 2.4.) Running your tests w/ 2.3 I get: ink:% timeit -s'd=dict.fromkeys(range(23))' 'for k, v in d.items(): pass' 100000 loops, best of 3: 7.16 usec per loop ink:% type python python is hashed (/opt/lang/bin/python) ink:% timeit -s'd=dict.fromkeys(range(23))' 'for k, v in d.iteritems(): pass' 100000 loops, best of 3: 6.83 usec per loop Hardly seems worth the effort to type the extra four letters. In fact, just iterating over the dict's keys and assigning to v (my personally preferred idiom) is faster: ink:% timeit -s'd=dict.fromkeys(range(23))' 'for k in d: v=d[k]' 100000 loops, best of 3: 5.03 usec per loop Sticking the 2.4 directory in front of PATH, I get: ink:% timeit -s'd=dict.fromkeys(range(23))' 'for k, v in d.items(): pass' 100000 loops, best of 3: 6.49 usec per loop ink:% timeit -s'd=dict.fromkeys(range(23))' 'for k, v in d.iteritems(): pass' 100000 loops, best of 3: 4.16 usec per loop ink:% timeit -s'd=dict.fromkeys(range(23))' 'for k in d: v = d[k]' 100000 loops, best of 3: 4.58 usec per loop Not as dramatic an improvement as you saw, but yes, I'm surprised that iteritems() is faster than items(). I stand corrected. Thx, Skip From guido at python.org Fri Mar 24 16:13:55 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 24 Mar 2006 07:13:55 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <4423BFEA.1040109@canterbury.ac.nz> References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> <4423BFEA.1040109@canterbury.ac.nz> Message-ID: On 3/24/06, Greg Ewing wrote: > > Someone else wrote: > > > When I was just > > > first learning Python I thought this would work: > > > > > > for item in select_results: > > > ... > > > else: > > > ... stuff when there are no items ... > > > > > > But it doesn't work like that. > > I have to agree that's actually a more intuitive use > of "else" in relation to a for-loop. It's a pity that > some other word wasn't chosen that would have left > "else" free for this purpose. > > Could something perhaps be done about this in Py3k? > Blatantly changing the meaning of "else" here might > be going too far, but maybe some other construct could > be found that expresses the same intent. I suspect that this isn't good enough for the crowd who want to special-case "empty", and in fact the "empty flag" pattern won't work for them as well. They typically want to do this: results = query() if not results: print "

No results

" else: print "

Results

" print "" for value in results: print "" % value print "
%s
" IOW they want to do something *before* entering the for-loop only if it's not empty. In this type of use case, casting to list() is totally fine -- if the list contains more than a few dozen items they'd want to insert pagination code anyway... -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Fri Mar 24 16:22:44 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 24 Mar 2006 07:22:44 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <4423D437.6080805@gmail.com> References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> <4423D437.6080805@gmail.com> Message-ID: On 3/24/06, Nick Coghlan wrote: > Brett Cannon wrote: > > On 3/23/06, Guido van Rossum wrote: > >> (Off-topic: maybe we can drop the fall-back behavior > >> of iter() if __iter__ isn't found?) > > > > I say yes. Iterators will be common enough that objects that want the > > support should just directly support it. > > Hmm, I'd expect the typical generator used for this to be a fair bit slower > than the current custom sequence iterator: But you wouldn't do that. You'd just rebuke the author of the uniterable sequence type for not getting with the program after 7 years. Some folks (not me) would like to make this a feature and remove the __iter__ method on strings... -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Fri Mar 24 16:26:39 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 24 Mar 2006 07:26:39 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <1143205760.2317.20.camel@geddy.wooz.org> References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> <1143205760.2317.20.camel@geddy.wooz.org> Message-ID: On 3/24/06, Barry Warsaw wrote: > On Thu, 2006-03-23 at 17:06 -0800, Guido van Rossum wrote: > > > The pattern with the 'empty' flag is only needed when due to API > > constraints you have only got an iterator. > > Which can happen quite often actually. Perhaps making the original > object available as an attribute of the iterator can help in those > situations though. It can't work, at least not in general. How do you do this if the iterator is a generator? Or an infinite sequence? Or a filter? It can't be made part of the iterator protocol. You can design your own extension of the iterator protocol, but then it wouldn't accept arbitrary iterators any more. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From python at discworld.dyndns.org Fri Mar 24 16:12:31 2006 From: python at discworld.dyndns.org (Charles Cazabon) Date: Fri, 24 Mar 2006 09:12:31 -0600 Subject: [Python-3000] C style guide In-Reply-To: <4423A29C.3060006@canterbury.ac.nz> References: <442203E5.7090009@gmail.com> <4423A29C.3060006@canterbury.ac.nz> Message-ID: <20060324151231.GB22474@discworld.dyndns.org> Greg Ewing wrote: > Guido van Rossum wrote: > > > That won't go away for me (Google's settings default to TWO-space indents > > :-( ) but I agree with the 4-space indent -- eventually. > > If we standardised on all-tabs, people could set their editors to display > indentation however they wanted, and there would be no need to argue about > how many spaces should be dancing at the head of a code line. Well, except for the fact that people who like 2-space indents and people who like 8-space indents would then start to argue vociferously about how many levels of indentation are acceptable given an 80-character-wide source file. Charles -- ----------------------------------------------------------------------- Charles Cazabon GPL'ed software available at: http://pyropus.ca/software/ ----------------------------------------------------------------------- From aahz at pythoncraft.com Fri Mar 24 16:33:27 2006 From: aahz at pythoncraft.com (Aahz) Date: Fri, 24 Mar 2006 07:33:27 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <4423BF96.4040608@canterbury.ac.nz> References: <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> <44232FD9.6050209@colorstudy.com> <442336C4.6050807@colorstudy.com> <442344A1.4070001@colorstudy.com> <4423BF96.4040608@canterbury.ac.nz> Message-ID: <20060324153327.GC425@panix.com> On Fri, Mar 24, 2006, Greg Ewing wrote: > > I've speculated that perhaps it should be illegal to > use an iterator that way, and you would instead have > to say > > for x from iterator: > > This would force people to keep the distinction firmly > in mind and might lead to less confusion. How would you distinguish? What about objects that are their own iterator (such as files)? -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "Look, it's your affair if you want to play with five people, but don't go calling it doubles." --John Cleese anticipates Usenet From barry at python.org Fri Mar 24 16:36:41 2006 From: barry at python.org (Barry Warsaw) Date: Fri, 24 Mar 2006 10:36:41 -0500 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <435DF58A933BA74397B42CDEB8145A86EACCB4@ex9.hostedexchange.local> <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> <1143205760.2317.20.camel@geddy.wooz.org> Message-ID: <1143214601.10792.15.camel@resist.wooz.org> On Fri, 2006-03-24 at 07:26 -0800, Guido van Rossum wrote: > On 3/24/06, Barry Warsaw wrote: > > On Thu, 2006-03-23 at 17:06 -0800, Guido van Rossum wrote: > > > > > The pattern with the 'empty' flag is only needed when due to API > > > constraints you have only got an iterator. > > > > Which can happen quite often actually. Perhaps making the original > > object available as an attribute of the iterator can help in those > > situations though. > > It can't work, at least not in general. How do you do this if the > iterator is a generator? Or an infinite sequence? Or a filter? It > can't be made part of the iterator protocol. You can design your own > extension of the iterator protocol, but then it wouldn't accept > arbitrary iterators any more. Yes, absolutely true. I wasn't really proposing a change to the generic iterator protocol, just suggestion something "one" could do if "one" needed that functionality (although an agreed upon convention would make it somewhat more general). -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 309 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-3000/attachments/20060324/0e20f6e5/attachment.pgp From adam.deprince at gmail.com Fri Mar 24 17:03:16 2006 From: adam.deprince at gmail.com (Adam DePrince) Date: Fri, 24 Mar 2006 11:03:16 -0500 Subject: [Python-3000] Best Practices essays In-Reply-To: <4423C630.1010306@canterbury.ac.nz> References: <442353CD.1050101@comcast.net> <1143183993.3287.67.camel@localhost.localdomain> <4423C630.1010306@canterbury.ac.nz> Message-ID: <1143216197.3190.74.camel@localhost.localdomain> On Fri, 2006-03-24 at 22:13 +1200, Greg Ewing wrote: > Adam DePrince wrote: > > > Now, as for your example m * [ n * [0]], I would exclude it from a best > > practices document. > > I'm assuming he meant the best-practices document would be > documenting how to do that *right*! > > BTW, something I had in mind for the list comprehension > syntax back when it was being developed, but didn't > get around to pursuing, was letting you say > > [0 times n] > > or for multidimensional arrays > > [[0 times m] times n] > > etc. Not sure if the use cases are frequent enough to > justify it, though. I think that when serious users start nesting rectangles into lists of lists, they just reach for the right tools instead. And in my own experience, d=dict;d[(n,m)] makes a fine two dimensional array./ I'm curious, however, what do you envision the semantics of [x times i] being? [x,]*i - or - [copy.copy( x ) for _i in xrange( i )] Part of me likes it, but IMHO this feature shines the best at those times that you shouldn't be using it. > I'm assuming he meant the best-practices document would be > documenting how to do that *right*! But m * [n * [0 ]] already does m * [n * [ 0 ]] alright, if that's what you really want, right? I propose a reference that has on the left how to do it wrong, and on the right what you really meant. Perhaps even a adjunct to help() called iMeant() Forgive, I can't ... imagine the aggregate trauma to the careers of new python users as they searched one person's idea of a common error, superimposing their own misunderstandings, arriving at a ... no ... please, think of the children. First of all, the point of my earlier post is there are different ways of doing the same thing. Each is right for reasons removed from the exact task at hand; right, or "best" cannot be handed down from above. There are many factors used in picking a way of expressing something. In general, I've grown cynical and opposed to the term "Best Practices." Too many gilded handcuffs, too many egos wrapped in the faux authenticity and subtle terror of the words "That's not best practices." My own experiences have taught me that "best practices" is a synonym for "my kids need braces, I'm going for a promotion!" You know, you can destroy a "Wall Street" programmer's career by pronouncing that phrase just right during the middle of a meeting. The placement doesn't even have to make sense semantically, as long as its pronounced correctly. Programmer: and that concludes our database migration strategy; now, what do you propose we do for lunch? Boss: Yeah, that's a good idea. Heckler: Mr Programmer, your latest proposal, lunch, has components that are inconsistent with best practices. Boss: Security: This way sir ... Some of our brethren still follow best practices that require, using papyrus, code to be punched as follows: a = 1 b = 2 c = myfunc( a, b ) Yes, it matches the "best practices" of FORTRAN compilers in years past that couldn't quite grasp how to pass a constant by reference. In all all fairness, their best practices generally keep up with the times, the punch cards are paper now and few programmers still use their teeth. I like the idea, but there is very little space between the mistakes that are a necessary part of the user's learning experience, and the already existing Python Cookbook. What we could do, and I'd enjoy this, is to take a slightly different tack. Index by what users intended to do, show what they tried to write and explain why it is wrong. Now that would be fun, as long as we don't call it best practices. The hubris of that term gives me chills. Cheers - Adam DePrince From edcjones at comcast.net Fri Mar 24 17:12:55 2006 From: edcjones at comcast.net (Edward C. Jones) Date: Fri, 24 Mar 2006 11:12:55 -0500 Subject: [Python-3000] Best Practices essays In-Reply-To: <1143183993.3287.67.camel@localhost.localdomain> References: <442353CD.1050101@comcast.net> <1143183993.3287.67.camel@localhost.localdomain> Message-ID: <44241A87.1040603@comcast.net> Adam DePrince wrote: > Now, as for your example m * [ n * [0]], I would exclude it from a best > practices document. If your goal is to create a two dimensional array > of numbers, it doesn't work. The first part, n* [0] is right, you are > creating a list of n zeros, and when you say l[x]=y you are replacing > that element. > > The second part, m *, is wrong. You are creating a list of m references > to the same list of n zeros. I know its wrong. It was a mistake I made several times when I was earning Python. The m * [ n * [0]] problem is a dark corner of Python that exists because Python variables are really pointers. Unfortunately, this dark corner is visible to newbies. Which is why it needs to be mentioned. From tim.peters at gmail.com Fri Mar 24 17:38:00 2006 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 24 Mar 2006 11:38:00 -0500 Subject: [Python-3000] else-clause on for-loops In-Reply-To: <4423AF95.40701@tekNico.net> References: <4423AF95.40701@tekNico.net> Message-ID: <1f7befae0603240838g26c52495xef5857838b1fdf98@mail.gmail.com> [Nicola Larosa] > ... I sometimes may have had a need for the current semantics > of the else after loops, but I don't remember it; on the other hand, I have > had a use for the no-iteration case a number of times. Somehow I find it > hard to stick into my mind that's not what it means. The primary use case is "search loops". for item in sequence: if desirable(item): break else: no desirable item exists Just remember "search loop", and you'll never be surprised again. Now go back and recode your search loops in this simpler way ;-) From bioinformed at gmail.com Fri Mar 24 17:49:40 2006 From: bioinformed at gmail.com (Kevin Jacobs ) Date: Fri, 24 Mar 2006 11:49:40 -0500 Subject: [Python-3000] C style guide In-Reply-To: <20060324151231.GB22474@discworld.dyndns.org> References: <442203E5.7090009@gmail.com> <4423A29C.3060006@canterbury.ac.nz> <20060324151231.GB22474@discworld.dyndns.org> Message-ID: <2e1434c10603240849h27b7bd52q408c6d93b6b439c2@mail.gmail.com> On 3/24/06, Charles Cazabon wrote: > > like 8-space indents would then start to argue vociferously about how many > levels of indentation are acceptable given an 80-character-wide source > file. Don't forget those of us who are now pushing for 120 character wide source files! -Kevin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20060324/1ddfab14/attachment.htm From barry at python.org Fri Mar 24 18:04:34 2006 From: barry at python.org (Barry Warsaw) Date: Fri, 24 Mar 2006 12:04:34 -0500 Subject: [Python-3000] C style guide In-Reply-To: <2e1434c10603240849h27b7bd52q408c6d93b6b439c2@mail.gmail.com> References: <442203E5.7090009@gmail.com> <4423A29C.3060006@canterbury.ac.nz> <20060324151231.GB22474@discworld.dyndns.org> <2e1434c10603240849h27b7bd52q408c6d93b6b439c2@mail.gmail.com> Message-ID: <1143219874.10793.18.camel@resist.wooz.org> On Fri, 2006-03-24 at 11:49 -0500, Kevin Jacobs wrote: > On 3/24/06, Charles Cazabon wrote: > like 8-space indents would then start to argue vociferously > about how many > levels of indentation are acceptable given an > 80-character-wide source file. > > > Don't forget those of us who are now pushing for 120 character wide > source files! Please gawd no! -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 309 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-3000/attachments/20060324/e55c4aca/attachment-0001.pgp From fdrake at acm.org Fri Mar 24 18:11:03 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri, 24 Mar 2006 12:11:03 -0500 Subject: [Python-3000] C style guide In-Reply-To: <2e1434c10603240849h27b7bd52q408c6d93b6b439c2@mail.gmail.com> References: <20060324151231.GB22474@discworld.dyndns.org> <2e1434c10603240849h27b7bd52q408c6d93b6b439c2@mail.gmail.com> Message-ID: <200603241211.03500.fdrake@acm.org> On Friday 24 March 2006 11:49, Kevin Jacobs wrote: > Don't forget those of us who are now pushing for 120 character wide source > files! That would be bad. Do you realize just how small fonts would have to get to let us still have as many editor windows on-screen? I don't think my old eyes could handle it! -Fred -- Fred L. Drake, Jr. From adam.deprince at gmail.com Fri Mar 24 18:31:20 2006 From: adam.deprince at gmail.com (Adam DePrince) Date: Fri, 24 Mar 2006 12:31:20 -0500 Subject: [Python-3000] C style guide In-Reply-To: <200603241211.03500.fdrake@acm.org> References: <20060324151231.GB22474@discworld.dyndns.org> <2e1434c10603240849h27b7bd52q408c6d93b6b439c2@mail.gmail.com> <200603241211.03500.fdrake@acm.org> Message-ID: <1143221481.3163.14.camel@localhost.localdomain> On Fri, 2006-03-24 at 12:11 -0500, Fred L. Drake, Jr. wrote: > On Friday 24 March 2006 11:49, Kevin Jacobs wrote: > > Don't forget those of us who are now pushing for 120 character wide source > > files! > > That would be bad. Do you realize just how small fonts would have to get to > let us still have as many editor windows on-screen? I don't think my old > eyes could handle it! Not to mention the specter of landscape printing! Normal sized letters means legal sized paper for those unlucky souls with narrow format printers and no access to wacky US paper sizes. (legal sized = 21.5 x 35.5 cm - a B4 with 3 cm trimmed off the side and one tacked on the bottom.) - Adam DePrince From guido at python.org Fri Mar 24 18:36:37 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 24 Mar 2006 09:36:37 -0800 Subject: [Python-3000] C style guide In-Reply-To: <4423A29C.3060006@canterbury.ac.nz> References: <442203E5.7090009@gmail.com> <4423A29C.3060006@canterbury.ac.nz> Message-ID: On 3/23/06, Greg Ewing wrote: > If we standardised on all-tabs, people could set their > editors to display indentation however they wanted, and > there would be no need to argue about how many spaces > should be dancing at the head of a code line. [and much followup] Please tell me this whole thread is a cruel joke. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik at pythonware.com Fri Mar 24 18:41:49 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 24 Mar 2006 18:41:49 +0100 Subject: [Python-3000] Best Practices essays References: <442353CD.1050101@comcast.net><1143183993.3287.67.camel@localhost.localdomain> <44241A87.1040603@comcast.net> Message-ID: Edward C. Jones wrote: > I know its wrong. It was a mistake I made several times when I was > earning Python. The m * [ n * [0]] problem is a dark corner of Python > that exists because Python variables are really pointers. Unfortunately, > this dark corner is visible to newbies. Which is why it needs to be > mentioned. it's already explained in the FAQ, of course. From rwgk at yahoo.com Fri Mar 24 17:52:57 2006 From: rwgk at yahoo.com (Ralf W. Grosse-Kunstleve) Date: Fri, 24 Mar 2006 08:52:57 -0800 (PST) Subject: [Python-3000] [Python-Dev] r43214 - peps/trunk/pep-3000.txt In-Reply-To: Message-ID: <20060324165257.97945.qmail@web31504.mail.mud.yahoo.com> --- Neal Norwitz wrote: > I created this list a few days ago before Alan > said he was interested in maintaining it. Tru64 is difficult (I think > there are still some open bugs that go back years) because no > developer has access to any of these boxes. It would be good for > people interested in these platforms to speak up and offer their time > or at least access to the platform so we can test. We have two Tru64 machines. Write me if you are interested in ssh access (login name, optionally public ssh key). We use Python (any version after 2.2) under Tru64 all the time. I am not aware of bugs. It could be though that some extensions don't build, but all the ones we care about work just fine. Cheers, Ralf __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com From steven.bethard at gmail.com Fri Mar 24 19:11:37 2006 From: steven.bethard at gmail.com (Steven Bethard) Date: Fri, 24 Mar 2006 11:11:37 -0700 Subject: [Python-3000] else-clause on for-loops In-Reply-To: <1f7befae0603240838g26c52495xef5857838b1fdf98@mail.gmail.com> References: <4423AF95.40701@tekNico.net> <1f7befae0603240838g26c52495xef5857838b1fdf98@mail.gmail.com> Message-ID: On 3/24/06, Tim Peters wrote: > [Nicola Larosa] > > ... I sometimes may have had a need for the current semantics > > of the else after loops, but I don't remember it; on the other hand, I have > > had a use for the no-iteration case a number of times. Somehow I find it > > hard to stick into my mind that's not what it means. > > The primary use case is "search loops". > > for item in sequence: > if desirable(item): > break > else: > no desirable item exists > > Just remember "search loop", and you'll never be surprised again. Yep, that's what I use 'em for. Of course once you drink that Kool-Aid too often, you start wanting to write things like: for item in seq: for subitem in item: if desirable(subitem): break else: continue break else: print 'no subsequences contain a desirable item' It's about that time that I just refactor it to a function and replace all those funny breaks and else-clauses with a simple return statement. STeVe -- Grammar am for people who can't think for myself. --- Bucky Katt, Get Fuzzy From steven.bethard at gmail.com Fri Mar 24 19:20:13 2006 From: steven.bethard at gmail.com (Steven Bethard) Date: Fri, 24 Mar 2006 11:20:13 -0700 Subject: [Python-3000] how to transition 2.X code to 3.0 code Message-ID: On 3/24/06, Fredrik Lundh wrote: > I'm well aware that Python 3000 doesn't aim to be backwards compatible, > but I would prefer if we could refrain from subtly changing the behaviour > of existing constructs in way that breaks all existing use, and makes it un- > necessarily hard to convert old code (whether by hand or by machine). Agreed. I wonder what the plan is for the transition? Things like changing the semantics of the for-loop else-clause are actually relatively easy to flag since they can be spotted easily with the AST, and *all* current uses of the else-clause would be wrong. However, flagging things like "expects a list from dict.items instead of an iterator" might be harder to do. Some uses (for-loops, list-comps and genexps) are still perfectly fine while other uses (indexing, getting the len) aren't. I think most of these should raise an exception pretty quickly, but if any of them don't, that would make me pretty nervous when transitioning code. I wonder if it would be worth having a branch of python-3000 at some point that inserts warning code on all constructs that changed (e.g. all calls to dict.items). I have a suspicion that this would generate a lot of false-positives, but maybe it would still be worthwhile... STeVe -- Grammar am for people who can't think for myself. --- Bucky Katt, Get Fuzzy From aleaxit at gmail.com Fri Mar 24 19:27:06 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Fri, 24 Mar 2006 10:27:06 -0800 Subject: [Python-3000] Best Practices essays In-Reply-To: <1143216197.3190.74.camel@localhost.localdomain> References: <442353CD.1050101@comcast.net> <1143183993.3287.67.camel@localhost.localdomain> <4423C630.1010306@canterbury.ac.nz> <1143216197.3190.74.camel@localhost.localdomain> Message-ID: On 3/24/06, Adam DePrince wrote: ... > "my kids need braces, I'm going for a promotion!" If your kids need braces, they'll find many more in C, Java, etc, than in Python, where indentation prevails (unless you use a HUGE lot of dictionary displays, of course). Alex From guido at python.org Fri Mar 24 19:40:17 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 24 Mar 2006 10:40:17 -0800 Subject: [Python-3000] how to transition 2.X code to 3.0 code In-Reply-To: References: Message-ID: On 3/24/06, Steven Bethard wrote: > On 3/24/06, Fredrik Lundh wrote: > > I'm well aware that Python 3000 doesn't aim to be backwards compatible, > > but I would prefer if we could refrain from subtly changing the behaviour > > of existing constructs in way that breaks all existing use, and makes it un- > > necessarily hard to convert old code (whether by hand or by machine). > > Agreed. I wonder what the plan is for the transition? That's the agenda of this list! The users need to speak up. > Things like > changing the semantics of the for-loop else-clause are actually > relatively easy to flag since they can be spotted easily with the AST, > and *all* current uses of the else-clause would be wrong. Regarding this particular feature proposal, I'd like to strongly discourage it. I don't think it's well thought out; it looks more like an ad-hoc response to the issue brought up by Ian Bicking. If you disagree, please write a well-argued PEP. > However, flagging things like "expects a list from dict.items instead > of an iterator" might be harder to do. Some uses (for-loops, > list-comps and genexps) are still perfectly fine while other uses > (indexing, getting the len) aren't. I think most of these should > raise an exception pretty quickly, but if any of them don't, that > would make me pretty nervous when transitioning code. Right. Unfortunately we're already bound to have incompatibilities that will give existing code new meanings that are hard to detect by source code inspection only -- e.g. i/j will return a float when i and j are ints, and you can't always tell whether any particular use of x/y will ever involve two ints. Python 3000 *will* be backwards incompatible, and sometimes it will be awkward to convert code. The question I lay in front of the community is, how much incompatibility can we handle? And what will the transition strategy be? Someone needs to own this discussion topic, start collecting ideas and feedback, and draft a PEP that will guide the Python 3000 feature specification process. One possibility is that every feature PEP is required to contain a section on compatibility issues and how the transition is handled. Sometimes the only reasonable answer will be "Some fraction of code will break and users will have to debug this on a per-case basis." (I imagine this is the most sensible approach in the case of the elimination of classic classes, which have a subtly different multiple inheritance pattern.) But hopefully in most cases we can provide tools to scan old code for potential issues (e.g. I wrote one long ago for int division) or even a conversion tool. But all this needs to be guided by an explicit policy, which hasn't been defined yet. Without an explicit policy we'll have never-ending discussions about whether a particular incompatibility is acceptable or not. > I wonder if it would be worth having a branch of python-3000 at some > point that inserts warning code on all constructs that changed (e.g. > all calls to dict.items). I have a suspicion that this would generate > a lot of false-positives, but maybe it would still be worthwhile... That's one proposal. I don't know how feasible it is, or if there are better solutions. Clearly we need to think about this more. Perhaps you would like to (co-)own the transition strategy topic? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.peters at gmail.com Fri Mar 24 19:51:00 2006 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 24 Mar 2006 13:51:00 -0500 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <44230BA5.8070407@zope.com> References: <4422FC96.2020409@zope.com> <4422FFE1.8050807@colorstudy.com> <44230BA5.8070407@zope.com> Message-ID: <1f7befae0603241051p513439d7ob98bca0b25091ee1@mail.gmail.com> [Jim Fulton] > ... > I'd be interested to hear if other people who have experience working > with ZODB BTrees have been as annoyed as I've been. Not since the first week ;-), no. Two things exacerbate this in ZODB: - BTrees are a mapping type, but unlike other mapping types its keys() (etc) methods don't return a list. This makes the result of BTree.keys() surprising to many, just because it's different from the way other keys() methods work. - ZODB Buckets are also mapping types, and pretty much interchangeable with BTrees (they support the same extended (relative to the base mapping interface) set of methods with the same meanings), _except_ that Bucket.keys() (etc) returns a list. Sometimes operations on BTrees even return Buckets, so from one line of code to the next it's hard to remember whether keys() (etc) will return a list or an iterator. That said, it doesn't much matter, since BTrees.keys() (etc) returns a particularly rich kind of iterator (as you know, it supports, e.g., __len__ and indexing, much like a list). There were only two ways I got surprised: - Typing, e.g., >>> b.keys() at an interactive shell to see the keys, and getting back - Writing, e.g., self.assertEqual(b.keys(), [1, 2, 3]) in a Bucket unit test, forgetting that it was in a test class that was also (re)used to test BTrees. "The solution" in both cases was to wrap the method result in list(), and stop caring that this would make a "needless" copy when `b` was in fact a Bucket. If dict.keys() (etc) had also returned an iterator, I doubt anyone would have been surprised by any of the above. There's one other common surprise in Zope-land, namely that for key in b.keys(): ... possibly try to delete `key` from `b` "doesn't work" when `b` is a BTree. The _expectation_, derived from experience with Python dicts, is that it's bulletproof, but that's again because dict.keys() has returned a distinct list and BTrees were just different that way. It's a little nastier for BTrees because they can't reliably detect a size change during iteration, so BTree users _usually_ don't get the >>> d = {1: 2, 3: 4} >>> for key in d: ... del d[key] ... Traceback (most recent call last): File "", line 1, in ? RuntimeError: dictionary changed size during iteration they're accustomed to when they try to mutate a dict that changes size during iteration. From guido at python.org Fri Mar 24 20:03:55 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 24 Mar 2006 11:03:55 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <1f7befae0603241051p513439d7ob98bca0b25091ee1@mail.gmail.com> References: <4422FC96.2020409@zope.com> <4422FFE1.8050807@colorstudy.com> <44230BA5.8070407@zope.com> <1f7befae0603241051p513439d7ob98bca0b25091ee1@mail.gmail.com> Message-ID: On 3/24/06, Tim Peters wrote: > There's one other common surprise in Zope-land, namely that > > for key in b.keys(): > ... possibly try to delete `key` from `b` > > "doesn't work" when `b` is a BTree. The _expectation_, derived from > experience with Python dicts, is that it's bulletproof, but that's > again because dict.keys() has returned a distinct list and BTrees were > just different that way. The Java collections framework actually has an API and an idiom that make this work: ther's a method on the *iterator* that deletes the current item without disturbing the iteration. The iterator is required to maintain enough state to know whether deletion is currently valid -- it's not before you've started iterating, or after the iterator is exhausted, or if you've already deleted the item. Deleting the item does not automatically move to the next item; you must still call next() for that. Clearly this requires careful cooperation between the iterator and the container! For Python sets and dicts, it would be sufficient to guarantee not to rehash upon such a deletion. For lists, it would require the iterator to remember not to increment the index on the subsequent next() call. Java has a few flavors of iterators; for lists it also has an extended iterator that allows moving back, and of course it also has iterators that don't support deletion for whatever reason. It's a really neat API! Note that this is all quite independent from the views proposal (also inspired by Java). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From steven.bethard at gmail.com Fri Mar 24 23:11:45 2006 From: steven.bethard at gmail.com (Steven Bethard) Date: Fri, 24 Mar 2006 15:11:45 -0700 Subject: [Python-3000] how to transition 2.X code to 3.0 code In-Reply-To: References: Message-ID: Guido van Rossum wrote: > Python 3000 *will* be backwards incompatible, and sometimes it will be > awkward to convert code. The question I lay in front of the community > is, how much incompatibility can we handle? And what will the > transition strategy be? Someone needs to own this discussion topic, > start collecting ideas and feedback, and draft a PEP that will guide > the Python 3000 feature specification process. > > One possibility is that every feature PEP is required to contain a > section on compatibility issues and how the transition is handled. +1. I definitely think this should be a requirement. Every PEP should explain how things change, and preferably give some code to identify the changes. I guess the question is, what are acceptable false-positive and false-negative rates? If some false-positives and false-negatives are okay, PEP's could just provide some regular expressions or AST-tree searches to identify possible code problems and either log or correct them. But if we need zero false-positives or zero false-negatives, I don't see any way to do that but to branch the Python 3000 trunk and insert a bunch of warnings in the code wherever it differs from Python 2.X. What are people happy with here? FWIW, I'm okay with imperfect change-finding because I think keeping a branch with all the appropriate warnings is going to take a fair bit of effort, and I suspect that effort would be better directed at other issues. Guido van Rossum wrote: > Clearly we need to think about this more. Perhaps > you would like to (co-)own the transition strategy topic? Sure. I'll wait for some feedback in this thread, and then try to turn it into a transition-strategy PEP. STeVe -- Grammar am for people who can't think for myself. --- Bucky Katt, Get Fuzzy From guido at python.org Fri Mar 24 23:33:44 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 24 Mar 2006 14:33:44 -0800 Subject: [Python-3000] how to transition 2.X code to 3.0 code In-Reply-To: References: Message-ID: On 3/24/06, Steven Bethard wrote: > Sure. I'll wait for some feedback in this thread, and then try to > turn it into a transition-strategy PEP. Excellent! Let me know when you're ready for feedback. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.peters at gmail.com Sat Mar 25 03:22:39 2006 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 24 Mar 2006 21:22:39 -0500 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <4422FC96.2020409@zope.com> <4422FFE1.8050807@colorstudy.com> <44230BA5.8070407@zope.com> <1f7befae0603241051p513439d7ob98bca0b25091ee1@mail.gmail.com> Message-ID: <1f7befae0603241822n1280ba67i3ed87e842114f18b@mail.gmail.com> [Tim] >> ... >> There's one other common surprise in Zope-land, namely that >> >> for key in b.keys(): >> ... possibly try to delete `key` from `b` >> >> "doesn't work" when `b` is a BTree. The _expectation_, derived from >> experience with Python dicts, is that it's bulletproof, but that's >> again because dict.keys() has returned a distinct list and BTrees were >> just different that way. [Guido] > The Java collections framework actually has an API and an idiom that > make this work: therd's a method on the *iterator* that deletes the > current item without disturbing the iteration. Yup, I saw that: http://java.sun.com/j2se/1.4.2/docs/api/java/util/Iterator.html It's at least curious that there isn't also a method to add an item in the base Iterator interface. > The iterator is required to maintain enough state to know whether > deletion is currently valid -- it's not before you've started iterating, or after > the iterator is exhausted, or if you've already deleted the item. > Deleting the item does not automatically move to the next item; you > must still call next() for that. Or, even easier, it can just throw UnsupportedOperationException if it doesn't feel like implementing Iterator.remove() in a useful way :-) > Clearly this requires careful cooperation between the iterator and the > container! For Python sets and dicts, it would be sufficient to > guarantee not to rehash upon such a deletion. For lists, it would > require the iterator to remember not to increment the index on the > subsequent next() call. Iterator.remove() would be reasonably easy to implement for BTrees too, given the Java constraint that all bets are off if the collection is modified in any way other than via Iterator.remove(). > Java has a few flavors of iterators; for lists it also has an extended > iterator that allows moving back, and of course it also has iterators > that don't support deletion for whatever reason. It's a really neat > API! On the page referenced above, ListIterator is given as the only "known" subinterface of Iterator. Iterators that don't want to support remove() still implement the base Iterator interface, but take the UnsupportedOperationException dodge noted above. > Note that this is all quite independent from the views proposal (also > inspired by Java). I'm not sure I saw a coherent "views proposal" ;-) Java has some nice ideas, but I'd think it's too elaborate for your tastes (and, by extension, for Python's). For example, it has six(?) collection interfaces, two of which don't even extend the base Collection interface. AFAICT, what Java calls "views" are unique to some methods of its Map and SortedMap interfaces. To be concrete, is what you're calling "a view" the kind of thing returned by Java's AbstsactMap.entrySet()?: http://java.sun.com/j2se/1.4.2/docs/api/java/util/AbstractMap.html That's closest to Python's items(), and AbstractMap.keys() is closest to Python's keys(). Java being Java, AbstractMap.values() returns a different _type_ of object than those two (Java's further sub-distinction between "set views" and "collection views"). I vote to steal the good parts and drop all the distinctions :-) From guido at python.org Sat Mar 25 03:58:22 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 24 Mar 2006 18:58:22 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <1f7befae0603241822n1280ba67i3ed87e842114f18b@mail.gmail.com> References: <4422FC96.2020409@zope.com> <4422FFE1.8050807@colorstudy.com> <44230BA5.8070407@zope.com> <1f7befae0603241051p513439d7ob98bca0b25091ee1@mail.gmail.com> <1f7befae0603241822n1280ba67i3ed87e842114f18b@mail.gmail.com> Message-ID: On 3/24/06, Tim Peters wrote: > [Guido] > > The Java collections framework actually has an API and an idiom that > > make this work: therd's a method on the *iterator* that deletes the > > current item without disturbing the iteration. > > Yup, I saw that: > > http://java.sun.com/j2se/1.4.2/docs/api/java/util/Iterator.html > > It's at least curious that there isn't also a method to add an item in > the base Iterator interface. I could ask my drinking buddy Josh Bloch, but I'll try to channel him first: (a) the use case isn't as attractive (why would you want to insert an item during an iteration? the use case for deletion during iteration is very natural); (b) did you mean to insert it before, after, or a random item? for hash sets that doesn't matter, but for trees and lists it does. (c) it would be hard to implement for hash sets because you may not have a choice about not rehashing. [...] > > Note that this is all quite independent from the views proposal (also > > inspired by Java). > > I'm not sure I saw a coherent "views proposal" ;-) Java has some nice > ideas, but I'd think it's too elaborate for your tastes (and, by > extension, for Python's). For example, it has six(?) collection > interfaces, two of which don't even extend the base Collection > interface. AFAICT, what Java calls "views" are unique to some methods > of its Map and SortedMap interfaces. Right, the views only make sense for maps that have separate keys and values. > To be concrete, is what you're calling "a view" the kind of thing > returned by Java's AbstsactMap.entrySet()?: > > http://java.sun.com/j2se/1.4.2/docs/api/java/util/AbstractMap.html Yes. Another example is Map.keySet(), which returns a set view of the keys of a map: http://java.sun.com/j2se/1.4.2/docs/api/java/util/Map.html#keySet() > That's closest to Python's items(), and AbstractMap.keys() is closest > to Python's keys(). Java being Java, AbstractMap.values() returns a > different _type_ of object than those two (Java's further > sub-distinction between "set views" and "collection views"). The "Abstract" classes aren't the most interesting pieces; they are really building blocks for implementers, as explained here: http://java.sun.com/j2se/1.4.2/docs/guide/collections/overview.html (section "Collection Implementations", last para of the intro). The really core stuff is in the interfaces. The difference between sets and collections is that collections may have duplicates; that's pretty essential for the thing you get from values(), but keys() and items() return sets (of different types). > I vote to steal the good parts and drop all the distinctions :-) Maybe. I need a volunteer to write the PEP! -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg.ewing at canterbury.ac.nz Sat Mar 25 05:18:33 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 25 Mar 2006 16:18:33 +1200 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <20060324153327.GC425@panix.com> References: <44231A7E.6050801@colorstudy.com> <4423213B.4050603@colorstudy.com> <44232FD9.6050209@colorstudy.com> <442336C4.6050807@colorstudy.com> <442344A1.4070001@colorstudy.com> <4423BF96.4040608@canterbury.ac.nz> <20060324153327.GC425@panix.com> Message-ID: <4424C499.4090204@canterbury.ac.nz> Aahz wrote: > On Fri, Mar 24, 2006, Greg Ewing wrote: >> for x from iterator: > How would you distinguish? What about objects that are their own > iterator (such as files)? Files would really and truly be iterators (i.e. they would have a next() method) and you would need to use for line from a_file: ... Greg From greg.ewing at canterbury.ac.nz Sat Mar 25 05:21:24 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 25 Mar 2006 16:21:24 +1200 Subject: [Python-3000] Best Practices essays In-Reply-To: <1143216197.3190.74.camel@localhost.localdomain> References: <442353CD.1050101@comcast.net> <1143183993.3287.67.camel@localhost.localdomain> <4423C630.1010306@canterbury.ac.nz> <1143216197.3190.74.camel@localhost.localdomain> Message-ID: <4424C544.5080906@canterbury.ac.nz> Adam DePrince wrote: > I'm curious, however, what do you envision the semantics of [x times i] > being? [x for _i in xrange(i)] Greg From greg.ewing at canterbury.ac.nz Sat Mar 25 05:25:10 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 25 Mar 2006 16:25:10 +1200 Subject: [Python-3000] else-clause on for-loops In-Reply-To: <1f7befae0603240838g26c52495xef5857838b1fdf98@mail.gmail.com> References: <4423AF95.40701@tekNico.net> <1f7befae0603240838g26c52495xef5857838b1fdf98@mail.gmail.com> Message-ID: <4424C626.30903@canterbury.ac.nz> Tim Peters wrote: > The primary use case is "search loops". > > for item in sequence: > if desirable(item): > break > else: > no desirable item exists Except that almost all of my search loops are in functions of their own, and I just use return. So I don't get to use this use case. Greg From greg.ewing at canterbury.ac.nz Sat Mar 25 05:28:11 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 25 Mar 2006 16:28:11 +1200 Subject: [Python-3000] C style guide In-Reply-To: <200603241211.03500.fdrake@acm.org> References: <20060324151231.GB22474@discworld.dyndns.org> <2e1434c10603240849h27b7bd52q408c6d93b6b439c2@mail.gmail.com> <200603241211.03500.fdrake@acm.org> Message-ID: <4424C6DB.6090308@canterbury.ac.nz> Fred L. Drake, Jr. wrote: > That would be bad. Do you realize just how small fonts would have to get to > let us still have as many editor windows on-screen? I'm still waiting for desk-sized displays with stylus for input. Let's make the desktop metaphor more than a metaphor! Greg From greg.ewing at canterbury.ac.nz Sat Mar 25 05:32:31 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 25 Mar 2006 16:32:31 +1200 Subject: [Python-3000] C style guide In-Reply-To: References: <442203E5.7090009@gmail.com> <4423A29C.3060006@canterbury.ac.nz> Message-ID: <4424C7DF.4090106@canterbury.ac.nz> Guido van Rossum wrote: > Please tell me this whole thread is a cruel joke. I know it isn't going to happen for Python, but I seriously believe that all-tabs is a better technical approach to program indentation, for a variety of reasons. The only arguments against it that I can see are legacy-related ones. Greg From adam.deprince at gmail.com Sat Mar 25 06:16:18 2006 From: adam.deprince at gmail.com (Adam DePrince) Date: Sat, 25 Mar 2006 00:16:18 -0500 Subject: [Python-3000] Best Practices essays In-Reply-To: <4424C544.5080906@canterbury.ac.nz> References: <442353CD.1050101@comcast.net> <1143183993.3287.67.camel@localhost.localdomain> <4423C630.1010306@canterbury.ac.nz> <1143216197.3190.74.camel@localhost.localdomain> <4424C544.5080906@canterbury.ac.nz> Message-ID: <1143263778.4203.38.camel@localhost.localdomain> On Sat, 2006-03-25 at 16:21 +1200, Greg Ewing wrote: > Adam DePrince wrote: > > > I'm curious, however, what do you envision the semantics of [x times i] > > being? > > [x for _i in xrange(i)] > > Greg So [[x times n] times m] would be really the same as n*[m*[0,]] So its different than list*int ... It looks pretty, but to grow the language and add a keyword for what some might call syntactic saccharine. I'd vote -1. Cheers - Adam DePrince From adam.deprince at gmail.com Sat Mar 25 06:22:31 2006 From: adam.deprince at gmail.com (Adam DePrince) Date: Sat, 25 Mar 2006 00:22:31 -0500 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <4422FC96.2020409@zope.com> <4422FFE1.8050807@colorstudy.com> <44230BA5.8070407@zope.com> <1f7befae0603241051p513439d7ob98bca0b25091ee1@mail.gmail.com> <1f7befae0603241822n1280ba67i3ed87e842114f18b@mail.gmail.com> Message-ID: <1143264152.4203.42.camel@localhost.localdomain> On Fri, 2006-03-24 at 18:58 -0800, Guido van Rossum wrote: > On 3/24/06, Tim Peters wrote: > > [Guido] > > > The Java collections framework actually has an API and an idiom that > > > make this work: therd's a method on the *iterator* that deletes the > > > current item without disturbing the iteration. > > > > Yup, I saw that: > > > > http://java.sun.com/j2se/1.4.2/docs/api/java/util/Iterator.html > > > > It's at least curious that there isn't also a method to add an item in > > the base Iterator interface. > > I could ask my drinking buddy Josh Bloch, but I'll try to channel him > first: (a) the use case isn't as attractive (why would you want to > insert an item during an iteration? the use case for deletion during > iteration is very natural); (b) did you mean to insert it before, > after, or a random item? for hash sets that doesn't matter, but for > trees and lists it does. (c) it would be hard to implement for hash > sets because you may not have a choice about not rehashing. > > [...] > > > Note that this is all quite independent from the views proposal (also > > > inspired by Java). > > > > I'm not sure I saw a coherent "views proposal" ;-) Java has some nice > > ideas, but I'd think it's too elaborate for your tastes (and, by > > extension, for Python's). For example, it has six(?) collection > > interfaces, two of which don't even extend the base Collection > > interface. AFAICT, what Java calls "views" are unique to some methods > > of its Map and SortedMap interfaces. > > Right, the views only make sense for maps that have separate keys and values. > > > To be concrete, is what you're calling "a view" the kind of thing > > returned by Java's AbstsactMap.entrySet()?: > > > > http://java.sun.com/j2se/1.4.2/docs/api/java/util/AbstractMap.html > > Yes. Another example is Map.keySet(), which returns a set view of the > keys of a map: > > http://java.sun.com/j2se/1.4.2/docs/api/java/util/Map.html#keySet() > > > That's closest to Python's items(), and AbstractMap.keys() is closest > > to Python's keys(). Java being Java, AbstractMap.values() returns a > > different _type_ of object than those two (Java's further > > sub-distinction between "set views" and "collection views"). > > The "Abstract" classes aren't the most interesting pieces; they are > really building blocks for implementers, as explained here: > > http://java.sun.com/j2se/1.4.2/docs/guide/collections/overview.html > (section "Collection Implementations", last para of the intro). > > The really core stuff is in the interfaces. > > The difference between sets and collections is that collections may > have duplicates; that's pretty essential for the thing you get from > values(), but keys() and items() return sets (of different types). > > > I vote to steal the good parts and drop all the distinctions :-) > > Maybe. I need a volunteer to write the PEP! Oh, why not. Me me! - Adam From guido at python.org Sat Mar 25 08:38:42 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 24 Mar 2006 23:38:42 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <1143264152.4203.42.camel@localhost.localdomain> References: <4422FC96.2020409@zope.com> <4422FFE1.8050807@colorstudy.com> <44230BA5.8070407@zope.com> <1f7befae0603241051p513439d7ob98bca0b25091ee1@mail.gmail.com> <1f7befae0603241822n1280ba67i3ed87e842114f18b@mail.gmail.com> <1143264152.4203.42.camel@localhost.localdomain> Message-ID: On 3/24/06, Adam DePrince wrote: [Guido] > > Maybe. I need a volunteer to write the PEP! > > Oh, why not. Me me! Excellent! Let us know when it's ready or when you'r stuck. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bioinformed at gmail.com Sat Mar 25 14:36:14 2006 From: bioinformed at gmail.com (Kevin Jacobs ) Date: Sat, 25 Mar 2006 08:36:14 -0500 Subject: [Python-3000] C style guide In-Reply-To: <200603241211.03500.fdrake@acm.org> References: <20060324151231.GB22474@discworld.dyndns.org> <2e1434c10603240849h27b7bd52q408c6d93b6b439c2@mail.gmail.com> <200603241211.03500.fdrake@acm.org> Message-ID: <2e1434c10603250536j3ac396aah8aecda487098986f@mail.gmail.com> On 3/24/06, Fred L. Drake, Jr. wrote: > > On Friday 24 March 2006 11:49, Kevin Jacobs > wrote: > > Don't forget those of us who are now pushing for 120 character wide > source > > files! > > That would be bad. Do you realize just how small fonts would have to get > to > let us still have as many editor windows on-screen? I don't think my old > eyes could handle it! > Take the "glass half full" approach -- just think how big a monitor you'll get to see all that information on the screen! -Kevin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20060325/c49c0ac5/attachment.html From guido at python.org Sat Mar 25 16:37:41 2006 From: guido at python.org (Guido van Rossum) Date: Sat, 25 Mar 2006 07:37:41 -0800 Subject: [Python-3000] C style guide In-Reply-To: <2e1434c10603250536j3ac396aah8aecda487098986f@mail.gmail.com> References: <20060324151231.GB22474@discworld.dyndns.org> <2e1434c10603240849h27b7bd52q408c6d93b6b439c2@mail.gmail.com> <200603241211.03500.fdrake@acm.org> <2e1434c10603250536j3ac396aah8aecda487098986f@mail.gmail.com> Message-ID: On 3/25/06, Kevin Jacobs wrote: > On 3/24/06, Fred L. Drake, Jr. wrote: > > On Friday 24 March 2006 11:49, Kevin Jacobs > wrote: > > > Don't forget those of us who are now pushing for 120 character wide > > > source files! > > > > That would be bad. Do you realize just how small fonts would have to get > > to let us still have as many editor windows on-screen? I don't think my old > > eyes could handle it! > > Take the "glass half full" approach -- just think how big a monitor you'll > get to see all that information on the screen! Actually, a 120-wide window is mostly a bigger waste of space since *most* code easily fits in 80 columns (remember, average line length is 30!). Folding the occasional long line is much better use of resources than stretching the window to accommodate it. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From msoulier at digitaltorque.ca Sat Mar 25 17:08:41 2006 From: msoulier at digitaltorque.ca (Michael P. Soulier) Date: Sat, 25 Mar 2006 11:08:41 -0500 Subject: [Python-3000] C style guide In-Reply-To: References: <20060324151231.GB22474@discworld.dyndns.org> <2e1434c10603240849h27b7bd52q408c6d93b6b439c2@mail.gmail.com> <200603241211.03500.fdrake@acm.org> <2e1434c10603250536j3ac396aah8aecda487098986f@mail.gmail.com> Message-ID: <20060325160841.GB12075@tigger.digitaltorque.ca> On 25/03/06 Guido van Rossum said: > Actually, a 120-wide window is mostly a bigger waste of space since > *most* code easily fits in 80 columns (remember, average line length > is 30!). Folding the occasional long line is much better use of > resources than stretching the window to accommodate it. Seconded. I hate reading code with all of the lines going way beyond 80 columns. Mike -- Michael P. Soulier "Any intelligent fool can make things bigger and more complex... It takes a touch of genius - and a lot of courage to move in the opposite direction." --Albert Einstein -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://mail.python.org/pipermail/python-3000/attachments/20060325/98f8716b/attachment.pgp From adam.deprince at gmail.com Sat Mar 25 18:02:13 2006 From: adam.deprince at gmail.com (Adam DePrince) Date: Sat, 25 Mar 2006 12:02:13 -0500 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <4422FC96.2020409@zope.com> <4422FFE1.8050807@colorstudy.com> <44230BA5.8070407@zope.com> <1f7befae0603241051p513439d7ob98bca0b25091ee1@mail.gmail.com> <1f7befae0603241822n1280ba67i3ed87e842114f18b@mail.gmail.com> Message-ID: <1143306134.3186.1.camel@localhost.localdomain> > Maybe. I need a volunteer to write the PEP! PEP: XXX Title: Mutable Iterations Version: $Revision$ Last-Modified: $Date$ Author: , Adam DePrince Status: Draft Type: Standards Content-Type: text/plain Created: 25-March-2006 Post-History: Abstract: This PEP proposes an extension to the iteration protocol to support deletion. Invocation of the delete method would result in the deletion of the corresponding object from the iterations underlying data store. This PEP further proposes that dict.iter{keys,items,values} be removed and the functions dict.{keys,items,values} return iters of this new deletable variation, and that the iter prefixed function variations be deprecated. Support for delete would become an optional component of the iter protocol. Motivation The current dictionary API has separate functions to return lists or iters for each of keys, values and items. This is cumbersome and annoying, yet is tolerated because neither alone possesses the full functionality desired by the python community. The iter variation provides all of the performance advantages normally associated with iters; primarily minimal in-flight memory consumption. Its use, however, denies the user the ability to mutate the underlying data structure. Modification of the underlying dict results in in a RuntimeError upon the subsequent call of iter.next The non-iter variation returns a snapshot of the current dictionary state stored within a list. This has the advantage of permitting in-situ mutation of the underlying dictionary without upsetting the current loop. The disadvantage is that of performance and resource consumption; the portions requested of the underlying dictionary could easily exceed marginal memory. In many situation, such as with dictionaries that consume substantial portion of memory, or dictionary implementations that operate out of core, the list variant of these operations is effectively unavailable. A common programming pattern is to iterate over the elements of a dictionary selecting those for removal. Somewhat less common is the insertion of elements during the traversal. It is the former that we seek to address at this time. This PEP attempts to merge the benefits of the list and iter variants of keys/values/items under a single implementation. SPECIFICATION: Example of desired operation: >>> d = {'foo':1,'bar':2 } >>> i = d.values() >>> print i.next() 2 >>> i.delete() >>> print i.next() 1 >>> print d {'foo':1} >>> print i.next() StopIteration SPECIAL CONSIDERATION This would require the implementation of a non-rehashing variant of dict.__del__. It may not be possible to prevent rehashing upon the insertion of an element into a dict as it is for a delete, therefore element insertion is not being considered at this time. ALSO CONSIDERED Implementation is the style of the JAVA views proposal. One concrete example of a list view superimposed upon a dict. Concerns were expressed about the record keeping and computational overhead of addressing holes that would appear within the list upon insertion and deletion. IMPLEMENTATION TBD REFERENCES TBD COPYRIGHT This document has been placed in the public domain. From fdrake at acm.org Sat Mar 25 18:29:04 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Sat, 25 Mar 2006 12:29:04 -0500 Subject: [Python-3000] C style guide In-Reply-To: <2e1434c10603250536j3ac396aah8aecda487098986f@mail.gmail.com> References: <200603241211.03500.fdrake@acm.org> <2e1434c10603250536j3ac396aah8aecda487098986f@mail.gmail.com> Message-ID: <200603251229.05095.fdrake@acm.org> On Saturday 25 March 2006 08:36, Kevin Jacobs wrote: > Take the "glass half full" approach -- just think how big a monitor you'll > get to see all that information on the screen! I'm afraid laptop monitors aren't enlarged so easily, and I find myself on a laptop most of the time these days. -Fred -- Fred L. Drake, Jr. From g.brandl at gmx.net Sat Mar 25 18:52:54 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 25 Mar 2006 18:52:54 +0100 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <1143306134.3186.1.camel@localhost.localdomain> References: <4422FC96.2020409@zope.com> <4422FFE1.8050807@colorstudy.com> <44230BA5.8070407@zope.com> <1f7befae0603241051p513439d7ob98bca0b25091ee1@mail.gmail.com> <1f7befae0603241822n1280ba67i3ed87e842114f18b@mail.gmail.com> <1143306134.3186.1.camel@localhost.localdomain> Message-ID: <44258376.1080709@gmx.net> Adam DePrince wrote: >> Maybe. I need a volunteer to write the PEP! > > PEP: XXX > Title: Mutable Iterations Comments (mostly grammar) inline. > This PEP proposes an extension to the iteration protocol to > support deletion. Invocation of the delete method would result in ^ "of items of the objected iterated over"? > the deletion of the corresponding object from the iterations > underlying data store. > > This PEP further proposes that dict.iter{keys,items,values} be > removed and the functions dict.{keys,items,values} return iters of > this new deletable variation, and that the iter prefixed function ^^^^^^^^^ the iterator is deletable? > variations be deprecated. ^^^^^^^^^^ deprecated or removed, like written above? > > Support for delete would become an optional component of the iter ^^^^^^ "deletion"? > protocol. > > Motivation > > The current dictionary API has separate functions to return lists > or iters for each of keys, values and items. This is cumbersome > and annoying, yet is tolerated because neither alone possesses the > full functionality desired by the python community. > > The iter variation provides all of the performance advantages > normally associated with iters; primarily minimal in-flight memory > consumption. Its use, however, denies the user the ability to > mutate the underlying data structure. Modification of the > underlying dict results in in a RuntimeError upon the subsequent > call of iter.next ^"()." > > The non-iter variation returns a snapshot of the current > dictionary state stored within a list. This has the advantage of > permitting in-situ mutation of the underlying dictionary without > upsetting the current loop. The disadvantage is that of ^^^^ no mention of a loop before > performance and resource consumption; the portions requested of > the underlying dictionary could easily exceed marginal memory. > > In many situation, such as with dictionaries that consume ^"s" ^" a" > substantial portion of memory, or dictionary implementations that > operate out of core, the list variant of these operations is > effectively unavailable. > > A common programming pattern is to iterate over the elements of a > dictionary selecting those for removal. Somewhat less common is > the insertion of elements during the traversal. It is the former > that we seek to address at this time. > > This PEP attempts to merge the benefits of the list and iter > variants of keys/values/items under a single implementation. > > SPECIFICATION: > > Example of desired operation: > > >>> d = {'foo':1,'bar':2 } > >>> i = d.values() > >>> print i.next() > 2 > >>> i.delete() > >>> print i.next() > 1 > >>> print d > {'foo':1} > >>> print i.next() > StopIteration > > SPECIAL CONSIDERATION > > This would require the implementation of a non-rehashing variant > of dict.__del__. ^^^^^^^ do you mean __delitem__? > It may not be possible to prevent rehashing upon the insertion of > an element into a dict as it is for a delete, therefore element > insertion is not being considered at this time. > > ALSO CONSIDERED > > Implementation is the style of the JAVA views proposal. One ^^ "in" > concrete example of a list view superimposed upon a dict. ^^ "is"? > Concerns were expressed about the record keeping and computational > overhead of addressing holes that would appear within the list > upon insertion and deletion. Georg -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 191 bytes Desc: OpenPGP digital signature Url : http://mail.python.org/pipermail/python-3000/attachments/20060325/7dc76c6e/attachment.pgp From adam.deprince at gmail.com Sat Mar 25 19:24:05 2006 From: adam.deprince at gmail.com (Adam DePrince) Date: Sat, 25 Mar 2006 13:24:05 -0500 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <4422FC96.2020409@zope.com> <4422FFE1.8050807@colorstudy.com> <44230BA5.8070407@zope.com> <1f7befae0603241051p513439d7ob98bca0b25091ee1@mail.gmail.com> <1f7befae0603241822n1280ba67i3ed87e842114f18b@mail.gmail.com> <1143264152.4203.42.camel@localhost.localdomain> Message-ID: <1143311045.3186.9.camel@localhost.localdomain> On Fri, 2006-03-24 at 23:38 -0800, Guido van Rossum wrote: > On 3/24/06, Adam DePrince wrote: > [Guido] > > > Maybe. I need a volunteer to write the PEP! > > > > Oh, why not. Me me! > > Excellent! Let us know when it's ready or when you'r stuck. I've added "ALSO CONSIDERED" and "OBJECTIONS" as well as consideration of the edge cases were iter.delete is called before the first .next or afte the reception of a StopIteration error. If everybody can take a look and start sending edits my way. The PEP is getting too big to copy and paste to the bottom of the email message. http://www.deprince.net/ideas/peps.html. Click on "Mutable Iterations". I've already started on an implementation, if I seriously misjudged the direction of the community somebody yell at me before I commit too many cups of coffee to it :-) Cheers - Adam DePrince From greg.ewing at canterbury.ac.nz Sun Mar 26 00:59:41 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 26 Mar 2006 11:59:41 +1200 Subject: [Python-3000] C style guide In-Reply-To: <200603251229.05095.fdrake@acm.org> References: <200603241211.03500.fdrake@acm.org> <2e1434c10603250536j3ac396aah8aecda487098986f@mail.gmail.com> <200603251229.05095.fdrake@acm.org> Message-ID: <4425D96D.50304@canterbury.ac.nz> Fred L. Drake, Jr. wrote: > I'm afraid laptop monitors aren't enlarged so easily, and I find myself on a > laptop most of the time these days. Eyephones. Virtual 360-degree wraparound displays. Greg From adam.deprince at gmail.com Sun Mar 26 01:59:00 2006 From: adam.deprince at gmail.com (Adam DePrince) Date: Sat, 25 Mar 2006 19:59:00 -0500 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <4422FC96.2020409@zope.com> <4422FFE1.8050807@colorstudy.com> <44230BA5.8070407@zope.com> <1f7befae0603241051p513439d7ob98bca0b25091ee1@mail.gmail.com> <1f7befae0603241822n1280ba67i3ed87e842114f18b@mail.gmail.com> <1143264152.4203.42.camel@localhost.localdomain> Message-ID: <1143334741.3682.46.camel@localhost.localdomain> I've made some updates to the Mutable iterators PEP. http://deprince.net/ideas/pep-dict.txt For the implementation I'm going to just implement the deleting iter for list and dict. I don't want iter implementors to scurry over their code just to add "nop exception throwers" as delete methods, so I'm calling for the absence of a .delete method to indicate that deleting is not supported. If it quacks ... Also, there is the issue of modification while iterating. Lists don't care at all and dictionaries only care so long as you don't change the current size. I propose that we adopt for the new deleting iterators Java's fast fail semantics. Basically, if the underlying data-store is touched in a way that doesn't make sense (i.e. we touch the dict and it rehashes), we actively tell any still alive iters (weak references are nice for this) As for what qualifies as a compatible or incompatible change will depend on the iterator and backing store implementation of course ... how specific or demanding should we be? On Fri, 2006-03-24 at 23:38 -0800, Guido van Rossum wrote: > On 3/24/06, Adam DePrince wrote: > [Guido] > > > Maybe. I need a volunteer to write the PEP! > > > > Oh, why not. Me me! > > Excellent! Let us know when it's ready or when you'r stuck. > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) From adam.deprince at gmail.com Mon Mar 27 06:08:04 2006 From: adam.deprince at gmail.com (Adam DePrince) Date: Sun, 26 Mar 2006 23:08:04 -0500 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <44258376.1080709@gmx.net> References: <4422FC96.2020409@zope.com> <4422FFE1.8050807@colorstudy.com> <44230BA5.8070407@zope.com> <1f7befae0603241051p513439d7ob98bca0b25091ee1@mail.gmail.com> <1f7befae0603241822n1280ba67i3ed87e842114f18b@mail.gmail.com> <1143306134.3186.1.camel@localhost.localdomain> <44258376.1080709@gmx.net> Message-ID: <1143432484.14391.67.camel@localhost.localdomain> I have a draft PEP and an implementation of mutable iterators for lists and dicts that supports delete only. The PEP (Mutable Iterations) and sample code can be found at: http://www.deprince.net/ideas/peps.html Or, you can get the patch from: http://sourceforge.net/tracker/index.php?func=detail&aid=1459011&group_id=5470&atid=305470 There are two questions that I'd like to put before the Python community at large. The first is which of Java's iterator methods we would like to see. I'm using Java's list-iterator as a template - http://java.sun.com/j2se/1.4.2/docs/api/java/util/ListIterator.html Question #1: Pick which you like, and which you think are too non-pythonic. Let me know. At a minimum we need delete, I'd like to see others, but I'd like to get some feedback before adding something that would make everyone recoil. Operation Works with list Works with dict add Yes Unreliably** hasNext Yes Yes hasPrev Yes Yes previous Yes Yes next Already implemented remove Implemented with patch, called delete (see question #3) set Yes Yes only for dict.values() previousIndex Yes Yes nextIndex Yes Yes And not part of Java, but I'd like to ask everyone anyway. current Yes Yes (the item last returned by next/prev) currentIndex Yes Yes (the item index last returned by next/prev) Question #2: What should delete() return? I currently have it returning the iter itself to make it possible to say: value = iter.delete().next() Example: item = i.next() while reject( item ): item = i.delete().next() I fully expect a groundswell of people to say "just return None." Question #3: Delete vs. remove. On the first cut I took some liberties with the naming. A lot of data-structures have a remove method, the semantics are datastructre.remove( offending_item ). Because in the iterator, which item is selected by the iterator state, not parameter, I didn't want to confuse the two, so I called it remove. The idea is remove has to be told what, delete refers to its internal state. Question #4: At first I wanted to implement Java fast-fail semantics, but now I have reservations. It would require a bit of house keeping on each and evert insert. IMHO, even adding as little as "flag=1" is too much for what is likely one of the most commonly used pieces of code in Python. Can I have a show of hands? For? against? Wait and see how bad the performance hit is before deciding? --- ** Python dicts never rehash on delete, but they can and sometimes must on insert. In fact, you often *can't* do more than fixed number of inserts without a mandatory rehash, otherwise you just run out slots. Even if somebody decides that delete should rehash, iters don't use the dict's delete ... why delete by key when we have something better, the slot #? A little housekeeping and the item is gone with no searching and no comparisons. Cheers - Adam DePrince From anthony at interlink.com.au Mon Mar 27 07:05:24 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Mon, 27 Mar 2006 16:05:24 +1100 Subject: [Python-3000] else-clause on for-loops In-Reply-To: References: Message-ID: <200603271605.26206.anthony@interlink.com.au> On Friday 24 March 2006 12:57, Steven Bethard wrote: > There was talk previously_ about removing the else clause on > for-loops (and while-loops). One possibility would be to change > the else-clause to behave as expected above (i.e. only executed > when the loop fails to iterate over any items). Ok, I could see _maybe_ removing the current code, but for the love of whatever gods you worship, please don't *change* it in this way. That would be insane. Anthony -- Anthony Baxter It's never too late to have a happy childhood. From benji at benjiyork.com Mon Mar 27 14:22:39 2006 From: benji at benjiyork.com (Benji York) Date: Mon, 27 Mar 2006 07:22:39 -0500 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <1143432484.14391.67.camel@localhost.localdomain> References: <4422FC96.2020409@zope.com> <4422FFE1.8050807@colorstudy.com> <44230BA5.8070407@zope.com> <1f7befae0603241051p513439d7ob98bca0b25091ee1@mail.gmail.com> <1f7befae0603241822n1280ba67i3ed87e842114f18b@mail.gmail.com> <1143306134.3186.1.camel@localhost.localdomain> <44258376.1080709@gmx.net> <1143432484.14391.67.camel@localhost.localdomain> Message-ID: <4427D90F.2080306@benjiyork.com> Adam DePrince wrote: > Question #2: > > What should delete() return? I currently have it returning the iter > itself to make it possible to say: > > value = iter.delete().next() Python doesn't generally return self for call-chaining purposes. I'd say delete() should return None. -- Benji York From steven.bethard at gmail.com Mon Mar 27 19:55:39 2006 From: steven.bethard at gmail.com (Steven Bethard) Date: Mon, 27 Mar 2006 10:55:39 -0700 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <1143432484.14391.67.camel@localhost.localdomain> References: <4422FC96.2020409@zope.com> <44230BA5.8070407@zope.com> <1f7befae0603241051p513439d7ob98bca0b25091ee1@mail.gmail.com> <1f7befae0603241822n1280ba67i3ed87e842114f18b@mail.gmail.com> <1143306134.3186.1.camel@localhost.localdomain> <44258376.1080709@gmx.net> <1143432484.14391.67.camel@localhost.localdomain> Message-ID: On 3/26/06, Adam DePrince wrote: > I have a draft PEP and an implementation of mutable iterators for lists > and dicts that supports delete only. > > The PEP (Mutable Iterations) and sample code can be found at: > > http://www.deprince.net/ideas/peps.html I think the PEP really needs a much stronger motivation section, particularly with real-world examples of code that gets improved by the additional methods. The whole discussion was spawned by a request for determining the length of an iterable, a problem which this PEP doesn't solve at all. What problem is this PEP solving? Is there real-world code where this PEP would help out? One of the reasons I'm having trouble imagining it is that especially for lists, code like:: >>> for c in 'abcdefg': ... l.insert( 0, c ) is almost certainly a bad idea performance-wise due to the Python implementation of lists. You don't want to repeatedly insert a single element at the beginning of a list. You'd probably do much better just writing: >>> lst = [] >>> lst.extend(reversed('abcdefg')) >>> lst.extend(l) Performance matters are probably better in the dict-case, but without some compelling real-world examples, this really feels like YAGNI to me. STeVe -- Grammar am for people who can't think for myself. --- Bucky Katt, Get Fuzzy From steven.bethard at gmail.com Mon Mar 27 21:15:20 2006 From: steven.bethard at gmail.com (Steven Bethard) Date: Mon, 27 Mar 2006 12:15:20 -0700 Subject: [Python-3000] pre-PEP: Procedure for PEPs with Backwards-Incompatible Changes Message-ID: The (pre-)PEP should be mostly self-explanatory. I'm trying to lay down some guidelines for how backwards-incompatible changes should be introduced in Python 3000. Feedback is greatly appreciated, especially in the Identifying Correct Code section. PEP: XXX Title: Procedure for PEPs with Backwards-Incompatible Changes Version: $Revision$ Last-Modified: $Date$ Author: Steven Bethard Status: Draft Type: Informational Content-Type: text/x-rst Created: 03-Mar-2006 Post-History: 03-Mar-2006 Abstract ======== This PEP describes the procedure for changes to Python that introduce backwards incompatible changes between the Python 2.X series and Python 3000. All such changes must be documented by an appropriate Python 3000 PEP and must be accompanied by code that can identify when pieces of Python 2.X code will be incorrect in Python 3000. Rationale ========= Python 3000 will introduce a number of backwards-incompatible changes to Python, mainly to streamline the language and to remove some previous design mistakes. But Python 3000 is not intended to be a new and completely different language from the Python 2.X series, and it is expected that much of the Python user community will make the transition to Python 3000 when it becomes available. To encourage this transition, it is crucial to provide a clear and complete guide on how to upgrade Python 2.X code to Python 3000 code. Thus, for any backwards-incompatible change, two things are required: * An official Python Enhancement Proposal (PEP) * Code that can identify pieces of Python 2.X code that will be incorrect in Python 3000 Python Enchancement Proposals ============================= Every backwards-incompatible change must be accompanied by a PEP. This PEP should follow the usual PEP guidelines and explain the purpose and reasoning behind the backwards incompatible change. In addition to the usual PEP sections, all PEPs proposing backwards-incompatible changes must include an additional section: Compatibility Issues. This section should describe what is backwards incompatible about the proposed change to Python, and the major sorts of breakage to be expected. While PEPs must still be evaluated on a case-by-case basis, a PEP may be inappropriate for Python 3000 if its Compatibility Issues section implies any of the following: * Most or all instances of a Python 2.X construct are incorrect in Python 3000, and most or all instances of the Python 3000 construct are incorrect in Python 2.X. So for example, changing the meaning of the for-loop else-clause from "executed when the loop was not broken out of" to "executed when the loop had zero iterations" would mean that all Python 2.X for-loop else-clauses would be broken, and there would be no way to use a for-loop else-clause in a Python-3000-appropriate manner. Thus a PEP for such an idea would likely be rejected. * Many instances of a Python 2.X construct are incorrect in Python 3000 and the PEP fails to demonstrate real-world use-cases for the changes. Backwards incompatible changes are allowed in Python 3000, but not to excess. A PEP that proposes backwards-incompatible changes should provide good examples of code that visibly benefits from the changes. Of course, PEP-writing is time-consuming, so it is encouraged that when a number of backwards-incompatible changes are closely related, they be proposed in the same PEP. Such PEPs will likely have longer Compatibility Issues sections however, since they must now describe the sorts of breakage expected from *all* the proposed changes. Identifying Incorrect Code ========================== In addition to the PEP required, backwards incompatible changes to Python must also be accompanied by code that can identify pieces of Python 2.X code that will be incorrect in Python 3.0. This PEP proposes to house this code in tools/scripts/python3warn.py. Thus PEPs for backwards incompatible changes should include a patch to this file that produces the appropriate warnings. Code in python3warn.py should be written to the latest version of Python 2.X (not Python 3000) so that Python 2.X users will be able to run the program without having Python 3000 installed. Currently, it seems too stringent to require that the code in python3warn.py identify all changes perfectly. Thus it is permissable if a backwards-incompatible PEP's python3warn.py code produces a number of false-positives (warning that a piece of code might be invalid in Python 3000 when it's actually still okay). However, false-negatives (not issuing a warning for code that will do the wrong thing in Python 3000) should be avioded whenever possible -- users of python3warn.py should be reasonably confident that they have been warned about the vast majority of incompatibilities. Optional Extensions =================== Instead of the python3warn.py script, a branch of Python 3000 could be maintained that added warnings at all the appropriate points in the code-base. PEPs proposing backwards-incompatible changes would then provide patches to the Python-3000-warn branch instead of to python3warn.py. With such a branch, the warnings issued could be near-perfect and Python users could be confident that their code was correct Python 3000 code by first running it on the Python-3000-warn branch and fixing all the warnings. At the moment, however, this PEP opts for the weaker measure (python3warn.py) as it is feared that maintaining a Python-3000-warn branch will be too much of a time drain. References ========== TBD Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: From p.f.moore at gmail.com Mon Mar 27 21:17:35 2006 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 27 Mar 2006 20:17:35 +0100 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <4422FC96.2020409@zope.com> <44230BA5.8070407@zope.com> <1f7befae0603241051p513439d7ob98bca0b25091ee1@mail.gmail.com> <1f7befae0603241822n1280ba67i3ed87e842114f18b@mail.gmail.com> <1143306134.3186.1.camel@localhost.localdomain> <44258376.1080709@gmx.net> <1143432484.14391.67.camel@localhost.localdomain> Message-ID: <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> On 3/27/06, Steven Bethard wrote: > On 3/26/06, Adam DePrince wrote: > > I have a draft PEP and an implementation of mutable iterators for lists > > and dicts that supports delete only. > > > > The PEP (Mutable Iterations) and sample code can be found at: > > > > http://www.deprince.net/ideas/peps.html > > I think the PEP really needs a much stronger motivation section, > particularly with real-world examples of code that gets improved by > the additional methods. The whole discussion was spawned by a request > for determining the length of an iterable, a problem which this PEP > doesn't solve at all. What problem is this PEP solving? Is there > real-world code where this PEP would help out? Agreed. I don't really see the relationship between this PEP and what went before. My understanding of the previous discussion was that there were a few use cases, based around the need to have more information about the underlying collection than is provided by the minimalist iterator spec, without passing concrete collections about. Guido referred to Java's collection framework, as a good example of how he saw such a facility developing. The next step was the suggestion that a PEP be written. So, in my mind, I was expecting a PEP which defined one or more new formal interfaces, (views) much like the iterator interface but with a wider set of methods. The view would be backed by a concrete collection, and the effects of view methods on on the underlying collection would be specified (e.g., view.delete() removes the entry from the view, and from the underlying collection, without affecting the element which will be produced by the next application of next(); or view.length() returns the number of elements in the underlying collection). The PEP would then go on to specify implementations to be provided in the core - views available from dict, list and set objects. That's a lot of work, and possibly more than one PEP, but that's what I imagined. The current PEP doesn't seem to match that. Some specific points: - Given that most uses of d.keys/values/items are in a for loop, there's no direct access to the iterator anyway. So the delete method is also inaccessible. This change would require code reorganisation, encouraging far more explicit passing round of iterators when looping. I'm not sure that's a good thing. - I don't see an *optional* addition to the iterator protocol as a good thing. Functions can't assume it exists for general iterators. So they have to be specified as taking iterators with a delete method". Better to give the interface a proper name and be done with it. - As Steven points out, deletion during an iteration isn't a particularly common use case. There are others (knowing the length of a sequence, or lookahead, for example) which are equally compelling. Do we get a special-purpose PEP for every one, with each concrete type growing a variety of optional iterator extensions? Python 2.x has already rejected an optional extension to some iterators allowing them to signal their length, as ultimately unhelpful. Better to have a well-defined concept which is a superset of the iterator protocol (mutable set view, say) which encapsulates a particular interface (iterator, plus other methods including length and delete) and just say that d.keys() returns a mutable set view. I won't go on any more - you probably get the idea... Paul. From brett at python.org Mon Mar 27 21:26:25 2006 From: brett at python.org (Brett Cannon) Date: Mon, 27 Mar 2006 11:26:25 -0800 Subject: [Python-3000] pre-PEP: Procedure for PEPs with Backwards-Incompatible Changes In-Reply-To: References: Message-ID: On 3/27/06, Steven Bethard wrote: > The (pre-)PEP should be mostly self-explanatory. I'm trying to lay > down some guidelines for how backwards-incompatible changes should be > introduced in Python 3000. Feedback is greatly appreciated, > especially in the Identifying Correct Code section. > > > PEP: XXX > Title: Procedure for PEPs with Backwards-Incompatible Changes > Version: $Revision$ > Last-Modified: $Date$ > Author: Steven Bethard > Status: Draft > Type: Informational > Content-Type: text/x-rst > Created: 03-Mar-2006 > Post-History: 03-Mar-2006 > > > Abstract > ======== > > This PEP describes the procedure for changes to Python that introduce > backwards incompatible changes between the Python 2.X series and > Python 3000. All such changes must be documented by an appropriate > Python 3000 PEP and must be accompanied by code that can identify > when pieces of Python 2.X code will be incorrect in Python 3000. > > > Rationale > ========= > > Python 3000 will introduce a number of backwards-incompatible changes > to Python, mainly to streamline the language and to remove some > previous design mistakes. But Python 3000 is not intended to be a new > and completely different language from the Python 2.X series, and it > is expected that much of the Python user community will make the > transition to Python 3000 when it becomes available. > > To encourage this transition, it is crucial to provide a clear and > complete guide on how to upgrade Python 2.X code to Python 3000 code. > Thus, for any backwards-incompatible change, two things are required: > > * An official Python Enhancement Proposal (PEP) > * Code that can identify pieces of Python 2.X code that will be > incorrect in Python 3000 > Requiring code that can identify things in 2.x that will change in 3.0 that are coded externally from the interpreter is going to be *really* difficult in some situations, if not impossible to get right. Just look at dict.items(); how do you do that? You can't just assume any object in a function that was passed in and has a items method called on it is a dict. But if you don't you can only detect for local scope (assuming you do a type inference for the code block). I say documenting where code will break and provide any possible transition strategy for people with code that will be affected is the best we can do in terms of requiring stuff. If you can provide code to detect the problem, great; but don't make it a base requirement for every change. -Brett From steven.bethard at gmail.com Mon Mar 27 21:37:32 2006 From: steven.bethard at gmail.com (Steven Bethard) Date: Mon, 27 Mar 2006 12:37:32 -0700 Subject: [Python-3000] pre-PEP: Procedure for PEPs with Backwards-Incompatible Changes In-Reply-To: References: Message-ID: On 3/27/06, Brett Cannon wrote: > Requiring code that can identify things in 2.x that will change in 3.0 > that are coded externally from the interpreter is going to be *really* > difficult in some situations, if not impossible to get right. Just > look at dict.items(); how do you do that? You issue a warning for every .items() call. If you want to be a little more intelligent about it, you issue a warning to every .items() call that isn't used directly in a for-loop. Sure, it's going to generate false-positives[1], but the PEP says that's okay. At least in my own code, it wouldn't generate that many false positives either -- most of my calls to items are on builtin dicts. Others who use a lot of custom dict-like objects would get more, but the custom dict-like objects likely need upgraded to the Python 3000 interface as well. Maybe each PEP should have it's own flag to enable/disable the warnings issued by python3warn.py? Then after you'd checked all your .items() calls (and presumably corrected them or determined they were false-positives), you could ask python3warn.py to stop warning you. Steve [1] It's going to generate some false-negatives too, when someone's done something like ``func = mydict.items``. But we're never going to get perfect warnings (unless people are crazy enough to want to maintain the branch), so I'm willing to accept that. -- Grammar am for people who can't think for myself. --- Bucky Katt, Get Fuzzy From jimjjewett at gmail.com Mon Mar 27 21:52:29 2006 From: jimjjewett at gmail.com (Jim Jewett) Date: Mon, 27 Mar 2006 14:52:29 -0500 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <1143432484.14391.67.camel@localhost.localdomain> References: <4422FC96.2020409@zope.com> <44230BA5.8070407@zope.com> <1f7befae0603241051p513439d7ob98bca0b25091ee1@mail.gmail.com> <1f7befae0603241822n1280ba67i3ed87e842114f18b@mail.gmail.com> <1143306134.3186.1.camel@localhost.localdomain> <44258376.1080709@gmx.net> <1143432484.14391.67.camel@localhost.localdomain> Message-ID: On 3/26/06, Adam DePrince wrote: > [ On porting Java Iterators to python ] Summary of my response: Add a (convention of an) "exhausted" property that indicates whether the iterator is used up, without wasting a value. Add a conventional name for a reference to the underlying collection, where there is one. Anything beyond that feels is probably at least overkill, and may be actively bad. Detailed response: These extensions seem to be useful only when the iterator represents an underlying collection, and often only certain types of collection, such as ordered or mapping. If Iterators are really about views on a collection (rather than any iterator), then it might make sense to also consider databases and SQL. IMHO, everything but the four basic Insert, Update, Delete, and Select (=lookup, get) is clearly too heavyweight. Someone listing the members doesn't want the overhead of locking. Inserts can be badly defined (unordered collections) or unexpectedly expensive (insert at a position instead of appending), so a generic interface might just be an attractive nuisance. Deletes already work for dictionaries as a special case. Doing them on a list is a bad idea -- using a comprehension expression is far more efficient. So again, a generic interface might just be an attractive nuisance. Updates already work with a dictionary as a special case. For ordered collections, either it isn't meaningful, or you can just iterate for enum(it) instead. > hasPrev Yes Yes > previous Yes Yes Reversible iteration seems useful. But the original pep semi-reserved a name for it, and I haven't seen many objects bothering to implement it; perhaps it isn't so useful after all, in python practice. > current Yes Yes > (the item last returned by next/prev) How would this be used? The only time I've wanted something like this was so that I could write something like while file.readline(): ... instead of for line in file: ... and I'm not sure that my code would actually have been better if I could have done it. > Question #4: > > At first I wanted to implement Java fast-fail semantics, but now I have > reservations. It would require a bit of house keeping on each and evert > insert. IMHO, even adding as little as "flag=1" is too much for what is > likely one of the most commonly used pieces of code in Python. Can I > have a show of hands? For? against? Wait and see how bad the > performance hit is before deciding? I think an even bigger problem is either (1) checking isvalid on every *lookup*, or (2) the mess and inefficiency of forcing every (mutable) collection to (weakly) track all its iterators, and forcing every iterator to have methods for handling notification. (So now "for k in dict" needs to allocate (and deallocate) an extra weakref, and to do an extra insert and delete on the weakrefs list ... OK for large stable objects like a module, but not good for large numbers of three-item dicts.) -jJ From aahz at pythoncraft.com Mon Mar 27 21:53:57 2006 From: aahz at pythoncraft.com (Aahz) Date: Mon, 27 Mar 2006 11:53:57 -0800 Subject: [Python-3000] pre-PEP: Procedure for PEPs with Backwards-Incompatible Changes In-Reply-To: References: Message-ID: <20060327195357.GB21164@panix.com> On Mon, Mar 27, 2006, Steven Bethard wrote: > > Abstract > ======== > > This PEP describes the procedure for changes to Python that introduce > backwards incompatible changes between the Python 2.X series and > Python 3000. All such changes must be documented by an appropriate > Python 3000 PEP and must be accompanied by code that can identify > when pieces of Python 2.X code will be incorrect in Python 3000. s/incorrect/problematic/g There will be plenty of cases where code may be either correct or incorrect depending on the exact circumstances. Correctly determining whether any code is incorrect is impossible for a computer program; all we can guarantee is flagging code that *may* cause problems. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "Look, it's your affair if you want to play with five people, but don't go calling it doubles." --John Cleese anticipates Usenet From steven.bethard at gmail.com Mon Mar 27 22:17:19 2006 From: steven.bethard at gmail.com (Steven Bethard) Date: Mon, 27 Mar 2006 13:17:19 -0700 Subject: [Python-3000] pre-PEP: Procedure for PEPs with Backwards-Incompatible Changes In-Reply-To: <20060327195357.GB21164@panix.com> References: <20060327195357.GB21164@panix.com> Message-ID: On 3/27/06, Aahz wrote: > On Mon, Mar 27, 2006, Steven Bethard wrote: > > > > Abstract > > ======== > > > > This PEP describes the procedure for changes to Python that introduce > > backwards incompatible changes between the Python 2.X series and > > Python 3000. All such changes must be documented by an appropriate > > Python 3000 PEP and must be accompanied by code that can identify > > when pieces of Python 2.X code will be incorrect in Python 3000. > > s/incorrect/problematic/g > > There will be plenty of cases where code may be either correct or > incorrect depending on the exact circumstances. Correctly determining > whether any code is incorrect is impossible for a computer program; all > we can guarantee is flagging code that *may* cause problems. Yeah, that looks better. I've updated the PEP accordingly and posted it at: http://ucsu.colorado.edu/~bethard/py/pep-backwards-incompatible.txt Steve -- Grammar am for people who can't think for myself. --- Bucky Katt, Get Fuzzy From skip at pobox.com Tue Mar 28 06:39:42 2006 From: skip at pobox.com (skip at pobox.com) Date: Mon, 27 Mar 2006 22:39:42 -0600 Subject: [Python-3000] pre-PEP: Procedure for PEPs with Backwards-Incompatible Changes In-Reply-To: References: Message-ID: <17448.48654.533321.884011@montanaro.dyndns.org> Steven> The (pre-)PEP should be mostly self-explanatory.... Steven> PEP: XXX Steven> Title: Procedure for PEPs with Backwards-Incompatible Changes Suggestion: Make this PEP 3001 and start any Py3k PEPs from 3100. That gives plenty of room between any PEPs that might be written for 2.x and gives some space for various informational PEPs that are specific to Python 3.x. Skip From greg.ewing at canterbury.ac.nz Tue Mar 28 07:04:29 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 28 Mar 2006 17:04:29 +1200 Subject: [Python-3000] Parallel iteration syntax Message-ID: <4428C3DD.9090603@canterbury.ac.nz> Some years ago there was a long discussion about extending the for-loop to express parallel iteration over a number of iterables, which ended with the conclusion that such an extension was syntactically impossible, and the creation of zip(). Slightly too late for consideration, I did come up with what I believe is a backwards-compatible syntax extension to support this: for (x in iter1, y in iter2): ... This is currently a syntax error, so there is no clash with existing semantics. I'm mentioning it here again just in case anyone wants to consider it for Py3k. I still believe it would be nice to have a direct syntax for parallel iteration to avoid the overhead of using zip or iterzip. Also I think the above is easier to read, because it puts each variable next to the relevant expression. Greg From adam.deprince at gmail.com Tue Mar 28 07:26:11 2006 From: adam.deprince at gmail.com (adam deprince) Date: Tue, 28 Mar 2006 00:26:11 -0500 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> References: <4422FC96.2020409@zope.com> <1f7befae0603241051p513439d7ob98bca0b25091ee1@mail.gmail.com> <1f7befae0603241822n1280ba67i3ed87e842114f18b@mail.gmail.com> <1143306134.3186.1.camel@localhost.localdomain> <44258376.1080709@gmx.net> <1143432484.14391.67.camel@localhost.localdomain> <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> Message-ID: > I won't go on any more - you probably get the idea... Agreed, scratch that, I'll rework it in the spriit of views. Cheers, Adam DePrince From adam.deprince at gmail.com Tue Mar 28 07:32:08 2006 From: adam.deprince at gmail.com (adam deprince) Date: Tue, 28 Mar 2006 00:32:08 -0500 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <4422FC96.2020409@zope.com> <44230BA5.8070407@zope.com> <1f7befae0603241051p513439d7ob98bca0b25091ee1@mail.gmail.com> <1f7befae0603241822n1280ba67i3ed87e842114f18b@mail.gmail.com> <1143306134.3186.1.camel@localhost.localdomain> <44258376.1080709@gmx.net> <1143432484.14391.67.camel@localhost.localdomain> Message-ID: > I think an even bigger problem is either > (1) checking isvalid on every *lookup*, or > (2) the mess and inefficiency of forcing every (mutable) collection to > (weakly) track all its iterators, and forcing every iterator to have > methods for handling notification. (So now "for k in dict" needs > to allocate (and deallocate) an extra weakref, and to do an extra > insert and delete on the weakrefs list ... OK for large stable > objects like > a module, but not good for large numbers of three-item dicts.) I agree about #2 whole heartedly ... as for #1, well, dict.iter does this to some extent ... each call to .next entails confirming that the dict and iter agree about the size of the former. I'm not going address these right now because I'm tossing this PEP ... now that I reread it myself I don't really like it. (Things always look good right before you submit them.) I've going to backtrack and describe a Java views style proposal that concrete classes can implement. Cheers - Adam DePrince > > -jJ > From p.f.moore at gmail.com Tue Mar 28 10:45:34 2006 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 28 Mar 2006 09:45:34 +0100 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <4422FC96.2020409@zope.com> <1f7befae0603241822n1280ba67i3ed87e842114f18b@mail.gmail.com> <1143306134.3186.1.camel@localhost.localdomain> <44258376.1080709@gmx.net> <1143432484.14391.67.camel@localhost.localdomain> <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> Message-ID: <79990c6b0603280045h4e670ea0l59d5f5feb925814d@mail.gmail.com> On 3/28/06, adam deprince wrote: > > I won't go on any more - you probably get the idea... > > Agreed, scratch that, I'll rework it in the spriit of views. Thanks for taking my comments so well! When I wrote them, I was *really* worried they came across as too negative. The key here (in my view) is probably to think in terms of use patterns. In a for loop for k in d: # or d.keys() for v in d.values(): for k, v in d.items(): you have no access to anything other than the return values from next(). So, extending the methods on the iterator is effectively useless. If you pass around the iterator, you're going to be passing it to a function (for in-line code, you will definitely have access to the underlying container, so there's no value to going via the iterator). OK, so we're looking specifically at functions, which have access to the iterator only, and not to the underlying object. That's the key use case. So, we need to document the function's requirements. That's why we need a named protocol, and not just "an iterator which also supports methods X, Y and Z", because otherwise we end up with yet another "file-like object" mess. There are *lots* of concepts that might help people write functions like this. Just off the top of my head (excuse my lousy naming): - bounded iterator (supports a length method, guaranteed non-infinite) - reiterator (supports clone/restart type operations) - mutable iterator (supports delete and insert, with appropriate guarantees) It's only really use cases that can identify which of these might be worth supporting. Hope this helps, Paul. From adam.deprince at gmail.com Tue Mar 28 16:29:56 2006 From: adam.deprince at gmail.com (Adam DePrince) Date: Tue, 28 Mar 2006 09:29:56 -0500 Subject: [Python-3000] Parallel iteration syntax In-Reply-To: <4428C3DD.9090603@canterbury.ac.nz> References: <4428C3DD.9090603@canterbury.ac.nz> Message-ID: <1143556197.3305.18.camel@localhost.localdomain> On Tue, 2006-03-28 at 17:04 +1200, Greg Ewing wrote: > Some years ago there was a long discussion about extending > the for-loop to express parallel iteration over a number > of iterables, which ended with the conclusion that such > an extension was syntactically impossible, and the creation > of zip(). > > Slightly too late for consideration, I did come up with > what I believe is a backwards-compatible syntax extension > to support this: > > for (x in iter1, y in iter2): > ... > > This is currently a syntax error, so there is no clash > with existing semantics. > > I'm mentioning it here again just in case anyone wants > to consider it for Py3k. I still believe it would be > nice to have a direct syntax for parallel iteration > to avoid the overhead of using zip or iterzip. Does this save any overhead, other than the mental state of programmers? Cheers - Adam DePrince From adam.deprince at gmail.com Tue Mar 28 18:44:01 2006 From: adam.deprince at gmail.com (Adam DePrince) Date: Tue, 28 Mar 2006 11:44:01 -0500 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <79990c6b0603280045h4e670ea0l59d5f5feb925814d@mail.gmail.com> References: <4422FC96.2020409@zope.com> <1f7befae0603241822n1280ba67i3ed87e842114f18b@mail.gmail.com> <1143306134.3186.1.camel@localhost.localdomain> <44258376.1080709@gmx.net> <1143432484.14391.67.camel@localhost.localdomain> <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> <79990c6b0603280045h4e670ea0l59d5f5feb925814d@mail.gmail.com> Message-ID: <1143564241.3305.75.camel@localhost.localdomain> > for k in d: # or d.keys() > for v in d.values(): > for k, v in d.items(): Right now I'm entertaining two competing "answers" to some of the issues addressed in this thread. The first, and easiest to write about and implement, was to make iters deletable to give the appearance of having a view in what I thought was the only way we cared. The second is views. My concern was the artificial ordering, and an explosion of interfaces as we tried to accommodate 2^n feature flats. I started a PEP for views as well. The view PEP will be larger when finished, I'm proposing a number of interfaces, including proper C-level interfaces. Ironically, when finished, the Set object's methods might be entirely SetViewInterface methods :-) I don't see the mutable iterator going anywhere just yet, for one, as you mentioned it doesn't work with common use cases, and its proper operation in the dict is dependent on our specific implementation. > - bounded iterator (supports a length method, guaranteed non-infinite) Perhaps we can just add to itertools: def bound( i, c ): for x in xrange( c ): i.next() > - reiterator (supports clone/restart type operations) itertools.tee or repeat? > - mutable iterator (supports delete and insert, with appropriate guarantees) The original mutable iterator supported this idea, the sudden explosion of methods afterward was intended to gauge the communities view of how rich it should be. I'll continue this only to see if it grows into something, but my hopes arn't that hight for it :-) > > It's only really use cases that can identify which of these might be > worth supporting. > > Hope this helps, > Paul. From adam.deprince at gmail.com Tue Mar 28 19:07:08 2006 From: adam.deprince at gmail.com (Adam DePrince) Date: Tue, 28 Mar 2006 12:07:08 -0500 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> References: <4422FC96.2020409@zope.com> <44230BA5.8070407@zope.com> <1f7befae0603241051p513439d7ob98bca0b25091ee1@mail.gmail.com> <1f7befae0603241822n1280ba67i3ed87e842114f18b@mail.gmail.com> <1143306134.3186.1.camel@localhost.localdomain> <44258376.1080709@gmx.net> <1143432484.14391.67.camel@localhost.localdomain> <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> Message-ID: <1143565628.3305.82.camel@localhost.localdomain> > Agreed. I don't really see the relationship between this PEP and what > went before. My understanding of the previous discussion was that I know. Here is my start of a ViewInterface PEP. Its a closer match, and a bit larger. Updated versions can be found at http://deprince.net/ideas/pep-views.txt PEP: XXX Title: Views Version: $Revision: 1.4 $ Last-Modified: $Date: 2006/03/29 04:08:48 $ Author: , Adam DePrince Status: Draft Type: Standards Content-Type: text/plain Created: 25-March-2006 Post-History: Abstract: This PEP proposes a plurality of interfaces in the spirit of Java's Collection interface. These interfaces will be collectively known as Views. Motivation Data-stores typically have a number of accessor methods that allow a data-store to viewed from a perspective different than its own. There currently exists no collection of standard interfaces for data-stores to choose and implement. The result is a plethora of overlapping methods and semantics with no strict commonality. The concern is that the applicability of duck typing in Python would become increasingly restricted as the result of subtle variations in semantics and methods for what should be otherwise similar implementations. If we hold up the dict for special consideration, the functions keys, items and values represent the initial effort to provide this abstraction. At the time the only logical structure to map to was a concrete list. With the arrival of the iteration came the iter variants to these functions. Again, neither of these structures are the natural view of their perspective on the underlying data store. Now with the set type formally part of Python 2.4 we are faced again with the urge to provide a view that represents an even more natural perspective. Nobody wants to see a setkeys, setitems ... Further complicating the issue is similar data structures whose aggregation functions are conceptually similar ... without a central language enforced notion of what a specific variation of a view should sound like, we run risk of incompatibility and rejection of views that should otherwise be equivalent. The second problem lies in the connectivity between these two. Again, hold our dict type up for undue scrutiny as a specific example of a general problem, the aggregation functions do not provide a synchronized view into the the underlying data structure; the list variants are strictly snapshots of and don't reflect modification. The iter variants fail upon modification. The notion of a view is a natural off-shot from the same reason we have a separate keys, items and values method. It is desirable to examine a data-structure from more than one perspective. While this might seem of marginal utility for a List -> Collection view, consider the needs of future data-structure designers. A graph could be viewed as a mapping, a set of nodes or collection of edges. A database object could provide arbitrarily complicated views. By providing a standard set of view interfaces we provide future users a way of ensuring compatibility between their existing code and new data store without the worry that their goose might not sound enough like a duck. If your aggregate return looks like: then you must implement. SPECIFICATION - VIEW All views must support the SetView interface. UNIQUENESS Data-stores whose elements are not unique among themselves must support the Multi-View interface. Data-stores whose elements are unique among must not implement the CollectionView interface. ORDERED Data-stores whose elements are ordered, either an assigned order as in a list, or an intrinsic order such as a Tree, must implement the OrderedView interface. Datastores whose elements are not ordered need not support any interface of this category. MAP Datastores that represent a map must implement the MappingView interface. The following interface names are abbreviations for the following permutations of the above. * Collection View( SetView + Multiview ) * ListView: (SetView + MultiView + OrderedView) * OrderedSetView (SetView + OrderedView ) * MapView( SetView + MappingView ) * OrderedMapView( SetView + MappingView + OrderedView ) * MultiMapView( SetView + MultiView + MappingView ) * OrderedMultiMapView( SetView + CollectionView + MappingView + OrderedView ) The following interfaces alone are functional members of the ViewsInterface: * SetView SPECIFICATION - VIEW CASTING Views may be cast between each other so long as the new view's implementor list is a subset of the old. The casting of Mapview to a not MapView implementing view must take place through one of the MapView's "keys, items, values" The following is the full ruleset for value casts. MapView.keys -> SetView MapView.items -> SetView Mapview.values -> CollectionView MultiMapView.keys -> ColletcionView MultiMapView.items -> CollectionView MultiMapView.values -> CollectionView .unordered -> .unqiue -> SPECIFICATION - CHANGES TO CONCRETE CLASSES The 3 methods iter{keys,values,items} of dict will be deprecated. The existing keys, values and items will adopt the semantics of the MapView interface. Example use >>> d = {} >>> print d.keys() SPECIFICATION - ABSTRACT METHODS Implementors of the CollectionsView would be required to offer: Implementors of the ListView would be required to offer. ALSO CONSIDERED Mutable iterators were also considered and rejected. Numerous concerns were cited: * Excessive housekeeping for mutable iterators in dict * Performance problems for mutable iterators on list * Limited utility to due inaccessibility of iterator on common use cases (for-loop) From guido at python.org Tue Mar 28 19:40:34 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 28 Mar 2006 09:40:34 -0800 Subject: [Python-3000] pre-PEP: Procedure for PEPs with Backwards-Incompatible Changes In-Reply-To: <17448.48654.533321.884011@montanaro.dyndns.org> References: <17448.48654.533321.884011@montanaro.dyndns.org> Message-ID: On 3/27/06, skip at pobox.com wrote: > Suggestion: Make this PEP 3001 and start any Py3k PEPs from 3100. That > gives plenty of room between any PEPs that might be written for 2.x and > gives some space for various informational PEPs that are specific to Python > 3.x. I already proposed that numbering scheme. More formally, Py3k meta PEPs go between 3001 and 3099, and feature PEPs start at 3100 (and hopefully we won't have to overflow into 4000 :-). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From steven.bethard at gmail.com Tue Mar 28 19:42:34 2006 From: steven.bethard at gmail.com (Steven Bethard) Date: Tue, 28 Mar 2006 10:42:34 -0700 Subject: [Python-3000] Parallel iteration syntax In-Reply-To: <4428C3DD.9090603@canterbury.ac.nz> References: <4428C3DD.9090603@canterbury.ac.nz> Message-ID: On 3/27/06, Greg Ewing wrote: > Some years ago there was a long discussion about extending > the for-loop to express parallel iteration over a number > of iterables, which ended with the conclusion that such > an extension was syntactically impossible, and the creation > of zip(). > > Slightly too late for consideration, I did come up with > what I believe is a backwards-compatible syntax extension > to support this: > > for (x in iter1, y in iter2): > ... I assume this would be exactly equivalent to:: for x, y in zip(iter1, iter2): ... where zip is actually izip since we're talking Python 3000? I'm -1, at least until I see some code that's substantially improved by the syntax. The zip version isn't that complicated -- you just need to understand how zip works. And zip has a variety of other use-cases, so any consistent user of Python should get themselves familiar with it. STeVe -- Grammar am for people who can't think for myself. --- Bucky Katt, Get Fuzzy From mcherm at mcherm.com Tue Mar 28 15:54:12 2006 From: mcherm at mcherm.com (Michael Chermside) Date: Tue, 28 Mar 2006 05:54:12 -0800 Subject: [Python-3000] Parallel iteration syntax Message-ID: <20060328055412.7w4i7h7ten4kc4ow@login.werra.lunarpages.com> Greg Ewing writes: > Some years ago there was a long discussion about extending > the for-loop to express parallel iteration over a number > of iterables [...] > I'm mentioning it here again just in case anyone wants > to consider it for Py3k. I still believe it would be > nice to have a direct syntax for parallel iteration > to avoid the overhead of using zip or iterzip. There's one big problem I see with this. Parallel iteration is underspecified... there are several reasonable choices for what to do if the iterables are of differing length. (1) Fill out the short iterables with "None" elements. (2) Stop as soon as any iterable runs out. (3) Raise an exception if the iterables are not all the same length. Today, we support these with different idioms: (1) result = map(some_func, seq_x, seq_y) (2) for x, y in zip(seq_x, seq_y): some_func(x, y) (3) # all right, there's no common idiom I know of for # (3), but there ought to be, because it is probably # what the programmer wants most often, since the # most common case is where you are expecting the # lists to be the same length. Frankly, I have no interest in supporting (1)... it's not really natural. But both (2) and (3) have some good points. (2) is fundamentally more powerful (it can do things that (3) can't), while (3) is probably what programmers need most often; it has the advantage of being explicit about errors and it is difficult to simulate. So which to support? -- Michael Chermside PS: To be explicit, what I mean by (3) is this: def equizip(*iterables): iterators = [iter(x) for x in iterables] while True: try: first_value = iterators[0].next() try: other_values = [x.next() for x in iterators[1:]] except StopIteration: raise IterableLengthMismatch else: values = [first_value] + other_values yield tuple(values) except StopIteration: for iterator in iterators[1:]: try: extra_value = iterator.next() except StopIteration: pass # this is what we expect else: raise IterableLengthMismatch raise StopIteration for x, y in equizip(seq_x, seq_y): some_func(x, y) From rasky at develer.com Tue Mar 28 16:52:55 2006 From: rasky at develer.com (Giovanni Bajo) Date: Tue, 28 Mar 2006 16:52:55 +0200 Subject: [Python-3000] Parallel iteration syntax References: <4428C3DD.9090603@canterbury.ac.nz> Message-ID: <080301c65277$4bd0f890$bf03030a@trilan> Greg Ewing wrote: > for (x in iter1, y in iter2): > ... Contrary to zip()/izip(), this does not easily allow further composition, as far as I can tell. For instance: for i,(x,y) in enumerate(izip(iter1, iter2)): ... must be translated to: for (i,x in enumerate(iter1), y in iter2): or: for (x in iter1, i,y in enumerate(iter2)): both of which require one further mental step, if you're coming from: i = 0 for (x in iter1, y in iter2): ... i += 1 -- Giovanni Bajo From skip at pobox.com Tue Mar 28 19:47:45 2006 From: skip at pobox.com (skip at pobox.com) Date: Tue, 28 Mar 2006 11:47:45 -0600 Subject: [Python-3000] pre-PEP: Procedure for PEPs with Backwards-Incompatible Changes In-Reply-To: References: <17448.48654.533321.884011@montanaro.dyndns.org> Message-ID: <17449.30401.409795.632234@montanaro.dyndns.org> >> Suggestion: Make this PEP 3001 and start any Py3k PEPs from 3100.... Guido> I already proposed that numbering scheme. More formally, Py3k Guido> meta PEPs go between 3001 and 3099, and feature PEPs start at Guido> 3100 (and hopefully we won't have to overflow into 4000 :-). Should there be some distinction between Py3k PEPs which fall under the purview of Steven's PEP and those which contain completely new stuff and aren't going to impact Python 2.x code? Skip From guido at python.org Tue Mar 28 20:07:13 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 28 Mar 2006 10:07:13 -0800 Subject: [Python-3000] Parallel iteration syntax In-Reply-To: <4428C3DD.9090603@canterbury.ac.nz> References: <4428C3DD.9090603@canterbury.ac.nz> Message-ID: On 3/27/06, Greg Ewing wrote: > Some years ago there was a long discussion about extending > the for-loop to express parallel iteration over a number > of iterables, which ended with the conclusion that such > an extension was syntactically impossible, and the creation > of zip(). > > Slightly too late for consideration, I did come up with > what I believe is a backwards-compatible syntax extension > to support this: > > for (x in iter1, y in iter2): > ... > > This is currently a syntax error, so there is no clash > with existing semantics. > > I'm mentioning it here again just in case anyone wants > to consider it for Py3k. I still believe it would be > nice to have a direct syntax for parallel iteration > to avoid the overhead of using zip or iterzip. > > Also I think the above is easier to read, because it > puts each variable next to the relevant expression. Based on the feedback so far I think not. There's also the issue that for (x in A, y in B): could just as well be meant as a shortcut for for x in A: for y in B: The proposed syntax doesn't quite jive with my guts, and the issue of "what to do if they are of unequal length" is a good one, which is better solved by being explicit and using zip (== izip). Finally (maybe this is why it doesn't jive :-) it seems to me that separating the variables is *worse* than for x, y in zip(A, B): because the latter emphasizes that you get a new x and a new y at the same time. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Mar 28 20:08:39 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 28 Mar 2006 10:08:39 -0800 Subject: [Python-3000] pre-PEP: Procedure for PEPs with Backwards-Incompatible Changes In-Reply-To: <17449.30401.409795.632234@montanaro.dyndns.org> References: <17448.48654.533321.884011@montanaro.dyndns.org> <17449.30401.409795.632234@montanaro.dyndns.org> Message-ID: On 3/28/06, skip at pobox.com wrote: > > >> Suggestion: Make this PEP 3001 and start any Py3k PEPs from 3100.... > > Guido> I already proposed that numbering scheme. More formally, Py3k > Guido> meta PEPs go between 3001 and 3099, and feature PEPs start at > Guido> 3100 (and hopefully we won't have to overflow into 4000 :-). > > Should there be some distinction between Py3k PEPs which fall under the > purview of Steven's PEP and those which contain completely new stuff and > aren't going to impact Python 2.x code? I don't think there are enough dimensions in the numbering scheme to indicate all possible distinctions. IOW, no. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From g.brandl at gmx.net Tue Mar 28 20:13:31 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 28 Mar 2006 20:13:31 +0200 Subject: [Python-3000] Parallel iteration syntax In-Reply-To: References: <4428C3DD.9090603@canterbury.ac.nz> Message-ID: <44297CCB.10008@gmx.net> Guido van Rossum wrote: >> Slightly too late for consideration, I did come up with >> what I believe is a backwards-compatible syntax extension >> to support this: >> >> for (x in iter1, y in iter2): >> ... [...] > Based on the feedback so far I think not. There's also the issue that > > for (x in A, y in B): > > could just as well be meant as a shortcut for > > for x in A: > for y in B: > > The proposed syntax doesn't quite jive with my guts, and the issue of > "what to do if they are of unequal length" is a good one, which is > better solved by being explicit and using zip (== izip). > > Finally (maybe this is why it doesn't jive :-) it seems to me that > separating the variables is *worse* than > > for x, y in zip(A, B): > > because the latter emphasizes that you get a new x and a new y at the same time. I agree. A related issue is, when map() is gone, could zip() grow a keyword argument, say, 'extend' so that map(None, A, B) translates to zip(A, B, extend=None)? zip(A, B, extend=0) would fill with 0 etc. Georg -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 191 bytes Desc: OpenPGP digital signature Url : http://mail.python.org/pipermail/python-3000/attachments/20060328/1cc37db8/attachment.pgp From g.brandl at gmx.net Tue Mar 28 20:14:19 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 28 Mar 2006 20:14:19 +0200 Subject: [Python-3000] pre-PEP: Procedure for PEPs with Backwards-Incompatible Changes In-Reply-To: References: <17448.48654.533321.884011@montanaro.dyndns.org> <17449.30401.409795.632234@montanaro.dyndns.org> Message-ID: <44297CFB.7010601@gmx.net> Guido van Rossum wrote: > On 3/28/06, skip at pobox.com wrote: >> >> >> Suggestion: Make this PEP 3001 and start any Py3k PEPs from 3100.... >> >> Guido> I already proposed that numbering scheme. More formally, Py3k >> Guido> meta PEPs go between 3001 and 3099, and feature PEPs start at >> Guido> 3100 (and hopefully we won't have to overflow into 4000 :-). >> >> Should there be some distinction between Py3k PEPs which fall under the >> purview of Steven's PEP and those which contain completely new stuff and >> aren't going to impact Python 2.x code? > > I don't think there are enough dimensions in the numbering scheme to > indicate all possible distinctions. IOW, no. Why, I expected 31xx to be Apocalypses, 32xx to be Synopses and 33xx to be Exe$(Z(/)"//&& -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 191 bytes Desc: OpenPGP digital signature Url : http://mail.python.org/pipermail/python-3000/attachments/20060328/cc262c17/attachment.pgp From ianb at colorstudy.com Tue Mar 28 20:14:25 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 28 Mar 2006 12:14:25 -0600 Subject: [Python-3000] pre-PEP: Procedure for PEPs with Backwards-Incompatible Changes In-Reply-To: <17449.30401.409795.632234@montanaro.dyndns.org> References: <17448.48654.533321.884011@montanaro.dyndns.org> <17449.30401.409795.632234@montanaro.dyndns.org> Message-ID: <44297D01.7040100@colorstudy.com> skip at pobox.com wrote: > >> Suggestion: Make this PEP 3001 and start any Py3k PEPs from 3100.... > > Guido> I already proposed that numbering scheme. More formally, Py3k > Guido> meta PEPs go between 3001 and 3099, and feature PEPs start at > Guido> 3100 (and hopefully we won't have to overflow into 4000 :-). > > Should there be some distinction between Py3k PEPs which fall under the > purview of Steven's PEP and those which contain completely new stuff and > aren't going to impact Python 2.x code? I notice also that some of these suggestions are applicable to 2.x, like the "Parallel iteration syntax" (which introduces no backward incompatibilities). If something can be applied to 2.x, should that be brought up in py-dev instead (or c.l.p.)? There seems to be a danger that Py3K is seen as a more friendly place to suggest changes than Python 2.x (or maybe that the python-3000 list is more friendly to these suggestions than py-dev), and so changes are brought up here even though they could be applied earlier. This would be problematic in part because if a change *can* go in 2.x, it would be really good if it did, so that the move to 3.0 involves as few changes as possible. Formalizing the target implementation through the PEP numbering might also cause premature expectations about when the feature might be introduced. Though if it's okay to just renumber the PEP then that wouldn't be a problem. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From ianb at colorstudy.com Tue Mar 28 20:22:31 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 28 Mar 2006 12:22:31 -0600 Subject: [Python-3000] Parallel iteration syntax In-Reply-To: References: <4428C3DD.9090603@canterbury.ac.nz> Message-ID: <44297EE7.7020300@colorstudy.com> Guido van Rossum wrote: > The proposed syntax doesn't quite jive with my guts, and the issue of > "what to do if they are of unequal length" is a good one, which is > better solved by being explicit and using zip (== izip). Is zip() going to be equivalent to izip(), or will it be a view? I vote for view. xrange() does not produce an iterator, so there is some precedence that we not replace list-constructing-builtins with iterator-constructing-builtins. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From nnorwitz at gmail.com Tue Mar 28 20:48:36 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue, 28 Mar 2006 10:48:36 -0800 Subject: [Python-3000] pre-PEP: Procedure for PEPs with Backwards-Incompatible Changes In-Reply-To: <44297D01.7040100@colorstudy.com> References: <17448.48654.533321.884011@montanaro.dyndns.org> <17449.30401.409795.632234@montanaro.dyndns.org> <44297D01.7040100@colorstudy.com> Message-ID: On 3/28/06, Ian Bicking wrote: > > There seems to be a danger that Py3K is seen as a more friendly place to > suggest changes than Python 2.x (or maybe that the python-3000 list is > more friendly to these suggestions than py-dev), and so changes are > brought up here even though they could be applied earlier. This would > be problematic in part because if a change *can* go in 2.x, it would be > really good if it did, so that the move to 3.0 involves as few changes > as possible. I'm not concerned about that. Everyone here will ensure that if a feature should go into 2.x, it will. The feature may be implemented first in 3k, the PEP may be 3xxx, but when it's ready, it will migrate to 2.x also. This is important for moving to 3k. We need to make the migration as simple as possible. "Backporting" these features is one aspect of making it easier. No one is forgetting about 2.x by any means. There seemed to be general consensus that there will be at least a couple more 2.x releases. Or maybe that was just my view and no one disagreed. :-) n From guido at python.org Tue Mar 28 20:55:59 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 28 Mar 2006 10:55:59 -0800 Subject: [Python-3000] pre-PEP: Procedure for PEPs with Backwards-Incompatible Changes In-Reply-To: <44297D01.7040100@colorstudy.com> References: <17448.48654.533321.884011@montanaro.dyndns.org> <17449.30401.409795.632234@montanaro.dyndns.org> <44297D01.7040100@colorstudy.com> Message-ID: On 3/28/06, Ian Bicking wrote: > skip at pobox.com wrote: > > >> Suggestion: Make this PEP 3001 and start any Py3k PEPs from 3100.... > > > > Guido> I already proposed that numbering scheme. More formally, Py3k > > Guido> meta PEPs go between 3001 and 3099, and feature PEPs start at > > Guido> 3100 (and hopefully we won't have to overflow into 4000 :-). > > > > Should there be some distinction between Py3k PEPs which fall under the > > purview of Steven's PEP and those which contain completely new stuff and > > aren't going to impact Python 2.x code? > > I notice also that some of these suggestions are applicable to 2.x, like > the "Parallel iteration syntax" (which introduces no backward > incompatibilities). If something can be applied to 2.x, should that be > brought up in py-dev instead (or c.l.p.)? > > There seems to be a danger that Py3K is seen as a more friendly place to > suggest changes than Python 2.x (or maybe that the python-3000 list is > more friendly to these suggestions than py-dev), and so changes are > brought up here even though they could be applied earlier. This would > be problematic in part because if a change *can* go in 2.x, it would be > really good if it did, so that the move to 3.0 involves as few changes > as possible. > > Formalizing the target implementation through the PEP numbering might > also cause premature expectations about when the feature might be > introduced. Though if it's okay to just renumber the PEP then that > wouldn't be a problem. Right. I guess this is one of the meta-issues that we should decide upon shortly. I like your strawman: if incompatibilities or synergy don't require it to go into Py3k, let's propose it for 2.x. (But in general it's too late for 2.5, unless you have a *really* small tweak or a *really* important issue. We don't want to make 2.5 late.) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Mar 28 21:01:49 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 28 Mar 2006 11:01:49 -0800 Subject: [Python-3000] Parallel iteration syntax In-Reply-To: <44297EE7.7020300@colorstudy.com> References: <4428C3DD.9090603@canterbury.ac.nz> <44297EE7.7020300@colorstudy.com> Message-ID: On 3/28/06, Ian Bicking wrote: > Guido van Rossum wrote: > > The proposed syntax doesn't quite jive with my guts, and the issue of > > "what to do if they are of unequal length" is a good one, which is > > better solved by being explicit and using zip (== izip). > > Is zip() going to be equivalent to izip(), or will it be a view? I vote > for view. xrange() does not produce an iterator, so there is some > precedence that we not replace list-constructing-builtins with > iterator-constructing-builtins. I believe it should be an interator (i.e. izip()). I think we should be careful with making everything a view, especially if the *input* can be an arbitrary iterator. filter(), map(), zip(), enumerate() all make perfect sense with an iterator as input, and I don't want to think about the consequences of allowing views on iterators. I think views should only be used when the view invariants can be easily sustained by the underlying data type. A dict can be taught about its views and has complete control because you get the views by calling a method. That's not the case for zip(). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at pobox.com Tue Mar 28 21:03:56 2006 From: skip at pobox.com (skip at pobox.com) Date: Tue, 28 Mar 2006 13:03:56 -0600 Subject: [Python-3000] pre-PEP: Procedure for PEPs with Backwards-Incompatible Changes In-Reply-To: <44297D01.7040100@colorstudy.com> References: <17448.48654.533321.884011@montanaro.dyndns.org> <17449.30401.409795.632234@montanaro.dyndns.org> <44297D01.7040100@colorstudy.com> Message-ID: <17449.34972.48418.435555@montanaro.dyndns.org> Ian> There seems to be a danger that Py3K is seen as a more friendly Ian> place to suggest changes than Python 2.x (or maybe that the Ian> python-3000 list is more friendly to these suggestions than Ian> py-dev), and so changes are brought up here even though they could Ian> be applied earlier. This would be problematic in part because if a Ian> change *can* go in 2.x, it would be really good if it did, so that Ian> the move to 3.0 involves as few changes as possible. There are some things which would be difficult to do cleanly in 2.x because because of syntactic or semantic limitations. We've agonized over a number of changes to the language in the past few years which were made more difficult by the preexisting bits of the language or its various implementations. I imagine proposals which fall into that category might well be Py3k only. I see no reason to implement them badly in 2.x at this point only to break them incompatibly in 3.x. That was why I asked about a distinction between PEPs. Skip From ianb at colorstudy.com Tue Mar 28 21:07:31 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 28 Mar 2006 13:07:31 -0600 Subject: [Python-3000] Parallel iteration syntax In-Reply-To: <44297EE7.7020300@colorstudy.com> References: <4428C3DD.9090603@canterbury.ac.nz> <44297EE7.7020300@colorstudy.com> Message-ID: <44298973.6030201@colorstudy.com> Ian Bicking wrote: > Guido van Rossum wrote: > >>The proposed syntax doesn't quite jive with my guts, and the issue of >>"what to do if they are of unequal length" is a good one, which is >>better solved by being explicit and using zip (== izip). > > > Is zip() going to be equivalent to izip(), or will it be a view? I vote > for view. xrange() does not produce an iterator, so there is some > precedence that we not replace list-constructing-builtins with > iterator-constructing-builtins. It occurs to me it could just be a funny kind of delegate as well: class zip: def __init__(self, *subobjs): self._subobjs = subobjs def __repr__(self): return '%s(%s)' % ( self.__class__.__name__, ', '.join(repr(obj) for obj in self._subobjs)) def __getattr__(self, attr): def repl(*args, **kw): return tuple([getattr(obj, attr)(*args, **kw) for obj in self._subobjs]) return repl def __iter__(self): iters = [iter(obj) for obj in self._subobjs] while 1: yield tuple([i.next() for i in iters]) z = zip([1, 2, 3], ['one', 'two', 'three']) That __getattr__ doesn't work correctly for new style classes. I haven't figured out the proper way to create delegates since new-style classes came along. Also, maybe zip() really should be sequence-oriented. E.g., z[0:2] should probably return [(1, 'one'), (2, 'two')], not ([1, 2], ['one', 'two']) (as it does with this implementation). So maybe it's not quite so generalizable as I make it here. But a more specific sequence-based delegation seems appropriate. z[0] should (and does) only work if all inputs are indexable; iterators as input are fine, so long as you only use the resulting object as an iterator. That is, you get the least-common-denominator of the inputs. I think this is attractive since it preserves the current behavior of zip() in many (but not all) ways, and seems both fairly transparent and efficient. The resulting object could be mutable -- there's no technical reason it should be mutable -- but it is somewhat questionable. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From guido at python.org Tue Mar 28 21:09:41 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 28 Mar 2006 11:09:41 -0800 Subject: [Python-3000] pre-PEP: Procedure for PEPs with Backwards-Incompatible Changes In-Reply-To: References: <17448.48654.533321.884011@montanaro.dyndns.org> <17449.30401.409795.632234@montanaro.dyndns.org> <44297D01.7040100@colorstudy.com> Message-ID: On 3/28/06, Neal Norwitz wrote: > I'm not concerned about that. Everyone here will ensure that if a > feature should go into 2.x, it will. The feature may be implemented > first in 3k, the PEP may be 3xxx, but when it's ready, it will migrate > to 2.x also. This is important for moving to 3k. We need to make the > migration as simple as possible. "Backporting" these features is one > aspect of making it easier. > > No one is forgetting about 2.x by any means. There seemed to be > general consensus that there will be at least a couple more 2.x > releases. Or maybe that was just my view and no one disagreed. :-) It's my view too. But I think there will be less reason/opportunity to backport 3.x features to 2.x than there were, for example, in the Zope 3 vs. Zope 2 situation. Zope 3 was a completely new system where much new ground was broken, including new implementations of functionality that already existed in Zope 2. When the new implementation in Zope 3 was deemed mature enough and a serious improvement on the equivalent in Zope 2, it was sometimes backported. The Python situation is a bit different -- we're not starting with a reimplementation from the ground up, and I expect that many of the new features will be constrained by syntax or semantics (e.g. Unicode-only strings) which will prevent them to be backported. However, I suppose a bytes type might find its way into 2.6, and perhaps also an alternate I/O stack. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at pobox.com Tue Mar 28 21:50:52 2006 From: skip at pobox.com (skip at pobox.com) Date: Tue, 28 Mar 2006 13:50:52 -0600 Subject: [Python-3000] pre-PEP: Procedure for PEPs with Backwards-Incompatible Changes In-Reply-To: References: <17448.48654.533321.884011@montanaro.dyndns.org> <17449.30401.409795.632234@montanaro.dyndns.org> <44297D01.7040100@colorstudy.com> Message-ID: <17449.37788.807759.169269@montanaro.dyndns.org> >> No one is forgetting about 2.x by any means. There seemed to be >> general consensus that there will be at least a couple more 2.x >> releases. Or maybe that was just my view and no one disagreed. :-) Guido> It's my view too. Are you sure it's not your iter? ;-) Skip From steven.bethard at gmail.com Tue Mar 28 22:21:11 2006 From: steven.bethard at gmail.com (Steven Bethard) Date: Tue, 28 Mar 2006 13:21:11 -0700 Subject: [Python-3000] pre-PEP: Procedure for PEPs with Backwards-Incompatible Changes In-Reply-To: References: <17448.48654.533321.884011@montanaro.dyndns.org> <17449.30401.409795.632234@montanaro.dyndns.org> <44297D01.7040100@colorstudy.com> Message-ID: On 3/28/06, Guido van Rossum wrote: > I like your strawman: if incompatibilities or synergy > don't require it to go into Py3k, let's propose it for 2.x. Yeah, I think this makes a lot of sense - and we should probably document it somewhere. Do you want this in the Backwards-Incompatible Changes PEP? Or another PEP? Or maybe just an update to PEP 1? Steve -- Grammar am for people who can't think for myself. --- Bucky Katt, Get Fuzzy From greg.ewing at canterbury.ac.nz Wed Mar 29 04:20:04 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 29 Mar 2006 14:20:04 +1200 Subject: [Python-3000] pre-PEP: Procedure for PEPs with Backwards-Incompatible Changes In-Reply-To: References: <17448.48654.533321.884011@montanaro.dyndns.org> <17449.30401.409795.632234@montanaro.dyndns.org> Message-ID: <4429EED4.8020509@canterbury.ac.nz> Guido van Rossum wrote: > I don't think there are enough dimensions in the numbering scheme to > indicate all possible distinctions. Dotted PEP numbers? PEP numbers with keyword arguments? :-) -- Greg From greg.ewing at canterbury.ac.nz Wed Mar 29 04:52:15 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 29 Mar 2006 14:52:15 +1200 Subject: [Python-3000] Parallel iteration syntax In-Reply-To: <20060328055412.7w4i7h7ten4kc4ow@login.werra.lunarpages.com> References: <20060328055412.7w4i7h7ten4kc4ow@login.werra.lunarpages.com> Message-ID: <4429F65F.4040904@canterbury.ac.nz> Michael Chermside wrote: > There's one big problem I see with this. Parallel iteration > is underspecified... there are several reasonable choices > for what to do if the iterables are of differing length. I have trouble seeing that as a *big* problem. I'd go for raising an exception (when in doubt...) Most of the time it's probably a bug if the sequences are of diffent lengths. If not, the user can catch the exception and take appropriate action. Especially if the exception includes info about which sequence was shorter. > Today, we support these with different idioms: > > (1) > result = map(some_func, seq_x, seq_y) > > (2) > for x, y in zip(seq_x, seq_y): > some_func(x, y) It's really only an accident that these correspond to different handlings of unequal length sequences, though, especially considering that we're trying to make the need for map() go away with things like zip() and LCs. If it were a deliberate feature, we'd have different versions of zip() corresponding to the different behaviours. Greg From greg.ewing at canterbury.ac.nz Wed Mar 29 04:53:26 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 29 Mar 2006 14:53:26 +1200 Subject: [Python-3000] Parallel iteration syntax In-Reply-To: <080301c65277$4bd0f890$bf03030a@trilan> References: <4428C3DD.9090603@canterbury.ac.nz> <080301c65277$4bd0f890$bf03030a@trilan> Message-ID: <4429F6A6.4020506@canterbury.ac.nz> Giovanni Bajo wrote: > for i,(x,y) in enumerate(izip(iter1, iter2)): > ... > > must be translated to: > > for (i,x in enumerate(iter1), y in iter2): Maybe the functionality of enumerate() could be incorporated into the syntax as well. for (i in *, x in iter1, y in iter2): ... -- Greg From greg.ewing at canterbury.ac.nz Wed Mar 29 04:53:35 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 29 Mar 2006 14:53:35 +1200 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <1143564241.3305.75.camel@localhost.localdomain> References: <4422FC96.2020409@zope.com> <1f7befae0603241822n1280ba67i3ed87e842114f18b@mail.gmail.com> <1143306134.3186.1.camel@localhost.localdomain> <44258376.1080709@gmx.net> <1143432484.14391.67.camel@localhost.localdomain> <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> <79990c6b0603280045h4e670ea0l59d5f5feb925814d@mail.gmail.com> <1143564241.3305.75.camel@localhost.localdomain> Message-ID: <4429F6AF.80906@canterbury.ac.nz> Adam DePrince wrote: > The first, and easiest to write about and > implement, was to make iters deletable to give the appearance of having > a view in what I thought was the only way we cared. This is massively wrong. There's much more to the views idea than just being able to delete things. > I'm proposing a number of interfaces, including proper C-level > interfaces. I don't see a need for any formal protocols or interfaces here. Each type will be unique in the kinds of views it provides and what functionality they have. It's a design pattern, not an interface. -- Greg From greg.ewing at canterbury.ac.nz Wed Mar 29 04:53:49 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 29 Mar 2006 14:53:49 +1200 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <1143565628.3305.82.camel@localhost.localdomain> References: <4422FC96.2020409@zope.com> <44230BA5.8070407@zope.com> <1f7befae0603241051p513439d7ob98bca0b25091ee1@mail.gmail.com> <1f7befae0603241822n1280ba67i3ed87e842114f18b@mail.gmail.com> <1143306134.3186.1.camel@localhost.localdomain> <44258376.1080709@gmx.net> <1143432484.14391.67.camel@localhost.localdomain> <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> <1143565628.3305.82.camel@localhost.localdomain> Message-ID: <4429F6BD.60704@canterbury.ac.nz> Adam DePrince wrote: > The following interface names are abbreviations for the following > permutations of the above. > > * Collection View( SetView + Multiview ) > * ListView: (SetView + MultiView + OrderedView) > * OrderedSetView (SetView + OrderedView ) > * MapView( SetView + MappingView ) > * OrderedMapView( SetView + MappingView + OrderedView ) > * MultiMapView( SetView + MultiView + MappingView ) > * OrderedMultiMapView( SetView + CollectionView + MappingView + OrderedView ) Nooooo.... This is massive over-design. Python is NOT Java! -- Greg From greg.ewing at canterbury.ac.nz Wed Mar 29 04:54:00 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 29 Mar 2006 14:54:00 +1200 Subject: [Python-3000] Parallel iteration syntax In-Reply-To: References: <4428C3DD.9090603@canterbury.ac.nz> Message-ID: <4429F6C8.8020401@canterbury.ac.nz> Guido van Rossum wrote: > for (x in A, y in B): > > could just as well be meant as a shortcut for > > for x in A: > for y in B: Well, the parens around the whole thing make it look like a single tuple to me. But that could just be because I already know what it's supposed to mean. Ultimately it would be something you just have to learn, like the meaning of zip(). -- Greg From brett at python.org Wed Mar 29 09:11:50 2006 From: brett at python.org (Brett Cannon) Date: Tue, 28 Mar 2006 23:11:50 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <4429F6BD.60704@canterbury.ac.nz> References: <4422FC96.2020409@zope.com> <1f7befae0603241822n1280ba67i3ed87e842114f18b@mail.gmail.com> <1143306134.3186.1.camel@localhost.localdomain> <44258376.1080709@gmx.net> <1143432484.14391.67.camel@localhost.localdomain> <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> <1143565628.3305.82.camel@localhost.localdomain> <4429F6BD.60704@canterbury.ac.nz> Message-ID: On 3/28/06, Greg Ewing wrote: > Adam DePrince wrote: > > > The following interface names are abbreviations for the following > > permutations of the above. > > > > * Collection View( SetView + Multiview ) > > * ListView: (SetView + MultiView + OrderedView) > > * OrderedSetView (SetView + OrderedView ) > > * MapView( SetView + MappingView ) > > * OrderedMapView( SetView + MappingView + OrderedView ) > > * MultiMapView( SetView + MultiView + MappingView ) > > * OrderedMultiMapView( SetView + CollectionView + MappingView + OrderedView ) > > Nooooo.... > > This is massive over-design. > > Python is NOT Java! > What I was taking away from this whole view discussion was basically just coming up with a simple, minimal, set/container interface that allows one to know about what a data structure contains. So I basically expected that it would implement __contains__, __len__, and if people wanted delete(obj) (optional or not). Basically a simple set interface where we could have a __container__/__view__/__set__ whatever method to call to get a view of the data structure. Basically a read-only (with a possible delete possibility) mapping interface. I am with Greg with wanting to minimize any official protocols we have. Iterators were desirable because they formalized how 'for' loops worked. The reason the view topic came up was people wanted to be able to know if an iterator had any value to return without having to call next(). So the proposed interface has a use, it doesn't directly tie into why we added iterators as much. Perhaps if people need to know if a specific iterator has a certain amount their iterator can also implement __len__, but it obviously would not be part of the iterator interface. Without a direct reason in terms of the language needing a standardization of an interface, perhaps we just don't need views. If people want their iterator to have a __len__ method, then fine, they can add it without breaking anything, just realize it isn't part of the iterator protocol and thus may limit what objects a function can accept, but that is there choice. -Brett From p.f.moore at gmail.com Wed Mar 29 11:29:48 2006 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 29 Mar 2006 10:29:48 +0100 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <4422FC96.2020409@zope.com> <1143306134.3186.1.camel@localhost.localdomain> <44258376.1080709@gmx.net> <1143432484.14391.67.camel@localhost.localdomain> <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> <1143565628.3305.82.camel@localhost.localdomain> <4429F6BD.60704@canterbury.ac.nz> Message-ID: <79990c6b0603290129m7bed22cft3e242cd009efe9e0@mail.gmail.com> On 3/29/06, Brett Cannon wrote: > Without a direct reason in terms of the language needing a > standardization of an interface, perhaps we just don't need views. If > people want their iterator to have a __len__ method, then fine, they > can add it without breaking anything, just realize it isn't part of > the iterator protocol and thus may limit what objects a function can > accept, but that is there choice. Good point. I think we need to start from strong use cases. With these, I agree that the view concept is a good implementation technique to consider. But let's not implement views just for the sake of having them - I'm pretty sure that was never Guido's intention. I still think my earlier analysis is important - for loops have no direct access to the iterator/view/whatever, and inline code has access to the original object. So the *only* relevant use cases are those where people are writing functions which take "extended iterator" arguments, where those functions cannot reasonably take either an additional argument which is the original object, or take the original object (an iterable) *instead* of an iterator. The key here is generators - with a generator, there is no "original object". But then again, generators are never going to be anything other than pure iterators, either. So you either allow them or don't, and if you don't, you can't say you take "any iterator". I'm still left wondering whether there are any really good use cases. The itertools module shows how much you can do without needing more than an iterator. Suggestion: Start with a call for use cases, and begin the PEP with nothing more than those. It's not normally what a PEP looks like, but it would probably help to capture things in this case. Paul. PS I also wonder whether adaptation would apply here. Pass the "original object", and the function adapts it to the view it needs. The advantage is that it supports the idea of passing concrete objects round, rather than partial views, while still allowing functions to be specified in terms of a limited interface. The disadvantage is that adaptation is still somewhat in flux, and it's possibly overkill for this issue. From stefan.rank at ofai.at Wed Mar 29 11:37:20 2006 From: stefan.rank at ofai.at (Stefan Rank) Date: Wed, 29 Mar 2006 11:37:20 +0200 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <4422FC96.2020409@zope.com> <1f7befae0603241822n1280ba67i3ed87e842114f18b@mail.gmail.com> <1143306134.3186.1.camel@localhost.localdomain> <44258376.1080709@gmx.net> <1143432484.14391.67.camel@localhost.localdomain> <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> <1143565628.3305.82.camel@localhost.localdomain> <4429F6BD.60704@canterbury.ac.nz> Message-ID: <442A5550.5090804@ofai.at> on 29.03.2006 09:11 Brett Cannon said the following: > On 3/28/06, Greg Ewing wrote: >> Adam DePrince wrote: >> [snip ... massive over-design.] >> >> Python is NOT Java! >> > > What I was taking away from this whole view discussion was basically > just coming up with a simple, minimal, set/container interface that > allows one to know about what a data structure contains. So I > basically expected that it would implement __contains__, __len__, and > if people wanted delete(obj) (optional or not). Basically a simple > set interface where we could have a __container__/__view__/__set__ > whatever method to call to get a view of the data structure. > Basically a read-only (with a possible delete possibility) mapping > interface. > > I am with Greg with wanting to minimize any official protocols we > have. Iterators were desirable because they formalized how 'for' > loops worked. The reason the view topic came up was people wanted to > be able to know if an iterator had any value to return without having > to call next(). So the proposed interface has a use, it doesn't > directly tie into why we added iterators as much. Perhaps if people > need to know if a specific iterator has a certain amount their > iterator can also implement __len__, but it obviously would not be > part of the iterator interface. > > Without a direct reason in terms of the language needing a > standardization of an interface, perhaps we just don't need views. If > people want their iterator to have a __len__ method, then fine, they > can add it without breaking anything, just realize it isn't part of > the iterator protocol and thus may limit what objects a function can > accept, but that is there choice. I think that two ideas get mixed up here: - deletion of collection members via iterators this has nearly nothing to do with 'views' of collections. deletion via the iterator in Java is as syntactically awkward as it would be in Python, it does not allow the (new) ``for (item : collection) {}`` syntax. Maybe it is not worth it for Python. I think it would be enough to think of guarantees for iterating in the face of concurrent modifications: dont choke if something is deleted that you already iterated over and dont care if something is deleted that you would have iterated over later (or guarantee to fail fast?). so deletion would still be possible only if you have access to the *iterable*. - views versus copies of collections views in Java just use the same interface that the original collection has, there are no iterators involved, and no new interfaces/types. examples (from the Java **1.5.0** docs at http://java.sun.com/j2se/1.5.0/docs/api/java/util/package-summary.html): - List.subList returns a List SortedSet.{headSet,subSet,tailSet} return SortedSet-s (and that's it!! The basic Collection does not know anything about relations between elements, it is a multiset, so it could only return the whole as a view... and that would be itself.) - Java's Map is not a Collection, but it returns views that use collection interfaces for: Map.entrySet returns a Set Map.keySet returns a Set Map.values returns a Collection (a multiset) - SortedMap.{headMap,subMap,tailMap} return SortedMap-s the {head,sub,tail}xxx things correspond to (a weak form of) slicing, the map methods correspond to the dict methods. the only difference is, they do not return a copy but internally stay connected to their ancestor. So I don't think new types/interfaces are necessary at all. I don't see why views should restrict the interface of the original iterable. Maybe an optional addition to all iterables to query if they have their own data or use data from an ancestor. Maybe a property: iterable.ancestor / __ancestor__ (defaults to None). A big question is: Should slicing also return views? and why not? Another one (and here's the only relation to the deletion thing): which guarantees in the face of modification? NumPy might be a good inspiration. (as mentioned earlier in the thread) cheers, stefan From ncoghlan at gmail.com Wed Mar 29 13:15:35 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 29 Mar 2006 21:15:35 +1000 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <79990c6b0603290129m7bed22cft3e242cd009efe9e0@mail.gmail.com> References: <4422FC96.2020409@zope.com> <1143306134.3186.1.camel@localhost.localdomain> <44258376.1080709@gmx.net> <1143432484.14391.67.camel@localhost.localdomain> <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> <1143565628.3305.82.camel@localhost.localdomain> <4429F6BD.60704@canterbury.ac.nz> <79990c6b0603290129m7bed22cft3e242cd009efe9e0@mail.gmail.com> Message-ID: <442A6C57.1030309@gmail.com> Paul Moore wrote: > On 3/29/06, Brett Cannon wrote: >> Without a direct reason in terms of the language needing a >> standardization of an interface, perhaps we just don't need views. If >> people want their iterator to have a __len__ method, then fine, they >> can add it without breaking anything, just realize it isn't part of >> the iterator protocol and thus may limit what objects a function can >> accept, but that is there choice. > > Good point. I think we need to start from strong use cases. With > these, I agree that the view concept is a good implementation > technique to consider. But let's not implement views just for the sake > of having them - I'm pretty sure that was never Guido's intention. There are three big use cases: dict.keys dict.values dict.items Currently these all return lists, which may be expensive in terms of copying. They all have iter* variants which while memory efficient, are far less convenient to work with. For Py3k, the intent is to have only one version which produces a view with the memory efficiency of an iterator, but the convenience of a list. To give these views the benefits of having a real list, the following is all that's really needed: 1. implement __len__ (allows bool() and len() to work) - all delegate to dict.__len__ 2. implement __contains__ (allows containment tests to work) - delegate to dict.__contains__ for dict.keys() - use (or fallback to) linear search for dict.values() - check "dict[item[0]] == item[1]" for dict.items() 3. implement __iter__ (allows iteration to work) - make iter(dict.keys()) equivalent to current dict.iterkeys() - make iter(dict.values()) equivalent to current dict.itervalues() - make iter(dict.items()) equivalent to current dict.iteritems() For an immutable view, that's all you need. IOW, take the iterable protocol (an __iter__ that returns a new iterator when invoked) and add __len__ and __contains__ to get a "container" protocol. Given that containment falls back on __iter__ anyway, __len__ is the only essential addition to turn an iterable into a container. Note that adding __len__ to an *iterator* does NOT give you something that would satisfy such a container protocol - invoking __iter__ again does not give you a fresh iterator, so you can't easily iterate repeatedly. With reiterability as a defining characteristic, other niceties become possible (potentially available as a mixin): 1. a generic container __str__ (not __repr__!) implementation: def __str__(self): # keep default __repr__ since eval(repr(x)) won't round trip name = self.__name__ guts = ", ".join(repr(x) for x in self) return "%s([%s])" % guts 2. generic container value based equality testing: def __eq__(self, other): if len(self) != len(other): return False for this, that in izip(self, other): if this != that: return False return True Further refinement of such a container protocol to the minimal requirements for a sequence protocol is already defined by such things as the requirements of the reversed() builtin: for i, x in enumerate(seq): assert seq[i] == x Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From adam.deprince at gmail.com Wed Mar 29 17:37:37 2006 From: adam.deprince at gmail.com (Adam DePrince) Date: Wed, 29 Mar 2006 10:37:37 -0500 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <4429F6BD.60704@canterbury.ac.nz> References: <4422FC96.2020409@zope.com> <44230BA5.8070407@zope.com> <1f7befae0603241051p513439d7ob98bca0b25091ee1@mail.gmail.com> <1f7befae0603241822n1280ba67i3ed87e842114f18b@mail.gmail.com> <1143306134.3186.1.camel@localhost.localdomain> <44258376.1080709@gmx.net> <1143432484.14391.67.camel@localhost.localdomain> <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> <1143565628.3305.82.camel@localhost.localdomain> <4429F6BD.60704@canterbury.ac.nz> Message-ID: <1143646657.3074.52.camel@localhost.localdomain> On Wed, 2006-03-29 at 14:53 +1200, Greg Ewing wrote: > Adam DePrince wrote: > > > The following interface names are abbreviations for the following > > permutations of the above. > > > > * Collection View( SetView + Multiview ) > > * ListView: (SetView + MultiView + OrderedView) > > * OrderedSetView (SetView + OrderedView ) > > * MapView( SetView + MappingView ) > > * OrderedMapView( SetView + MappingView + OrderedView ) > > * MultiMapView( SetView + MultiView + MappingView ) > > * OrderedMultiMapView( SetView + CollectionView + MappingView + OrderedView ) > > Nooooo.... > > This is massive over-design. > > Python is NOT Java! I couldn't agree more. My goal is *not* to introduce a weighty set of abstractions into the Python interpreter. I should worded this better. What I'm proposing is multiple inheritance, mix-in classes to a basic SetView so that we can accommodate all of the permutations of views we are likely to encounter. What I am *not* proposing is that we copy the Java specification into Python. What I am proposing is that when we decide exactly which methods will be required for each flavor of view. This is almost more about mental bookkeeping than it is about the Python implementation. From an implementation standpoint, how heavy will it be? Well, maybe: SetView implements: .__contains__ .add .discard .__len__ A MultiView mixin might might add .howmany An OrderedView mixin might add .__getitem__ .__setitem__ A Mapping mixin might add .get .set The idea here is really to document "what it means to" offer a particular view. The only thing heavy about it is the multiple inheritance. Lets not confuse heavy abstraction surrounding the motivation for a heavy end product. IMHO P3K is a good time to sit back and ask ourselves what we mean as opposed to what we do. Part of what I'm calling for here in this PEP is to see .values for what it really is, a collection. Noting here stops or makes it difficult to continue to say: for i in dict.items(): process( i ) But because the identity of .items as a collection remains at the fore front of our mind, we are less likely to hobble ourselves as we did before with the iter variants of keys/values/items. I'm all for use cases, but keep in mind that python use cases are motivated by the capabilities of older version of python, so they will naturally be biased towards older ways of working. By allowing dict.keys() to announce "I'm a SetView" we open up all sorts of possibilities, such as direct interaction with our set type. >>> a = set( "abcdef" ) set(['a', 'c', 'b', 'e', 'd', 'f']) >>> d = dict( a=1 ) >>> a-d.keys() Traceback (most recent call last): File "", line 1, in ? TypeError: unsupported operand type(s) for -: 'set' and 'list' Consider that having d.keys be able to assert that 'I'm a set.view' would allow the above to work with any object that understood the view. I don't expect the set type to have special case code for lists and iters, but it is reasonable that a set object would be able to accommodate a set view, and the above would work. You don't see it as a common use case right now because the proper way of saying it today isn't particularly efficient, there is no way to say it without either making a full copy of the keys >>> a-set( d.iterkeys() ) set(['c', 'b', 'e', 'd', 'f']) >>> or building a loop. Ergo, the for-loop becomes the common use case. Besides, common use cases are too often about what benchmarks the fastest anyway :-) My allowing for a closer conceptual match, and more abstract types, we increase the number of opportunities for duck typing to "just work." If we had a SetView interface, everybody that implemented that interface would "just work" in the above example. Cheers - Adam DePrince From adam.deprince at gmail.com Wed Mar 29 17:45:33 2006 From: adam.deprince at gmail.com (Adam DePrince) Date: Wed, 29 Mar 2006 10:45:33 -0500 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <4422FC96.2020409@zope.com> <1f7befae0603241822n1280ba67i3ed87e842114f18b@mail.gmail.com> <1143306134.3186.1.camel@localhost.localdomain> <44258376.1080709@gmx.net> <1143432484.14391.67.camel@localhost.localdomain> <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> <1143565628.3305.82.camel@localhost.localdomain> <4429F6BD.60704@canterbury.ac.nz> Message-ID: <1143647134.3074.61.camel@localhost.localdomain> > set interface where we could have a __container__/__view__/__set__ Why would I call a method to get a view on an object when the object can just as well implement the view? The *only* time we want to call a method to get a view is when there is not one, single, completing definition of the object's canonical perspective should be. List has a single obvious view. If you really want to see a list as a setview, just pretend it doesn't implement OrderedView and CollectionView. Down-casting by neglect works fine. Dict doesn't, there are 4 possiable views. The mapping view provided directly by the dict, and keys/values/items. - Adam DePrince From adam.deprince at gmail.com Wed Mar 29 18:10:53 2006 From: adam.deprince at gmail.com (Adam DePrince) Date: Wed, 29 Mar 2006 11:10:53 -0500 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <442A6C57.1030309@gmail.com> References: <4422FC96.2020409@zope.com> <1143306134.3186.1.camel@localhost.localdomain> <44258376.1080709@gmx.net> <1143432484.14391.67.camel@localhost.localdomain> <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> <1143565628.3305.82.camel@localhost.localdomain> <4429F6BD.60704@canterbury.ac.nz> <79990c6b0603290129m7bed22cft3e242cd009efe9e0@mail.gmail.com> <442A6C57.1030309@gmail.com> Message-ID: <1143648653.3074.86.camel@localhost.localdomain> On Wed, 2006-03-29 at 21:15 +1000, Nick Coghlan wrote: > Paul Moore wrote: > > On 3/29/06, Brett Cannon wrote: > >> Without a direct reason in terms of the language needing a > >> standardization of an interface, perhaps we just don't need views. If > >> people want their iterator to have a __len__ method, then fine, they > >> can add it without breaking anything, just realize it isn't part of > >> the iterator protocol and thus may limit what objects a function can > >> accept, but that is there choice. > > > > Good point. I think we need to start from strong use cases. With > > these, I agree that the view concept is a good implementation > > technique to consider. But let's not implement views just for the sake > > of having them - I'm pretty sure that was never Guido's intention. > > There are three big use cases: > > dict.keys > dict.values > dict.items There is more than that. Everybody who accesses a database has to jump and down to extract their fields. Wouldn't it be nice if you could say to your result set from a database: >>> rs.execute( "select upc, description, price from my_table" ) >>> data = rs.fetch().fieldby( 'price','upc') >>> print type( data ) Or a tree implementation of a dictionary. >>> type( tree_dict.keys() ) The idea that is there is so much more we can do if we had some mechanism of identifying at a higher level the semantics of the data structure. While dict is pretty much it for core python, there are a lot of data stores in the wild, and the View's would give us the ability for better interaction and abstraction than passing around lists or their performance modified twin iter. Consider for instance if you had to dictionaries, both of which are so large you don't want to work on copies of their keys. You want to know which items are in only the first ... dicta.keys() - dictb.keys() Because each supports the SetView interface, we need only provide a single generic SetView.difference operator and move on. This prevents the ungainly conversion to sets first which, while easy to write, is slow, especially considering how well dict's implement sets in the first place. Cheers - Adam DePrince > To give these views the benefits of having a real list, the following is all > that's really needed: > > 1. implement __len__ (allows bool() and len() to work) > - all delegate to dict.__len__ > > 2. implement __contains__ (allows containment tests to work) > - delegate to dict.__contains__ for dict.keys() > - use (or fallback to) linear search for dict.values() > - check "dict[item[0]] == item[1]" for dict.items() > > 3. implement __iter__ (allows iteration to work) > - make iter(dict.keys()) equivalent to current dict.iterkeys() > - make iter(dict.values()) equivalent to current dict.itervalues() > - make iter(dict.items()) equivalent to current dict.iteritems() > > For an immutable view, that's all you need. IOW, take the iterable protocol Mutability isn't really a problem for Views, unlike iters, views don't store state, they are just wrappers. Now for a view created iter, yeah, the normal iter mutation problems still exist. Views do partly solve the iter mutability problem by allowing many operations of an iteration that would otherwise take place within. Consider this: unwanted_words = set( ... index = { .... for k in index.keys(): if k in unwanted_words: del( index[ k ] ) But with a view, we could say: index.keys() -= unwanted_words Basically, my understanding of the the idea behind a view is eliminate the need for a mutation compatible iterator by reducing the pressure and demand for one to a level acceptable for something ignored. > (an __iter__ that returns a new iterator when invoked) and add __len__ and > __contains__ to get a "container" protocol. Given that containment falls back > on __iter__ anyway, __len__ is the only essential addition to turn an iterable > into a container. > > Note that adding __len__ to an *iterator* does NOT give you something that > would satisfy such a container protocol - invoking __iter__ again does not > give you a fresh iterator, so you can't easily iterate repeatedly. > > With reiterability as a defining characteristic, other niceties become > possible (potentially available as a mixin): > > 1. a generic container __str__ (not __repr__!) implementation: > > def __str__(self): > # keep default __repr__ since eval(repr(x)) won't round trip > name = self.__name__ > guts = ", ".join(repr(x) for x in self) > return "%s([%s])" % guts > > 2. generic container value based equality testing: > def __eq__(self, other): > if len(self) != len(other): > return False > for this, that in izip(self, other): > if this != that: > return False > return True > > Further refinement of such a container protocol to the minimal requirements > for a sequence protocol is already defined by such things as the requirements > of the reversed() builtin: > > for i, x in enumerate(seq): > assert seq[i] == x > > Cheers, From brett at python.org Wed Mar 29 21:08:27 2006 From: brett at python.org (Brett Cannon) Date: Wed, 29 Mar 2006 11:08:27 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <1143647134.3074.61.camel@localhost.localdomain> References: <4422FC96.2020409@zope.com> <1143306134.3186.1.camel@localhost.localdomain> <44258376.1080709@gmx.net> <1143432484.14391.67.camel@localhost.localdomain> <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> <1143565628.3305.82.camel@localhost.localdomain> <4429F6BD.60704@canterbury.ac.nz> <1143647134.3074.61.camel@localhost.localdomain> Message-ID: On 3/29/06, Adam DePrince wrote: > > > set interface where we could have a __container__/__view__/__set__ > > Why would I call a method to get a view on an object when the object can > just as well implement the view? The *only* time we want to call a > method to get a view is when there is not one, single, completing > definition of the object's canonical perspective should be. > That's my point. What are we gaining by trying to augment iterators or come up with some specified way to know if something contains something else when we can get it off of the object itself. Although, as Nick pointed out, dicts have multiple views of their data. But if you just want to know if a key or value exists, then you can either use a dict's __contains__ for keys or create a set to use for values. That's what a view will end up having to do anyway underneath the covers, so I don't know if we are going to get much benefit from going through this beyond just saying dict.value_view() returns a set of the values in a dict. Don't know if we really need to go through all of this formality. > List has a single obvious view. If you really want to see a list as a > setview, just pretend it doesn't implement OrderedView and > CollectionView. Down-casting by neglect works fine. > > Dict doesn't, there are 4 possiable views. The mapping view provided > directly by the dict, and keys/values/items. Four? In terms of "atomic" data views, there are keys and values. Items could be counted, but that is just a way to pair the different types of data in a dict together so I don't know if I would count it as a view, let alone whether it would be useful outside of an iterator. But counting the dict itself is just counting the key view twice since you can get all of that data directly from the dict without needing a view as you pointed out above. -Brett From p.f.moore at gmail.com Thu Mar 30 00:28:14 2006 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 29 Mar 2006 23:28:14 +0100 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <1143648653.3074.86.camel@localhost.localdomain> References: <4422FC96.2020409@zope.com> <1143432484.14391.67.camel@localhost.localdomain> <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> <1143565628.3305.82.camel@localhost.localdomain> <4429F6BD.60704@canterbury.ac.nz> <79990c6b0603290129m7bed22cft3e242cd009efe9e0@mail.gmail.com> <442A6C57.1030309@gmail.com> <1143648653.3074.86.camel@localhost.localdomain> Message-ID: <79990c6b0603291428l5d050419hd5ede4c821da4b37@mail.gmail.com> On 3/29/06, Adam DePrince wrote: > There is more than that. Everybody who accesses a database has to jump > and down to extract their fields. Wouldn't it be nice if you could say > to your result set from a database: > > >>> rs.execute( "select upc, description, price from my_table" ) > >>> data = rs.fetch().fieldby( 'price','upc') > >>> print type( data ) > Um. I use databases a lot and I wouldn't find this useful... Reasons: 1. I nearly always do "for row in cursor" to get rows one at a time via the cursor-as-iterator interface. So I have a tuple in row, and don't access the cursor directly for much else (and again, considering the cursor as the "original object", I have it present so why bother with views?) 2. Why fetch columns I'm not going to use? And even if I did, row indexing is fine. If I want named access, I use a more complex idiom: cols = [d[1] for d in cursor.description] for row in cursor: row_d = dict(zip(cols, row)) # now use row_d['column_name'] which is fine for general DB API interfaces. And many DB interfaces have named access to rowsets as an extension, if I really care. 3. To change this would require changes to the DB API, which is outside the scope of the Python 3000 project (apart from anything else, it could be done now, or independently of any particular Python release). > Or a tree implementation of a dictionary. > > >>> type( tree_dict.keys() ) > This doesn't explain why this is better than just an iterator (which happens to return keys in sorted sequence). > The idea that is there is so much more we can do if we had some > mechanism of identifying at a higher level the semantics of the data > structure. No problem with this. But why do we NEED the extra. Nobody's disputing that we can do more, we're just looking for a reason why more is better. What can't we do with what we have? (Again note - I like the idea of views, I just don't want them to be a solution looking for a problem. If there's a problem, views can be a solution to it, if there isn't a problem, we don't need views). > Consider for instance if you had to dictionaries, both of which are so > large you don't want to work on copies of their keys. You want to know > which items are in only the first ... > > dicta.keys() - dictb.keys() [...] > Views do partly solve the iter mutability problem by allowing many > operations of an iteration that would otherwise take place within. > Consider this: > > unwanted_words = set( ... > > index = { .... > > for k in index.keys(): > if k in unwanted_words: > del( index[ k ] ) > > But with a view, we could say: > > index.keys() -= unwanted_words Those are reasonable use cases, but artificial. Is there real life code that would benefit? The proposed solution is a lot of machinery - it should have correspondingly substantial benefits. Paul. From greg.ewing at canterbury.ac.nz Thu Mar 30 02:04:10 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 30 Mar 2006 12:04:10 +1200 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <4422FC96.2020409@zope.com> <1f7befae0603241822n1280ba67i3ed87e842114f18b@mail.gmail.com> <1143306134.3186.1.camel@localhost.localdomain> <44258376.1080709@gmx.net> <1143432484.14391.67.camel@localhost.localdomain> <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> <1143565628.3305.82.camel@localhost.localdomain> <4429F6BD.60704@canterbury.ac.nz> Message-ID: <442B207A.60305@canterbury.ac.nz> Brett Cannon wrote: > Basically a simple > set interface where we could have a __container__/__view__/__set__ > whatever method to call to get a view of the data structure. > Basically a read-only (with a possible delete possibility) mapping > interface. If there's an obvious default meaning for the basic access methods like __contains__ and __len__, there's no need for a view to provide these -- the original object can (and should) just implement them itself. Views only come into play when there is more than one possible view of an object (e.g. dict has keys, items, values). Then the details are completely type-specific. There might be room for a general immutable view object that doesn't allow any changes, but that could be provided as a generic wrapper that doesn't need to know anything about the base object or vice versa. > Without a direct reason in terms of the language needing a > standardization of an interface, perhaps we just don't need > views. On the contrary, views are a very useful idea, *as a design pattern*. What we *don't* need in Python, as far as I can see, is any formalised protocols or interfaces for views, because there's nothing that can be said about them in general. Thinking that "having views" means "having a formally defined interface for views" is a mindset that comes from B&D languages like Java. It doesn't apply to Python at all. -- Greg From greg.ewing at canterbury.ac.nz Thu Mar 30 02:05:04 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 30 Mar 2006 12:05:04 +1200 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <1143646657.3074.52.camel@localhost.localdomain> References: <4422FC96.2020409@zope.com> <44230BA5.8070407@zope.com> <1f7befae0603241051p513439d7ob98bca0b25091ee1@mail.gmail.com> <1f7befae0603241822n1280ba67i3ed87e842114f18b@mail.gmail.com> <1143306134.3186.1.camel@localhost.localdomain> <44258376.1080709@gmx.net> <1143432484.14391.67.camel@localhost.localdomain> <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> <1143565628.3305.82.camel@localhost.localdomain> <4429F6BD.60704@canterbury.ac.nz> <1143646657.3074.52.camel@localhost.localdomain> Message-ID: <442B20B0.8060207@canterbury.ac.nz> Adam DePrince wrote: > SetView implements: > .__contains__ > .add > .discard > .__len__ But what would there be to inherit from the mixin? Each view class will have entirely its own implementation of these, depending on the details of the base object. Inheritance in Python is *entirely* about implementation inheritance, not interface inheritance. > By allowing dict.keys() to announce "I'm a SetView" If you mean to make it possible for code to do if isinstance(something, SetView): ... that's another thing that Python tries to stay away from as much as possible. If what you're proposing is purely a conceptual classification, I still think it's far too elaborate. Anyone reading such a specification is going to have their eyes glaze over within a few milliseconds. -- Greg From greg.ewing at canterbury.ac.nz Thu Mar 30 02:05:24 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 30 Mar 2006 12:05:24 +1200 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <442A5550.5090804@ofai.at> References: <4422FC96.2020409@zope.com> <1f7befae0603241822n1280ba67i3ed87e842114f18b@mail.gmail.com> <1143306134.3186.1.camel@localhost.localdomain> <44258376.1080709@gmx.net> <1143432484.14391.67.camel@localhost.localdomain> <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> <1143565628.3305.82.camel@localhost.localdomain> <4429F6BD.60704@canterbury.ac.nz> <442A5550.5090804@ofai.at> Message-ID: <442B20C4.70309@canterbury.ac.nz> Stefan Rank wrote: > A big question is: Should slicing also return views? and why not? That's been considered before, in relation to strings. The stumbling block is the problem of a view of a small part of the object keeping the whole thing alive and using up memory. While having a separate way of getting slice-views could be useful, I think it would be too big a change in semantics to make it the default behaviour of slicing notation. -- Greg From greg.ewing at canterbury.ac.nz Thu Mar 30 02:05:33 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 30 Mar 2006 12:05:33 +1200 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <79990c6b0603290129m7bed22cft3e242cd009efe9e0@mail.gmail.com> References: <4422FC96.2020409@zope.com> <1143306134.3186.1.camel@localhost.localdomain> <44258376.1080709@gmx.net> <1143432484.14391.67.camel@localhost.localdomain> <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> <1143565628.3305.82.camel@localhost.localdomain> <4429F6BD.60704@canterbury.ac.nz> <79990c6b0603290129m7bed22cft3e242cd009efe9e0@mail.gmail.com> Message-ID: <442B20CD.6000908@canterbury.ac.nz> Paul Moore wrote: > I still think my earlier analysis is important - for loops have no > direct access to the iterator/view/whatever, and inline code has > access to the original object. So the *only* relevant use cases are > those where people are writing functions which take "extended > iterator" arguments, where those functions cannot reasonably take > either an additional argument which is the original object, or take > the original object (an iterable) *instead* of an iterator. The problem I thought the deletable-iterator proposal was addressing was that it's often awkward to iterate over something and delete selected items from it. Even if you have the original object, you have to either make a copy of it to iterate over while deleting from the original, or keep a list of items to be deleted and then delete them afterwards, both of which are memory-inefficient. If you knew you had a sequence that produces deletable iterators, you could do diter = iter(myseq) for item in diter: if is_nasty(item): diter.delete() If it were a general principle that mutable containers produce deletable iterators, then code such as the above would work on most mutable containers. So I think the idea does have some possible merit, as an orthogonal issue to views. -- Greg From brett at python.org Thu Mar 30 02:31:04 2006 From: brett at python.org (Brett Cannon) Date: Wed, 29 Mar 2006 16:31:04 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <442B207A.60305@canterbury.ac.nz> References: <4422FC96.2020409@zope.com> <1143306134.3186.1.camel@localhost.localdomain> <44258376.1080709@gmx.net> <1143432484.14391.67.camel@localhost.localdomain> <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> <1143565628.3305.82.camel@localhost.localdomain> <4429F6BD.60704@canterbury.ac.nz> <442B207A.60305@canterbury.ac.nz> Message-ID: On 3/29/06, Greg Ewing wrote: > Brett Cannon wrote: > > > Basically a simple > > set interface where we could have a __container__/__view__/__set__ > > whatever method to call to get a view of the data structure. > > Basically a read-only (with a possible delete possibility) mapping > > interface. > > If there's an obvious default meaning for the basic access > methods like __contains__ and __len__, there's no need for > a view to provide these -- the original object can (and > should) just implement them itself. > > Views only come into play when there is more than one > possible view of an object (e.g. dict has keys, items, > values). Then the details are completely type-specific. > > There might be room for a general immutable view object > that doesn't allow any changes, but that could be > provided as a generic wrapper that doesn't need to know > anything about the base object or vice versa. > Maybe. If people really want to have a frozen set that contains the keys or values of a dict those could be added to the dict type. > > Without a direct reason in terms of the language needing a > > standardization of an interface, perhaps we just don't need > > views. > > On the contrary, views are a very useful idea, *as a > design pattern*. What we *don't* need in Python, as far > as I can see, is any formalised protocols or interfaces > for views, because there's nothing that can be said about > them in general. > > Thinking that "having views" means "having a formally > defined interface for views" is a mindset that comes > from B&D languages like Java. It doesn't apply to > Python at all. > Right. I am just talking about views not being needed in terms of being formalized for the language. Just like how the mapping protocol is there, but not formalized. I am not arguing about the usefulness of views in terms of a concept. -Brett From greg.ewing at canterbury.ac.nz Thu Mar 30 03:16:44 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 30 Mar 2006 13:16:44 +1200 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <1143648653.3074.86.camel@localhost.localdomain> References: <4422FC96.2020409@zope.com> <1143306134.3186.1.camel@localhost.localdomain> <44258376.1080709@gmx.net> <1143432484.14391.67.camel@localhost.localdomain> <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> <1143565628.3305.82.camel@localhost.localdomain> <4429F6BD.60704@canterbury.ac.nz> <79990c6b0603290129m7bed22cft3e242cd009efe9e0@mail.gmail.com> <442A6C57.1030309@gmail.com> <1143648653.3074.86.camel@localhost.localdomain> Message-ID: <442B317C.4000009@canterbury.ac.nz> Adam DePrince wrote: > dicta.keys() - dictb.keys() > > Because each supports the SetView interface, we need only provide a > single generic SetView.difference operator and move on. I can see some use for inheritance there. But keep in mind that there is no multiple inheritance at the C level, so thinking terms of "mixins" doesn't really apply. There would need to be a single base class, or just a bunch of C functions for implementations to plug into the relevant slots. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Thu Mar 30 03:19:00 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 30 Mar 2006 13:19:00 +1200 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <4422FC96.2020409@zope.com> <1143306134.3186.1.camel@localhost.localdomain> <44258376.1080709@gmx.net> <1143432484.14391.67.camel@localhost.localdomain> <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> <1143565628.3305.82.camel@localhost.localdomain> <4429F6BD.60704@canterbury.ac.nz> <1143647134.3074.61.camel@localhost.localdomain> Message-ID: <442B3204.9020803@canterbury.ac.nz> Brett Cannon wrote: > Four? In terms of "atomic" data views, there are keys and values. > Items could be counted, but that is just a way to pair the different > types of data in a dict together so I don't know if I would count it > as a view, let alone whether it would be useful outside of an > iterator. Having a reiterable view of (key, value) pairs would be useful even if it provided nothing else. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Thu Mar 30 04:03:05 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 30 Mar 2006 14:03:05 +1200 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <79990c6b0603291428l5d050419hd5ede4c821da4b37@mail.gmail.com> References: <4422FC96.2020409@zope.com> <1143432484.14391.67.camel@localhost.localdomain> <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> <1143565628.3305.82.camel@localhost.localdomain> <4429F6BD.60704@canterbury.ac.nz> <79990c6b0603290129m7bed22cft3e242cd009efe9e0@mail.gmail.com> <442A6C57.1030309@gmail.com> <1143648653.3074.86.camel@localhost.localdomain> <79990c6b0603291428l5d050419hd5ede4c821da4b37@mail.gmail.com> Message-ID: <442B3C59.4020504@canterbury.ac.nz> On 3/29/06, Adam DePrince wrote: > >>>rs.execute( "select upc, description, price from my_table" ) > >>>data = rs.fetch().fieldby( 'price','upc') > >>>print type( data ) > > Seems to me it would be a better idea for the DB module to return tuple-with-attributes objects for the rows in the first place, rather than plain tuples. When I get around to reworking my custom Firebird module, I'm going to make it do that. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From aleaxit at gmail.com Thu Mar 30 04:23:53 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Wed, 29 Mar 2006 18:23:53 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <442B3C59.4020504@canterbury.ac.nz> References: <4422FC96.2020409@zope.com> <1143432484.14391.67.camel@localhost.localdomain> <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> <1143565628.3305.82.camel@localhost.localdomain> <4429F6BD.60704@canterbury.ac.nz> <79990c6b0603290129m7bed22cft3e242cd009efe9e0@mail.gmail.com> <442A6C57.1030309@gmail.com> <1143648653.3074.86.camel@localhost.localdomain> <79990c6b0603291428l5d050419hd5ede4c821da4b37@mail.gmail.com> <442B3C59.4020504@canterbury.ac.nz> Message-ID: <178A09D2-A36F-4C51-B34C-3139C1ABD2F4@gmail.com> On Mar 29, 2006, at 6:03 PM, Greg Ewing wrote: > On 3/29/06, Adam DePrince wrote: > >>>>> rs.execute( "select upc, description, price from my_table" ) >>>>> data = rs.fetch().fieldby( 'price','upc') >>>>> print type( data ) >> >> > > Seems to me it would be a better idea for the DB > module to return tuple-with-attributes objects for > the rows in the first place, rather than plain > tuples. > > When I get around to reworking my custom Firebird > module, I'm going to make it do that. It does indeed seem a great idea for a lot of tuples to sprout attributes in this way. Do we have any plans to make the "attributeing" of tuples easier for C extension writers? What about Python programmers? Looks like a simple metaclass should suffice, and we might have somewhere in the stdlib a class to inherit from in order to get the appropriate metaclass -- module 'types' seems the natural place for it. Alex From ianb at colorstudy.com Thu Mar 30 05:28:50 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Wed, 29 Mar 2006 21:28:50 -0600 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <442B3C59.4020504@canterbury.ac.nz> References: <4422FC96.2020409@zope.com> <1143432484.14391.67.camel@localhost.localdomain> <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> <1143565628.3305.82.camel@localhost.localdomain> <4429F6BD.60704@canterbury.ac.nz> <79990c6b0603290129m7bed22cft3e242cd009efe9e0@mail.gmail.com> <442A6C57.1030309@gmail.com> <1143648653.3074.86.camel@localhost.localdomain> <79990c6b0603291428l5d050419hd5ede4c821da4b37@mail.gmail.com> <442B3C59.4020504@canterbury.ac.nz> Message-ID: <442B5072.5030005@colorstudy.com> Greg Ewing wrote: > On 3/29/06, Adam DePrince wrote: > >>>>> rs.execute( "select upc, description, price from my_table" ) >>>>> data = rs.fetch().fieldby( 'price','upc') >>>>> print type( data ) >> > > Seems to me it would be a better idea for the DB > module to return tuple-with-attributes objects for > the rows in the first place, rather than plain > tuples. > > When I get around to reworking my custom Firebird > module, I'm going to make it do that. I find it fairly useless when database-specific drivers fancy up their results, because I can't rely on them, and they all work differently, and I have to stick to the lowest common denominator. Which is off-topic here, except to say that a view on the tuple would be useful in a way that returning a fancy tuple would not, because it could wrap any DB-API-compliant result set. Which might still be off-topic, since the implementation of that particular view would be mostly unrelated to any other view we've talked about here. Except perhaps to note that patterns for building light and fast views would also be nice. -- Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org From greg.ewing at canterbury.ac.nz Thu Mar 30 05:36:55 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 30 Mar 2006 15:36:55 +1200 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <442B5072.5030005@colorstudy.com> References: <4422FC96.2020409@zope.com> <1143432484.14391.67.camel@localhost.localdomain> <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> <1143565628.3305.82.camel@localhost.localdomain> <4429F6BD.60704@canterbury.ac.nz> <79990c6b0603290129m7bed22cft3e242cd009efe9e0@mail.gmail.com> <442A6C57.1030309@gmail.com> <1143648653.3074.86.camel@localhost.localdomain> <79990c6b0603291428l5d050419hd5ede4c821da4b37@mail.gmail.com> <442B3C59.4020504@canterbury.ac.nz> <442B5072.5030005@colorstudy.com> Message-ID: <442B5257.5010500@canterbury.ac.nz> Ian Bicking wrote: > Which is off-topic here, except to say that a view on the tuple would be > useful in a way that returning a fancy tuple would not, because it could > wrap any DB-API-compliant result set. A wrapper like that could be built quite generically. Also, better to wrap the whole sequence, I think, rather than each tuple individually: results = tableview(mycursor, 'price', 'upc') for row in results: print row.price, row.upc Something for the itertools module, perhaps? (It would need to be an iterator wrapper, not a sequence wrapper, to work on DB cursors etc.) -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From tjreedy at udel.edu Thu Mar 30 04:08:11 2006 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 29 Mar 2006 21:08:11 -0500 Subject: [Python-3000] pre-PEP: Procedure for PEPs withBackwards-Incompatible Changes References: <17448.48654.533321.884011@montanaro.dyndns.org><17449.30401.409795.632234@montanaro.dyndns.org><44297D01.7040100@colorstudy.com> Message-ID: "Steven Bethard" wrote in message news:d11dcfba0603281221q286833fag1405e2f9b476cd20 at mail.gmail.com... > On 3/28/06, Guido van Rossum wrote: >> I like your strawman: if incompatibilities or synergy >> don't require it to go into Py3k, let's propose it for 2.x. > > Yeah, I think this makes a lot of sense - and we should probably > document it somewhere. Do you want this in the Backwards-Incompatible > Changes PEP? Or another PEP? Or maybe just an update to PEP 1? A PEP that proposes that functions be moved in 3.0 can be split into two actions: an addition and deletion. The addition can be moved back to 2.x, but the deletion cannot. Example: move filter() from builtins to functools (or whatever it ends up being called). I think this should be submitted (and possibly approved) as one 3.0 BIC PEP that mentions the possibility of a two-phase implementation. In such a case, the addition to 2.x should not happen unless and until the deletion in 3.0 is approved and accepted. The BIC PEP could discuss generic categories like this. Terry Jan Reedy From ncoghlan at gmail.com Thu Mar 30 11:46:26 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 30 Mar 2006 19:46:26 +1000 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <1143648653.3074.86.camel@localhost.localdomain> References: <4422FC96.2020409@zope.com> <1143306134.3186.1.camel@localhost.localdomain> <44258376.1080709@gmx.net> <1143432484.14391.67.camel@localhost.localdomain> <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> <1143565628.3305.82.camel@localhost.localdomain> <4429F6BD.60704@canterbury.ac.nz> <79990c6b0603290129m7bed22cft3e242cd009efe9e0@mail.gmail.com> <442A6C57.1030309@gmail.com> <1143648653.3074.86.camel@localhost.localdomain> Message-ID: <442BA8F2.3050604@gmail.com> Adam DePrince wrote: > On Wed, 2006-03-29 at 21:15 +1000, Nick Coghlan wrote: >> Paul Moore wrote: >>> On 3/29/06, Brett Cannon wrote: >>>> Without a direct reason in terms of the language needing a >>>> standardization of an interface, perhaps we just don't need views. If >>>> people want their iterator to have a __len__ method, then fine, they >>>> can add it without breaking anything, just realize it isn't part of >>>> the iterator protocol and thus may limit what objects a function can >>>> accept, but that is there choice. >>> Good point. I think we need to start from strong use cases. With >>> these, I agree that the view concept is a good implementation >>> technique to consider. But let's not implement views just for the sake >>> of having them - I'm pretty sure that was never Guido's intention. >> There are three big use cases: >> >> dict.keys >> dict.values >> dict.items > > There is more than that. But those are the three where we *need* to do something for Py3k. We want to get rid of the copying that exists in Py 2.x, but get a result that is as easy to work with as a real set or list. We don't really have a convention for doing that at this point. The other characteristic of these three is that they can be easily generalised to a simple protocol (add __len__ to a reiterable iterable to get something that implements the container protocol). > Everybody who accesses a database has to jump > and down to extract their fields. Wouldn't it be nice if you could say > to your result set from a database: > >>>> rs.execute( "select upc, description, price from my_table" ) >>>> data = rs.fetch().fieldby( 'price','upc') >>>> print type( data ) > Think new protocols, not new types. This specific example is the old named-tuple problem. While expressing that as a wrapper type that works on an arbitrary underlying sequence is an interesting (and, IMO, good) idea, it doesn't require a new protocol, just a convenient type. > Or a tree implementation of a dictionary. > >>>> type( tree_dict.keys() ) > > > The idea that is there is so much more we can do if we had some > mechanism of identifying at a higher level the semantics of the data > structure. While dict is pretty much it for core python, there are a > lot of data stores in the wild, and the View's would give us the ability > for better interaction and abstraction than passing around lists or > their performance modified twin iter. But we can get most of that benefit just by defining a container protocol, and implementing a few container views for things that currently return lists or iterators. We don't yet have any concrete use cases to justify going beyond the simple interface needed to make dict.keys(), .values() and .items() both memory efficient and easy to use. > Consider for instance if you had to dictionaries, both of which are so > large you don't want to work on copies of their keys. You want to know > which items are in only the first ... > > dicta.keys() - dictb.keys() > > Because each supports the SetView interface, we need only provide a > single generic SetView.difference operator and move on. This prevents > the ungainly conversion to sets first which, while easy to write, is > slow, especially considering how well dict's implement sets in the first > place. But there's already an easy way to write that diff so it works based on the iterator and container protocols: def only_in_first(first, second): """Returns a set containing only items in the first iterable first can be any iterable second can be any container (preferably a container with O(1) item lookup such as a set) """ first_only = set() for key in first: if key not in second: first_only.add(key) return first_only Memory efficient creation of new collections almost always involves starting with an empty container and building it incrementally. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From ncoghlan at gmail.com Thu Mar 30 11:55:24 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 30 Mar 2006 19:55:24 +1000 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <442B207A.60305@canterbury.ac.nz> References: <4422FC96.2020409@zope.com> <1f7befae0603241822n1280ba67i3ed87e842114f18b@mail.gmail.com> <1143306134.3186.1.camel@localhost.localdomain> <44258376.1080709@gmx.net> <1143432484.14391.67.camel@localhost.localdomain> <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> <1143565628.3305.82.camel@localhost.localdomain> <4429F6BD.60704@canterbury.ac.nz> <442B207A.60305@canterbury.ac.nz> Message-ID: <442BAB0C.6080901@gmail.com> Greg Ewing wrote: > Brett Cannon wrote: >> Without a direct reason in terms of the language needing a >> standardization of an interface, perhaps we just don't need > > views. > > On the contrary, views are a very useful idea, *as a > design pattern*. What we *don't* need in Python, as far > as I can see, is any formalised protocols or interfaces > for views, because there's nothing that can be said about > them in general. > > Thinking that "having views" means "having a formally > defined interface for views" is a mindset that comes > from B&D languages like Java. It doesn't apply to > Python at all. +lots All I think we're currently missing is an idea of 'what magic methods does an object need to provide in order to pretend to be a container, rather than just an iterable?' Change dict.keys/values/items to return views which implement that set of methods, and all should be good. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From ncoghlan at gmail.com Thu Mar 30 12:23:56 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 30 Mar 2006 20:23:56 +1000 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <435DF58A933BA74397B42CDEB8145A86010CBC5E@ex9.hostedexchange.local> References: <435DF58A933BA74397B42CDEB8145A86010CBC5E@ex9.hostedexchange.local> Message-ID: <442BB1BC.9030804@gmail.com> Robert Brewer wrote: > Nick Coghlan wrote: >> There are three big use cases: >> >> dict.keys >> dict.values >> dict.items >> >> Currently these all return lists, which may be expensive in >> terms of copying. They all have iter* variants which while >> memory efficient, are far less convenient to work with. > > I'm still wondering what "far less convenient" means. Is it simply the 4 > extra key presses? I find the iter* variants to be a great solution. An iterator has some serious limitations as a view of a container: 1. Can only iterate once 2. Can't check number of items 3. Truth value testing doesn't work 4. Containment tests don't work 5. String representation is well-nigh useless 6. Value-based comparison doesn't work The source object supports all of these things, but the iterator doesn't. This is why iteritems and friends are poor substitutes for their list based equivalents in many situations. Consider, however, the following view-based approach: from itertools import izip class _containerview(object): """Default behaviour for container views""" def __init__(self, container): self.container = container def __len__(self): return len(self.container) def __iter__(self): return iter(self.container) def __contains__(self, val): return val in self.container def __eq__(self, other): # Check they're the same type if type(self) != type(other): return False # Check they're the same size if len(self) != len(other): return False # Check they have the same elements for this, that in izip(self, other): if this != that: return False return True def __ne__(self, other): return not self.__eq__(other) def __str__(self): type_name = type(self).__name__ details = ", ".join(self) return "%s([%s])" % (type_name, details) class keyview(_containerview): """Convenient view of dictionary keys""" # The default behaviour is always right class valueview(_containerview): """Convenient view of dictionary values""" def __iter__(self): return self.container.itervalues() def __contains__(self, val): for item in self: if item == val: return True return False class itemview(_containerview): """Convenient view of dictionary items""" def __iter__(self): return self.container.iteritems() def __contains__(self, item): if not isinstance(item, tuple): return False if len(item) != 2: return False key, val = item try: stored_val = self.container[key] except AttributeError: return False return stored_val == val Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From greg.ewing at canterbury.ac.nz Thu Mar 30 13:01:56 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 30 Mar 2006 23:01:56 +1200 Subject: [Python-3000] Parallel iteration syntax In-Reply-To: <0af401c653d0$2e23d100$f44c2597@bagio> References: <4428C3DD.9090603@canterbury.ac.nz> <080301c65277$4bd0f890$bf03030a@trilan> <4429F6A6.4020506@canterbury.ac.nz> <0af401c653d0$2e23d100$f44c2597@bagio> Message-ID: <442BBAA4.2020407@canterbury.ac.nz> Giovanni Bajo wrote: > And what about the ambiguity in parsing: > > for (x in iter1,y,z in iter2): > ... It would probably be necessary to require some parens there, e.g. for (x in iter1, (y,z) in iter2): ... -- Greg From bioinformed at gmail.com Thu Mar 30 14:59:05 2006 From: bioinformed at gmail.com (Kevin Jacobs ) Date: Thu, 30 Mar 2006 07:59:05 -0500 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <442B5072.5030005@colorstudy.com> References: <4422FC96.2020409@zope.com> <1143565628.3305.82.camel@localhost.localdomain> <4429F6BD.60704@canterbury.ac.nz> <79990c6b0603290129m7bed22cft3e242cd009efe9e0@mail.gmail.com> <442A6C57.1030309@gmail.com> <1143648653.3074.86.camel@localhost.localdomain> <79990c6b0603291428l5d050419hd5ede4c821da4b37@mail.gmail.com> <442B3C59.4020504@canterbury.ac.nz> <442B5072.5030005@colorstudy.com> Message-ID: <2e1434c10603300459h5ef3653bw97156bb8b4da86af@mail.gmail.com> On 3/29/06, Ian Bicking wrote: > > Greg Ewing wrote: > > On 3/29/06, Adam DePrince wrote: > > > >>>>> rs.execute( "select upc, description, price from my_table" ) > >>>>> data = rs.fetch().fieldby( 'price','upc') > >>>>> print type( data ) > >> > > > > Seems to me it would be a better idea for the DB > > module to return tuple-with-attributes objects for > > the rows in the first place, rather than plain > > tuples. > > > > When I get around to reworking my custom Firebird > > module, I'm going to make it do that. See the db_row module from http://opensource.theopalgroup.com/. It attempts to provide a tuple-looking (sequence-like) object that also provides "object-like" and "mapping-like" interfaces that is both space and time efficient. While being far from a perfect solution, it does solve many practical problems and is used by many. > Which is off-topic here, except to say that a view on the tuple would be > useful in a way that returning a fancy tuple would not, because it could > wrap any DB-API-compliant result set. Which might still be off-topic, > since the implementation of that particular view would be mostly > unrelated to any other view we've talked about here. Except perhaps to > note that patterns for building light and fast views would also be nice. db_api result sets are the tip of the iceberg, though an important start. These annotated tuples can also be used in many other bulk data processing contexts where metadata is either implicit or decoupled with the data values -- e.g., processing csv files with the csv module. -Kevin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20060330/cd0a998c/attachment.html From Ben.Young at risk.sungard.com Thu Mar 30 14:51:12 2006 From: Ben.Young at risk.sungard.com (Ben.Young at risk.sungard.com) Date: Thu, 30 Mar 2006 12:51:12 +0000 Subject: [Python-3000] [Python-Dev] Class decorators In-Reply-To: Message-ID: python-dev-bounces+python=theyoungfamily.co.uk at python.org wrote on 30/03/2006 13:01:25: > Ben.Young at risk.sungard.com wrote: > > python-dev-bounces+python=theyoungfamily.co.uk at python.org wrote on > > 30/03/2006 11:38:30: > > > >> Jack Diederich wrote: > >> > >> > Classes have a unique property in that they are the easiest way to > > make > >> > little namespaces in python. > >> > >> For a while now, I've been wondering whether it would > >> be worth having a construct purely for creating little > >> namespaces, instead of abusing a class for this. > >> > >> I've been thinking about an 'instance' statement that > >> creates an instance of a class: > >> > >> instance my_thing(MyClass): > >> > >> # attribute assignments go here > > > > Maybe this would be a use for the proposal a while back where: > > > > 'statement' name(args): > > ... > > > > implied > > > > name = 'statement'("name", args, namespace) > [...] > > I like that generalization (since a class definition statement > currently does about the same anyway). > > However, please post that to the python-3000 list as this would > be a change for Python 3. > Right, sorry. Now forwarded to python-3000! Cheers, Ben > Cheers, > Georg > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python- > dev/python%40theyoungfamily.co.uk > From fabien-ml at x-phuture.com Thu Mar 30 14:54:56 2006 From: fabien-ml at x-phuture.com (Fabien Schwob) Date: Thu, 30 Mar 2006 14:54:56 +0200 Subject: [Python-3000] Parallel iteration syntax In-Reply-To: <442BBAA4.2020407@canterbury.ac.nz> Message-ID: <20060330125456.732B67EC1F@postix.sdv.fr> > > And what about the ambiguity in parsing: > > > > for (x in iter1,y,z in iter2): > > ... > > It would probably be necessary to require some > parens there, e.g. > > for (x in iter1, (y,z) in iter2): > ... I've also a proposition, but I don't know if it can't be done since I don't know how Python works internally : for x in iter1 and y in iter2: ... for x in iter1 and y,z in iter2: ... -- Fabien From rasky at develer.com Thu Mar 30 10:01:41 2006 From: rasky at develer.com (Giovanni Bajo) Date: Thu, 30 Mar 2006 10:01:41 +0200 Subject: [Python-3000] Parallel iteration syntax References: <4428C3DD.9090603@canterbury.ac.nz><080301c65277$4bd0f890$bf03030a@trilan> <4429F6A6.4020506@canterbury.ac.nz> Message-ID: <0af401c653d0$2e23d100$f44c2597@bagio> Greg Ewing wrote: >> for i,(x,y) in enumerate(izip(iter1, iter2)): >> ... >> >> must be translated to: >> >> for (i,x in enumerate(iter1), y in iter2): > > Maybe the functionality of enumerate() could be > incorporated into the syntax as well. > > for (i in *, x in iter1, y in iter2): > ... And what about the ambiguity in parsing: for (x in iter1,y,z in iter2): ... Giovanni Bajo From guido at python.org Thu Mar 30 17:17:56 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 30 Mar 2006 07:17:56 -0800 Subject: [Python-3000] Parallel iteration syntax In-Reply-To: <20060330125456.732B67EC1F@postix.sdv.fr> References: <442BBAA4.2020407@canterbury.ac.nz> <20060330125456.732B67EC1F@postix.sdv.fr> Message-ID: Save your breath on this one, folks. This isn't going to happen. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From adam.deprince at gmail.com Thu Mar 30 19:01:28 2006 From: adam.deprince at gmail.com (Adam DePrince) Date: Thu, 30 Mar 2006 12:01:28 -0500 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <442BAB0C.6080901@gmail.com> References: <4422FC96.2020409@zope.com> <1f7befae0603241822n1280ba67i3ed87e842114f18b@mail.gmail.com> <1143306134.3186.1.camel@localhost.localdomain> <44258376.1080709@gmx.net> <1143432484.14391.67.camel@localhost.localdomain> <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> <1143565628.3305.82.camel@localhost.localdomain> <4429F6BD.60704@canterbury.ac.nz> <442B207A.60305@canterbury.ac.nz> <442BAB0C.6080901@gmail.com> Message-ID: <1143738089.3204.41.camel@localhost.localdomain> [sni[] > All I think we're currently missing is an idea of 'what magic methods does an > object need to provide in order to pretend to be a container, rather than just > an iterable?' > > Change dict.keys/values/items to return views which implement that set of > methods, and all should be good. That was my intent. I've changed the PEP to better reflect what I'm trying to do. In short, a view is really a semantic mechanism, a list of minimal methods we need to do xyz. I've heard people going back and forth about views and iters, there seems to be some confusion, so I want to clarify. An iter is something that your object *generates* to contain the state of your current iteration. Notice that a dict has no __next__ method, but an iter does, this is because the state of your loop is specific to that for loop and not shared. Iters are generated, each call to __iter__ creates a new object. Views are not generated, they are either directly implemented, or returned. I see .items/.values returning a link to another object that is symbotioc to a dict, basically the dict but with different method mappings. In the C code, the structure would be: dict <-- | |<- item view |<- value view Dict .items and dict .values would basically be references to these parasitic objects that provide the alternative interfaces. THere isn't actually a reason for them even to be callable except for backwards compatability. As a view contains only the state of its host, you don't need to generate one, only to take a reference. Now for backwards compatability, I'd like to consider making them callable, that is calling a dict's item will self-return so that old code doesn't break. Somebody mentioned having a __views__ or __flavor_a_view__ method to implement views. We should, but I don't see the difference between dict.items() items_view( dict ) In fact, IMHO the former is more useful ... its the dict object that has the best idea of what views the dict object can provide. Personally, I don't like the idea of a __views__ method because it doesn't seem to make sense, perhaps I'm just unable to wrap my mind around it, but internal to each object is the knoweledge of what views make sense. So, what would a __view__ method do? Perhaps return a catalog of perspectives ( my keys, my items, my values ) and the views they are associated with ( SetView, Setview, View )? - Adam DePrince From guido at python.org Thu Mar 30 19:32:52 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 30 Mar 2006 09:32:52 -0800 Subject: [Python-3000] Need list owner for py3k lists Message-ID: Barry made me the list owner for python-3000 and python-3000-checkins. That's fine for a while, but long term I need to delegate this. Any volunteers? (Especially during my trip to the UK I won't have time to attend to the list owner responsibilities.) There's really not much to do, except maybe a moderation request once a day. (I'm not sure why these happen -- the list seems to be open?) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From adam.deprince at gmail.com Thu Mar 30 19:44:06 2006 From: adam.deprince at gmail.com (Adam DePrince) Date: Thu, 30 Mar 2006 12:44:06 -0500 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <4422FC96.2020409@zope.com> <1143306134.3186.1.camel@localhost.localdomain> <44258376.1080709@gmx.net> <1143432484.14391.67.camel@localhost.localdomain> <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> <1143565628.3305.82.camel@localhost.localdomain> <4429F6BD.60704@canterbury.ac.nz> <1143647134.3074.61.camel@localhost.localdomain> Message-ID: <1143740646.3204.71.camel@localhost.localdomain> On Wed, 2006-03-29 at 11:08 -0800, Brett Cannon wrote: > On 3/29/06, Adam DePrince wrote: > > > > > set interface where we could have a __container__/__view__/__set__ > > > > Why would I call a method to get a view on an object when the object can > > just as well implement the view? The *only* time we want to call a > > method to get a view is when there is not one, single, completing > > definition of the object's canonical perspective should be. > > > > That's my point. What are we gaining by trying to augment iterators > or come up with some specified way to know if something contains > something else when we can get it off of the object itself. Although, > as Nick pointed out, dicts have multiple views of their data. All views are the object itself, just different perspectives. .iter,.values would just return a parasitic object to the dict that contains a different set of methods to support that perspective. > But if you just want to know if a key or value exists, then you can > either use a dict's __contains__ for keys or create a set to use for > values. That's what a view will end up having to do anyway underneath There are three problems with your proposal. 1. The semantics of the problem and your solution are different 2. You ignore the issue of mutability amoung the items 3. The performance in the simple case is reduced with no gain in the larger case 1. d = dict( .... 'x' in set( d.items() ) This creates a snap shot of your items; the snapshot of your items is then examined for membership. 'x' in d.items This directly examines the data store. The difference is one of concurrency. 2. Mutability. Items in dict.items can be mutable. set fails for that. 3. Performance. If you have even one mutable item, then a linear scan is the best you can do. But a scan will always be faster than a copy and scan. If you have all unique, immutable items, then future dict implementations are in a good position to do this optimization for you. Some sort of heuristic, such as revere indexing the table after lg(__len__) calls to values.__contains__, or maintaining a reverse index once beyond a certain size until you start inserting immutable or duplicate items might be other possiabilities. But really, if set operations on .values are what you really want on a regular basis, then the dict is the wrong data structure for you anyway. > the covers, so I don't know if we are going to get much benefit from > going through this beyond just saying dict.value_view() returns a set > of the values in a dict. Don't know if we really need to go through > all of this formality. > > > List has a single obvious view. If you really want to see a list as a > > setview, just pretend it doesn't implement OrderedView and > > CollectionView. Down-casting by neglect works fine. > > > > Dict doesn't, there are 4 possiable views. The mapping view provided > > directly by the dict, and keys/values/items. > > Four? In terms of "atomic" data views, there are keys and values. keys values items > Items could be counted, but that is just a way to pair the different > types of data in a dict together so I don't know if I would count it > as a view, let alone whether it would be useful outside of an > iterator. But counting the dict itself is just counting the key view > twice since you can get all of that data directly from the dict > without needing a view as you pointed out above. There are 3 views, the items view is a valid perspective that is used enough to consider special treatment. I separate the dict and dict.key uses for the same reason we do now ... we had keys, we added iterkeys, now we have dict.__iter__ and nobody wants to take the brave step of breaking old code by taking away dict.keys() In the collective minds of the python development community there are 4, even if there are only 3. - Adam From fdrake at acm.org Thu Mar 30 20:22:27 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu, 30 Mar 2006 13:22:27 -0500 Subject: [Python-3000] Need list owner for py3k lists In-Reply-To: References: Message-ID: <200603301322.27817.fdrake@acm.org> On Thursday 30 March 2006 12:32, Guido van Rossum wrote: > Barry made me the list owner for python-3000 and python-3000-checkins. > That's fine for a while, but long term I need to delegate this. Any > volunteers? (Especially during my trip to the UK I won't have time to > attend to the list owner responsibilities.) I can help out there, but we should really have a couple of list admins. -Fred -- Fred L. Drake, Jr. From skip at pobox.com Thu Mar 30 20:26:30 2006 From: skip at pobox.com (skip at pobox.com) Date: Thu, 30 Mar 2006 12:26:30 -0600 Subject: [Python-3000] Need list owner for py3k lists In-Reply-To: References: Message-ID: <17452.8918.379093.381174@montanaro.dyndns.org> Guido> Barry made me the list owner for python-3000 and Guido> python-3000-checkins. That's fine for a while, but long term I Guido> need to delegate this. Any volunteers? Feel free to add me. I moderate a number of lists and have a little script that helps me plow through moderation requests in short order. Skip From guido at python.org Thu Mar 30 20:37:56 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 30 Mar 2006 10:37:56 -0800 Subject: [Python-3000] Need list owner for py3k lists In-Reply-To: References: Message-ID: I've got two volunteers. Thanks! On 3/30/06, Guido van Rossum wrote: > Barry made me the list owner for python-3000 and python-3000-checkins. > That's fine for a while, but long term I need to delegate this. Any > volunteers? (Especially during my trip to the UK I won't have time to > attend to the list owner responsibilities.) > > There's really not much to do, except maybe a moderation request once > a day. (I'm not sure why these happen -- the list seems to be open?) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From adam.deprince at gmail.com Thu Mar 30 20:40:58 2006 From: adam.deprince at gmail.com (Adam DePrince) Date: Thu, 30 Mar 2006 13:40:58 -0500 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <442BB1BC.9030804@gmail.com> References: <435DF58A933BA74397B42CDEB8145A86010CBC5E@ex9.hostedexchange.local> <442BB1BC.9030804@gmail.com> Message-ID: <1143744058.3204.124.camel@localhost.localdomain> On Thu, 2006-03-30 at 20:23 +1000, Nick Coghlan wrote: > Robert Brewer wrote: > > Nick Coghlan wrote: > >> There are three big use cases: > >> > >> dict.keys > >> dict.values > >> dict.items > >> > >> Currently these all return lists, which may be expensive in > >> terms of copying. They all have iter* variants which while > >> memory efficient, are far less convenient to work with. > > > > I'm still wondering what "far less convenient" means. Is it simply the 4 > > extra key presses? I find the iter* variants to be a great solution. > > An iterator has some serious limitations as a view of a container: > > 1. Can only iterate once Call the iter generator again to get a fresh iter. > 2. Can't check number of items len( dict ) > 3. Truth value testing doesn't work What does true and false for an iter mean? > 4. Containment tests don't work Put the iter down and use the view. 'x' in dict (1,'abc') in dict.items 'albatross' in dict.values > 5. String representation is well-nigh useless print list( mydata ) > 6. Value-based comparison doesn't work You mean d1 = {... d2 = {... d1.keys() == d2.keys() Eww, you are depending on a lot on the internal operation of the dict. It is possiable to generate different orders for the same key set (hash tables start at 8 slots; any 5 items x & 0x7 == y & 0x7 will collide ... Watch: >>> d = {} >>> d[0] = 0 >>> d[8] = 8 >>> d.keys() [0, 8] >>> d = {} >>> d[8] = 8 >>> d[0] = 0 >>> d.keys() [8, 0] > > The source object supports all of these things, but the iterator doesn't. This > is why iteritems and friends are poor substitutes for their list based > equivalents in many situations. I feel strongly about this. Sorry in advance about the rant ... What you describe in this paragraph is a special case. A dict.key isn't a list; while it was implemeneted as a list, and the list form is currently understood, it isn't really a list because dict's have no ordering. That we call it a list is a product of our own inertia At risk of being sacrcastic the semantics (however not the performance) of dict.keys is basically: random.shuffle( list(dict.iter()) ) Basically, part of what I'm advocating for is the functions associated with a datastore to do the minimum amount of work to support their underlying semantics. There seemed to be a concensus in the community on the size of the view proposal, and I'm reimplementing the PEP to reflect that. But what I can't resolve is the other anciliary issue: "To list or iter." I'm not yet ready to resolve that issue. The views don't resolve it either, and by their nature are biased towards the iter approach. They provide __iter__ because its light weight to do, but there is no way a light weight view can provide you with ordering information from an unordered datastore. Now, as a means of resolving this conflict, I'm open to the notion of a view implementing both __iter__ and an explicit .list method to avoid any extra overhead in generating a list from an iter instead of directly from the dict as we do now. Personally, I think returning a list as a default is disgusting. Given any two operations, one which is cheap and the other expensive, I feel its the expensive one that your should have to work for. We have two choices: * Force the iter club to build that nasty list and iterate off of it. There are times when you *can't* do this ... when your dict is big and memory is limited. The integral( d_time, memory ) is a lot worse here, and this will be a killer on memory constrained devices. * Force the list club to get a nasty iter that they have to build a list from. The cost: list( ... ) -- they have the same overall O(n) cost, perhaps with a small constant overhead, they had before. (Lots of pain upfront, or a little pain as you go, oh, I can see all sorts of classical economic "utility of" arguments. The value of your n-st byte of ram vs. your m-th.) If given the requirement that the language implement only one, I feel that it must be an iter. If the programmer feels that the iter is a bad substitute for the list, then they can cast to a list. If the programmer feels that the list is a bad substitute for the iter, well, they could cast to a list when their machine is done paging or their telephone reboots. Insistence on a list represents a certain degree of hubris with respect to the available resources. Some users, particularly the embedded market, have significant space constrains. Cheers - Adam From adam.deprince at gmail.com Thu Mar 30 20:44:05 2006 From: adam.deprince at gmail.com (Adam DePrince) Date: Thu, 30 Mar 2006 13:44:05 -0500 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <442B20C4.70309@canterbury.ac.nz> References: <4422FC96.2020409@zope.com> <1f7befae0603241822n1280ba67i3ed87e842114f18b@mail.gmail.com> <1143306134.3186.1.camel@localhost.localdomain> <44258376.1080709@gmx.net> <1143432484.14391.67.camel@localhost.localdomain> <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> <1143565628.3305.82.camel@localhost.localdomain> <4429F6BD.60704@canterbury.ac.nz> <442A5550.5090804@ofai.at> <442B20C4.70309@canterbury.ac.nz> Message-ID: <1143744246.3204.126.camel@localhost.localdomain> On Thu, 2006-03-30 at 12:05 +1200, Greg Ewing wrote: > Stefan Rank wrote: > > > A big question is: Should slicing also return views? and why not? > > That's been considered before, in relation to strings. > The stumbling block is the problem of a view of a > small part of the object keeping the whole thing > alive and using up memory. > > While having a separate way of getting slice-views > could be useful, I think it would be too big a > change in semantics to make it the default > behaviour of slicing notation. No reason we can't make other string operations views as well ... concatenation is one example. If I recall, that's how snobol handles strings, view upon view upon view. Eww. Maybe my memory failed me. - Adam, From adam.deprince at gmail.com Thu Mar 30 20:49:48 2006 From: adam.deprince at gmail.com (Adam DePrince) Date: Thu, 30 Mar 2006 13:49:48 -0500 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <442B20B0.8060207@canterbury.ac.nz> References: <4422FC96.2020409@zope.com> <44230BA5.8070407@zope.com> <1f7befae0603241051p513439d7ob98bca0b25091ee1@mail.gmail.com> <1f7befae0603241822n1280ba67i3ed87e842114f18b@mail.gmail.com> <1143306134.3186.1.camel@localhost.localdomain> <44258376.1080709@gmx.net> <1143432484.14391.67.camel@localhost.localdomain> <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> <1143565628.3305.82.camel@localhost.localdomain> <4429F6BD.60704@canterbury.ac.nz> <1143646657.3074.52.camel@localhost.localdomain> <442B20B0.8060207@canterbury.ac.nz> Message-ID: <1143744589.3204.132.camel@localhost.localdomain> On Thu, 2006-03-30 at 12:05 +1200, Greg Ewing wrote: > Adam DePrince wrote: > > > SetView implements: > > .__contains__ > > .add > > .discard > > .__len__ > > But what would there be to inherit from the mixin? > Each view class will have entirely its own implementation > of these, depending on the details of the base object. > Inheritance in Python is *entirely* about implementation > inheritance, not interface inheritance. > > > By allowing dict.keys() to announce "I'm a SetView" > > If you mean to make it possible for code to do > > if isinstance(something, SetView): > ... > > that's another thing that Python tries to stay away > from as much as possible. > > If what you're proposing is purely a conceptual > classification, I still think it's far too > elaborate. Anyone reading such a specification is > going to have their eyes glaze over within a few > milliseconds. Only if they examine Python from the perspective of Python ... from a C-api perspective there are certain advantages to lumping things into interfaces, and in the C-API there is no inheritance. The manifestation of these interfaces would be in the naming convention ... __len__ would become part of the View (note the PEP changed since the above SetView implementation description) interface and called PyView_len. I'm trying to address both perspectives here; the user is free to toss these interfaces and code just knowing what methods to provide trusting that everything will quack right. > > -- > Greg > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: http://mail.python.org/mailman/options/python-3000/adam.deprince%40gmail.com From tds333+pydev at gmail.com Thu Mar 30 23:00:41 2006 From: tds333+pydev at gmail.com (Wolfgang Langner) Date: Thu, 30 Mar 2006 23:00:41 +0200 Subject: [Python-3000] Need list owner for py3k lists In-Reply-To: References: Message-ID: <4c45c1530603301300o86f688fo55ce7d2dabff462c@mail.gmail.com> Hello, On 3/30/06, Guido van Rossum wrote: > Barry made me the list owner for python-3000 and python-3000-checkins. > That's fine for a while, but long term I need to delegate this. Any > volunteers? (Especially during my trip to the UK I won't have time to > attend to the list owner responsibilities.) I can help too. > There's really not much to do, except maybe a moderation request once > a day. (I'm not sure why these happen -- the list seems to be open?) There are people like me, they are not able to set the correct From adress in gmail. :-) -- bye by Wolfgang From ncoghlan at gmail.com Thu Mar 30 23:44:50 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 31 Mar 2006 07:44:50 +1000 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <1143744058.3204.124.camel@localhost.localdomain> References: <435DF58A933BA74397B42CDEB8145A86010CBC5E@ex9.hostedexchange.local> <442BB1BC.9030804@gmail.com> <1143744058.3204.124.camel@localhost.localdomain> Message-ID: <442C5152.2000400@gmail.com> Adam DePrince wrote: > There seemed to be a concensus in the community on the size of the view > proposal, and I'm reimplementing the PEP to reflect that. But what I > can't resolve is the other anciliary issue: "To list or iter." I'm not > yet ready to resolve that issue. The views don't resolve it either, and > by their nature are biased towards the iter approach. They provide > __iter__ because its light weight to do, but there is no way a light > weight view can provide you with ordering information from an unordered > datastore. Now, as a means of resolving this conflict, I'm open to the > notion of a view implementing both __iter__ and an explicit .list method > to avoid any extra overhead in generating a list from an iter instead of > directly from the dict as we do now. Umm, the whole point of the views discussion is the realisation that "list or iterator" is a false dichotomy. The correct answer is "new iterable that looks like a container in its own right, but is really just a view of the original". As far as the value-based comparison goes, yes, in reality the view will need to have greater knowledge of the underlying container than in my sample classes in order to be sure of getting a consistent ordering from the underlying objects. As Python's own set and dict show, however, unordered collections can be legitimately compared by value. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From guido at python.org Fri Mar 31 00:06:16 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 30 Mar 2006 14:06:16 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <442C5152.2000400@gmail.com> References: <435DF58A933BA74397B42CDEB8145A86010CBC5E@ex9.hostedexchange.local> <442BB1BC.9030804@gmail.com> <1143744058.3204.124.camel@localhost.localdomain> <442C5152.2000400@gmail.com> Message-ID: On 3/30/06, Nick Coghlan wrote: > Adam DePrince wrote: > > There seemed to be a concensus in the community on the size of the view > > proposal, and I'm reimplementing the PEP to reflect that. But what I > > can't resolve is the other anciliary issue: "To list or iter." I'm not > > yet ready to resolve that issue. The views don't resolve it either, and > > by their nature are biased towards the iter approach. They provide > > __iter__ because its light weight to do, but there is no way a light > > weight view can provide you with ordering information from an unordered > > datastore. Now, as a means of resolving this conflict, I'm open to the > > notion of a view implementing both __iter__ and an explicit .list method > > to avoid any extra overhead in generating a list from an iter instead of > > directly from the dict as we do now. > > Umm, the whole point of the views discussion is the realisation that "list or > iterator" is a false dichotomy. The correct answer is "new iterable that looks > like a container in its own right, but is really just a view of the original". > > As far as the value-based comparison goes, yes, in reality the view will need > to have greater knowledge of the underlying container than in my sample > classes in order to be sure of getting a consistent ordering from the > underlying objects. > > As Python's own set and dict show, however, unordered collections can be > legitimately compared by value. Boy. I definitely need to make time to read this discussion and the PEP. Java does it this way and I think we can do the same thing: keys() and items() return views that behave like sets; values() returns a view that behaves like a collection (aka multiset or bag). Neither behaves like a list, which means that the order is unspecified (even though of course iteration reveals an order, there's nothing that says the order needs to remain the same). We need to be careful with copying Java's definition of equality though -- for sets, any two objects implementing the Set interface must compare equal iff they are contained in each other (the standard mathematical definition), and for collections they give two options: reference equality (i.e. they are the same object) or some other symmetric equality that however cannot equal a set or list containing the same values (sets can only be equal to other sets, and lists only to other lists). That's quite different from Python's equality definitions, which don't take interfaces into account but only concrete types. Adam correctly pointed out the bugs in Nick's __eq__ implementation (depending on the order), but this is easy enough to fix (though expensive to execute since it would require casting keys and items to sets, and doing some kind of multiset comparison for values). But that doesn't answer the question about the following code a = {1: "one", 2: "two"} b = set(a.keys()) c = a.keys() # assuming keys() returns a "set view" print b == c Here b is a concrete set object; c is a set view of a's keys. These are presumably different types (since the concrete set contains its own hash table while the set view just contains a reference to the dict a). So Nick's definition of __eq__ makes them unequal. However Java would consider them equal (since both implement the set interface, and both contain the same elements). We could say that sets and set-like objects should be comparable, but that requires us to define exactly what we consider set-like; this is difficult in a world of duck typing. It would also presumably require us (out of fairness if anything) to allow sequences to be compared for equality, so [1,2,3] and (1,2,3) would henceforth compare equal. And similar for mappings. All with the same problems. (And we'd also have to do it for collections/multisets/bags -- I don't like Java's cop-out there, and in fact I'd like multisets to be comparable to sets, with the semantics that a multiset can be equal to a set only if the multiset has no duplicates.) But do we really want this? It's a pretty serious change in basic semantics of collection data types, *and* it requires us to find a way to determine unequivocally whether something is a set, sequence, mapping, or multiset (and it can't be more than one!). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ianb at colorstudy.com Fri Mar 31 00:50:33 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 30 Mar 2006 16:50:33 -0600 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <435DF58A933BA74397B42CDEB8145A86010CBC5E@ex9.hostedexchange.local> <442BB1BC.9030804@gmail.com> <1143744058.3204.124.camel@localhost.localdomain> <442C5152.2000400@gmail.com> Message-ID: <442C60B9.9020302@colorstudy.com> Guido van Rossum wrote: > On 3/30/06, Nick Coghlan wrote: > >>Adam DePrince wrote: >> >>>There seemed to be a concensus in the community on the size of the view >>>proposal, and I'm reimplementing the PEP to reflect that. But what I >>>can't resolve is the other anciliary issue: "To list or iter." I'm not >>>yet ready to resolve that issue. The views don't resolve it either, and >>>by their nature are biased towards the iter approach. They provide >>>__iter__ because its light weight to do, but there is no way a light >>>weight view can provide you with ordering information from an unordered >>>datastore. Now, as a means of resolving this conflict, I'm open to the >>>notion of a view implementing both __iter__ and an explicit .list method >>>to avoid any extra overhead in generating a list from an iter instead of >>>directly from the dict as we do now. >> >>Umm, the whole point of the views discussion is the realisation that "list or >>iterator" is a false dichotomy. The correct answer is "new iterable that looks >>like a container in its own right, but is really just a view of the original". >> >>As far as the value-based comparison goes, yes, in reality the view will need >>to have greater knowledge of the underlying container than in my sample >>classes in order to be sure of getting a consistent ordering from the >>underlying objects. >> >>As Python's own set and dict show, however, unordered collections can be >>legitimately compared by value. > > > Boy. I definitely need to make time to read this discussion and the PEP. > > Java does it this way and I think we can do the same thing: > > keys() and items() return views that behave like sets; values() > returns a view that behaves like a collection (aka multiset or bag). > Neither behaves like a list, which means that the order is unspecified > (even though of course iteration reveals an order, there's nothing > that says the order needs to remain the same). > > We need to be careful with copying Java's definition of equality > though -- for sets, any two objects implementing the Set interface > must compare equal iff they are contained in each other (the standard > mathematical definition), and for collections they give two options: > reference equality (i.e. they are the same object) or some other > symmetric equality that however cannot equal a set or list containing > the same values (sets can only be equal to other sets, and lists only > to other lists). > > That's quite different from Python's equality definitions, which don't > take interfaces into account but only concrete types. Adam correctly > pointed out the bugs in Nick's __eq__ implementation (depending on the > order), but this is easy enough to fix (though expensive to execute > since it would require casting keys and items to sets, and doing some > kind of multiset comparison for values). > > But that doesn't answer the question about the following code > > a = {1: "one", 2: "two"} > b = set(a.keys()) > c = a.keys() # assuming keys() returns a "set view" > print b == c > > Here b is a concrete set object; c is a set view of a's keys. These > are presumably different types (since the concrete set contains its > own hash table while the set view just contains a reference to the > dict a). So Nick's definition of __eq__ makes them unequal. However > Java would consider them equal (since both implement the set > interface, and both contain the same elements). > > We could say that sets and set-like objects should be comparable, but > that requires us to define exactly what we consider set-like; this is > difficult in a world of duck typing. It would also presumably require > us (out of fairness if anything) to allow sequences to be compared for > equality, so [1,2,3] and (1,2,3) would henceforth compare equal. And > similar for mappings. All with the same problems. Set-like is anything that subclasses baseset? But maybe there's a deeper answer somewhere, as base* types seem a bit kludgy. A collection-specific protocol for testing equality would be reasonable. It doesn't break duck typing, just adds a bit more to the interfaces of collections than purely a bunch of disparate methods. A generic view protocol could maybe handle it too, so you'd ask an object to give a set-like view of itself when comparing that object to a set, and then test that. And the object could return itself, if it already implemented a set-like view. Or return None, meaning no such view was possible. At which point it sounds just like adaptation. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From guido at python.org Fri Mar 31 01:47:49 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 30 Mar 2006 15:47:49 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <442C60B9.9020302@colorstudy.com> References: <435DF58A933BA74397B42CDEB8145A86010CBC5E@ex9.hostedexchange.local> <442BB1BC.9030804@gmail.com> <1143744058.3204.124.camel@localhost.localdomain> <442C5152.2000400@gmail.com> <442C60B9.9020302@colorstudy.com> Message-ID: On 3/30/06, Ian Bicking wrote: > Set-like is anything that subclasses baseset? No, it should to support duck typing. > But maybe there's a > deeper answer somewhere, as base* types seem a bit kludgy. Not just kludgy, but unpythonic. > A collection-specific protocol for testing equality would be reasonable. I'm not sure what you mean here. Are you proposing using a different method than __eq__()? > It doesn't break duck typing, just adds a bit more to the interfaces of > collections than purely a bunch of disparate methods. > > A generic view protocol could maybe handle it too, so you'd ask an > object to give a set-like view of itself when comparing that object to a > set, and then test that. And the object could return itself, if it > already implemented a set-like view. Or return None, meaning no such > view was possible. At which point it sounds just like adaptation. That's definitely an interesting thought, but I'm not sure if it'll go anywhere. I wouldn't want this to turn into the creation of a new set object (a copy); if it can't be a view, it should be refused. That's different from adaptation. I don't want copies to be created because views ought to be lightweight; that's part of the contract for views. If the caller doesn't mind a copy to be taken, they should just use set(x) directly. FWIW, if anyone is reading the Java collections framework docs, their "abstract" classes are really implementation helpers along the lines of Python's DictMixin. (In Java they can't be mixins because of the constraint to single inheritance.) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ianb at colorstudy.com Fri Mar 31 02:11:47 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 30 Mar 2006 18:11:47 -0600 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <435DF58A933BA74397B42CDEB8145A86010CBC5E@ex9.hostedexchange.local> <442BB1BC.9030804@gmail.com> <1143744058.3204.124.camel@localhost.localdomain> <442C5152.2000400@gmail.com> <442C60B9.9020302@colorstudy.com> Message-ID: <442C73C3.2090203@colorstudy.com> Guido van Rossum wrote: >>A collection-specific protocol for testing equality would be reasonable. > > > I'm not sure what you mean here. Are you proposing using a different > method than __eq__()? No, that a collection that wanted to do a nice equality test might do something like: class Set: def __eq__(self, other): if not other.__is_collection__(): return False if len(self) != len(other): return False for item in other: if item not in self: return False return True Though in this example "Set([1, 2]) == {1: None, 2: None}" is true, which I wouldn't like. In general mapping and non-mapping collections seem fairly different to me. Though if "Set([(1, None), (2, None)]) == {1: None, 2: None}" is true, that's actually perfectly fine to me. I guess I don't like that dictionaries, when treated like sequences, reveal only their keys, when that's just a subset of what they are. >>It doesn't break duck typing, just adds a bit more to the interfaces of >>collections than purely a bunch of disparate methods. >> >>A generic view protocol could maybe handle it too, so you'd ask an >>object to give a set-like view of itself when comparing that object to a >>set, and then test that. And the object could return itself, if it >>already implemented a set-like view. Or return None, meaning no such >>view was possible. At which point it sounds just like adaptation. > > > That's definitely an interesting thought, but I'm not sure if it'll go > anywhere. I wouldn't want this to turn into the creation of a new set > object (a copy); if it can't be a view, it should be refused. That's > different from adaptation. I don't want copies to be created because > views ought to be lightweight; that's part of the contract for views. > If the caller doesn't mind a copy to be taken, they should just use > set(x) directly. I can't remember if there is an adaptation term related to that. I seem to remember people (where "people" probably means "Jim") referring to the fact that for some adaptations "IFoo(IBar(foo)) is foo" is true, when foo implements IFoo, whereas for views this would be more-or-less implied...? And I don't know if it is guaranteed that any changes made to IBar(foo) will be reflected in foo itself. I think that isn't guaranteed, and that Zope interfaces have some extra things to handle the results (Annotatable or something). The more I think about it, the more views seem like a more explicit and constrained form of adaptation. They are more explicit because they generally are accessed in a specific way, like dict.keys(). But if views were used for equality, there'd need to be an implicit way to get a view too (though I assume that no implicit means of getting a view would ever give you something quite like dict.keys(), which is only a subset of dict -- but it might give you something like dict.items()). Views are more constrained because they have no state and are only proxies to some underlying object. Generally speaking I've remained suspicious of adaptation. Clearly other people have too, since adaptation has long been lingering and hasn't become part of mainstream Python (in or out of the standard library). Views as a more conservative kind of adaptation seems like it addresses many of the issues -- preserving dynamic typing while also allowing for explicit interfaces -- without some of the complexity of "full" adaptation. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From greg.ewing at canterbury.ac.nz Fri Mar 31 02:32:06 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 31 Mar 2006 12:32:06 +1200 Subject: [Python-3000] Parallel iteration syntax In-Reply-To: <20060330125456.732B67EC1F@postix.sdv.fr> References: <20060330125456.732B67EC1F@postix.sdv.fr> Message-ID: <442C7886.6060501@canterbury.ac.nz> Fabien Schwob wrote: > for x in iter1 and y in iter2: It would be tricky to avoid having that parsed as for x in (iter1 and y in iter2): since 'in' is also a valid part of an expression. That's one of ther reasons I suggested parens around the whole thing, which would make it unambiguous. The other reason was to try to make it look more like parallel than nested iteration, but it seems this is too subtle a clue for some people. -- Greg From aahz at pythoncraft.com Fri Mar 31 03:20:37 2006 From: aahz at pythoncraft.com (Aahz) Date: Thu, 30 Mar 2006 17:20:37 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <435DF58A933BA74397B42CDEB8145A86010CBC5E@ex9.hostedexchange.local> <442BB1BC.9030804@gmail.com> <1143744058.3204.124.camel@localhost.localdomain> <442C5152.2000400@gmail.com> Message-ID: <20060331012037.GA7960@panix.com> On Thu, Mar 30, 2006, Guido van Rossum wrote: > > Java does it this way and I think we can do the same thing: > > keys() and items() return views that behave like sets; values() > returns a view that behaves like a collection (aka multiset or bag). > Neither behaves like a list, which means that the order is unspecified > (even though of course iteration reveals an order, there's nothing > that says the order needs to remain the same). What do we want to tell people who have code like this: keys = d.keys() keys.sort() Not so much in terms of the fix, but where/why we drew the line about what's supported by the value returned by d.keys() and what's not. I'm not getting clarity about that from this discussion so far, and I think it's needed. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "Look, it's your affair if you want to play with five people, but don't go calling it doubles." --John Cleese anticipates Usenet From greg.ewing at canterbury.ac.nz Fri Mar 31 03:39:13 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 31 Mar 2006 13:39:13 +1200 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <435DF58A933BA74397B42CDEB8145A86010CBC5E@ex9.hostedexchange.local> <442BB1BC.9030804@gmail.com> <1143744058.3204.124.camel@localhost.localdomain> <442C5152.2000400@gmail.com> Message-ID: <442C8841.3030905@canterbury.ac.nz> Guido van Rossum wrote: > But do we really want this? It's a pretty serious change in basic > semantics of collection data types, *and* it requires us to find a way > to determine unequivocally whether something is a set, sequence, > mapping, or multiset (and it can't be more than one!). If we *did* want it, I think there would have to be a collection of abstract types -- SequenceBase, MappingBase, SetBase, etc., that all types wanting to participate in this scheme would have to be based on. Since this would go against duck typing, my feeling is no, we don't want it. If it's okay for lists and tuples containing the same items to be unequal even though they're both sequences, then I think it's okay for a real set not to be equal to a set view of something. If you really want to be able to compare different set-like objects, there could be a function for that. Or even a bunch of functions for doing set operations on set-like objects. There's a precedent for this in Numeric. The Numeric array objects know how to do arithmetic with each other, but there is also a set of functions add(), multiply(), etc. which do the corresponding things with any objects that can be treated as sequences of sequences. It could be worth having a set of functions like this in the core for doing sequence operations, set operations, etc. -- Greg From guido at python.org Fri Mar 31 03:39:08 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 30 Mar 2006 17:39:08 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <20060331012037.GA7960@panix.com> References: <435DF58A933BA74397B42CDEB8145A86010CBC5E@ex9.hostedexchange.local> <442BB1BC.9030804@gmail.com> <1143744058.3204.124.camel@localhost.localdomain> <442C5152.2000400@gmail.com> <20060331012037.GA7960@panix.com> Message-ID: On 3/30/06, Aahz wrote: > What do we want to tell people who have code like this: > > keys = d.keys() > keys.sort() > > Not so much in terms of the fix, but where/why we drew the line about > what's supported by the value returned by d.keys() and what's not. I'm > not getting clarity about that from this discussion so far, and I think > it's needed. That's a really good point; we need a meta-PEP on how to handle this kind of issues. In this particular case I'm convinced that we must allow such code to break (perhaps silently or painfully); making keys() return an iterator has been on the Python 3000 agenda for years (pretty much since iterators were first introduced in Python 2.2). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg.ewing at canterbury.ac.nz Fri Mar 31 03:53:14 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 31 Mar 2006 13:53:14 +1200 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <442C73C3.2090203@colorstudy.com> References: <435DF58A933BA74397B42CDEB8145A86010CBC5E@ex9.hostedexchange.local> <442BB1BC.9030804@gmail.com> <1143744058.3204.124.camel@localhost.localdomain> <442C5152.2000400@gmail.com> <442C60B9.9020302@colorstudy.com> <442C73C3.2090203@colorstudy.com> Message-ID: <442C8B8A.40906@canterbury.ac.nz> Ian Bicking wrote: > Though if "Set([(1, None), (2, None)]) == > {1: None, 2: None}" is true, that's actually perfectly fine to me. That would be rather too loose for my tastes. A mapping can be *represented* as a set of tuples, but that's not the same thing as it *being* a set of tuples. > Generally speaking I've remained suspicious of adaptation. I think to most people it seems like a solution looking for a problem. In all the code I've ever written, plain duck typing has been perfectly adequate. I'm willing to concede that it might have use in some specialised areas such as Zope, but there doesn't seem to be any general demand for it. -- Greg From greg.ewing at canterbury.ac.nz Fri Mar 31 04:05:11 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 31 Mar 2006 14:05:11 +1200 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <20060331012037.GA7960@panix.com> References: <435DF58A933BA74397B42CDEB8145A86010CBC5E@ex9.hostedexchange.local> <442BB1BC.9030804@gmail.com> <1143744058.3204.124.camel@localhost.localdomain> <442C5152.2000400@gmail.com> <20060331012037.GA7960@panix.com> Message-ID: <442C8E57.3000603@canterbury.ac.nz> Aahz wrote: > What do we want to tell people who have code like this: > > keys = d.keys() > keys.sort() I think the view returned in this case should be immutable, so that the above fails. Then we tell them to replace it with keys = sorted(d.keys()) In general, where we're changing the semantics of an existing method to return a view instead of a new object, the view returned should be immutable. -- Greg From greg.ewing at canterbury.ac.nz Fri Mar 31 04:21:24 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 31 Mar 2006 14:21:24 +1200 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <1143744246.3204.126.camel@localhost.localdomain> References: <4422FC96.2020409@zope.com> <1f7befae0603241822n1280ba67i3ed87e842114f18b@mail.gmail.com> <1143306134.3186.1.camel@localhost.localdomain> <44258376.1080709@gmx.net> <1143432484.14391.67.camel@localhost.localdomain> <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> <1143565628.3305.82.camel@localhost.localdomain> <4429F6BD.60704@canterbury.ac.nz> <442A5550.5090804@ofai.at> <442B20C4.70309@canterbury.ac.nz> <1143744246.3204.126.camel@localhost.localdomain> Message-ID: <442C9224.3050601@canterbury.ac.nz> Adam DePrince wrote: > No reason we can't make other string operations views as well ... > concatenation is one example. If I recall, that's how snobol handles > strings, view upon view upon view. I don't think it was quite as bad as that. If I remember correctly, when you took a substring you didn't get a view of a view, but another view of the underlying layer holding the characters. And there was some way of detecting when parts of the underlying buffer were no longer used and freeing them. One strange thing Snobol did do was effectively intern every string, though. And it had some weird-assed name for them like "natural variable" or some such... -- Greg From greg.ewing at canterbury.ac.nz Fri Mar 31 04:34:04 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 31 Mar 2006 14:34:04 +1200 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <1143738089.3204.41.camel@localhost.localdomain> References: <4422FC96.2020409@zope.com> <1f7befae0603241822n1280ba67i3ed87e842114f18b@mail.gmail.com> <1143306134.3186.1.camel@localhost.localdomain> <44258376.1080709@gmx.net> <1143432484.14391.67.camel@localhost.localdomain> <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> <1143565628.3305.82.camel@localhost.localdomain> <4429F6BD.60704@canterbury.ac.nz> <442B207A.60305@canterbury.ac.nz> <442BAB0C.6080901@gmail.com> <1143738089.3204.41.camel@localhost.localdomain> Message-ID: <442C951C.4010002@canterbury.ac.nz> Adam DePrince wrote: > Views > are not generated, they are either directly implemented, or returned. If you're thinking that the object would keep a set of pre-allocated views, there's a problem with that -- the views need to have a reference to the base object, thus creating a circular reference. The object could perhaps keep a cache of weakly- referenced views, returning one of those if it's available, otherwise creating a new one. -- Greg From aleaxit at gmail.com Fri Mar 31 05:22:03 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Thu, 30 Mar 2006 19:22:03 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <442C8B8A.40906@canterbury.ac.nz> References: <435DF58A933BA74397B42CDEB8145A86010CBC5E@ex9.hostedexchange.local> <442BB1BC.9030804@gmail.com> <1143744058.3204.124.camel@localhost.localdomain> <442C5152.2000400@gmail.com> <442C60B9.9020302@colorstudy.com> <442C73C3.2090203@colorstudy.com> <442C8B8A.40906@canterbury.ac.nz> Message-ID: <8541A849-BEBC-416B-AAF6-C222BEA9F51C@gmail.com> On Mar 30, 2006, at 5:53 PM, Greg Ewing wrote: ... >> Generally speaking I've remained suspicious of adaptation. > > I think to most people it seems like a solution > looking for a problem. In all the code I've ever > written, plain duck typing has been perfectly > adequate. I'm willing to concede that it might > have use in some specialised areas such as Zope, > but there doesn't seem to be any general demand > for it. I concede that the peasants haven't (yet!-) stormed Guido's castle with pitchforks and torches to get him to approve PEP 246, but I view that as me not having done a good job of communication. Each and every time a new ad-hoc-adaptation gets into the language (e.g., most recently the __index__ one), I'm tempted to point out how much better life would be with adaptation... but these days I mostly shrug and get on with my life instead. Consider __index__, and a user of gmpy, assuming gmpy didn't rush out a 2.5 release with tp_index support. The user of gmpy would be stuck -- no way he could use a gmpy.mpz as an index into a list, because the ad-hoc-adaptation of __index__ means that the type itself must grow the slot. _IF_ adaptation existed, a third party could write an adapter from gmpy.mpz to indextype (or whatever), without needing any tweak on gmpy itself NOR list objects' sources, and everybody else could just import and use that adapter to make gmpy.mpz instances usable as indices in the natural way. And that's just for somebody using one humble library -- adaptation really shines when you're using independently developed frameworks. If the framework consuming X requested adaptation-to-X on all objects it's passed, rather than checking them for this or that ad-hoc protocol, as longs as a suitable adapter is registered the framework producing Y's and the one consuming X's would just fit with each other with no effort required on the part of the longsuffering application developer. I don't know why this shines so bright, so OBVIOUS, to me, and yet I'm unable to convince Guido _or_ build a groundswell of support. It's not as if I've done more multi-framework development than anybody else, after all... just my share. I guess I'll be back to campaigning for it more actively in the future, once the Nutshell's 2nd edition is out. I'm seriously convinced that having protocol-adaptation is the ONE change that would make most positive difference to typical app developers today... Alex From greg.ewing at canterbury.ac.nz Fri Mar 31 06:45:07 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 31 Mar 2006 16:45:07 +1200 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <8541A849-BEBC-416B-AAF6-C222BEA9F51C@gmail.com> References: <435DF58A933BA74397B42CDEB8145A86010CBC5E@ex9.hostedexchange.local> <442BB1BC.9030804@gmail.com> <1143744058.3204.124.camel@localhost.localdomain> <442C5152.2000400@gmail.com> <442C60B9.9020302@colorstudy.com> <442C73C3.2090203@colorstudy.com> <442C8B8A.40906@canterbury.ac.nz> <8541A849-BEBC-416B-AAF6-C222BEA9F51C@gmail.com> Message-ID: <442CB3D3.3010705@canterbury.ac.nz> Alex Martelli wrote: > If the framework consuming X requested adaptation-to-X on all objects > it's passed, This is the part that bothers me, I think. It seems like all these adaptation requests would be a huge burden on the framework developer. In PyGUI, for example, I currently have about 2 or 3 dozen classes. Should I be defining and registering a formal protocol for every one of those? Should I be putting adaptation calls in all of the few hundred places where an object might come in from somewhere else and I'm expecting it to be one of my classes? Does it even stop there? Should I be doing IInteger(x) on every x someone gives me that I'm going to use as an integer? What you're proposing seems to be tantamount to a sort of dynamic version of static typing, with much of its bookkeeping overhead. Python tries to avoid this by not going in for that sort of thing. -- Greg From guido at python.org Fri Mar 31 07:56:24 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 30 Mar 2006 21:56:24 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <442C9224.3050601@canterbury.ac.nz> References: <4422FC96.2020409@zope.com> <79990c6b0603271117l4c69372h11362dd5d2d0ca32@mail.gmail.com> <1143565628.3305.82.camel@localhost.localdomain> <4429F6BD.60704@canterbury.ac.nz> <442A5550.5090804@ofai.at> <442B20C4.70309@canterbury.ac.nz> <1143744246.3204.126.camel@localhost.localdomain> <442C9224.3050601@canterbury.ac.nz> Message-ID: > Adam DePrince wrote: > > No reason we can't make other string operations views as well ... > > concatenation is one example. If I recall, that's how snobol handles > > strings, view upon view upon view. But that's irrelevant for immutable strings -- views are about semantic links, not implementation. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ianb at colorstudy.com Fri Mar 31 08:41:49 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Fri, 31 Mar 2006 00:41:49 -0600 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <8541A849-BEBC-416B-AAF6-C222BEA9F51C@gmail.com> References: <435DF58A933BA74397B42CDEB8145A86010CBC5E@ex9.hostedexchange.local> <442BB1BC.9030804@gmail.com> <1143744058.3204.124.camel@localhost.localdomain> <442C5152.2000400@gmail.com> <442C60B9.9020302@colorstudy.com> <442C73C3.2090203@colorstudy.com> <442C8B8A.40906@canterbury.ac.nz> <8541A849-BEBC-416B-AAF6-C222BEA9F51C@gmail.com> Message-ID: <442CCF2D.9090206@colorstudy.com> Alex Martelli wrote: > On Mar 30, 2006, at 5:53 PM, Greg Ewing wrote: > ... >>> Generally speaking I've remained suspicious of adaptation. >> I think to most people it seems like a solution >> looking for a problem. In all the code I've ever >> written, plain duck typing has been perfectly >> adequate. I'm willing to concede that it might >> have use in some specialised areas such as Zope, >> but there doesn't seem to be any general demand >> for it. > > I concede that the peasants haven't (yet!-) stormed Guido's castle > with pitchforks and torches to get him to approve PEP 246, but I view > that as me not having done a good job of communication. Each and > every time a new ad-hoc-adaptation gets into the language (e.g., most > recently the __index__ one), I'm tempted to point out how much better > life would be with adaptation... but these days I mostly shrug and > get on with my life instead. There's nothing stopping people from using adaptation right now. And yet it only has happened in a few very specific communities -- Twisted and Zope 3. And probably several small projects and whatnot, but mass adoption has really stayed inside those communities. There's no doubt lots of interesting interpretations one could make about this, about social dynamics and goals and problem areas and whatnot, but whatever it is, people are not scrambling to get their hands on adaptation. Which is not to say that there is no use to adaptation, just that as currently formulated it's a difficult sell. I think the ambition of the current implementations and usage might be part of this. Or maybe the way it changes the system, and demands a great deal from the system. It only really starts getting interesting when everyone starts to use interfaces and adaptation, and that requires more buy-in than most people are willing to give. > And that's just for somebody using one humble library -- adaptation > really shines when you're using independently developed frameworks. > If the framework consuming X requested adaptation-to-X on all objects > it's passed, rather than checking them for this or that ad-hoc > protocol, as longs as a suitable adapter is registered the framework > producing Y's and the one consuming X's would just fit with each > other with no effort required on the part of the longsuffering > application developer. Does it really shine? I don't know. I think the geometrically increasing returns from more use of adaptation and interfaces also works against it -- because the returns get geometrically smaller as you consider the incremental value of introducing adaptation into a system. I don't see multiframework integration using adaptation. I see frameworks that are loosely coupled but otherwise insular -- Twisted and Zope 3 -- using adaptation. I don't even get the impression they use each other's code that much, even though they use the same adaptation system. Explicit, stupid, obvious code is how integration works. Adaptation is too clever, and cleverness kills integration IMHO. -- Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org From tjreedy at udel.edu Fri Mar 31 09:12:22 2006 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 31 Mar 2006 02:12:22 -0500 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) References: <435DF58A933BA74397B42CDEB8145A86010CBC5E@ex9.hostedexchange.local><442BB1BC.9030804@gmail.com><1143744058.3204.124.camel@localhost.localdomain><442C5152.2000400@gmail.com> <20060331012037.GA7960@panix.com> Message-ID: "Aahz" wrote in message news:20060331012037.GA7960 at panix.com... > What do we want to tell people who have code like this: > > keys = d.keys() > keys.sort() Could a good-enough code analyzer detect such, even if separated by intervening lines? If so, it could suggest sorted() as a fix. I wonder if the pypy analyzer could be adapted for 2.x to 3.0 warning and upgrade purposes. Or do pylint or pychecker gather enough information? tjr From nnorwitz at gmail.com Fri Mar 31 09:37:28 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Thu, 30 Mar 2006 23:37:28 -0800 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <435DF58A933BA74397B42CDEB8145A86010CBC5E@ex9.hostedexchange.local> <442BB1BC.9030804@gmail.com> <1143744058.3204.124.camel@localhost.localdomain> <442C5152.2000400@gmail.com> <20060331012037.GA7960@panix.com> Message-ID: On 3/30/06, Terry Reedy wrote: > > "Aahz" wrote in message > news:20060331012037.GA7960 at panix.com... > > What do we want to tell people who have code like this: > > > > keys = d.keys() > > keys.sort() > > Could a good-enough code analyzer detect such, even if separated by > intervening lines? If so, it could suggest sorted() as a fix. I wonder if > the pypy analyzer could be adapted for 2.x to 3.0 warning and upgrade > purposes. Or do pylint or pychecker gather enough information? pychecker is supposed to have this info, but only if d is known to be a dict. It could be extended to assume any method keys() (and friends) should return iterators. In which case, it would say that an iterator doesn't have a sort method. Below is the output of the current version. n ### file: tt.py def foo(): d = {} keys = d.keys() keys.sort() keys.sort2() ### $ pychecker tt.py Processing tt... Warnings... tt.py:6: Object (keys) has no attribute (sort2) From taroso at gmail.com Fri Mar 31 13:26:39 2006 From: taroso at gmail.com (Taro Ogawa) Date: Fri, 31 Mar 2006 11:26:39 +0000 (UTC) Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) References: <435DF58A933BA74397B42CDEB8145A86010CBC5E@ex9.hostedexchange.local> <442BB1BC.9030804@gmail.com> Message-ID: [Originally misposted to python-dev] Nick Coghlan gmail.com> writes: > There are three big use cases: > dict.keys > dict.values > dict.items > Currently these all return lists, which may be expensive in terms of copying. > They all have iter* variants which while memory efficient, are far less > convenient to work with. Is there any reason why they can't be view objects - a dictionary has keys, has values, has items - rather than methods returning view objects: for k in mydict.keys: ... for v in mydict.values: ... for k, v in mydict.items: ... For backward compatibility with Py2.x, calling them would raise a DeprecationWarning and return a list. This could even be introduced in 2.x (with a PendingDeprecationWarning instead?). Cheers, -T. From p.f.moore at gmail.com Fri Mar 31 14:18:17 2006 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 31 Mar 2006 13:18:17 +0100 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <442CB3D3.3010705@canterbury.ac.nz> References: <435DF58A933BA74397B42CDEB8145A86010CBC5E@ex9.hostedexchange.local> <1143744058.3204.124.camel@localhost.localdomain> <442C5152.2000400@gmail.com> <442C60B9.9020302@colorstudy.com> <442C73C3.2090203@colorstudy.com> <442C8B8A.40906@canterbury.ac.nz> <8541A849-BEBC-416B-AAF6-C222BEA9F51C@gmail.com> <442CB3D3.3010705@canterbury.ac.nz> Message-ID: <79990c6b0603310418v738bec87s5ac710a9a1aa6dab@mail.gmail.com> On 3/31/06, Greg Ewing wrote: > Alex Martelli wrote: > > > If the framework consuming X requested adaptation-to-X on all objects > > it's passed, > > This is the part that bothers me, I think. It > seems like all these adaptation requests would > be a huge burden on the framework developer. That *is* the big stumbling block, I agree. However, where it's used, I've generally found adaptation to be a nice solution (although I've only written "learning" or "toy" code, so I don't have production-level experience to back this up). But it's not true that you have to do this from day one. You can leave your framework as it is, and only add adaptation when it's needed. The __index__ example is a good one here - for a long time, __index__ didn't exist. But in the end, the requirement to index with user-defined types became sufficiently pressing that a solution was needed. The "traditional" solution, __index__, requires co-operation from all classes that want to support the new protocol. Adaptation doesn't - it can be added externally. The downside of adaptation is that it either requires buy-in to an existing interface framework (zope interfaces, PyProtocols, whatever) or it requires language (stdlib) support. Rather than being a solution looking for a problem, I suspect it's more of a chicken and egg issue. But either way it's a stumbling block. Paul. From ncoghlan at gmail.com Fri Mar 31 14:22:13 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 31 Mar 2006 22:22:13 +1000 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: References: <435DF58A933BA74397B42CDEB8145A86010CBC5E@ex9.hostedexchange.local> <442BB1BC.9030804@gmail.com> Message-ID: <442D1EF5.4020702@gmail.com> Taro Ogawa wrote: > [Originally misposted to python-dev] > Nick Coghlan gmail.com> writes: >> There are three big use cases: >> dict.keys >> dict.values >> dict.items >> Currently these all return lists, which may be expensive in terms of copying. >> They all have iter* variants which while memory efficient, are far less >> convenient to work with. > > Is there any reason why they can't be view objects - a dictionary has keys, > has values, has items - rather than methods returning view objects: > for k in mydict.keys: > ... > for v in mydict.values: > ... > for k, v in mydict.items: > ... > For backward compatibility with Py2.x, calling them would raise a > DeprecationWarning and return a list. This could even be introduced in 2.x > (with a PendingDeprecationWarning instead?). Too much pain for not enough gain, IMO. The only real benefit is avoiding typing a couple of parentheses, but we'd be breaking an awful lot more code than the change of data type will break. One of the great joys of duck-typing is that so long as what we return is sufficiently containerish, a lot of code will continue to just work. It's only code that requires an *actual* list (e.g. by indexing, slicing or sorting the result directly) that will need to change to wrap the method call in either list() or sorted(). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From benji at benjiyork.com Fri Mar 31 16:30:43 2006 From: benji at benjiyork.com (Benji York) Date: Fri, 31 Mar 2006 09:30:43 -0500 Subject: [Python-3000] Iterators for dict keys, values, and items == annoying :) In-Reply-To: <442CB3D3.3010705@canterbury.ac.nz> References: <435DF58A933BA74397B42CDEB8145A86010CBC5E@ex9.hostedexchange.local> <442BB1BC.9030804@gmail.com> <1143744058.3204.124.camel@localhost.localdomain> <442C5152.2000400@gmail.com> <442C60B9.9020302@colorstudy.com> <442C73C3.2090203@colorstudy.com> <442C8B8A.40906@canterbury.ac.nz> <8541A849-BEBC-416B-AAF6-C222BEA9F51C@gmail.com> <442CB3D3.3010705@canterbury.ac.nz> Message-ID: <442D3D13.7050103@benjiyork.com> Alex Martelli wrote: > If the framework consuming X requested adaptation-to-X on all objects > it's passed, That's not generally the way Zope 3 does it (and doesn't sound like a good idea to me). There are three ways (as I see it) adaptation is used in Z3. First, the traditional idea of an "adapter", in which something has the right state, but the wrong interface. That form of adaptation is used when plugging different components together (whether they are Z3 or come from other projects). In that way of using adaptation a function/method/whatever wants things that act in a particular way (duck typing). If I, as the user of the interface, have something I want to pass in that doesn't match I it to the appropriate interface the burden is on me to create something that matches expectations. People do that all the time today without an interface/adaption framework, they just write code that takes one thing and builds another. Greg Ewing wrote: > This is the part that bothers me, I think. It > seems like all these adaptation requests would > be a huge burden on the framework developer. Not in the above scenario, when using adapters like that the burden is on the user. That might sound like a bad thing, but if they're exclusively using your library, they already have objects of the necessary type, if not they have an adaptation framework to help them do something they'd have to do anyway. The second way adaptation is used is as a general lookup facility. Say that I have a user object and I want to know their security information. Instead of building an API for looking up security descriptions from a user name that I have to pull out of the user object, I could instead register and adapter from IUser to ISecurityInfo, now I don't need any new APIs, I just so sec_info = ISecurityInfo(the_user). This form of adaptation is good for the "I have something and want more information about it" use case. It also adds some flexibility, the workings of the adapter can change without having to change all the client code as you'd have to do if you changed (for example) the parameters an API expected. The third way it's used is to make systems pluggable. You mentioned PyGUI, so say you had a schema describing a data entry form. You could use adaptation to decide which GUI widget would be used for each field. Looping over the form fields and adapting each to IWidget and getting back a TextField for a string, CheckBox for a boolean, etc. Then if the user has a nice TextField object with spell checking, they could just plug in a different adapter and all their fields would get the new widget without PyGUI having to support a plug-in framework. > In PyGUI, for example, I currently have about > 2 or 3 dozen classes. Should I be defining and > registering a formal protocol for every one of > those? Unless PyGUI wants to use the interfaces for something, probably not. Another theme of Zope 3 is to use other project's code instead of reinventing the wheel. If we want to use a third-party class with adaptation, we can (externally to the class) assert that it conforms to a particular interface. I don't intend to bore people with a study of adaptation in Zope 3, but instead am attempting to echo Alex's realization that once you have the simple tools of adaptation in mind (much like other practices like OO or data-driven programming) you start to recognize places where they help you solve problems in better ways. -- Benji York