From ziade.tarek at gmail.com Sun Feb 1 13:40:03 2009 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Sun, 1 Feb 2009 13:40:03 +0100 Subject: [Python-ideas] Name mangling removal ? Message-ID: <94bdd2610902010440t363d6a5ev8f2dbcaac11f3137@mail.gmail.com> Hello, I have found several posts of people (including Python commiters) asking for the removal of name mangling in Python, (the latest in 2006 in python-3000). I have searched in the various mailing lists and I can't find a clear status on this. If someone knows please let me know. Otherwise : I would like to propose this feature to be removed because it brakes Python philosophy imho. Regards Tarek -- Tarek Ziad? | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/ From jnoller at gmail.com Sun Feb 1 13:56:17 2009 From: jnoller at gmail.com (Jesse Noller) Date: Sun, 1 Feb 2009 07:56:17 -0500 Subject: [Python-ideas] Name mangling removal ? In-Reply-To: <94bdd2610902010440t363d6a5ev8f2dbcaac11f3137@mail.gmail.com> References: <94bdd2610902010440t363d6a5ev8f2dbcaac11f3137@mail.gmail.com> Message-ID: <4222a8490902010456w3719f1b9je1b6e60af9e28669@mail.gmail.com> On Sun, Feb 1, 2009 at 7:40 AM, Tarek Ziad? wrote: > Hello, > > I have found several posts of people (including Python commiters) > asking for the removal of name mangling in Python, > (the latest in 2006 in python-3000). I have searched in the various > mailing lists and I can't find a clear status on this. > > If someone knows please let me know. > > Otherwise : I would like to propose this feature to be removed because > it brakes Python philosophy imho. > > > Regards > Tarek > This could only be done in python 3000, and without 2to3 fixers (and smart ones at that) would break an insane amount of code. >From a philosophically pure standpoint, I suppose this makes sense - name mangling just doesn't make sense when we're dealing with a language that's based on the consenting adults theory (and is can still be bypassed). I think though - there's would be a fair amount of outcry due to this from the userbase/people looking at the language. Some people simply want the ability to hide things within their modules/classes/etc from the external consumers - this makes sense if you approach things from the JWODT (Java way of doing things), privacy of internal methods/variables is "key" to a clean and stable API. I think if you find enough support, a PEP would need to be written up for this - it's a large enough change that we could not make it lightly. Not to mention if done in a 3.1 or 3.2 it would break compatibility with the older 3.xx releases (ergo, we'd need deprecation warnings and so on). Ultimately, I'm +1 on this but I'm wary of it too - it's a pretty big change and without some ability to lock down/hide a given method/variable or way of constructing "public interfaces" I think people in the community may be upset. Now, removing the name mangling but adding some way of declaring an interface into a module might be nice ;) jesse From ziade.tarek at gmail.com Sun Feb 1 14:25:43 2009 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Sun, 1 Feb 2009 14:25:43 +0100 Subject: [Python-ideas] Name mangling removal ? In-Reply-To: <4222a8490902010456w3719f1b9je1b6e60af9e28669@mail.gmail.com> References: <94bdd2610902010440t363d6a5ev8f2dbcaac11f3137@mail.gmail.com> <4222a8490902010456w3719f1b9je1b6e60af9e28669@mail.gmail.com> Message-ID: <94bdd2610902010525h173568fycd9ff17ba91608ab@mail.gmail.com> On Sun, Feb 1, 2009 at 1:56 PM, Jesse Noller wrote: > I think though - there's would be a fair amount of outcry due to this > from the userbase/people looking at the language. Some people simply > want the ability to hide things within their modules/classes/etc from > the external consumers - this makes sense if you approach things from > the JWODT (Java way of doing things), privacy of internal > methods/variables is "key" to a clean and stable API. That's the first thing that surprised me when I started Python, since I came from Delphi where private and protected attributes were a religion back then, > > I think if you find enough support, a PEP would need to be written up > for this - it's a large enough change that we could not make it > lightly. Not to mention if done in a 3.1 or 3.2 it would break > compatibility with the older 3.xx releases (ergo, we'd need > deprecation warnings and so on). Yes, I am just throwing this here and that's a big piece of work indeed. If people are supporting this idea, maybe it can be suggested to Brett for dicussion at the Python Language Summit, (for the "'new features and future plans" part) http://us.pycon.org/2009/about/summits/language/ > > Ultimately, I'm +1 on this but I'm wary of it too - it's a pretty big > change and without some ability to lock down/hide a given > method/variable or way of constructing "public interfaces" I think > people in the community may be upset. > I would tend to think that properties are a good alternative for people that want to protect from external consumers. Tarek From ziade.tarek at gmail.com Sun Feb 1 15:20:56 2009 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Sun, 1 Feb 2009 15:20:56 +0100 Subject: [Python-ideas] Name mangling removal ? In-Reply-To: <61849.151.53.143.156.1233497147.squirrel@webmail4.pair.com> References: <94bdd2610902010440t363d6a5ev8f2dbcaac11f3137@mail.gmail.com> <4222a8490902010456w3719f1b9je1b6e60af9e28669@mail.gmail.com> <94bdd2610902010525h173568fycd9ff17ba91608ab@mail.gmail.com> <61849.151.53.143.156.1233497147.squirrel@webmail4.pair.com> Message-ID: <94bdd2610902010620we353249k164bbb35bc15d4b5@mail.gmail.com> On Sun, Feb 1, 2009 at 3:05 PM, Cesare Di Mauro wrote: > And I can assure you that it wasn't > rare having the need to access private members of some VCL component. Hehe right. And the paradox is that most people I worked with were complaining about that and were still using private/protected parts in their own code. Just for the anectode since you are familiar with Delphi, my path to open source was : Delphi -> getting frustrated on expensive, closed components -> getting a taste of freedom with the Indy Components -> moving to Python and OSS > > So, if access control can be a matter of religion, men must have the > freedom to decide whatever they want to do with classes. Like Python does. > ;) > > Cheers > Cesare > -- Tarek Ziad? | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/ From cesare.dimauro at a-tono.com Sun Feb 1 15:05:47 2009 From: cesare.dimauro at a-tono.com (Cesare Di Mauro) Date: Sun, 1 Feb 2009 15:05:47 +0100 (CET) Subject: [Python-ideas] Name mangling removal ? In-Reply-To: <94bdd2610902010525h173568fycd9ff17ba91608ab@mail.gmail.com> References: <94bdd2610902010440t363d6a5ev8f2dbcaac11f3137@mail.gmail.com> <4222a8490902010456w3719f1b9je1b6e60af9e28669@mail.gmail.com> <94bdd2610902010525h173568fycd9ff17ba91608ab@mail.gmail.com> Message-ID: <61849.151.53.143.156.1233497147.squirrel@webmail4.pair.com> On Dom, Feb 1, 2009 14:25, Tarek Ziad? wrote: > On Sun, Feb 1, 2009 at 1:56 PM, Jesse Noller wrote: >> I think though - there's would be a fair amount of outcry due to this >> from the userbase/people looking at the language. Some people simply >> want the ability to hide things within their modules/classes/etc from >> the external consumers - this makes sense if you approach things from >> the JWODT (Java way of doing things), privacy of internal >> methods/variables is "key" to a clean and stable API. > > That's the first thing that surprised me when I started Python, since > I came from Delphi where private and protected attributes were a > religion back then, In Delphi (like many other languages, such as C++, Java, etc.) protected members aren't really protected. You can always use class crackers to gain full (public-like) access to them. It's a pretty common practice, and everyone that have built components has surely used it at least one time. Access control can be a good thing on paper, but leaves everything on the designer's hands. So if he makes a mistake design the class, you can be out of business. If you are lucky, the members are protected and you can always crack them, but when they are private, you can only give up or rewrite the class yourself: not a beautiful prospect. And I can assure you that it wasn't rare having the need to access private members of some VCL component. So, if access control can be a matter of religion, men must have the freedom to decide whatever they want to do with classes. Like Python does. ;) Cheers Cesare From guido at python.org Sun Feb 1 17:30:31 2009 From: guido at python.org (Guido van Rossum) Date: Sun, 1 Feb 2009 08:30:31 -0800 Subject: [Python-ideas] Name mangling removal ? In-Reply-To: <94bdd2610902010440t363d6a5ev8f2dbcaac11f3137@mail.gmail.com> References: <94bdd2610902010440t363d6a5ev8f2dbcaac11f3137@mail.gmail.com> Message-ID: On Sun, Feb 1, 2009 at 4:40 AM, Tarek Ziad? wrote: > I have found several posts of people (including Python commiters) > asking for the removal of name mangling in Python, > (the latest in 2006 in python-3000). I have searched in the various > mailing lists and I can't find a clear status on this. This is the first time I've heard of this request. > If someone knows please let me know. AFAIK it's never been brought up on python-dev while we were discussing Py3k. > Otherwise : I would like to propose this feature to be removed because > it brakes Python philosophy imho. I'm against removing it. While the "privacy" it offers is marginal, it also offers protection against accidental clashes between attribute names. E.g. consider person A who writes a library containing a class A, and person B who writes an application with a class B that subclasses A. Let's say B needs to add new instance variables, and wants to be "future-proof" against newer versions of A that might add instance variables too. Using name-mangled variables gives B a "namespace" of his own (_B__*), so he doesn't have to worry about clashes between attribute names he chooses now and attribute names that A might choose in the future. Without name-mangling, B would have to worry that A could add private variables with clashing names as well -- in fact, the presence of any private variables in A would have to be documented in order to ensure that subclasses wouldn't accidentally clash with them, defeating the whole purpose of private. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From lists at cheimes.de Sun Feb 1 17:45:53 2009 From: lists at cheimes.de (Christian Heimes) Date: Sun, 01 Feb 2009 17:45:53 +0100 Subject: [Python-ideas] Name mangling removal ? In-Reply-To: <94bdd2610902010440t363d6a5ev8f2dbcaac11f3137@mail.gmail.com> References: <94bdd2610902010440t363d6a5ev8f2dbcaac11f3137@mail.gmail.com> Message-ID: Tarek Ziad? schrieb: > Hello, > > I have found several posts of people (including Python commiters) > asking for the removal of name mangling in Python, > (the latest in 2006 in python-3000). I have searched in the various > mailing lists and I can't find a clear status on this. > > If someone knows please let me know. > > Otherwise : I would like to propose this feature to be removed because > it brakes Python philosophy imho. In my opinion the removal of the feature is going to create too much pain. The name mangling just causes a minor unpleasantness for newbies. Is it really a big deal in your opinion? Christian From ggpolo at gmail.com Sun Feb 1 18:10:17 2009 From: ggpolo at gmail.com (Guilherme Polo) Date: Sun, 1 Feb 2009 15:10:17 -0200 Subject: [Python-ideas] Adding a test discovery into Python Message-ID: Hi, I believe it would be good to include a test discovery into Python, right now I notice all the following packages: bsdbb, ctypes, distutils, email, json, lib2to3 and lib-tk (tkinter in py3k) duplicate some form of test discovery that works for each one of them (there are also sqlite3 tests, but they are not using a real "test discovery"). In the future it is very likely that this "duplication count" increases, since from time to time new modules and packages get into Python. I can also feel the "idlelib" package starting getting tests, just making things worse. External packages would benefit from it too. Right now you either duplicate the test discovery once more (because your project is small enough that you don't want to use something specific for that), or you use nose, trial, py.test or whatever looks better for you. So.. is there any chance we can enter in agreement what features would be useful in a test discovery that could be included with Python ? I for myself do not have fancy wishes for this one, I would be happy with something that would collect unittests, inside packages and subpackages, with some fixed patterns and let me run them with test.test_support.run_unittests, or maybe something that would collect unittests and doctests and have something like run_tests in test.test_support. But then I believe this wouldn't be good enough to substitute any of the current tools, making the addition mostly useless. Thanks for reading, -- -- Guilherme H. Polo Goncalves From jnoller at gmail.com Sun Feb 1 18:21:42 2009 From: jnoller at gmail.com (Jesse Noller) Date: Sun, 1 Feb 2009 12:21:42 -0500 Subject: [Python-ideas] Adding a test discovery into Python In-Reply-To: References: Message-ID: <4222a8490902010921s6f9273e8t586fbc68340f7366@mail.gmail.com> On Sun, Feb 1, 2009 at 12:10 PM, Guilherme Polo wrote: > Hi, > > I believe it would be good to include a test discovery into Python, > right now I notice all the following packages: > > bsdbb, ctypes, distutils, email, json, lib2to3 and lib-tk (tkinter in py3k) > > duplicate some form of test discovery that works for each one of them > (there are also sqlite3 tests, but they are not using a real "test > discovery"). In the future it is very likely that this "duplication > count" increases, since from time to time new modules and packages get > into Python. I can also feel the "idlelib" package starting getting > tests, just making things worse. > > External packages would benefit from it too. Right now you either > duplicate the test discovery once more (because your project is small > enough that you don't want to use something specific for that), or you > use nose, trial, py.test or whatever looks better for you. > > So.. is there any chance we can enter in agreement what features would > be useful in a test discovery that could be included with Python ? I > for myself do not have fancy wishes for this one, I would be happy > with something that would collect unittests, inside packages and > subpackages, with some fixed patterns and let me run them with > test.test_support.run_unittests, or maybe something that would collect > unittests and doctests and have something like run_tests in > test.test_support. But then I believe this wouldn't be good enough to > substitute any of the current tools, making the addition mostly > useless. > > Thanks for reading, > > -- > -- Guilherme H. Polo Goncalves I think reinventing something which has been implemented ad-nauseam in the community (py.test, nose, etc) would be silly. I'm biased towards nose (really biased) - but I think using the discovery core of one of these might be good. jesse From ggpolo at gmail.com Sun Feb 1 18:34:00 2009 From: ggpolo at gmail.com (Guilherme Polo) Date: Sun, 1 Feb 2009 15:34:00 -0200 Subject: [Python-ideas] Adding a test discovery into Python In-Reply-To: <4222a8490902010921s6f9273e8t586fbc68340f7366@mail.gmail.com> References: <4222a8490902010921s6f9273e8t586fbc68340f7366@mail.gmail.com> Message-ID: On Sun, Feb 1, 2009 at 3:21 PM, Jesse Noller wrote: > On Sun, Feb 1, 2009 at 12:10 PM, Guilherme Polo wrote: >> Hi, >> >> I believe it would be good to include a test discovery into Python, >> . >> . > > I think reinventing something which has been implemented ad-nauseam in > the community (py.test, nose, etc) would be silly. I'm biased towards > nose (really biased) - but I think using the discovery core of one of > these might be good. I didn't want to sound like reinventing code, that is why I put some the word "include" sometimes ;) We are in agreement with using the core of some other project, and I like nose too. -- -- Guilherme H. Polo Goncalves From brett at python.org Sun Feb 1 21:46:19 2009 From: brett at python.org (Brett Cannon) Date: Sun, 1 Feb 2009 12:46:19 -0800 Subject: [Python-ideas] Adding a test discovery into Python In-Reply-To: References: Message-ID: On Sun, Feb 1, 2009 at 09:10, Guilherme Polo wrote: > Hi, > > I believe it would be good to include a test discovery into Python, > right now I notice all the following packages: > > bsdbb, ctypes, distutils, email, json, lib2to3 and lib-tk (tkinter in py3k) > ... and importlib. > duplicate some form of test discovery that works for each one of them > (there are also sqlite3 tests, but they are not using a real "test > discovery"). In the future it is very likely that this "duplication > count" increases, since from time to time new modules and packages get > into Python. I can also feel the "idlelib" package starting getting > tests, just making things worse. > > External packages would benefit from it too. Right now you either > duplicate the test discovery once more (because your project is small > enough that you don't want to use something specific for that), or you > use nose, trial, py.test or whatever looks better for you. > > So.. is there any chance we can enter in agreement what features would > be useful in a test discovery that could be included with Python ? I > for myself do not have fancy wishes for this one, I would be happy > with something that would collect unittests, inside packages and > subpackages, with some fixed patterns and let me run them with > test.test_support.run_unittests, or maybe something that would collect > unittests and doctests and have something like run_tests in > test.test_support. But then I believe this wouldn't be good enough to > substitute any of the current tools, making the addition mostly > useless. Yep. I want to be able to state "find the tests in this package and search the entire package top-down looking for tests to run". The only trick is if you store the tests in something other than on the file system directly (e.g. zipfile). That would require a little bit more work like having modules listed in the __all__ value of the package and use that to know which modules to look in. -Brett -Brett From ggpolo at gmail.com Sun Feb 1 22:18:40 2009 From: ggpolo at gmail.com (Guilherme Polo) Date: Sun, 1 Feb 2009 19:18:40 -0200 Subject: [Python-ideas] Adding a test discovery into Python In-Reply-To: References: Message-ID: On Sun, Feb 1, 2009 at 6:46 PM, Brett Cannon wrote: > On Sun, Feb 1, 2009 at 09:10, Guilherme Polo wrote: >> Hi, >> >> I believe it would be good to include a test discovery into Python, >> right now I notice all the following packages: >> >> bsdbb, ctypes, distutils, email, json, lib2to3 and lib-tk (tkinter in py3k) >> > > ... and importlib. My bad, I was even looking at it other day but when I compiled this list of packages I looked only in trunk and forgot about the py3k branch. >> So.. is there any chance we can enter in agreement what features would >> be useful in a test discovery that could be included with Python ? I >> for myself do not have fancy wishes for this one, I would be happy >> with something that would collect unittests, inside packages and >> subpackages, with some fixed patterns and let me run them with >> test.test_support.run_unittests, or maybe something that would collect >> unittests and doctests and have something like run_tests in >> test.test_support. But then I believe this wouldn't be good enough to >> substitute any of the current tools, making the addition mostly >> useless. > > Yep. To the agreement question ? > I want to be able to state "find the tests in this package and > search the entire package top-down looking for tests to run". The only > trick is if you store the tests in something other than on the file > system directly (e.g. zipfile). That would require a little bit more > work like having modules listed in the __all__ value of the package > and use that to know which modules to look in. This made me remember about another two features I think would be useful to have. One of these features would allow me to restrict from which packages I want to collect tests, this is useful even in "flat packages" (no sub-packages) like tkinter where some of its modules possibly depend on external packages in order to be tested/executed. This adds the possibility to skip tests of single modules instead of entire packages. Other one would be able to check the requirements of a test and decide to collect it or not based on some parameters given. So I could collect all the tests that require a GUI, and others that do not, for example. This adds the possibility to skip single tests, or at least the possibility to divide the tests before running. > -Brett > -- -- Guilherme H. Polo Goncalves From ziade.tarek at gmail.com Sun Feb 1 22:40:39 2009 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Sun, 1 Feb 2009 22:40:39 +0100 Subject: [Python-ideas] Name mangling removal ? In-Reply-To: References: <94bdd2610902010440t363d6a5ev8f2dbcaac11f3137@mail.gmail.com> Message-ID: <94bdd2610902011340k55c6e0bcod0d1a28f355d574c@mail.gmail.com> On Sun, Feb 1, 2009 at 5:30 PM, Guido van Rossum wrote: > This is the first time I've heard of this request. The most recent one is here I think : http://mail.python.org/pipermail/python-3000/2006-September/003857.html > I'm against removing it. While the "privacy" it offers is marginal, it > also offers protection against accidental clashes between attribute > names. E.g. consider person A who writes a library containing a class > A, and person B who writes an application with a class B that > subclasses A. Let's say B needs to add new instance variables, and > wants to be "future-proof" against newer versions of A that might add > instance variables too. Using name-mangled variables gives B a > "namespace" of his own (_B__*), so he doesn't have to worry about > clashes between attribute names he chooses now and attribute names > that A might choose in the future. Without name-mangling, B would have > to worry that A could add private variables with clashing names as > well -- in fact, the presence of any private variables in A would have > to be documented in order to ensure that subclasses wouldn't > accidentally clash with them, defeating the whole purpose of private. Right, thanks for the explanation, I found back a thread where it has been discussed already so I'll study it http://mail.python.org/pipermail/python-dev/2005-December/058555.html Regards Tarek -- Tarek Ziad? | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/ From ziade.tarek at gmail.com Sun Feb 1 22:42:54 2009 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Sun, 1 Feb 2009 22:42:54 +0100 Subject: [Python-ideas] Name mangling removal ? In-Reply-To: References: <94bdd2610902010440t363d6a5ev8f2dbcaac11f3137@mail.gmail.com> Message-ID: <94bdd2610902011342m171d58eeu1228b051d73aa6f9@mail.gmail.com> On Sun, Feb 1, 2009 at 5:45 PM, Christian Heimes wrote: > > In my opinion the removal of the feature is going to create too much > pain. The name mangling just causes a minor unpleasantness for newbies. > Is it really a big deal in your opinion? No I guess people can live with it. And if it's too much pain to remove, it sounds right not to do it I guess. It seems to make sense for mixins, but that is not a pattern I am using at all; Regards Tarek -- Tarek Ziad? | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/ From lists at cheimes.de Sun Feb 1 23:29:21 2009 From: lists at cheimes.de (Christian Heimes) Date: Sun, 01 Feb 2009 23:29:21 +0100 Subject: [Python-ideas] Adding a test discovery into Python In-Reply-To: References: Message-ID: Guilherme Polo schrieb: > So.. is there any chance we can enter in agreement what features would > be useful in a test discovery that could be included with Python ? I > for myself do not have fancy wishes for this one, I would be happy > with something that would collect unittests, inside packages and > subpackages, with some fixed patterns and let me run them with > test.test_support.run_unittests, or maybe something that would collect > unittests and doctests and have something like run_tests in > test.test_support. But then I believe this wouldn't be good enough to > substitute any of the current tools, making the addition mostly > useless. I'm +1 for a simple (!) test discovery system. I'm emphasizing on simple because there are enough frameworks for elaborate unit testing. Such a tool should - find all modules and packages named 'tests' for a given package name - load all subclasses of unittest.TestCase from 'tests' module or '*/tests/test*.py' files - support some basic filtering for test cases or test functions Do we need more features? Christian From steve at pearwood.info Sun Feb 1 23:29:10 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 02 Feb 2009 09:29:10 +1100 Subject: [Python-ideas] Name mangling removal ? In-Reply-To: References: <94bdd2610902010440t363d6a5ev8f2dbcaac11f3137@mail.gmail.com> Message-ID: <49862236.80609@pearwood.info> Guido van Rossum wrote: > E.g. consider person A who writes a library containing a class > A, and person B who writes an application with a class B that > subclasses A. Let's say B needs to add new instance variables, and > wants to be "future-proof" against newer versions of A that might add > instance variables too. Using name-mangled variables gives B a > "namespace" of his own (_B__*), so he doesn't have to worry about > clashes between attribute names he chooses now and attribute names > that A might choose in the future. Without name-mangling, B would have > to worry that A could add private variables with clashing names as > well -- in fact, the presence of any private variables in A would have > to be documented in order to ensure that subclasses wouldn't > accidentally clash with them, defeating the whole purpose of private. Just for completeness sake, I'll point out that there is still a possible name clash using name-mangling: if you subclass B, and inadvertently name your subclass A (or any other superclass of B), then your __names may clash with A's __names. I don't particularly like name-mangling, but I don't see it is a large enough problem that it needs to be removed, particularly in the absence of any viable alternative. -- Steven From guido at python.org Sun Feb 1 23:32:01 2009 From: guido at python.org (Guido van Rossum) Date: Sun, 1 Feb 2009 14:32:01 -0800 Subject: [Python-ideas] Name mangling removal ? In-Reply-To: <94bdd2610902011340k55c6e0bcod0d1a28f355d574c@mail.gmail.com> References: <94bdd2610902010440t363d6a5ev8f2dbcaac11f3137@mail.gmail.com> <94bdd2610902011340k55c6e0bcod0d1a28f355d574c@mail.gmail.com> Message-ID: On Sun, Feb 1, 2009 at 1:40 PM, Tarek Ziad? wrote: > On Sun, Feb 1, 2009 at 5:30 PM, Guido van Rossum wrote: >> This is the first time I've heard of this request. > > The most recent one is here I think : > http://mail.python.org/pipermail/python-3000/2006-September/003857.html Thanks for refreshing my memory. But the explanation below holds -- it should stay. >> I'm against removing it. While the "privacy" it offers is marginal, it >> also offers protection against accidental clashes between attribute >> names. E.g. consider person A who writes a library containing a class >> A, and person B who writes an application with a class B that >> subclasses A. Let's say B needs to add new instance variables, and >> wants to be "future-proof" against newer versions of A that might add >> instance variables too. Using name-mangled variables gives B a >> "namespace" of his own (_B__*), so he doesn't have to worry about >> clashes between attribute names he chooses now and attribute names >> that A might choose in the future. Without name-mangling, B would have >> to worry that A could add private variables with clashing names as >> well -- in fact, the presence of any private variables in A would have >> to be documented in order to ensure that subclasses wouldn't >> accidentally clash with them, defeating the whole purpose of private. > > Right, thanks for the explanation, > > I found back a thread where it has been discussed already so I'll study it > > http://mail.python.org/pipermail/python-dev/2005-December/058555.html > > > Regards > Tarek > -- > Tarek Ziad? | Association AfPy | www.afpy.org > Blog FR | http://programmation-python.org > Blog EN | http://tarekziade.wordpress.com/ > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Sun Feb 1 23:34:23 2009 From: guido at python.org (Guido van Rossum) Date: Sun, 1 Feb 2009 14:34:23 -0800 Subject: [Python-ideas] Name mangling removal ? In-Reply-To: <49862236.80609@pearwood.info> References: <94bdd2610902010440t363d6a5ev8f2dbcaac11f3137@mail.gmail.com> <49862236.80609@pearwood.info> Message-ID: On Sun, Feb 1, 2009 at 2:29 PM, Steven D'Aprano wrote: >> E.g. consider person A who writes a library containing a class >> >> A, and person B who writes an application with a class B that >> subclasses A. Let's say B needs to add new instance variables, and >> wants to be "future-proof" against newer versions of A that might add >> instance variables too. Using name-mangled variables gives B a >> "namespace" of his own (_B__*), so he doesn't have to worry about >> clashes between attribute names he chooses now and attribute names >> that A might choose in the future. Without name-mangling, B would have >> to worry that A could add private variables with clashing names as >> well -- in fact, the presence of any private variables in A would have >> to be documented in order to ensure that subclasses wouldn't >> accidentally clash with them, defeating the whole purpose of private. > Just for completeness sake, I'll point out that there is still a possible > name clash using name-mangling: if you subclass B, and inadvertently name > your subclass A (or any other superclass of B), then your __names may clash > with A's __names. Of course. But that's several orders of magnitude easier to avoid, since classes are so much rarer than attributes. > I don't particularly like name-mangling, but I don't see it is a large > enough problem that it needs to be removed, particularly in the absence of > any viable alternative. I usually recommend against it (in favor of a single underscore), but there are a few situations where it is really useful. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Sun Feb 1 23:40:32 2009 From: guido at python.org (Guido van Rossum) Date: Sun, 1 Feb 2009 14:40:32 -0800 Subject: [Python-ideas] Adding a test discovery into Python In-Reply-To: References: Message-ID: On Sun, Feb 1, 2009 at 2:29 PM, Christian Heimes wrote: > Guilherme Polo schrieb: >> So.. is there any chance we can enter in agreement what features would >> be useful in a test discovery that could be included with Python ? I >> for myself do not have fancy wishes for this one, I would be happy >> with something that would collect unittests, inside packages and >> subpackages, with some fixed patterns and let me run them with >> test.test_support.run_unittests, or maybe something that would collect >> unittests and doctests and have something like run_tests in >> test.test_support. But then I believe this wouldn't be good enough to >> substitute any of the current tools, making the addition mostly >> useless. > > I'm +1 for a simple (!) test discovery system. I'm emphasizing on simple > because there are enough frameworks for elaborate unit testing. > > Such a tool should > > - find all modules and packages named 'tests' for a given package name I predict that this part is where you'll have a hard time getting consensus. There are lots of different naming conventions. It would be nice if people could use the new discovery feature without having to move all their tests around. > - load all subclasses of unittest.TestCase from 'tests' module or > '*/tests/test*.py' files Once you've found the test modules, TestCase subclasses are a good place to start. Though beware -- there's a pattern that defines several alternative classes in a module, e.g. one per platform, and then defines a test suite that dynamically decides which class to use. The stdlib test suite uses this in a number of places, though I can't quite remember where. > - support some basic filtering for test cases or test functions Or module names. > Do we need more features? I'd look at some of the popular alternatives to unittest.py and see what wisdom or conventions they have to offer. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz at pythoncraft.com Mon Feb 2 00:37:41 2009 From: aahz at pythoncraft.com (Aahz) Date: Sun, 1 Feb 2009 15:37:41 -0800 Subject: [Python-ideas] Adding a test discovery into Python In-Reply-To: References: Message-ID: <20090201233741.GA17625@panix.com> On Sun, Feb 01, 2009, Christian Heimes wrote: > > I'm +1 for a simple (!) test discovery system. I'm emphasizing on simple > because there are enough frameworks for elaborate unit testing. > > Such a tool should > > - find all modules and packages named 'tests' for a given package name > - load all subclasses of unittest.TestCase from 'tests' module or > '*/tests/test*.py' files > - support some basic filtering for test cases or test functions > > Do we need more features? Depends on whether the above features find doctests. In my previous job (got laid off a month ago), I'd say that about eighty percent of test suites were doctests (harder to guess the percentage by line counts), and many of these were substantial. I think that doctests are one of the killer Python features that make it easy to do testing, and I think it's critical that all core testing tools support them. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ Weinberg's Second Law: If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization. From brett at python.org Mon Feb 2 03:14:41 2009 From: brett at python.org (Brett Cannon) Date: Sun, 1 Feb 2009 18:14:41 -0800 Subject: [Python-ideas] Adding a test discovery into Python In-Reply-To: References: Message-ID: On Sun, Feb 1, 2009 at 14:40, Guido van Rossum wrote: > On Sun, Feb 1, 2009 at 2:29 PM, Christian Heimes wrote: >> Guilherme Polo schrieb: >>> So.. is there any chance we can enter in agreement what features would >>> be useful in a test discovery that could be included with Python ? I >>> for myself do not have fancy wishes for this one, I would be happy >>> with something that would collect unittests, inside packages and >>> subpackages, with some fixed patterns and let me run them with >>> test.test_support.run_unittests, or maybe something that would collect >>> unittests and doctests and have something like run_tests in >>> test.test_support. But then I believe this wouldn't be good enough to >>> substitute any of the current tools, making the addition mostly >>> useless. >> >> I'm +1 for a simple (!) test discovery system. I'm emphasizing on simple >> because there are enough frameworks for elaborate unit testing. >> >> Such a tool should >> >> - find all modules and packages named 'tests' for a given package name > > I predict that this part is where you'll have a hard time getting > consensus. There are lots of different naming conventions. It would be > nice if people could use the new discovery feature without having to > move all their tests around. > Guido is right; this cannot be convention-based but instead configuration-based. Let me specify in a function call what package to look in and what naming convention I used for test files. >> - load all subclasses of unittest.TestCase from 'tests' module or >> '*/tests/test*.py' files > > Once you've found the test modules, TestCase subclasses are a good > place to start. Though beware -- there's a pattern that defines > several alternative classes in a module, e.g. one per platform, and > then defines a test suite that dynamically decides which class to use. > The stdlib test suite uses this in a number of places, though I can't > quite remember where. > Really? I have never come across that before in the standard library, but I am sure people do stuff like that. There is no way any tool will work in all pre-existing situations like this where people heavily rely on a test_main function to handle decisions on what to call. But as long as it is made clear how to restrict what tests are found and it is an easy solution I am sure people will adjust. >> - support some basic filtering for test cases or test functions > > Or module names. > yes please! To be able to specify only the test being worked on would be very nice indeed. -Brett From guido at python.org Mon Feb 2 04:04:56 2009 From: guido at python.org (Guido van Rossum) Date: Sun, 1 Feb 2009 19:04:56 -0800 Subject: [Python-ideas] Adding a test discovery into Python In-Reply-To: References: Message-ID: On Sun, Feb 1, 2009 at 6:14 PM, Brett Cannon wrote: > On Sun, Feb 1, 2009 at 14:40, Guido van Rossum wrote: >> On Sun, Feb 1, 2009 at 2:29 PM, Christian Heimes wrote: >>> Guilherme Polo schrieb: >>>> So.. is there any chance we can enter in agreement what features would >>>> be useful in a test discovery that could be included with Python ? I >>>> for myself do not have fancy wishes for this one, I would be happy >>>> with something that would collect unittests, inside packages and >>>> subpackages, with some fixed patterns and let me run them with >>>> test.test_support.run_unittests, or maybe something that would collect >>>> unittests and doctests and have something like run_tests in >>>> test.test_support. But then I believe this wouldn't be good enough to >>>> substitute any of the current tools, making the addition mostly >>>> useless. >>> >>> I'm +1 for a simple (!) test discovery system. I'm emphasizing on simple >>> because there are enough frameworks for elaborate unit testing. >>> >>> Such a tool should >>> >>> - find all modules and packages named 'tests' for a given package name >> >> I predict that this part is where you'll have a hard time getting >> consensus. There are lots of different naming conventions. It would be >> nice if people could use the new discovery feature without having to >> move all their tests around. >> > > Guido is right; this cannot be convention-based but instead > configuration-based. Let me specify in a function call what package to > look in and what naming convention I used for test files. > >>> - load all subclasses of unittest.TestCase from 'tests' module or >>> '*/tests/test*.py' files >> >> Once you've found the test modules, TestCase subclasses are a good >> place to start. Though beware -- there's a pattern that defines >> several alternative classes in a module, e.g. one per platform, and >> then defines a test suite that dynamically decides which class to use. >> The stdlib test suite uses this in a number of places, though I can't >> quite remember where. >> > > Really? I have never come across that before in the standard library, > but I am sure people do stuff like that. I'm 100% positive I've seen it. 95% sure it was in the stdlib, but it could have been somewhere else. > There is no way any tool will work in all pre-existing situations like > this where people heavily rely on a test_main function to handle > decisions on what to call. But as long as it is made clear how to > restrict what tests are found and it is an easy solution I am sure > people will adjust. A convention that the presence of e.g. a test_main function overrides the default in-module discovery would help too. >>> - support some basic filtering for test cases or test functions >> >> Or module names. >> > > yes please! To be able to specify only the test being worked on would > be very nice indeed. The alternative of excluding tests by pattern would also be handy. As would be pointing the tool to a specific module or class. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From rocky at gnu.org Tue Feb 3 10:56:47 2009 From: rocky at gnu.org (rocky at gnu.org) Date: Tue, 3 Feb 2009 04:56:47 -0500 Subject: [Python-ideas] Add a cryptographic hash (e.g SHA1) of source to Python Compiled objects? Message-ID: <18824.5343.933820.298039@panix5.panix.com> I've been re-examining from ground up the whole state of affairs in writing a debugger. One of the challenges of a debugger or any source-code analysis tool is verifying that the source-code that the tool is reporting on corresponds to the compiled object under execution. For debuggers, this problem becomes more likely to occur when you are debugging on a computer that isn't the same as the computer where the code is running.) For this, it would be useful to have a cryptographic hash like a SHA1 in the compiled object, but hopefully accessible via the module object where the file path is stored. I understand that there is a mtime timestamp in the .pyc but this is not as reliable as cryptographic hash such as SHA1. There seems to be some confusion in thinking the only use case for this is in remote debugging where source code may be on a different computer than where the code is running, but I do not believe this is so. Here are two other situations which come up. First is a code coverage tool like coverage.py which checks coverage over several runs. Let's say the source code is erased and checked out again; or edited and temporarily changed several times but in the end the file stays the same. A SHA1 has will understand the file hasn't changed, mtime won't. A second more contrived example is in in some sort of secure environment. Let's say I am using the compiled Python code, (say for an embedded device) and someone offers me what's purported to be the source code. How can I easily verify that this is correct? In theory I suppose if I have enough information about the version of Python and which platform, I can compile the purported source ignoring some bits of information (like the mtime ;-) in the compiled object. But one would have to be careful about getting compilers and platforms then same or understand how this changes compilation. From brett at python.org Tue Feb 3 20:06:15 2009 From: brett at python.org (Brett Cannon) Date: Tue, 3 Feb 2009 11:06:15 -0800 Subject: [Python-ideas] Add a cryptographic hash (e.g SHA1) of source to Python Compiled objects? In-Reply-To: <18824.5343.933820.298039@panix5.panix.com> References: <18824.5343.933820.298039@panix5.panix.com> Message-ID: On Tue, Feb 3, 2009 at 01:56, wrote: > I've been re-examining from ground up the whole state of affairs in > writing a debugger. One of the challenges of a debugger or any > source-code analysis tool is verifying that the source-code that the > tool is reporting on corresponds to the compiled object under > execution. > > For debuggers, this problem becomes more likely to occur when you are > debugging on a computer that isn't the same as the computer where the > code is running.) > > For this, it would be useful to have a cryptographic hash like a SHA1 > in the compiled object, but hopefully accessible via the module object > where the file path is stored. > > I understand that there is a mtime timestamp in the .pyc but this is > not as reliable as cryptographic hash such as SHA1. > Well, whatever solution you propose would need to have this signing be optional since it is in no way required in day-to-day executions. The overhead of calculating the hash is not worth the benefit in the general case. > There seems to be some confusion in thinking the only use case for > this is in remote debugging where source code may be on a different > computer than where the code is running, but I do not believe this is > so. Here are two other situations which come up. > > First is a code coverage tool like coverage.py which checks coverage > over several runs. Let's say the source code is erased and checked out > again; or edited and temporarily changed several times but in the end > the file stays the same. A SHA1 has will understand the file hasn't > changed, mtime won't. > That's seems somewhat contrived. Assuming you do not have coverage as part of your continuous build process is having a couple of files have to be covered again that expensive? And if you were mucking with the files you might want to make sure that you really did not change something. > A second more contrived example is in in some sort of secure > environment. Let's say I am using the compiled Python code, (say for > an embedded device) and someone offers me what's purported to be the > source code. How can I easily verify that this is correct? > I really do not see that situation ever coming up. > In theory I suppose if I have enough information about the version of > Python and which platform, I can compile the purported source ignoring > some bits of information (like the mtime ;-) in the compiled > object. But one would have to be careful about getting compilers and > platforms then same or understand how this changes compilation. The only thing you need to compile bytecode is the same Python version (and thus the same magic number) and whether it is a .pyc or .pyo (and thus if -O/-OO was used). Your platform has nothing to do with bytecode compilation. -Brett From python at rcn.com Tue Feb 3 20:44:47 2009 From: python at rcn.com (Raymond Hettinger) Date: Tue, 3 Feb 2009 11:44:47 -0800 Subject: [Python-ideas] Add a cryptographic hash (e.g SHA1) of source toPython Compiled objects? References: <18824.5343.933820.298039@panix5.panix.com> Message-ID: <7C24D4B5998D43AA9289972758E9B426@RaymondLaptop1> [Brett] > The only thing you need to compile bytecode is the same Python version > (and thus the same magic number) and whether it is a .pyc or .pyo (and > thus if -O/-OO was used). Your platform has nothing to do with > bytecode compilation. Well said. The best validation of a pyc is to compile the source. Raymond From brett at python.org Tue Feb 3 20:59:14 2009 From: brett at python.org (Brett Cannon) Date: Tue, 3 Feb 2009 11:59:14 -0800 Subject: [Python-ideas] Add a cryptographic hash (e.g SHA1) of source toPython Compiled objects? In-Reply-To: <7C24D4B5998D43AA9289972758E9B426@RaymondLaptop1> References: <18824.5343.933820.298039@panix5.panix.com> <7C24D4B5998D43AA9289972758E9B426@RaymondLaptop1> Message-ID: On Tue, Feb 3, 2009 at 11:44, Raymond Hettinger wrote: > [Brett] >> >> The only thing you need to compile bytecode is the same Python version >> (and thus the same magic number) and whether it is a .pyc or .pyo (and >> thus if -O/-OO was used). Your platform has nothing to do with >> bytecode compilation. > > Well said. The best validation of a pyc is to compile the source. Actually, recompilation is so cheap and easy (see py_compile or compileall with the --force flag) that bothering with the hash is probably not worth it. Might as well recompile the thing and just see if the strings are equivalent. Unless you are doing this for a ton of files the speed difference, while possibly relatively huge, in absolute terms is still really small and thus probably not worth the added complexity of the hashes. -Brett From rocky at gnu.org Wed Feb 4 05:16:55 2009 From: rocky at gnu.org (rocky at gnu.org) Date: Tue, 3 Feb 2009 23:16:55 -0500 Subject: [Python-ideas] Add a cryptographic hash (e.g SHA1) of source toPython Compiled objects? In-Reply-To: References: <18824.5343.933820.298039@panix5.panix.com> <7C24D4B5998D43AA9289972758E9B426@RaymondLaptop1> Message-ID: <18825.5815.559562.370640@panix5.panix.com> Brett Cannon writes: > On Tue, Feb 3, 2009 at 11:44, Raymond Hettinger wrote: > > [Brett] > >> > >> The only thing you need to compile bytecode is the same Python version > >> (and thus the same magic number) and whether it is a .pyc or .pyo (and > >> thus if -O/-OO was used). Your platform has nothing to do with > >> bytecode compilation. > > > > Well said. The best validation of a pyc is to compile the source. > > Actually, recompilation is so cheap and easy (see py_compile or > compileall with the --force flag) that bothering with the hash is > probably not worth it. Ok. I'm now enlightened as to a viable approach. Thanks. Without a doubt you all are much more familiar at this stuff that I am. (In fact I'm a rank novice.) So I'd be grateful if someone would post code for a function say: compare_src_obj(python_src, python_obj) that takes two strings -- a Python source filename and a Python object -- does what's outlined above, and returns a status which indicates the same or if not and if not whether the difference is because of the wrong version of Python was used. (The use case given was little more involved than this because one needs to make sure Python is found for the same version as the given compiled object; but to start out I'd be happy to skip that complexity, if it's too much trouble to do.) From tjreedy at udel.edu Wed Feb 4 07:17:22 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 04 Feb 2009 01:17:22 -0500 Subject: [Python-ideas] Add a cryptographic hash (e.g SHA1) of source toPython Compiled objects? In-Reply-To: <18825.5815.559562.370640@panix5.panix.com> References: <18824.5343.933820.298039@panix5.panix.com> <7C24D4B5998D43AA9289972758E9B426@RaymondLaptop1> <18825.5815.559562.370640@panix5.panix.com> Message-ID: rocky at gnu.org wrote: > Without a doubt you all are much more familiar at this stuff that I > am. (In fact I'm a rank novice.) So I'd be grateful if someone would > post code for a function say: > > compare_src_obj(python_src, python_obj) > > that takes two strings -- a Python source filename and a Python object > -- does what's outlined above, and returns a status which indicates > the same or if not and if not whether the difference is because of the > wrong version of Python was used. Interesting question. For equaility, I would start with, just guessing a bit: marshal(compile(open(file.py).read())) == open(file.pyc).read() Specifically for version clash, I believe the first 4 bytes are a magic version number. If that is not part of the marshal string, it would need to be skipped for the equality comparison. From rocky at gnu.org Wed Feb 4 10:57:18 2009 From: rocky at gnu.org (rocky at gnu.org) Date: Wed, 4 Feb 2009 04:57:18 -0500 Subject: [Python-ideas] Add a cryptographic hash (e.g SHA1) of source toPython Compiled objects? In-Reply-To: References: <18824.5343.933820.298039@panix5.panix.com> <7C24D4B5998D43AA9289972758E9B426@RaymondLaptop1> <18825.5815.559562.370640@panix5.panix.com> Message-ID: <18825.26238.914919.705377@panix5.panix.com> Terry Reedy writes: > rocky at gnu.org wrote: > > > Without a doubt you all are much more familiar at this stuff that I > > am. (In fact I'm a rank novice.) So I'd be grateful if someone would > > post code for a function say: > > > > compare_src_obj(python_src, python_obj) > > > > that takes two strings -- a Python source filename and a Python object > > -- does what's outlined above, and returns a status which indicates > > the same or if not and if not whether the difference is because of the > > wrong version of Python was used. > > Interesting question. For equaility, I would start with, just guessing > a bit: > > marshal(compile(open(file.py).read())) == open(file.pyc).read() > > Specifically for version clash, I believe the first 4 bytes are a magic > version number. If that is not part of the marshal string, it would > need to be skipped for the equality comparison. There's also the mtime that needs to be ignored mentioned in prior posts. And is there a table which converts a magic number version back into a string with the Python version number? Thanks. From arnodel at googlemail.com Wed Feb 4 11:18:10 2009 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Wed, 4 Feb 2009 10:18:10 +0000 Subject: [Python-ideas] Add a cryptographic hash (e.g SHA1) of source toPython Compiled objects? In-Reply-To: <18825.26238.914919.705377@panix5.panix.com> References: <18824.5343.933820.298039@panix5.panix.com> <7C24D4B5998D43AA9289972758E9B426@RaymondLaptop1> <18825.5815.559562.370640@panix5.panix.com> <18825.26238.914919.705377@panix5.panix.com> Message-ID: <9bfc700a0902040218m3c95a620sced72084e76c2ad4@mail.gmail.com> 2009/2/4 : > There's also the mtime that needs to be ignored mentioned in prior > posts. And is there a table which converts a magic number version back > into a string with the Python version number? Thanks. You can look at Python/import.c, near the top of the file. -- Arnaud From brett at python.org Wed Feb 4 19:36:26 2009 From: brett at python.org (Brett Cannon) Date: Wed, 4 Feb 2009 10:36:26 -0800 Subject: [Python-ideas] Add a cryptographic hash (e.g SHA1) of source toPython Compiled objects? In-Reply-To: <18825.26238.914919.705377@panix5.panix.com> References: <18824.5343.933820.298039@panix5.panix.com> <7C24D4B5998D43AA9289972758E9B426@RaymondLaptop1> <18825.5815.559562.370640@panix5.panix.com> <18825.26238.914919.705377@panix5.panix.com> Message-ID: On Wed, Feb 4, 2009 at 01:57, wrote: > Terry Reedy writes: > > rocky at gnu.org wrote: > > > > > Without a doubt you all are much more familiar at this stuff that I > > > am. (In fact I'm a rank novice.) So I'd be grateful if someone would > > > post code for a function say: > > > > > > compare_src_obj(python_src, python_obj) > > > > > > that takes two strings -- a Python source filename and a Python object > > > -- does what's outlined above, and returns a status which indicates > > > the same or if not and if not whether the difference is because of the > > > wrong version of Python was used. > > > > Interesting question. For equaility, I would start with, just guessing > > a bit: > > > > marshal(compile(open(file.py).read())) == open(file.pyc).read() > > > > Specifically for version clash, I believe the first 4 bytes are a magic > > version number. If that is not part of the marshal string, it would > > need to be skipped for the equality comparison. > > There's also the mtime that needs to be ignored mentioned in prior > posts. And is there a table which converts a magic number version back > into a string with the Python version number? Thanks. marshal.dumps(compile(open('file.py').read(), 'file.py', 'exec')) == open('file.pyc').read()[8:] -Brett From brett at python.org Wed Feb 4 19:37:40 2009 From: brett at python.org (Brett Cannon) Date: Wed, 4 Feb 2009 10:37:40 -0800 Subject: [Python-ideas] Add a cryptographic hash (e.g SHA1) of source toPython Compiled objects? In-Reply-To: <9bfc700a0902040218m3c95a620sced72084e76c2ad4@mail.gmail.com> References: <18824.5343.933820.298039@panix5.panix.com> <7C24D4B5998D43AA9289972758E9B426@RaymondLaptop1> <18825.5815.559562.370640@panix5.panix.com> <18825.26238.914919.705377@panix5.panix.com> <9bfc700a0902040218m3c95a620sced72084e76c2ad4@mail.gmail.com> Message-ID: On Wed, Feb 4, 2009 at 02:18, Arnaud Delobelle wrote: > 2009/2/4 : > >> There's also the mtime that needs to be ignored mentioned in prior >> posts. And is there a table which converts a magic number version back >> into a string with the Python version number? Thanks. > > You can look at Python/import.c, near the top of the file. The other option to see how all of this works is importlib as found in the py3k branch. That's in pure Python so it's easier to follow. -Brett From yaogzhan at gmail.com Thu Feb 5 04:05:30 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Wed, 4 Feb 2009 23:35:30 -0330 Subject: [Python-ideas] Making colons optional? Message-ID: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> Hi everybody, I'm proposing to make colons optional. Let me call this feature Colonless Python. To show what it is like without colons, here are some examples adapted from the official tutorial. ---------- sample code ----------- if statement ============ if x < 0 x = 0 print 'Negative changed to zero' elif x == 0 print 'Zero' elif x == 1 print 'Single' else print 'More' for statement ============= for x in a print x, len(x) while statement =============== # Fibonacci series: # the sum of two elements defines the next a, b = 0, 1 while b < 10 print b a, b = b, a+b function definition ================== def fib(n) # write Fibonacci series up to n """Print a Fibonacci series up to n.""" a, b = 0, 1 while b < n print b, a, b = b, a+b class definition ================ class MyClass """A simple example class""" i = 12345 def f(self) return 'hello world' ------------- end -------------- Please note that like semicolons, one can still write one-liners with colons like if x > 0: y = z; x = y or for each in range(10): x += each; x -= each**2 I'm proposing this due to several reasons, namely - I noticed a strong tendency to forget colons by new users of Python in a second-year computer science undergraduate course. The students seemed not getting used to colons even near the end of the course. I guess it is probably because they learn Java and C first, both of which do not have colons. What other languages do you know that require colons? Colons seem to be a trap for new Python users from other languages. - We already have indentation to visually separate different levels of code. Why bother with those extra colons at all? They should be optional just like semicolons are optional (and discouraged) for line breaks. I doubt we will ever lose much, if any, by omitting colons. - I find colons pretty annoying. They interrupt mental flows of thinking. I have to do a mental break at the end of conditionals, loops, function and class definitions, and then a physical break to type "SHIFT+;" to get the colons before I continue on the subsequent lines. It's not really a big issue, but it is annoying. What do you think? Specifically, may I ask your feedback on the following issues? - Do you find yourself sometimes forget to type them and the interpreter complains? - Are the above pieces of code less readable due to lack of colons? - What problems do you think will occur if colons are made optional? PS. I had some preliminary discussion in the comments on one of Guido's blog posts about Python's history. If you are interested, you can check out this link http://python-history.blogspot.com/2009/01/pythons-design-philosophy.html#comments Best, Rio From timothy.c.delaney at gmail.com Thu Feb 5 04:30:53 2009 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Thu, 5 Feb 2009 14:30:53 +1100 Subject: [Python-ideas] Making colons optional? In-Reply-To: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> Message-ID: <98985ab20902041930m7385f427nd83e7c12c6962091@mail.gmail.com> On Thu, Feb 5, 2009 at 2:05 PM, Riobard Zhan wrote: > I'm proposing to make colons optional. Let me call this feature Colonless > Python. To show what it is like without colons, here are some examples > adapted from the official tutorial. > http://www.google.com.au/search?hl=en&q=python+make+colon+optional&meta= Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From yaogzhan at gmail.com Thu Feb 5 04:49:50 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Thu, 5 Feb 2009 00:19:50 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: <98985ab20902041930m7385f427nd83e7c12c6962091@mail.gmail.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <98985ab20902041930m7385f427nd83e7c12c6962091@mail.gmail.com> Message-ID: Hi Tim, Thanks for pointing out the links. Actually I read the original thread on python-list titled "why not drop the colons". But as I said in the comments of Guido's blog post, the so called usability surveys did not pop up any real data (we do not know how the surveys were conducted, at what context, with what results, etc) to convince the original proposer (and me). So I think we can probably discuss it. Things change, right? :P Hope you don't mind that I bring up such an ancient topic ... Best, Rio On 5-Feb-09, at 12:00 AM, Tim Delaney wrote: > > > On Thu, Feb 5, 2009 at 2:05 PM, Riobard Zhan > wrote: > > I'm proposing to make colons optional. Let me call this feature > Colonless Python. To show what it is like without colons, here are > some examples adapted from the official tutorial. > > http://www.google.com.au/search?hl=en&q=python+make+colon+optional&meta= > > Tim Delaney From ben+python at benfinney.id.au Thu Feb 5 05:07:36 2009 From: ben+python at benfinney.id.au (Ben Finney) Date: Thu, 05 Feb 2009 15:07:36 +1100 Subject: [Python-ideas] Making colons optional? References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> Message-ID: <871vudzfuf.fsf@benfinney.id.au> Riobard Zhan writes: > - I noticed a strong tendency to forget colons by new users of > Python I've also noticed this, but I also notice that it doesn't take long at all (on the order of hours) for the new user to get used to the requirement. > - We already have indentation to visually separate different levels > of code. Why bother with those extra colons at all? Because indentation also occurs for other reasons, a major example being continuations of previous lines. for line in foo: do_something( spam, line) That's three lines at differing indentation levels, but two statements. I find that the line-end colon is a strong visual indicator that a suite is being introduced, as contrasted with some other difference in indentation. > - Do you find yourself sometimes forget to type them and the > interpreter complains? Not since a few days learning the language, no. > - Are the above pieces of code less readable due to lack of colons? The examples you've chosen, no; but I think that a lot of code which uses multi-line statements with implicit continuation would be made less readable without colons to indicate the introduction of suites. -- \ ?I hate it when my foot falls asleep during the day, because | `\ that means it's gonna be up all night.? ?Steven Wright | _o__) | Ben Finney From george.sakkis at gmail.com Thu Feb 5 05:26:25 2009 From: george.sakkis at gmail.com (George Sakkis) Date: Wed, 4 Feb 2009 23:26:25 -0500 Subject: [Python-ideas] Making colons optional? In-Reply-To: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> Message-ID: <91ad5bf80902042026l8d3bb9fj3beaf149b6bab5c3@mail.gmail.com> On Wed, Feb 4, 2009 at 10:05 PM, Riobard Zhan wrote: > Hi everybody, > > > I'm proposing to make colons optional. Let me call this feature Colonless > Python. http://cobra-language.com/docs/hello-world/ George From steve at pearwood.info Thu Feb 5 12:01:21 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 05 Feb 2009 22:01:21 +1100 Subject: [Python-ideas] Making colons optional? In-Reply-To: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> Message-ID: <498AC701.6030300@pearwood.info> Riobard Zhan wrote: > - I noticed a strong tendency to forget colons by new users of Python in > a second-year computer science undergraduate course. The students seemed > not getting used to colons even near the end of the course. I guess it > is probably because they learn Java and C first, both of which do not > have colons. What other languages do you know that require colons? Pascal uses colons, but not for the exact same purpose as Python. Both languages use colons in similar ways to it's use in English. In particular, Python uses colons as a break between clauses: larger than a comma, smaller than a period. > - I find colons pretty annoying. ... I'm sorry you dislike colons, but I like them. > What do you think? Specifically, may I ask your feedback on the > following issues? > > - Do you find yourself sometimes forget to type them and the interpreter > complains? Perhaps one time in 1000, MUCH less often than I miss a closing parenthesis, and about as often as I accidentally use round brackets instead of square (or visa versa). > - Are the above pieces of code less readable due to lack of colons? I think so. I came to Python from Pascal and Hypertalk, two very different languages. I never missed Pascal's semi-colons at the end of each line, and Hypertalk rarely use punctuation (other than for arithmetic). I didn't find it difficult to pick up on using colons. As an exercise, I just took a look at some old Hypertalk code, and found the lack of colons equally distracting in it as I find it in your examples. > - What problems do you think will occur if colons are made optional? I don't think it would lead to any problems, but I think it would make Python less elegant. -- Steven From steve at pearwood.info Thu Feb 5 12:05:50 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 05 Feb 2009 22:05:50 +1100 Subject: [Python-ideas] Making colons optional? In-Reply-To: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> Message-ID: <498AC80E.2010708@pearwood.info> I forgot one other comment... Riobard Zhan wrote: > - I noticed a strong tendency to forget colons by new users of Python in > a second-year computer science undergraduate course. The students seemed > not getting used to colons even near the end of the course. While I would not support the removal of colons, I would support a better error message when one is missing. >>> for x in 1,2,3 File "", line 1 for x in 1,2,3 ^ SyntaxError: invalid syntax Perhaps this should say why the syntax is missing? SyntaxError: linebreak in statement or missing colon -- Steven From denis.spir at free.fr Thu Feb 5 13:26:02 2009 From: denis.spir at free.fr (spir) Date: Thu, 5 Feb 2009 13:26:02 +0100 Subject: [Python-ideas] Making colons optional? Syntactic alternative. In-Reply-To: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> Message-ID: <20090205132602.58368969@o> Le Wed, 4 Feb 2009 23:35:30 -0330, Riobard Zhan a ?crit : > Hi everybody, > > > I'm proposing to make colons optional. I 99% agree with that. But 100% sure this will never happen. No chance at all, imo... Still, it may be worth exchanging some thoughts on the topic as an opprtunity to take some distance, and watch python common traits with an outer look (correct?). Block starts already are indicated, to humans as well as to the machine, by instruction type and indentation. Parsing for validity check or for editor smart indentation may rely on instruction 'type': if last_instruction.type in block_headline_types: I personly do not find that expression's semantics more complicated than e.g.: if last_instruction.text.endswith(':'): The first one has the additional advantage to prevent the common error of editors indenting after comments that happen to end with ":" ;-) Which by the way shows that relying on ':' is an wrong algorithm; as well as the fact that we get an error when forgetting ':' shows that python parsers actually rely on more information, namely instruction type. Smart unindent in editors is already done that way, by identifying instructions like 'return' that very probably end a block and thus are followed by unindentation: if last_instruction.type in probable_unindent_types: do_unindent On the other hand, at the human semantic level, the presence of a ':' nicely introduces a block. One could object that this fact is basically western culture centered (like e.g. '#' meaning number is, as far as I know, purely english), but this is True for all common programming conventions. Computer science and technics is a product of western culture, anyway. People from other cultures have to get familiar with much more than arbitrary meaning of signs, before having a chance to explore computer programming. A kind of meaning conflict nevertheless exists even in python itself: a ':' sign even more nicely carries a sense of binding, or link. Python uses this common world semantics in dict litteral key:value pairs. Starting from this point, and looking for clarity and consistency, my wished syntactic revolution ;-) reads as follows: * Get rid of ':' at end of headlines. * Allow only single-instruction blocks (suites) for one-liners: "if cond; do_that". Or maybe no separator at all is necessary here for unambiguous parsing? Or replace ':' or ';' by 'then'. This would also allow if...then...else one-liners. Alternatively, make newline+indent compulsery. This is anyway often recommended in style guidelines. * Extend name binding format "name:value" to assignments. That is, write "a:1" instead of "a=1. This may avoid tons of misunderstandings (imo, '=' for assignment is a semantic plague, a major error/fault, a pedagogic calamity). * Which lets the "=" sign free for better use: use "=" for the semantics of equality, like in math and common interpretation learnt from school. * Perhaps: use "==" for identity, instead of "is". Again, this is not intended as a proposal for python, not even a hope ;-) Just a reflexion on alternative syntactic ruleset. I would enjoy comments & critics about that. Denis PS: I will take the opportunity to propose a thread on the topic of "binding vs rebinding". ------ la vida e estranya From denis.spir at free.fr Thu Feb 5 14:22:11 2009 From: denis.spir at free.fr (spir) Date: Thu, 5 Feb 2009 14:22:11 +0100 Subject: [Python-ideas] binding vs rebinding Message-ID: <20090205142211.77ba2568@o> Hello, I wonder why there is no difference in syntax between binding and rebinding. Obviously, the semantics is not at all the same, for humans as well as for the interpreter: * Binding: create a name, bind a value to it. * Rebinding: change the value bound to the name. I see several advantages for this distinction and no drawback. The first advantage, which imo is worthful enough, is to let syntax match semantics; as the distinction *makes sense*. A nice side-effect would be to allow detection of typographic or (human ;-) memory errors: * When an error wrongly and silently creates a new name instead of launching a NameError exception. A distinct syntax for rebinding would prevent that. * When an error wrongly and silently rebinds an existing name instead of launching a NameError exception. A distinct syntax for (first) binding would prevent that. No need, I guess, to insist on the fact that such errors sometimes lead to long and difficult debugging precisely for they are silent. This, because in all cases "a=1" is a valid instruction, as there is no distinction between binding and rebinding. I suspect a further advantage may be to get rid of "global" and "nonlocal" declarations -- which, as I see it, do not at all fit the python way. I may be wrong on that, still it seems such declarations are necessary only because of the above distinction lacking. My rational on this is: * It is very common and helpful to allow a local variable beeing named identically as another one in an external scope. * There is no binding/redinding distinction in python syntax. * So that whenever a name appears on the left side of an assignment, inside a non-global scope, there is no way to know whether the programmer intends to create a local name or to access a possibly existing external name. * To resolve this ambiguity, python adopts the rule of creating a local name. * Thus, it becomes impossible to rebind an external name from a local scope. Which is still useful in rather rare, but relevant, use cases. * So that 'global', and later 'nonlocal', declarations had to be introduced in python. (I tried to be as clear and step-by-step as I can so that this reasoning can easily be refuted if ever it holds errors I cannot see.) It seems that if ever the second step would not hold, then there would be no reason for such declarations. Imagine that rebinbing is spellt using ':='. Then, from a non-glocal scope: * a=1 causes creation of a local name * a:=1 rebinds a local name if exists, or rebinds an external name if exists (step-by-step up to module level scope), or else launches NameError. There may be reasons why such a behaviour is not the best a programmer would expect: I wait for your comments. Obviously, for the sake of compatibility, this is more a base for discussion, if you find the topic interesting, than for a proposal for python 9000... Denis PS: As a reference to the thread on the sign ':' at the end of block headlines, the syntactic format I would actually enjoy is: * binding name : value * rebinding name :: value ------ la vida e estranya From rocky at gnu.org Thu Feb 5 14:40:44 2009 From: rocky at gnu.org (rocky at gnu.org) Date: Thu, 5 Feb 2009 08:40:44 -0500 Subject: [Python-ideas] Add a cryptographic hash (e.g SHA1) of source toPython Compiled objects? In-Reply-To: References: <18824.5343.933820.298039@panix5.panix.com> <7C24D4B5998D43AA9289972758E9B426@RaymondLaptop1> <18825.5815.559562.370640@panix5.panix.com> <18825.26238.914919.705377@panix5.panix.com> Message-ID: <18826.60508.910995.447316@panix5.panix.com> Brett Cannon writes: > On Wed, Feb 4, 2009 at 01:57, wrote: > > Terry Reedy writes: > > > rocky at gnu.org wrote: > > > > > > > Without a doubt you all are much more familiar at this stuff that I > > > > am. (In fact I'm a rank novice.) So I'd be grateful if someone would > > > > post code for a function say: > > > > > > > > compare_src_obj(python_src, python_obj) > > > > > > > > that takes two strings -- a Python source filename and a Python object > > > > -- does what's outlined above, and returns a status which indicates > > > > the same or if not and if not whether the difference is because of the > > > > wrong version of Python was used. > > > > > > Interesting question. For equaility, I would start with, just guessing > > > a bit: > > > > > > marshal(compile(open(file.py).read())) == open(file.pyc).read() > > > > > > Specifically for version clash, I believe the first 4 bytes are a magic > > > version number. If that is not part of the marshal string, it would > > > need to be skipped for the equality comparison. > > > > There's also the mtime that needs to be ignored mentioned in prior > > posts. And is there a table which converts a magic number version back > > into a string with the Python version number? Thanks. > > marshal.dumps(compile(open('file.py').read(), 'file.py', 'exec')) == > open('file.pyc').read()[8:] Thanks. Alas, I can't see how in practice this will be generally useful. Again, here is the problem: I have a some sort of compiled python file and something which I think is the source code for it. I want to verify that it is. (In a debugger it means we can warn that what you are seeing is not what's being run. However I do not believe this is the only situation where getting the answer to this question is helpful/important.) The solution above is very sensitive to knowing the name of the file (files?) used in compilation because those are stored in the co_filename portion of the code object. For example if what's stored in that field is 'foo.py' but I compile with the name './foo.py' or some other equivalent name, then I get a false mismatch. Worse, as we've seen before when dealing with zipped eggs, the name stored in co_filename is a somewhat temporary location and something very few people are going to guess or recognize as the location of where they think the file originated. What seems to me to be a weakness of this approach is that it requires that you get two additional pieces of information correct that really are irrelevant from the standpoint of the problem: the name of the file and the version of Python used in the compilation process. I just care about the source text. As I write this I can't help but be amused me, because when before on pydthon-dev I asked about how I could get more accurate file names in co_filename (for zipped eggs), the answer invariably offered was something along the lines "why not use the source text?" From lists at cheimes.de Thu Feb 5 14:34:05 2009 From: lists at cheimes.de (Christian Heimes) Date: Thu, 05 Feb 2009 14:34:05 +0100 Subject: [Python-ideas] Making colons optional? In-Reply-To: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> Message-ID: -1 Please keep Python sources readable. Christian From dangyogi at gmail.com Thu Feb 5 15:59:45 2009 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Thu, 05 Feb 2009 09:59:45 -0500 Subject: [Python-ideas] binding vs rebinding In-Reply-To: <20090205142211.77ba2568@o> References: <20090205142211.77ba2568@o> Message-ID: <498AFEE1.7070400@gmail.com> spir wrote: > It seems that if ever the second step would not hold, then there would be no reason for such declarations. Imagine that rebinbing is spellt using ':='. Then, from a non-glocal scope: > * a=1 causes creation of a local name > * a:=1 rebinds a local name if exists, or rebinds an external name if exists (step-by-step up to module level scope), or else launches NameError. > There may be reasons why such a behaviour is not the best a programmer would expect: I wait for your comments. > I like the idea. On the global/nonlocal thing, it would be possible that a nested function does a:=l and there is both a nonlocal "a" and a global "a". The current global/nonlocal mechanism allows the programmer to disambiguate this case. But it is hard to imagine use cases where this couldn't be resolved by renaming one of the "a" variables. And the same problem occurs in the current mechanism where a function nested 2 levels down has two nonlocal "a" variables: in its direct parent, and in its grandparent. -bruce From yaogzhan at gmail.com Thu Feb 5 17:05:17 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Thu, 5 Feb 2009 12:35:17 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: <871vudzfuf.fsf@benfinney.id.au> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <871vudzfuf.fsf@benfinney.id.au> Message-ID: On 5-Feb-09, at 12:37 AM, Ben Finney wrote: > >> - We already have indentation to visually separate different levels >> of code. Why bother with those extra colons at all? > > Because indentation also occurs for other reasons, a major example > being continuations of previous lines. > > for line in foo: > do_something( > spam, line) > > That's three lines at differing indentation levels, but two > statements. > > I find that the line-end colon is a strong visual indicator that a > suite is being introduced, as contrasted with some other difference in > indentation. Actually this is the first concrete example for colons. Thanks very much for bringing it up, Ben! :) Here is a counter-example for the strong visual indicator cause. for some_list in some_collection: do_something(some_list[1: ], something_else) From yaogzhan at gmail.com Thu Feb 5 17:11:19 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Thu, 5 Feb 2009 12:41:19 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> Message-ID: On 5-Feb-09, at 10:04 AM, Christian Heimes wrote: > -1 > > Please keep Python sources readable. > > Christian -1 Please elaborate (better yet, give concrete examples) why it makes Python code less readable. The reason I propose to make colons optional is that I fail to see colons make code more readable. They seem to be line noise to me. You might disagree, but please explain why. From dickinsm at gmail.com Thu Feb 5 17:12:17 2009 From: dickinsm at gmail.com (Mark Dickinson) Date: Thu, 5 Feb 2009 16:12:17 +0000 Subject: [Python-ideas] binding vs rebinding In-Reply-To: <20090205142211.77ba2568@o> References: <20090205142211.77ba2568@o> Message-ID: <5c6f2a5d0902050812u10a82baaxd643053256ac7f80@mail.gmail.com> On Thu, Feb 5, 2009 at 1:22 PM, spir wrote: > Hello, > > I wonder why there is no difference in syntax between binding and rebinding. Obviously, the semantics is not at all the same, for humans as well as for the interpreter: > * Binding: create a name, bind a value to it. > * Rebinding: change the value bound to the name. > > I see several advantages for this distinction and no drawback. How would you write code like: my_result = [] for item in items: a = ... my_result.append() where a is bound for the first time on the first iteration of the loop, and rebound on all subsequent iterations? Mark From lists at cheimes.de Thu Feb 5 17:53:46 2009 From: lists at cheimes.de (Christian Heimes) Date: Thu, 05 Feb 2009 17:53:46 +0100 Subject: [Python-ideas] Making colons optional? In-Reply-To: References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> Message-ID: Riobard Zhan schrieb: > On 5-Feb-09, at 10:04 AM, Christian Heimes wrote: > >> -1 >> >> Please keep Python sources readable. >> >> Christian > > > -1 > > Please elaborate (better yet, give concrete examples) why it makes > Python code less readable. > > > The reason I propose to make colons optional is that I fail to see > colons make code more readable. They seem to be line noise to me. You > might disagree, but please explain why. The colon at the end makes it clear it's the end of the statement, too. Some example: def method(self, some, very, long, method, going, over, lots, and, lots, and, lots, of, lines): pass The second example makes it even more obvious: if (some() and some_other() or some_more(complex=(True,)) and a_final_call(egg=(1,2,3))): do_something() You see a line starting with "if" but not ending with a colon. You know for sure that you have to search for a trailing colon in order to find the end of a very long "if" line. Yes, the colon is extra noise but it's the kind of good noise that makes life more joyful like the noise of rain on a roof. Did you notice that I'm using a colon in my regular postings, too? I've used two colons to separate my text from the examples. Colon separators are natural to me. Christian From denis.spir at free.fr Thu Feb 5 18:10:39 2009 From: denis.spir at free.fr (spir) Date: Thu, 5 Feb 2009 18:10:39 +0100 Subject: [Python-ideas] Making colons optional? In-Reply-To: References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> Message-ID: <20090205181039.6a7bfd73@o> Le Thu, 05 Feb 2009 17:53:46 +0100, Christian Heimes a ?crit : [...] > Yes, the colon is extra noise but it's the kind of good noise that makes > life more joyful like the noise of rain on a roof. Did you notice that > I'm using a colon in my regular postings, too? I've used two colons to > separate my text from the examples. Colon separators are natural to me. So why not support a PEP to introduce compulsory ';' at the end of all not-compound statements? All your comments and exemples hold both for ordinary statements and block headlines -- or do I miss the point? > The second example makes it even more obvious: > > if (some() and some_other() or some_more(complex=(True,)) > and a_final_call(egg=(1,2,3))): Do(some(), some_other(), some_more(complex=(True,)), and_final_call(egg=(1,2,3,zzzzz=False))); # endof statement is obvious Denis > Christian > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > ------ la vida e estranya From scott+python-ideas at scottdial.com Thu Feb 5 18:07:39 2009 From: scott+python-ideas at scottdial.com (Scott Dial) Date: Thu, 05 Feb 2009 12:07:39 -0500 Subject: [Python-ideas] Making colons optional? In-Reply-To: References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> Message-ID: <498B1CDB.3000804@scottdial.com> Christian Heimes wrote: > def method(self, some, very, long, method, > going, over, lots, > and, lots, > and, lots, of, lines): > pass > This example line-continues via parentheses, so why isn't the right-paren enough? > if (some() and some_other() or some_more(complex=(True,)) > and a_final_call(egg=(1,2,3))): > do_something() This example uses the same mechanism as above. BTW, I tend to indent this as: if (some() and some_other() or some_more(complex=(True,)) and a_final_call(egg=(1,2,3))): do_something() With or without the colon, and it's more readable than your version (IMHO), and clearly the colon provides no aide to it. > You see a line starting with "if" but not ending with a colon. You know > for sure that you have to search for a trailing colon in order to find > the end of a very long "if" line. I'd also add that every C programmer has dealt with this before with single-statement if clauses that require no braces. This is after all the reason why I indent line-continued test expression the way I do.. > Yes, the colon is extra noise but it's the kind of good noise that makes > life more joyful like the noise of rain on a roof. Did you notice that > I'm using a colon in my regular postings, too? I've used two colons to > separate my text from the examples. Colon separators are natural to me. All said, I would prefer to keep the colons as well.. I like regularity of colon<=>suite equilibrium. Even if the splicing operator spits in the face of it, rarely do you see [x:y] line-continued to leave a naked colon at the end of the line. -- Scott Dial scott at scottdial.com scodial at cs.indiana.edu -- Scott Dial scott at scottdial.com scodial at cs.indiana.edu From qrczak at knm.org.pl Thu Feb 5 18:52:04 2009 From: qrczak at knm.org.pl (Marcin 'Qrczak' Kowalczyk) Date: Thu, 5 Feb 2009 18:52:04 +0100 Subject: [Python-ideas] binding vs rebinding In-Reply-To: <5c6f2a5d0902050812u10a82baaxd643053256ac7f80@mail.gmail.com> References: <20090205142211.77ba2568@o> <5c6f2a5d0902050812u10a82baaxd643053256ac7f80@mail.gmail.com> Message-ID: <3f4107910902050952t442b3810j328cb22339ca95dc@mail.gmail.com> On Thu, Feb 5, 2009 at 17:12, Mark Dickinson wrote: > How would you write code like: > > my_result = [] > for item in items: > a = > ... > my_result.append() > > where a is bound for the first time on the first iteration > of the loop, and rebound on all subsequent iterations? It would be bound locally to the body of the loop. -- Marcin Kowalczyk qrczak at knm.org.pl http://qrnik.knm.org.pl/~qrczak/ From yaogzhan at gmail.com Thu Feb 5 19:12:58 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Thu, 5 Feb 2009 14:42:58 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: <498AC701.6030300@pearwood.info> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> Message-ID: <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> On 5-Feb-09, at 7:31 AM, Steven D'Aprano wrote: > Riobard Zhan wrote: > >> - I noticed a strong tendency to forget colons by new users of >> Python in a second-year computer science undergraduate course. The >> students seemed not getting used to colons even near the end of the >> course. I guess it is probably because they learn Java and C first, >> both of which do not have colons. What other languages do you know >> that require colons? > > Pascal uses colons, but not for the exact same purpose as Python. > Both languages use colons in similar ways to it's use in English. In > particular, Python uses colons as a break between clauses: larger > than a comma, smaller than a period. > Pascal is my first language. It has been some years ago, so I cannot remember the detail now. I checked wikipedia and did not find colons are used after if's. Not sure if you mean declarations? If so, I don't think that is what we are discussing here; Java and C also use colons in switch/case statements. AFAIK, Python is quite unique in requiring trailing colons after if's, for's, and function/class definitions. > >> - I find colons pretty annoying. > ... > > I'm sorry you dislike colons, but I like them. > Yes I agree with you that many people like colons. What bothers me is that some people dislike them, but not given the choice to avoid them. We don't like semicolons in Python, but what would stop a hard-core C users to end every statement with a semicolon? They have the choice. And I would also argue that many of those like colons not because they really feel colons improve readability, but that they have get used to colons in the first place. You like colons, I don't. How do you know another Python user will like them or not? By making trailing colons OPTIONAL, we can probably have the chance to field test. If people really think colons improve readability that much, they can still use them, just like we feel semicolons are line noise and void them if possible, even though we CAN use them. I don't think we will ever lose anything to make colons optional. > >> - What problems do you think will occur if colons are made optional? > > I don't think it would lead to any problems, but I think it would > make Python less elegant. > I think omitting colons makes Python more elegant--more uniform, less clutter. It's an itch every time I see a piece of Ruby code with lots of def's and if's without trailing colons ... From yaogzhan at gmail.com Thu Feb 5 19:21:18 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Thu, 5 Feb 2009 14:51:18 -0330 Subject: [Python-ideas] Making colons optional? Syntactic alternative. In-Reply-To: <20090205132602.58368969@o> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <20090205132602.58368969@o> Message-ID: <1F2A6158-A36D-42E1-96F2-993F373BDB35@gmail.com> On 5-Feb-09, at 8:56 AM, spir wrote: > Le Wed, 4 Feb 2009 23:35:30 -0330, > Riobard Zhan a ?crit : > >> Hi everybody, >> >> >> I'm proposing to make colons optional. > > I 99% agree with that. But 100% sure this will never happen. No > chance at all, imo... Still, it may be worth exchanging some > thoughts on the topic as an opprtunity to take some distance, and > watch python common traits with an outer look (correct?). I am a little bit different than you in the numbers--I 100% dislike colons, but 99% sure they will never be made optional. That last 1% leads me to give it a try here :) I totally agree with you that discussing such issues would be a great chance to learn something about Python. How many Python users outside this mailing list understand the rationale behind colons? I guess it might be a few. It's an issue worth thinking. > A kind of meaning conflict nevertheless exists even in python > itself: a ':' sign even more nicely carries a sense of binding, or > link. Python uses this common world semantics in dict litteral > key:value pairs. Colons are also used for slicing. Seem to be overused ... From bruce at leapyear.org Thu Feb 5 19:30:00 2009 From: bruce at leapyear.org (Bruce Leban) Date: Thu, 5 Feb 2009 10:30:00 -0800 Subject: [Python-ideas] Making stars optional? (was: Making colons optional?) Message-ID: In algebra, you don't have to put a multiplication sign in between two quantities that you want to multiply. I've seen beginning programmers write things like x = 3a + 4(b-c) instead of x = 3*a + 4*(b-c) Why should we require the stars when it's unambiguous what the first statement means? --- Bruce P.S. Pascal used words if ... then ... and while ... do ... and begin ... end. I like non-alpha symbols that stand out better, like colons, braces and indentation. I don't need a semicolon to tell me where the end of a line is. I only need something to tell me where the end of a line ISN'T and I handle that with prominent indentation. -------------- next part -------------- An HTML attachment was scrubbed... URL: From yaogzhan at gmail.com Thu Feb 5 19:29:59 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Thu, 5 Feb 2009 14:59:59 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: <20090205181039.6a7bfd73@o> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <20090205181039.6a7bfd73@o> Message-ID: On 5-Feb-09, at 1:40 PM, spir wrote: > Le Thu, 05 Feb 2009 17:53:46 +0100, > Christian Heimes a ?crit : > > [...] >> Yes, the colon is extra noise but it's the kind of good noise that >> makes >> life more joyful like the noise of rain on a roof. Did you notice >> that >> I'm using a colon in my regular postings, too? I've used two colons >> to >> separate my text from the examples. Colon separators are natural to >> me. > > So why not support a PEP to introduce compulsory ';' at the end of > all not-compound statements? All your comments and exemples hold > both for ordinary statements and block headlines -- or do I miss the > point? > >> The second example makes it even more obvious: >> >> if (some() and some_other() or some_more(complex=(True,)) >> and a_final_call(egg=(1,2,3))): > > Do(some(), some_other(), some_more(complex=(True,)), > and_final_call(egg=(1,2,3,zzzzz=False))); # endof > statement is obvious This is exactly the point in my mind! :) If semicolons are optional, so should be colons. If line continuation really matters, we should use both anyway. Why the irregularity? From denis.spir at free.fr Thu Feb 5 19:45:38 2009 From: denis.spir at free.fr (spir) Date: Thu, 5 Feb 2009 19:45:38 +0100 Subject: [Python-ideas] Fw: binding vs rebinding Message-ID: <20090205194538.23689a13@o> Le Thu, 5 Feb 2009 16:12:17 +0000, Mark Dickinson a ?crit : > On Thu, Feb 5, 2009 at 1:22 PM, spir wrote: > > Hello, > > > > I wonder why there is no difference in syntax between binding and rebinding. Obviously, the semantics is not at all the same, for humans as well as for the interpreter: > > * Binding: create a name, bind a value to it. > > * Rebinding: change the value bound to the name. > > > > I see several advantages for this distinction and no drawback. > > How would you write code like: > > my_result = [] > for item in items: > a = > ... > my_result.append() > > where a is bound for the first time on the first iteration > of the loop, and rebound on all subsequent iterations? Yes, good point! That's probably a reason why in some languages loops create local scopes/namespaces. In which case the issue disappears, as 'a' is newly created for each iteration. An incremental value (e.g. a sum) is first defined/initialized outside the loop instead. I takes this as different, but related, issue; because the intention and meaning here is really to have an 'a' for each iteration, matching each item. It is not a value that will evoluate along with the loop cycles, as a sum would: it is not rebinding. Conceptually, 'a' is thus a loop-local name. Maybe a special case for loops, where such "utility short-life names" are common? Not very nice. If "a=..." is allowed even when 'a' exists, and recreates the name, then a distinction with rebinding still holds, in the sense that explicit rebinding is possible (including in non-local scope). But we lose the side-effect of getting a NameError in case the programmer errorneously types an existing name. > Mark > ------ la vida e estranya ------ la vida e estranya From pyideas at rebertia.com Thu Feb 5 19:51:04 2009 From: pyideas at rebertia.com (Chris Rebert) Date: Thu, 5 Feb 2009 10:51:04 -0800 Subject: [Python-ideas] Making stars optional? (was: Making colons optional?) In-Reply-To: References: Message-ID: <50697b2c0902051051m5d76d2f7q18123a2f510dd943@mail.gmail.com> On Thu, Feb 5, 2009 at 10:30 AM, Bruce Leban wrote: > In algebra, you don't have to put a multiplication sign in between two > quantities that you want to multiply. I've seen beginning programmers write > things like > > x = 3a + 4(b-c) > > instead of > > x = 3*a + 4*(b-c) > > Why should we require the stars when it's unambiguous what the first > statement means? Because there's a /very/ high likelihood that it was typo, and per the Zen, Python shouldn't guess in the face of ambiguity and (likely) errors should never pass silently. Further, if it is a typo, it's not unambiguous enough that we can infer with certainty that multiplication was intended; it's just as likely the programmer forgot to type the operator, and there's only a 1 in 11 (or worse) (that's how many binary operators I could come up with without looking at the manual) chance that multiplication was indeed intended. Cheers, Chris -- Follow the path of the Iguana... http://rebertia.com From curt at hagenlocher.org Thu Feb 5 20:04:51 2009 From: curt at hagenlocher.org (Curt Hagenlocher) Date: Thu, 5 Feb 2009 11:04:51 -0800 Subject: [Python-ideas] Making stars optional? (was: Making colons optional?) In-Reply-To: <50697b2c0902051051m5d76d2f7q18123a2f510dd943@mail.gmail.com> References: <50697b2c0902051051m5d76d2f7q18123a2f510dd943@mail.gmail.com> Message-ID: On Thu, Feb 5, 2009 at 10:30 AM, Bruce Leban wrote: > In algebra, you don't have to put a multiplication sign in between two > quantities that you want to multiply. I've seen beginning programmers write > things like > > x = 3a + 4(b-c) > > instead of > > x = 3*a + 4*(b-c) > > Why should we require the stars when it's unambiguous what the first > statement means? Sure, and given the following program: a = 2 b = 4 print ab shouldn't we be able to print "8", given that the meaning of the program is unambiguous? Ultimately, you have to balance ease-of-use against consistency -- both because too much inconsistency can actually harm ease-of-use and because "special cases" tend to combine in crazy ways to create horrible edge cases. Where to draw the line is always a matter of personal taste, but the Python language has consistently favored consistency in its philosophy. -- Curt Hagenlocher curt at hagenlocher.org From santagada at gmail.com Thu Feb 5 20:07:12 2009 From: santagada at gmail.com (Leonardo Santagada) Date: Thu, 5 Feb 2009 17:07:12 -0200 Subject: [Python-ideas] Making colons optional? In-Reply-To: <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> Message-ID: <515259D2-9EFF-4432-8670-129362D1AA01@gmail.com> On Feb 5, 2009, at 4:12 PM, Riobard Zhan wrote: > I think omitting colons makes Python more elegant--more uniform, > less clutter. It's an itch every time I see a piece of Ruby code > with lots of def's and if's without trailing colons ... Could you explain to us how ruby does it? ps: Just remember that they do have an end in the end of suites (which in my opinion look worse than the colons...) -- Leonardo Santagada santagada at gmail.com From pyideas at rebertia.com Thu Feb 5 20:07:27 2009 From: pyideas at rebertia.com (Chris Rebert) Date: Thu, 5 Feb 2009 11:07:27 -0800 Subject: [Python-ideas] Making colons optional? In-Reply-To: References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <871vudzfuf.fsf@benfinney.id.au> Message-ID: <50697b2c0902051107w6f383798vd6dfe78823f41564@mail.gmail.com> On Thu, Feb 5, 2009 at 8:05 AM, Riobard Zhan wrote: > > On 5-Feb-09, at 12:37 AM, Ben Finney wrote: >> >>> - We already have indentation to visually separate different levels >>> of code. Why bother with those extra colons at all? >> >> Because indentation also occurs for other reasons, a major example >> being continuations of previous lines. >> >> for line in foo: >> do_something( >> spam, line) >> >> That's three lines at differing indentation levels, but two >> statements. >> >> I find that the line-end colon is a strong visual indicator that a >> suite is being introduced, as contrasted with some other difference in >> indentation. > > Actually this is the first concrete example for colons. Thanks very much for > bringing it up, Ben! :) > > Here is a counter-example for the strong visual indicator cause. > > for some_list in some_collection: > do_something(some_list[1: > ], something_else) True; however, the parentheses and brackets are unbalanced and that immediately stands out, at least for me, so I see something odd is going on right away. You do have to admit that example is a bit contrived though. Move the `],` back onto the previous line and the code becomes perfectly clear. I would also defend colons on the grounds that they let particularly short statements be one-liners: def add(x,y): return x+y #this one especially comes up a lot in practice if x is None: x=42 for i in alist: print(i) with the added bonus that you're forced to indent if the body becomes multiline (unless you use semicolons that is, in which case you're destined to burn in the Nth circle of Hell for your sin against the BDFL (blessed be his holiness) ;-P). -1; colons enhance readability and are almost never forgotten after you get over the initial hurdle of learning a new language. Cheers, Chris -- Follow the path of the Iguana... http://rebertia.com From pyideas at rebertia.com Thu Feb 5 20:33:21 2009 From: pyideas at rebertia.com (Chris Rebert) Date: Thu, 5 Feb 2009 11:33:21 -0800 Subject: [Python-ideas] Making colons optional? Syntactic alternative. In-Reply-To: <20090205132602.58368969@o> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <20090205132602.58368969@o> Message-ID: <50697b2c0902051133k6051a633qf5c9d1de9f43fcb@mail.gmail.com> On Thu, Feb 5, 2009 at 4:26 AM, spir wrote: > Le Wed, 4 Feb 2009 23:35:30 -0330, > Riobard Zhan a ?crit : > * Extend name binding format "name:value" to assignments. That is, write "a:1" instead of "a=1. This may avoid tons of misunderstandings (imo, '=' for assignment is a semantic plague, a major error/fault, a pedagogic calamity). > * Which lets the "=" sign free for better use: use "=" for the semantics of equality, like in math and common interpretation learnt from school. > * Perhaps: use "==" for identity, instead of "is". I'm sorry but regarding these bullet points, C makes for a persuasive argument for not changing the meaning of those operators. Pretty much everyone in computing either knows well or is at least familiar with C and probably knows one of its syntactic descendants quite well, so gratuitous deviation from C violates the Principle of Least Surprise for anyone who already is a programmer. Newbies may trip over it, but they catch on eventually. If you did want to change the assignment operator, the colon seems visually a poor choice IMHO; it's too easily overlooked and `x : y` looks rather sparse on a line by itself. The other somewhat popular choices of assignment syntax that I've seen are x := y, x <- y, and let x = y. The first and last of those still use a =, albeit with something extra, and so aren't too much of a stretch. x <- y is particularly deviant and isn't readable (as in read it out loud) IMHO; you either have to read it right-to-left (which is unnatural) as "Take y and put it into x", or as "x has y placed into it" (which uses the passive voice and so does not fit well with an imperative language. Cheers, Chris -- Follow the path of the Iguana... http://rebertia.com From yaogzhan at gmail.com Thu Feb 5 20:38:49 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Thu, 5 Feb 2009 16:08:49 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: <7528bcdd0902051106r24efc8c1h16930029bb5210bf@mail.gmail.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <7528bcdd0902051106r24efc8c1h16930029bb5210bf@mail.gmail.com> Message-ID: <74155C36-D718-4C43-BDFC-9CCD115F3A89@gmail.com> On 5-Feb-09, at 3:36 PM, Andre Roberge wrote: > > >> Yes I agree with you that many people like colons. What bothers me >> is that >> some people dislike them, but not given the choice to avoid them. >> We don't >> like semicolons in Python, but what would stop a hard-core C users >> to end >> every statement with a semicolon? They have the choice. >> > > I have seen *very* few code samples where semi-colon were used - and > most were by people just learning Python (and familiar with some other > language). If it were up to me, semi-colon would not be allowed - > again, for consistency. Semicolons are bad but are because you'll need them sometimes for one- liners. You can use colons too if you wish. Now the inconsistency is that semicolons are optional but colons are required. I did not propose to eliminate colons all together; just make them optional so we have the choice. >> And I would also argue that many of those like colons not because >> they >> really feel colons improve readability, but that they have get used >> to >> colons in the first place. You like colons, I don't. How do you >> know another >> Python user will like them or not? By making trailing colons >> OPTIONAL, we >> can probably have the chance to field test. If people really think >> colons >> improve readability that much, they can still use them, just like >> we feel >> semicolons are line noise and void them if possible, even though we >> CAN use >> them. I don't think we will ever lose anything to make colons >> optional. >> > > The emphasis for Python is improved readability (*perhaps* at the > expense of a small extra burden when writing, i.e. when adding > colons). I do find code more readable with colons (and have from day > 1 when I started learning Python). Apparently you don't. But, if you > are given the choice and write code without colons and I read it, I > know that it will be less readable for me. So I do disagree with the > opinion that we would would not ever lose anything by making colons > optional. Either they stay or they are removed - don't allow both. I agree readability counts. That's why I love Python. And that's also why I don't like colons--they are line noise. Semicolons improve readability; they tell you exactly where lines end (esp. in case of line continuation), in the same way colons tell you where suites end. What do you feel if you see a piece of Python code full of semicolons at the end of every statement? That will probably be the same feeling I have when facing colons at the end of every suite. And unfortunately, we allow both for semicolons. From yaogzhan at gmail.com Thu Feb 5 20:47:06 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Thu, 5 Feb 2009 16:17:06 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: <515259D2-9EFF-4432-8670-129362D1AA01@gmail.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <515259D2-9EFF-4432-8670-129362D1AA01@gmail.com> Message-ID: <06586733-716E-4D17-A1B3-7E0DB0CDF421@gmail.com> On 5-Feb-09, at 3:37 PM, Leonardo Santagada wrote: > > On Feb 5, 2009, at 4:12 PM, Riobard Zhan wrote: > >> I think omitting colons makes Python more elegant--more uniform, >> less clutter. It's an itch every time I see a piece of Ruby code >> with lots of def's and if's without trailing colons ... > > > Could you explain to us how ruby does it? > > ps: Just remember that they do have an end in the end of suites > (which in my opinion look worse than the colons...) I don't like the end-end-end-end part of Ruby either. But they do have better starting part IMO. def foo(var) if var == 10 print "Variable is 10? else print "Variable is something else" end end (Note this is actually an optional "then" keyword at the end of "if") From andre.roberge at gmail.com Thu Feb 5 20:53:35 2009 From: andre.roberge at gmail.com (Andre Roberge) Date: Thu, 5 Feb 2009 15:53:35 -0400 Subject: [Python-ideas] Making colons optional? Syntactic alternative. In-Reply-To: <50697b2c0902051133k6051a633qf5c9d1de9f43fcb@mail.gmail.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <20090205132602.58368969@o> <50697b2c0902051133k6051a633qf5c9d1de9f43fcb@mail.gmail.com> Message-ID: <7528bcdd0902051153l505a71d3jcd6c6fc10cf8da5a@mail.gmail.com> On Thu, Feb 5, 2009 at 3:33 PM, Chris Rebert wrote: [snip] > The other somewhat popular choices of assignment syntax that I've seen > are x := y, x <- y, and let x = y. The first and last of those still > use a =, albeit with something extra, and so aren't too much of a > stretch. x <- y is particularly deviant and isn't readable (as in read > it out loud) IMHO; you either have to read it right-to-left (which is > unnatural) as "Take y and put it into x", or as "x has y placed into > it" (which uses the passive voice and so does not fit well with an > imperative language. Furthermore, the "take y and put it into x" might be an appropriate interpretation for C but not for Python. For Python it should be instead something like "Assign the name x to object y" which could be visually represented by x -> y instead of x <- y as you mention. Andr? > > Cheers, > Chris > > -- > Follow the path of the Iguana... > http://rebertia.com > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From yaogzhan at gmail.com Thu Feb 5 20:57:05 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Thu, 5 Feb 2009 16:27:05 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: <50697b2c0902051107w6f383798vd6dfe78823f41564@mail.gmail.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <871vudzfuf.fsf@benfinney.id.au> <50697b2c0902051107w6f383798vd6dfe78823f41564@mail.gmail.com> Message-ID: On 5-Feb-09, at 3:37 PM, Chris Rebert wrote: > On Thu, Feb 5, 2009 at 8:05 AM, Riobard Zhan > wrote: >> >> On 5-Feb-09, at 12:37 AM, Ben Finney wrote: >>> >>>> - We already have indentation to visually separate different levels >>>> of code. Why bother with those extra colons at all? >>> >>> Because indentation also occurs for other reasons, a major example >>> being continuations of previous lines. >>> >>> for line in foo: >>> do_something( >>> spam, line) >>> >>> That's three lines at differing indentation levels, but two >>> statements. >>> >>> I find that the line-end colon is a strong visual indicator that a >>> suite is being introduced, as contrasted with some other >>> difference in >>> indentation. >> >> Actually this is the first concrete example for colons. Thanks very >> much for >> bringing it up, Ben! :) >> >> Here is a counter-example for the strong visual indicator cause. >> >> for some_list in some_collection: >> do_something(some_list[1: >> ], something_else) > > True; however, the parentheses and brackets are unbalanced and that > immediately stands out, at least for me, so I see something odd is > going on right away. > You do have to admit that example is a bit contrived though. Move the > `],` back onto the previous line and the code becomes perfectly clear. Do you notice immediately that the left ( is unbalanced in your original example? :P I see it odd right away, too. Line continuations are not good examples to demonstrate the necessity of colons, because soon you will run into the problem of requiring semicolons at the end of every statement. I think Denis made it quite clear in a previous reply. do(some(), some_other(), some_more(complex=(True,)), and_final_call(egg=(1,2,3,zzzzz=False))); # endof statement is obvious (if semicolons are required) > I would also defend colons on the grounds that they let particularly > short statements be one-liners: > > def add(x,y): return x+y > > #this one especially comes up a lot in practice > if x is None: x=42 > > for i in alist: print(i) > > with the added bonus that you're forced to indent if the body becomes > multiline (unless you use semicolons that is, in which case you're > destined to burn in the Nth circle of Hell for your sin against the > BDFL (blessed be his holiness) ;-P). I think in my original proposal I stated it very clear the colons are OPTIONAL and can be used for one-liners, just like semicolons. I did not propose to eliminate colons all together. From pyideas at rebertia.com Thu Feb 5 20:57:40 2009 From: pyideas at rebertia.com (Chris Rebert) Date: Thu, 5 Feb 2009 11:57:40 -0800 Subject: [Python-ideas] Making colons optional? Syntactic alternative. In-Reply-To: <7528bcdd0902051153l505a71d3jcd6c6fc10cf8da5a@mail.gmail.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <20090205132602.58368969@o> <50697b2c0902051133k6051a633qf5c9d1de9f43fcb@mail.gmail.com> <7528bcdd0902051153l505a71d3jcd6c6fc10cf8da5a@mail.gmail.com> Message-ID: <50697b2c0902051157j17a85249t937f328867e9a011@mail.gmail.com> On Thu, Feb 5, 2009 at 11:53 AM, Andre Roberge wrote: > On Thu, Feb 5, 2009 at 3:33 PM, Chris Rebert wrote: > [snip] >> The other somewhat popular choices of assignment syntax that I've seen >> are x := y, x <- y, and let x = y. The first and last of those still >> use a =, albeit with something extra, and so aren't too much of a >> stretch. x <- y is particularly deviant and isn't readable (as in read >> it out loud) IMHO; you either have to read it right-to-left (which is >> unnatural) as "Take y and put it into x", or as "x has y placed into >> it" (which uses the passive voice and so does not fit well with an >> imperative language. > > Furthermore, the "take y and put it into x" might be an appropriate > interpretation for C but not for Python. For Python it should be > instead something like > "Assign the name x to object y" which could be visually represented by > x -> y instead of x <- y as you mention. Indeed. Of course, I think that commits the equally cardinal sin of flipping the direction of the assignment operator, from variable-value to value-variable. :) Cheers, Chris -- Follow the path of the Iguana... http://rebertia.com From bruce at leapyear.org Thu Feb 5 21:47:59 2009 From: bruce at leapyear.org (Bruce Leban) Date: Thu, 5 Feb 2009 12:47:59 -0800 Subject: [Python-ideas] Making stars optional? (was: Making colons optional?) In-Reply-To: References: <50697b2c0902051051m5d76d2f7q18123a2f510dd943@mail.gmail.com> Message-ID: I apologize for leaving the :-) out in my original post. Just to be clear: (1) I indeed have seen this mistake from beginning programmers and (2) I think they should get over it. I also don't think that = and == should be the same syntax and trust the compiler to figure out which one you mean. I also don't think spaces should be optional, despite the fact that no space probe has ever been lost as a result of that feature. :-( http://my.safaribooksonline.com/0131774298/ch02 http://catless.ncl.ac.uk/Risks/9.54.html --- Bruce On Thu, Feb 5, 2009 at 11:04 AM, Curt Hagenlocher wrote: > On Thu, Feb 5, 2009 at 10:30 AM, Bruce Leban wrote: > > In algebra, you don't have to put a multiplication sign in between two > > quantities that you want to multiply. I've seen beginning programmers > write > > things like > > > > x = 3a + 4(b-c) > > > > instead of > > > > x = 3*a + 4*(b-c) > > > > Why should we require the stars when it's unambiguous what the first > > statement means? > > Sure, and given the following program: > a = 2 > b = 4 > print ab > shouldn't we be able to print "8", given that the meaning of the > program is unambiguous? > That would be so cool! If any variable is undefined, break it up into smaller variables and if they're numbers multiply them and if they're strings concatenate them. Wow! > > > Ultimately, you have to balance ease-of-use against consistency -- > both because too much inconsistency can actually harm ease-of-use and > because "special cases" tend to combine in crazy ways to create > horrible edge cases. Where to draw the line is always a matter of > personal taste, but the Python language has consistently favored > consistency in its philosophy. > > -- > Curt Hagenlocher > curt at hagenlocher.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From curt at hagenlocher.org Thu Feb 5 21:51:23 2009 From: curt at hagenlocher.org (Curt Hagenlocher) Date: Thu, 5 Feb 2009 12:51:23 -0800 Subject: [Python-ideas] Making stars optional? (was: Making colons optional?) In-Reply-To: References: <50697b2c0902051051m5d76d2f7q18123a2f510dd943@mail.gmail.com> Message-ID: On Thu, Feb 5, 2009 at 12:47 PM, Bruce Leban wrote: > I apologize for leaving the :-) out in my original post. *blush* -- Curt Hagenlocher curt at hagenlocher.org From pyideas at rebertia.com Thu Feb 5 22:16:12 2009 From: pyideas at rebertia.com (Chris Rebert) Date: Thu, 5 Feb 2009 13:16:12 -0800 Subject: [Python-ideas] binding vs rebinding In-Reply-To: <20090205142211.77ba2568@o> References: <20090205142211.77ba2568@o> Message-ID: <50697b2c0902051316h58c7b7c1r823da85842916029@mail.gmail.com> On Thu, Feb 5, 2009 at 5:22 AM, spir wrote: > Hello, > > I wonder why there is no difference in syntax between binding and rebinding. Obviously, the semantics is not at all the same, for humans as well as for the interpreter: > * Binding: create a name, bind a value to it. > * Rebinding: change the value bound to the name. > > I see several advantages for this distinction and no drawback. The first advantage, which imo is worthful enough, is to let syntax match semantics; as the distinction *makes sense*. > > A nice side-effect would be to allow detection of typographic or (human ;-) memory errors: > * When an error wrongly and silently creates a new name instead of launching a NameError exception. A distinct syntax for rebinding would prevent that. > * When an error wrongly and silently rebinds an existing name instead of launching a NameError exception. A distinct syntax for (first) binding would prevent that. > No need, I guess, to insist on the fact that such errors sometimes lead to long and difficult debugging precisely for they are silent. This, because in all cases "a=1" is a valid instruction, as there is no distinction between binding and rebinding. > > I suspect a further advantage may be to get rid of "global" and "nonlocal" declarations -- which, as I see it, do not at all fit the python way. I may be wrong on that, still it seems such declarations are necessary only because of the above distinction lacking. My rational on this is: > * It is very common and helpful to allow a local variable beeing named identically as another one in an external scope. > * There is no binding/redinding distinction in python syntax. > * So that whenever a name appears on the left side of an assignment, inside a non-global scope, there is no way to know whether the programmer intends to create a local name or to access a possibly existing external name. > * To resolve this ambiguity, python adopts the rule of creating a local name. > * Thus, it becomes impossible to rebind an external name from a local scope. Which is still useful in rather rare, but relevant, use cases. > * So that 'global', and later 'nonlocal', declarations had to be introduced in python. > (I tried to be as clear and step-by-step as I can so that this reasoning can easily be refuted if ever it holds errors I cannot see.) > > It seems that if ever the second step would not hold, then there would be no reason for such declarations. Imagine that rebinbing is spellt using ':='. Then, from a non-glocal scope: > * a=1 causes creation of a local name > * a:=1 rebinds a local name if exists, or rebinds an external name if exists (step-by-step up to module level scope), or else launches NameError. > There may be reasons why such a behaviour is not the best a programmer would expect: I wait for your comments. > > Obviously, for the sake of compatibility, this is more a base for discussion, if you find the topic interesting, than for a proposal for python 9000... It was proposed and rejected for Python 3000, so it's unlikely (though not impossible) to be changed. See PEP 3099 (http://www.python.org/dev/peps/pep-3099/ : "There will be no alternative binding operators such as :=") and http://mail.python.org/pipermail/python-dev/2006-July/066995.html I haven't read the thread to see what the reasoning was, but I trust Guido's judgement, Cheers, Chris -- Follow the path of the Iguana... http://rebertia.com From ben+python at benfinney.id.au Thu Feb 5 22:31:58 2009 From: ben+python at benfinney.id.au (Ben Finney) Date: Fri, 06 Feb 2009 08:31:58 +1100 Subject: [Python-ideas] binding vs rebinding References: <20090205142211.77ba2568@o> Message-ID: <87skmsy3ht.fsf@benfinney.id.au> spir writes: > I wonder why there is no difference in syntax between binding and > rebinding. Obviously, the semantics is not at all the same, for > humans as well as for the interpreter: > * Binding: create a name, bind a value to it. > * Rebinding: change the value bound to the name. That's not obvious. The semantics could just as well be described as: * Binding: bind this name to that value. * Rebinding: bind this name to that value. If I claimed this semantic description as ?obvious?, I'd be just as wrong to do so. But this description functions very well to explain the semantics of these operations for me and others. > I see several advantages for this distinction and no drawback. The > first advantage, which imo is worthful enough, is to let syntax > match semantics; as the distinction *makes sense*. Since I see no sense in the distinction, I see the drawback of unnecessarily complicating the syntax. -- \ ?I went to a garage sale. ?How much for the garage?? ?It's not | `\ for sale.?? ?Steven Wright | _o__) | Ben Finney From steve at pearwood.info Thu Feb 5 22:35:53 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 06 Feb 2009 08:35:53 +1100 Subject: [Python-ideas] Making colons optional? In-Reply-To: <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> Message-ID: <498B5BB9.60909@pearwood.info> Riobard Zhan wrote: >> Pascal uses colons, but not for the exact same purpose as Python. Both >> languages use colons in similar ways to it's use in English. In >> particular, Python uses colons as a break between clauses: larger than >> a comma, smaller than a period. >> > > Pascal is my first language. It has been some years ago, so I cannot > remember the detail now. I checked wikipedia and did not find colons are > used after if's. Not sure if you mean declarations? I didn't say that Pascal uses colons after IFs. I explicitly said Pascal used colons "not for the exact same purpose as Python". > If so, I don't think > that is what we are discussing here; Java and C also use colons in > switch/case statements. AFAIK, Python is quite unique in requiring > trailing colons after if's, for's, and function/class definitions. A switch/case statement is equivalent to a series of if...elif... statements, so it would be inconsistent to require colons in a switch but not in if...elif. My point was that both languages (Pascal and Python) use colons in a way which is very familiar and standard to the English language, albeit different usages in the two languages. If newbies to either language find colons confusing, that's indicative of the general decline of educational standards. If people whose first language is not English have trouble with colons, then I sympathize, but then so much of Python is based on English-like constructs that colons will be the least of their problem. >>> - I find colons pretty annoying. >> ... >> >> I'm sorry you dislike colons, but I like them. >> > > Yes I agree with you that many people like colons. What bothers me is > that some people dislike them, but not given the choice to avoid them. > We don't like semicolons in Python, but what would stop a hard-core C > users to end every statement with a semicolon? Peer pressure. Everybody would laugh at their code and think they're foolish. > And I would also argue that many of those like colons not because they > really feel colons improve readability, but that they have get used to > colons in the first place. You like colons, I don't. How do you know > another Python user will like them or not? I don't really care. I'm sure that there are millions of programmers who don't like brackets around function calls, but we don't make them optional: myfunction x, y, z For that matter, commas in lists and tuples: my function x y z Flexibility of punctuation in human languages is a good thing, because it enables the writer to express subtle differences in semantics. There is a subtle difference between Guido is an excellent language designer: he knows what he is doing. and Guido is an excellent language designer, he knows what he is doing. But compared to human languages, the semantics expressed by computer languages are simple and unsubtle. Flexibility of punctuation hurts computer languages, not helps. > By making trailing colons > OPTIONAL, we can probably have the chance to field test. If people > really think colons improve readability that much, they can still use > them, just like we feel semicolons are line noise and void them if > possible, even though we CAN use them. I don't think we will ever lose > anything to make colons optional. Of course we do. It makes the language bigger and more complex. The parser has to be more complex. Who is going to write that, maintain that? Every time you write a def block you have to decide "colon or not?". Most people will standardize on one or the other, and the decision will go away, but they will still need to read other people's code, and then colon-haters will be unhappy, because they have to read other people's code containing colons, while colon-likers will be unhappy, because they have to read other people's code missing colons. Optional colons are the worst of both worlds, because it makes *everybody* unhappy. -- Steven From ben+python at benfinney.id.au Thu Feb 5 22:40:27 2009 From: ben+python at benfinney.id.au (Ben Finney) Date: Fri, 06 Feb 2009 08:40:27 +1100 Subject: [Python-ideas] Making colons optional? References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> Message-ID: <87ocxgy33o.fsf@benfinney.id.au> Riobard Zhan writes: > On 5-Feb-09, at 7:31 AM, Steven D'Aprano wrote: > > > I'm sorry you dislike colons, but I like them. > > Yes I agree with you that many people like colons. What bothers me is > that some people dislike them, but not given the choice to avoid them. That argument doesn't address the point of the existing syntax. I (and presumably Steven) like the colons in code *when I have to read it*. If they are optional, and some significant proportion of coders stop using them to introduce a suite, then they entirely lose their strong association with ?here comes a suite? that is the main benefit of having them as complulsory syntax. > We don't like semicolons in Python, but what would stop a hard-core > C users to end every statement with a semicolon? They have the > choice. Laziness (the good kind). Once someone discovers that they *don't have to* add the semicolons, and it doesn't affect the operation of their program, those semicolons will, I predict, become much less frequent. -- \ ?Pinky, are you pondering what I'm pondering?? ?I think so, | `\ Brain, but Tuesday Weld isn't a complete sentence.? ?_Pinky and | _o__) The Brain_ | Ben Finney From steve at pearwood.info Thu Feb 5 22:51:14 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 06 Feb 2009 08:51:14 +1100 Subject: [Python-ideas] binding vs rebinding In-Reply-To: <20090205142211.77ba2568@o> References: <20090205142211.77ba2568@o> Message-ID: <498B5F52.1080200@pearwood.info> spir wrote: > Hello, > > I wonder why there is no difference in syntax between binding and rebinding. Obviously, the semantics is not at all the same, for humans as well as for the interpreter: Denis, as you can see from the above line, your email breaks the Internet standard of using hard-line breaks within paragraphs. This causes problems for other people. Please set your mail client to wrap text at 68, 70 or 72 characters. > * Binding: create a name, bind a value to it. > * Rebinding: change the value bound to the name. > > I see several advantages for this distinction and no drawback. The first advantage, which imo is worthful enough, is to let syntax match semantics; as the distinction *makes sense*. In Python, names are stored in namespaces, which are implemented as dictionaries. There is a nice correspondence between the syntax of namespaces and of dicts: x = 1 # create a new name and bind it to 1 x = 2 # rebind name to 2 del x # delete name mydict['x'] = 1 # create new key and bind it to 1 mydict['x'] = 2 # rebind key to 2 del mydict['x'] # delete key Also, your suggestion is conceptually the same as requiring declarations: x = 1 # declare x with value 1 x := 2 # assign to x Finally, what should we do here? if flag: x = 2 print foo(x) x = 3 # is this a rebinding or a new binding? print bar(x) -- Steven From tjreedy at udel.edu Thu Feb 5 22:54:12 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 05 Feb 2009 16:54:12 -0500 Subject: [Python-ideas] binding vs rebinding In-Reply-To: <20090205142211.77ba2568@o> References: <20090205142211.77ba2568@o> Message-ID: spir wrote: > Hello, > > I wonder why there is no difference in syntax between binding and rebinding. Obviously, the semantics is not at all the same, for humans as well as for the interpreter: > * Binding: create a name, bind a value to it. > * Rebinding: change the value bound to the name. Python has several name-binding statements other that assignment. 1. augmented assignments 2. for loops (mentioned by Mark already) 3. import statements (several variations) 4. class statements 5. def statements Any proposal to adjust one should uniformly adjust all. All binding statements currently have the same meaning: if name is currently bound: unbind it bind it to the indicated object If not currently bound, the unbind step is obviously skipped. Simple to understand. Requiring the programmer to indicate whether the unbind step *must* be done or not makes more work and pain for the programmer. It will make certain editing operations much harder by increasing the context sensitivity of code. Suppose I see code like the following: x = (a*a + b*b) / (1 - a*a - b*b) and I realize I can improve efficiency by pulling out the subexpression: tem = a*a + b*b x = tem / (1 - tem) Under this proposal, I would have to care whether tem had been (irrelevantly) used before or not and write the above differently depending on which. Similarly I would also have to care if tem were (irrelevantly) used after and possibly revise the later code depending on whether it were also used previously. What a bother! This will make for more new bugs than the proposal might eliminate. There really is some virtue to the simple-minded design most languages use. Terry Jan Reedy From tjreedy at udel.edu Thu Feb 5 22:57:19 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 05 Feb 2009 16:57:19 -0500 Subject: [Python-ideas] Add a cryptographic hash (e.g SHA1) of source toPython Compiled objects? In-Reply-To: <18826.60508.910995.447316@panix5.panix.com> References: <18824.5343.933820.298039@panix5.panix.com> <7C24D4B5998D43AA9289972758E9B426@RaymondLaptop1> <18825.5815.559562.370640@panix5.panix.com> <18825.26238.914919.705377@panix5.panix.com> <18826.60508.910995.447316@panix5.panix.com> Message-ID: rocky at gnu.org wrote: > Brett Cannon writes: > > marshal.dumps(compile(open('file.py').read(), 'file.py', 'exec')) == > > open('file.pyc').read()[8:] > > Thanks. > > Alas, I can't see how in practice this will be generally useful. > > Again, here is the problem: I have a some sort of compiled python file > and something which I think is the source code for it. I want to > verify that it is. > > (In a debugger it means we can warn that what you are seeing is not > what's being run. However I do not believe this is the only situation > where getting the answer to this question is helpful/important.) > > The solution above is very sensitive to knowing the name of the file > (files?) used in compilation because those are stored in the > co_filename portion of the code object. Instead of comparing marshal strings, custom compare the code objects, ignoring .co_filename. tjr From steve at pearwood.info Thu Feb 5 23:03:42 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 06 Feb 2009 09:03:42 +1100 Subject: [Python-ideas] Making stars optional? (was: Making colons optional?) In-Reply-To: References: Message-ID: <498B623E.1040206@pearwood.info> Bruce Leban wrote: > In algebra, you don't have to put a multiplication sign in between two > quantities that you want to multiply. I've seen beginning programmers > write things like > > x = 3a + 4(b-c) > > instead of > > x = 3*a + 4*(b-c) > > Why should we require the stars when it's unambiguous what the first > statement means? Because it's never unambiguous. If I write x = 3a, does that mean that I've accidentally left out the + sign I intended, or that it is a multiplication? Worse, if I write x = a3, is that an assignment of a*3 or a variable named "a3"? Similar for x = ab. The tradition in mathematics is to use one letter variable names, so a variable called "ab" is so rare as to be virtually non-existent, but this obviously doesn't hold for programming. Mathematicians get away with this sort of ambiguity because they are writing for other mathematicians, not for a computer. Because mathematical proofs rely on a sequence of equations, not just a single statement, the ambiguity can be resolved: y = a(b+c) - ac # does this mean a+() or a*() or something else? y = ab + ac - ac # ah, it must have been a*() y = ab -- Steven From tjreedy at udel.edu Thu Feb 5 23:01:37 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 05 Feb 2009 17:01:37 -0500 Subject: [Python-ideas] Making stars optional? (was: Making colons optional?) In-Reply-To: References: <50697b2c0902051051m5d76d2f7q18123a2f510dd943@mail.gmail.com> Message-ID: Curt Hagenlocher wrote: > On Thu, Feb 5, 2009 at 12:47 PM, Bruce Leban wrote: >> I apologize for leaving the :-) out in my original post. > > *blush* No need to blush, really. This question really has been asked by naive newbies who did not notice (as you pointed out) that the no-star math notation *depends* on single-char var names. From leif.walsh at gmail.com Thu Feb 5 23:06:59 2009 From: leif.walsh at gmail.com (Leif Walsh) Date: Thu, 5 Feb 2009 17:06:59 -0500 Subject: [Python-ideas] Making colons optional? Syntactic alternative. In-Reply-To: <20090205132602.58368969@o> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <20090205132602.58368969@o> Message-ID: 2009/2/5 spir : > Le Wed, 4 Feb 2009 23:35:30 -0330, > Riobard Zhan a ?crit : >> Hi everybody, >> >> I'm proposing to make colons optional. -1. I like the colons. They're not strictly necessary, but defining functions like >>> def foo(x) ... return x+1 just looks sloppy to me. The colon helps show it's a definition. Perhaps this is my anglocentrism (sp?) showing, as in "Here is my function: ...". Native English speakers will notice how helpful that colon is. > [snip] > > Starting from this point, and looking for clarity and consistency, my wished syntactic revolution ;-) reads as follows: > * Get rid of ':' at end of headlines. > * Allow only single-instruction blocks (suites) for one-liners: "if cond; do_that". Or maybe no separator at all is necessary here for unambiguous parsing? Or replace ':' or ';' by 'then'. This would also allow if...then...else one-liners. Alternatively, make newline+indent compulsery. This is anyway often recommended in style guidelines. I sort of like this. One-liners should be doable with "if cond then stmt else stmt" instead of "if cond: stmt". I just think it's prettier; I can't really prove this or try to convince many people. Another option would be to add "if cond then stmt else stmt", and also require a newline after a colon, though this would mentally separate the two forms of 'if'. > * Extend name binding format "name:value" to assignments. That is, write "a:1" instead of "a=1. This may avoid tons of misunderstandings (imo, '=' for assignment is a semantic plague, a major error/fault, a pedagogic calamity). Eww. I'll always keep ':=' and '<-' close to my heart, but I can't envision ':' as assignment. > * Which lets the "=" sign free for better use: use "=" for the semantics of equality, like in math and common interpretation learnt from school. Yeah, yeah, yeah...the day we finally get rid of the throngs of programmers complaining that assignment is too many characters for their weak little fingers to type will also be the day we finally get 'import soul' working correctly, and never have to program again. If only.... > * Perhaps: use "==" for identity, instead of "is". I've always been a fan of 'is'. I think it looks nice. -- Cheers, Leif From leif.walsh at gmail.com Thu Feb 5 23:08:24 2009 From: leif.walsh at gmail.com (Leif Walsh) Date: Thu, 5 Feb 2009 17:08:24 -0500 Subject: [Python-ideas] Making colons optional? In-Reply-To: <498B1CDB.3000804@scottdial.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498B1CDB.3000804@scottdial.com> Message-ID: 2009/2/5 Scott Dial : > Christian Heimes wrote: >> def method(self, some, very, long, method, >> going, over, lots, >> and, lots, >> and, lots, of, lines): >> pass >> > > This example line-continues via parentheses, so why isn't the > right-paren enough? Because I can sweep the code visually for colons very easily. Consider: >>> def method(self, some, very, long, method, ... going, over, lots, of, ... lines, with, some=(default, argument, ... lists, that, are), also=(continued), ... and=None, some=None, more=1): ... pass We could keep getting more and more complicated, certainly, but I think I've made my point, and I'm probably not going to convince you. >> if (some() and some_other() or some_more(complex=(True,)) >> and a_final_call(egg=(1,2,3))): >> do_something() > > This example uses the same mechanism as above. BTW, I tend to indent > this as: > > if (some() and some_other() or some_more(complex=(True,)) > and a_final_call(egg=(1,2,3))): > do_something() > > With or without the colon, and it's more readable than your version > (IMHO), and clearly the colon provides no aide to it. For your definition of 'clearly', perhaps. I get a lot of help out of that colon. >> You see a line starting with "if" but not ending with a colon. You know >> for sure that you have to search for a trailing colon in order to find >> the end of a very long "if" line. > > I'd also add that every C programmer has dealt with this before with > single-statement if clauses that require no braces. This is after all > the reason why I indent line-continued test expression the way I do.. This is why good C programmers always use braces around one-liners. :P >> Yes, the colon is extra noise but it's the kind of good noise that makes >> life more joyful like the noise of rain on a roof. Did you notice that >> I'm using a colon in my regular postings, too? I've used two colons to >> separate my text from the examples. Colon separators are natural to me. > > All said, I would prefer to keep the colons as well.. I like regularity > of colon<=>suite equilibrium. Even if the splicing operator spits in the > face of it, rarely do you see [x:y] line-continued to leave a naked > colon at the end of the line. If I were code-reviewing something with [x: at the end of a line, I'd fire the programmer instantly. There's no excuse for that kind of unreadability. -- Cheers, Leif From rocky at gnu.org Thu Feb 5 23:40:34 2009 From: rocky at gnu.org (rocky at gnu.org) Date: Thu, 5 Feb 2009 17:40:34 -0500 Subject: [Python-ideas] Add a cryptographic hash (e.g SHA1) of source toPython Compiled objects? In-Reply-To: References: <18824.5343.933820.298039@panix5.panix.com> <7C24D4B5998D43AA9289972758E9B426@RaymondLaptop1> <18825.5815.559562.370640@panix5.panix.com> <18825.26238.914919.705377@panix5.panix.com> <18826.60508.910995.447316@panix5.panix.com> Message-ID: <18827.27362.801325.257186@panix5.panix.com> Terry Reedy writes: > > Instead of comparing marshal strings, custom compare the code objects, > ignoring .co_filename. > > tjr Alas, I suspect going down this path will lead to plugging more and more leaks. It is not just co_filename that might need ignoring, possibly artifacts from __file__ as well. When I run this Python program: import marshal print marshal.dumps(compile(open(__file__).read(), __file__, 'exec')) which I store in "/tmp/foo.py", and I look at the output, I see the string "/tmp/foo.py". No doubt this comes from whereever the value of __file__ is stored which seems computed at compile time. So one would probably need to ignore the values of __file__ variables, where ever that is stored. That said, if someone can write such a program I'd appreciate it. From rocky at gnu.org Fri Feb 6 04:38:01 2009 From: rocky at gnu.org (rocky at gnu.org) Date: Thu, 5 Feb 2009 22:38:01 -0500 Subject: [Python-ideas] Add a cryptographic hash (e.g SHA1) of source toPython Compiled objects? In-Reply-To: References: <18824.5343.933820.298039@panix5.panix.com> <7C24D4B5998D43AA9289972758E9B426@RaymondLaptop1> <18825.5815.559562.370640@panix5.panix.com> <18825.26238.914919.705377@panix5.panix.com> <9bfc700a0902040218m3c95a620sced72084e76c2ad4@mail.gmail.com> Message-ID: <18827.45209.826238.805391@panix5.panix.com> Brett Cannon writes: > On Wed, Feb 4, 2009 at 02:18, Arnaud Delobelle wrote: > > 2009/2/4 : > > > >> There's also the mtime that needs to be ignored mentioned in prior > >> posts. And is there a table which converts a magic number version back > >> into a string with the Python version number? Thanks. > > > > You can look at Python/import.c, near the top of the file. > > The other option to see how all of this works is importlib as found in > the py3k branch. That's in pure Python so it's easier to follow. > > -Brett > Sorry for the delayed response - I finally had a chance to check out the py3k code and look. Perhaps I'm missing something. Although there is some really cool, well-written and neat Python code there (and some of the private methods there seem to me like they should public and somewhere else, perhaps in os or os.path), I don't see a table mapping magic numbers to a string containing a Python version as you would find when running "python -V" and that's what was kind of asked for. As Arnaud mentioned, Python/import.c has this magic-number mapping in comments near the top of the file. Of course one could take those comments and turn it into a dictionary, but I was hoping Python had such a dictionary/function built in already since needs to be maintained along with changes to the magic number. Thanks. From stephen at xemacs.org Fri Feb 6 05:38:46 2009 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 06 Feb 2009 13:38:46 +0900 Subject: [Python-ideas] Making colons optional? In-Reply-To: <498B1CDB.3000804@scottdial.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498B1CDB.3000804@scottdial.com> Message-ID: <87vdrodvs9.fsf@xemacs.org> Scott Dial writes: > Christian Heimes wrote: > > def method(self, some, very, long, method, > > going, over, lots, > > and, lots, > > and, lots, of, lines): > > pass > > > > This example line-continues via parentheses, so why isn't the > right-paren enough? My eyes are going; I no longer can see how many open parens are there without moving focus. The colon is a local indicator that allows me to avoid backtracking. > > if (some() and some_other() or some_more(complex=(True,)) > > and a_final_call(egg=(1,2,3))): > > do_something() > > This example uses the same mechanism as above. Except that Python syntax allows you to omit one level of parentheses and still continue across lines, which you couldn't do without the colon. In this example I probably couldn't count the parentheses correctly even if I could see them all without moving eye focus. I love Lisp, but I'm also very glad that Python is not Lisp. From greg.ewing at canterbury.ac.nz Fri Feb 6 07:21:21 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 06 Feb 2009 19:21:21 +1300 Subject: [Python-ideas] Making stars optional? (was: Making colons optional?) In-Reply-To: References: Message-ID: <498BD6E1.8080407@canterbury.ac.nz> Bruce Leban wrote: > > x = 3a + 4(b-c) > > Why should we require the stars when it's unambiguous what the first > statement means? Indeed. Quite obviously the programmer meant to call the number 4 with argument b - c there. :-) I believe that one of HP's programmable calculator/ computer thingies had a language that let you write implied multiplications like that, but it only had single-letter variable names and no function calls, so there wasn't so much of a problem. In the next model you were allowed multi-char variable names, so they had to drop that feature. BTW, I think in Icon you can actually call numbers, although I forget what weird-assed thing it means just at the moment. -- Greg From yaogzhan at gmail.com Fri Feb 6 07:42:34 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Fri, 6 Feb 2009 03:12:34 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: <498B5BB9.60909@pearwood.info> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <498B5BB9.60909@pearwood.info> Message-ID: <178CBE16-379E-4BD9-9394-1460CFA79A84@gmail.com> On 5-Feb-09, at 6:05 PM, Steven D'Aprano wrote: > A switch/case statement is equivalent to a series of if...elif... > statements, so it would be inconsistent to require colons in a > switch but not in if...elif. Python does not have switch/case statements. Java/C does not have significant indentation. Do you mean Java/C should also use colons after if's, given that they use colons after case's? > My point was that both languages (Pascal and Python) use colons in a > way which is very familiar and standard to the English language, > albeit different usages in the two languages. If newbies to either > language find colons confusing, that's indicative of the general > decline of educational standards. If people whose first language is > not English have trouble with colons, then I sympathize, but then so > much of Python is based on English-like constructs that colons will > be the least of their problem. I did not notice anybody complained about colons because their first language is not English. Why bring it up? Yes, colons are natural in English. It is even more natural in English to end sentences with periods. Do you want to do that in Python? Erlang does that, and it's ugly. you cannot simply copy what we do in English to programming languages after all we do not carriage-return and indent our sentences like Python >> Yes I agree with you that many people like colons. What bothers me >> is that some people dislike them, but not given the choice to avoid >> them. We don't like semicolons in Python, but what would stop a >> hard-core C users to end every statement with a semicolon? > > Peer pressure. Everybody would laugh at their code and think they're > foolish. Same for semicolons, I would laugh and think it's foolish to type colons when we have to carriage-return and indent right after them anyway. > Flexibility of punctuation hurts computer languages, not helps. So we should really forbid x, y = m, n and instead force (x, y) = (m, n) >> By making trailing colons OPTIONAL, we can probably have the chance >> to field test. If people really think colons improve readability >> that much, they can still use them, just like we feel semicolons >> are line noise and void them if possible, even though we CAN use >> them. I don't think we will ever lose anything to make colons >> optional. > > Of course we do. It makes the language bigger and more complex. The > parser has to be more complex. Who is going to write that, maintain > that? Do you really think making colons optional makes Python bigger and more complex? One less thing to worry about in addition to indentation makes the parser more complex? > Every time you write a def block you have to decide "colon or not?". > Most people will standardize on one or the other, and the decision > will go away, but they will still need to read other people's code, > and then colon-haters will be unhappy, because they have to read > other people's code containing colons, while colon-likers will be > unhappy, because they have to read other people's code missing colons. > > Optional colons are the worst of both worlds, because it makes > *everybody* unhappy. "Every time you start a new line you have to decide "semicolon or not?". Most people will standardize on one or the other, and the decision will go away, but they will still need to read other people's code, and then semicolon-haters will be unhappy, because they have to read other people's code containing semicolons, while semicolons- likers will be unhappy, because they have to read other people's code missing semicolons. Optional semicolons are the worst of both worlds, because it makes *everybody* unhappy. " Does that happen? I would argue if colons are made optional, in the long run we will treat them like dinosaurs. If you disagree, think this: try convincing hard-core C users that they should really get rid of semicolons. From yaogzhan at gmail.com Fri Feb 6 07:42:36 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Fri, 6 Feb 2009 03:12:36 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: <87ocxgy33o.fsf@benfinney.id.au> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> Message-ID: <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> On 5-Feb-09, at 6:10 PM, Ben Finney wrote: > Riobard Zhan writes: > >> On 5-Feb-09, at 7:31 AM, Steven D'Aprano wrote: >> >>> I'm sorry you dislike colons, but I like them. >> >> Yes I agree with you that many people like colons. What bothers me is >> that some people dislike them, but not given the choice to avoid >> them. > > That argument doesn't address the point of the existing syntax. I (and > presumably Steven) like the colons in code *when I have to read it*. > > If they are optional, and some significant proportion of coders stop > using them to introduce a suite, then they entirely lose their strong > association with ?here comes a suite? that is the main benefit of > having them as complulsory syntax. Your strong association with "here comes a suite" should come from indentation, that's how Python works. Or you should fallback to opening and ending braces like Java/C (or even old school begin-end keywords) if you fail to do so. >> We don't like semicolons in Python, but what would stop a hard-core >> C users to end every statement with a semicolon? They have the >> choice. > > Laziness (the good kind). Once someone discovers that they *don't have > to* add the semicolons, and it doesn't affect the operation of their > program, those semicolons will, I predict, become much less frequent. "Once someone discovers that they *don't have to* add the colons, and it doesn't affect the operation of their program, those colons will, I predict, become much less frequent. " Thank God, finally we are on the right track. Making colons optional is just the first step to kill both semicolons and colons all together. From ben+python at benfinney.id.au Fri Feb 6 08:17:45 2009 From: ben+python at benfinney.id.au (Ben Finney) Date: Fri, 06 Feb 2009 18:17:45 +1100 Subject: [Python-ideas] Making colons optional? References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> Message-ID: <87skmsvxt2.fsf@benfinney.id.au> Riobard Zhan writes: > On 5-Feb-09, at 6:10 PM, Ben Finney wrote: > > > If they are optional, and some significant proportion of coders > > stop using them to introduce a suite, then they entirely lose > > their strong association with ?here comes a suite? that is the > > main benefit of having them as complulsory syntax. > > Your strong association with "here comes a suite" should come from > indentation, that's how Python works. We're going around in circles: I've already demonstrated that there is plenty of indentation changes in Python code that *isn't* associated with here-comes-a-suite. > Or you should fallback to opening and ending braces like Java/C (or > even old school begin-end keywords) if you fail to do so. Why? I already have indentation plus here-comes-a-suite colons in Python. -- \ ?It's up to the masses to distribute [music] however they want | `\ ? The laws don't matter at that point. People sharing music in | _o__) their bedrooms is the new radio.? ?Neil Young, 2008-05-06 | Ben Finney From greg.ewing at canterbury.ac.nz Fri Feb 6 06:52:34 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 06 Feb 2009 18:52:34 +1300 Subject: [Python-ideas] Making colons optional? In-Reply-To: <06586733-716E-4D17-A1B3-7E0DB0CDF421@gmail.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <515259D2-9EFF-4432-8670-129362D1AA01@gmail.com> <06586733-716E-4D17-A1B3-7E0DB0CDF421@gmail.com> Message-ID: <498BD022.4080508@canterbury.ac.nz> Riobard Zhan wrote: > I don't like the end-end-end-end part of Ruby either. But they do have > better starting part IMO. The really annoying thing about ends in Ruby is that if you leave one out, you don't get an error until the very end of the file, giving you no clue *where* the missing 'end' is... That sort of thing never seems to happen in Python, thankfully. I don't think this issue has anything to do with semicolons, though. -- Greg From scott+python-ideas at scottdial.com Fri Feb 6 08:23:47 2009 From: scott+python-ideas at scottdial.com (Scott Dial) Date: Fri, 06 Feb 2009 02:23:47 -0500 Subject: [Python-ideas] Making colons optional? In-Reply-To: <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> Message-ID: <498BE583.4040701@scottdial.com> Riobard Zhan wrote: > Your strong association with "here comes a suite" should come from > indentation, that's how Python works. Your argument makes no sense, since an indention does not automatically indicate a suite. We as programmers regularly indent line-continuations for the sake of code readability, and we rarely break lines after a colon. This is a certainly a style and not a law, but it's *the* style of Python, codified by PEP 8: """ The preferred way of wrapping long lines is by using Python's implied line continuation inside parentheses, brackets and braces. If necessary, you can add an extra pair of parentheses around an expression, but sometimes using a backslash looks better. *Make sure to indent the continued line appropriately.* The preferred place to break around a binary operator is *after* the operator, not before it. """ Although the binary operator instruction might lead to breaking "[x:y]" into "[x:\ny]", many in the community have rejected that as good form. Perhaps it should be noted in the PEP 8 guidelines that its bad. Presuming that is accepted practice, one can easily assert that colons at the end of a line in (well-written) code precede suites. On the other hand, you have no hope of making such an assertion about indentions. > Or you should fallback to opening > and ending braces like Java/C (or even old school begin-end keywords) > if you fail to do so. Huh? You want to add more line noise? The colon is a pleasant compromise between the "please give me some indicator that is visual" and the "please don't make me type extra characters" crowds. Your arguments that people who don't want to type them shouldn't don't hold up when you consider that other people have to read the code, and they are not given the option to put the colons back in upon reading it. And there is the simple fact that a piece of code will be read more often than it is ever wrote. Does your burden out weigh all of the potential readers of your code? -- Scott Dial scott at scottdial.com scodial at cs.indiana.edu From leif.walsh at gmail.com Fri Feb 6 08:40:27 2009 From: leif.walsh at gmail.com (Leif Walsh) Date: Fri, 6 Feb 2009 02:40:27 -0500 Subject: [Python-ideas] Making colons optional? In-Reply-To: <498BE583.4040701@scottdial.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <498BE583.4040701@scottdial.com> Message-ID: > [noise] Sounds like there are a lot of fevered egos on both sides of the debate. Personally, I like colons. I'm sure if python forbade them, I'd stop using it. That said, the folks on the other side seem to have a good argument in that "semicolons are optional; [sic] why aren't colons?" I don't at all agree with the people calling them line noise or saying that they hurt readability (seriously, what?), but it seems sensible to allow the option. After all, if you have to read someone's code, there's a decent chance that you can use a style guide to force them to use colons. It's also probably not hard to hack the parser to create a tool that simply adds or removes colons to or from old code to make it conform to a new style guide, if this were necessary. In short, the idiots that are against colons might be idiots, but at least they seem to be more fair. -- Cheers, Leif From steve at pearwood.info Fri Feb 6 08:49:12 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 06 Feb 2009 18:49:12 +1100 Subject: [Python-ideas] Making colons optional? In-Reply-To: References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <498BE583.4040701@scottdial.com> Message-ID: <498BEB78.5000400@pearwood.info> Leif Walsh wrote: > In short, the idiots that are against colons might be idiots, but at > least they seem to be more fair. "Fairness" is not a language design principle. -- Steven From stephen at xemacs.org Fri Feb 6 09:02:52 2009 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 06 Feb 2009 17:02:52 +0900 Subject: [Python-ideas] Making colons optional? In-Reply-To: <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> Message-ID: <87ocxgdmc3.fsf@xemacs.org> Riobard Zhan writes: > "Once someone discovers that they *don't have to* add the colons, and > it doesn't affect the operation of their program, those colons will, I > predict, become much less frequent. " > > Thank God, finally we are on the right track. Making colons optional > is just the first step to kill both semicolons and colons all together. I think the language you are looking for is called "Haskell". From leif.walsh at gmail.com Fri Feb 6 09:19:53 2009 From: leif.walsh at gmail.com (Leif Walsh) Date: Fri, 6 Feb 2009 03:19:53 -0500 Subject: [Python-ideas] Making colons optional? In-Reply-To: <498BEB16.4020608@scottdial.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <498BE583.4040701@scottdial.com> <498BEB16.4020608@scottdial.com> Message-ID: On Fri, Feb 6, 2009 at 2:47 AM, Scott Dial wrote: > I don't think reducing my post to "[noise]" or calling people idiots is > at all beneficial. I'd appreciate it if you toned down the condescension > in the future. Ahh. I accidentally listed you as the To: recipient. I meant [noise] to be the thread in general, not your contribution. Sorry about that. The 'idiots' comment was meant tongue-in-cheek. Didn't mean for anyone to take offense. Let's everyone be happy, ok? -- Cheers, Leif From leif.walsh at gmail.com Fri Feb 6 09:20:51 2009 From: leif.walsh at gmail.com (Leif Walsh) Date: Fri, 6 Feb 2009 03:20:51 -0500 Subject: [Python-ideas] Making colons optional? In-Reply-To: <498BEB78.5000400@pearwood.info> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <498BE583.4040701@scottdial.com> <498BEB78.5000400@pearwood.info> Message-ID: On Fri, Feb 6, 2009 at 2:49 AM, Steven D'Aprano wrote: > "Fairness" is not a language design principle. Hopefully it would be a principle on which discussion could be based. I was hoping more that the colon-supporters would come up with a good counter to that argument, because I can't. -- Cheers, Leif From denis.spir at free.fr Fri Feb 6 10:11:30 2009 From: denis.spir at free.fr (spir) Date: Fri, 6 Feb 2009 10:11:30 +0100 Subject: [Python-ideas] Making colons optional? In-Reply-To: <87ocxgy33o.fsf@benfinney.id.au> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> Message-ID: <20090206101130.7688c3ce@o> Le Fri, 06 Feb 2009 08:40:27 +1100, Ben Finney a ?crit : > > We don't like semicolons in Python, but what would stop a hard-core > > C users to end every statement with a semicolon? They have the > > choice. > > Laziness (the good kind). Once someone discovers that they *don't have > to* add the semicolons, and it doesn't affect the operation of their > program, those semicolons will, I predict, become much less frequent. Ditto for colons!
Strange that you do not realize that all arguments pro / against semi-colons apply to colons as well: they have same syntactic position and the same semantics. So that clear, consistent, choices are: * both to trash * both compulsery * both optional My opinion: the rest is blahblah confusing habits and good design. Saying "I prefere this because I'm used to it." is fully respectable. This does not mean that other options are weaker design choices.
Denis ------ la vida e estranya From denis.spir at free.fr Fri Feb 6 10:18:41 2009 From: denis.spir at free.fr (spir) Date: Fri, 6 Feb 2009 10:18:41 +0100 Subject: [Python-ideas] Making colons optional? In-Reply-To: <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> Message-ID: <20090206101841.15604ff0@o> Le Fri, 6 Feb 2009 03:12:36 -0330, Riobard Zhan a ?crit : > > Laziness (the good kind). Once someone discovers that they *don't have > > to* add the semicolons, and it doesn't affect the operation of their > > program, those semicolons will, I predict, become much less frequent. > > "Once someone discovers that they *don't have to* add the colons, and > it doesn't affect the operation of their program, those colons will, I > predict, become much less frequent. " > > Thank God, finally we are on the right track. Making colons optional > is just the first step to kill both semicolons and colons all together. Probably... Denis ------ la vida e estranya From denis.spir at free.fr Fri Feb 6 10:21:19 2009 From: denis.spir at free.fr (spir) Date: Fri, 6 Feb 2009 10:21:19 +0100 Subject: [Python-ideas] Making colons optional? In-Reply-To: <498BE583.4040701@scottdial.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <498BE583.4040701@scottdial.com> Message-ID: <20090206102119.243ae6f0@o> Le Fri, 06 Feb 2009 02:23:47 -0500, Scott Dial a ?crit : > Riobard Zhan wrote: > > Your strong association with "here comes a suite" should come from > > indentation, that's how Python works. > > Your argument makes no sense, since an indention does not automatically > indicate a suite. We as programmers regularly indent line-continuations > for the sake of code readability, and we rarely break lines after a > colon. [...] Your argument makes no sense, because it adresses semi-colons as well. Do you really mean semi-colons must be compulsary for the sake of readibility? Denis ------ la vida e estranya From denis.spir at free.fr Fri Feb 6 10:43:40 2009 From: denis.spir at free.fr (spir) Date: Fri, 6 Feb 2009 10:43:40 +0100 Subject: [Python-ideas] binding vs rebinding In-Reply-To: <87skmsy3ht.fsf@benfinney.id.au> References: <20090205142211.77ba2568@o> <87skmsy3ht.fsf@benfinney.id.au> Message-ID: <20090206104340.610e17f5@o> Le Fri, 06 Feb 2009 08:31:58 +1100, Ben Finney a ?crit : > > I wonder why there is no difference in syntax between binding and > > rebinding. Obviously, the semantics is not at all the same, for > > humans as well as for the interpreter: > > > * Binding: create a name, bind a value to it. > > * Rebinding: change the value bound to the name. > > That's not obvious. The semantics could just as well be described as: > > * Binding: bind this name to that value. > * Rebinding: bind this name to that value. You are right do describe it like that. I agree that the point of view is not wrong. However, * At the interpreter level, as far as I know, rebinding does not create the name like if it was unknown. (Note that a similar issue happens with dicts.) * At the programmer level, changing the value associated to a name is really a different action than introducing and giving an initial value to a new symbol. For me, *this* is the relevant point: even if the language would behave "behind the scene" the same way in both cases. This is implantation concern. A pertinent objection has been raised already in the case of loops: for item in container: # temp symbol foo = func(item) # process do_stuff Conceptually, there is a foo for each item. Obviously, we cannot express that properly, because the loop's body is the same for each iteration. The issue here lies in the fact that a loop does not introduce a local namespace, while this is precisely what we *mean* when defining such as utility variable as foo. In other words: there is a distortion between language semantics and modelizing semantics. Denis ------ la vida e estranya From denis.spir at free.fr Fri Feb 6 11:31:45 2009 From: denis.spir at free.fr (spir) Date: Fri, 6 Feb 2009 11:31:45 +0100 Subject: [Python-ideas] binding vs rebinding In-Reply-To: <498B5F52.1080200@pearwood.info> References: <20090205142211.77ba2568@o> <498B5F52.1080200@pearwood.info> Message-ID: <20090206113145.6b2227bd@o> Le Fri, 06 Feb 2009 08:51:14 +1100, Steven D'Aprano a ?crit : > spir wrote: > > > Hello, > > > > I wonder why there is no difference in syntax between binding and rebinding. Obviously, the semantics is not at all the same, for humans as well as for the interpreter: > > * Binding: create a name, bind a value to it. > > * Rebinding: change the value bound to the name. > > > > I see several advantages for this distinction and no drawback. The first advantage, which imo is worthful enough, is to let syntax match semantics; as the distinction *makes sense*. > > In Python, names are stored in namespaces, which are implemented as > dictionaries. There is a nice correspondence between the syntax of > namespaces and of dicts: > > x = 1 # create a new name and bind it to 1 > x = 2 # rebind name to 2 > del x # delete name > > mydict['x'] = 1 # create new key and bind it to 1 > mydict['x'] = 2 # rebind key to 2 > del mydict['x'] # delete key Good point, yes, this holds for dicts, too! mydict['x'] := bar # ==> KeyError: mydict['x'] does not exist. mydict['x'] = 1 # create new key and bind it to 1 mydict['x'] := 2 # (syntax suggestion) rebind key to 2 mydict['x'] = foo # ==> KeyError: mydict['x'] already exists. > Also, your suggestion is conceptually the same as requiring declarations: > > x = 1 # declare x with value 1 > x := 2 # assign to x Hem... well, you can actually see it like that. The main difference is that, unlike in "declarative" languages, there is no declaration without binding. var foo; int i # not in python So that I would rather call that 'initialization' (== declaration + first binding) as opposed to 'rebinding'. > Finally, what should we do here? > > if flag: > x = 2 > print foo(x) > x = 3 # is this a rebinding or a new binding? > print bar(x) Interesting, thank you. Now comes on stage the naming problem. What is here expressed is an optional additional step, right? This is written as a special case where whatever is symbolized with 'x' will take a specific value. Later the program enters back the main flow and this thing will have a standard value. I assert that the same name should not be used for both special and standard cases. This imo shows a lack of distinction. Using a different name will make things clearer: if flag: # 'flag' expresses a special case flag_x = 2 # possibly reuse the case name as prefix print foo(flag_x) x = 3 # is this a rebinding or a new binding? print bar(x) I am rather sure that in any concrete model, once the distinction is made clear, then obvious naming can be found. if ambiguity: ambiguity_message = 'ambiguity warning: "%s"\n' % ambiguous_expr print warning_format(message) message = "\t%s\n" % line_body print standard_format(message) Probably most name rebindings rather express a lack of distinction that makes the code obscure. Haven't you ever read things like: config = "config.txt" # string: name config = open(config) # file object config = config.read() # string: content config = config.lines() # array of strings config = parse(config) # e.g. custom object Denis ------ la vida e estranya From steve at pearwood.info Fri Feb 6 14:39:18 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 07 Feb 2009 00:39:18 +1100 Subject: [Python-ideas] binding vs rebinding In-Reply-To: <20090206104340.610e17f5@o> References: <20090205142211.77ba2568@o> <87skmsy3ht.fsf@benfinney.id.au> <20090206104340.610e17f5@o> Message-ID: <498C3D86.5030506@pearwood.info> spir wrote: > Le Fri, 06 Feb 2009 08:31:58 +1100, > Ben Finney a ?crit : > >>> I wonder why there is no difference in syntax between binding and >>> rebinding. Obviously, the semantics is not at all the same, for >>> humans as well as for the interpreter: >>> * Binding: create a name, bind a value to it. >>> * Rebinding: change the value bound to the name. >> That's not obvious. The semantics could just as well be described as: >> >> * Binding: bind this name to that value. >> * Rebinding: bind this name to that value. > > You are right do describe it like that. I agree that the point of view is not wrong. However, > > * At the interpreter level, as far as I know, rebinding does not create the name like if it was unknown. (Note that a similar issue happens with dicts.) Both binding and rebinding actions are identical. >>> x = 5 >>> import dis >>> dis.dis( compile("x=3 # rebinding", '', 'single') ) 1 0 LOAD_CONST 0 (3) 3 STORE_NAME 0 (x) 6 LOAD_CONST 1 (None) 9 RETURN_VALUE >>> del x >>> dis.dis( compile("x=3 # new binding", '', 'single') ) 1 0 LOAD_CONST 0 (3) 3 STORE_NAME 0 (x) 6 LOAD_CONST 1 (None) 9 RETURN_VALUE Whatever implementation difference there may be is transparent at the interpreter level. > * At the programmer level, changing the value associated to a name is really a different action than introducing and giving an initial value to a new symbol. For me, *this* is the relevant point: even if the language would behave "behind the scene" the same way in both cases. This is implantation concern. I disagree. At the programmer level, x=5 is the same thing whether x already exists or not. > A pertinent objection has been raised already in the case of loops: > > for item in container: > # temp symbol > foo = func(item) > # process > do_stuff > > Conceptually, there is a foo for each item. Obviously, we cannot express that properly, because the loop's body is the same for each iteration. The issue here lies in the fact that a loop does not introduce a local namespace, while this is precisely what we *mean* when defining such as utility variable as foo. Speak for yourself. When I use a for loop, I do not expect or want it to create a new namespace. > In other words: there is a distortion between language semantics and modelizing semantics. I disagree. I think the language semantics match precisely the semantics of name binding as I expect it. -- Steven From yaogzhan at gmail.com Fri Feb 6 15:16:26 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Fri, 6 Feb 2009 10:46:26 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: <87skmsvxt2.fsf@benfinney.id.au> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <87skmsvxt2.fsf@benfinney.id.au> Message-ID: <9B3D70F3-673C-465F-BA17-FE95DF7CF6D6@gmail.com> On 6-Feb-09, at 3:47 AM, Ben Finney wrote: > Riobard Zhan writes: > >> On 5-Feb-09, at 6:10 PM, Ben Finney wrote: >> >>> If they are optional, and some significant proportion of coders >>> stop using them to introduce a suite, then they entirely lose >>> their strong association with ?here comes a suite? that is the >>> main benefit of having them as complulsory syntax. >> >> Your strong association with "here comes a suite" should come from >> indentation, that's how Python works. > > We're going around in circles: I've already demonstrated that there is > plenty of indentation changes in Python code that *isn't* associated > with here-comes-a-suite. > >> Or you should fallback to opening and ending braces like Java/C (or >> even old school begin-end keywords) if you fail to do so. > > Why? I already have indentation plus here-comes-a-suite colons in > Python. Why do you want a strong association with "here comes a suite" coming from colons? Why don't you want a strong association with "here comes a statement" coming from semicolons? Can you explain the inconsistency? From yaogzhan at gmail.com Fri Feb 6 15:17:48 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Fri, 6 Feb 2009 10:47:48 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: <87ocxgdmc3.fsf@xemacs.org> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <87ocxgdmc3.fsf@xemacs.org> Message-ID: <727F165C-16E2-404F-9A47-9A9806E7C2B1@gmail.com> On 6-Feb-09, at 4:32 AM, Stephen J. Turnbull wrote: > Riobard Zhan writes: > >> "Once someone discovers that they *don't have to* add the colons, and >> it doesn't affect the operation of their program, those colons >> will, I >> predict, become much less frequent. " >> >> Thank God, finally we are on the right track. Making colons optional >> is just the first step to kill both semicolons and colons all >> together. > > I think the language you are looking for is called "Haskell". I think the mailing list I am posting to is called "Python Ideas". From phd at phd.pp.ru Fri Feb 6 15:21:55 2009 From: phd at phd.pp.ru (Oleg Broytmann) Date: Fri, 6 Feb 2009 17:21:55 +0300 Subject: [Python-ideas] Making colons optional? In-Reply-To: <727F165C-16E2-404F-9A47-9A9806E7C2B1@gmail.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <87ocxgdmc3.fsf@xemacs.org> <727F165C-16E2-404F-9A47-9A9806E7C2B1@gmail.com> Message-ID: <20090206142155.GD13165@phd.pp.ru> On Fri, Feb 06, 2009 at 10:47:48AM -0330, Riobard Zhan wrote: > On 6-Feb-09, at 4:32 AM, Stephen J. Turnbull wrote: > >> Riobard Zhan writes: >> >>> "Once someone discovers that they *don't have to* add the colons, and >>> it doesn't affect the operation of their program, those colons will, >>> I >>> predict, become much less frequent. " >>> >>> Thank God, finally we are on the right track. Making colons optional >>> is just the first step to kill both semicolons and colons all >>> together. >> >> I think the language you are looking for is called "Haskell". > > I think the mailing list I am posting to is called "Python Ideas". I think the idea of making colons optional is dead on arrival. Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From yaogzhan at gmail.com Fri Feb 6 15:37:08 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Fri, 6 Feb 2009 11:07:08 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: <498BE583.4040701@scottdial.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <498BE583.4040701@scottdial.com> Message-ID: <19D96DF5-6757-4793-B25A-5AE345CB7B4E@gmail.com> On 6-Feb-09, at 3:53 AM, Scott Dial wrote: > Riobard Zhan wrote: >> Your strong association with "here comes a suite" should come from >> indentation, that's how Python works. > > Your argument makes no sense, since an indention does not > automatically > indicate a suite. Indentation does not automatically indicate a statement, either. Why do you omit semicolons? > Huh? You want to add more line noise? The colon is a pleasant > compromise > between the "please give me some indicator that is visual" and the > "please don't make me type extra characters" crowds. Your arguments > that > people who don't want to type them shouldn't don't hold up when you > consider that other people have to read the code, and they are not > given > the option to put the colons back in upon reading it. And there is the > simple fact that a piece of code will be read more often than it is > ever > wrote. Does your burden out weigh all of the potential readers of > your code? So you do agree colons are line noise? Colons are unpleasant for me not because I have to type them (though this certainly is a factor, too), but I have to *read* them every time I read Python code and they disrupt the mental flow. You are assuming colons improve readability, which I disagree. Readability comes from indentation, not from punctuation. If you logic holds, you might have to use semicolons as well. From aahz at pythoncraft.com Fri Feb 6 15:38:39 2009 From: aahz at pythoncraft.com (Aahz) Date: Fri, 6 Feb 2009 06:38:39 -0800 Subject: [Python-ideas] Making colons optional? In-Reply-To: <727F165C-16E2-404F-9A47-9A9806E7C2B1@gmail.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <87ocxgdmc3.fsf@xemacs.org> <727F165C-16E2-404F-9A47-9A9806E7C2B1@gmail.com> Message-ID: <20090206143839.GA5218@panix.com> On Fri, Feb 06, 2009, Riobard Zhan wrote: > > I think the mailing list I am posting to is called "Python Ideas". That doesn't mean that people will have any interest in your ideas. You should pay attention to what people are telling you and spend less time arguing with them. Keep in mind that Guido is still BDFL, and you have effectively zero chance of convincing him to drop the colons. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ Weinberg's Second Law: If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization. From aahz at pythoncraft.com Fri Feb 6 15:44:30 2009 From: aahz at pythoncraft.com (Aahz) Date: Fri, 6 Feb 2009 06:44:30 -0800 Subject: [Python-ideas] binding vs rebinding In-Reply-To: <20090206113145.6b2227bd@o> References: <20090205142211.77ba2568@o> <498B5F52.1080200@pearwood.info> <20090206113145.6b2227bd@o> Message-ID: <20090206144430.GB5218@panix.com> On Fri, Feb 06, 2009, spir wrote: > > Probably most name rebindings rather express a lack of distinction > that makes the code obscure. Haven't you ever read things like: > > config = "config.txt" # string: name > config = open(config) # file object > config = config.read() # string: content > config = config.lines() # array of strings > config = parse(config) # e.g. custom object Yup, I do that all the time. I happen to like that style. Your proposal would break existing code, therefore it's not going to happen. (Keep in mind that even the 2.x to 3.x transition changed very little of Python's semantic model.) -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ Weinberg's Second Law: If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization. From curt at hagenlocher.org Fri Feb 6 15:45:03 2009 From: curt at hagenlocher.org (Curt Hagenlocher) Date: Fri, 6 Feb 2009 06:45:03 -0800 Subject: [Python-ideas] Making colons optional? In-Reply-To: <19D96DF5-6757-4793-B25A-5AE345CB7B4E@gmail.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <498BE583.4040701@scottdial.com> <19D96DF5-6757-4793-B25A-5AE345CB7B4E@gmail.com> Message-ID: On Fri, Feb 6, 2009 at 6:37 AM, Riobard Zhan wrote: > > You are assuming colons improve readability, which I disagree. "Readability" isn't an entirely objective metric. But making colons optional would mean that you're now looking at two different kinds of code -- with colons and without. And it's extremely hard for me to imagine that having two different kinds of code would not harm readability. -- Curt Hagenlocher curt at hagenlocher.org From bwinton at latte.ca Fri Feb 6 15:38:50 2009 From: bwinton at latte.ca (Blake Winton) Date: Fri, 06 Feb 2009 09:38:50 -0500 Subject: [Python-ideas] binding vs rebinding In-Reply-To: <20090206113145.6b2227bd@o> References: <20090205142211.77ba2568@o> <498B5F52.1080200@pearwood.info> <20090206113145.6b2227bd@o> Message-ID: <498C4B7A.3060902@latte.ca> spir wrote: > I assert that the same name should not be used for both special > and standard cases. This imo shows a lack of distinction. Using > a different name will make things clearer: I assert that once you've gotten to the point of suggesting that people change the way they program, you've lost the debate. :) "Practicality beats purity." How many other languages support this split between binding and rebinding? How many of them are popular? (What's the highest rank that one of them has on, say, http://www.tiobe.com/content/paperinfo/tpci/index.html ?) (Don't take this to mean that I disagree with you. I think that if people either had to say when they were rebinding names, or couldn't rebind names (a la Erlang), programs would probably be more stable. But Python isn't the kind of language to do that sort of thing in. I feel it would fit in better with a more academic language like Haskell or Scheme.) > Probably most name rebindings rather express a lack of distinction > that makes the code obscure. Haven't you ever read things like: > > config = "config.txt" # string: name > config = open(config) # file object > config = config.read() # string: content > config = config.lines() # array of strings > config = parse(config) # e.g. custom object I don't see that code as particularly obscure. Sure you could re-write it to use a billion different temporary variables (but you'ld better make sure they didn't conflict with ones above you!), but that doesn't really help anything from my point of view, and this example (with the addition of some extra syntax) would still be legal in the rebinding-version of Python you're suggesting. Later, Blake. From yaogzhan at gmail.com Fri Feb 6 15:47:35 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Fri, 6 Feb 2009 11:17:35 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <498BE583.4040701@scottdial.com> Message-ID: <08A88544-C428-4BAF-8AF2-86FB05056BC6@gmail.com> On 6-Feb-09, at 4:10 AM, Leif Walsh wrote: > the folks on the other side seem to have a good > argument in that "semicolons are optional; [sic] why aren't colons?" That's exactly the point. Explain the inconsistency, please. Hints for [sic] part: use semicolons for one-liners; ditto for colons. That's why making them "optional". A colon and a semicolon are used in this one-liner. From yaogzhan at gmail.com Fri Feb 6 15:51:17 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Fri, 6 Feb 2009 11:21:17 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <498BE583.4040701@scottdial.com> <498BEB78.5000400@pearwood.info> Message-ID: <5EE58FCE-3420-4FB3-BB82-0AAEBBA1E645@gmail.com> On 6-Feb-09, at 4:50 AM, Leif Walsh wrote: > On Fri, Feb 6, 2009 at 2:49 AM, Steven D'Aprano > wrote: >> "Fairness" is not a language design principle. > > Hopefully it would be a principle on which discussion could be based. > I was hoping more that the colon-supporters would come up with a good > counter to that argument, because I can't. What bothers me here is that colon-lovers seem to assume their choice is *the* choice. From yaogzhan at gmail.com Fri Feb 6 15:52:58 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Fri, 6 Feb 2009 11:22:58 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: <20090206101130.7688c3ce@o> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <20090206101130.7688c3ce@o> Message-ID: On 6-Feb-09, at 5:41 AM, spir wrote: > Le Fri, 06 Feb 2009 08:40:27 +1100, > Ben Finney a ?crit : > >>> We don't like semicolons in Python, but what would stop a hard-core >>> C users to end every statement with a semicolon? They have the >>> choice. >> >> Laziness (the good kind). Once someone discovers that they *don't >> have >> to* add the semicolons, and it doesn't affect the operation of their >> program, those semicolons will, I predict, become much less frequent. > > Ditto for colons! > >
> Strange that you do not realize that all arguments pro / against > semi-colons apply to colons as well: they have same syntactic > position and the same semantics. So that clear, consistent, choices > are: > * both to trash > * both compulsery > * both optional > My opinion: the rest is blahblah confusing habits and good design. > Saying "I prefere this because I'm used to it." is fully > respectable. This does not mean that other options are weaker design > choices. >
Exactly! Making colons optional just like colons makes Python more consistent. It is a better design, even though we might have to fight old habits at the beginning. From yaogzhan at gmail.com Fri Feb 6 15:53:32 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Fri, 6 Feb 2009 11:23:32 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: <20090206142155.GD13165@phd.pp.ru> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <87ocxgdmc3.fsf@xemacs.org> <727F165C-16E2-404F-9A47-9A9806E7C2B1@gmail.com> <20090206142155.GD13165@phd.pp.ru> Message-ID: <982C396D-82B9-49FA-97EF-9B85844A6D4D@gmail.com> On 6-Feb-09, at 10:51 AM, Oleg Broytmann wrote: > On Fri, Feb 06, 2009 at 10:47:48AM -0330, Riobard Zhan wrote: >> On 6-Feb-09, at 4:32 AM, Stephen J. Turnbull wrote: >> >>> Riobard Zhan writes: >>> >>>> "Once someone discovers that they *don't have to* add the colons, >>>> and >>>> it doesn't affect the operation of their program, those colons >>>> will, >>>> I >>>> predict, become much less frequent. " >>>> >>>> Thank God, finally we are on the right track. Making colons >>>> optional >>>> is just the first step to kill both semicolons and colons all >>>> together. >>> >>> I think the language you are looking for is called "Haskell". >> >> I think the mailing list I am posting to is called "Python Ideas". > > I think the idea of making colons optional is dead on arrival. What do you think the idea of making semicolons optional? From phd at phd.pp.ru Fri Feb 6 16:09:51 2009 From: phd at phd.pp.ru (Oleg Broytmann) Date: Fri, 6 Feb 2009 18:09:51 +0300 Subject: [Python-ideas] Making colons optional? In-Reply-To: <982C396D-82B9-49FA-97EF-9B85844A6D4D@gmail.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <87ocxgdmc3.fsf@xemacs.org> <727F165C-16E2-404F-9A47-9A9806E7C2B1@gmail.com> <20090206142155.GD13165@phd.pp.ru> <982C396D-82B9-49FA-97EF-9B85844A6D4D@gmail.com> Message-ID: <20090206150951.GA26929@phd.pp.ru> On Fri, Feb 06, 2009 at 11:23:32AM -0330, Riobard Zhan wrote: > What do you think the idea of making semicolons optional? You are trying to change the language - and changing the language is a major, big scale change - but you are trying to change the language for nothing. Small inconsistency (even if I'd agree there is an inconsistency there) doesn't by itself warrant such a major change. -1000 for the change. http://python.org/dev/peps/pep-0008/ : "A Foolish Consistency is the Hobgoblin of Little Minds" Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From leif.walsh at gmail.com Fri Feb 6 16:10:31 2009 From: leif.walsh at gmail.com (Leif Walsh) Date: Fri, 6 Feb 2009 10:10:31 -0500 Subject: [Python-ideas] Making colons optional? In-Reply-To: <5EE58FCE-3420-4FB3-BB82-0AAEBBA1E645@gmail.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <498BE583.4040701@scottdial.com> <498BEB78.5000400@pearwood.info> <5EE58FCE-3420-4FB3-BB82-0AAEBBA1E645@gmail.com> Message-ID: On Fri, Feb 6, 2009 at 9:51 AM, Riobard Zhan wrote: > What bothers me here is that colon-lovers seem to assume their choice is > *the* choice. But, currently, it is. :-P -- Cheers, Leif From yaogzhan at gmail.com Fri Feb 6 16:11:16 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Fri, 6 Feb 2009 11:41:16 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: <20090206143839.GA5218@panix.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <87ocxgdmc3.fsf@xemacs.org> <727F165C-16E2-404F-9A47-9A9806E7C2B1@gmail.com> <20090206143839.GA5218@panix.com> Message-ID: On 6-Feb-09, at 11:08 AM, Aahz wrote: > On Fri, Feb 06, 2009, Riobard Zhan wrote: >> >> I think the mailing list I am posting to is called "Python Ideas". > > That doesn't mean that people will have any interest in your ideas. > You > should pay attention to what people are telling you and spend less > time > arguing with them. It is pretty clear that colon-supporters do not pay attention to the inconsistency of semicolons being optional and colons being mandatory that I mentioned, and try to argue with me using a set of reasons why they love colons when in fact the same reasons would obviously make them love semicolons, and ignore the fact that they are used to colons not because it is a better idea but they are never allowed to omit. > Keep in mind that Guido is still BDFL, and you have > effectively zero chance of convincing him to drop the colons. It does not stop me giving it a try. From yaogzhan at gmail.com Fri Feb 6 16:14:44 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Fri, 6 Feb 2009 11:44:44 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <498BE583.4040701@scottdial.com> <19D96DF5-6757-4793-B25A-5AE345CB7B4E@gmail.com> Message-ID: On 6-Feb-09, at 11:15 AM, Curt Hagenlocher wrote: > On Fri, Feb 6, 2009 at 6:37 AM, Riobard Zhan > wrote: >> >> You are assuming colons improve readability, which I disagree. > > "Readability" isn't an entirely objective metric. But making colons > optional would mean that you're now looking at two different kinds of > code -- with colons and without. And it's extremely hard for me to > imagine that having two different kinds of code would not harm > readability. Not when we dump colons like semicolons. We don't see two different kinds of code, one with semicolons and the other without. From yaogzhan at gmail.com Fri Feb 6 16:25:12 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Fri, 6 Feb 2009 11:55:12 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: <20090206150951.GA26929@phd.pp.ru> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <87ocxgdmc3.fsf@xemacs.org> <727F165C-16E2-404F-9A47-9A9806E7C2B1@gmail.com> <20090206142155.GD13165@phd.pp.ru> <982C396D-82B9-49FA-97EF-9B85844A6D4D@gmail.com> <20090206150951.GA26929@phd.pp.ru> Message-ID: <63E445CF-F77B-4D2C-92D4-789DBE4B959A@gmail.com> On 6-Feb-09, at 11:39 AM, Oleg Broytmann wrote: > On Fri, Feb 06, 2009 at 11:23:32AM -0330, Riobard Zhan wrote: >> What do you think the idea of making semicolons optional? > > You are trying to change the language - and changing the language is > a major, big scale change - but you are trying to change the > language for > nothing. Small inconsistency (even if I'd agree there is an > inconsistency > there) doesn't by itself warrant such a major change. -1000 for the > change. Making colons is not a major, big scale change. You can use them if you want. There is even no backward compatibility issues. I guess hardly anybody would notice if we do make the change. By comparison, what do we get by making "print" a function? Why not create a "put" or "echo" built-in function if we really want the flexibility? Isn't that a major, big scale change? From yaogzhan at gmail.com Fri Feb 6 16:26:36 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Fri, 6 Feb 2009 11:56:36 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <498BE583.4040701@scottdial.com> <498BEB78.5000400@pearwood.info> <5EE58FCE-3420-4FB3-BB82-0AAEBBA1E645@gmail.com> Message-ID: <5748636D-8254-46F7-B18C-2FA6C3612757@gmail.com> On 6-Feb-09, at 11:40 AM, Leif Walsh wrote: > On Fri, Feb 6, 2009 at 9:51 AM, Riobard Zhan > wrote: >> What bothers me here is that colon-lovers seem to assume their >> choice is >> *the* choice. > > But, currently, it is. :-P That's what to be debated. And they fail to give a rational argument to explain why. From phd at phd.pp.ru Fri Feb 6 16:31:39 2009 From: phd at phd.pp.ru (Oleg Broytmann) Date: Fri, 6 Feb 2009 18:31:39 +0300 Subject: [Python-ideas] Making colons optional? In-Reply-To: <63E445CF-F77B-4D2C-92D4-789DBE4B959A@gmail.com> References: <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <87ocxgdmc3.fsf@xemacs.org> <727F165C-16E2-404F-9A47-9A9806E7C2B1@gmail.com> <20090206142155.GD13165@phd.pp.ru> <982C396D-82B9-49FA-97EF-9B85844A6D4D@gmail.com> <20090206150951.GA26929@phd.pp.ru> <63E445CF-F77B-4D2C-92D4-789DBE4B959A@gmail.com> Message-ID: <20090206153139.GA31773@phd.pp.ru> On Fri, Feb 06, 2009 at 11:55:12AM -0330, Riobard Zhan wrote: > On 6-Feb-09, at 11:39 AM, Oleg Broytmann wrote: >> On Fri, Feb 06, 2009 at 11:23:32AM -0330, Riobard Zhan wrote: >>> What do you think the idea of making semicolons optional? >> >> You are trying to change the language - and changing the language is >> a major, big scale change - but you are trying to change the language >> for >> nothing. Small inconsistency (even if I'd agree there is an >> inconsistency >> there) doesn't by itself warrant such a major change. -1000 for the >> change. > > Making colons is not a major, big scale change. It is, because it is now a part of *the language*. Changing the language is always a major change. If you don't understand - well, we failed to communicate, and I am stopping now. You have to learn to live with small inconsistencies. > By comparison, what do we get by making "print" a function? Why not > create a "put" or "echo" built-in function if we really want the > flexibility? Isn't that a major, big scale change? Yes, it was a major change, and it warranted a new major release - 3.0. Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From curt at hagenlocher.org Fri Feb 6 16:37:49 2009 From: curt at hagenlocher.org (Curt Hagenlocher) Date: Fri, 6 Feb 2009 07:37:49 -0800 Subject: [Python-ideas] Making colons optional? In-Reply-To: References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <498BE583.4040701@scottdial.com> <19D96DF5-6757-4793-B25A-5AE345CB7B4E@gmail.com> Message-ID: On Fri, Feb 6, 2009 at 7:14 AM, Riobard Zhan wrote: > >> But making colons >> optional would mean that you're now looking at two different kinds of >> code -- with colons and without. And it's extremely hard for me to >> imagine that having two different kinds of code would not harm >> readability. > > Not when we dump colons like semicolons. We don't see two different kinds of > code, one with semicolons and the other without. If you're suggesting that -- for consistency -- it shouldn't be legal to have an empty statement after a semicolon, I might be tempted to agree. I've never seen anyone use a semicolon where it's optional. In any event, as several people have suggested, there's basically zero likelihood of Python ever being changed to make colons optional. And there doesn't even have to be a rational basis for that. If it were 1994 and Python were still young, we could discuss the relative merits of required/optional/disallowed. But at this point, you're basically trying to convince people that their many years of experience with reading Python code should be thrown away in order to accommodate a small number of new users who sometimes forget a trivial piece of punctuation. -- Curt Hagenlocher curt at hagenlocher.org From stephen at xemacs.org Fri Feb 6 16:41:53 2009 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sat, 07 Feb 2009 00:41:53 +0900 Subject: [Python-ideas] Making colons optional? In-Reply-To: <727F165C-16E2-404F-9A47-9A9806E7C2B1@gmail.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <87ocxgdmc3.fsf@xemacs.org> <727F165C-16E2-404F-9A47-9A9806E7C2B1@gmail.com> Message-ID: <87bptfefni.fsf@xemacs.org> Riobard Zhan writes: > > On 6-Feb-09, at 4:32 AM, Stephen J. Turnbull wrote: > > I think the language you are looking for is called "Haskell". > > I think the mailing list I am posting to is called "Python Ideas". I think that's exactly my point. From stephen at xemacs.org Fri Feb 6 17:21:46 2009 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sat, 07 Feb 2009 01:21:46 +0900 Subject: [Python-ideas] Making colons optional? In-Reply-To: References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <87ocxgdmc3.fsf@xemacs.org> <727F165C-16E2-404F-9A47-9A9806E7C2B1@gmail.com> <20090206143839.GA5218@panix.com> Message-ID: <87ab8zedt1.fsf@xemacs.org> Riobard Zhan writes: > It is pretty clear that colon-supporters do not pay attention to the > inconsistency of semicolons being optional and colons being mandatory > that I mentioned, The inconsistency is quite small. In fact, in English colons and semicolons have both different syntax and different semantics, and the same is true of Python. In English, the semicolon expresses a conjunction of two not necessarily related sentences. Python is the same, except that since we all run von Neumann machines, conjunction implies sequencing. The colon, on the other hand, represents a relationship of some kind. A native-speaker simply doesn't write things like "The sky is blue: mesons are composed of two quarks." (Well, the coiner of the work "quark" might, which is the exception that proves the rule.) And the same is true in Python; the code fragment preceding the colon controls the statement (suite) following it. Thus, the apparent isomorphism of the semantics is quite superficial, almost syntactic itself. Syntactically, in English a semicolon takes two sentences to conjoin. A colon, on the other hand, must have a full sentence on one side or the other, but the other side may be almost anything.[1] Similarly in Python: both what precedes the semicolon and what follows it must be complete statements, while what precedes the colon must be a fragment, and what follows it is a complete statement (suite). So the colon and semicolon do not have isomorphic syntax, either. In fact, the semicolon is indeed redundant, as it is precisely equivalent to a newline + the current level of indentation. This kind of redundancy is quite unusual in Python, so I wonder if the BDFL might regret having permitted semicolons at all, as he has expressed regret for allowing the extended assignment operators (eg, +=). > and try to argue with me using a set of reasons why > they love colons when in fact the same reasons would obviously make > them love semicolons, and ignore the fact that they are used to colons > not because it is a better idea but they are never allowed to omit. I think that if condition: act (appropriately) is more readable than if (condition) act (appropriately); or if condition act (appropriately) In C, the visual parallelism of the control structure and the function call is spurious, and the trailing semicolon actually terminates the f statement, which is not the way I really think about it. In colon-less Python, the control fragment is insufficiently divided from the controlled suite. Just as in English, a bit of redundancy to reflect the hierarchical relationship between the control structure and the controlled suite is useful. In Python, the same character is used, which is helpful to those whose native languages use colons. This is indeed *good* design. It expresses, and emphasizes, the hierarchical structure of the control construct, and does so using a syntax that any English-speaker (at least) will recognize as analogous. I don't expect to convince you that the required colon is good design, but I think it puts paid to the notion that colons and semicolons are isomorphic. It is not true that what we conclude about semicolons is necessarily what we should conclude about colons. Footnotes: [1] That's not entirely true. It is possible to write something like Required colon: an idea which is still correct. which does not contain any complete sentence, and where the colon is actually a copula. But even then, it's a relationship. From jared.grubb at gmail.com Fri Feb 6 18:09:17 2009 From: jared.grubb at gmail.com (Jared Grubb) Date: Fri, 6 Feb 2009 09:09:17 -0800 Subject: [Python-ideas] Making colons optional? In-Reply-To: <19D96DF5-6757-4793-B25A-5AE345CB7B4E@gmail.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <498BE583.4040701@scottdial.com> <19D96DF5-6757-4793-B25A-5AE345CB7B4E@gmail.com> Message-ID: >> > So you do agree colons are line noise? Colons are unpleasant for me > not because I have to type them (though this certainly is a factor, > too), but I have to *read* them every time I read Python code and > they disrupt the mental flow. > > You are assuming colons improve readability, which I disagree. > Readability comes from indentation, not from punctuation. If you > logic holds, you might have to use semicolons as well. Just because punctuation is not necessary does not mean it's not useful. For example, you could argue that "f(1,2,3)" and "f(1 2 3)" should be both allowed. Or you could argue that you dont even need colons OR semicolons in one-liners: "if condition do_f() do_g() do_h()". At some point aesthetics matters (even if it is an objective thing). One major difference between semicolons and colons is their frequency of use. A colon once every 8-10 lines of code is less tedious/noisy than a semicolon on EVERY line of code. Second, a colon always precedes a linebreak AND an indent (a visible change), but semicolons never precede anything but a simple linebreak. Colons and indents are complementary. Third, colons and semicolons look similar; in some ways, colons are more useful when semicolons are NOT used because visual ambiguity is lower. I almost never forget a colon in Python, because (to me) they feel and look natural (and the parser gives a great error message when a colon is missing!). I get missing-semicolon and missing-brace errors in C++ often enough (albeit infrequent) that I do NOT miss them in Python (and the cryptic error messages they cause, sometimes in different files than the one they're missing from) Jared From bruce at leapyear.org Fri Feb 6 18:57:35 2009 From: bruce at leapyear.org (Bruce Leban) Date: Fri, 6 Feb 2009 09:57:35 -0800 Subject: [Python-ideas] Making colons optional? In-Reply-To: References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <498BE583.4040701@scottdial.com> <19D96DF5-6757-4793-B25A-5AE345CB7B4E@gmail.com> Message-ID: Riobard Zhan writes: > It is pretty clear that colon-supporters do not pay attention to the > inconsistency of semicolons being optional and colons being mandatory > that I mentioned So what? Commas are also mandatory. They look like semicolons so we should make them optional too. Am I joking? Sure but look at this syntax: f(1 2 3 a=4 b=5 c=6) I can figure out where each parameter ends easily enough without the commas [because my implicit multiplication proposal was rejected :-(]. Sure some people will quibble that f(1 -2) is a problem and so on but that can be fixed just as C distinguishes between x =- y and x = -y.** That's not the real problem. The real problem is that we *want* those commas to break up the flow of text. In English, a comma is a pause in the reading. If you, stick extra, commas in, your sentences you will, confuse people. Colons, semicolons, dashes, etc. all server similar purposes to cue the reader in on what the sentences is about? Hah, tricked you there with that question mark. Enough of this for me. Consistency for consistency's sake is overrated. Let's get on to important stuff like whether we number the bits from the right end or left end of the number and whether array indexes should start at 1 or 0. --- Bruce P.S. Colons are redundant by their very nature that they use two dots when one is all you need (or love is all you need (or something like that)). ** For the historically challenged, original C used =- instead of -=. -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.brandl at gmx.net Fri Feb 6 19:21:13 2009 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 06 Feb 2009 19:21:13 +0100 Subject: [Python-ideas] Making stars optional? (was: Making colons optional?) In-Reply-To: <498B623E.1040206@pearwood.info> References: <498B623E.1040206@pearwood.info> Message-ID: Steven D'Aprano schrieb: > Mathematicians get away with this sort of ambiguity because they are > writing for other mathematicians, not for a computer. Because > mathematical proofs rely on a sequence of equations, not just a single > statement, the ambiguity can be resolved: > > y = a(b+c) - ac # does this mean a+() or a*() or something else? > y = ab + ac - ac # ah, it must have been a*() > y = ab No context is needed to know what a(b+c) means. In maths, you only have single-character variable names (sub-/superscripts notwithstanding), so ab always means a*b. Together with some other conventions, like that nobody writes a3 instead of 3a, everything is unambiguous. Though there *are* programmers who wouldn't notice if Python 3.0 switched to one-character-only variable names, they are probably a minority and hopefully dying out :) Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From g.brandl at gmx.net Fri Feb 6 19:33:04 2009 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 06 Feb 2009 19:33:04 +0100 Subject: [Python-ideas] Making colons optional? In-Reply-To: <5748636D-8254-46F7-B18C-2FA6C3612757@gmail.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <498BE583.4040701@scottdial.com> <498BEB78.5000400@pearwood.info> <5EE58FCE-3420-4FB3-BB82-0AAEBBA1E645@gmail.com> <5748636D-8254-46F7-B18C-2FA6C3612757@gmail.com> Message-ID: Riobard Zhan schrieb: > On 6-Feb-09, at 11:40 AM, Leif Walsh wrote: > >> On Fri, Feb 6, 2009 at 9:51 AM, Riobard Zhan >> wrote: >>> What bothers me here is that colon-lovers seem to assume their >>> choice is >>> *the* choice. >> >> But, currently, it is. :-P > > That's what to be debated. And they fail to give a rational argument > to explain why. I don't want to discredit your efforts here, but that's simply the laziness you kind of acquire when you know that the discussion won't make any difference. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From brett at python.org Fri Feb 6 19:34:38 2009 From: brett at python.org (Brett Cannon) Date: Fri, 6 Feb 2009 10:34:38 -0800 Subject: [Python-ideas] Add a cryptographic hash (e.g SHA1) of source toPython Compiled objects? In-Reply-To: <18827.45209.826238.805391@panix5.panix.com> References: <18824.5343.933820.298039@panix5.panix.com> <7C24D4B5998D43AA9289972758E9B426@RaymondLaptop1> <18825.5815.559562.370640@panix5.panix.com> <18825.26238.914919.705377@panix5.panix.com> <9bfc700a0902040218m3c95a620sced72084e76c2ad4@mail.gmail.com> <18827.45209.826238.805391@panix5.panix.com> Message-ID: On Thu, Feb 5, 2009 at 19:38, wrote: > Brett Cannon writes: > > On Wed, Feb 4, 2009 at 02:18, Arnaud Delobelle wrote: > > > 2009/2/4 : > > > > > >> There's also the mtime that needs to be ignored mentioned in prior > > >> posts. And is there a table which converts a magic number version back > > >> into a string with the Python version number? Thanks. > > > > > > You can look at Python/import.c, near the top of the file. > > > > The other option to see how all of this works is importlib as found in > > the py3k branch. That's in pure Python so it's easier to follow. > > > > -Brett > > > > Sorry for the delayed response - I finally had a chance to check out the > py3k code and look. > > Perhaps I'm missing something. Although there is some really cool, > well-written and neat Python code there (and some of the private > methods there seem to me like they should public and somewhere else, Still working on exposing the API. > perhaps in os or os.path), Nothing in that module belongs in os. > I don't see a table mapping magic numbers > to a string containing a Python version as you would find when running > "python -V" and that's what was kind of asked for. > Sorry, misread the email. Python/import.c is the right place then. > As Arnaud mentioned, Python/import.c has this magic-number mapping in > comments near the top of the file. Of course one could take those > comments and turn it into a dictionary, but I was hoping Python had > such a dictionary/function built in already since needs to be > maintained along with changes to the magic number. It actually doesn't need to be maintained. If the magic number doesn't match from a .pyc then it needs to be regenerated, period. We do not try to see if the magic number can be different yet compatible with the running interpreter. And as for changing it, it is simply a specific increment along with committing the file. The magic number history is documented in that file "just in case". -Brett From g.brandl at gmx.net Fri Feb 6 19:46:22 2009 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 06 Feb 2009 19:46:22 +0100 Subject: [Python-ideas] Making colons optional? In-Reply-To: <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> Message-ID: Riobard Zhan schrieb: > On 5-Feb-09, at 7:31 AM, Steven D'Aprano wrote: > >> Riobard Zhan wrote: >> >>> - I noticed a strong tendency to forget colons by new users of >>> Python in a second-year computer science undergraduate course. The >>> students seemed not getting used to colons even near the end of the >>> course. I guess it is probably because they learn Java and C first, >>> both of which do not have colons. What other languages do you know >>> that require colons? >> >> Pascal uses colons, but not for the exact same purpose as Python. >> Both languages use colons in similar ways to it's use in English. In >> particular, Python uses colons as a break between clauses: larger >> than a comma, smaller than a period. >> > > Pascal is my first language. It has been some years ago, so I cannot > remember the detail now. I checked wikipedia and did not find colons > are used after if's. Not sure if you mean declarations? If so, I don't > think that is what we are discussing here; Java and C also use colons > in switch/case statements. AFAIK, Python is quite unique in requiring > trailing colons after if's, for's, and function/class definitions. Python is also quite unique in using indentation to delimit blocks, so I'm not sure what point you're trying to make. >> >>> - I find colons pretty annoying. >> ... >> >> I'm sorry you dislike colons, but I like them. >> > > Yes I agree with you that many people like colons. What bothers me is > that some people dislike them, but not given the choice to avoid them. > We don't like semicolons in Python, but what would stop a hard-core C > users to end every statement with a semicolon? They have the choice. > > And I would also argue that many of those like colons not because they > really feel colons improve readability, but that they have get used to > colons in the first place. You like colons, I don't. How do you know > another Python user will like them or not? By making trailing colons > OPTIONAL, we can probably have the chance to field test. If people > really think colons improve readability that much, they can still use > them, just like we feel semicolons are line noise and void them if > possible, even though we CAN use them. I don't think we will ever lose > anything to make colons optional. By making colon syntax flexible, we also lose a consistent look and feel of the language. [1] I think it's a good indicator for optional syntax if you can formulate new rules for PEP 8 that state when to use it. In the case of colons, you'd have to either forbid or mandate them; I'd be at a loss to find another consistent rule. So, making them optional is pointless; we should either keep them or remove them. And removing is out of the question. Applying that indicator to semicolons, there is a clear rule in PEP 8 that states when to use them: to separate two statements on one line. >>> - What problems do you think will occur if colons are made optional? >> >> I don't think it would lead to any problems, but I think it would >> make Python less elegant. >> > > I think omitting colons makes Python more elegant--more uniform, less > clutter. It's an itch every time I see a piece of Ruby code with lots > of def's and if's without trailing colons ... That's your prerogative. However, the only person around here whose itches alone, in the face of a wall of disagreeing users, can lead to a change in the language, is Guido. cheers, Georg [1] Yes, I have read all those remarks about semicolons. What you all fail to recognize is that the majority of Python users like the colons, and wouldn't program without them. You can call it habit if you can't understand that it fits their thinking process, that's all the same: the colon won't just go away, just because it's made optional. Therefore, consistency is lost, not gained. -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From rocky at gnu.org Fri Feb 6 19:58:17 2009 From: rocky at gnu.org (rocky at gnu.org) Date: Fri, 6 Feb 2009 13:58:17 -0500 Subject: [Python-ideas] Add a cryptographic hash (e.g SHA1) of source toPython Compiled objects? In-Reply-To: References: <18824.5343.933820.298039@panix5.panix.com> <7C24D4B5998D43AA9289972758E9B426@RaymondLaptop1> <18825.5815.559562.370640@panix5.panix.com> <18825.26238.914919.705377@panix5.panix.com> <9bfc700a0902040218m3c95a620sced72084e76c2ad4@mail.gmail.com> <18827.45209.826238.805391@panix5.panix.com> Message-ID: <18828.34889.865240.983179@panix5.panix.com> Brett Cannon writes: > On Thu, Feb 5, 2009 at 19:38, wrote: > > Brett Cannon writes: > > > On Wed, Feb 4, 2009 at 02:18, Arnaud Delobelle wrote: > > > > 2009/2/4 : > > > > > > > >> There's also the mtime that needs to be ignored mentioned in prior > > > >> posts. And is there a table which converts a magic number version back > > > >> into a string with the Python version number? Thanks. > > > > > > > > You can look at Python/import.c, near the top of the file. > > > > > > The other option to see how all of this works is importlib as found in > > > the py3k branch. That's in pure Python so it's easier to follow. > > > > > > -Brett > > > > > > > Sorry for the delayed response - I finally had a chance to check out the > > py3k code and look. > > > > Perhaps I'm missing something. Although there is some really cool, > > well-written and neat Python code there (and some of the private > > methods there seem to me like they should public and somewhere else, > > Still working on exposing the API. > > > perhaps in os or os.path), > > Nothing in that module belongs in os. There's probably some confusion as to what I was referring to or what I took you to mean when you mentioned importlib. I took that to mean the files in that directory "importlib". At any rate that's what I looked at. One of the files is _bootstrap.py which has: def _path_join(*args): """Replacement for os.path.join.""" return path_sep.join(x[:-len(path_sep)] if x.endswith(path_sep) else x for x in args) def _path_exists(path): """Replacement for os.path.exists.""" try: _os.stat(path) except OSError: return False else: return True def _path_is_mode_type(path, mode): """Test whether the path is the specified mode type.""" try: stat_info = _os.stat(path) except OSError: return False return (stat_info.st_mode & 0o170000) == mode For _path_join, posixpath.py has something similar and perhaps even the same functionality although it's different code. _path_is_mode_type doesn't exist in posixpath.py _path_exists seems to be almost a duplicate of lexists using which uses lstat instead of _os.stat. > > > As Arnaud mentioned, Python/import.c has this magic-number mapping in > > comments near the top of the file. Of course one could take those > > comments and turn it into a dictionary, but I was hoping Python had > > such a dictionary/function built in already since needs to be > > maintained along with changes to the magic number. > > It actually doesn't need to be maintained. I meant the mapping between magic number and version that it represents. For a use case, recall again what the problem is: you are given python code and a file that purports to be the source and want to verify. The current proposal (with its current weaknesses) requires getting the compiler the same. When that's not the same one could say "sorry, Python compiler version mismatch -- go figure it out", but more helpful would be to indicate that you compiled with version X (as a string rather than a magic number) and the python code was compiled with version Y. This means the source might be the same, we just don't really know. From qrczak at knm.org.pl Fri Feb 6 20:06:53 2009 From: qrczak at knm.org.pl (Marcin 'Qrczak' Kowalczyk) Date: Fri, 6 Feb 2009 20:06:53 +0100 Subject: [Python-ideas] binding vs rebinding In-Reply-To: <498C4B7A.3060902@latte.ca> References: <20090205142211.77ba2568@o> <498B5F52.1080200@pearwood.info> <20090206113145.6b2227bd@o> <498C4B7A.3060902@latte.ca> Message-ID: <3f4107910902061106g774d599fs4705a4a3ffb0fa3@mail.gmail.com> On Fri, Feb 6, 2009 at 15:38, Blake Winton wrote: > How many other languages support this split between binding and rebinding? > How many of them are popular? (What's the highest rank that one of them > has on, say, http://www.tiobe.com/content/paperinfo/tpci/index.html ?) 1. Java - split 2. C - split 3. C++ - split 4. (Visual) Basic - split 5. PHP - not split 6. C# - split 7. Python - not split 8. Perl - split 9. JavaScript - split 10. Delphi - split 11. Ruby - not split -- Marcin Kowalczyk qrczak at knm.org.pl http://qrnik.knm.org.pl/~qrczak/ From brett at python.org Fri Feb 6 20:39:59 2009 From: brett at python.org (Brett Cannon) Date: Fri, 6 Feb 2009 11:39:59 -0800 Subject: [Python-ideas] Add a cryptographic hash (e.g SHA1) of source toPython Compiled objects? In-Reply-To: <18828.34889.865240.983179@panix5.panix.com> References: <18824.5343.933820.298039@panix5.panix.com> <18825.5815.559562.370640@panix5.panix.com> <18825.26238.914919.705377@panix5.panix.com> <9bfc700a0902040218m3c95a620sced72084e76c2ad4@mail.gmail.com> <18827.45209.826238.805391@panix5.panix.com> <18828.34889.865240.983179@panix5.panix.com> Message-ID: On Fri, Feb 6, 2009 at 10:58, wrote: > Brett Cannon writes: > > On Thu, Feb 5, 2009 at 19:38, wrote: > > > Brett Cannon writes: > > > > On Wed, Feb 4, 2009 at 02:18, Arnaud Delobelle wrote: > > > > > 2009/2/4 : > > > > > > > > > >> There's also the mtime that needs to be ignored mentioned in prior > > > > >> posts. And is there a table which converts a magic number version back > > > > >> into a string with the Python version number? Thanks. > > > > > > > > > > You can look at Python/import.c, near the top of the file. > > > > > > > > The other option to see how all of this works is importlib as found in > > > > the py3k branch. That's in pure Python so it's easier to follow. > > > > > > > > -Brett > > > > > > > > > > Sorry for the delayed response - I finally had a chance to check out the > > > py3k code and look. > > > > > > Perhaps I'm missing something. Although there is some really cool, > > > well-written and neat Python code there (and some of the private > > > methods there seem to me like they should public and somewhere else, > > > > Still working on exposing the API. > > > > > perhaps in os or os.path), > > > > Nothing in that module belongs in os. > > There's probably some confusion as to what I was referring to or what > I took you to mean when you mentioned importlib. I took that to mean > the files in that directory "importlib". No, that's right. > At any rate that's what I looked at. > One of the files is _bootstrap.py which has: > > def _path_join(*args): > """Replacement for os.path.join.""" > return path_sep.join(x[:-len(path_sep)] if x.endswith(path_sep) else x > for x in args) > > def _path_exists(path): > """Replacement for os.path.exists.""" > try: > _os.stat(path) > except OSError: > return False > else: > return True > > def _path_is_mode_type(path, mode): > """Test whether the path is the specified mode type.""" > try: > stat_info = _os.stat(path) > except OSError: > return False > return (stat_info.st_mode & 0o170000) == mode > > For _path_join, posixpath.py has something similar and perhaps even the same > functionality although it's different code. > > _path_is_mode_type doesn't exist in posixpath.py > > _path_exists seems to be almost a duplicate of lexists using which > uses lstat instead of _os.stat. > All of that code is duplicated, most of it copy-and-paste, from some code from the os module or its helper modules. The only reason it is there is for bootstrapping reasons when that code will be used as the implementation of import (module can't rely on non-builtin modules). > > > > > > As Arnaud mentioned, Python/import.c has this magic-number mapping in > > > comments near the top of the file. Of course one could take those > > > comments and turn it into a dictionary, but I was hoping Python had > > > such a dictionary/function built in already since needs to be > > > maintained along with changes to the magic number. > > > > It actually doesn't need to be maintained. > > I meant the mapping between magic number and version that it > represents. For a use case, recall again what the problem is: you are > given python code and a file that purports to be the source and want > to verify. The current proposal (with its current weaknesses) requires > getting the compiler the same. When that's not the same one could say > "sorry, Python compiler version mismatch -- go figure it out", but > more helpful would be to indicate that you compiled with version X (as > a string rather than a magic number) and the python code was compiled > with version Y. This means the source might be the same, we just don't > really know. I still don't see the benefit of knowing what version of Python a magic number matches to. So I know some bytecode was compiled by Python 2.5 while I am running Python 2.6. What benefit do I derive from knowing that compared to just knowing that it was not compiled by Python 2.6? I mean are you ultimately planning on launching a different interpreter based on what generated the bytecode? -Brett From scott+python-ideas at scottdial.com Fri Feb 6 20:43:06 2009 From: scott+python-ideas at scottdial.com (Scott Dial) Date: Fri, 06 Feb 2009 14:43:06 -0500 Subject: [Python-ideas] binding vs rebinding In-Reply-To: <3f4107910902061106g774d599fs4705a4a3ffb0fa3@mail.gmail.com> References: <20090205142211.77ba2568@o> <498B5F52.1080200@pearwood.info> <20090206113145.6b2227bd@o> <498C4B7A.3060902@latte.ca> <3f4107910902061106g774d599fs4705a4a3ffb0fa3@mail.gmail.com> Message-ID: <498C92CA.5060308@scottdial.com> Marcin 'Qrczak' Kowalczyk wrote: > 1. Java - split > 2. C - split > 3. C++ - split > 6. C# - split Are you counting the implicit lack of a binding as a binding? "int x;" does not bind anything to "x", technically the first time "x = ..." appears is the first binding. Maybe I am splitting hairs, but this whole topic is a hair splitter. > 4. (Visual) Basic - split > 8. Perl - split > 9. JavaScript - split > 10. Delphi - split In addition to the debate above, the declarations are optional in these languages, and in that case, they are very much the same semantics as Python. > 5. PHP - not split > 7. Python - not split > 11. Ruby - not split With regard to *all* of these languages there is *not* a separate assignment operator, only possibly a special statement (which in none of these languages *require* the initial assignment). And since the debate is about a special operator and not a special statement, all of these languages support dropping this discussion. It really is that foreign. -- Scott Dial scott at scottdial.com scodial at cs.indiana.edu From curt at hagenlocher.org Fri Feb 6 21:04:17 2009 From: curt at hagenlocher.org (Curt Hagenlocher) Date: Fri, 6 Feb 2009 12:04:17 -0800 Subject: [Python-ideas] binding vs rebinding In-Reply-To: <498C92CA.5060308@scottdial.com> References: <20090205142211.77ba2568@o> <498B5F52.1080200@pearwood.info> <20090206113145.6b2227bd@o> <498C4B7A.3060902@latte.ca> <3f4107910902061106g774d599fs4705a4a3ffb0fa3@mail.gmail.com> <498C92CA.5060308@scottdial.com> Message-ID: On Fri, Feb 6, 2009 at 11:43 AM, Scott Dial wrote: > Marcin 'Qrczak' Kowalczyk wrote: >> 1. Java - split >> 2. C - split >> 3. C++ - split >> 6. C# - split > > Are you counting the implicit lack of a binding as a binding? "int x;" > does not bind anything to "x", technically the first time "x = ..." > appears is the first binding. Maybe I am splitting hairs, but this whole > topic is a hair splitter. Java and C# will always bind a default value on the declaration (even without the "="); C++ sometimes does, and C never does (at least not according to whichever standard I vaguely remember). But there's not really any point in discussing this as a language feature independently of scoping, and none of these languages have scoping semantics that are a good match for Python. -- Curt Hagenlocher curt at hagenlocher.org From rocky at gnu.org Fri Feb 6 21:10:34 2009 From: rocky at gnu.org (rocky at gnu.org) Date: Fri, 6 Feb 2009 15:10:34 -0500 Subject: [Python-ideas] Add a cryptographic hash (e.g SHA1) of source toPython Compiled objects? In-Reply-To: References: <18824.5343.933820.298039@panix5.panix.com> <18825.5815.559562.370640@panix5.panix.com> <18825.26238.914919.705377@panix5.panix.com> <9bfc700a0902040218m3c95a620sced72084e76c2ad4@mail.gmail.com> <18827.45209.826238.805391@panix5.panix.com> <18828.34889.865240.983179@panix5.panix.com> Message-ID: <18828.39226.268367.520484@panix5.panix.com> > I still don't see the benefit of knowing what version of Python a > magic number matches to. So I know some bytecode was compiled by > Python 2.5 while I am running Python 2.6. Yep. Not uncommon for me to have several versions of Python available. It so happens that the computer where this email is being sent has at least 9 versions, possibly more because I didn't check if python.new and python.old are one those other 9. (I don't maintain this box, but pay someone else to; clearly this is a pathological case, but it's kind of interesting to me that there are at more than 9 versions installed and I did not contrive this case.) > What benefit do I derive > from knowing that compared to just knowing that it was not compiled by > Python 2.6? I mean are you ultimately planning on launching a > different interpreter based on what generated the bytecode? If there's a mismatch in the first place, it means there's confusion on someone's part. Don't you want to foster development of programs that try to minimize confusion? Subsidiary effects when support of magic to version string are not readily available in situations where it would be helpful is possibly back and forth dialog in bug reports one is asking what telling folks how to get the version number (because it's not in the error message because its not readily available by a programmer). Never underestimate the level of users, especially if you are working on something like a debugger. If we hope that someone's going to know about and read that comment in the C file turn it into a dictionary and maintain it anytime the magic number gets updated, it's probably not going to happen often. Again, although I see specific uses in a debugger this really an issue regarding code tools or programs that deal with Python code. I know there's a disassembler, but you mean there isn't a dump tool which shows all of the information inside a compiled object including a disassembly, Python version the program was compiled with, mtime in human readable format, and whatnot? From qrczak at knm.org.pl Fri Feb 6 21:47:26 2009 From: qrczak at knm.org.pl (Marcin 'Qrczak' Kowalczyk) Date: Fri, 6 Feb 2009 21:47:26 +0100 Subject: [Python-ideas] binding vs rebinding In-Reply-To: <498C92CA.5060308@scottdial.com> References: <20090205142211.77ba2568@o> <498B5F52.1080200@pearwood.info> <20090206113145.6b2227bd@o> <498C4B7A.3060902@latte.ca> <3f4107910902061106g774d599fs4705a4a3ffb0fa3@mail.gmail.com> <498C92CA.5060308@scottdial.com> Message-ID: <3f4107910902061247r4ec438cboaa9cef88abd0d4ee@mail.gmail.com> On Fri, Feb 6, 2009 at 20:43, Scott Dial wrote: >> 1. Java - split >> 2. C - split >> 3. C++ - split >> 6. C# - split > > Are you counting the implicit lack of a binding as a binding? I am treating as "split" languages which distinguish introducing a new variable in the current scope from assignment to an existing variable. Or, in other words, languages where assignment alone cannot be used to create a new local variable. When a variable is introduced, its value can be given explicitly, or it can be assumed to be some default (possibly depending on the type), or it can be unspecified, or it can require further rebinding. Usually there is a choice between the first and some of the others. I do not consider these differences. > "int x;" does not bind anything to "x", It creates an object of type int and binds the name x to it. The object initially has an unspecified value, and can have its value changed. The association between usages of x in this scope and this object cannot be changed. >> 4. (Visual) Basic - split >> 8. Perl - split >> 9. JavaScript - split >> 10. Delphi - split > > In addition to the debate above, the declarations are optional in these > languages, and in that case, they are very much the same semantics as > Python. In Perl you can assign to an non-existing variable, which is in this case assumed to be global (and you get a warning if warnings are turned on). In non-trivial programs they are local variables which matter. For JavaScript: http://www.webdevelopersnotes.com/tutorials/javascript/global_local_variables_scope_javascript.php3 For Visual Basic: http://msdn.microsoft.com/en-us/library/1t0wsc67(VS.80).aspx For Delphi: http://delphi.about.com/od/beginners/l/aa060899.htm All these languages have a syntax for introducing a variable, distinct from assignment to an existing variable, and that syntax is needed for creating a local variable in a function scope. -- Marcin Kowalczyk qrczak at knm.org.pl http://qrnik.knm.org.pl/~qrczak/ From ben+python at benfinney.id.au Fri Feb 6 22:27:33 2009 From: ben+python at benfinney.id.au (Ben Finney) Date: Sat, 07 Feb 2009 08:27:33 +1100 Subject: [Python-ideas] Making colons optional? References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <87skmsvxt2.fsf@benfinney.id.au> <9B3D70F3-673C-465F-BA17-FE95DF7CF6D6@gmail.com> Message-ID: <87fxirw916.fsf@benfinney.id.au> Riobard Zhan writes: > Why do you want a strong association with "here comes a suite" coming > from colons? I want that indication from *something*, because indentation isn't sufficient. A colon ?:? is a good choice because its semantic meaning has a good analogue to the same character in natural language. > Why don't you want a strong association with "here comes a > statement" coming from semicolons? Because ?start of a new line at the same indentation level? is sufficient for that. (Not to mention that the very idea of ?here comes a statement? is rather foreign; I have so little need for something extra to do that job that I barely recognise it as a job that needs doing.) > Can you explain the inconsistency? Entirely different requirements. If that's not enough for you, I think the gulf of understanding is too wide for my level of interest in exploring the reasons. -- \ ?If you go flying back through time and you see somebody else | `\ flying forward into the future, it's probably best to avoid eye | _o__) contact.? ?Jack Handey | Ben Finney From brett at python.org Fri Feb 6 23:27:36 2009 From: brett at python.org (Brett Cannon) Date: Fri, 6 Feb 2009 14:27:36 -0800 Subject: [Python-ideas] Add a cryptographic hash (e.g SHA1) of source toPython Compiled objects? In-Reply-To: <18828.39226.268367.520484@panix5.panix.com> References: <18824.5343.933820.298039@panix5.panix.com> <18825.26238.914919.705377@panix5.panix.com> <9bfc700a0902040218m3c95a620sced72084e76c2ad4@mail.gmail.com> <18827.45209.826238.805391@panix5.panix.com> <18828.34889.865240.983179@panix5.panix.com> <18828.39226.268367.520484@panix5.panix.com> Message-ID: On Fri, Feb 6, 2009 at 12:10, wrote: > > I still don't see the benefit of knowing what version of Python a > > magic number matches to. So I know some bytecode was compiled by > > Python 2.5 while I am running Python 2.6. > > Yep. Not uncommon for me to have several versions of Python > available. It so happens that the computer where this email is being > sent has at least 9 versions, possibly more because I didn't check if > python.new and python.old are one those other 9. (I don't maintain > this box, but pay someone else to; clearly this is a pathological > case, but it's kind of interesting to me that there are at more than 9 > versions installed and I did not contrive this case.) > > > > What benefit do I derive > > from knowing that compared to just knowing that it was not compiled by > > Python 2.6? I mean are you ultimately planning on launching a > > different interpreter based on what generated the bytecode? > > If there's a mismatch in the first place, it means there's confusion > on someone's part. Don't you want to foster development of programs > that try to minimize confusion? Come on, that is such a baiting question. You view adding a dict of the versions as a way to help deal with confusion in a case where someone actually cares about which version of bytecode is used. I view it as another API someone is going to have to maintain for a use case I do not see as justifying that maintenance. Bytecode is purely a performance benefit, nothing more. This is why we so readily reconstruct it. Heck, in Python 3.0 the __file__ attribute *always* points to the .py file even if the .pyc was used for the load. > Subsidiary effects when support of > magic to version string are not readily available in situations where > it would be helpful is possibly back and forth dialog in bug reports > one is asking what telling folks how to get the version number > (because it's not in the error message because its not readily > available by a programmer). I have never had a bug report come in where I had to care about he magic number of a .pyc file. > Never underestimate the level of users, > especially if you are working on something like a debugger. > I don't, else I would not be a Python developer. But along with not underestimating also means that if you need to worry about something like what version of Python generates what magic number then you can look at Python/compile.c just as easily without me adding some code that has to be maintained. > If we hope that someone's going to know about and read that comment in > the C file turn it into a dictionary and maintain it anytime the magic > number gets updated, it's probably not going to happen often. > Nope, it probably won't, and honestly I am fine with that. > Again, although I see specific uses in a debugger this really an issue > regarding code tools or programs that deal with Python code. I know > there's a disassembler, but you mean there isn't a dump tool which > shows all of the information inside a compiled object including a > disassembly, Python version the program was compiled with, mtime in > human readable format, and whatnot? > Just so there is no confusion: a .pyc is not a compiled object, but a file containing a magic number, mtime, and a marshaled code object. And no, there is nothing in the standard library that dumps all of this data out for a .pyc. This is somewhat on purpose as we make no hard guarantees we won't change the format of .pyc files at some point (see the python-ideas list for a discussion about changing the format to allowing a variable amount of metadata). -Brett From santagada at gmail.com Sat Feb 7 00:20:54 2009 From: santagada at gmail.com (Leonardo Santagada) Date: Fri, 6 Feb 2009 21:20:54 -0200 Subject: [Python-ideas] Making colons optional? In-Reply-To: References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <87ocxgdmc3.fsf@xemacs.org> <727F165C-16E2-404F-9A47-9A9806E7C2B1@gmail.com> <20090206143839.GA5218@panix.com> Message-ID: <04F982A6-04BF-4F8C-B9DF-FEBEF60661F7@gmail.com> On Feb 6, 2009, at 1:11 PM, Riobard Zhan wrote: > > On 6-Feb-09, at 11:08 AM, Aahz wrote: > >> On Fri, Feb 06, 2009, Riobard Zhan wrote: >>> >>> I think the mailing list I am posting to is called "Python Ideas". >> >> That doesn't mean that people will have any interest in your >> ideas. You >> should pay attention to what people are telling you and spend less >> time >> arguing with them. > > > It is pretty clear that colon-supporters do not pay attention to the > inconsistency of semicolons being optional and colons being > mandatory that I mentioned, and try to argue with me using a set of > reasons why they love colons when in fact the same reasons would > obviously make them love semicolons, and ignore the fact that they > are used to colons not because it is a better idea but they are > never allowed to omit. > > >> Keep in mind that Guido is still BDFL, and you have >> effectively zero chance of convincing him to drop the colons. > > It does not stop me giving it a try. I Will give you an idea, change the rules of the parser so your idea works (making colon optional) and maybe even making comas optional and then show us how it works and then maybe convert the whole stdlib to be colon free (and maybe comma free)... this is probably going to convince more people that you are right... or it will convince you that you are wrong :). I think it will be quite simple to do (as you say it they are there just because so it should be a simple grammar change). Good luck, and I will probably play with your demo python -- Leonardo Santagada santagada at gmail.com From steve at pearwood.info Sat Feb 7 00:41:53 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 07 Feb 2009 10:41:53 +1100 Subject: [Python-ideas] Making colons optional? In-Reply-To: <5EE58FCE-3420-4FB3-BB82-0AAEBBA1E645@gmail.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <498BE583.4040701@scottdial.com> <498BEB78.5000400@pearwood.info> <5EE58FCE-3420-4FB3-BB82-0AAEBBA1E645@gmail.com> Message-ID: <498CCAC1.9060304@pearwood.info> Riobard Zhan wrote: > What bothers me here is that colon-lovers seem to assume their choice is > *the* choice. Colons are not the only choice, but they are the choice made nearly twenty years ago. Colons are the status quo. We don't have to justify the choice, you have to justify the change. All we have to do is nothing, and nothing will change. You have to convince others, and either change the interpreter or convince somebody else to change the interpreter, and convince the Python-dev team and Guido to accept that change. Given the opposition on this list, do you think that is likely? -- Steven From steve at pearwood.info Sat Feb 7 00:59:18 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 07 Feb 2009 10:59:18 +1100 Subject: [Python-ideas] Making stars optional? (was: Making colons optional?) In-Reply-To: References: <498B623E.1040206@pearwood.info> Message-ID: <498CCED6.7040209@pearwood.info> Georg Brandl wrote: > Steven D'Aprano schrieb: > >> Mathematicians get away with this sort of ambiguity because they are >> writing for other mathematicians, not for a computer. Because >> mathematical proofs rely on a sequence of equations, not just a single >> statement, the ambiguity can be resolved: >> >> y = a(b+c) - ac # does this mean a+() or a*() or something else? >> y = ab + ac - ac # ah, it must have been a*() >> y = ab > > No context is needed to know what a(b+c) means. Only if you assume that the mathematician didn't make a typo. Any operator could appear between the a and the bracketed term: you're deducing the presence of an implied multiplication by the absence of an operator, which is the convention in mathematics, but it is an unsafe deduction if you have only a single line in isolation. It only becomes safe in context because any accidental omission of the operator should become obvious in the following line(s) of the proof. I'm not saying this as a criticism of mathematicians. They value brevity over the increased protection from typos, which is a valid choice to make. But it is not a choice available to Python, because we have multi-character variable names, and mathematical expressions stand alone in Python code, they aren't part of an extended proof. > In maths, > you only have single-character variable names (sub-/superscripts > notwithstanding), so ab always means a*b. Except in the presence of typos. This is a small risk for mathematicians, but a large risk for Python programmers. Not because mathematicians are better typists than Python programmers, but because there are more opportunities to catch the typo in a mathematical proof than there are in a program. -- Steven From steve at pearwood.info Sat Feb 7 01:30:38 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 07 Feb 2009 11:30:38 +1100 Subject: [Python-ideas] Making colons optional? In-Reply-To: References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <87ocxgdmc3.fsf@xemacs.org> <727F165C-16E2-404F-9A47-9A9806E7C2B1@gmail.com> <20090206143839.GA5218@panix.com> Message-ID: <498CD62E.9060401@pearwood.info> Riobard Zhan wrote: > It is pretty clear that colon-supporters do not pay attention to the > inconsistency of semicolons being optional and colons being mandatory No, it is pretty clear that you do not understand the fundamental difference in semantics between colons and semi-colons. It is not an inconsistency. Colons are used as an indicator of *association*: the block following the colon is strongly associated with the line containing the colon, just like in English. This reinforces the association due to indentation, and makes it more obvious in complicated cases: if (some very complicated statement) or (another big statement which goes over many lines) or clause: do something here else: do something else Semi-colons are used as a *separator*, and they are designed for one-liners. They are permitted in multi-line programs only because there is no need to bother forbidding them, but they are completely redundant if followed by a line break instead of another statement. x = 1; y = 2 # the semi-colon is useful x = 1; y = 2 # the semi-colon is useless line noise Because colons are used for a very different thing than semi-colons, any inconsistency between the rules for one and the rules for the other is imaginary. Different purposes, different rules. -- Steven From george.sakkis at gmail.com Sat Feb 7 02:14:16 2009 From: george.sakkis at gmail.com (George Sakkis) Date: Fri, 6 Feb 2009 20:14:16 -0500 Subject: [Python-ideas] Making stars optional? (was: Making colons optional?) In-Reply-To: <498CCED6.7040209@pearwood.info> References: <498B623E.1040206@pearwood.info> <498CCED6.7040209@pearwood.info> Message-ID: <91ad5bf80902061714g527d0c64h55ba781774f56d78@mail.gmail.com> On Fri, Feb 6, 2009 at 6:59 PM, Steven D'Aprano wrote: >> In maths, >> you only have single-character variable names (sub-/superscripts >> notwithstanding), so ab always means a*b. > > Except in the presence of typos. In the presence of typos all bets are off, unless you are aware of any typo-proof writing system. Python certainly isn't one since, say, "x.y" and "x,y" are pretty similar, both visually and in keyboard distance. George From rocky at gnu.org Sat Feb 7 02:14:37 2009 From: rocky at gnu.org (rocky at gnu.org) Date: Fri, 6 Feb 2009 20:14:37 -0500 Subject: [Python-ideas] Add a cryptographic hash (e.g SHA1) of source toPython Compiled objects? In-Reply-To: References: <18824.5343.933820.298039@panix5.panix.com> <18825.26238.914919.705377@panix5.panix.com> <9bfc700a0902040218m3c95a620sced72084e76c2ad4@mail.gmail.com> <18827.45209.826238.805391@panix5.panix.com> <18828.34889.865240.983179@panix5.panix.com> <18828.39226.268367.520484@panix5.panix.com> Message-ID: <18828.57469.249341.43500@panix5.panix.com> Clearly I've failed to make any compelling cases. So be it. Thanks for considering. Brett Cannon writes: > On Fri, Feb 6, 2009 at 12:10, wrote: > > > I still don't see the benefit of knowing what version of Python a > > > magic number matches to. So I know some bytecode was compiled by > > > Python 2.5 while I am running Python 2.6. > > > > Yep. Not uncommon for me to have several versions of Python > > available. It so happens that the computer where this email is being > > sent has at least 9 versions, possibly more because I didn't check if > > python.new and python.old are one those other 9. (I don't maintain > > this box, but pay someone else to; clearly this is a pathological > > case, but it's kind of interesting to me that there are at more than 9 > > versions installed and I did not contrive this case.) > > > > > > > What benefit do I derive > > > from knowing that compared to just knowing that it was not compiled by > > > Python 2.6? I mean are you ultimately planning on launching a > > > different interpreter based on what generated the bytecode? > > > > If there's a mismatch in the first place, it means there's confusion > > on someone's part. Don't you want to foster development of programs > > that try to minimize confusion? > > Come on, that is such a baiting question. You view adding a dict of > the versions as a way to help deal with confusion in a case where > someone actually cares about which version of bytecode is used. I view > it as another API someone is going to have to maintain for a use case > I do not see as justifying that maintenance. Bytecode is purely a > performance benefit, nothing more. This is why we so readily > reconstruct it. Heck, in Python 3.0 the __file__ attribute *always* > points to the .py file even if the .pyc was used for the load. > > > Subsidiary effects when support of > > magic to version string are not readily available in situations where > > it would be helpful is possibly back and forth dialog in bug reports > > one is asking what telling folks how to get the version number > > (because it's not in the error message because its not readily > > available by a programmer). > > I have never had a bug report come in where I had to care about he > magic number of a .pyc file. > > > Never underestimate the level of users, > > especially if you are working on something like a debugger. > > > > I don't, else I would not be a Python developer. But along with not > underestimating also means that if you need to worry about something > like what version of Python generates what magic number then you can > look at Python/compile.c just as easily without me adding some code > that has to be maintained. > > > If we hope that someone's going to know about and read that comment in > > the C file turn it into a dictionary and maintain it anytime the magic > > number gets updated, it's probably not going to happen often. > > > > Nope, it probably won't, and honestly I am fine with that. > > > Again, although I see specific uses in a debugger this really an issue > > regarding code tools or programs that deal with Python code. I know > > there's a disassembler, but you mean there isn't a dump tool which > > shows all of the information inside a compiled object including a > > disassembly, Python version the program was compiled with, mtime in > > human readable format, and whatnot? > > > > Just so there is no confusion: a .pyc is not a compiled object, but a > file containing a magic number, mtime, and a marshaled code object. > > And no, there is nothing in the standard library that dumps all of > this data out for a .pyc. This is somewhat on purpose as we make no > hard guarantees we won't change the format of .pyc files at some point > (see the python-ideas list for a discussion about changing the format > to allowing a variable amount of metadata). > > -Brett > From steve at pearwood.info Sat Feb 7 03:21:30 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 07 Feb 2009 13:21:30 +1100 Subject: [Python-ideas] Making stars optional? (was: Making colons optional?) In-Reply-To: <91ad5bf80902061714g527d0c64h55ba781774f56d78@mail.gmail.com> References: <498B623E.1040206@pearwood.info> <498CCED6.7040209@pearwood.info> <91ad5bf80902061714g527d0c64h55ba781774f56d78@mail.gmail.com> Message-ID: <498CF02A.2070807@pearwood.info> George Sakkis wrote: > On Fri, Feb 6, 2009 at 6:59 PM, Steven D'Aprano wrote: > >>> In maths, >>> you only have single-character variable names (sub-/superscripts >>> notwithstanding), so ab always means a*b. >> Except in the presence of typos. > > In the presence of typos all bets are off, unless you are aware of any > typo-proof writing system. Python certainly isn't one since, say, > "x.y" and "x,y" are pretty similar, both visually and in keyboard > distance. You overstate your case: not *all* bets are off, just some of them. Some typos have greater consequences than others. Some will be discovered earlier than others, and the earlier they are discovered, the more likely they are to be easily fixed without the need for significant debugging effort. E.g. if I type R**$ instead of R**4, such a typo will be picked up in Python immediately. But R**5 instead could be missed for arbitrarily large amounts of time. E.g. if you mean x.y but type x,y instead, then such an error will be discovered *very* soon, unless you happen to also have a name 'y'. But anyway, we're not actually disagreeing. (At least, I don't think we are.) We're just discussing how some conventions encourage errors and others discourage them, and the circumstances of each. -- Steven From denis.spir at free.fr Sat Feb 7 13:31:34 2009 From: denis.spir at free.fr (spir) Date: Sat, 7 Feb 2009 13:31:34 +0100 Subject: [Python-ideas] binding vs rebinding In-Reply-To: <3f4107910902061247r4ec438cboaa9cef88abd0d4ee@mail.gmail.com> References: <20090205142211.77ba2568@o> <498B5F52.1080200@pearwood.info> <20090206113145.6b2227bd@o> <498C4B7A.3060902@latte.ca> <3f4107910902061106g774d599fs4705a4a3ffb0fa3@mail.gmail.com> <498C92CA.5060308@scottdial.com> <3f4107910902061247r4ec438cboaa9cef88abd0d4ee@mail.gmail.com> Message-ID: <20090207133134.6574444a@o> Le Fri, 6 Feb 2009 21:47:26 +0100, "Marcin 'Qrczak' Kowalczyk" a ?crit : > I am treating as "split" languages which distinguish introducing a new > variable in the current scope from assignment to an existing variable. > > Or, in other words, languages where assignment alone cannot be used to > create a new local variable. You are thus enlarging the topic to languages where it is possible to introduce a name without any *explicit* binding. As the title of the thread shows, this was not my initial intention, but I must recognize this is pertinent -- see bottom of the message. This debate has proved fruitful I guess, thank you very much. Maybe it's time to close it as probably we have explored the topic enough and this aspect of python will not change in a previsible future. I will try sum up fairly the main discussion points -- feel free to criticize ;-). My startup assertion was that (first) binding and rebinding are conceptually different actions, worth beeing distinguished in syntax (I wrote: "the distinction *makes sense*"): * Binding: create a name, bind a value to it. * Rebinding: change the value bound to the name. An objection came up that both can be seen as "bind a name to a value". We can nevertheless assume, as there was no further debate on this point, that in most cases and/or for most programmers these actions have a different meaning. Maybe there should be more material to state on that point. I also asserted that these actions are also different at the interpreter level. This proved to be wrong, which someone showed by providing matching bytecodes. [Is this a consequence that namespaces are implemented as dicts? I cannot imagine, at the underlying level, how adding an entry to a hash table can be the same as changing an entry's value.] Among possible advantages of the distinction counts the detection of name spelling errors by throwing NameError (or AttributeError): thus avoiding both erroneous rebinding of an existing name and erroneous creation of a new name. There was few comment on this point, too. Maybe it does not seem important or, on the contrary, you consider it relevant but obvious enough. I also expressed the idea that explicit rebinding is so helpful that it may even avoid the need for 'global' and 'nonlocal' declarations. There was some debate about that point, mostly supporting the idea (right?), but also some opposition. Some talks took place around several coding situations. for item in container: name = The temporary/utility variables in loops revealed a related topic, namely the absence of loop local scopes in python. This launched a short specific debate: there is clearly a need for that, but then it becomes unclear how to later access a value computed in the loop. [There were threads around this on python lists.] if condition: name = value do_something(name) name = value do_something_else(name) In the case of a value that has similar meaning and similar use, both in standard flow and a conditioned branch, it is legitimate to use a unique name. I objected that if a case is special, then the name should be, too. An objection to my objection may be summarized as "naming freedom" ;-) name = ... name = ... name = ... In this state of mind, the above code is perfectly acceptable. Anyway, this is not really an issue, as it can be simply expressed in the frame of binding/rebinding by using the rebinding syntax in following assignments. (Actually, this would even help legibility when assignments are not directly in a row.) Finally, there was a dense exchange launched by the assumtion that most languages, especially the most used ones, do not allow such a distinction. Which is obviously True (which should not mean Good for pythonists, as shown by the syntactic use of indentation ;-). Marcin enlarged the debate by opposing (name) introduction to rebinding instead: this one distinction is indeed allowed and even enforced by most prominent languages, namely C and successors. I find this observation highly pertinent. My personal conclusion derives from Marcin's note: Python was designed to be familiar to C programmers (stated by Guido), adopted its assignment syntax, but got rid of any kind of declaration. As a consequence, there is only one form of name binding left. Languages that have explicit name introduction, with or without implicit default binding, do not need to distinguish rebinding. My point of view is now that declaration free languages like python would better have it; either enforced, or only allowed. Or else we lose all the conceptual and syntactic benefits of such a distinction. Denis [Please erase the debate synhesis if ever you reply to the conclusion only.] ------ la vida e estranya From santagada at gmail.com Sat Feb 7 15:19:50 2009 From: santagada at gmail.com (Leonardo Santagada) Date: Sat, 7 Feb 2009 12:19:50 -0200 Subject: [Python-ideas] Making stars optional? (was: Making colons optional?) In-Reply-To: <498CF02A.2070807@pearwood.info> References: <498B623E.1040206@pearwood.info> <498CCED6.7040209@pearwood.info> <91ad5bf80902061714g527d0c64h55ba781774f56d78@mail.gmail.com> <498CF02A.2070807@pearwood.info> Message-ID: <26594513-3A0C-4223-9095-0293CA884663@gmail.com> On Feb 7, 2009, at 12:21 AM, Steven D'Aprano wrote: > George Sakkis wrote: >> On Fri, Feb 6, 2009 at 6:59 PM, Steven D'Aprano >> wrote: >>>> In maths, >>>> you only have single-character variable names (sub-/superscripts >>>> notwithstanding), so ab always means a*b. >>> Except in the presence of typos. >> In the presence of typos all bets are off, unless you are aware of >> any >> typo-proof writing system. Python certainly isn't one since, say, >> "x.y" and "x,y" are pretty similar, both visually and in keyboard >> distance. > > You overstate your case: not *all* bets are off, just some of them. > > Some typos have greater consequences than others. Some will be > discovered earlier than others, and the earlier they are discovered, > the more likely they are to be easily fixed without the need for > significant debugging effort. > > E.g. if I type R**$ instead of R**4, such a typo will be picked up > in Python immediately. But R**5 instead could be missed for > arbitrarily large amounts of time. > > E.g. if you mean x.y but type x,y instead, then such an error will > be discovered *very* soon, unless you happen to also have a name 'y'. > > But anyway, we're not actually disagreeing. (At least, I don't think > we are.) We're just discussing how some conventions encourage errors > and others discourage them, and the circumstances of each. This doesn't really makes much sense to me... if you don't use tests to verify your program both R**5 or R**A are just the same error if for example this code path is a rare case, it will only be discovered when something goes wrong with the program... And for a testable program is much more easily verifiable than a mathematical proof on paper (specially because our brain is used to fix little problems in reality to match what we think is right). Still I think that ab to mean a*b in python is silly, even if python was only the scripting language of Sage. -- Leonardo Santagada santagada at gmail.com From greg.ewing at canterbury.ac.nz Sat Feb 7 23:07:11 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 08 Feb 2009 11:07:11 +1300 Subject: [Python-ideas] binding vs rebinding In-Reply-To: <20090207133134.6574444a@o> References: <20090205142211.77ba2568@o> <498B5F52.1080200@pearwood.info> <20090206113145.6b2227bd@o> <498C4B7A.3060902@latte.ca> <3f4107910902061106g774d599fs4705a4a3ffb0fa3@mail.gmail.com> <498C92CA.5060308@scottdial.com> <3f4107910902061247r4ec438cboaa9cef88abd0d4ee@mail.gmail.com> <20090207133134.6574444a@o> Message-ID: <498E060F.7060102@canterbury.ac.nz> spir wrote: > [Is this a consequence that namespaces are implemented as dicts? > I cannot imagine, at the underlying level, how adding an entry to a hash > table can be the same as changing an entry's value.] There's a difference somewhere, but it's far below the level of bytecode, buried somewhere in the C code that implements the bytecodes. -- Greg From leif.walsh at gmail.com Sun Feb 8 02:48:46 2009 From: leif.walsh at gmail.com (Leif Walsh) Date: Sat, 7 Feb 2009 20:48:46 -0500 Subject: [Python-ideas] [Python-Dev] yield * (Re: Missing operator.call) In-Reply-To: <20090208015007.12555.885411350.divmod.xquotient.4298@weber.divmod.com> References: <498991D9.3060407@avl.com> <498BF062.5000407@canterbury.ac.nz> <87ljskdiyy.fsf@xemacs.org> <498C1D0A.4040200@canterbury.ac.nz> <498D4E9D.6070405@canterbury.ac.nz> <498E2EBF.7090309@canterbury.ac.nz> <20090208015007.12555.885411350.divmod.xquotient.4298@weber.divmod.com> Message-ID: [+python-ideas -python-dev] On Sat, Feb 7, 2009 at 8:50 PM, wrote: > has anyone considered the syntax 'yield from iterable'? i.e. > > def foo(): > yield 1 > yield 2 > > def bar(): > yield from foo() > yield from foo() > > list(bar()) -> [1, 2, 1, 2] > > I suggest this because (1) it's already what I say when I see the 'for' > construct, i.e. "foo then *yield*s all results *from* bar", and (2) no new > keywords are required. I still don't understand why such a construct is necessary. Is >>> for elt in iterable: >>> yield elt really all that bad? Maybe it's a little silly-looking, but at least it's easy to understand and not _that_ hard to type.... -- Cheers, Leif From leif.walsh at gmail.com Sun Feb 8 03:36:51 2009 From: leif.walsh at gmail.com (Leif Walsh) Date: Sat, 7 Feb 2009 21:36:51 -0500 Subject: [Python-ideas] [Python-Dev] yield * (Re: Missing operator.call) In-Reply-To: <20090208022406.12555.1343203422.divmod.xquotient.4301@weber.divmod.com> References: <498991D9.3060407@avl.com> <87ljskdiyy.fsf@xemacs.org> <498C1D0A.4040200@canterbury.ac.nz> <498D4E9D.6070405@canterbury.ac.nz> <498E2EBF.7090309@canterbury.ac.nz> <20090208015007.12555.885411350.divmod.xquotient.4298@weber.divmod.com> <20090208022406.12555.1343203422.divmod.xquotient.4301@weber.divmod.com> Message-ID: On Sat, Feb 7, 2009 at 9:24 PM, wrote: > For what it's worth, I don't care if this is added. I can continue typing > that stanza. I just know that if it *is* added, I'd find it a lot easier to > read "yield from foo()" than "yield *foo()". Similarly conditioned +1. -- Cheers, Leif From bruce at leapyear.org Sun Feb 8 06:23:57 2009 From: bruce at leapyear.org (Bruce Leban) Date: Sat, 7 Feb 2009 21:23:57 -0800 Subject: [Python-ideas] binding vs rebinding In-Reply-To: <20090207133134.6574444a@o> References: <20090205142211.77ba2568@o> <498B5F52.1080200@pearwood.info> <20090206113145.6b2227bd@o> <498C4B7A.3060902@latte.ca> <3f4107910902061106g774d599fs4705a4a3ffb0fa3@mail.gmail.com> <498C92CA.5060308@scottdial.com> <3f4107910902061247r4ec438cboaa9cef88abd0d4ee@mail.gmail.com> <20090207133134.6574444a@o> Message-ID: On Sat, Feb 7, 2009 at 4:31 AM, spir wrote: > We can nevertheless assume, as there was no further debate on this > point, that in most cases and/or for most programmers these actions > have a different meaning. Please don't assume. Just because people don't state their disagreements does not mean they agree. I found the statement that binding and rebinding are "obviously" different to be clearly NOT obvious and not a particularly practical distinction and really not worth discussing. I only respond to this to prevent the meme that "everyone agrees" with this from propagating. There *is* something in Python related to this that I find obviously different and that's local and global variables. I would prefer that all global variables have to be included in a global declaration. I dislike the fact that an assignment to a variable changes other references to that same name from local to global references. This sort of feels like "spooky action at a distance" to me. --- Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From yaogzhan at gmail.com Sun Feb 8 07:47:00 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Sun, 8 Feb 2009 03:17:00 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <498BE583.4040701@scottdial.com> <19D96DF5-6757-4793-B25A-5AE345CB7B4E@gmail.com> Message-ID: <0DB8FCE8-8EBF-44CC-89DF-E5D601E8822A@gmail.com> On 6-Feb-09, at 2:27 PM, Bruce Leban wrote: > So what? Commas are also mandatory. They look like semicolons so we > should make them optional too. Am I joking? You can not generalize that far. Most programming languages require commas (notable exceptions include Lisp and Tcl), but Python is the only language that requires trailing colons. There are some differences between making commas optional and making trailing colons optional. From yaogzhan at gmail.com Sun Feb 8 07:47:03 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Sun, 8 Feb 2009 03:17:03 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: <87ab8zedt1.fsf@xemacs.org> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <87ocxgdmc3.fsf@xemacs.org> <727F165C-16E2-404F-9A47-9A9806E7C2B1@gmail.com> <20090206143839.GA5218@panix.com> <87ab8zedt1.fsf@xemacs.org> Message-ID: <0F9C1637-3E39-4FD4-9F9D-7B2BA45723C4@gmail.com> On 6-Feb-09, at 12:51 PM, Stephen J. Turnbull wrote: > The inconsistency is quite small. In fact, in English colons and > semicolons have both different syntax and different semantics, and the > same is true of Python. In English, the semicolon expresses a > conjunction of two not necessarily related sentences. Python is the > same, except that since we all run von Neumann machines, conjunction > implies sequencing. The colon, on the other hand, represents a > relationship of some kind. A native-speaker simply doesn't write > things like "The sky is blue: mesons are composed of two quarks." > (Well, the coiner of the work "quark" might, which is the exception > that proves the rule.) And the same is true in Python; the code > fragment preceding the colon controls the statement (suite) following > it. Thus, the apparent isomorphism of the semantics is quite > superficial, almost syntactic itself. > > Syntactically, in English a semicolon takes two sentences to conjoin. > A colon, on the other hand, must have a full sentence on one side or > the other, but the other side may be almost anything.[1] Similarly in > Python: both what precedes the semicolon and what follows it must be > complete statements, while what precedes the colon must be a fragment, > and what follows it is a complete statement (suite). So the colon and > semicolon do not have isomorphic syntax, either. > > In fact, the semicolon is indeed redundant, as it is precisely > equivalent to a newline + the current level of indentation. This kind > of redundancy is quite unusual in Python, so I wonder if the BDFL > might regret having permitted semicolons at all, as he has expressed > regret for allowing the extended assignment operators (eg, +=). I have no problem with the different semantics of colons and semicolons. What I do have problem is that colons are mandatory but semicolons are optional, while both are completely redundant. Colons are redundant too because they are precisely equivalent to a newline + a deeper level of indentation. > In colon-less Python, the control fragment is insufficiently divided > from the controlled suite. Just as in English, a bit of redundancy to > reflect the hierarchical relationship between the control structure > and the controlled suite is useful. I think indentation is sufficient to divide them. A block indented one level deeper is always associated with the previous one level shallower code. We even indent line continuation to reflect such relationships. > In Python, the same character is used, which is helpful to those whose > native languages use colons. This is indeed *good* design. It > expresses, and emphasizes, the hierarchical structure of the control > construct, and does so using a syntax that any English-speaker (at > least) will recognize as analogous. The hierarchical structure is expressed by indentation. Colons simply (over)emphasize it. That's why I think they are redundant. Choosing the colon character is fine. Making it mandatory is bad. > I don't expect to convince you that the required colon is good design, > but I think it puts paid to the notion that colons and semicolons are > isomorphic. It is not true that what we conclude about semicolons is > necessarily what we should conclude about colons. Forgive me if I do not fully understand your point, but it appears to me that you conclude semicolons are optional because they are redundant. I think the conclusion is precisely the same for colons. From yaogzhan at gmail.com Sun Feb 8 07:47:07 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Sun, 8 Feb 2009 03:17:07 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: <498CD62E.9060401@pearwood.info> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <87ocxgdmc3.fsf@xemacs.org> <727F165C-16E2-404F-9A47-9A9806E7C2B1@gmail.com> <20090206143839.GA5218@panix.com> <498CD62E.9060401@pearwood.info> Message-ID: <7CA31A47-0708-4B97-8421-8F7917D707AC@gmail.com> On 6-Feb-09, at 9:00 PM, Steven D'Aprano wrote: > No, it is pretty clear that you do not understand the fundamental > difference in semantics between colons and semi-colons. It is not an > inconsistency. I'm not arguing it is inconsistent because they have similar semantics. I'm arguing it is inconsistent because both are redundant but one is optional while the other mandatory. You do agree indentation signifies association, and colons reinforce the association. To me, it means indentation is good enough, and colons are redundant if followed by a line break and indentation instead of a statement. > Because colons are used for a very different thing than semi-colons, > any inconsistency between the rules for one and the rules for the > other is imaginary. Different purposes, different rules. It is real. That you have different rules for semicolons and colons is the inconsistency. There should be just one rule instead of two. Different purposes do not automatically imply different rules, otherwise we will have too many rules to hold in memory. From yaogzhan at gmail.com Sun Feb 8 07:47:09 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Sun, 8 Feb 2009 03:17:09 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: <87fxirw916.fsf@benfinney.id.au> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <87skmsvxt2.fsf@benfinney.id.au> <9B3D70F3-673C-465F-BA17-FE95DF7CF6D6@gmail.com> <87fxirw916.fsf@benfinney.id.au> Message-ID: On 6-Feb-09, at 5:57 PM, Ben Finney wrote: > Riobard Zhan writes: > >> Why do you want a strong association with "here comes a suite" coming >> from colons? > > I want that indication from *something*, because indentation isn't > sufficient. A colon ?:? is a good choice because its semantic meaning > has a good analogue to the same character in natural language. > >> Why don't you want a strong association with "here comes a >> statement" coming from semicolons? > > Because ?start of a new line at the same indentation level? is > sufficient for that. (Not to mention that the very idea of ?here > comes a statement? is rather foreign; I have so little need for > something extra to do that job that I barely recognise it as a job > that needs doing.) Sorry, I did not make it clear. Please let me put it straight: I think it is unnecessary to have an extra indication of "here comes a suite". The purpose is to introduce a relationship of association, and one deeper level of indentation does the job perfectly. We even indent line continuation to reflect such relationships as well. Thus it should not be insufficient for indentation to give you such strong indication. >> Can you explain the inconsistency? > > Entirely different requirements. Different requirements do not imply different rules. The point of consistency is that even if we have completely different requirements, we can still have one rule (or similar rules), so that complexity is minimized and simplicity is gained. > If that's not enough for you, I think the gulf of understanding is too > wide for my level of interest in exploring the reasons. You are not forced to explain everything. But I think it would be polite to defend your argument with some detail in a debate. From yaogzhan at gmail.com Sun Feb 8 07:47:10 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Sun, 8 Feb 2009 03:17:10 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <498BE583.4040701@scottdial.com> <19D96DF5-6757-4793-B25A-5AE345CB7B4E@gmail.com> Message-ID: <7BB2FA7D-EA80-4A3E-874F-9D610960756B@gmail.com> On 6-Feb-09, at 12:07 PM, Curt Hagenlocher wrote: > > In any event, as several people have suggested, there's basically zero > likelihood of Python ever being changed to make colons optional. And > there doesn't even have to be a rational basis for that. I understand the nearly impossible likelihood; entrenched habits are really difficult to change, especially when there seems no obvious gain. The last sentence is very true (even in a broader sense; not limited to language design). It sometimes made me very sad. From yaogzhan at gmail.com Sun Feb 8 07:47:14 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Sun, 8 Feb 2009 03:17:14 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> Message-ID: <4801CF66-D8E6-45A0-A5E0-DABF691F4E16@gmail.com> On 6-Feb-09, at 3:16 PM, Georg Brandl wrote: > Riobard Zhan schrieb: >> On 5-Feb-09, at 7:31 AM, Steven D'Aprano wrote: >> >>> Riobard Zhan wrote: >>> >>>> - I noticed a strong tendency to forget colons by new users of >>>> Python in a second-year computer science undergraduate course. The >>>> students seemed not getting used to colons even near the end of the >>>> course. I guess it is probably because they learn Java and C first, >>>> both of which do not have colons. What other languages do you know >>>> that require colons? >>> >>> Pascal uses colons, but not for the exact same purpose as Python. >>> Both languages use colons in similar ways to it's use in English. In >>> particular, Python uses colons as a break between clauses: larger >>> than a comma, smaller than a period. >>> >> >> Pascal is my first language. It has been some years ago, so I cannot >> remember the detail now. I checked wikipedia and did not find colons >> are used after if's. Not sure if you mean declarations? If so, I >> don't >> think that is what we are discussing here; Java and C also use colons >> in switch/case statements. AFAIK, Python is quite unique in requiring >> trailing colons after if's, for's, and function/class definitions. > > Python is also quite unique in using indentation to delimit blocks, > so I'm not sure what point you're trying to make. Sorry, I did not make it obvious. Please let me re-state it more clearly. I pointed out that Python is quite unique in requiring trailing colons because I don't think there are any other popular languages do so. Therefore I did not understand why you mentioned Pascal. I was trying to figure out if you mean Pascal uses colons in declarations. > I think it's a good indicator for optional syntax if you can formulate > new rules for PEP 8 that state when to use it. In the case of colons, > you'd have to either forbid or mandate them; I'd be at a loss to find > another consistent rule. So, making them optional is pointless; we > should either keep them or remove them. And removing is out of the > question. > > Applying that indicator to semicolons, there is a clear rule in PEP 8 > that states when to use them: to separate two statements on one line. I thought semicolons and multiple statements on one line are discouraged in PEP 8. Did I miss something? :| > However, the only person around here whose > itches alone, in the face of a wall of disagreeing users, can lead to > a change in the language, is Guido. Agreed. That's what BDFL means. I'm not trying to impose my itches on you. I'm not the BDFL. I'm explaining why I think omitting colons makes Python more elegant. > Yes, I have read all those remarks about semicolons. Thanks very much! I thought you ignored them before. I apologize :) > What you all fail to recognize is that the majority of Python users > like the colons, and wouldn't program without them. I doubt it. Would you not program other languages because they do not have colons? Do you really think the majority of Python users care about colons that much? I bet they will never notice if there is any colons missing in a piece of Python code if colons are made optional. From yaogzhan at gmail.com Sun Feb 8 07:47:17 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Sun, 8 Feb 2009 03:17:17 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: <04F982A6-04BF-4F8C-B9DF-FEBEF60661F7@gmail.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <87ocxgdmc3.fsf@xemacs.org> <727F165C-16E2-404F-9A47-9A9806E7C2B1@gmail.com> <20090206143839.GA5218@panix.com> <04F982A6-04BF-4F8C-B9DF-FEBEF60661F7@gmail.com> Message-ID: <49E68272-B71D-4240-8107-48628AC2E2E0@gmail.com> On 6-Feb-09, at 7:50 PM, Leonardo Santagada wrote: > > I Will give you an idea, change the rules of the parser so your idea > works (making colon optional) and maybe even making comas optional > and then show us how it works and then maybe convert the whole > stdlib to be colon free (and maybe comma free)... this is probably > going to convince more people that you are right... or it will > convince you that you are wrong :). I think it will be quite simple > to do (as you say it they are there just because so it should be a > simple grammar change). > > Good luck, > and I will probably play with your demo python Thank you very much for the suggestion! I don't think making colons optional is a technical problem at all. Eventually I'll have to convince people on this list that it would be a (slightly) better idea (best would be to kill colons, semicolons, and as a consequence, one- liners, completely); or they convince me the opposite. I'm doing the harder part first. From yaogzhan at gmail.com Sun Feb 8 07:47:20 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Sun, 8 Feb 2009 03:17:20 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: <498CCAC1.9060304@pearwood.info> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <498BE583.4040701@scottdial.com> <498BEB78.5000400@pearwood.info> <5EE58FCE-3420-4FB3-BB82-0AAEBBA1E645@gmail.com> <498CCAC1.9060304@pearwood.info> Message-ID: On 6-Feb-09, at 8:11 PM, Steven D'Aprano wrote: > Colons are not the only choice, but they are the choice made nearly > twenty years ago. Colons are the status quo. We don't have to > justify the choice, you have to justify the change. > > All we have to do is nothing, and nothing will change. You have to > convince others, and either change the interpreter or convince > somebody else to change the interpreter, and convince the Python-dev > team and Guido to accept that change. Given the opposition on this > list, do you think that is likely? "Earth-center theory is not the only choice, but they are the choice made nearly two thousand years ago. Earth-center theory is the status quo. We don't have to justify the choice, you have to justify the change. All we have to do is nothing, and nothing will change. You have to convince others, and convince the church leaders and Pope to accept that. Given the opposition in this country, do you think that is likely?" Sorry, the above is not really a proper analogy, but I failed to find a better way to make it obvious how I felt when reading your words. Don't take it wrong. I fully agree with your words. They are very true and convincing. I know it is nearly impossible even before I posted the original proposal. It is very difficult to justify a change for such a trivial issue. People will say, "OK, I'm perfectly fine with colons. Why bother?" And I cannot come up with anything more compelling than it will make the language more consistent and elegant. Nevertheless, it can be discussed; I will not be prosecuted for bringing it up, right? :P From yaogzhan at gmail.com Sun Feb 8 07:47:25 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Sun, 8 Feb 2009 03:17:25 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: <20090206191154.1b677eda@bhuda.mired.org> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <498BE583.4040701@scottdial.com> <08A88544-C428-4BAF-8AF2-86FB05056BC6@gmail.com> <20090206191154.1b677eda@bhuda.mired.org> Message-ID: <04091536-0883-451B-A819-EA1667EED6EF@gmail.com> On 6-Feb-09, at 8:41 PM, Mike Meyer wrote: > That this consistency - ignoring trailing separators in list > structures - can be misunderstood to be an optional ending separator > in the degenerate case of a single statement is a good indication of > why consistency isn't a trump property. This is a very strange view of consistency to me. How many different kinds of list separators do we have? I can only think of semicolons and commas. I don't think semicolons are anything like commas. Non- trailing semicolons can be omitted, while non-trailing commas cannot, even if you put each item of [1,2,3] in separate lines. If you don't think consistency counts (at least in this case), I cannot argue with you---that's waaay off topic. From yaogzhan at gmail.com Sun Feb 8 07:47:33 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Sun, 8 Feb 2009 03:17:33 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: <20090206153139.GA31773@phd.pp.ru> References: <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <87ocxgdmc3.fsf@xemacs.org> <727F165C-16E2-404F-9A47-9A9806E7C2B1@gmail.com> <20090206142155.GD13165@phd.pp.ru> <982C396D-82B9-49FA-97EF-9B85844A6D4D@gmail.com> <20090206150951.GA26929@phd.pp.ru> <63E445CF-F77B-4D2C-92D4-789DBE4B959A@gmail.com> <20090206153139.GA31773@phd.pp.ru> Message-ID: On 6-Feb-09, at 12:01 PM, Oleg Broytmann wrote: > It is, because it is now a part of *the language*. Changing the > language > is always a major change. If you don't understand - well, we failed to > communicate, and I am stopping now. It's OK. We just have different understanding of how "big" is "big". > You have to learn to live with small inconsistencies. I'm perfectly fine with them. Have lived with them for a couple of years. >> By comparison, what do we get by making "print" a function? Why not >> create a "put" or "echo" built-in function if we really want the >> flexibility? Isn't that a major, big scale change? > > Yes, it was a major change, and it warranted a new major release - > 3.0. The original context is that you thought it's not worth a "big/major" change to make colons optional because we get nothing, then I asked by comparison what do we get by making "print" a function (a "bigger" change for me) when we really have the option to just add a new built- in function with a slightly different name. From arnodel at googlemail.com Sun Feb 8 08:16:01 2009 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Sun, 8 Feb 2009 07:16:01 +0000 Subject: [Python-ideas] binding vs rebinding In-Reply-To: References: <20090205142211.77ba2568@o> <498B5F52.1080200@pearwood.info> <20090206113145.6b2227bd@o> <498C4B7A.3060902@latte.ca> <3f4107910902061106g774d599fs4705a4a3ffb0fa3@mail.gmail.com> <498C92CA.5060308@scottdial.com> <3f4107910902061247r4ec438cboaa9cef88abd0d4ee@mail.gmail.com> <20090207133134.6574444a@o> Message-ID: <9bfc700a0902072316m677845adw584fc3025954487c@mail.gmail.com> 2009/2/8 Bruce Leban : > There *is* something in Python related to this that I find obviously > different and that's local and global variables. I would prefer that all > global variables have to be included in a global declaration. I dislike the > fact that an assignment to a variable changes other references to that same > name from local to global references. This sort of feels like "spooky action > at a distance" to me. IMHO it would be very tiresome to have to declare all global functions and builtins used in a function as global E.g. def foo(x): return x + 2 def bar(x): global str, foo, int return str(foo(int(x))) -- Arnaud From bruce at leapyear.org Sun Feb 8 08:20:55 2009 From: bruce at leapyear.org (Bruce Leban) Date: Sat, 7 Feb 2009 23:20:55 -0800 Subject: [Python-ideas] Making colons optional? In-Reply-To: <0DB8FCE8-8EBF-44CC-89DF-E5D601E8822A@gmail.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <498BE583.4040701@scottdial.com> <19D96DF5-6757-4793-B25A-5AE345CB7B4E@gmail.com> <0DB8FCE8-8EBF-44CC-89DF-E5D601E8822A@gmail.com> Message-ID: On Sat, Feb 7, 2009 at 10:47 PM, Riobard Zhan wrote: > > On 6-Feb-09, at 2:27 PM, Bruce Leban wrote: > >> So what? Commas are also mandatory. They look like semicolons so we should >> make them optional too. Am I joking? >> > > You can not generalize that far. Most programming languages require commas > (notable exceptions include Lisp and Tcl), but Python is the only language > that requires trailing colons. Nope. See http://en.wikipedia.org/wiki/Smalltalk Sure Lisp doesn't use commas. Parenthesis can be made optional in Lisp too. Therefore semicolons, colons, commas and parenthesis are all equally optional. Woohoo! No pesky syntax! There are some differences between making commas optional and making > trailing colons optional. > "There are some differences between making X optional and making Y optional" for all features X and Y. Clearly the use of the specific semicolon character is confusing you. So let's replace it with a better symbol: \n as in this example: for i in x: foo(i) \n bar(i+1) Sure a \n is optional at the end of any line because a blank line is always allowed. So what? --- Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From arnodel at googlemail.com Sun Feb 8 08:34:57 2009 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Sun, 8 Feb 2009 07:34:57 +0000 Subject: [Python-ideas] binding vs rebinding In-Reply-To: References: <20090205142211.77ba2568@o> <20090206113145.6b2227bd@o> <498C4B7A.3060902@latte.ca> <3f4107910902061106g774d599fs4705a4a3ffb0fa3@mail.gmail.com> <498C92CA.5060308@scottdial.com> <3f4107910902061247r4ec438cboaa9cef88abd0d4ee@mail.gmail.com> <20090207133134.6574444a@o> <9bfc700a0902072316m677845adw584fc3025954487c@mail.gmail.com> Message-ID: <9bfc700a0902072334u6a5e0411ue7c86311cb33bf20@mail.gmail.com> [Redirecting to python-ideas] 2009/2/8 Bruce Leban : > On Sat, Feb 7, 2009 at 11:16 PM, Arnaud Delobelle > wrote: >> >> 2009/2/8 Bruce Leban : >> >> > There *is* something in Python related to this that I find obviously >> > different and that's local and global variables. I would prefer that all >> > global variables have to be included in a global declaration. I dislike >> > the >> > fact that an assignment to a variable changes other references to that >> > same >> > name from local to global references. This sort of feels like "spooky >> > action >> > at a distance" to me. >> >> IMHO it would be very tiresome to have to declare all global functions >> and builtins used in a function as global E.g. >> >> def foo(x): >> return x + 2 >> >> def bar(x): >> global str, foo, int >> return str(foo(int(x))) >> > I don't want it for functions, just for variables. I realize that those may > be the same on some level but I don't think fo them that way when I'm > writing code. That's impossible. Functions are python objects which are bound to variables at runtime. At compile time (when it has to be decided which variable is local and which is local), there is no way to know if a variable will be bound to a function or to another object. Worse, many python objects are callable without being functions. -- Arnaud From yaogzhan at gmail.com Sun Feb 8 08:58:24 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Sun, 8 Feb 2009 04:28:24 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <498BE583.4040701@scottdial.com> <19D96DF5-6757-4793-B25A-5AE345CB7B4E@gmail.com> <0DB8FCE8-8EBF-44CC-89DF-E5D601E8822A@gmail.com> Message-ID: On 8-Feb-09, at 3:50 AM, Bruce Leban wrote: > You can not generalize that far. Most programming languages require > commas (notable exceptions include Lisp and Tcl), but Python is the > only language that requires trailing colons. > > Nope. See http://en.wikipedia.org/wiki/Smalltalk Just checked. Smalltalk's colons seem to have completely different semantics. Correct me if I'm wrong, but they appear to be at the end of every keyword, including ifTrue and ifElse. > There are some differences between making commas optional and making > trailing colons optional. > > "There are some differences between making X optional and making Y > optional" for all features X and Y. This generalization is meaningless. You deliberately ignored the original context. > Clearly the use of the specific semicolon character is confusing > you. So let's replace it with a better symbol: \n as in this example: > > for i in x: foo(i) \n bar(i+1) > > Sure a \n is optional at the end of any line because a blank line is > always allowed. So what? What's the point you are trying to make? From yaogzhan at gmail.com Sun Feb 8 08:58:31 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Sun, 8 Feb 2009 04:28:31 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: <20090208020446.0eec2cf8@bhuda.mired.org> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <498BE583.4040701@scottdial.com> <08A88544-C428-4BAF-8AF2-86FB05056BC6@gmail.com> <20090206191154.1b677eda@bhuda.mired.org> <04091536-0883-451B-A819-EA1667EED6EF@gmail.com> <20090208020446.0eec2cf8@bhuda.mired.org> Message-ID: On Sun, 8 Feb 2009 03:17:25 -0330 Riobard Zhan wrote: > > On 6-Feb-09, at 8:41 PM, Mike Meyer wrote: >> That this consistency - ignoring trailing separators in list >> structures - can be misunderstood to be an optional ending separator >> in the degenerate case of a single statement is a good indication of >> why consistency isn't a trump property. > > This is a very strange view of consistency to me. How many different > kinds of list separators do we have? I can only think of semicolons > and commas. I don't think semicolons are anything like commas. Non- > trailing semicolons can be omitted, while non-trailing commas cannot, > even if you put each item of [1,2,3] in separate lines. Oops, I thought I missed a clause. The last sentence should be "Non- trailing semicolons can be omitted [if you put each statement in its own line], while non-trailing commas cannot, even if you put each item of [1,2,3] in separate lines." On 8-Feb-09, at 3:34 AM, Mike Meyer wrote: > You still don't understand the semantics of semicolons. Non-trailing > semicolons are required, and can *not* be omitted. Try it and see: > > bhuda$ python > Python 2.6 (r26:66714, Nov 11 2008, 07:45:20) > [GCC 4.2.1 20070719 [FreeBSD]] on freebsd7 > Type "help", "copyright", "credits" or "license" for more information. >>>> a = 1; b = 2 >>>> a = 1 b = 2 > File "", line 1 > a = 1 b = 2 > ^ > SyntaxError: invalid syntax >>>> > > Only *trailing* semicolons - the one following the last statement in > the list - can be omitted. Just like lists in list literals, in tuple > literals (module zero & one element tuples), in dictionary literals, > and as arguments to certain types of functions functions. I'm really confused by your words. Here is a list of statements. a = 1; # non-trailing semicolon of the list of statements b = 2; # trailing semicolon of the list of statements Both semicolons can be omitted. Wait a minute... What do you mean by a "list" of statements? Is this one list of length 2, or two lists of length 1? a = 1 b = 2 -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sun Feb 8 09:05:01 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 08 Feb 2009 19:05:01 +1100 Subject: [Python-ideas] Making colons optional? In-Reply-To: <04091536-0883-451B-A819-EA1667EED6EF@gmail.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <498BE583.4040701@scottdial.com> <08A88544-C428-4BAF-8AF2-86FB05056BC6@gmail.com> <20090206191154.1b677eda@bhuda.mired.org> <04091536-0883-451B-A819-EA1667EED6EF@gmail.com> Message-ID: <498E922D.8060706@pearwood.info> Riobard Zhan wrote: > > On 6-Feb-09, at 8:41 PM, Mike Meyer wrote: >> That this consistency - ignoring trailing separators in list >> structures - can be misunderstood to be an optional ending separator >> in the degenerate case of a single statement is a good indication of >> why consistency isn't a trump property. > > This is a very strange view of consistency to me. How many different > kinds of list separators do we have? I can only think of semicolons and > commas. In English: commas, semi-colons, slashes and newlines. There may be others, but I can't think of them off the top of my head. Examples: Sandwiches are made of bread, cheese, tomato, ham, and eggs. The hospital was visited by the following dignitaries: the President, Mr Obama; the Queen, Elisabeth II; and a famous actor, Bruce Willis. The invitation is for you and your wife/husband/partner. Shopping List: milk fruit meat In programming languages: commas and semi-colons are usual. OpenOffice spreadsheet uses ; to separate arguments to formulas, which never ceases to annoy me. I've seen at least one Context-Free Grammar format that uses vertical bar | as a list separator. I presume Lisp uses whitespace. If I recall correctly, so does Forth. Hypertalk separates "items" with commas and "words" with spaces, although the item delimiter was configurable in later versions. Tab delimited files use tabs as the item separator. In Python: only commas are item separators. Semi-colons and newlines are statement separators. Colons are not separators at all. -- Steven From steve at pearwood.info Sun Feb 8 09:11:48 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 08 Feb 2009 19:11:48 +1100 Subject: [Python-ideas] Making colons optional? In-Reply-To: References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <498BE583.4040701@scottdial.com> <498BEB78.5000400@pearwood.info> <5EE58FCE-3420-4FB3-BB82-0AAEBBA1E645@gmail.com> <498CCAC1.9060304@pearwood.info> Message-ID: <498E93C4.6090200@pearwood.info> Riobard Zhan wrote: > I know it is nearly impossible even before I posted the > original proposal. It is very difficult to justify a change for such a > trivial issue. People will say, "OK, I'm perfectly fine with colons. Why > bother?" And I cannot come up with anything more compelling than it will > make the language more consistent and elegant. You haven't convinced anyone that the change will make the language more consistent or elegant. Most people who have replied believe that such a change would make the language FOOLISHLY consistent and LESS elegant. By all means try to get your idea across, that's what this list is for. But in the absence of support from anyone else, you probably should realise that this isn't going anywhere. -- Steven From steve at pearwood.info Sun Feb 8 09:14:46 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 08 Feb 2009 19:14:46 +1100 Subject: [Python-ideas] [Python-Dev] yield * (Re: Missing operator.call) In-Reply-To: References: <498991D9.3060407@avl.com> <498BF062.5000407@canterbury.ac.nz> <87ljskdiyy.fsf@xemacs.org> <498C1D0A.4040200@canterbury.ac.nz> <498D4E9D.6070405@canterbury.ac.nz> <498E2EBF.7090309@canterbury.ac.nz> <20090208015007.12555.885411350.divmod.xquotient.4298@weber.divmod.com> Message-ID: <498E9476.80404@pearwood.info> Leif Walsh wrote: > I still don't understand why such a construct is necessary. Is > >>>> for elt in iterable: >>>> yield elt > > really all that bad? Maybe it's a little silly-looking, but at least > it's easy to understand and not _that_ hard to type.... It's not just silly looking, it's the same construct used repeatedly, in many different places in code. It is a basic principle of programming that anytime you have blocks of code that are almost identical, you should factor out the common code into it's own routine. See "Don't Repeat Yourself" and "Once And Only Once" for similar ideas: http://c2.com/cgi/wiki?OnceAndOnlyOnce http://c2.com/cgi/wiki?DontRepeatYourself Consider a pure Python implementation of itertools.chain: def chain(*iterables): for it in iterables: for elt in it: yield elt The double for loop obscures the essential nature of chain. From help(itertools.chain): "Return a chain object whose .next() method returns elements from the first iterable until it is exhausted, then elements from the next iterable, until all of the iterables are exhausted." The emphasis is on iterating over the sequence of iterables, not iterating over each iterable itself. This is one place where explicit is *not* better than implicit, as the inner loop exposes too much of the internal detail to the reader. Instead, chain() could be better written as this: def chain(*iterables): for it in iterables: yield from it Naturally you can use map and filter to transform the results: yield from map(trans, filter(expr, it)) The advantage is even more obvious when married with a generator expression: yield from (3*x for x in seq if x%2 == 1) instead of: for x in seq: if x%2 == 1: yield 3*x or for y in (3*x for x in seq if x%2 == 1): yield y I'm +1 on this suggestion, especially since it requires no new keywords. -- Steven From steve at pearwood.info Sun Feb 8 09:17:36 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 08 Feb 2009 19:17:36 +1100 Subject: [Python-ideas] Making colons optional? In-Reply-To: <0F9C1637-3E39-4FD4-9F9D-7B2BA45723C4@gmail.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <87ocxgdmc3.fsf@xemacs.org> <727F165C-16E2-404F-9A47-9A9806E7C2B1@gmail.com> <20090206143839.GA5218@panix.com> <87ab8zedt1.fsf@xemacs.org> <0F9C1637-3E39-4FD4-9F9D-7B2BA45723C4@gmail.com> Message-ID: <498E9520.2080206@pearwood.info> Riobard Zhan wrote: > Colons are redundant too > because they are precisely equivalent to a newline + a deeper level of > indentation. You have been given many examples showing that this is not true. At best, it is *often* true, but certainly not always. At this point, the horse is not just dead, it has rotted away to a skeleton. I don't believe there is any point continuing to beat it. If anyone other than Riobard thinks differently, please speak up. -- Steven From stephen at xemacs.org Sun Feb 8 09:21:59 2009 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sun, 08 Feb 2009 17:21:59 +0900 Subject: [Python-ideas] Making colons optional? In-Reply-To: <0F9C1637-3E39-4FD4-9F9D-7B2BA45723C4@gmail.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <87ocxgdmc3.fsf@xemacs.org> <727F165C-16E2-404F-9A47-9A9806E7C2B1@gmail.com> <20090206143839.GA5218@panix.com> <87ab8zedt1.fsf@xemacs.org> <0F9C1637-3E39-4FD4-9F9D-7B2BA45723C4@gmail.com> Message-ID: <87vdrlcp94.fsf@xemacs.org> Riobard Zhan writes: > Forgive me if I do not fully understand your point, You don't. > but it appears to me that you conclude semicolons are optional > because they are redundant. It appears to me that you're responding to something other than what I wrote. I wrote nothing about why semicolons are optional, only about why I believe they are redundant. > I think the conclusion is precisely the same for colons. My point was precisely that two syntaxes are appropriate balance with two semantics, but we have three syntax elements, so one is redundant. If you claim that is an argument for symmetry in treatment of colons and semicolons, I guess you are of the school that 3 - 1 = 1 for sufficiently large values of 1? But no, my point is that we can get rid of one but not both, assuming that syntax should reflect semantics to this extent. And it's pretty obvious which one to get rid of! You pretty clearly disagree with that principle, but I think it's an important aspect of what makes Python attractive to me: syntactic units and dividers do correspond to semantic units, according to my intuition. From leif.walsh at gmail.com Sun Feb 8 09:29:18 2009 From: leif.walsh at gmail.com (Leif Walsh) Date: Sun, 8 Feb 2009 03:29:18 -0500 Subject: [Python-ideas] [Python-Dev] yield * (Re: Missing operator.call) In-Reply-To: <498E9476.80404@pearwood.info> References: <498991D9.3060407@avl.com> <87ljskdiyy.fsf@xemacs.org> <498C1D0A.4040200@canterbury.ac.nz> <498D4E9D.6070405@canterbury.ac.nz> <498E2EBF.7090309@canterbury.ac.nz> <20090208015007.12555.885411350.divmod.xquotient.4298@weber.divmod.com> <498E9476.80404@pearwood.info> Message-ID: On Sun, Feb 8, 2009 at 3:14 AM, Steven D'Aprano wrote: > It's not just silly looking, it's the same construct used repeatedly, in > many different places in code. It is a basic principle of programming that > anytime you have blocks of code that are almost identical, you should factor > out the common code into it's own routine. See "Don't Repeat Yourself" and > "Once And Only Once" for similar ideas: Sure, but it's only factoring out one or two lines. I dunno. If it's not too intrusive to the parser, I guess it's not such a bad idea, but it just seems like it's more work than it's worth. Besides, most applications I can think of that require you to build a generator around another number of generators also require some kind of manipulation of each data item generated, which this construct doesn't allow. It's a decent proposal, and looks nice enough, but I'm not convinced it's a good use of our time (not that it's up to me though). -- Cheers, Leif From bruce at leapyear.org Sun Feb 8 18:48:39 2009 From: bruce at leapyear.org (Bruce Leban) Date: Sun, 8 Feb 2009 09:48:39 -0800 Subject: [Python-ideas] Making colons optional? In-Reply-To: References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <498BE583.4040701@scottdial.com> <19D96DF5-6757-4793-B25A-5AE345CB7B4E@gmail.com> <0DB8FCE8-8EBF-44CC-89DF-E5D601E8822A@gmail.com> Message-ID: On Sat, Feb 7, 2009 at 11:58 PM, Riobard Zhan wrote: > > On 8-Feb-09, at 3:50 AM, Bruce Leban wrote: > > You can not generalize that far. Most programming languages require commas >> (notable exceptions include Lisp and Tcl), but Python is the only language >> that requires trailing colons. >> >> Nope. See http://en.wikipedia.org/wiki/Smalltalk >> > > Just checked. Smalltalk's colons seem to have completely different > semantics. Correct me if I'm wrong, but they appear to be at the end of > every keyword, including ifTrue and ifElse. > Your statement that no programming language used trailing colons is simply false. I and many others have said that colons and semicolons have completely different semantics as well and you have chosen to ignore that. > Clearly the use of the specific semicolon character is confusing you. So >> let's replace it with a better symbol: \n as in this example: >> >> for i in x: foo(i) \n bar(i+1) >> >> Sure a \n is optional at the end of any line because a blank line is >> always allowed. So what? >> > > What's the point you are trying to make? > The point is that the \n token and the : token have different semantics entirely. The \n token is used to separate statements on a single line. You seem to think that they are related because they look similar and now they don't. One more final point: in some languages, the semicolon is a statement TERMINATOR as it is in C and in others it is a statement SEPARATOR as it is in Pascal. I think your time would be better served by working to convince the Pascal people and the C people to reconcile that inconsistency than this discussion. --- Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Sun Feb 8 19:54:25 2009 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 8 Feb 2009 18:54:25 +0000 (UTC) Subject: [Python-ideas] [Python-Dev] yield * (Re: Missing operator.call) References: <498991D9.3060407@avl.com> <498BF062.5000407@canterbury.ac.nz> <87ljskdiyy.fsf@xemacs.org> <498C1D0A.4040200@canterbury.ac.nz> <498D4E9D.6070405@canterbury.ac.nz> <498E2EBF.7090309@canterbury.ac.nz> <20090208015007.12555.885411350.divmod.xquotient.4298@weber.divmod.com> <498E9476.80404@pearwood.info> Message-ID: Steven D'Aprano writes: > The advantage is even more obvious when married with a generator expression: > > yield from (3*x for x in seq if x%2 == 1) > > instead of: > > for x in seq: > if x%2 == 1: > yield 3*x But the former will be slower than the latter, because it constructs an intermediate generator only to yield it element by element. Regards Antoine. From nate at binkert.org Sun Feb 8 20:02:18 2009 From: nate at binkert.org (nathan binkert) Date: Sun, 8 Feb 2009 11:02:18 -0800 Subject: [Python-ideas] [Python-Dev] yield * (Re: Missing operator.call) In-Reply-To: <498E9476.80404@pearwood.info> References: <498991D9.3060407@avl.com> <87ljskdiyy.fsf@xemacs.org> <498C1D0A.4040200@canterbury.ac.nz> <498D4E9D.6070405@canterbury.ac.nz> <498E2EBF.7090309@canterbury.ac.nz> <20090208015007.12555.885411350.divmod.xquotient.4298@weber.divmod.com> <498E9476.80404@pearwood.info> Message-ID: <217accd40902081102r7f17f4ecp97781c41fe9ed934@mail.gmail.com> > It's not just silly looking, it's the same construct used repeatedly, in > many different places in code. It is a basic principle of programming that > anytime you have blocks of code that are almost identical, you should factor > out the common code into it's own routine. See "Don't Repeat Yourself" and > "Once And Only Once" for similar ideas: > > http://c2.com/cgi/wiki?OnceAndOnlyOnce > http://c2.com/cgi/wiki?DontRepeatYourself > Just to give another random user's opinion, I love this idea. When writing code where I factor out lots of generators (for something like cherrypy), I've had to repeat this two line idiom dozens of times in one function. +1 Nate From steve at pearwood.info Sun Feb 8 23:20:48 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 09 Feb 2009 09:20:48 +1100 Subject: [Python-ideas] [Python-Dev] yield * (Re: Missing operator.call) In-Reply-To: References: <498991D9.3060407@avl.com> <498BF062.5000407@canterbury.ac.nz> <87ljskdiyy.fsf@xemacs.org> <498C1D0A.4040200@canterbury.ac.nz> <498D4E9D.6070405@canterbury.ac.nz> <498E2EBF.7090309@canterbury.ac.nz> <20090208015007.12555.885411350.divmod.xquotient.4298@weber.divmod.com> <498E9476.80404@pearwood.info> Message-ID: <498F5AC0.60708@pearwood.info> Antoine Pitrou wrote: > Steven D'Aprano writes: >> The advantage is even more obvious when married with a generator expression: >> >> yield from (3*x for x in seq if x%2 == 1) >> >> instead of: >> >> for x in seq: >> if x%2 == 1: >> yield 3*x > > But the former will be slower than the latter, because it constructs an > intermediate generator only to yield it element by element. The "yield from" syntax hasn't even been approved, let alone implemented, and you're already complaining it's slow? Talk about premature optimization! That's a criticism of *generator expressions*, not the suggested syntax. They're popular because most people prefer the large benefits in readability and convenience over the minuscule cost in generating them, particularly since often that cost is paid somewhere else: the caller builds the generator expression and passes the resulting iterator into your function, which merely iterates over it. And the cost is small: [steve at ando ~]$ python -m timeit -s "seq = range(500)" "(3*x for x in seq if x%2 == 1)" 1000000 loops, best of 3: 0.611 usec per loop -- Steven From greg.ewing at canterbury.ac.nz Sun Feb 8 23:46:54 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 09 Feb 2009 11:46:54 +1300 Subject: [Python-ideas] yield * (Re: Missing operator.call) In-Reply-To: <498E3866.2020607@gmail.com> References: <498991D9.3060407@avl.com> <498BC891.4020103@canterbury.ac.nz> <498BCB29.4070005@canterbury.ac.nz> <498BF062.5000407@canterbury.ac.nz> <87ljskdiyy.fsf@xemacs.org> <498C1D0A.4040200@canterbury.ac.nz> <498D4E9D.6070405@canterbury.ac.nz> <498E2EBF.7090309@canterbury.ac.nz> <498E3866.2020607@gmail.com> Message-ID: <498F60DE.7000506@canterbury.ac.nz> Nick Coghlan wrote: > One important question to ask yourself is whether the semantics you want > may make more sense as a new generator method (as happened with the > addition of send() and throw()) rather than as new syntax. > > def f(): > c = g() > yield *c > print c.result() That turns one line into three and makes it impossible to embed it in an expression. It's a very poor substitute for what I have in mind. > In particular, the return value of 'yield *' would likely still by > needed for send() in the case where the subgenerator has already > terminated, so the only sensible destination for the sent value is the > generator that invoked 'yield *' The effect I'm after is the same as what would happen if the subgenerator were yielding directly to the caller of the outer generator. Since, except for the first send(), it's only possible to send() something to a generator when it's suspended in a yield, anything sent to the outer generator after the subgenerator terminates would have to appear as the return value of some later (ordinary) yield in the outer generator itself or another subgenerator. The full expansion, taking sends into account, of result = yield *g() would be something like _g = g() try: _v = yield _g.next() while 1: _v = yield _g.send(_v) except StopIteration, _e: result = _e.return_value I think I've got that right. While it may look like the last value assigned to _v gets lost, that's not actually the case, because the last next() or send() call before _g terminates never returns, raising StopIteration instead. (Here I'm assuming the return value is passed back as an argument to the StopIteration exception, something that I think got proposed at one point but never adopted. The return value could alternatively be attached to the generator-iterator itself.) -- Greg From qrczak at knm.org.pl Mon Feb 9 00:15:00 2009 From: qrczak at knm.org.pl (Marcin 'Qrczak' Kowalczyk) Date: Mon, 9 Feb 2009 00:15:00 +0100 Subject: [Python-ideas] [Python-Dev] yield * (Re: Missing operator.call) In-Reply-To: <498F5AC0.60708@pearwood.info> References: <498991D9.3060407@avl.com> <498D4E9D.6070405@canterbury.ac.nz> <498E2EBF.7090309@canterbury.ac.nz> <20090208015007.12555.885411350.divmod.xquotient.4298@weber.divmod.com> <498E9476.80404@pearwood.info> <498F5AC0.60708@pearwood.info> Message-ID: <3f4107910902081515s3bc1b5b1k64bbac14051dbd0b@mail.gmail.com> On Sun, Feb 8, 2009 at 23:20, Steven D'Aprano wrote: > And the cost is small: > > [steve at ando ~]$ python -m timeit -s "seq = range(500)" "(3*x for x in seq if > x%2 == 1)" > 1000000 loops, best of 3: 0.611 usec per loop Because generators are lazy and you don't run it into completion. -- Marcin Kowalczyk qrczak at knm.org.pl http://qrnik.knm.org.pl/~qrczak/ From greg.ewing at canterbury.ac.nz Mon Feb 9 00:19:15 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 09 Feb 2009 12:19:15 +1300 Subject: [Python-ideas] [Python-Dev] yield * (Re: Missing operator.call) In-Reply-To: References: <498991D9.3060407@avl.com> <498BF062.5000407@canterbury.ac.nz> <87ljskdiyy.fsf@xemacs.org> <498C1D0A.4040200@canterbury.ac.nz> <498D4E9D.6070405@canterbury.ac.nz> <498E2EBF.7090309@canterbury.ac.nz> <20090208015007.12555.885411350.divmod.xquotient.4298@weber.divmod.com> Message-ID: <498F6873.6080105@canterbury.ac.nz> Leif Walsh wrote: > I still don't understand why such a construct is necessary. Is > >>>>for elt in iterable: >>>> yield elt > > really all that bad? If all you want is to pass yielded values outwards, it's not all that bad, although it could get a bit tedious if you're doing it a lot. However, if you want values passed back in by send() to go to the right places, it's *considerably* more complicated. The expansion I posted just before shows, I think, that this is not something you want to have to write out longhand every time -- at least not if you want a good chance of getting it right! -- Greg From carl at carlsensei.com Mon Feb 9 01:41:37 2009 From: carl at carlsensei.com (Carl Johnson) Date: Sun, 8 Feb 2009 14:41:37 -1000 Subject: [Python-ideas] Allow lambda decorators Message-ID: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> A few months back there was a discussion of how code like this gives "surprising" results because of the scoping rules: >>> def func_maker(): ... fs = [] ... for i in range(10): ... def f(): ... return i ... fs.append(f) ... return fs ... >>> [f() for f in func_maker()] [9, 9, 9, 9, 9, 9, 9, 9, 9, 9] Various syntax changes were proposed to get around this, but nothing ever came of it. Also recently, I tried to propose a new syntax to allow Ruby-like blocks in Python without sacrificing Python's indenting rules. My idea was that "@" would mean "placeholder for a function to be defined on the next line" like so: >>> sorted(range(10), key=@): ... def @(item): ... return -item ... [9, 8, 7, 6, 5, 4, 3, 2, 1, 0] Thinking about it some more, I've realized that the change I proposed is unnecessary, since we already have the decorator syntax. So, for example, the original function can be made to act with the "expected" scoping by using an each_in function defined as follows: >>> def each_in(seq): ... return lambda f: [f(item) for item in seq] ... >>> def func_maker(): ... @each_in(range(10)) ... def fs(i): ... def f(): ... return i ... return f ... return fs #Warning, fs is a list, not a function! ... >>> [f() for f in func_maker()] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] On the one hand, I can imagine some people thinking this is decorator abuse, since the each_in decorator produces a list and not a function. If so, I suppose the lambda might be changed to >>> def each_in(seq): ... return lambda f: lambda: [f(item) for item in seq] ... >>> def func_maker(): ... @each_in(range(10)) ... def fs(i): ... def f(): ... return i ... return f ... return fs() #Warning, fs is a function, not a list ... >>> [f() for f in func_maker()] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] In either version the important thing is that it provides a scoped version of a for-loop, which those from a C++ background might be more conditioned to expect. Possibly, such a function could be added to the functools or the itertools. It would be useful for when scoping issues arise, for example when adding a bunch of properties or attributes to a class. Thinking about it some more though, it's hard to see why such a trivial function is needed for the library. There's no reason it couldn't just be done as an inline lambda instead: >>> def func_maker(): ... @lambda f: [f(i) for i in range(10)] ... def fs(i): ... def f(): ... return i ... return f ... return fs #Warning, fs is a list, not a function! ... >>> [f() for f in func_maker()] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] OK, actually there is one reason this couldn't be done as a lambda: >>> @lambda f: [f(i) for i in range(10)] File "", line 1 @lambda f: [f(i) for i in range(10)] ^ SyntaxError: invalid syntax This is because the decorator grammar asks for a name, not an expression, as one (well, OK, me a couple months ago) might naively expect. decorator: '@' dotted_name [ '(' [arglist] ')' ] NEWLINE decorators: decorator+ decorated: decorators (classdef | funcdef) and not decorator: '@' test NEWLINE decorators: decorator+ decorated: decorators (classdef | funcdef) Changing the grammar would also allow for the rewriting of the sorted example given earlier: >>> @lambda key: sorted(range(10), key=key) ... def sorted_list(item): ... return -item ... >>> sorted_list [9, 8, 7, 6, 5, 4, 3, 2, 1, 0] Of course, the use of lambda decorator is strictly speaking unnecessary, >>> k = lambda key: sorted(range(10), key=key) >>> @k ... def sorted_list(item): ... return -item ... >>> del k >>> sorted_list [9, 8, 7, 6, 5, 4, 3, 2, 1, 0] But, I think I think that allowing lambda decorators would be convenient for a number of situations and it would give a simple answer to those asking for a multiline lambda or Ruby-like blocks. What do other people think? -- Carl From yaogzhan at gmail.com Mon Feb 9 03:35:44 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Sun, 8 Feb 2009 23:05:44 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: <20090208031324.070b2073@bhuda.mired.org> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <498BE583.4040701@scottdial.com> <08A88544-C428-4BAF-8AF2-86FB05056BC6@gmail.com> <20090206191154.1b677eda@bhuda.mired.org> <04091536-0883-451B-A819-EA1667EED6EF@gmail.com> <20090208020446.0eec2cf8@bhuda.mired.org> <20090208031324.070b2073@bhuda.mired.org> Message-ID: On 8-Feb-09, at 4:43 AM, Mike Meyer wrote: >> Wait a minute... What do you mean by a "list" of statements? Is this >> one list of length 2, or two lists of length 1? >> >> a = 1 >> b = 2 > > Two lists of length one. Each list is terminated by the new line. If you think the above code is composed of two lists of length 1 instead of one list of length 2, then I guess probably we have completely different views of how to group things as a list. That probably explained why I feel very strange about your definition in the beginning. I'm not going to argue with you on this. I might never see the point of thinking the following code a = 1; b = 2 c = 3 as a list of length 2 + a list of length 1, instead of a list of length 3. If I take your approach, I might treat [a, b, c] as two lists, too. From yaogzhan at gmail.com Mon Feb 9 03:35:50 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Sun, 8 Feb 2009 23:05:50 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: <498E922D.8060706@pearwood.info> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <498BE583.4040701@scottdial.com> <08A88544-C428-4BAF-8AF2-86FB05056BC6@gmail.com> <20090206191154.1b677eda@bhuda.mired.org> <04091536-0883-451B-A819-EA1667EED6EF@gmail.com> <498E922D.8060706@pearwood.info> Message-ID: <2C625082-6EBE-4363-A116-36CCE9E9AD79@gmail.com> On 8-Feb-09, at 4:35 AM, Steven D'Aprano wrote: > Riobard Zhan wrote: >> On 6-Feb-09, at 8:41 PM, Mike Meyer wrote: >>> That this consistency - ignoring trailing separators in list >>> structures - can be misunderstood to be an optional ending separator >>> in the degenerate case of a single statement is a good indication of >>> why consistency isn't a trump property. >> This is a very strange view of consistency to me. How many >> different kinds of list separators do we have? I can only think of >> semicolons and commas. > > In English: commas, semi-colons, slashes and newlines. There may be > others, but I can't think of them off the top of my head. Examples: > > Sandwiches are made of bread, cheese, tomato, ham, and eggs. > > The hospital was visited by the following dignitaries: the > President, Mr Obama; the Queen, Elisabeth II; and a famous actor, > Bruce Willis. > > The invitation is for you and your wife/husband/partner. > > Shopping List: > milk > fruit > meat > > In programming languages: commas and semi-colons are usual. > OpenOffice spreadsheet uses ; to separate arguments to formulas, > which never ceases to annoy me. I've seen at least one Context-Free > Grammar format that uses vertical bar | as a list separator. I > presume Lisp uses whitespace. If I recall correctly, so does Forth. > Hypertalk separates "items" with commas and "words" with spaces, > although the item delimiter was configurable in later versions. Tab > delimited files use tabs as the item separator. > > In Python: only commas are item separators. Semi-colons and newlines > are statement separators. Colons are not separators at all. We are talking about Python here. Mike proposed it is consistent if we treat semicolons as separators in the same fashion as commas (another separators), which I do not think that is the right way (at least for me). See my other reply to Mike's reply if it is not clear. Nobody said colons are separators. From yaogzhan at gmail.com Mon Feb 9 03:35:56 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Sun, 8 Feb 2009 23:05:56 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: <87vdrlcp94.fsf@xemacs.org> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <87ocxgdmc3.fsf@xemacs.org> <727F165C-16E2-404F-9A47-9A9806E7C2B1@gmail.com> <20090206143839.GA5218@panix.com> <87ab8zedt1.fsf@xemacs.org> <0F9C1637-3E39-4FD4-9F9D-7B2BA45723C4@gmail.com> <87vdrlcp94.fsf@xemacs.org> Message-ID: <499FDF5E-0829-4562-A0DD-483C050F0A5A@gmail.com> On 8-Feb-09, at 4:51 AM, Stephen J. Turnbull wrote: > > It appears to me that you're responding to something other than what I > wrote. I wrote nothing about why semicolons are optional, only about > why I believe they are redundant. I apologize if you do not intend to do so, but the form of your argument really tricked me to think so. See here "In fact, the semicolon is indeed redundant, as it is precisely equivalent to a newline + the current level of indentation. This kind of redundancy is quite unusual in Python, so I wonder if the BDFL might regret having permitted semicolons at all" I take it as if the semicolons should be eliminated, but due to a mistake they became optional, because they are redundant. >> I think the conclusion is precisely the same for colons. > > My point was precisely that two syntaxes are appropriate balance with > two semantics, but we have three syntax elements, so one is redundant. > If you claim that is an argument for symmetry in treatment of colons > and semicolons, I guess you are of the school that 3 - 1 = 1 for > sufficiently large values of 1? But no, my point is that we can get > rid of one but not both, assuming that syntax should reflect semantics > to this extent. And it's pretty obvious which one to get rid of! I do not disagree the principle in general. But in this case I think it is not necessary to have extra syntax of colons when indentation works very well. So two are redundant, both semicolons and colons. Even Guido admits requiring colons is redundancy (see here [http://python-history.blogspot.com/2009/02/early-language-design-and-development.html#comments ]); he just thinks this redundancy is good. From yaogzhan at gmail.com Mon Feb 9 03:35:59 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Sun, 8 Feb 2009 23:05:59 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: <498E93C4.6090200@pearwood.info> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <498BE583.4040701@scottdial.com> <498BEB78.5000400@pearwood.info> <5EE58FCE-3420-4FB3-BB82-0AAEBBA1E645@gmail.com> <498CCAC1.9060304@pearwood.info> <498E93C4.6090200@pearwood.info> Message-ID: <50B1A631-BB03-436A-BCEB-6EB659015B37@gmail.com> On 8-Feb-09, at 4:41 AM, Steven D'Aprano wrote: > Riobard Zhan wrote: > >> I know it is nearly impossible even before I posted the original >> proposal. It is very difficult to justify a change for such a >> trivial issue. People will say, "OK, I'm perfectly fine with >> colons. Why bother?" And I cannot come up with anything more >> compelling than it will make the language more consistent and >> elegant. > > You haven't convinced anyone that the change will make the language > more consistent or elegant. Most people who have replied believe > that such a change would make the language FOOLISHLY consistent and > LESS elegant. You haven't convince me the current choice makes Python SMARTLY inconsistent and MORE elegant, either. I do agree with you that we might probably stay the course, given the difficult nature of changing an entrenched habit for no obvious gain. But that does not necessarily mean the old habit is a good choice in the first place. > By all means try to get your idea across, that's what this list is > for. But in the absence of support from anyone else, you probably > should realise that this isn't going anywhere. You seem to assume everyone else thinks in your way, which I do not think is the case. There are people who think we should kill colons (see the comment area here [http://python-history.blogspot.com/2009/02/early-language-design-and-development.html#comments ] and here [http://python-history.blogspot.com/2009/01/pythons-design-philosophy.html#comments ]). And if you are following the thread entirely, you will see there is an id called spir/Denis who thinks the same. The problem is virtually all those who think colons should be optional will *not* even bother with such a trivial issue, let alone coming here and discussing it. In fact I would not bother either, until I was redirected here by Guido. The primary motive I came here was to throw in the proposal and see if there are any good counter-arguments to the idea. Up till now, I've yet seen one, except perhaps what Curt mentioned as "there doesn't even have to be a rational basis for that". Now spir/Denis gave up. I think I should follow him, too. -------------- next part -------------- An HTML attachment was scrubbed... URL: From yaogzhan at gmail.com Mon Feb 9 03:36:07 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Sun, 8 Feb 2009 23:06:07 -0330 Subject: [Python-ideas] Making colons optional? In-Reply-To: References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <498BE583.4040701@scottdial.com> <19D96DF5-6757-4793-B25A-5AE345CB7B4E@gmail.com> <0DB8FCE8-8EBF-44CC-89DF-E5D601E8822A@gmail.com> Message-ID: <0A9ABC7B-C4E4-4C11-A087-170DAF67A7FF@gmail.com> On 8-Feb-09, at 2:18 PM, Bruce Leban wrote: > Your statement that no programming language used trailing colons is > simply false. I and many others have said that colons and semicolons > have completely different semantics as well and you have chosen to > ignore that. I apologize for the imprecise statement. It would be more precise if I say Python is one of the very, very few languages that require trailing colons. In either case, Smalltalk's colons are completely different animals (they signify keywords, which are used to build control constructs via message passing) than Python, and it makes no sense to compare the two. And Smalltalk is not as widespread as many other "mainstream" languages; therefore it is rather an exception for Python to require trailing colons (but not so to require commas). That is the difference between making colons optional and making commas optional; that is also why I think your generalization to commas will not work. I fully understand your view that semicolons and colons are somehow different in semantic. It is just that you and many others have chosen to ignore the fact that different semantics do not necessarily imply different rules, otherwise we will have too many rules to worry about. In this case, I do not think the difference between their semantics is huge enough to treat them differently. > Clearly the use of the specific semicolon character is confusing > you. So let's replace it with a better symbol: \n as in this example: > > for i in x: foo(i) \n bar(i+1) > > Sure a \n is optional at the end of any line because a blank line is > always allowed. So what? > > What's the point you are trying to make? > > The point is that the \n token and the : token have different > semantics entirely. The \n token is used to separate statements on a > single line. You seem to think that they are related because they > look similar and now they don't. > > One more final point: in some languages, the semicolon is a > statement TERMINATOR as it is in C and in others it is a statement > SEPARATOR as it is in Pascal. I think your time would be better > served by working to convince the Pascal people and the C people to > reconcile that inconsistency than this discussion. I think you misunderstood my point of relating semicolons and colons. My point is that colons are not so much different than semicolons as visual indicators that they deserve a different rule than semicolons (with one being mandatory while the other optional), because we already have indentation as a very effective visual indicator to tell us everything we need to be told. From yaogzhan at gmail.com Mon Feb 9 04:08:48 2009 From: yaogzhan at gmail.com (Riobard Zhan) Date: Sun, 8 Feb 2009 23:38:48 -0330 Subject: [Python-ideas] out of list In-Reply-To: <20090208101820.749fa471@o> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <498BE583.4040701@scottdial.com> <08A88544-C428-4BAF-8AF2-86FB05056BC6@gmail.com> <20090206191154.1b677eda@bhuda.mired.org> <04091536-0883-451B-A819-EA1667EED6EF@gmail.com> <20090208101820.749fa471@o> Message-ID: To all of you who participated in this (not so fruitful) discussion, I sincerely thank you for your precious time and thought. Although we failed to yield anything very useful for later Python users to clear up the issue of colons, I do appreciate your effort and patience. If there is any bad emotions generated during the process, please do not take it personally; that's not the purpose---after all, we all want to make our beloved language better and better. Sincerely yours, Riobard On 8-Feb-09, at 5:48 AM, spir wrote: > Le Sun, 8 Feb 2009 03:17:25 -0330, > Riobard Zhan a ?crit : > > Hello Riobard, > > You know that I rather agree with you on the point that colon would > rather be optional, that the purpose is similar to that of semi- > colons, that the opponents' arguments are not convincing at all (I > bet from the form of these arguments, that if colons were optional > in python, most of them would fight against a proposition to make > them obligatory ;-). > Still, this debate goes nowhere now, and you just kill your own > credibility. It's time to stop. Not only python will not change on > that point, but the discussion does not bring any more clue to help > understand the whys and hows of syntax/semantics. > > respecfully, > Denis > > PS: > Have you had a look at cobra? http://cobra-language.com/ > > ------ > la vida e estranya From guido at python.org Mon Feb 9 04:29:35 2009 From: guido at python.org (Guido van Rossum) Date: Sun, 8 Feb 2009 19:29:35 -0800 Subject: [Python-ideas] Allow lambda decorators In-Reply-To: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> References: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> Message-ID: On Sun, Feb 8, 2009 at 4:41 PM, Carl Johnson wrote: > A few months back there was a discussion of how code like this gives > "surprising" results because of the scoping rules: > >>>> def func_maker(): > ... fs = [] > ... for i in range(10): > ... def f(): > ... return i > ... fs.append(f) > ... return fs > ... >>>> [f() for f in func_maker()] > [9, 9, 9, 9, 9, 9, 9, 9, 9, 9] > > Various syntax changes were proposed to get around this, but nothing ever > came of it. > > Also recently, I tried to propose a new syntax to allow Ruby-like blocks in > Python without sacrificing Python's indenting rules. My idea was that "@" > would mean "placeholder for a function to be defined on the next line" like > so: > >>>> sorted(range(10), key=@): > ... def @(item): > ... return -item > ... > [9, 8, 7, 6, 5, 4, 3, 2, 1, 0] > > Thinking about it some more, I've realized that the change I proposed is > unnecessary, since we already have the decorator syntax. So, for example, > the original function can be made to act with the "expected" scoping by > using an each_in function defined as follows: > >>>> def each_in(seq): > ... return lambda f: [f(item) for item in seq] > ... >>>> def func_maker(): > ... @each_in(range(10)) > ... def fs(i): > ... def f(): > ... return i > ... return f > ... return fs #Warning, fs is a list, not a function! > ... >>>> [f() for f in func_maker()] > [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] > > On the one hand, I can imagine some people thinking this is decorator abuse, > since the each_in decorator produces a list and not a function. If so, I > suppose the lambda might be changed to > >>>> def each_in(seq): > ... return lambda f: lambda: [f(item) for item in seq] > ... >>>> def func_maker(): > ... @each_in(range(10)) > ... def fs(i): > ... def f(): > ... return i > ... return f > ... return fs() #Warning, fs is a function, not a list > ... >>>> [f() for f in func_maker()] > [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] I'm sorry, but you are using two nested lambdas plus a list comprehension, and three nested functions here, plus one more list comprehension for showing the result. My brain hurts trying to understand all this. I don't think this bodes well as a use case for a proposed feature. I'm not trying to be sarcastic here -- I really think this code is too hard to follow for a motivating example. > In either version the important thing is that it provides a scoped version > of a for-loop, which those from a C++ background might be more conditioned > to expect. Possibly, such a function could be added to the functools or the > itertools. It would be useful for when scoping issues arise, for example > when adding a bunch of properties or attributes to a class. > > Thinking about it some more though, it's hard to see why such a trivial > function is needed for the library. There's no reason it couldn't just be > done as an inline lambda instead: > >>>> def func_maker(): > ... @lambda f: [f(i) for i in range(10)] > ... def fs(i): > ... def f(): > ... return i > ... return f > ... return fs #Warning, fs is a list, not a function! > ... >>>> [f() for f in func_maker()] > [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] > > OK, actually there is one reason this couldn't be done as a lambda: > >>>> @lambda f: [f(i) for i in range(10)] > File "", line 1 > @lambda f: [f(i) for i in range(10)] > ^ > SyntaxError: invalid syntax > > This is because the decorator grammar asks for a name, not an expression, as > one (well, OK, me a couple months ago) might naively expect. > > decorator: '@' dotted_name [ '(' [arglist] ')' ] NEWLINE > decorators: decorator+ > decorated: decorators (classdef | funcdef) > > and not > > decorator: '@' test NEWLINE > decorators: decorator+ > decorated: decorators (classdef | funcdef) > > Changing the grammar would also allow for the rewriting of the sorted > example given earlier: > >>>> @lambda key: sorted(range(10), key=key) > ... def sorted_list(item): > ... return -item > ... >>>> sorted_list > [9, 8, 7, 6, 5, 4, 3, 2, 1, 0] > > > Of course, the use of lambda decorator is strictly speaking unnecessary, > >>>> k = lambda key: sorted(range(10), key=key) >>>> @k > ... def sorted_list(item): > ... return -item > ... >>>> del k >>>> sorted_list > [9, 8, 7, 6, 5, 4, 3, 2, 1, 0] > > But, I think I think that allowing lambda decorators would be convenient for > a number of situations and it would give a simple answer to those asking for > a multiline lambda or Ruby-like blocks. > > What do other people think? > > -- Carl > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From carl at carlsensei.com Mon Feb 9 05:08:25 2009 From: carl at carlsensei.com (Carl Johnson) Date: Sun, 8 Feb 2009 18:08:25 -1000 Subject: [Python-ideas] Allow lambda decorators In-Reply-To: References: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> Message-ID: On 2009/02/08, at 5:29 pm, Guido van Rossum wrote: > I'm sorry, but you are using two nested lambdas plus a list > comprehension, and three nested functions here, plus one more list > comprehension for showing the result. My brain hurts trying to > understand all this. I don't think this bodes well as a use case for a > proposed feature. > > I'm not trying to be sarcastic here -- I really think this code is too > hard to follow for a motivating example. I will admit that this is getting a bit too functional-language-like for its own good, but (ignoring my proposed solution for a while) at least in the case of the nested scope problem, what other choices is there but to nest functions in order to keep the variable from varying? I myself was bitten by the variable thing when I wanted to write a simple for-loop to add methods to a class, but because of the scoping issue, it ended up that all of the methods were equivalent to the last in the list. At present, there's no way around this but to write a slightly confusing series of nested functions. So, that's my case for an each_in function. Going back to the @lambda thing, in general, people are eventually going to run into different situations where they have to write their own decorators. We already have a lot of different situations taken care of for us by the functools and itertools, but I don't think the library will ever be able to do everything for everyone. I think that the concept of writing decorators (as opposed to using them) is definitely confusing at first, since you're nesting one thing inside of another and then flipping it all around, and it doesn't entirely make sense, but the Python community seems to have adapted to it. For that matter, if I think too much about what happens in a series of nested generators, I can confuse myself, but each individual generator makes sense as a kind of "pipe" through which data is flowing and being transformed. Similarly, metaclasses are hard to understand but make things easy to use. So, I guess the key is to keep the parts easy to understand even if how the parts work together is a bit circuitous. Maybe @lambda isn't accomplishing that, but I'm not sure how much easier to understand equivalent solutions would be that is written without it. Easy things easy, hard things possible? -- Carl From aahz at pythoncraft.com Mon Feb 9 06:40:26 2009 From: aahz at pythoncraft.com (Aahz) Date: Sun, 8 Feb 2009 21:40:26 -0800 Subject: [Python-ideas] Allow lambda decorators In-Reply-To: References: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> Message-ID: <20090209054026.GA20148@panix.com> On Sun, Feb 08, 2009, Guido van Rossum wrote: > On Sun, Feb 8, 2009 at 4:41 PM, Carl Johnson wrote: >> >> >>> def each_in(seq): >> ... return lambda f: lambda: [f(item) for item in seq] >> ... >> >>> def func_maker(): >> ... @each_in(range(10)) >> ... def fs(i): >> ... def f(): >> ... return i >> ... return f >> ... return fs() #Warning, fs is a function, not a list >> ... >> >>> [f() for f in func_maker()] >> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] > > I'm sorry, but you are using two nested lambdas plus a list > comprehension, and three nested functions here, plus one more list > comprehension for showing the result. My brain hurts trying to > understand all this. I don't think this bodes well as a use case for a > proposed feature. > > I'm not trying to be sarcastic here -- I really think this code is too > hard to follow for a motivating example. Maybe I've been corrupted by the functional mindset (I hope not!), but I can follow this. As Carl says, the key is to focus on the scoping problem: def func_maker(): fs = [] for i in range(10): def f(): return i fs.append(f) return fs This creates a list of function objects, but the way scoping works in Python, every single function object returns 9 because that's the final value of ``i`` in func_maker(). Living with this wart is an option; changing Python's scoping rules to fix the wart is probably not worth considering. Carl is suggesting something in-between, allowing the use of lambda combined with decorator syntax to create an intermediate scope that hides func_maker()'s ``i`` from f(). I think that's ugly, but it probably is about the best we can do -- the other options are uglier. What I don't know is whether it's worth considering; abusing scope is not generally considered Pythonic, but we've been slowly catering to that market over the years. Given my general antipathy to decorators as obfuscating code, I don't think Carl's suggestion causes much damage. Carl, let me know if I've accurately summarized you. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ Weinberg's Second Law: If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization. From pyideas at rebertia.com Mon Feb 9 07:02:49 2009 From: pyideas at rebertia.com (Chris Rebert) Date: Sun, 8 Feb 2009 22:02:49 -0800 Subject: [Python-ideas] Allow lambda decorators In-Reply-To: <20090209054026.GA20148@panix.com> References: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> <20090209054026.GA20148@panix.com> Message-ID: <50697b2c0902082202h2ea97343x8422801231e08b83@mail.gmail.com> On Sun, Feb 8, 2009 at 9:40 PM, Aahz wrote: > Maybe I've been corrupted by the functional mindset (I hope not!), but I > can follow this. As Carl says, the key is to focus on the scoping > problem: > > def func_maker(): > fs = [] > for i in range(10): > def f(): > return i > fs.append(f) > return fs > > This creates a list of function objects, but the way scoping works in > Python, every single function object returns 9 because that's the final > value of ``i`` in func_maker(). Living with this wart is an option; > changing Python's scoping rules to fix the wart is probably not worth > considering. Carl is suggesting something in-between, allowing the use > of lambda combined with decorator syntax to create an intermediate scope > that hides func_maker()'s ``i`` from f(). > > I think that's ugly, but it probably is about the best we can do -- the > other options are uglier. This will likely get shot down in 5 minutes somehow, but I just want to put it forward since I personally didn't find it ugly: Has a declaration been considered (like with `nonlocal`)? Seems a relatively elegant solution to this problem, imho. I know it was mentioned in a thread once (with the keyword as "instantize" or "immediatize" or something similar), but I've been unable to locate it (my Google-chi must not be flowing today). The basic idea was (using the keyword "bind" for the sake of argument): def func_maker(): fs = [] for i in range(10): def f(): bind i #this declaration tells Python to grab the value of i as it is *right now* at definition-time return i fs.append(f) return fs The `bind` declaration would effectively tell Python to apply an internal version of the `lambda x=x:` trick/hack/kludge automatically, thus solving the scoping problem. I now expect someone will point me to the thread I couldn't find and/or poke several gaping holes in this idea. Cheers, Chris -- Follow the path of the Iguana... http://rebertia.com From tjreedy at udel.edu Mon Feb 9 07:03:39 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 09 Feb 2009 01:03:39 -0500 Subject: [Python-ideas] Allow lambda decorators In-Reply-To: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> References: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> Message-ID: Carl Johnson wrote: > A few months back there was a discussion of how code like this gives > "surprising" results because of the scoping rules: And solutions were given. > >>> def func_maker(): > ... fs = [] > ... for i in range(10): > ... def f(): > ... return i > ... fs.append(f) > ... return fs > ... > >>> [f() for f in func_maker()] > [9, 9, 9, 9, 9, 9, 9, 9, 9, 9] > > Various syntax changes were proposed to get around this, but nothing > ever came of it. Because a) there is already a trivial way to get the result wanted; b) proposals are wrapped in trollish claims; c) perhaps no proposer is really serious. To me, one pretty obvious way to define default non-parameters would be to follow the signature with "; +". Where is the PEP, though? Enough already. ... > Thinking about it some more, I've realized that the change I proposed is > unnecessary, since we already have the decorator syntax. So, for > example, the original function can be made to act with the "expected" > scoping by using an each_in function defined as follows: Such a mess to avoid using the current syntax Like Guido, my head hurt trying to read it, so I quit. > What do other people think? -1 Terry Jan Reedy From carl at carlsensei.com Mon Feb 9 09:36:51 2009 From: carl at carlsensei.com (Carl Johnson) Date: Sun, 8 Feb 2009 22:36:51 -1000 Subject: [Python-ideas] Allow lambda decorators In-Reply-To: References: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> Message-ID: <81F15FFB-4953-40EE-B1F9-10BE4712F257@carlsensei.com> > Carl, let me know if I've accurately summarized you. The only problem I can see with your summary is that it's better stated than what I would say. Thanks, Carl From rhamph at gmail.com Mon Feb 9 10:36:44 2009 From: rhamph at gmail.com (Adam Olsen) Date: Mon, 9 Feb 2009 02:36:44 -0700 Subject: [Python-ideas] Allow lambda decorators In-Reply-To: <50697b2c0902082202h2ea97343x8422801231e08b83@mail.gmail.com> References: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> <20090209054026.GA20148@panix.com> <50697b2c0902082202h2ea97343x8422801231e08b83@mail.gmail.com> Message-ID: On Sun, Feb 8, 2009 at 11:02 PM, Chris Rebert wrote: > The basic idea was (using the keyword "bind" for the sake of argument): > > def func_maker(): > fs = [] > for i in range(10): > def f(): > bind i #this declaration tells Python to grab the value of > i as it is *right now* at definition-time > return i > fs.append(f) > return fs > > The `bind` declaration would effectively tell Python to apply an > internal version of the `lambda x=x:` trick/hack/kludge automatically, > thus solving the scoping problem. > > I now expect someone will point me to the thread I couldn't find > and/or poke several gaping holes in this idea. How about it's icky? It feels like it's evaluating part of the inner function when the function is defined. There's a simple solution, which is to put it in the argument list: def f(bind i): return i However, when compared to the existing def f(i=i): it just seems too petty to be worth the increased language complexity. That's where it dies: too little benefit with too large a cost. The only way I'd reconsider it is if default values were changed to reevaluate with each call, but nobody should be proposing that again until at least Python 4000, and only then if they come up with a much better argument than we have today. I doubt it'll ever happen. -- Adam Olsen, aka Rhamphoryncus From pyideas at rebertia.com Mon Feb 9 10:50:25 2009 From: pyideas at rebertia.com (Chris Rebert) Date: Mon, 9 Feb 2009 01:50:25 -0800 Subject: [Python-ideas] Allow lambda decorators In-Reply-To: References: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> <20090209054026.GA20148@panix.com> <50697b2c0902082202h2ea97343x8422801231e08b83@mail.gmail.com> Message-ID: <50697b2c0902090150l3e752d9dmc2d41a270116c8fe@mail.gmail.com> On Mon, Feb 9, 2009 at 1:36 AM, Adam Olsen wrote: > The only way I'd reconsider it is if default values were changed to > reevaluate with each call, but nobody should be proposing that again > until at least Python 4000, and only then if they come up with a much > better argument than we have today. I doubt it'll ever happen. That much I completely agree with, and I speak from personal experience :-) Cheers, Chris -- Follow the path of the Iguana... http://rebertia.com From denis.spir at free.fr Mon Feb 9 11:09:30 2009 From: denis.spir at free.fr (spir) Date: Mon, 9 Feb 2009 11:09:30 +0100 Subject: [Python-ideas] Allow lambda decorators In-Reply-To: References: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> Message-ID: <20090209110930.00e8328e@o> Le Mon, 09 Feb 2009 01:03:39 -0500, Terry Reedy a ?crit : > To me, one pretty obvious way to define default non-parameters would be > to follow the signature with "; +". Where is the PEP, though? I guess mean the following? >>> def func_maker(): ... fs = [] ... for i in range(10): ... def f(n=i): ... return n ... fs.append(f) ... return fs ... >>> fs = func_maker() >>> for f in fs: print f(), ... 0 1 2 3 4 5 6 7 8 9 As I understand it, the issue only happens when yielding functions. For instance, the following works as expected: class C(object): def __init__(self,n): self.n = n def obj_maker(): objs = [] for i in range(10): obj = C(i) objs.append(obj) return objs The pseudo-parameter trick used to generate funcs is only a workaround, but it works fine and is not overly complicated (I guess). Maybe this case should be documented in introductory literature; not only as a possible trap, also because it helps understanding python's (non-)scoping rules which allow (this is not be obvious when expecting iteration-specific scope): # name 'item' like silently introduced here for item in container: if test(item): break print item Denis ------ la vida e estranya From qrczak at knm.org.pl Mon Feb 9 11:21:21 2009 From: qrczak at knm.org.pl (Marcin 'Qrczak' Kowalczyk) Date: Mon, 9 Feb 2009 11:21:21 +0100 Subject: [Python-ideas] Allow lambda decorators In-Reply-To: <50697b2c0902082202h2ea97343x8422801231e08b83@mail.gmail.com> References: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> <20090209054026.GA20148@panix.com> <50697b2c0902082202h2ea97343x8422801231e08b83@mail.gmail.com> Message-ID: <3f4107910902090221l50948336y20f537def45ca002@mail.gmail.com> On Mon, Feb 9, 2009 at 07:02, Chris Rebert wrote: > The basic idea was (using the keyword "bind" for the sake of argument): > > def func_maker(): > fs = [] > for i in range(10): > def f(): > bind i #this declaration tells Python to grab the value of > i as it is *right now* at definition-time > return i > fs.append(f) > return fs When is "now"? The bind, whatever it does, is not executed until the function is called, which is too late. Letting it influence the interpretation of f itself is confusing. The problem stems from the fact that local variables are only local to the enclosing function, without the possibility of finer scopes. If variable declarations were explicit and scoped to the enclosing block, this would work: def func_maker(): var fs = [] for i in range(10): var j = i fs.append(lambda: j) return fs or possibly even the original, depending on the semantics of for (whether it binds a new variable on each iteration, scoped over the body, or it rebinds the same variable each time). -- Marcin Kowalczyk qrczak at knm.org.pl http://qrnik.knm.org.pl/~qrczak/ From pyideas at rebertia.com Mon Feb 9 11:33:15 2009 From: pyideas at rebertia.com (Chris Rebert) Date: Mon, 9 Feb 2009 02:33:15 -0800 Subject: [Python-ideas] Allow lambda decorators In-Reply-To: <3f4107910902090221l50948336y20f537def45ca002@mail.gmail.com> References: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> <20090209054026.GA20148@panix.com> <50697b2c0902082202h2ea97343x8422801231e08b83@mail.gmail.com> <3f4107910902090221l50948336y20f537def45ca002@mail.gmail.com> Message-ID: <50697b2c0902090233i5efef6dbncba7e3e0a64d9e21@mail.gmail.com> On Mon, Feb 9, 2009 at 2:21 AM, Marcin 'Qrczak' Kowalczyk wrote: > On Mon, Feb 9, 2009 at 07:02, Chris Rebert wrote: > >> The basic idea was (using the keyword "bind" for the sake of argument): >> >> def func_maker(): >> fs = [] >> for i in range(10): >> def f(): >> bind i #this declaration tells Python to grab the value of >> i as it is *right now* at definition-time >> return i >> fs.append(f) >> return fs > > When is "now"? The bind, whatever it does, is not executed until the > function is called, which is too late. Letting it influence the > interpretation of f itself is confusing. Well, it's a declaration of sorts, so it would hypothetically be taken into account at definition-time rather than call-time (that's it's entire raison detre!); surely we can agree that 'global' can also be viewed as having definition-time rather than call-time effects? It is somewhat confusing, but then so is the problem it's trying to solve (the unintuitive behavior of functions defined in a loop). But yes, I agree the idea has been thoroughly thrashed (as I predicted) and at least now the discussion is preserved for posterity should similar ideas ever resurface. :) Cheers, Chris -- Follow the path of the Iguana... http://rebertia.com From steve at pearwood.info Mon Feb 9 11:57:32 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 09 Feb 2009 21:57:32 +1100 Subject: [Python-ideas] Making colons optional? In-Reply-To: <50B1A631-BB03-436A-BCEB-6EB659015B37@gmail.com> References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <498BE583.4040701@scottdial.com> <498BEB78.5000400@pearwood.info> <5EE58FCE-3420-4FB3-BB82-0AAEBBA1E645@gmail.com> <498CCAC1.9060304@pearwood.info> <498E93C4.6090200@pearwood.info> <50B1A631-BB03-436A-BCEB-6EB659015B37@gmail.com> Message-ID: <49900C1C.2040609@pearwood.info> Riobard Zhan wrote: > The problem is virtually all those who think colons should be optional > will *not* even bother with such a trivial issue, let alone coming here > and discussing it. That may be true, but most people who think colons should not be optional also don't bother coming here and discussing it. > In fact I would not bother either, until I was > redirected here by Guido. The primary motive I came here was to throw in > the proposal and see if there are any good counter-arguments to the > idea. Up till now, I've yet seen one, So you say. > except perhaps what Curt mentioned > as "there doesn't even have to be a rational basis for that". > > Now spir/Denis gave up. I think I should follow him, too. Earlier you said that you didn't think it would be difficult to make the Python compiler accept optional colons. If you really believe that the majority of people would prefer your suggestion, why don't you come back when you have a patch, then see how many people are interested to try it out? Who knows, you might prove the nay-sayers wrong. (But I doubt it.) -- Steven From steve at pearwood.info Mon Feb 9 12:00:25 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 09 Feb 2009 22:00:25 +1100 Subject: [Python-ideas] [Python-Dev] yield * (Re: Missing operator.call) In-Reply-To: <3f4107910902081515s3bc1b5b1k64bbac14051dbd0b@mail.gmail.com> References: <498991D9.3060407@avl.com> <498D4E9D.6070405@canterbury.ac.nz> <498E2EBF.7090309@canterbury.ac.nz> <20090208015007.12555.885411350.divmod.xquotient.4298@weber.divmod.com> <498E9476.80404@pearwood.info> <498F5AC0.60708@pearwood.info> <3f4107910902081515s3bc1b5b1k64bbac14051dbd0b@mail.gmail.com> Message-ID: <49900CC9.10504@pearwood.info> Marcin 'Qrczak' Kowalczyk wrote: > On Sun, Feb 8, 2009 at 23:20, Steven D'Aprano wrote: > >> And the cost is small: >> >> [steve at ando ~]$ python -m timeit -s "seq = range(500)" "(3*x for x in seq if >> x%2 == 1)" >> 1000000 loops, best of 3: 0.611 usec per loop > > Because generators are lazy and you don't run it into completion. We were talking about the cost of *making* the generator, not the cost of running it to completion. -- Steven From lie.1296 at gmail.com Mon Feb 9 13:44:20 2009 From: lie.1296 at gmail.com (Lie Ryan) Date: Mon, 9 Feb 2009 12:44:20 +0000 (UTC) Subject: [Python-ideas] Allow lambda decorators References: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> Message-ID: On Sun, 08 Feb 2009 18:08:25 -1000, Carl Johnson wrote: I can't really comprehend the feature you're describing (TLDR), but knowing that decorator is merely a syntax sugar for function calling to a function: @lambdafunc def func(foo): pass is equal to def func(foo): pass func = lambdafunc(func) why don't you do this instead: lambdafunc(lambda foo: -foo) it is perfectly readable and is a more functional approach than decorator. Also for the first example you gave: >>> def func_maker(): ... fs = [] ... for i in range(10): ... def f(): ... return i ... fs.append(f) ... return fs why not? (untested) >>> from functools import partial >>> def func_maker(): ... def f(j): ... return j ... fs = [] ... for i in range(10): ... fs.append(partial(f, i)) ... return fs ... Closure is already confusing enough for most people. Dynamic creation of closure, which is allowed because of python's way of defining function is an abuse of closure IMHO. From solipsis at pitrou.net Mon Feb 9 14:10:54 2009 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 9 Feb 2009 13:10:54 +0000 (UTC) Subject: [Python-ideas] [Python-Dev] yield * (Re: Missing operator.call) References: <498991D9.3060407@avl.com> <498D4E9D.6070405@canterbury.ac.nz> <498E2EBF.7090309@canterbury.ac.nz> <20090208015007.12555.885411350.divmod.xquotient.4298@weber.divmod.com> <498E9476.80404@pearwood.info> <498F5AC0.60708@pearwood.info> <3f4107910902081515s3bc1b5b1k64bbac14051dbd0b@mail.gmail.com> <49900CC9.10504@pearwood.info> Message-ID: Steven D'Aprano writes: > > Marcin 'Qrczak' Kowalczyk wrote: > > On Sun, Feb 8, 2009 at 23:20, Steven D'Aprano wrote: > > > >> And the cost is small: > >> > >> [steve ando ~]$ python -m timeit -s "seq = range(500)" "(3*x for x in seq if > >> x%2 == 1)" > >> 1000000 loops, best of 3: 0.611 usec per loop > > > > Because generators are lazy and you don't run it into completion. > > We were talking about the cost of *making* the generator, not the cost > of running it to completion. No, I was talking about the cost of running it to completion. A generator is executed in a separated frame. Therefore, if you "yield from" a generator, there is a frame switch at each iteration between the generator frame and the frame of the "yield from". Which is not the case with an inline "for" loop containing a "yield". Regards Antoine. From lie.1296 at gmail.com Mon Feb 9 14:12:53 2009 From: lie.1296 at gmail.com (Lie Ryan) Date: Mon, 9 Feb 2009 13:12:53 +0000 (UTC) Subject: [Python-ideas] Allow lambda decorators References: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> Message-ID: On Mon, 09 Feb 2009 12:44:20 +0000, Lie Ryan wrote: > @lambdafunc > def func(foo): pass > > is equal to > > def func(foo): pass > func = lambdafunc(func) > > why don't you do this instead: > > lambdafunc(lambda foo: -foo) > > it is perfectly readable and is a more functional approach than > decorator. OK, disregard that. I've just realized you want something really different. From arnodel at googlemail.com Mon Feb 9 15:27:45 2009 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Mon, 9 Feb 2009 14:27:45 +0000 Subject: [Python-ideas] Allow lambda decorators In-Reply-To: <50697b2c0902082202h2ea97343x8422801231e08b83@mail.gmail.com> References: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> <20090209054026.GA20148@panix.com> <50697b2c0902082202h2ea97343x8422801231e08b83@mail.gmail.com> Message-ID: <9bfc700a0902090627ya40a516k1d6e5532de5e5fc7@mail.gmail.com> 2009/2/9 Chris Rebert : > This will likely get shot down in 5 minutes somehow, but I just want > to put it forward since I personally didn't find it ugly: Has a > declaration been considered (like with `nonlocal`)? Seems a relatively > elegant solution to this problem, imho. I know it was mentioned in a > thread once (with the keyword as "instantize" or "immediatize" or > something similar), but I've been unable to locate it (my Google-chi > must not be flowing today). You can get to a related discussion if you search for 'localize' or 'immanentize'. Is this what you were looking for? -- Arnaud From rrr at ronadam.com Mon Feb 9 15:32:46 2009 From: rrr at ronadam.com (Ron Adam) Date: Mon, 09 Feb 2009 08:32:46 -0600 Subject: [Python-ideas] Allow lambda decorators In-Reply-To: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> References: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> Message-ID: <49903E8E.90102@ronadam.com> Carl Johnson wrote: > A few months back there was a discussion of how code like this gives > "surprising" results because of the scoping rules: > > >>> def func_maker(): > ... fs = [] > ... for i in range(10): > ... def f(): > ... return i > ... fs.append(f) > ... return fs > ... > >>> [f() for f in func_maker()] > [9, 9, 9, 9, 9, 9, 9, 9, 9, 9] > > Various syntax changes were proposed to get around this, but nothing > ever came of it. This isn't really a scoping rule problem. It occurs because the value of i in f() isn't evaluated until f is called, which is after the loop is finished. >>> funcs = [lambda i=i:i for i in range(10)] >>> [f() for f in funcs] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] Here i's default value is set at the time f() is defined, so it avoids the issue. ra at Gutsy:~$ python Python 2.5.2 (r252:60911, Oct 5 2008, 19:29:17) [GCC 4.3.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> def f(): return i ... >>> Notice that we can define a function with an i variable even before i exists. The requirement is that i exists when the function f is called. Another way to do this is to use a proper objective approach. >>> class obj(object): ... def __init__(self): ... self.i = i ... def __call__(self): ... return self.i ... >>> objs = [obj() for i in range(10)] >>> [v() for v in objs] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] Because the value is stored on the object at the time the object is created, vs when it is called, this works as expected. For clarity and readability, I would change the __init__ line to accept a proper argument for i, but as you see it still works. Cheers, Ron From denis.spir at free.fr Mon Feb 9 16:23:13 2009 From: denis.spir at free.fr (spir) Date: Mon, 9 Feb 2009 16:23:13 +0100 Subject: [Python-ideas] Allow lambda decorators In-Reply-To: <49903E8E.90102@ronadam.com> References: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> <49903E8E.90102@ronadam.com> Message-ID: <20090209162313.0289b187@o> Le Mon, 09 Feb 2009 08:32:46 -0600, Ron Adam a ?crit : > >>> class obj(object): > ... def __init__(self): > ... self.i = i > ... def __call__(self): > ... return self.i > ... > >>> objs = [obj() for i in range(10)] > >>> [v() for v in objs] > [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] > > Because the value is stored on the object at the time the object is > created, vs when it is called, this works as expected. I see this formulation equivalent to "def f(i=i0): return i", I mean conceptually. From an OO point of view, a default/keyword arg is indeed an (implicit) attribute of a function, evaluated at definition/creation time. Actually, isn't the above code an excellent explanation of what a default argument is in python? Denis ------ la vida e estranya From guido at python.org Mon Feb 9 18:09:18 2009 From: guido at python.org (Guido van Rossum) Date: Mon, 9 Feb 2009 09:09:18 -0800 Subject: [Python-ideas] Allow lambda decorators In-Reply-To: References: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> Message-ID: On Sun, Feb 8, 2009 at 8:08 PM, Carl Johnson wrote: > > On 2009/02/08, at 5:29 pm, Guido van Rossum wrote: > >> I'm sorry, but you are using two nested lambdas plus a list >> comprehension, and three nested functions here, plus one more list >> comprehension for showing the result. My brain hurts trying to >> understand all this. I don't think this bodes well as a use case for a >> proposed feature. >> >> I'm not trying to be sarcastic here -- I really think this code is too >> hard to follow for a motivating example. > > I will admit that this is getting a bit too functional-language-like for its > own good, but (ignoring my proposed solution for a while) at least in the > case of the nested scope problem, what other choices is there but to nest > functions in order to keep the variable from varying? I myself was bitten by > the variable thing when I wanted to write a simple for-loop to add methods > to a class, but because of the scoping issue, it ended up that all of the > methods were equivalent to the last in the list. At present, there's no way > around this but to write a slightly confusing series of nested functions. Are you unaware of or rejecting the solution of using a default argument value? [(lambda x, _i=i: x+i) for i in range(10)] is a list of 10 functions that add i to their argument, for i in range(10). > So, that's my case for an each_in function. Going back to the @lambda thing, > in general, people are eventually going to run into different situations > where they have to write their own decorators. We already have a lot of > different situations taken care of for us by the functools and itertools, > but I don't think the library will ever be able to do everything for > everyone. I think that the concept of writing decorators (as opposed to > using them) is definitely confusing at first, since you're nesting one thing > inside of another and then flipping it all around, and it doesn't entirely > make sense, but the Python community seems to have adapted to it. For that > matter, if I think too much about what happens in a series of nested > generators, I can confuse myself, but each individual generator makes sense > as a kind of "pipe" through which data is flowing and being transformed. > Similarly, metaclasses are hard to understand but make things easy to use. > So, I guess the key is to keep the parts easy to understand even if how the > parts work together is a bit circuitous. Maybe @lambda isn't accomplishing > that, but I'm not sure how much easier to understand equivalent solutions > would be that is written without it. Easy things easy, hard things possible? I don't see what @lambda does that you can't already do with several other forms of syntax. The reason for adding decorators to the language is to have easier syntax for common manipulations/modifications of functions and methods. A decorator using lambda would be a one-off, which kind of defeats the purpose. For example, instead of this: >>> def func_maker(): ... @lambda f: [f(i) for i in range(10)] ... def fs(i): ... def f(): ... return i ... return f ... return fs #Warning, fs is a list, not a function! ... I would write this: def func_maker(): def fi(i): def f(): return i fs = [fi(i) for i in range(10)] return fs In regard to the proposal of "bind i" syntax, I have a counter-proposal (as log as we're in free association mode :-): Define new 'for' syntax so that you can write [lambda: i for new i in range(10)] or e.g. fs = [] for new i in range(10): def f(): return i fs.append(f) The rule would be that "for new in ..." defines a new "cell" each time around the loop, whose scope is limited to the for loop. So e.g. this wouldn't work: for new i in range(10): if i%7 == 6: break print i # NameError I'm not saying I like this all that much, but it seems a more Pythonic solution than "bind i", and it moves the special syntax closer to the source of the problem. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From phd at phd.pp.ru Mon Feb 9 18:12:28 2009 From: phd at phd.pp.ru (Oleg Broytmann) Date: Mon, 9 Feb 2009 20:12:28 +0300 Subject: [Python-ideas] Allow lambda decorators In-Reply-To: References: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> Message-ID: <20090209171228.GA17020@phd.pp.ru> On Mon, Feb 09, 2009 at 09:09:18AM -0800, Guido van Rossum wrote: > [(lambda x, _i=i: x+i) for i in range(10)] Either [(lambda x, i=i: x+i) for i in range(10)] or [(lambda x, _i=i: x+_i) for i in range(10)] :-) Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From dangyogi at gmail.com Mon Feb 9 18:33:52 2009 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Mon, 09 Feb 2009 12:33:52 -0500 Subject: [Python-ideas] Allow lambda decorators In-Reply-To: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> References: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> Message-ID: <49906900.10006@gmail.com> Carl Johnson wrote: > A few months back there was a discussion of how code like this gives > "surprising" results because of the scoping rules: > > >>> def func_maker(): > ... fs = [] > ... for i in range(10): > ... def f(): > ... return i > ... fs.append(f) > ... return fs > ... > >>> [f() for f in func_maker()] > [9, 9, 9, 9, 9, 9, 9, 9, 9, 9] > > [...] > > Thinking about it some more, I've realized that the change I proposed > is unnecessary, since we already have the decorator syntax. So, for > example, the original function can be made to act with the "expected" > scoping by using an each_in function defined as follows: > > >>> def each_in(seq): > ... return lambda f: [f(item) for item in seq] > ... > >>> def func_maker(): > ... @each_in(range(10)) > ... def fs(i): > ... def f(): > ... return i > ... return f > ... return fs #Warning, fs is a list, not a function! > ... > >>> [f() for f in func_maker()] > [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] > It seems like a more general solution to the variable scoping in for statements would be: >>> def for_each(body): ... def loop(seq): ... for i in seq: body(i) ... return loop >>> @for_each ... def body(i): ... def f(): return i ... fs.append(f) >>> fs = [] >>> body(range(10)) >>> [f() for f in fs] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] But it's not clear that the decorator is even required: >>> def repeat(body, seq): ... for i in seq: body(i) >>> def body(i): ... def f(): return i ... fs.append(f) >>> fs = [] >>> repeat(body, range(10)) >>> [f() for f in fs] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] The repeat function should be fully general. -bruce From aahz at pythoncraft.com Mon Feb 9 18:43:36 2009 From: aahz at pythoncraft.com (Aahz) Date: Mon, 9 Feb 2009 09:43:36 -0800 Subject: [Python-ideas] Allow lambda decorators In-Reply-To: References: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> Message-ID: <20090209174335.GB20816@panix.com> On Mon, Feb 09, 2009, Guido van Rossum wrote: > > In regard to the proposal of "bind i" syntax, I have a > counter-proposal (as log as we're in free association mode :-): > > Define new 'for' syntax so that you can write > > [lambda: i for new i in range(10)] > > or e.g. > > fs = [] > for new i in range(10): > def f(): > return i > fs.append(f) > > The rule would be that "for new in ..." defines a new "cell" > each time around the loop, whose scope is limited to the for loop. So > e.g. this wouldn't work: > > for new i in range(10): > if i%7 == 6: > break > print i # NameError > > I'm not saying I like this all that much, but it seems a more Pythonic > solution than "bind i", and it moves the special syntax closer to the > source of the problem. Nice! This does seem like it would work to quell the complaints about for loop scoping while still maintaining current semantics for those who rely on them. Anyone wanna write a PEP for this?... Anyone who does tackle this should make sure the PEP addresses this: for new x, y in foo: versus for new x, new y in foo: (I'll channel Guido and say that the second form will be rejected; we don't want to allow multiple loop targets to have different scope. IOW, ``for new`` becomes its own construct, functionally speaking.) -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ Weinberg's Second Law: If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization. From denis.spir at free.fr Mon Feb 9 19:05:54 2009 From: denis.spir at free.fr (spir) Date: Mon, 9 Feb 2009 19:05:54 +0100 Subject: [Python-ideas] Allow lambda decorators In-Reply-To: References: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> Message-ID: <20090209190554.1b297117@o> Le Mon, 9 Feb 2009 09:09:18 -0800, Guido van Rossum a ?crit : > fs = [] > for new i in range(10): > def f(): > return i > fs.append(f) The difference I see between such an iteration-specific loop variable and a "declarative" version is that in the latter case it is possible to choose which name(s), among the ones that depend on the loop var, will actually get one "cell" per iteration -- or not. Hem, maybe it's not clear... For instance, using a declaration, it may be possible to write the following loop (I do not pretend 'local' to be a good lexical choice ;-): funcs = [] for item in seq: local prod # this name only is iteration specific prod = product(item) # ==> one prod per item def f(): return prod funcs.append(f) if test(item): final = whatever(item) # non-local name break print "func results:\n%s\nend result:%s" \ %([f() for f in funcs],final) I do not like global & non-local declaration (do not fit well the overall python style, imo). So I would not like such a proposal, neither. But I like the idea to select names rather than a loop-level on/off switch. Denis ------ la vida e estranya From guido at python.org Mon Feb 9 19:12:43 2009 From: guido at python.org (Guido van Rossum) Date: Mon, 9 Feb 2009 10:12:43 -0800 Subject: [Python-ideas] Allow lambda decorators In-Reply-To: <20090209190554.1b297117@o> References: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> <20090209190554.1b297117@o> Message-ID: On Mon, Feb 9, 2009 at 10:05 AM, spir wrote: > Le Mon, 9 Feb 2009 09:09:18 -0800, > Guido van Rossum a ?crit : > >> fs = [] >> for new i in range(10): >> def f(): >> return i >> fs.append(f) > > The difference I see between such an iteration-specific loop variable and a "declarative" version is that in the latter case it is possible to choose which name(s), among the ones that depend on the loop var, will actually get one "cell" per iteration -- or not. Hem, maybe it's not clear... > > For instance, using a declaration, it may be possible to write the following loop (I do not pretend 'local' to be a good lexical choice ;-): > > funcs = [] > for item in seq: > local prod # this name only is iteration specific > prod = product(item) # ==> one prod per item > def f(): > return prod > funcs.append(f) > if test(item): > final = whatever(item) # non-local name > break > print "func results:\n%s\nend result:%s" \ > %([f() for f in funcs],final) > > I do not like global & non-local declaration (do not fit well the overall python style, imo). So I would not like such a proposal, neither. But I like the idea to select names rather than a loop-level on/off switch. You could use "new prod" for this. Or you could use "var prod" and change the "for" proposal to "for var". It's a slippery slope though -- how often do you really need this vs. how much more confusion does it cause. You would need rules telling the scope of such a variable: with "for new" that is easy, but what if "var foo" is not nested inside a loop? Then it would essentially become a no-op. Or would it introduce a scope out of the nearest indented block? This quickly becomes too hairy to be attractive. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From rrr at ronadam.com Mon Feb 9 20:10:28 2009 From: rrr at ronadam.com (Ron Adam) Date: Mon, 09 Feb 2009 13:10:28 -0600 Subject: [Python-ideas] Allow lambda decorators In-Reply-To: <20090209174335.GB20816@panix.com> References: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> <20090209174335.GB20816@panix.com> Message-ID: <49907FA4.5090707@ronadam.com> Aahz wrote: > On Mon, Feb 09, 2009, Guido van Rossum wrote: >> In regard to the proposal of "bind i" syntax, I have a >> counter-proposal (as log as we're in free association mode :-): >> >> Define new 'for' syntax so that you can write >> >> [lambda: i for new i in range(10)] >> >> or e.g. >> >> fs = [] >> for new i in range(10): >> def f(): >> return i >> fs.append(f) >> >> The rule would be that "for new in ..." defines a new "cell" >> each time around the loop, whose scope is limited to the for loop. So >> e.g. this wouldn't work: >> >> for new i in range(10): >> if i%7 == 6: >> break >> print i # NameError >> >> I'm not saying I like this all that much, but it seems a more Pythonic >> solution than "bind i", and it moves the special syntax closer to the >> source of the problem. > > Nice! This does seem like it would work to quell the complaints about > for loop scoping while still maintaining current semantics for those who > rely on them. Anyone wanna write a PEP for this?... > > Anyone who does tackle this should make sure the PEP addresses this: > > for new x, y in foo: > > versus > > for new x, new y in foo: > > (I'll channel Guido and say that the second form will be rejected; we > don't want to allow multiple loop targets to have different scope. IOW, > ``for new`` becomes its own construct, functionally speaking.) Personally, I think when ever a state or data is to be stored, objects should be used instead of trying to get functions to do what objects already do. Here's something that looks like it works, but... >>> f = {} >>> for i in range(10): f[i] = lambda:i ... >>> for i in f: f[i]() ... 0 1 2 3 4 5 6 7 8 9 It was only worked because I reused i as a variable name. >>> for j in f: f[j]() ... 9 9 9 9 9 9 9 9 9 9 I've always thought closures cause more problems than they solve, but I doubt closures can be removed because of both how many programs are already written that use them, and because of the number of programmers who like them. One of the reasons I dislike closures is a function that uses a closure looks exactly like a function that uses a global value and as you see above is susceptible to side effects depending on how and when they are used. Guido's 'new' syntax would help, but I'm undecided on weather it's a good feature or not. I'd prefer to have an optional feature to turn on closures myself. Maybe an "__allow_closures__ = True" at the beginning of a program? Shrug, Ron From arnodel at googlemail.com Mon Feb 9 20:13:12 2009 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Mon, 9 Feb 2009 19:13:12 +0000 Subject: [Python-ideas] Allow lambda decorators In-Reply-To: References: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> Message-ID: <9bfc700a0902091113u80abaddh345fbe0024e2b792@mail.gmail.com> 2009/2/9 Guido van Rossum : > Define new 'for' syntax so that you can write > > [lambda: i for new i in range(10)] > > or e.g. > > fs = [] > for new i in range(10): > def f(): > return i > fs.append(f) > > The rule would be that "for new in ..." defines a new "cell" > each time around the loop, whose scope is limited to the for loop. So > e.g. this wouldn't work: > > for new i in range(10): > if i%7 == 6: > break > print i # NameError Cool! I'll be able to introduce a new scope at the drop of a hat with a new handy 'idiom': def foo(x): a = x + 1 for new a in a**2,: print a print a >>> foo(5) 36 6 >>> -- Arnaud From rrr at ronadam.com Mon Feb 9 20:28:46 2009 From: rrr at ronadam.com (Ron Adam) Date: Mon, 09 Feb 2009 13:28:46 -0600 Subject: [Python-ideas] Allow lambda decorators In-Reply-To: <49907FA4.5090707@ronadam.com> References: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> <20090209174335.GB20816@panix.com> <49907FA4.5090707@ronadam.com> Message-ID: <499083EE.4080506@ronadam.com> Correction/Note: Both examples I gave use globals and not closures, but I still hold my preference against them because it's still too easy to make subtle mistakes with them. One of which is to use a global where a closure is intended. ;-) Ron Ron Adam wrote: > > > Aahz wrote: >> On Mon, Feb 09, 2009, Guido van Rossum wrote: >>> In regard to the proposal of "bind i" syntax, I have a >>> counter-proposal (as log as we're in free association mode :-): >>> >>> Define new 'for' syntax so that you can write >>> >>> [lambda: i for new i in range(10)] >>> >>> or e.g. >>> >>> fs = [] >>> for new i in range(10): >>> def f(): >>> return i >>> fs.append(f) >>> >>> The rule would be that "for new in ..." defines a new "cell" >>> each time around the loop, whose scope is limited to the for loop. So >>> e.g. this wouldn't work: >>> >>> for new i in range(10): >>> if i%7 == 6: >>> break >>> print i # NameError >>> >>> I'm not saying I like this all that much, but it seems a more Pythonic >>> solution than "bind i", and it moves the special syntax closer to the >>> source of the problem. >> >> Nice! This does seem like it would work to quell the complaints about >> for loop scoping while still maintaining current semantics for those who >> rely on them. Anyone wanna write a PEP for this?... >> >> Anyone who does tackle this should make sure the PEP addresses this: >> >> for new x, y in foo: >> >> versus >> >> for new x, new y in foo: >> >> (I'll channel Guido and say that the second form will be rejected; we >> don't want to allow multiple loop targets to have different scope. IOW, >> ``for new`` becomes its own construct, functionally speaking.) > > Personally, I think when ever a state or data is to be stored, objects > should be used instead of trying to get functions to do what objects > already do. > > > Here's something that looks like it works, but... > > >>> f = {} > >>> for i in range(10): f[i] = lambda:i > ... > > >>> for i in f: f[i]() > ... > 0 > 1 > 2 > 3 > 4 > 5 > 6 > 7 > 8 > 9 > > It was only worked because I reused i as a variable name. > > > >>> for j in f: f[j]() > ... > 9 > 9 > 9 > 9 > 9 > 9 > 9 > 9 > 9 > 9 > > I've always thought closures cause more problems than they solve, but I > doubt closures can be removed because of both how many programs are > already written that use them, and because of the number of programmers > who like them. > > One of the reasons I dislike closures is a function that uses a closure > looks exactly like a function that uses a global value and as you see > above is susceptible to side effects depending on how and when they are > used. > > Guido's 'new' syntax would help, but I'm undecided on weather it's a > good feature or not. > > I'd prefer to have an optional feature to turn on closures myself. > Maybe an "__allow_closures__ = True" at the beginning of a program? > > Shrug, > Ron From steve at pearwood.info Tue Feb 10 00:04:04 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 10 Feb 2009 10:04:04 +1100 Subject: [Python-ideas] [Python-Dev] yield * (Re: Missing operator.call) In-Reply-To: References: <498991D9.3060407@avl.com> <498D4E9D.6070405@canterbury.ac.nz> <498E2EBF.7090309@canterbury.ac.nz> <20090208015007.12555.885411350.divmod.xquotient.4298@weber.divmod.com> <498E9476.80404@pearwood.info> <498F5AC0.60708@pearwood.info> <3f4107910902081515s3bc1b5b1k64bbac14051dbd0b@mail.gmail.com> <49900CC9.10504@pearwood.info> Message-ID: <4990B664.7060807@pearwood.info> Antoine Pitrou wrote: > Steven D'Aprano writes: >> Marcin 'Qrczak' Kowalczyk wrote: >>> On Sun, Feb 8, 2009 at 23:20, Steven D'Aprano wrote: >>> >>>> And the cost is small: >>>> >>>> [steve ando ~]$ python -m timeit -s "seq = range(500)" "(3*x for x in > seq if >>>> x%2 == 1)" >>>> 1000000 loops, best of 3: 0.611 usec per loop >>> Because generators are lazy and you don't run it into completion. >> We were talking about the cost of *making* the generator, not the cost >> of running it to completion. > > No, I was talking about the cost of running it to completion. Perhaps I misunderstood you, because what you actually said was: "But the former will be slower than the latter, because it constructs an intermediate generator only to yield it element by element." Since a for loop will also yield element by element, the only difference I saw was constructing the generator expression, which is cheap. > A generator is > executed in a separated frame. Therefore, if you "yield from" a generator, there > is a frame switch at each iteration between the generator frame and the frame of > the "yield from". > Which is not the case with an inline "for" loop containing a "yield". I'm afraid I don't understand the relevance of this. If it's a criticism, it's a criticism of generators in general, not of the proposed syntax. Don't we already carry the cost of the frame switch when iterating over a generator? for el in generator: yield el If that is replaced with the proposed syntax yield from generator what's the difference, performance-wise? In both cases, you can optimise by unrolling the generator into an inline for loop, at the cost of readability, convenience, and the ability to pass generator objects around. In Python 2.4 at least, the optimisation is not to be sneered at (modulo the usual warnings about premature optimisation): unrolling is about 30-40% faster: $ python -m timeit -s "def f():" -s " for x in (i+1 for i in xrange(20)): yield x" "list(f())" # using gen expr 100000 loops, best of 3: 12.9 usec per loop $ python -m timeit -s "def f():" -s " for x in xrange(20): yield x+1" "list(f())" # unrolled into body of the loop 100000 loops, best of 3: 8.09 usec per loop Since people are already choosing to use generator expressions instead of unrolling them into for loops, I don't believe that your objection is relevant to the proposal. "yield from expression" would (presumably) be a shorter, neater way of saying "for x in expression: yield x" except that it doesn't create a new name x. -- Steven From pyideas at rebertia.com Tue Feb 10 01:43:01 2009 From: pyideas at rebertia.com (Chris Rebert) Date: Mon, 9 Feb 2009 16:43:01 -0800 Subject: [Python-ideas] Allow lambda decorators In-Reply-To: <9bfc700a0902090627ya40a516k1d6e5532de5e5fc7@mail.gmail.com> References: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> <20090209054026.GA20148@panix.com> <50697b2c0902082202h2ea97343x8422801231e08b83@mail.gmail.com> <9bfc700a0902090627ya40a516k1d6e5532de5e5fc7@mail.gmail.com> Message-ID: <50697b2c0902091643s1e4984cbjc375d9232fa03df7@mail.gmail.com> On Mon, Feb 9, 2009 at 6:27 AM, Arnaud Delobelle wrote: > 2009/2/9 Chris Rebert : > >> This will likely get shot down in 5 minutes somehow, but I just want >> to put it forward since I personally didn't find it ugly: Has a >> declaration been considered (like with `nonlocal`)? Seems a relatively >> elegant solution to this problem, imho. I know it was mentioned in a >> thread once (with the keyword as "instantize" or "immediatize" or >> something similar), but I've been unable to locate it (my Google-chi >> must not be flowing today). > > You can get to a related discussion if you search for 'localize' or > 'immanentize'. Is this what you were looking for? Yes that is, thanks! As the starter of the thread said, "immanentize" really is a terrible name. :) http://mail.python.org/pipermail/python-ideas/2008-October/002149.html Cheers, Chris -- Follow the path of the Iguana... http://rebertia.com From carl at carlsensei.com Tue Feb 10 01:46:42 2009 From: carl at carlsensei.com (Carl Johnson) Date: Mon, 9 Feb 2009 14:46:42 -1000 Subject: [Python-ideas] variable binding [was lambda decorators] Message-ID: The proposal to add bind to the function definition is silly, since we can do the equivalent of def f(bind i, ?) already using decorators: >>> class bind(object): ... def __init__(self, *args, **kwargs): ... self.args, self.kwargs = args, kwargs ... ... def __call__(self, f): ... def inner(*args, **kwargs): ... return f(self, *args, **kwargs) ... return inner ... >>> l = [] >>> for i in range(10): ... @bind(i) ... def my_func(bound_vars, *args, **kwargs): ... return bound_vars.args[0] ... l.append(my_func) ... >>> [f() for f in l] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> [f() for f in l] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] Just put the bind decorator into functools and the problem is solved. This is better than the (UGLY!) default values hack, since in this case it is impossible for your caller to accidentally overwrite the value you wanted bound (or at least not without some stackframe manipulation, at which point you get what you deserve). I also don't like Guido's proposed var and new keywords. With all due respect, it looks like JavaScript. Besides, we all ready have a perfectly good tool for adding new scopes: functions. Just use a decorator like this map_maker to make an imap of the for-loop you wanted to have a separate scope in. >>> class map_maker(object): ... def __init__(self, f): ... self.f = f ... ... def __call__(self, seq): ... return (self.f(item) for item in seq) ... >>> a = 1 >>> >>> @map_maker ... def my_map(a): ... print("Look ma, the letter a equals", a) ... >>> list(my_map(range(10))) Look ma, the letter a equals 0 Look ma, the letter a equals 1 Look ma, the letter a equals 2 Look ma, the letter a equals 3 Look ma, the letter a equals 4 Look ma, the letter a equals 5 Look ma, the letter a equals 6 Look ma, the letter a equals 7 Look ma, the letter a equals 8 Look ma, the letter a equals 9 [None, None, None, None, None, None, None, None, None, None] >>> >>> a 1 So, my proposal is that the API of the bind decorator be cleaned up considerably then added to the functools. The map_maker API seems to be good enough and could also go into functools. -- Carl From guido at python.org Tue Feb 10 01:55:03 2009 From: guido at python.org (Guido van Rossum) Date: Mon, 9 Feb 2009 16:55:03 -0800 Subject: [Python-ideas] variable binding [was lambda decorators] In-Reply-To: References: Message-ID: On Mon, Feb 9, 2009 at 4:46 PM, Carl Johnson wrote: > The proposal to add bind to the function definition is silly, since we can > do the equivalent of def f(bind i, ?) already using decorators: > >>>> class bind(object): > ... def __init__(self, *args, **kwargs): > ... self.args, self.kwargs = args, kwargs > ... > ... def __call__(self, f): > ... def inner(*args, **kwargs): > ... return f(self, *args, **kwargs) > ... return inner > ... >>>> l = [] >>>> for i in range(10): > ... @bind(i) > ... def my_func(bound_vars, *args, **kwargs): > ... return bound_vars.args[0] > ... l.append(my_func) > ... >>>> [f() for f in l] > [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>>> [f() for f in l] > [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] > > Just put the bind decorator into functools and the problem is solved. This > is better than the (UGLY!) default values hack, since in this case it is > impossible for your caller to accidentally overwrite the value you wanted > bound (or at least not without some stackframe manipulation, at which point > you get what you deserve). Sure, if you want to manufacture a whole new object each time through the loop and access the bound variable by digging deep into an extra argument, this might be considered less ugly. Personally if I had to go to such lengths I would prefer to just switch to an OO style of coding and make the function a genuine method (effectively you're making it a method anyway by implicitly passing self). As others implied, I don't see why you should go to such lengths to avoid having to write OO code. -1 on adding this UGLY hack to functools. > I also don't like Guido's proposed var and new keywords. With all due > respect, it looks like JavaScript. Besides, we all ready have a perfectly > good tool for adding new scopes: functions. Just use a decorator like this > map_maker to make an imap of the for-loop you wanted to have a separate > scope in. I guess UGLY is subjective. :-) >>>> class map_maker(object): > ... def __init__(self, f): > ... self.f = f > ... > ... def __call__(self, seq): > ... return (self.f(item) for item in seq) > ... >>>> a = 1 >>>> >>>> @map_maker > ... def my_map(a): > ... print("Look ma, the letter a equals", a) > ... >>>> list(my_map(range(10))) > Look ma, the letter a equals 0 > Look ma, the letter a equals 1 > Look ma, the letter a equals 2 > Look ma, the letter a equals 3 > Look ma, the letter a equals 4 > Look ma, the letter a equals 5 > Look ma, the letter a equals 6 > Look ma, the letter a equals 7 > Look ma, the letter a equals 8 > Look ma, the letter a equals 9 > [None, None, None, None, None, None, None, None, None, None] >>>> >>>> a > 1 Such games with decorators look UGLY to me. How I am, poor reader, supposed to guess what the signature of my_map() is from reading its definition? > So, my proposal is that the API of the bind decorator be cleaned up > considerably then added to the functools. The map_maker API seems to be good > enough and could also go into functools. -1 again. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From rhamph at gmail.com Tue Feb 10 02:36:29 2009 From: rhamph at gmail.com (Adam Olsen) Date: Mon, 9 Feb 2009 18:36:29 -0700 Subject: [Python-ideas] variable binding [was lambda decorators] In-Reply-To: References: Message-ID: Of course a much simpler decorator is possible: for i in range(10): @bind(i=i) def my_func(i): print i # Implementing bind is left as an exercise for the reader i is passed to bind as a keyword argument, which it saves and later passes to my_func as a keyword argument. No default values are used so it's no longer considered "ugly". It will also give an error if somebody accidentally passes in i themselves. I stand by my previous comment on it not being worthwhile. -- Adam Olsen, aka Rhamphoryncus From solipsis at pitrou.net Tue Feb 10 03:12:47 2009 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 10 Feb 2009 02:12:47 +0000 (UTC) Subject: [Python-ideas] [Python-Dev] yield * (Re: Missing operator.call) References: <498991D9.3060407@avl.com> <498D4E9D.6070405@canterbury.ac.nz> <498E2EBF.7090309@canterbury.ac.nz> <20090208015007.12555.885411350.divmod.xquotient.4298@weber.divmod.com> <498E9476.80404@pearwood.info> <498F5AC0.60708@pearwood.info> <3f4107910902081515s3bc1b5b1k64bbac14051dbd0b@mail.gmail.com> <49900CC9.10504@pearwood.info> <4990B664.7060807@pearwood.info> Message-ID: Steven D'Aprano writes: > > Since people are already choosing to use generator expressions instead > of unrolling them into for loops, I don't believe that your objection is > relevant to the proposal. I don't know about other people, but when I write a generator expression, it's usually for passing it around. That is, I write a generator expression in places where I'd otherwise have to write a full generator function; both are probably equivalent performance-wise. I don't think writing a generator expression in situations where you could simply inline the equivalent loop is very common, because it doesn't seem to bring anything (and, as you observed, it's slower). Regards Antoine. From lie.1296 at gmail.com Tue Feb 10 05:19:41 2009 From: lie.1296 at gmail.com (Lie Ryan) Date: Tue, 10 Feb 2009 04:19:41 +0000 (UTC) Subject: [Python-ideas] Making colons optional? References: <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <87ocxgy33o.fsf@benfinney.id.au> <6A75D969-850F-4C6F-98EA-E1041E916EAC@gmail.com> <87ocxgdmc3.fsf@xemacs.org> <727F165C-16E2-404F-9A47-9A9806E7C2B1@gmail.com> <20090206142155.GD13165@phd.pp.ru> <982C396D-82B9-49FA-97EF-9B85844A6D4D@gmail.com> <20090206150951.GA26929@phd.pp.ru> <63E445CF-F77B-4D2C-92D4-789DBE4B959A@gmail.com> <20090206153139.GA31773@phd.pp.ru> Message-ID: On Sun, 08 Feb 2009 03:17:33 -0330, Riobard Zhan wrote: > > The original context is that you thought it's not worth a "big/major" > change to make colons optional because we get nothing, then I asked by > comparison what do we get by making "print" a function (a "bigger" > change for me) when we really have the option to just add a new built- > in function with a slightly different name. No, it is not an option because 1) it is duplication of feature 2) having two different print construct that have subtle differences confuses the hell out of newbies 3) people won't use that new built-in function 4) if people don't use the new built-in function, overriding print would involve some dark magic 5) to override print statement you need to mess with sys.stdout, which is messy as you need to keep a reference to the original sys.stdout or face infinite recursion. The case for colon is different from the case of semicolon. Statements between semicolons have sibling relationship, while the statement before a colon and the statements after the colon have parent-child relationship. Sibling relationship is not significant enough to require anything more than a newline or -- in the case of several statements in one line -- a semicolon. Parent-Child relationship is much more important and having colon before the child statements makes it visually easy to distinguish the parent and children. The case for you requesting optional colon is like asking for code like this to be legal: def foo(bar): print 'foo' print 'bar' spit() antigravity(bar) print 'koo' which would be interpreted as def foo(bar): print 'foo' print 'bar' spit() antigravity(bar) print 'koo' and saying: "consistent indentation is unnecessary as computer can syntactically distinguish them as long as the number spaces is equal or more than the first indentation of the suite" yeah, although computers can distinguish them easily, human can't. From lie.1296 at gmail.com Tue Feb 10 05:40:53 2009 From: lie.1296 at gmail.com (Lie Ryan) Date: Tue, 10 Feb 2009 04:40:53 +0000 (UTC) Subject: [Python-ideas] Making colons optional? References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498AC701.6030300@pearwood.info> <9A64E831-1D8E-44CC-A3B4-DEF6578DC308@gmail.com> <4801CF66-D8E6-45A0-A5E0-DABF691F4E16@gmail.com> Message-ID: On Sun, 08 Feb 2009 03:17:14 -0330, Riobard Zhan wrote: > >> I think it's a good indicator for optional syntax if you can formulate >> new rules for PEP 8 that state when to use it. In the case of colons, >> you'd have to either forbid or mandate them; I'd be at a loss to find >> another consistent rule. So, making them optional is pointless; we >> should either keep them or remove them. And removing is out of the >> question. >> >> Applying that indicator to semicolons, there is a clear rule in PEP 8 >> that states when to use them: to separate two statements on one line. > > I thought semicolons and multiple statements on one line are discouraged > in PEP 8. Did I miss something? :| > Discouraged doesn't mean you can't use them. In fact the first section of PEP8 after the "Introduction" is "A Foolish Consistency is the Hobgoblin of Little Minds". lambda is discouraged, but when you think using lambda is more readable then defining a function, use it. PEP 8 is made to increase readability of code, if there is any PEP8 rules that decreases readability in your code, violate the rule. >> However, the only person around here whose itches alone, in the face of >> a wall of disagreeing users, can lead to a change in the language, is >> Guido. > > Agreed. That's what BDFL means. > > I'm not trying to impose my itches on you. I'm not the BDFL. I'm > explaining why I think omitting colons makes Python more elegant. > And you have failed to convince others that it is "more" elegant. >> Yes, I have read all those remarks about semicolons. > > Thanks very much! I thought you ignored them before. I apologize :) Vice versa, we expect you to read other's post. >> What you all fail to recognize is that the majority of Python users >> like the colons, and wouldn't program without them. > > I doubt it. Would you not program other languages because they do not > have colons? Do you really think the majority of Python users care about > colons that much? I bet they will never notice if there is any colons > missing in a piece of Python code if colons are made optional. They might not notice it when they were writing it, but several days later when fixing an unrelated bug they would feel an itch. From greg.ewing at canterbury.ac.nz Tue Feb 10 06:02:27 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 10 Feb 2009 18:02:27 +1300 Subject: [Python-ideas] Allow lambda decorators In-Reply-To: <20090209190554.1b297117@o> References: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> <20090209190554.1b297117@o> Message-ID: <49910A63.4010302@canterbury.ac.nz> spir wrote: > For instance, using a declaration, it may be possible to write the following loop (I do not pretend 'local' to be a good lexical choice ;-): > > funcs = [] > for item in seq: > local prod # this name only is iteration specific > prod = product(item) # ==> one prod per item My conception of the 'new' idea is that you would be able to use it with any assignment to a bare name, so you could write for item in seq: new prod = product(item) ... -- Greg From greg.ewing at canterbury.ac.nz Tue Feb 10 06:13:39 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 10 Feb 2009 18:13:39 +1300 Subject: [Python-ideas] Allow lambda decorators In-Reply-To: References: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> <20090209190554.1b297117@o> Message-ID: <49910D03.6060104@canterbury.ac.nz> Guido van Rossum wrote: > You would need rules telling the scope of such a variable: > with "for new" that is easy, but what if "var foo" is not nested > inside a loop? Then it would essentially become a no-op. Or would it > introduce a scope out of the nearest indented block? In my version of it, 'new' doesn't affect scoping at all -- the scope of the name that it qualifies is the same as it would be if the 'new' wasn't there. It's not necessarily true that it would be a no-op outside of a loop: x = 17 def f(): ... # sees x == 17 new x = 42 def g(): ... # sees x == 42 although practical uses for that are probably rather thin on the ground. > This quickly > becomes too hairy to be attractive. It all seems rather clear and straightforward to me, once you understand what it does. Some RTFMing on the part of newcomers would be required, but nothing worse than what we already have with things like generators, decorators and with-statements. -- Greg From greg.ewing at canterbury.ac.nz Tue Feb 10 06:25:50 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 10 Feb 2009 18:25:50 +1300 Subject: [Python-ideas] Allow lambda decorators In-Reply-To: <9bfc700a0902091113u80abaddh345fbe0024e2b792@mail.gmail.com> References: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> <9bfc700a0902091113u80abaddh345fbe0024e2b792@mail.gmail.com> Message-ID: <49910FDE.3050205@canterbury.ac.nz> Arnaud Delobelle wrote: > Cool! I'll be able to introduce a new scope at the drop of a hat with > a new handy 'idiom': > > def foo(x): > a = x + 1 > for new a in a**2,: > print a > print a Before you get too carried away with that, I'd like to make it clear that in my version of the semantics of 'new', it *not* work that way -- 'a' would remain bound to the last cell, and therefore the last value, that it was given by the for-loop. I considered that a feature of my proposal, i.e. it preserves the ability to use the loop variable's value after exiting the loop, which some people seem to like. -- Greg From lie.1296 at gmail.com Tue Feb 10 08:04:52 2009 From: lie.1296 at gmail.com (Lie Ryan) Date: Tue, 10 Feb 2009 07:04:52 +0000 (UTC) Subject: [Python-ideas] Making colons optional? References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498B1CDB.3000804@scottdial.com> Message-ID: On Thu, 05 Feb 2009 12:07:39 -0500, Scott Dial wrote: >> if (some() and some_other() or some_more(complex=(True,)) >> and a_final_call(egg=(1,2,3))): >> do_something() > > This example uses the same mechanism as above. BTW, I tend to indent > this as: > > if (some() and some_other() or some_more(complex=(True,)) > and a_final_call(egg=(1,2,3))): > do_something() > > With or without the colon, and it's more readable than your version of course a more pythonic approach would format it as this: if (some() and some_other() or some_more(complex=(True,)) and a_final_call(egg=(1,2,3))): do_something() It is clear that the if-clause continues to the next line because there is an unsatisfied "and" in the end of it and it is also consistent with PEP 8's guideline for having operators/keywords before breaking. alternatively: if some() and some_other() or some_more(complex=(True,)) and \ a_final_call(egg=(1,2,3)): do_something() it is even clearer since there is a line continuation token "\". personally, I like this: if some() and some_other() or \ some_more(complex=(True,)) and \ a_final_call(egg=(1,2,3)) \ : do_something() anything more complex than a very simple expression in a multi-line if should really be on its own line, irrespective of whether it has reached the character limit or not. Even more preferably though, that expression shouldn't be that complex, it becomes hard to follow the logic of the if. The program require a little bit of factoring if I saw such expression. From lie.1296 at gmail.com Tue Feb 10 08:08:48 2009 From: lie.1296 at gmail.com (Lie Ryan) Date: Tue, 10 Feb 2009 07:08:48 +0000 (UTC) Subject: [Python-ideas] Making colons optional? References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <20090205181039.6a7bfd73@o> Message-ID: On Thu, 05 Feb 2009 14:59:59 -0330, Riobard Zhan wrote: >>> The second example makes it even more obvious: >>> >>> if (some() and some_other() or some_more(complex=(True,)) >>> and a_final_call(egg=(1,2,3))): >> >> Do(some(), some_other(), some_more(complex=(True,)), >> and_final_call(egg=(1,2,3,zzzzz=False))); # endof >> statement is obvious > > > This is exactly the point in my mind! :) What is that piece of code supposed to mean? Is it supposed to mean the same as the easy to understand if-clause above it? Among the parentheses clouds, I can't see the end-of-line easily... From arnodel at googlemail.com Tue Feb 10 08:15:55 2009 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Tue, 10 Feb 2009 07:15:55 +0000 Subject: [Python-ideas] Allow lambda decorators In-Reply-To: <49910FDE.3050205@canterbury.ac.nz> References: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> <9bfc700a0902091113u80abaddh345fbe0024e2b792@mail.gmail.com> <49910FDE.3050205@canterbury.ac.nz> Message-ID: On 10 Feb 2009, at 05:25, Greg Ewing wrote: > Arnaud Delobelle wrote: > >> Cool! I'll be able to introduce a new scope at the drop of a hat with >> a new handy 'idiom': >> def foo(x): >> a = x + 1 >> for new a in a**2,: >> print a >> print a > > Before you get too carried away with that, I'd like > to make it clear that in my version of the semantics > of 'new', it *not* work that way -- 'a' would remain > bound to the last cell, and therefore the last value, > that it was given by the for-loop. > > I considered that a feature of my proposal, i.e. it > preserves the ability to use the loop variable's value > after exiting the loop, which some people seem to like. Sorry I should have made it clearer that my 'idiom' was tongue in cheek. Indeed I think that it's better to let the value of the inner 'a' spill outside the loop. -- Arnaud From lie.1296 at gmail.com Tue Feb 10 08:19:58 2009 From: lie.1296 at gmail.com (Lie Ryan) Date: Tue, 10 Feb 2009 07:19:58 +0000 (UTC) Subject: [Python-ideas] Making stars optional? (was: Making colons optional?) References: <498B623E.1040206@pearwood.info> <498CCED6.7040209@pearwood.info> <91ad5bf80902061714g527d0c64h55ba781774f56d78@mail.gmail.com> <498CF02A.2070807@pearwood.info> <26594513-3A0C-4223-9095-0293CA884663@gmail.com> Message-ID: On Sat, 07 Feb 2009 12:19:50 -0200, Leonardo Santagada wrote: > On Feb 7, 2009, at 12:21 AM, Steven D'Aprano wrote: > >> George Sakkis wrote: >>> On Fri, Feb 6, 2009 at 6:59 PM, Steven D'Aprano >>> wrote: >>>>> In maths, >>>>> you only have single-character variable names (sub-/superscripts >>>>> notwithstanding), so ab always means a*b. >>>> Except in the presence of typos. >>> In the presence of typos all bets are off, unless you are aware of any >>> typo-proof writing system. Python certainly isn't one since, say, >>> "x.y" and "x,y" are pretty similar, both visually and in keyboard >>> distance. >> >> You overstate your case: not *all* bets are off, just some of them. >> >> Some typos have greater consequences than others. Some will be >> discovered earlier than others, and the earlier they are discovered, >> the more likely they are to be easily fixed without the need for >> significant debugging effort. >> >> E.g. if I type R**$ instead of R**4, such a typo will be picked up in >> Python immediately. But R**5 instead could be missed for arbitrarily >> large amounts of time. >> >> E.g. if you mean x.y but type x,y instead, then such an error will be >> discovered *very* soon, unless you happen to also have a name 'y'. >> >> But anyway, we're not actually disagreeing. (At least, I don't think we >> are.) We're just discussing how some conventions encourage errors and >> others discourage them, and the circumstances of each. > > > This doesn't really makes much sense to me... if you don't use tests to > verify your program both R**5 or R**A are just the same error if for > example this code path is a rare case, it will only be discovered when > something goes wrong with the program... If that piece of code is in a rarely walked path, then it is even worse since the bug could easily creep into production release without anyone noticing. It happens that there are three classifications of errors in programming: Syntax Error - easily found, impossible to creep into production release since compiler/interpreter would complaint loudly Runtime Error - found when the code path is walked, comprehensive testing would reveal such errors Logic Error - hard to find, hard to debug It happens that most typos would become SyntaxError and LogicError. Implicit multiplication currently becomes a SyntaxError, since python recognizes no implicit multiplication, OTOH if we introduce implicit multiplication, it'll invariably becomes Logic Error or Runtime Error. From lie.1296 at gmail.com Tue Feb 10 11:43:34 2009 From: lie.1296 at gmail.com (Lie Ryan) Date: Tue, 10 Feb 2009 10:43:34 +0000 (UTC) Subject: [Python-ideas] String formatting and namedtuple Message-ID: I've been experimenting with namedtuple, it seems that string formatting doesn't recognize namedtuple as mapping. from collections import namedtuple Nt = namedtuple('Nt', ['x', 'y']) nt = Nt(12, 32) print 'one = %(x)s, two = %(y)s' % nt # output should be: one = 12, two = 32 currently, it is possible to use nt._asdict() as a workaround, but I think it will be easier and more intuitive to be able to use namedtuple directly with string interpolation Sorry, if this issue has been discussed before. From denis.spir at free.fr Tue Feb 10 13:39:21 2009 From: denis.spir at free.fr (spir) Date: Tue, 10 Feb 2009 13:39:21 +0100 Subject: [Python-ideas] __data__ Message-ID: <20090210133921.540edda7@o> Hello, I stepped on an issue recently that let me reconsider what we actually do and mean when subtyping, especially when deriving from built-in types. The issue basically is I need to add some information and behaviour to "base data" that can be of any type. (See PS for details.) When subtyping int, I mean instances of MyInt to basically "act as" an int. This does not mean, in fact, inheriting int's attributes. This means instead the integer specific operations to apply on a very special attribute instead of applying on the global object. For instance, if n is an instance of MyInt with a value of 2, then: * "print n" will print "2" instead of "<__main__.MyInt object at 0xb7d56ecc>" * "1 + n" will compute "2 + 1" instead of trying to add 1 to the object itself. Relevant method calls to the object are passed to an implicit data attribute. This concretely shows as the nice side-effect to avoid writing "n.data" (or "n.value") all the time. But in fact, from the user's point of view, an instance of a derived type such as MyInt basically *is* an int with some additional or some changed features. So that implicit indirection is not only nice: it really reflects the intention. In other words, an implicit indirection operates. This indirection is precisely what we need to explicitely write down when, instread of deriving from int, we simulate the built-in type's behaviour: class MyInt(object): def __init__(self, value): self.value = value def __add__(self, other): return self.value + other # or more precisely: int.__add__(self.value, other) def __str__(self): return str(self.value) Whatever their actual implementation, built-in types will target operations toward the real, basic, data item that represents the value. Ordinary inheritance instead does not cause objects to "act as" whatever base, silently indirecting relevant operations to an implicit data field -- if not explicitely written. There is no special data at all, and no specific behaviour targeted to that data. Moreover, as shown in the simulation case above, making a custom type does not imply any inheritance, rather it simply requires properly targeting relevant methods. There is *no base type*: there is *base data* instead -- which has its own type. Deriving to let instances "act as" base type instances, finally requires only to tell the interpreter which attribute holds the base data. Then, when a method is called on an instance, if ever this method is not defined for its type, nore for this very instance, the interpreter can (try and) apply it on the basic data instead. If ever deriving can actually work that way, then there is no reason to declare a parent class at definition time, thus forever closing what kind of data is allowed. The fictional code snippets below give the interpreter all the information it needs: class MyInt(object): # no specific base def __init__(self, value): int.__init__(self, value) class MyInt(object): __data__ = "value" def __init__(self, value): self.value = value class MyInt(object): def __init__(self, value): self.__data__ = value The first one has the drawback of allowing several base.__init__ calls (desirable feature?), obviously leading to method name clashes. Also, if ever the type is not predictable, we need to use an even more complicated syntax such as type(value).__init__(self,value) which by the way shows how redondant such a call is (value has to be uselessly specified twice). The second form simply tells which field is to be considered as holding hte base data, so that the interpreted can properly process indirection. The third version even simplierly stores the base data into a magic field name. In both cases, the absence of a __data__ attribute means there is no basic data, thus inheritance (from object or any base class or classes) and method call operate normally. Probably some will consider that the present way of achieving the same goal (combining inheritance with possible call to base.__init__ if needed) is simpler, clearer, etc... I will not argue on that point. Still, consider the following: class MyInt(int): def __init__(self, value): data = int(value) int.__init__(self, data) class MyInt(int): def __init__(self, value): data = int(value) super(MyInt, self).__init__(self, data)) class MyInt(object): __data__ = "data" def __init__(self, value): self.data = int(value) class MyInt(object): def __init__(self, value): self.__data__ = int(value) I find third and fourth versions self-commenting. The fourth one is simpler and straightforward, but the third one makes the existence of __data__ more obvious (so we know at first sight that this type is able to operate silent indirection). Moreover --this is the reason why I first had to study that point closer--, the present syntax requires the base type to be unique *and* known at design time. Which must not be. I do not know however if there are many use cases for flexible base types. Still, versions 3 and 4 allow it without any overload: __data__ (or data) can be of any type, and can even change at runtime: what's the point? Denis ------ la vita e estranya PS: use case. I started to write a kind of parser generator based on a grammar inspired by PEG (http://en.wikipedia.org/wiki/Parsing_expression_grammar), like pyparsing. The way I see it implies several requirements for the parse result type: -0- Basically, valid results of succesful matches can be "empty" (eg from optional or lookahead expression), "single" (simple string), or "sequential" (when generated from repetition or sequence patterns). -1- Each (sub)result knows its own "nature" (implemented as a reference to the pattern that generated it); so that when patterns contain choices the client codes does not need partially reparse the result, only to know what the result actually is -- which the parser has already determined! Moreover, the client code needs only to comprise minimal knowledge about the grammar, as long as the parser passes by all the information it has collected. -2- Global and partial results can be output in several practicle formats including a "treeView" that +/- represents the parse tree. -3- Results can be transformed to prepare further processing -- which may well be only final output. There are several standard methods needed for this (glue a result sequence into a single string, extract nested results, keep only the parse tree's leaves); results can also be converted (e.g. an int representation into an int) or changed in whatever manner using ad hoc functions: eventually result data can well be of custom types. (This is actually a very interesting feature: for instance when parsing wikitext, a "styled_text" result may instantiate a StyledText object as an abstract represention of it, that can e.g. write itself to a DB, another wiki lang, x/html... The conceptual nature of a result requires the object to implicitely indirect proper methods to the real result data it holds. For instance, the client code should be able to write an addition on a result if ever additioning this result has a sense, and this operation to be implicited applied on the underlying data. First and second points require a custom type with additional information and behaviour. Third point requires flexibility on the base type. From denis.spir at free.fr Tue Feb 10 14:05:30 2009 From: denis.spir at free.fr (spir) Date: Tue, 10 Feb 2009 14:05:30 +0100 Subject: [Python-ideas] Allow lambda decorators In-Reply-To: <49910A63.4010302@canterbury.ac.nz> References: <3FBBB95C-E9EC-4ECA-9126-E6F37B3911EA@carlsensei.com> <20090209190554.1b297117@o> <49910A63.4010302@canterbury.ac.nz> Message-ID: <20090210140530.2933acd4@o> Le Tue, 10 Feb 2009 18:02:27 +1300, Greg Ewing a ?crit : > spir wrote: > > > For instance, using a declaration, it may be possible to write the following loop (I do not pretend 'local' to be a good lexical choice ;-): > > > > funcs = [] > > for item in seq: > > local prod # this name only is iteration specific > > prod = product(item) # ==> one prod per item > > My conception of the 'new' idea is that you would be able to > use it with any assignment to a bare name, so you could write > > for item in seq: > new prod = product(item) > ... > Yes, this is different indeed. As Guido's example was to tag the loop variable itself with the kw 'new', then all names inside the loop body depending on it would also be "iteration specific". So that it would not be possible anymore to retrieve final state data outsid the loop as is commonly done in python. Your proposal solves the problems. Yet, as Guido noted, it is probably not a big issue enough. ------ la vida e estranya From steve at pearwood.info Tue Feb 10 14:42:50 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 11 Feb 2009 00:42:50 +1100 Subject: [Python-ideas] __data__ In-Reply-To: <20090210133921.540edda7@o> References: <20090210133921.540edda7@o> Message-ID: <4991845A.3000501@pearwood.info> spir wrote: > Hello, > > I stepped on an issue recently that let me reconsider what we > actually do and mean when subtyping, especially when deriving from > built-in types. The issue basically is I need to add some information > and behaviour to "base data" that can be of any type. (See PS for > details.) > > When subtyping int, I mean instances of MyInt to basically "act as" > an int. This does not mean, in fact, inheriting int's attributes. It doesn't? How bizarre. > This means instead the integer specific operations to apply on a very > special attribute instead of applying on the global object. That sounds vaguely like delegation. ... > Moreover --this is the reason why I first had to study that point > closer--, the present syntax requires the base type to be unique > *and* known at design time. I don't think so. Just write a class factory. >>> def factory(base): ... class MyThing(base): ... def method(self): ... return "self is a %s" % base.__name__ ... return MyThing ... >>> x = factory(int)() >>> x.method() 'self is a int' >>> y = factory(str)() >>> y.method() 'self is a str' -- Steven From denis.spir at free.fr Tue Feb 10 15:38:06 2009 From: denis.spir at free.fr (spir) Date: Tue, 10 Feb 2009 15:38:06 +0100 Subject: [Python-ideas] __data__ In-Reply-To: <4991845A.3000501@pearwood.info> References: <20090210133921.540edda7@o> <4991845A.3000501@pearwood.info> Message-ID: <20090210153806.2f0fb7a5@o> Le Wed, 11 Feb 2009 00:42:50 +1100, Steven D'Aprano a ?crit : > > Moreover --this is the reason why I first had to study that point > > closer--, the present syntax requires the base type to be unique > > *and* known at design time. > > I don't think so. Just write a class factory. > > >>> def factory(base): > ... class MyThing(base): > ... def method(self): > ... return "self is a %s" % base.__name__ > ... return MyThing > ... > >>> x = factory(int)() > >>> x.method() > 'self is a int' > >>> y = factory(str)() > >>> y.method() > 'self is a str' I considered this approach already. It does not solve the problem at all. It's not the class's base type that is undefined at design time; it's the "base data"'s type. Each instance's base data may be of any type. In other words base==type(data) for each instance. Researches on the topic have revealed nothing except for a thread on SWIG's site. They were confronted to precisely the same issue, and seemed to have no more clue on a possible workaroud. ------ la vida e estranya From guido at python.org Tue Feb 10 15:53:52 2009 From: guido at python.org (Guido van Rossum) Date: Tue, 10 Feb 2009 06:53:52 -0800 Subject: [Python-ideas] __data__ In-Reply-To: <20090210153806.2f0fb7a5@o> References: <20090210133921.540edda7@o> <4991845A.3000501@pearwood.info> <20090210153806.2f0fb7a5@o> Message-ID: Seems to me jou are describing delegation. On Tue, Feb 10, 2009 at 6:38 AM, spir wrote: > Le Wed, 11 Feb 2009 00:42:50 +1100, > Steven D'Aprano a ?crit : > >> > Moreover --this is the reason why I first had to study that point >> > closer--, the present syntax requires the base type to be unique >> > *and* known at design time. >> >> I don't think so. Just write a class factory. >> >> >>> def factory(base): >> ... class MyThing(base): >> ... def method(self): >> ... return "self is a %s" % base.__name__ >> ... return MyThing >> ... >> >>> x = factory(int)() >> >>> x.method() >> 'self is a int' >> >>> y = factory(str)() >> >>> y.method() >> 'self is a str' > > I considered this approach already. It does not solve the problem at all. It's not the class's base type that is undefined at design time; it's the "base data"'s type. Each instance's base data may be of any type. In other words base==type(data) for each instance. > Researches on the topic have revealed nothing except for a thread on SWIG's site. They were confronted to precisely the same issue, and seemed to have no more clue on a possible workaroud. > > ------ > la vida e estranya > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From rrr at ronadam.com Tue Feb 10 17:54:52 2009 From: rrr at ronadam.com (Ron Adam) Date: Tue, 10 Feb 2009 10:54:52 -0600 Subject: [Python-ideas] Making colons optional? In-Reply-To: References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498B1CDB.3000804@scottdial.com> Message-ID: <4991B15C.7070904@ronadam.com> Lie Ryan wrote: > On Thu, 05 Feb 2009 12:07:39 -0500, Scott Dial wrote: > personally, I like this: > > if some() and some_other() or \ > some_more(complex=(True,)) and \ > a_final_call(egg=(1,2,3)) \ > : > do_something() My preference is to lead with keywords or symbols when it's convenient: if (some() and some_other() or some_more(complex=(True,)) and a_final_call(egg=(1, 2, 3))): do_something() For long math expressions spanning multiple lines I usually split before addition or subtraction signs. if (some_long_value * another_long_value + some_long_value * a_special_quanity - an_offset_of_some_size): do_something() With syntax highlighting these becomes even more readable. The eye and mind just follow along the vertical line of highlighted keywords or operations until it reaches the end. As far as optional things go, I'd like the '\' to be optional for multiple lines that end with a ':'. Adding parentheses around the expression works, but it seems like a compromise to me. Cheers, Ron From aahz at pythoncraft.com Tue Feb 10 18:03:19 2009 From: aahz at pythoncraft.com (Aahz) Date: Tue, 10 Feb 2009 09:03:19 -0800 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: References: Message-ID: <20090210170319.GA1763@panix.com> On Tue, Feb 10, 2009, Lie Ryan wrote: > > I've been experimenting with namedtuple, it seems that string formatting > doesn't recognize namedtuple as mapping. > > from collections import namedtuple > Nt = namedtuple('Nt', ['x', 'y']) > nt = Nt(12, 32) > print 'one = %(x)s, two = %(y)s' % nt > > # output should be: > one = 12, two = 32 > > currently, it is possible to use nt._asdict() as a workaround, but I > think it will be easier and more intuitive to be able to use namedtuple > directly with string interpolation This makes perfect sense to me; I think you should feel free to go ahead and submit a patch. (Regardless of whether the patch gets accepted, at least we would have a good record -- python-ideas is not really a good place to record ideas, and this is something nice and simple.) -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ Weinberg's Second Law: If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization. From carl at carlsensei.com Wed Feb 11 02:21:33 2009 From: carl at carlsensei.com (Carl Johnson) Date: Tue, 10 Feb 2009 15:21:33 -1000 Subject: [Python-ideas] String formatting and namedtuple Message-ID: <18909BF3-EA4B-4F2C-9C2C-7CE42E676B29@carlsensei.com> See if you can add a patch for this too: >>> print( 'one = {x}s, two = {y}s'.format(**nt)) Traceback (most recent call last): File "", line 1, in TypeError: format() argument after ** must be a mapping, not Nt From rhamph at gmail.com Wed Feb 11 09:41:21 2009 From: rhamph at gmail.com (Adam Olsen) Date: Wed, 11 Feb 2009 01:41:21 -0700 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: References: Message-ID: On Tue, Feb 10, 2009 at 3:43 AM, Lie Ryan wrote: > I've been experimenting with namedtuple, it seems that string formatting > doesn't recognize namedtuple as mapping. > > from collections import namedtuple > Nt = namedtuple('Nt', ['x', 'y']) > nt = Nt(12, 32) > print 'one = %(x)s, two = %(y)s' % nt > > # output should be: > one = 12, two = 32 > > currently, it is possible to use nt._asdict() as a workaround, but I > think it will be easier and more intuitive to be able to use namedtuple > directly with string interpolation > > Sorry, if this issue has been discussed before. Exposing a dict API would pollute namedtuple. What's more, it's unnecessary in python 2.6/3.0: >>> print('one = {0.x}, two = {0.y}'.format(nt)) one = 12, two = 32 (And before you ask, no, I don't think it's worth adding the dict API just to make old-style formatting a tiny bit easier.) -- Adam Olsen, aka Rhamphoryncus From aahz at pythoncraft.com Wed Feb 11 16:38:55 2009 From: aahz at pythoncraft.com (Aahz) Date: Wed, 11 Feb 2009 07:38:55 -0800 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: <20090210170319.GA1763@panix.com> References: <20090210170319.GA1763@panix.com> Message-ID: <20090211153855.GA13187@panix.com> On Tue, Feb 10, 2009, Aahz wrote: > On Tue, Feb 10, 2009, Lie Ryan wrote: >> >> I've been experimenting with namedtuple, it seems that string formatting >> doesn't recognize namedtuple as mapping. >> >> from collections import namedtuple >> Nt = namedtuple('Nt', ['x', 'y']) >> nt = Nt(12, 32) >> print 'one = %(x)s, two = %(y)s' % nt >> >> # output should be: >> one = 12, two = 32 >> >> currently, it is possible to use nt._asdict() as a workaround, but I >> think it will be easier and more intuitive to be able to use namedtuple >> directly with string interpolation > > This makes perfect sense to me; I think you should feel free to go ahead > and submit a patch. (Regardless of whether the patch gets accepted, at > least we would have a good record -- python-ideas is not really a good > place to record ideas, and this is something nice and simple.) Actually, I just realized that this can't work because then you couldn't have print 'one = %s, two = %s' % nt due to the overloading on __getitem__(). Even if you could somehow special-case this, tuple index sequencing needs to continue working as fast path. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ Weinberg's Second Law: If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization. From python at rcn.com Wed Feb 11 21:11:28 2009 From: python at rcn.com (Raymond Hettinger) Date: Wed, 11 Feb 2009 12:11:28 -0800 Subject: [Python-ideas] String formatting and namedtuple References: Message-ID: <4E46CE2777954AB696472EFFB9197A76@RaymondLaptop1> [Lie Ryan] > I've been experimenting with namedtuple, it seems that string formatting > doesn't recognize namedtuple as mapping. That's because a named tuple isn't a mapping ;-) It's a tuple that also supports getattr() style access. > from collections import namedtuple > Nt = namedtuple('Nt', ['x', 'y']) > nt = Nt(12, 32) > print 'one = %(x)s, two = %(y)s' % nt > > # output should be: > one = 12, two = 32 > > currently, it is possible to use nt._asdict() as a workaround, but I > think it will be easier and more intuitive to be able to use namedtuple > directly with string interpolation This is not unique to named tuples. String interpolation and the string format do not use getattr() style access with any kind of object: print '<%(real)s, %(imag)s>' % (3+4j) # doesn't find real/imag attributes Raymond From guido at python.org Wed Feb 11 21:53:15 2009 From: guido at python.org (Guido van Rossum) Date: Wed, 11 Feb 2009 12:53:15 -0800 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: <4E46CE2777954AB696472EFFB9197A76@RaymondLaptop1> References: <4E46CE2777954AB696472EFFB9197A76@RaymondLaptop1> Message-ID: On Wed, Feb 11, 2009 at 12:11 PM, Raymond Hettinger wrote: > > [Lie Ryan] >> >> I've been experimenting with namedtuple, it seems that string formatting >> doesn't recognize namedtuple as mapping. > > That's because a named tuple isn't a mapping ;-) > It's a tuple that also supports getattr() style access. > > >> from collections import namedtuple >> Nt = namedtuple('Nt', ['x', 'y']) >> nt = Nt(12, 32) >> print 'one = %(x)s, two = %(y)s' % nt >> >> # output should be: >> one = 12, two = 32 >> >> currently, it is possible to use nt._asdict() as a workaround, but I think >> it will be easier and more intuitive to be able to use namedtuple directly >> with string interpolation > > This is not unique to named tuples. String interpolation and the string > format do not use getattr() style access with any kind of object: > > print '<%(real)s, %(imag)s>' % (3+4j) # doesn't find real/imag attributes Hm... I see a feature request brewing. In some use cases it might make a *lot* of sense to have a variant of .format() that uses __getattr__ instead of __getitem__... -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ironfroggy at gmail.com Wed Feb 11 22:05:25 2009 From: ironfroggy at gmail.com (Calvin Spealman) Date: Wed, 11 Feb 2009 16:05:25 -0500 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: References: <4E46CE2777954AB696472EFFB9197A76@RaymondLaptop1> Message-ID: <76fd5acf0902111305u2c51b432h7d3b13fa2e84cb49@mail.gmail.com> On Wed, Feb 11, 2009 at 3:53 PM, Guido van Rossum wrote: > On Wed, Feb 11, 2009 at 12:11 PM, Raymond Hettinger wrote: >> >> [Lie Ryan] >>> >>> I've been experimenting with namedtuple, it seems that string formatting >>> doesn't recognize namedtuple as mapping. >> >> That's because a named tuple isn't a mapping ;-) >> It's a tuple that also supports getattr() style access. >> >> >>> from collections import namedtuple >>> Nt = namedtuple('Nt', ['x', 'y']) >>> nt = Nt(12, 32) >>> print 'one = %(x)s, two = %(y)s' % nt >>> >>> # output should be: >>> one = 12, two = 32 >>> >>> currently, it is possible to use nt._asdict() as a workaround, but I think >>> it will be easier and more intuitive to be able to use namedtuple directly >>> with string interpolation >> >> This is not unique to named tuples. String interpolation and the string >> format do not use getattr() style access with any kind of object: >> >> print '<%(real)s, %(imag)s>' % (3+4j) # doesn't find real/imag attributes > > Hm... I see a feature request brewing. In some use cases it might make > a *lot* of sense to have a variant of .format() that uses __getattr__ > instead of __getitem__... Perhaps the feature request here should be that vars() be able to work on built-in types like these, so we could just use it as a simple wrapper. > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://techblog.ironfroggy.com/ Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy From guido at python.org Wed Feb 11 22:10:17 2009 From: guido at python.org (Guido van Rossum) Date: Wed, 11 Feb 2009 13:10:17 -0800 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: <76fd5acf0902111305u2c51b432h7d3b13fa2e84cb49@mail.gmail.com> References: <4E46CE2777954AB696472EFFB9197A76@RaymondLaptop1> <76fd5acf0902111305u2c51b432h7d3b13fa2e84cb49@mail.gmail.com> Message-ID: On Wed, Feb 11, 2009 at 1:05 PM, Calvin Spealman wrote: > On Wed, Feb 11, 2009 at 3:53 PM, Guido van Rossum wrote: >> On Wed, Feb 11, 2009 at 12:11 PM, Raymond Hettinger wrote: >>> This is not unique to named tuples. String interpolation and the string >>> format do not use getattr() style access with any kind of object: >>> >>> print '<%(real)s, %(imag)s>' % (3+4j) # doesn't find real/imag attributes >> >> Hm... I see a feature request brewing. In some use cases it might make >> a *lot* of sense to have a variant of .format() that uses __getattr__ >> instead of __getitem__... > > Perhaps the feature request here should be that vars() be able to work > on built-in types like these, so we could just use it as a simple > wrapper. I don't think you need vars(). vars() is for discovery of *all* attributes, but .format() doesn't need that. An adaptor like this would do it, but it would be nice if there was a shorthand for something like this (untested) snippet: class GetItemToGetAttrAdaptor: def __init__(self, target): self.target = target def __getitem__(self, key): try: return getattr(self.target, key) except AttributeError as e: raise KeyError(str(e)) You could then use "re={real} im={imag}".format(GetItemToGetAttrAdaptor(1j+2)) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From rhamph at gmail.com Wed Feb 11 22:18:19 2009 From: rhamph at gmail.com (Adam Olsen) Date: Wed, 11 Feb 2009 14:18:19 -0700 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: References: Message-ID: On Wed, Feb 11, 2009 at 5:09 AM, Lie Ryan wrote: > On Wed, Feb 11, 2009 at 7:41 PM, Adam Olsen wrote: > >> Exposing a dict API would pollute namedtuple. > > What do you mean by "pollute namedtuple"? I just mean that we don't want to add a bunch of extra options and features that'll only be appropriate for rare special cases. >> What's more, it's >> unnecessary in python 2.6/3.0: >> >> >>> print('one = {0.x}, two = {0.y}'.format(nt)) >> one = 12, two = 32 >> >> (And before you ask, no, I don't think it's worth adding the dict API >> just to make old-style formatting a tiny bit easier.) > > Is % interpolation deprecated? I've never heard of that... There's no plans to remove it in the near future, so it's not deprecated, and there's no problem using it when it's convenient. However, .format() obsoletes it. .format() is significantly more powerful for reasons just like this. There's no reason to add new features to % interpolation with .format() around. -- Adam Olsen, aka Rhamphoryncus From ironfroggy at gmail.com Wed Feb 11 22:19:03 2009 From: ironfroggy at gmail.com (Calvin Spealman) Date: Wed, 11 Feb 2009 16:19:03 -0500 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: References: <4E46CE2777954AB696472EFFB9197A76@RaymondLaptop1> <76fd5acf0902111305u2c51b432h7d3b13fa2e84cb49@mail.gmail.com> Message-ID: <76fd5acf0902111319y5d802a8aud09acfc94cb9dec1@mail.gmail.com> On Wed, Feb 11, 2009 at 4:10 PM, Guido van Rossum wrote: > On Wed, Feb 11, 2009 at 1:05 PM, Calvin Spealman wrote: >> On Wed, Feb 11, 2009 at 3:53 PM, Guido van Rossum wrote: >>> On Wed, Feb 11, 2009 at 12:11 PM, Raymond Hettinger wrote: >>>> This is not unique to named tuples. String interpolation and the string >>>> format do not use getattr() style access with any kind of object: >>>> >>>> print '<%(real)s, %(imag)s>' % (3+4j) # doesn't find real/imag attributes >>> >>> Hm... I see a feature request brewing. In some use cases it might make >>> a *lot* of sense to have a variant of .format() that uses __getattr__ >>> instead of __getitem__... >> >> Perhaps the feature request here should be that vars() be able to work >> on built-in types like these, so we could just use it as a simple >> wrapper. > > I don't think you need vars(). vars() is for discovery of *all* > attributes, but .format() doesn't need that. An adaptor like this > would do it, but it would be nice if there was a shorthand for > something like this (untested) snippet: > > class GetItemToGetAttrAdaptor: > def __init__(self, target): > self.target = target > def __getitem__(self, key): > try: > return getattr(self.target, key) > except AttributeError as e: > raise KeyError(str(e)) > > You could then use "re={real} im={imag}".format(GetItemToGetAttrAdaptor(1j+2)) What if non-__dict__-having objects were treated exactly like that by vars(), returning this adapter and providing a uniform interface? Note that it would not, especially when invoked via vars(), allow setitem->setattr mapping. -- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://techblog.ironfroggy.com/ Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy From guido at python.org Wed Feb 11 22:23:50 2009 From: guido at python.org (Guido van Rossum) Date: Wed, 11 Feb 2009 13:23:50 -0800 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: <76fd5acf0902111319y5d802a8aud09acfc94cb9dec1@mail.gmail.com> References: <4E46CE2777954AB696472EFFB9197A76@RaymondLaptop1> <76fd5acf0902111305u2c51b432h7d3b13fa2e84cb49@mail.gmail.com> <76fd5acf0902111319y5d802a8aud09acfc94cb9dec1@mail.gmail.com> Message-ID: On Wed, Feb 11, 2009 at 1:19 PM, Calvin Spealman wrote: > On Wed, Feb 11, 2009 at 4:10 PM, Guido van Rossum wrote: >> On Wed, Feb 11, 2009 at 1:05 PM, Calvin Spealman wrote: >>> On Wed, Feb 11, 2009 at 3:53 PM, Guido van Rossum wrote: >>>> On Wed, Feb 11, 2009 at 12:11 PM, Raymond Hettinger wrote: >>>>> This is not unique to named tuples. String interpolation and the string >>>>> format do not use getattr() style access with any kind of object: >>>>> >>>>> print '<%(real)s, %(imag)s>' % (3+4j) # doesn't find real/imag attributes >>>> >>>> Hm... I see a feature request brewing. In some use cases it might make >>>> a *lot* of sense to have a variant of .format() that uses __getattr__ >>>> instead of __getitem__... >>> >>> Perhaps the feature request here should be that vars() be able to work >>> on built-in types like these, so we could just use it as a simple >>> wrapper. >> >> I don't think you need vars(). vars() is for discovery of *all* >> attributes, but .format() doesn't need that. An adaptor like this >> would do it, but it would be nice if there was a shorthand for >> something like this (untested) snippet: >> >> class GetItemToGetAttrAdaptor: >> def __init__(self, target): >> self.target = target >> def __getitem__(self, key): >> try: >> return getattr(self.target, key) >> except AttributeError as e: >> raise KeyError(str(e)) >> >> You could then use "re={real} im={imag}".format(GetItemToGetAttrAdaptor(1j+2)) > > What if non-__dict__-having objects were treated exactly like that by > vars(), returning this adapter and providing a uniform interface? Note > that it would not, especially when invoked via vars(), allow > setitem->setattr mapping. Having a __dict__ is not the deciding factor, it would be whether __getitem__ is defined. You could try try: val = x[key] except AttributeError: val = getattr(x, key) but I worry this is likely to mask bugs where one simply passed the wrong object. (Also I realize that my example .format(Adaptor(...)) doesn't work -- it would have to be .format(**Adaptor(...)).) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Feb 11 22:24:55 2009 From: guido at python.org (Guido van Rossum) Date: Wed, 11 Feb 2009 13:24:55 -0800 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: References: Message-ID: On Wed, Feb 11, 2009 at 1:18 PM, Adam Olsen wrote: > On Wed, Feb 11, 2009 at 5:09 AM, Lie Ryan wrote: >> Is % interpolation deprecated? I've never heard of that... > > There's no plans to remove it in the near future, so it's not > deprecated, and there's no problem using it when it's convenient. > > However, .format() obsoletes it. .format() is significantly more > powerful for reasons just like this. There's no reason to add new > features to % interpolation with .format() around. I thought the plan was to start deprecating % in 3.1 and remove it at some later release (maybe 3.3). Or did I miss a reversal on this? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From rhamph at gmail.com Wed Feb 11 22:27:39 2009 From: rhamph at gmail.com (Adam Olsen) Date: Wed, 11 Feb 2009 14:27:39 -0700 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: References: <4E46CE2777954AB696472EFFB9197A76@RaymondLaptop1> <76fd5acf0902111305u2c51b432h7d3b13fa2e84cb49@mail.gmail.com> Message-ID: On Wed, Feb 11, 2009 at 2:10 PM, Guido van Rossum wrote: > class GetItemToGetAttrAdaptor: > def __init__(self, target): > self.target = target > def __getitem__(self, key): > try: > return getattr(self.target, key) > except AttributeError as e: > raise KeyError(str(e)) > > You could then use "re={real} im={imag}".format(GetItemToGetAttrAdaptor(1j+2)) I'm confused, we can already satisfy this use case: >>> "re={0.real} im={0.imag}".format(1j+2) 're=2.0 im=1.0' -- Adam Olsen, aka Rhamphoryncus From rhamph at gmail.com Wed Feb 11 22:28:49 2009 From: rhamph at gmail.com (Adam Olsen) Date: Wed, 11 Feb 2009 14:28:49 -0700 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: References: Message-ID: On Wed, Feb 11, 2009 at 2:24 PM, Guido van Rossum wrote: > On Wed, Feb 11, 2009 at 1:18 PM, Adam Olsen wrote: >> On Wed, Feb 11, 2009 at 5:09 AM, Lie Ryan wrote: >>> Is % interpolation deprecated? I've never heard of that... >> >> There's no plans to remove it in the near future, so it's not >> deprecated, and there's no problem using it when it's convenient. >> >> However, .format() obsoletes it. .format() is significantly more >> powerful for reasons just like this. There's no reason to add new >> features to % interpolation with .format() around. > > I thought the plan was to start deprecating % in 3.1 and remove it at > some later release (maybe 3.3). Or did I miss a reversal on this? I'm going to hide behind my vague use of "near future". Although I didn't realize removal was even that specific yet. -- Adam Olsen, aka Rhamphoryncus From python at rcn.com Wed Feb 11 22:34:48 2009 From: python at rcn.com (Raymond Hettinger) Date: Wed, 11 Feb 2009 13:34:48 -0800 Subject: [Python-ideas] String formatting and namedtuple References: <4E46CE2777954AB696472EFFB9197A76@RaymondLaptop1><76fd5acf0902111305u2c51b432h7d3b13fa2e84cb49@mail.gmail.com> Message-ID: <23F74D75DB2E47D98169F7FBD9ABA6F6@RaymondLaptop1> >> You could then use "re={real} im={imag}".format(GetItemToGetAttrAdaptor(1j+2)) > > I'm confused, we can already satisfy this use case: > >>>> "re={0.real} im={0.imag}".format(1j+2) > 're=2.0 im=1.0' Interesting how a quest for new tools blinds us to the ones we already have ;-) Raymond From python at rcn.com Wed Feb 11 22:44:51 2009 From: python at rcn.com (Raymond Hettinger) Date: Wed, 11 Feb 2009 13:44:51 -0800 Subject: [Python-ideas] String formatting and namedtuple References: Message-ID: [Guido van Rossum] > I thought the plan was to start deprecating % in 3.1 and remove it at > some later release (maybe 3.3). Or did I miss a reversal on this? I thought we had backed-off on this for a number of reasons. * waiting to see if users actually adopt and prefer the new way * no compelling reason to force people to convert right away * need a tool for automatic conversion (this may not be easy) * the code and api for the new-way hasn't had a chance to be shaken-out and battle-tested in real-world apps yet. both the api and implementation are not yet mature (i.e. how well does it work in templating apps and whatnot). Raymond From guido at python.org Wed Feb 11 22:48:09 2009 From: guido at python.org (Guido van Rossum) Date: Wed, 11 Feb 2009 13:48:09 -0800 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: References: Message-ID: On Wed, Feb 11, 2009 at 1:44 PM, Raymond Hettinger wrote: > > [Guido van Rossum] >> >> I thought the plan was to start deprecating % in 3.1 and remove it at >> some later release (maybe 3.3). Or did I miss a reversal on this? > > I thought we had backed-off on this for a number of reasons. Can you refer me to a thread? > * waiting to see if users actually adopt and prefer the new way Without deprecation as the stick I doubt that they will bother to even try it. > * no compelling reason to force people to convert right away 3.3 is not right away. :-) > * need a tool for automatic conversion (this may not be easy) I believe we punted on this, otherwise we would have removed % from 3.0. > * the code and api for the new-way hasn't had a chance to be shaken-out and > battle-tested in real-world apps yet. both > the api and implementation are not yet mature (i.e. how well > does it work in templating apps and whatnot). Fair enough. But again, 3.3 is a long time away. > Raymond > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz at pythoncraft.com Wed Feb 11 22:51:10 2009 From: aahz at pythoncraft.com (Aahz) Date: Wed, 11 Feb 2009 13:51:10 -0800 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: References: Message-ID: <20090211215110.GA17105@panix.com> On Wed, Feb 11, 2009, Guido van Rossum wrote: > > I thought the plan was to start deprecating % in 3.1 and remove it at > some later release (maybe 3.3). Or did I miss a reversal on this? Well, I thought it had been reversed. If it's not reversed, I'll start campaigning to keep it. I think any decision to deprecate % should come *after* 3.x has had significant uptake among the broader community. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ Weinberg's Second Law: If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization. From guido at python.org Wed Feb 11 22:58:48 2009 From: guido at python.org (Guido van Rossum) Date: Wed, 11 Feb 2009 13:58:48 -0800 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: <20090211215110.GA17105@panix.com> References: <20090211215110.GA17105@panix.com> Message-ID: On Wed, Feb 11, 2009 at 1:51 PM, Aahz wrote: > On Wed, Feb 11, 2009, Guido van Rossum wrote: >> I thought the plan was to start deprecating % in 3.1 and remove it at >> some later release (maybe 3.3). Or did I miss a reversal on this? > > Well, I thought it had been reversed. If it's not reversed, I'll start > campaigning to keep it. I think any decision to deprecate % should come > *after* 3.x has had significant uptake among the broader community. That probably means we'll still be supporting % 5 years from now. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tjreedy at udel.edu Wed Feb 11 23:15:35 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 11 Feb 2009 17:15:35 -0500 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: References: Message-ID: Guido van Rossum wrote: > On Wed, Feb 11, 2009 at 1:18 PM, Adam Olsen wrote: >> On Wed, Feb 11, 2009 at 5:09 AM, Lie Ryan wrote: >>> Is % interpolation deprecated? I've never heard of that... >> There's no plans to remove it in the near future, so it's not >> deprecated, and there's no problem using it when it's convenient. >> >> However, .format() obsoletes it. .format() is significantly more >> powerful for reasons just like this. There's no reason to add new >> features to % interpolation with .format() around. I agree. > I thought the plan was to start deprecating % in 3.1 and remove it at > some later release (maybe 3.3). Or did I miss a reversal on this? I am aware of this as your idea and possibly even as your 'plan', but not of it as a consensus plan. I sense that there may be some quiet opposition. There is certainly some disconnect on the issue. So, unless you intend a quick BDFL pronouncement, I recommend you sometime start a "Deprecate % interpolation" thread to get arguments pro and con exposed and argued. tjr From steve at pearwood.info Wed Feb 11 23:49:05 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 12 Feb 2009 09:49:05 +1100 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: References: Message-ID: <499355E1.7030209@pearwood.info> Guido van Rossum wrote: > On Wed, Feb 11, 2009 at 1:18 PM, Adam Olsen wrote: >> On Wed, Feb 11, 2009 at 5:09 AM, Lie Ryan wrote: >>> Is % interpolation deprecated? I've never heard of that... >> There's no plans to remove it in the near future, so it's not >> deprecated, and there's no problem using it when it's convenient. >> >> However, .format() obsoletes it. .format() is significantly more >> powerful for reasons just like this. There's no reason to add new >> features to % interpolation with .format() around. > > I thought the plan was to start deprecating % in 3.1 and remove it at > some later release (maybe 3.3). Or did I miss a reversal on this? Judging by responses on comp.lang.python deprecating % will make a lot of people deeply unhappy. Personally, my own feeling is that they're (mostly) just objecting to change, but they do make some good points. In a later post, Guido wrote: "Without deprecation as the stick I doubt that they will bother to even try [format]." I think you under-estimate just how much many programmers like to play with shiny new tools. I installed Python 2.6 specifically so I could try format. My conclusion is that it is a nice, powerful *heavyweight* solution for text formatting. It's great for complex tasks, but it's hard to beat % for simple ones. -- Steven From python at rcn.com Wed Feb 11 23:50:36 2009 From: python at rcn.com (Raymond Hettinger) Date: Wed, 11 Feb 2009 14:50:36 -0800 Subject: [Python-ideas] String formatting and namedtuple References: Message-ID: <529E126629B145BFAFE322648952AE11@RaymondLaptop1> [Terry Reedy] > There is certainly some disconnect on the issue. FWIW, whenever I done talks on 3.0, it is common to get an aversive reaction when the new syntax is shown. I pitch it in a positive light, but you can sense churning stomachs. > get arguments pro > and con exposed and argued. One risk is that maintainers of third-party modules may be disincentivized from converting to 3.x. For heavy users of %-formatting, it may be easiest just to stay in the 2.x world (unless an automated conversion tool emerges). Another thought is that it is premature to mandate that others convert until we ourselves have updated the standard library. That exercise may be informative and provide some evidence about how easy, how hard, or how possible it is to convert. Also, conversion may be difficult in cases where the existing syntax has been exposed to end-users and is a guaranteed part of the API (perhaps in config files, templates, file renamers, mail mergers, etc.) In our own code, logging formatters spring to mind. Raymond From guido at python.org Thu Feb 12 00:01:32 2009 From: guido at python.org (Guido van Rossum) Date: Wed, 11 Feb 2009 15:01:32 -0800 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: References: Message-ID: On Wed, Feb 11, 2009 at 2:15 PM, Terry Reedy wrote: > Guido van Rossum wrote: >> I thought the plan was to start deprecating % in 3.1 and remove it at >> some later release (maybe 3.3). Or did I miss a reversal on this? > > I am aware of this as your idea and possibly even as your 'plan', but not of > it as a consensus plan. I sense that there may be some quiet opposition. It's not quiet any more. > There is certainly some disconnect on the issue. > > So, unless you intend a quick BDFL pronouncement, I recommend you sometime > start a "Deprecate % interpolation" thread to get arguments pro and con > exposed and argued. I don't intend to force the issue. I'm disappointed though -- .format() fixes several common stumbling blocks with %(name)s and at least one with %s. Let someone else start a new thread. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz at pythoncraft.com Thu Feb 12 01:14:07 2009 From: aahz at pythoncraft.com (Aahz) Date: Wed, 11 Feb 2009 16:14:07 -0800 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: <529E126629B145BFAFE322648952AE11@RaymondLaptop1> References: <529E126629B145BFAFE322648952AE11@RaymondLaptop1> Message-ID: <20090212001407.GA23285@panix.com> On Wed, Feb 11, 2009, Raymond Hettinger wrote: > > Also, conversion may be difficult in cases where the existing syntax has > been exposed to end-users and is a guaranteed part of the API > (perhaps in config files, templates, file renamers, mail mergers, etc.) > In our own code, logging formatters spring to mind. My previous company exposed raw Python code to end-users, so this it not a trivial concern. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ Weinberg's Second Law: If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization. From tjreedy at udel.edu Thu Feb 12 02:06:37 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 11 Feb 2009 20:06:37 -0500 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: References: Message-ID: I am combining responses to several posts. Guido van Rossum wrote: >> [Raymond] I thought we had backed-off on this for a number of reasons. > > Can you refer me to a thread? I believe there was one over a year ago on backing off from deprecation in 3.0. I also believe the alternative time frame was subject to differential interpretation. *** > That probably means we'll still be supporting % 5 years from now. Declare % interpolation 'frozen': no new features. [As Raymond suggested with respect to attribute lookup.] *** > Without deprecation as the stick I doubt that they will bother to > even try it. ... > .format() fixes several common stumbling blocks with %(name)s and at > least one with %s. If there is a concise article on "Advantages of .format over % interpolation" to encourage use of the former, I am not aware of it. Attribute lookup with field.name is one of many to promote. One thing probably not mentioned in the PEP is the possibility of bound methods, reduces typing of '.format' for reused formats. >>> msg = "{0} == {1}".format >>> print(msg('.format', 'improvement')) .format == improvement >>> msg('Python', 'greatness') 'Python == greatness' [Steven D'Aprano] >It's great for complex tasks, but it's hard to beat % for simple ones. This seems to be a common feeling. PROPOSAL: Allow the simple case to stay simple. Allow field names to be omitted for all fields in a string and then default to 0, 1, ... so that example above could be written as >>> msg = "{} == {}".format Given that computers are glorified counting machines, it *is* a bit annoying to be required to do the counting manually. I think this is at least half the objection to the new scheme! Terry Jan Reedy From rhamph at gmail.com Thu Feb 12 02:23:38 2009 From: rhamph at gmail.com (Adam Olsen) Date: Wed, 11 Feb 2009 18:23:38 -0700 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: References: <20090211215110.GA17105@panix.com> Message-ID: On Wed, Feb 11, 2009 at 2:58 PM, Guido van Rossum wrote: > On Wed, Feb 11, 2009 at 1:51 PM, Aahz wrote: >> On Wed, Feb 11, 2009, Guido van Rossum wrote: >>> I thought the plan was to start deprecating % in 3.1 and remove it at >>> some later release (maybe 3.3). Or did I miss a reversal on this? >> >> Well, I thought it had been reversed. If it's not reversed, I'll start >> campaigning to keep it. I think any decision to deprecate % should come >> *after* 3.x has had significant uptake among the broader community. > > That probably means we'll still be supporting % 5 years from now. My concern is not so much with general users but rather projects attempting to support many versions at once. I know 3.0 tries to be a big jump, but I much prefer an incremental transition with as much overlap as we can stomach. I'd suggest waiting until we stop releasing 2.x before removing any more features. Of course that assumes 2.x won't stretch out into 10 years. -- Adam Olsen, aka Rhamphoryncus From denis.spir at free.fr Thu Feb 12 14:10:40 2009 From: denis.spir at free.fr (spir) Date: Thu, 12 Feb 2009 14:10:40 +0100 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: References: Message-ID: <20090212141040.0c89e0fc@o> Le Wed, 11 Feb 2009 20:06:37 -0500, Terry Reedy a ?crit : > One thing probably not mentioned in the PEP is the possibility of bound > methods, reduces typing of '.format' for reused formats. > > >>> msg = "{0} == {1}".format > >>> print(msg('.format', 'improvement')) > .format == improvement > >>> msg('Python', 'greatness') > 'Python == greatness' This is relevant. I've read the PEP and was not aware of such a wide open door. It allows building collections of string formats. Why not makes them public? Why not e.g. start a wiki page for common useful formats? Why not then store them into a standard module? I see loads of uses in the sole field of UI. "Please, enter a {0}." "Hello, {0}! Ausgeschlafen? Press enter to continue..." (slept well?) "name: {0} -- phone:{1} -- email{2}" Denis ------ la vida e estranya From ddborowitz at gmail.com Thu Feb 12 19:29:15 2009 From: ddborowitz at gmail.com (David Borowitz) Date: Thu, 12 Feb 2009 10:29:15 -0800 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: <20090212141040.0c89e0fc@o> References: <20090212141040.0c89e0fc@o> Message-ID: <70e6cf130902121029l17ec06e0k9d3ea91e23ecd74e@mail.gmail.com> Can't roughly the same thing be achieved with % substitution? >>> msg = '%s == %s' >>> print(msg % ('.format', 'improvement')) .format == improvement >>> msg % ('Python', 'greatness') 'Python == greatness' The main non-syntactic difference here is that msg is a normal string object rather than a bound method. Not arguing against moving to .format, just that it doesn't seem inherently more powerful than % in this case. (I guess you could still argue that this pattern alleviates a concern with .format, namely repeated typing of .format, but that's never been an issue with % to begin with.) On Thu, Feb 12, 2009 at 05:10, spir wrote: > Le Wed, 11 Feb 2009 20:06:37 -0500, > Terry Reedy a ?crit : > > > One thing probably not mentioned in the PEP is the possibility of bound > > methods, reduces typing of '.format' for reused formats. > > > > >>> msg = "{0} == {1}".format > > >>> print(msg('.format', 'improvement')) > > .format == improvement > > >>> msg('Python', 'greatness') > > 'Python == greatness' > > This is relevant. I've read the PEP and was not aware of such a wide open > door. It allows building collections of string formats. Why not makes them > public? Why not e.g. start a wiki page for common useful formats? Why not > then store them into a standard module? > > I see loads of uses in the sole field of UI. > "Please, enter a {0}." > "Hello, {0}! Ausgeschlafen? Press enter to continue..." (slept well?) > "name: {0} -- phone:{1} -- email{2}" > > Denis > ------ > la vida e estranya > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- It is better to be quotable than to be honest. -Tom Stoppard Borowitz -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Thu Feb 12 19:46:00 2009 From: guido at python.org (Guido van Rossum) Date: Thu, 12 Feb 2009 10:46:00 -0800 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: <70e6cf130902121029l17ec06e0k9d3ea91e23ecd74e@mail.gmail.com> References: <20090212141040.0c89e0fc@o> <70e6cf130902121029l17ec06e0k9d3ea91e23ecd74e@mail.gmail.com> Message-ID: On Thu, Feb 12, 2009 at 10:29 AM, David Borowitz wrote: > Can't roughly the same thing be achieved with % substitution? >>>> msg = '%s == %s' >>>> print(msg % ('.format', 'improvement')) > .format == improvement >>>> msg % ('Python', 'greatness') > 'Python == greatness' > > The main non-syntactic difference here is that msg is a normal string object > rather than a bound method. > > Not arguing against moving to .format, just that it doesn't seem inherently > more powerful than % in this case. (I guess you could still argue that this > pattern alleviates a concern with .format, namely repeated typing of > .format, but that's never been an issue with % to begin with.) It has the classic issue that % always has: if there's a single %s in the format string it will do the wrong thing when the argument is a tuple. >>> s = "[%s]" >>> mystery_object = () >>> s % mystery_object Traceback (most recent call last): File "", line 1, in TypeError: not enough arguments for format string >>> -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tjreedy at udel.edu Thu Feb 12 20:24:25 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 12 Feb 2009 14:24:25 -0500 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: References: <20090212141040.0c89e0fc@o> <70e6cf130902121029l17ec06e0k9d3ea91e23ecd74e@mail.gmail.com> Message-ID: Guido van Rossum wrote: > On Thu, Feb 12, 2009 at 10:29 AM, David Borowitz wrote: >> Can't roughly the same thing be achieved with % substitution? >>>>> msg = '%s == %s' >>>>> print(msg % ('.format', 'improvement')) >> .format == improvement >>>>> msg % ('Python', 'greatness') >> 'Python == greatness' >> >> The main non-syntactic difference here is that msg is a normal string object >> rather than a bound method. >> >> Not arguing against moving to .format, just that it doesn't seem inherently >> more powerful than % in this case. (I guess you could still argue that this >> pattern alleviates a concern with .format, namely repeated typing of >> .format, but that's never been an issue with % to begin with.) > > It has the classic issue that % always has: if there's a single %s in > the format string it will do the wrong thing when the argument is a > tuple. > >>>> s = "[%s]" >>>> mystery_object = () >>>> s % mystery_object > Traceback (most recent call last): > File "", line 1, in > TypeError: not enough arguments for format string Using bound methods was, to me, a lesser point of my post. I still am curious what anyone thinks of PROPOSAL: Allow the simple case to stay simple. Allow field names to be omitted for all fields in a string and then default to 0, 1, ... so that example above could be written as >>> msg = "{} == {}".format Given that computers are glorified counting machines, it *is* a bit annoying to be required to do the counting manually. I think this is at least half the objection to switching to .format. Terry Jan Reedy From aahz at pythoncraft.com Thu Feb 12 20:45:21 2009 From: aahz at pythoncraft.com (Aahz) Date: Thu, 12 Feb 2009 11:45:21 -0800 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: References: <20090212141040.0c89e0fc@o> <70e6cf130902121029l17ec06e0k9d3ea91e23ecd74e@mail.gmail.com> Message-ID: <20090212194520.GA12955@panix.com> On Thu, Feb 12, 2009, Terry Reedy wrote: > > PROPOSAL: Allow the simple case to stay simple. Allow field names to be > omitted for all fields in a string and then default to 0, 1, ... so that > example above could be written as > > >>> msg = "{} == {}".format > > Given that computers are glorified counting machines, it *is* a bit > annoying to be required to do the counting manually. I think this is at > least half the objection to switching to .format. +1 -- given this, I would remove half my objection to .format(); the rest has mostly to do with backward compatibility as explained previously. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ Weinberg's Second Law: If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization. From python at rcn.com Thu Feb 12 21:01:52 2009 From: python at rcn.com (Raymond Hettinger) Date: Thu, 12 Feb 2009 12:01:52 -0800 Subject: [Python-ideas] String formatting and namedtuple References: <20090212141040.0c89e0fc@o><70e6cf130902121029l17ec06e0k9d3ea91e23ecd74e@mail.gmail.com> <20090212194520.GA12955@panix.com> Message-ID: >> PROPOSAL: Allow the simple case to stay simple. Allow field names to be >> omitted for all fields in a string and then default to 0, 1, ... so that >> example above could be written as >> >> >>> msg = "{} == {}".format >> >> Given that computers are glorified counting machines, it *is* a bit >> annoying to be required to do the counting manually. I think this is at >> least half the objection to switching to .format. > > +1 -- given this, I would remove half my objection to .format(); the rest > has mostly to do with backward compatibility as explained previously. +1 from here also. Raymond From grosser.meister.morti at gmx.net Thu Feb 12 21:06:30 2009 From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=) Date: Thu, 12 Feb 2009 21:06:30 +0100 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: <20090212194520.GA12955@panix.com> References: <20090212141040.0c89e0fc@o> <70e6cf130902121029l17ec06e0k9d3ea91e23ecd74e@mail.gmail.com> <20090212194520.GA12955@panix.com> Message-ID: <49948146.20904@gmx.net> Aahz wrote: > On Thu, Feb 12, 2009, Terry Reedy wrote: >> PROPOSAL: Allow the simple case to stay simple. Allow field names to be >> omitted for all fields in a string and then default to 0, 1, ... so that >> example above could be written as >> >>>>> msg = "{} == {}".format >> Given that computers are glorified counting machines, it *is* a bit >> annoying to be required to do the counting manually. I think this is at >> least half the objection to switching to .format. > > +1 -- given this, I would remove half my objection to .format(); the rest > has mostly to do with backward compatibility as explained previously. +1 from me, too. From rwgk at yahoo.com Thu Feb 12 23:19:03 2009 From: rwgk at yahoo.com (Ralf W. Grosse-Kunstleve) Date: Thu, 12 Feb 2009 14:19:03 -0800 (PST) Subject: [Python-ideas] set.add() return value Message-ID: <118486.47943.qm@web111413.mail.gq1.yahoo.com> Has this come up before? Python 2.6.1 behavior: >>> s = set() >>> print s.add(1) None >>> print s.add(1) None >>> Desired behavior: >>> s = set() >>> print s.add(1) True >>> print s.add(1) False >>> Motivation: Instead of if (1 not in s): # O(N log N) lookup s.add(1) # O(N log N) lookup again do_something_else() or prev_len = len(s) s.add(1) # O(N log N) lookup if (len(s) != prev_len): do_something_else() one could write if (s.add(1)): do_something_else() which would be as fast as the second form and the most concise of all alternatives. From george.sakkis at gmail.com Thu Feb 12 23:40:10 2009 From: george.sakkis at gmail.com (George Sakkis) Date: Thu, 12 Feb 2009 17:40:10 -0500 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: References: <20090212141040.0c89e0fc@o> <70e6cf130902121029l17ec06e0k9d3ea91e23ecd74e@mail.gmail.com> Message-ID: <91ad5bf80902121440j62fd3bcar8abef4bdca492ea@mail.gmail.com> On Thu, Feb 12, 2009 at 2:24 PM, Terry Reedy wrote: > PROPOSAL: Allow the simple case to stay simple. Allow field names to be > omitted for all fields in a string and then default to 0, 1, ... so that > example above could be written as > >>>> msg = "{} == {}".format > > Given that computers are glorified counting machines, it *is* a bit annoying > to be required to do the counting manually. I think this is at least half > the objection to switching to .format. What happens when both empty and non-empty fields appear ? E.g. would 'I love {} with {1} and {} with {1}'.format('bacon', 'eggs', 'sausage') return 'I love bacon with eggs and eggs with eggs', or it would be smarter and see that 1 is used explicitly and skip over it, giving 'I love bacon with eggs and sausage with eggs' ? George From python at rcn.com Thu Feb 12 23:40:02 2009 From: python at rcn.com (Raymond Hettinger) Date: Thu, 12 Feb 2009 14:40:02 -0800 Subject: [Python-ideas] set.add() return value References: <118486.47943.qm@web111413.mail.gq1.yahoo.com> Message-ID: <2F8C0565097D4CA690BB83CC5B6A2C19@RaymondLaptop1> [Ralf W. Grosse-Kunstleve] > Instead of > > if (1 not in s): # O(N log N) lookup > s.add(1) # O(N log N) lookup again > do_something_else() Three thoughts: * insertion and lookup times are O(1), not O(n log n). * because of caching the second lookup is very cheap. * the bdfl frowns on mutating methods returning anything at all > if (s.add(1)): > do_something_else() I would find this to be useful but don't find it to be a significant improvement over the original. Also, I find it to be a bit tricky. Currently, sets have a nearly zero learning curve and are not imbued with non-obvious behaviors. The proposed form requires that the reader knows about the special boolean found/notfound behavior. Also, since some people do want mutating methods to return a copy of the collection (i.e. s.add(1).add(2).add(3)), those folks will find your suggestion to be counter-intuitive. Raymond From aahz at pythoncraft.com Thu Feb 12 23:55:12 2009 From: aahz at pythoncraft.com (Aahz) Date: Thu, 12 Feb 2009 14:55:12 -0800 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: <91ad5bf80902121440j62fd3bcar8abef4bdca492ea@mail.gmail.com> References: <20090212141040.0c89e0fc@o> <70e6cf130902121029l17ec06e0k9d3ea91e23ecd74e@mail.gmail.com> <91ad5bf80902121440j62fd3bcar8abef4bdca492ea@mail.gmail.com> Message-ID: <20090212225512.GA10700@panix.com> On Thu, Feb 12, 2009, George Sakkis wrote: > On Thu, Feb 12, 2009 at 2:24 PM, Terry Reedy wrote: >> >> PROPOSAL: Allow the simple case to stay simple. Allow field names to be >> omitted for all fields in a string and then default to 0, 1, ... so that >> example above could be written as >> >> >>> msg = "{} == {}".format >> >> Given that computers are glorified counting machines, it *is* a bit annoying >> to be required to do the counting manually. I think this is at least half >> the objection to switching to .format. > > What happens when both empty and non-empty fields appear ? E.g. would > > 'I love {} with {1} and {} with {1}'.format('bacon', 'eggs', 'sausage') > > return 'I love bacon with eggs and eggs with eggs', or it would be > smarter and see that 1 is used explicitly and skip over it, giving 'I > love bacon with eggs and sausage with eggs' ? I'd favor raising an exception. Alternatively, we could do the equivalent of what % formatting does, which would be the first option (that is, '{#}' is considered equivalent to mapped interpolation in % formatting). -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ Weinberg's Second Law: If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization. From rwgk at yahoo.com Thu Feb 12 23:59:08 2009 From: rwgk at yahoo.com (Ralf W. Grosse-Kunstleve) Date: Thu, 12 Feb 2009 14:59:08 -0800 (PST) Subject: [Python-ideas] set.add() return value References: <118486.47943.qm@web111413.mail.gq1.yahoo.com> <2F8C0565097D4CA690BB83CC5B6A2C19@RaymondLaptop1> Message-ID: <203230.34503.qm@web111401.mail.gq1.yahoo.com> > * insertion and lookup times are O(1), not O(n log n). Sorry, I was thinking the .add() is happening inside a loop with N iterations, with N also being the eventual size of the set. Is O(N log N) correct then? http://en.wikipedia.org/wiki/Big_O_notation says: O(log N) example: finding an item in a sorted array with a binary search. Isn't that what set is doing? From pyideas at rebertia.com Fri Feb 13 00:12:37 2009 From: pyideas at rebertia.com (Chris Rebert) Date: Thu, 12 Feb 2009 15:12:37 -0800 Subject: [Python-ideas] set.add() return value In-Reply-To: <203230.34503.qm@web111401.mail.gq1.yahoo.com> References: <118486.47943.qm@web111413.mail.gq1.yahoo.com> <2F8C0565097D4CA690BB83CC5B6A2C19@RaymondLaptop1> <203230.34503.qm@web111401.mail.gq1.yahoo.com> Message-ID: <50697b2c0902121512h6749333tb2a5cf5ff1e54899@mail.gmail.com> On Thu, Feb 12, 2009 at 2:59 PM, Ralf W. Grosse-Kunstleve wrote: > >> * insertion and lookup times are O(1), not O(n log n). > > > Sorry, I was thinking the .add() is happening inside a loop with N iterations, > with N also being the eventual size of the set. Is O(N log N) correct then? > > http://en.wikipedia.org/wiki/Big_O_notation says: > > O(log N) example: finding an item in a sorted array with a binary search. > > Isn't that what set is doing? Python's set uses an unsorted hash table internally and is thus O(1), not O(N). Cheers, Chris -- Follow the path of the Iguana... http://rebertia.com From greg.ewing at canterbury.ac.nz Fri Feb 13 00:21:27 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 13 Feb 2009 12:21:27 +1300 Subject: [Python-ideas] Proto-PEP on a 'yield from' statement Message-ID: <4994AEF7.2080703@canterbury.ac.nz> Comments are invited on the following proto-PEP. PEP: XXX Title: Syntax for Delegating to a Subgenerator Version: $Revision$ Last-Modified: $Date$ Author: Gregory Ewing Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 13-Feb-2009 Python-Version: 2.7 Post-History: Abstract ======== A syntax is proposed to allow a generator to easily delegate part of its operations to another generator, with the subgenerator yielding directly to the delegating generator's caller and receiving values sent to the delegating generator using send(). Additionally, the subgenerator is allowed to return with a value and the value is made available to the delegating generator. The new syntax also opens up some opportunities for optimisation when one generator re-yields values produced by another. Proposal ======== The following new expression syntax will be allowed in the body of a generator: :: yield from where is an expression evaluating to an iterator. The effect is to run the iterator to exhaustion, with any values that it yields being passed directly to the caller of the generator containing the ``yield from`` expression (the "delegating generator"), and any values sent to the delegating generator using ``send()`` being sent directly to the iterator. (If the iterator does not have a ``send()`` method, values sent in are ignored.) The value of the ``yield from`` expression is the first argument to the ``StopIteration`` exception raised by the iterator when it terminates. Additionally, generators will be allowed to execute a ``return`` statement with a value, and that value will be passed as an argument to the ``StopIteration`` exception. Formal Semantics ---------------- The statement :: result = yield from iterator is semantically equivalent to :: _i = iterator try: _v = _i.next() while 1: if hasattr(_i, 'send'): _v = _i.send(_v) else: _v = _i.next() except StopIteration, _e: _a = _e.args if len(_a) > 0: result = _a[0] else: result = None Rationale ========= A Python generator is a form of coroutine, but has the limitation that it can only yield to its immediate caller. This means that a piece of code containing a ``yield`` cannot be factored out and put into a separate function in the same way as other code. Performing such a factoring causes the called function to itself become a generator, and it is necessary to explicitly iterate over this second generator and re-yield any values that it produces. If yielding of values is the only concern, this is not very arduous and can be performed with a loop such as :: for v in g: yield v However, if the subgenerator is to receive values sent to the outer generator using ``send()``, it is considerably more complicated. As the formal expansion presented above illustrates, the necessary code is very longwinded, and it is tricky to handle all the corner cases correctly. In this case, the advantages of a specialised syntax should be clear. The particular syntax proposed has been chosen as suggestive of its meaning, while not introducing any new keywords and clearly standing out as being different from a plain ``yield``. Furthermore, using a specialised syntax opens up possibilities for optimisation when there is a long chain of generators. Such chains can arise, for instance, when recursively traversing a tree structure. The overhead of passing ``next()`` calls and yielded values down and up the chain can cause what ought to be an O(n) operation to become O(n\*\*2). A possible strategy is to add a slot to generator objects to hold a generator being delegated to. When a ``next()`` or ``send()`` call is made on the generator, this slot is checked first, and if it is nonempty, the generator that it references is resumed instead. If it raises StopIteration, the slot is cleared and the main generator is resumed. This would reduce the delegation overhead to a chain of C function calls involving no Python code execution. A possible enhancement would be to traverse the whole chain of generators in a loop and directly resume the one at the end, although the handling of StopIteration is more complicated then. Alternative Proposals ===================== Proposals along similar lines have been made before, some using the syntax ``yield *`` instead of ``yield from``. While ``yield *`` is more concise, it could be argued that it looks too similar to an ordinary ``yield`` and the difference might be overlooked when reading code. To the author's knowledge, previous proposals have focused only on yielding values, and thereby suffered from the criticism that the two-line for-loop they replace is not sufficiently tiresome to write to justify a new syntax. By dealing with sent values as well as yielded ones, this proposal provides considerably more benefit. Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: -- Greg From tjreedy at udel.edu Fri Feb 13 00:32:29 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 12 Feb 2009 18:32:29 -0500 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: References: <20090212141040.0c89e0fc@o><70e6cf130902121029l17ec06e0k9d3ea91e23ecd74e@mail.gmail.com> <20090212194520.GA12955@panix.com> Message-ID: Raymond Hettinger wrote: >>> PROPOSAL: Allow the simple case to stay simple. Allow field names to >>> be omitted for all fields in a string and then default to 0, 1, ... >>> so that example above could be written as >>> >>> >>> msg = "{} == {}".format >>> >>> Given that computers are glorified counting machines, it *is* a bit >>> annoying to be required to do the counting manually. I think this is >>> at least half the objection to switching to .format. >> >> +1 -- given this, I would remove half my objection to .format(); the rest >> has mostly to do with backward compatibility as explained previously. > > +1 from here also. http://bugs.python.org/issue5237 This idea was pulled to full consciousness by your comment that we should tweak .format(), based on use experience, before deprecating % interpolation. tjr From solipsis at pitrou.net Fri Feb 13 01:13:59 2009 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 13 Feb 2009 00:13:59 +0000 (UTC) Subject: [Python-ideas] Proto-PEP on a 'yield from' statement References: <4994AEF7.2080703@canterbury.ac.nz> Message-ID: Greg Ewing writes: > > Additionally, generators will be allowed to execute a ``return`` > statement with a value, and that value will be passed as an argument > to the ``StopIteration`` exception. What is the use of it? The problem I can see is that in normal iteration forms (e.g. a "for" loop), the argument to StopIteration is ignored. Therefore, a generator executing such a return statement and expecting the caller to use the return value wouldn't be usable in normal iteration contexts. > :: > > result = yield from iterator > > is semantically equivalent to > > :: > > _i = iterator > try: > _v = _i.next() > while 1: > if hasattr(_i, 'send'): > _v = _i.send(_v) > else: > _v = _i.next() > except StopIteration, _e: > _a = _e.args > if len(_a) > 0: > result = _a[0] > else: > result = None There seems to lack at least a "yield" statement in this snippet. Also, why doesn't it call iter() first? Does it mean one couldn't write e.g. "yield from my_list"? Besides, the idea of getting the "result" from the /inner/ generator goes against the current semantics of "result = yield value", where the result comes from the /outer/ calling routine. Regards Antoine. From greg.ewing at canterbury.ac.nz Fri Feb 13 01:33:48 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 13 Feb 2009 13:33:48 +1300 Subject: [Python-ideas] Proto-PEP on a 'yield from' statement In-Reply-To: References: <4994AEF7.2080703@canterbury.ac.nz> Message-ID: <4994BFEC.1040108@canterbury.ac.nz> Antoine Pitrou wrote: > The problem I can see is that in normal iteration forms (e.g. a "for" loop), the > argument to StopIteration is ignored. Therefore, a generator executing such a > return statement and expecting the caller to use the return value wouldn't be > usable in normal iteration contexts. How is this different from an ordinary function returning a value that is ignored by the caller? It's up to the caller to decide whether to use the return value. If it wants the return value, then it has to either use a 'yield from' or catch the StopIteration itself and extract the value. > There seems to lack at least a "yield" statement in this snippet. You're right, there should be some yields there. I'll post a fixed version. > Also, why doesn't it call iter() first? Does it mean one couldn't write e.g. > "yield from my_list"? I'm not sure about that. Since you can't send anything to a list iterator, there's not a huge advantage over 'for x in my_list: yield x', but I suppose it's a logical and useful thing to be able to do. > Besides, the idea of getting the "result" from the /inner/ generator goes > against the current semantics of "result = yield value", where the result comes > from the /outer/ calling routine. I'm going to add some more material to the Rationale section that will hopefully make it clearer why I want it to work this way. -- Greg From steve at pearwood.info Fri Feb 13 01:34:06 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 13 Feb 2009 11:34:06 +1100 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: References: <20090212141040.0c89e0fc@o> <70e6cf130902121029l17ec06e0k9d3ea91e23ecd74e@mail.gmail.com> Message-ID: <4994BFFE.8000903@pearwood.info> Terry Reedy wrote: > PROPOSAL: Allow the simple case to stay simple. Allow field names to be > omitted for all fields in a string and then default to 0, 1, ... so that > example above could be written as > > >>> msg = "{} == {}".format > > Given that computers are glorified counting machines, it *is* a bit > annoying to be required to do the counting manually. I think this is at > least half the objection to switching to .format. +1 from me. Just to make it explicit: omitting field names will be an all-or-nothing proposition: you can't omit some of them unless you omit them all. Correct? -- Steven From greg.ewing at canterbury.ac.nz Fri Feb 13 01:42:39 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 13 Feb 2009 13:42:39 +1300 Subject: [Python-ideas] Corrections to Proto-PEP on a 'yield from' statement In-Reply-To: <4994AEF7.2080703@canterbury.ac.nz> References: <4994AEF7.2080703@canterbury.ac.nz> Message-ID: <4994C1FF.3030807@canterbury.ac.nz> There were some bugs in the expansion code. Here is a corrected version. result = yield from iterable expands to _i = iter(iterable) try: _v = yield _i.next() while 1: if hasattr(_i, 'send'): _v = yield _i.send(_v) else: _v = yield _i.next() except StopIteration, _e: _a = _e.args if len(_a) > 0: result = _a[0] else: result = None -- Greg From greg.ewing at canterbury.ac.nz Fri Feb 13 02:15:03 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 13 Feb 2009 14:15:03 +1300 Subject: [Python-ideas] Proto-PEP on a 'yield from' statement In-Reply-To: <20090212194129.576e2fff@bhuda.mired.org> References: <4994AEF7.2080703@canterbury.ac.nz> <4994BFEC.1040108@canterbury.ac.nz> <20090212194129.576e2fff@bhuda.mired.org> Message-ID: <4994C997.3060602@canterbury.ac.nz> Mike Meyer wrote: > While you're working on that, you might consider dealing with the > behavior of StopIteration exceptions. They're odd enough before this > gets added; it's not at all clear to me what's going on with them > here. I'm not sure what I should add. The PEP is written on the assumption that the reader understands how iterators and generators currently work, including StopIteration. Given that, I think it ought to be fairly clear how StopIteration is being used in the proposal. It's not much different from what you would do if you wanted to iterate over an iterator manually, without using a for-loop: try: while 1: x = myiter.next() ... except StopIteration: pass The only difference is that instead of ignoring the StopIteration instance itself, we use it to carry the return value of the generator. I don't think the PEP is the right place to put tutorial information about the current StopIteration mechanism. -- Greg From dangyogi at gmail.com Fri Feb 13 02:57:16 2009 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Thu, 12 Feb 2009 20:57:16 -0500 Subject: [Python-ideas] Proto-PEP on a 'yield from' statement In-Reply-To: <4994AEF7.2080703@canterbury.ac.nz> References: <4994AEF7.2080703@canterbury.ac.nz> Message-ID: <4994D37C.2000509@gmail.com> I would think that in addition to forwarding send values to the subgenerator, that throw exceptions sent to the delegating generator also be forwarded to the subgenerator. If the subgenerator does not handle the exception, then it should be re-raised in the delegating generator. Also, the subgenerator close method should be called by the delegating generator. Thus, the new expansion code would look something like: _i = iter(iterable) try: _value_to_yield = next(_i) while True: try: _value_to_send = yield _value_to_yield except Exception as _throw_from_caller: if hasattr(_i, 'throw'): _value_to_yield = _i.throw(_throw_from_caller) else: raise else: if hasattr(_i, 'send'): _value_to_yield = _i.send(_value_to_send) else: _value_to_yield = next(_i) except StopIteration as _exc_from_i: _a = _exc_from_i.args if len(_a) > 0: result = _a[0] else: result = None finally: if hasattr(_i, 'close'): _i.close() I'm also against using return as the syntax for a final value from the subgenerator. I've accidentally used return inside generators many times and appreciate getting an error for this. I would be OK with removing the ability to return a final value from the subgenerator. Or create a new syntax that won't be used accidentally. Perhaps: finally return expr? -bruce frederiksen Greg Ewing wrote: > Comments are invited on the following proto-PEP. From greg.ewing at canterbury.ac.nz Fri Feb 13 03:04:00 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 13 Feb 2009 15:04:00 +1300 Subject: [Python-ideas] Revised PEP on yield-from expression Message-ID: <4994D510.4070406@canterbury.ac.nz> Here's a revision of the PEP incorporating the corrections posted earlier. There is also a new section titled "Generators as Threads" providing additional motivation. PEP: XXX Title: Syntax for Delegating to a Subgenerator Version: $Revision$ Last-Modified: $Date$ Author: Gregory Ewing Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 13-Feb-2009 Python-Version: 2.7 Post-History: Abstract ======== A syntax is proposed to allow a generator to easily delegate part of its operations to another generator, with the subgenerator yielding directly to the delegating generator's caller and receiving values sent to the delegating generator using send(). Additionally, the subgenerator is allowed to return with a value and the value is made available to the delegating generator. The new syntax also opens up some opportunities for optimisation when one generator re-yields values produced by another. Proposal ======== The following new expression syntax will be allowed in the body of a generator: :: yield from where is an expression evaluating to an iterable, from which an iterator is extracted. The effect is to run the iterator to exhaustion, with any values that it yields being passed directly to the caller of the generator containing the ``yield from`` expression (the "delegating generator"), and any values sent to the delegating generator using ``send()`` being sent directly to the iterator. (If the iterator does not have a ``send()`` method, values sent in are ignored.) The value of the ``yield from`` expression is the first argument to the ``StopIteration`` exception raised by the iterator when it terminates. Additionally, generators will be allowed to execute a ``return`` statement with a value, and that value will be passed as an argument to the ``StopIteration`` exception. Formal Semantics ---------------- The statement :: result = yield from expr is semantically equivalent to :: _i = iter(expr) try: _v = yield _i.next() while 1: if hasattr(_i, 'send'): _v = yield _i.send(_v) else: _v = yield _i.next() except StopIteration, _e: _a = _e.args if len(_a) > 0: result = _a[0] else: result = None Rationale ========= A Python generator is a form of coroutine, but has the limitation that it can only yield to its immediate caller. This means that a piece of code containing a ``yield`` cannot be factored out and put into a separate function in the same way as other code. Performing such a factoring causes the called function to itself become a generator, and it is necessary to explicitly iterate over this second generator and re-yield any values that it produces. If yielding of values is the only concern, this is not very arduous and can be performed with a loop such as :: for v in g: yield v However, if the subgenerator is to receive values sent to the outer generator using ``send()``, it is considerably more complicated. As the formal expansion presented above illustrates, the necessary code is very longwinded, and it is tricky to handle all the corner cases correctly. In this case, the advantages of a specialised syntax should be clear. Generators as Threads --------------------- A motivating use case for generators being able to return values concerns the use of generators to implement lightweight threads. When using generators in that way, it is reasonable to want to spread the computation performed by the lightweight thread over many functions. One would like to be able to call subgenerators as though they were ordinary functions, passing them parameters and receiving a returned value. Using the proposed syntax, a function call such as :: y = f(x) where f is an ordinary function, can be transformed into an equivalent generator call :: y = yield from g(x) where g is a generator. Syntax ------ The particular syntax proposed has been chosen as suggestive of its meaning, while not introducing any new keywords and clearly standing out as being different from a plain ``yield``. Optimisations ------------- Using a specialised syntax opens up possibilities for optimisation when there is a long chain of generators. Such chains can arise, for instance, when recursively traversing a tree structure. The overhead of passing ``next()`` calls and yielded values down and up the chain can cause what ought to be an O(n) operation to become O(n\*\*2). A possible strategy is to add a slot to generator objects to hold a generator being delegated to. When a ``next()`` or ``send()`` call is made on the generator, this slot is checked first, and if it is nonempty, the generator that it references is resumed instead. If it raises StopIteration, the slot is cleared and the main generator is resumed. This would reduce the delegation overhead to a chain of C function calls involving no Python code execution. A possible enhancement would be to traverse the whole chain of generators in a loop and directly resume the one at the end, although the handling of StopIteration is more complicated then. Alternative Proposals ===================== Proposals along similar lines have been made before, some using the syntax ``yield *`` instead of ``yield from``. While ``yield *`` is more concise, it could be argued that it looks too similar to an ordinary ``yield`` and the difference might be overlooked when reading code. To the author's knowledge, previous proposals have focused only on yielding values, and thereby suffered from the criticism that the two-line for-loop they replace is not sufficiently tiresome to write to justify a new syntax. By dealing with sent values as well as yielded ones, this proposal provides considerably more benefit. Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: -- Greg From greg.ewing at canterbury.ac.nz Fri Feb 13 03:12:37 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 13 Feb 2009 15:12:37 +1300 Subject: [Python-ideas] Proto-PEP on a 'yield from' statement In-Reply-To: <4994D37C.2000509@gmail.com> References: <4994AEF7.2080703@canterbury.ac.nz> <4994D37C.2000509@gmail.com> Message-ID: <4994D715.7020707@canterbury.ac.nz> Bruce Frederiksen wrote: > I would think that in addition to forwarding send values to the > subgenerator, that throw exceptions sent to the delegating generator > also be forwarded to the subgenerator. Urg, I'd forgotten about that feature! You're quite right. > Also, the subgenerator close method should be called by the > delegating generator. Right again. I'll add these to the next next revision, thanks. > I'm also against using return as the syntax for a final value from the > subgenerator. Can you look at what I said in the last revision about "Generators as Threads" and tell me whether you still feel that way? -- Greg From python at rcn.com Fri Feb 13 04:08:41 2009 From: python at rcn.com (Raymond Hettinger) Date: Thu, 12 Feb 2009 19:08:41 -0800 Subject: [Python-ideas] Proto-PEP on a 'yield from' statement References: <4994AEF7.2080703@canterbury.ac.nz> <4994D37C.2000509@gmail.com> Message-ID: <26E3F4C63E8F41FC953C6393DA15393F@RaymondLaptop1> >I would think that in addition to forwarding send values to the > subgenerator, that throw exceptions sent to the delegating generator > also be forwarded to the subgenerator. If the subgenerator does not > handle the exception, then it should be re-raised in the delegating > generator. Also, the subgenerator close method should be called by the > delegating generator. I recommend dropping the notion of forwarding from the proposal. The idea is use-case challenged, complicated, and should not be hidden behind new syntax. Would hate for this to become a trojan horse proposal when most folks just want a fast iterator pass-through mechasism: def threesomes(vars) for var in vars: yield from itertools.repeat(var, n) Raymond From george.sakkis at gmail.com Fri Feb 13 04:12:10 2009 From: george.sakkis at gmail.com (George Sakkis) Date: Thu, 12 Feb 2009 22:12:10 -0500 Subject: [Python-ideas] Proto-PEP on a 'yield from' statement In-Reply-To: <4994D37C.2000509@gmail.com> References: <4994AEF7.2080703@canterbury.ac.nz> <4994D37C.2000509@gmail.com> Message-ID: <91ad5bf80902121912i577734b5qa9c9b8dcbec8cf97@mail.gmail.com> On Thu, Feb 12, 2009 at 8:57 PM, Bruce Frederiksen wrote: > I'm also against using return as the syntax for a final value from the > subgenerator. Thirded, for two reasons: - A "yield x" expression has completely different semantics from "yield from x"; that's a bad idea given how similar they look. - Returning a value by stuffing it in the StopIteration abuses the exception mechanism. Without a compelling, concrete, example I'm -1 on the return part; +1 for the rest. George From rwgk at yahoo.com Fri Feb 13 04:36:35 2009 From: rwgk at yahoo.com (Ralf W. Grosse-Kunstleve) Date: Thu, 12 Feb 2009 19:36:35 -0800 (PST) Subject: [Python-ideas] set.add() return value References: <118486.47943.qm@web111413.mail.gq1.yahoo.com> <2F8C0565097D4CA690BB83CC5B6A2C19@RaymondLaptop1> <203230.34503.qm@web111401.mail.gq1.yahoo.com> <50697b2c0902121512h6749333tb2a5cf5ff1e54899@mail.gmail.com> Message-ID: <171757.71077.qm@web111402.mail.gq1.yahoo.com> > > Sorry, I was thinking the .add() is happening inside a loop with N iterations, > > with N also being the eventual size of the set. Is O(N log N) correct then? > > > > http://en.wikipedia.org/wiki/Big_O_notation says: > > > > O(log N) example: finding an item in a sorted array with a binary search. > > > > Isn't that what set is doing? > > Python's set uses an unsorted hash table internally and is thus O(1), not O(N). This is at odds with the results of a simple experiment. Please try the attached script. On my machine I get these results: lookup repeats: 1000000 timing set lookup N: 10000 overhead: 1.16 lookup contained: 0.07 lookup missing: 0.12 timing set lookup N: 10000000 overhead: 1.13 lookup contained: 0.42 lookup missing: 0.34 timing dict lookup N: 10000 overhead: 1.16 lookup contained: 0.06 lookup missing: 0.11 timing dict lookup N: 10000000 overhead: 1.12 lookup contained: 0.44 lookup missing: 0.44 N is the size of a set or dict. The number of lookups of random elements is the same in both cases. That's why the "overhead" is constant. Of course, the memory cache does have an influence here, but your claim is also at odds with anything I could find in the literature. A dictionary or set has to do some kind of search to find a key or element. Dictionary and set lookups have O(log N) runtime complexity, where N is the size of the dict or set at the time of the lookup. To come back to my original suggestion, Raymond's reply... > * because of caching the second lookup is very cheap. makes me think that this... > if (1 not in s): > s.add(1) > do_something_else() is basically as fast as I was hoping to get from: > if (s.add(1)): > do_something_else() So I guess that kills my runtime argument. > * the bdfl frowns on mutating methods returning anything at all Shock! What about dict.setdefault()? Honestly, I found it very odd from the start that set.add() doesn't tell me what it did. import random import time import os import sys repeats = 1000000 print "lookup repeats:", repeats for now_with_dict in [False, True]: random.seed(0) for N in [10000, 10000000]: if (not now_with_dict): print "timing set lookup" s = set(xrange(N)) else: print "timing dict lookup" s = {} for i in xrange(N): s[i] = 0 print " N:", len(s) t0 = os.times()[0] for i in xrange(repeats): j = random.randrange(N) assert j < N oh = os.times()[0]-t0 print " overhead: %.2f" % oh t0 = os.times()[0] for i in xrange(repeats): j = random.randrange(N) assert j in s print " lookup contained: %.2f" % (os.times()[0]-t0-oh) t0 = os.times()[0] for i in xrange(repeats): j = N + random.randrange(N) assert j not in s print " lookup missing: %.2f" % (os.times()[0]-t0-oh) From rhamph at gmail.com Fri Feb 13 04:49:00 2009 From: rhamph at gmail.com (Adam Olsen) Date: Thu, 12 Feb 2009 20:49:00 -0700 Subject: [Python-ideas] set.add() return value In-Reply-To: <171757.71077.qm@web111402.mail.gq1.yahoo.com> References: <118486.47943.qm@web111413.mail.gq1.yahoo.com> <2F8C0565097D4CA690BB83CC5B6A2C19@RaymondLaptop1> <203230.34503.qm@web111401.mail.gq1.yahoo.com> <50697b2c0902121512h6749333tb2a5cf5ff1e54899@mail.gmail.com> <171757.71077.qm@web111402.mail.gq1.yahoo.com> Message-ID: On Thu, Feb 12, 2009 at 8:36 PM, Ralf W. Grosse-Kunstleve wrote: > Of course, the memory cache does have an influence here, but > your claim is also at odds with anything I could find in the > literature. A dictionary or set has to do some kind of search to > find a key or element. Dictionary and set lookups have O(log N) > runtime complexity, where N is the size of the dict or set at > the time of the lookup. The literature you found was assuming a binary tree implementation, which is O(log n). Python's dict and set use a hash table, which is O(1). The performance effect you found was because of the CPU's cache. Besides, you really should use more than 2 samples if you want to show an O(log n) performance curve. -- Adam Olsen, aka Rhamphoryncus From eric at trueblade.com Fri Feb 13 04:49:13 2009 From: eric at trueblade.com (Eric Smith) Date: Thu, 12 Feb 2009 22:49:13 -0500 Subject: [Python-ideas] String formatting and namedtuple Message-ID: <4994EDB9.4040500@trueblade.com> [sorry for starting a new thread, but I just subscribed and can't figure out how to respond to an earlier message] Raymond wrote: >[Terry Reedy] >> There is certainly some disconnect on the issue. > >FWIW, whenever I done talks on 3.0, it is common to get >an aversive reaction when the new syntax is shown. >I pitch it in a positive light, but you can sense churning stomachs. [not picking on Raymond here at all, his message was just convenient] There are a number of comments in this thread that lead me to think not everyone is aware that .format is fully supported in 2.6. I just want to make sure everyone knows that's the case. If you want to support 2.6+ and 3.0+, you can certainly use .format. Eric. From tjreedy at udel.edu Fri Feb 13 05:29:33 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 12 Feb 2009 23:29:33 -0500 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: <20090212225512.GA10700@panix.com> References: <20090212141040.0c89e0fc@o> <70e6cf130902121029l17ec06e0k9d3ea91e23ecd74e@mail.gmail.com> <91ad5bf80902121440j62fd3bcar8abef4bdca492ea@mail.gmail.com> <20090212225512.GA10700@panix.com> Message-ID: Aahz wrote: > On Thu, Feb 12, 2009, George Sakkis wrote: >> On Thu, Feb 12, 2009 at 2:24 PM, Terry Reedy wrote: >>> PROPOSAL: Allow the simple case to stay simple. Allow field names to be >>> omitted for all fields in a string and then default to 0, 1, ... so that >>> example above could be written as >>> >>>>>> msg = "{} == {}".format >>> Given that computers are glorified counting machines, it *is* a bit annoying >>> to be required to do the counting manually. I think this is at least half >>> the objection to switching to .format. >> What happens when both empty and non-empty fields appear ? E.g. would >> >> 'I love {} with {1} and {} with {1}'.format('bacon', 'eggs', 'sausage') >> >> return 'I love bacon with eggs and eggs with eggs', or it would be >> smarter and see that 1 is used explicitly and skip over it, giving 'I >> love bacon with eggs and sausage with eggs' ? > > I'd favor raising an exception. Alternatively, we could do the > equivalent of what % formatting does, which would be the first option > (that is, '{#}' is considered equivalent to mapped interpolation in % > formatting). From http://bugs.python.org/issue5237 """This proposal is currently all or nothing for simplicity of description and presumed ease of implementation. The patch to the doc could then be "If all replacement fields are left blank, then sequential indexes 0,1, ... will be automatically inserted." inserted after [Each replacement field contains either the numeric index of a positional argument, or the name of a keyword argument.]. Mixing blank and non-blank specs would then be an error and raise an exception. """ I think mixing implicit and explicit indexes would be confusing. Mixing implicit indexes and keywords could perhaps work, but I won't propose that. It would be a rare usage, while my goal is to make the common case '%s' format as easy to write as it is now, by replacing '%s' with '{}' [two keystrokes each]. Terry Jan Reedy From tjreedy at udel.edu Fri Feb 13 05:37:44 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 12 Feb 2009 23:37:44 -0500 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: <4994BFFE.8000903@pearwood.info> References: <20090212141040.0c89e0fc@o> <70e6cf130902121029l17ec06e0k9d3ea91e23ecd74e@mail.gmail.com> <4994BFFE.8000903@pearwood.info> Message-ID: Steven D'Aprano wrote: > Terry Reedy wrote: > >> PROPOSAL: Allow the simple case to stay simple. Allow field names to >> be omitted for all fields in a string and then default to 0, 1, ... so >> that example above could be written as >> >> >>> msg = "{} == {}".format >> >> Given that computers are glorified counting machines, it *is* a bit >> annoying to be required to do the counting manually. I think this is >> at least half the objection to switching to .format. > > +1 from me. Just to make it explicit: omitting field names will be an > all-or-nothing proposition: you can't omit some of them unless you omit > them all. Correct? That is my proposal. I should have made that clearer, as I now did on the tracker issue I filed. My idea was that the function could switch from current behavior to alternative behavior depending on the presence or absence of anything in the first {}. But that is up to the patch writer. tjr From steve at pearwood.info Fri Feb 13 05:54:12 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 13 Feb 2009 15:54:12 +1100 Subject: [Python-ideas] set.add() return value In-Reply-To: <171757.71077.qm@web111402.mail.gq1.yahoo.com> References: <118486.47943.qm@web111413.mail.gq1.yahoo.com> <2F8C0565097D4CA690BB83CC5B6A2C19@RaymondLaptop1> <203230.34503.qm@web111401.mail.gq1.yahoo.com> <50697b2c0902121512h6749333tb2a5cf5ff1e54899@mail.gmail.com> <171757.71077.qm@web111402.mail.gq1.yahoo.com> Message-ID: <4994FCF4.3030301@pearwood.info> Ralf W. Grosse-Kunstleve wrote: >> Python's set uses an unsorted hash table internally and is thus O(1), not O(N). > > This is at odds with the results of a simple experiment. Timing experiments are tricky to get right on multi-processing machines, which is virtually all PCs these days. For small code snippets, you are better off using the timeit module rather than re-inventing the wheel. That will have the very desirable outcome that others reading your code are dealing with a known and trusted component, rather than having to work out how you are doing your timing. > Please try > the attached script. On my machine I get these results: > > lookup repeats: 1000000 [...] How do you determine that something fits a log N curve from just two data points? There's an infinite number of curves that pass through two data points. It's true that the results you found aren't consistent with O(1), but as I understand it, Python dicts are O(1) amortized ("on average over the long term"). Sometimes dicts resize, which is not a constant time operation, and sometimes the dict has to walk a short linked list, which depends on the proportion of hashes that lead to a collisions. But more importantly, I don't think you're necessarily measuring what you think you're measuring. I see that you include a call to random.randrange(N) within the timing loop. I don't think there is any guarantee that randrange(N) will take the same amount of time for any N. I'm not sure if that is actually the cause of your results, but it is a potential issue. When timing, you should try to time the barest minimum of code. This gives a quick demonstration of constant look-up time for sets: >>> import timeit >>> setup = """s = set(range(%(N)d)) ... found = range(%(N)d//4, %(N)d//4+10) ... missing = range(%(N)d*2, %(N)d*2+10) ... """ # assumes N is at least 14 >>> >>> timeit.Timer('for i in found: i in s', ... setup % {'N':1000}).repeat() [2.0811450481414795, 2.1155159473419189, 2.0662739276885986] >>> timeit.Timer('for i in found: i in s', ... setup % {'N':10000000}).repeat() [2.0981149673461914, 2.0697150230407715, 2.0843479633331299] >>> >>> timeit.Timer('for i in missing: i in s', ... setup % {'N':1000}).repeat() [1.5208888053894043, 1.5102288722991943, 1.5023901462554932] >>> timeit.Timer('for i in missing: i in s', ... setup % {'N':10000000}).repeat() [1.6430721282958984, 1.6344850063323975, 1.6358041763305664] Regards, -- Steven From talin at acm.org Fri Feb 13 06:13:23 2009 From: talin at acm.org (Talin) Date: Thu, 12 Feb 2009 21:13:23 -0800 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: References: <20090212141040.0c89e0fc@o> <70e6cf130902121029l17ec06e0k9d3ea91e23ecd74e@mail.gmail.com> <4994BFFE.8000903@pearwood.info> Message-ID: <49950173.9010703@acm.org> Some reactions: First, let me say that PEP 3101 was always meant to be an expression of the python-dev consensus rather than the ideas of one specific person. So as far as I am concerned, the consensus rules. That being said, I can think of two objections to your proposal, although I don't feel strongly about either one. The first issue is that the character sequence '{}' isn't by itself suggestive of anything (except perhaps an empty dict.) The logical extrapolation from {ddd} to {} is a subtractive one, meaning that it only makes sense if you are starting from {ddd} to begin with. From a pedagogical perspective, I have a slight preference for language constructs that are naturally additive, meaning that I can teach the shortcut first and it makes intuitive sense. (However, there are lots of language features that violate that rule, which is why this argument is somewhat weak.) Secondly, there is an argument to be made towards moving away from any syntactical pattern that requires the programmer to synchronize two lists, in this case the set of '%' field markers in the string and the sequence of replacement values. Having to maintain a correspondence between lists is almost never a problem when code is first written, but I think we can all remember instances where bugs have been introduced by maintainers who added a new item to one list but forgot to add the corresponding item to the other list. More generally, I think that when you evaluate the goodness of either syntax, you should do so from the perspective of both the original code author as well as someone who is dealing with an existing code base that has undergone a large degree of churn. Thus, a case could be made that by forcing programmers to explicitly state the mapping between field marker and replacement value (either numerically, or preferably by name) that the code becomes a fraction more robust as a result. BTW, although the use of unbound methods is not mentioned in the PEP, I do recall discussions on the mailing list around that time that proposed a whole lot of different ways of wrapping format() to create various formatting APIs. Terry Reedy wrote: > Steven D'Aprano wrote: >> Terry Reedy wrote: >> >>> PROPOSAL: Allow the simple case to stay simple. Allow field names to >>> be omitted for all fields in a string and then default to 0, 1, ... >>> so that example above could be written as >>> >>> >>> msg = "{} == {}".format >>> >>> Given that computers are glorified counting machines, it *is* a bit >>> annoying to be required to do the counting manually. I think this is >>> at least half the objection to switching to .format. >> >> +1 from me. Just to make it explicit: omitting field names will be an >> all-or-nothing proposition: you can't omit some of them unless you >> omit them all. Correct? > > That is my proposal. I should have made that clearer, as I now did on > the tracker issue I filed. My idea was that the function could switch > from current behavior to alternative behavior depending on the presence > or absence of anything in the first {}. But that is up to the patch > writer. > > tjr > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From greg.ewing at canterbury.ac.nz Fri Feb 13 07:13:20 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 13 Feb 2009 19:13:20 +1300 Subject: [Python-ideas] Revised revised PEP on yield-from Message-ID: <49950F80.5050309@canterbury.ac.nz> Third draft of the PEP, incorporating throw() and close() handling, and other feedback that I have received. PEP: XXX Title: Syntax for Delegating to a Subgenerator Version: $Revision$ Last-Modified: $Date$ Author: Gregory Ewing Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 13-Feb-2009 Python-Version: 2.7 Post-History: Abstract ======== A syntax is proposed to allow a generator to easily delegate part of its operations to another generator, the subgenerator interacting directly with the main generator's caller for as long as it runs. Additionally, the subgenerator is allowed to return with a value, and the value is made available to the delegating generator. The new syntax also opens up some opportunities for optimisation when one generator re-yields values produced by another. Proposal ======== The following new expression syntax will be allowed in the body of a generator: :: yield from where is an expression evaluating to an iterable, from which an iterator is extracted. The effect is to run the iterator to exhaustion, during which time it behaves as though it were communicating directly with the caller of the generator containing the ``yield from`` expression (the "delegating generator"). In detail: * Any values that the iterator yields are passed directly to the caller. * Any values sent to the delegating generator using ``send()`` are sent directly to the iterator. (If the iterator does not have a ``send()`` method, values sent in are ignored.) * Calls to the ``throw()`` method of the delegating generator are forwarded to the iterator. (If the iterator does not have a ``throw()`` method, the thrown-in exception is raised in the delegating generator.) * If the delegating generator's ``close()`` method is called, the iterator is finalised before finalising the delegating generator. The value of the ``yield from`` expression is the first argument to the ``StopIteration`` exception raised by the iterator when it terminates. Additionally, generators will be allowed to execute a ``return`` statement with a value, and that value will be passed as an argument to the ``StopIteration`` exception. Formal Semantics ---------------- The statement :: result = yield from expr is semantically equivalent to :: _i = iter(expr) try: _u = _i.next() _v = yield while 1: try: _v = yield _u except Exception, _e: if hasattr(_i, 'throw'): _i.throw(_e) else: raise else: if hasattr(_i, 'send'): _u = _i.send(_v) else: _u = _i.next() except StopIteration, _e: _a = _e.args if len(_a) > 0: result = _a[0] else: result = None finally: if hasattr(_i, 'close'): _i.close() Rationale ========= A Python generator is a form of coroutine, but has the limitation that it can only yield to its immediate caller. This means that a piece of code containing a ``yield`` cannot be factored out and put into a separate function in the same way as other code. Performing such a factoring causes the called function to itself become a generator, and it is necessary to explicitly iterate over this second generator and re-yield any values that it produces. If yielding of values is the only concern, this is not very arduous and can be performed with a loop such as :: for v in g: yield v However, if the subgenerator is to interact properly with the caller in the case of calls to ``send()``, ``throw()`` and ``close()``, things become considerably more complicated. As the formal expansion presented above illustrates, the necessary code is very longwinded, and it is tricky to handle all the corner cases correctly. In this situation, the advantages of a specialised syntax should be clear. Generators as Threads --------------------- A motivating use case for generators being able to return values concerns the use of generators to implement lightweight threads. When using generators in that way, it is reasonable to want to spread the computation performed by the lightweight thread over many functions. One would like to be able to call a subgenerator as though it were an ordinary functions, passing it parameters and receiving a returned value. Using the proposed syntax, a statement such as :: y = f(x) where f is an ordinary function, can be transformed into an equivalent delegation call :: y = yield from g(x) where g is a generator, and the behaviour of the resulting code can be reasoned about by thinking of it as a function that can be suspended. It is also reasonable to expect that if an exception is thrown into the lightweight thread from outside using ``throw()``, it should first be raised in the innermost generator where the thread is suspended, and propagate outwards from there; and that if the lightweight thread is terminated from outside by calling ``close()``, the chain of active generators should be finalised from the innermost outwards. Syntax ------ The particular syntax proposed has been chosen as suggestive of its meaning, while not introducing any new keywords and clearly standing out as being different from a plain ``yield``. Optimisations ------------- Using a specialised syntax opens up possibilities for optimisation when there is a long chain of generators. Such chains can arise, for instance, when recursively traversing a tree structure. The overhead of passing ``next()`` calls and yielded values down and up the chain can cause what ought to be an O(n) operation to become O(n\*\*2). A possible strategy is to add a slot to generator objects to hold a generator being delegated to. When a ``next()`` or ``send()`` call is made on the generator, this slot is checked first, and if it is nonempty, the generator that it references is resumed instead. If it raises StopIteration, the slot is cleared and the main generator is resumed. This would reduce the delegation overhead to a chain of C function calls involving no Python code execution. A possible enhancement would be to traverse the whole chain of generators in a loop and directly resume the one at the end, although the handling of StopIteration is more complicated then. Criticisms ========== Under this proposal, the value of a ``yield from`` expression would be derived in a very different way from that of an ordinary ``yield`` expression. This suggests that some other syntax not containing the word ``yield`` might be more appropriate, but no alternative has so far been proposed, other than ``call``, which has already been rejected by the BDFL. It has been suggested that some mechanism other than ``return`` in the subgenerator should be used to establish the value returned by the ``yield from`` expression. However, this would interfere with the goal of being able to think of the subgenerator as a suspendable function, since it would not be able to return values in the same way as other functions. The use of an argument to StopIteration to pass the return value has been criticised as an "abuse of exceptions", without any concrete justification of this claim. In any case, this is only one suggested implementation; another mechanism could just as well be used to achieve the desired result, namely that the return value of the subgenerator appears as the value of the ``yield from`` expression. Alternative Proposals ===================== Proposals along similar lines have been made before, some using the syntax ``yield *`` instead of ``yield from``. While ``yield *`` is more concise, it could be argued that it looks too similar to an ordinary ``yield`` and the difference might be overlooked when reading code. To the author's knowledge, previous proposals have focused only on yielding values, and thereby suffered from the criticism that the two-line for-loop they replace is not sufficiently tiresome to write to justify a new syntax. By dealing with sent values as well as yielded ones, this proposal provides considerably more benefit. Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: -- Greg From greg.ewing at canterbury.ac.nz Fri Feb 13 07:25:16 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 13 Feb 2009 19:25:16 +1300 Subject: [Python-ideas] Proto-PEP on a 'yield from' statement In-Reply-To: <26E3F4C63E8F41FC953C6393DA15393F@RaymondLaptop1> References: <4994AEF7.2080703@canterbury.ac.nz> <4994D37C.2000509@gmail.com> <26E3F4C63E8F41FC953C6393DA15393F@RaymondLaptop1> Message-ID: <4995124C.5040007@canterbury.ac.nz> Raymond Hettinger wrote: > I recommend dropping the notion of forwarding from the proposal. > The idea is use-case challenged, complicated, and should not be > hidden behind new syntax. I don't think it's conceptually complicated -- it just seems that way when you write out the Python code necessary to implement it *without* new syntax. The essential concept is that, while the subgenerator is running, everything behaves as though it were talking directly to whatever is calling the outer generator. If you leave out some of the ways that the caller can interact with a generator, such as send(), throw() and close(), then it doesn't behave exactly that way, and I think that would actually make it more complicated to understand. I should perhaps point out that the way I would implement all this would *not* be by emitting bytecode equivalent to the expansion in the PEP. It would be more along the lines of the suggested optimisation, and all the next(), send(), throw() etc. calls would go more or less directly to the subgenerator until it terminates. Done that way, I expect the implementation would actually be fairly simple and straightforward. > Would hate for this to become a trojan horse proposal > when most folks just want a fast iterator pass-through mechasism: You can use it that way if you want, without having to think about any of the other complications. -- Greg From greg.ewing at canterbury.ac.nz Fri Feb 13 07:25:34 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 13 Feb 2009 19:25:34 +1300 Subject: [Python-ideas] Proto-PEP on a 'yield from' statement In-Reply-To: <91ad5bf80902121912i577734b5qa9c9b8dcbec8cf97@mail.gmail.com> References: <4994AEF7.2080703@canterbury.ac.nz> <4994D37C.2000509@gmail.com> <91ad5bf80902121912i577734b5qa9c9b8dcbec8cf97@mail.gmail.com> Message-ID: <4995125E.3010004@canterbury.ac.nz> George Sakkis wrote: > - A "yield x" expression has completely different semantics from > "yield from x"; that's a bad idea given how similar they look. If that's a concern, I would take it as an indication that 'yield from' is perhaps not the best syntax to use, and maybe it should be something completely new, such as y = delegate f(args) But then you lose the connection with generators that the word 'yield' gives you. > - Returning a value by stuffing it in the StopIteration abuses the > exception mechanism. I don't see why. StopIteration is already being used as an out-of-band return value to signal the end of iteration. Attaching further information to that return value doesn't seem an unreasonable thing to do. In any case, that's an implementation detail. There are other ways that the desired result could be achieved -- the desired result being the appearance of the return value as the value of the 'yield from' expression. -- Greg From dangyogi at gmail.com Fri Feb 13 07:27:20 2009 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Fri, 13 Feb 2009 01:27:20 -0500 Subject: [Python-ideas] Proto-PEP on a 'yield from' statement In-Reply-To: <26E3F4C63E8F41FC953C6393DA15393F@RaymondLaptop1> References: <4994AEF7.2080703@canterbury.ac.nz> <4994D37C.2000509@gmail.com> <26E3F4C63E8F41FC953C6393DA15393F@RaymondLaptop1> Message-ID: <499512C8.6080700@gmail.com> Raymond Hettinger wrote: >> I would think that in addition to forwarding send values to the >> subgenerator, that throw exceptions sent to the delegating generator >> also be forwarded to the subgenerator. If the subgenerator does not >> handle the exception, then it should be re-raised in the delegating >> generator. Also, the subgenerator close method should be called by >> the delegating generator. > > I recommend dropping the notion of forwarding from the proposal. > The idea is use-case challenged, complicated, and should not be > hidden behind new syntax. > > Would hate for this to become a trojan horse proposal > when most folks just want a fast iterator pass-through mechasism: I don't really understand your objection. How does adding the ability to forward send/throw values and closing the subgenerator in any way whatsoever get in the way of you using this as a fast iterator pass-through mechanism? I agree that 98% of the time the simple pass-through mechanism is all that will be required of this new feature. And I agree that this alone is sufficient motivation to want to see this feature added. But I have done quite a bit of work with nested generators and end up having to use itertools.chain, which also doesn't support the full generator behavior. Specifically, in my case, I needed itertools.chain to close the subgenerator so that finally clauses in the subgenerator get run when they should on jython and ironpython. I put in a request of this and was turned down. I found an alternative way to do it, but it's somewhat ugly: class chain_context(object): def __init__(self, outer_it): self.outer_it = outer_iterable(outer_it) def __enter__(self): return itertools.chain.from_iterable(self.outer_it) def __exit__(self, type, value, tb): self.outer_it.close() class outer_iterable(object): def __init__(self, outer_it): self.outer_it = iter(outer_it) self.inner_it = None def __iter__(self): return self def close(self): if hasattr(self.inner_it, '__exit__'): self.inner_it.__exit__(None, None, None) elif hasattr(self.inner_it, 'close'): self.inner_it.close() if hasattr(self.outer_it, 'close'): self.outer_it.close() def next(self): ans = self.outer_it.next() if hasattr(ans, '__enter__'): self.inner_it = ans return ans.__enter__() ans = iter(ans) self.inner_it = ans return ans and then use as: with chain_context(gen(x) for x in iterable) as it: for y in it: ... So from my own experience, I would strongly argue that the new yield from should at least honor the generator close method. Perhaps some people here have never run python with a different garbage collector that doesn't immediately reclaim garbage objects, so they don't understand the need for this. Jython and ironpython are both just coming out with their 2.5 support; so expect to hear more of these complaints in the not to distant future from that crowd... But I am baffled why the python community adopts these extra methods on generators and then refuses to support them anywhere else (for loops, itertools)? Is this a case of "well, I didn't vote for them, so I'm not going to play ball"? If that's the case, then perhaps send and throw should be retracted. I know that close is necessary when you move away from the reference counting collector, so I'll fight to keep that; as well as fight to get the rest of python to play ball with it. I haven't seen a need for send or throw myself. I've played a lot with send and it always seems to get too complicated, so I wouldn't fight for that one. I can imagine possible uses for throw, but haven't hit them yet myself in actual practice; so I'd only fight somewhat for throw. If send/throw were mistakes, let's document that and urge people not to use them and make a plan for deprecating them and removing them from the language; and figure out what the right answers are. But if send/throw/close were not mistakes and are done deals, then let's support them! In all of these cases, adding full support for send/throw/close does not require that you use any of them. It does not prevent using simple iterators rather than full blown generators. It does not diminish in any way the current capabilities of these other language features. It simply supports and allows the use of send/throw/close when needed. Otherwise, why did we put send/throw/close into the language in the first place? I would dearly love to see the for statement fully support close and throw, since that's where you use generators 99% of the time. Maybe this one needs different syntax to not break existing code. I'm not very good with clever syntax, so you may be able to improve on these: for i from gen(x): for i finally in gen(x): for i in gen(x) closing throwing: for i in final gen(x): for gen(x) yielding i: for gen(x) as i: The idea is that close should be called when the for loop terminates (for any reason), and uncaught exceptions in the for body should be sent to the generator using throw, and then only propagated outside of the for statement if they are not handled by throw. And, yes, the for statement should not do these things if a simple iterator is used rather than a generator. If you wanted to support the send method too, then maybe something like: for gen1(x) | gen2(y) as i: where the values yielded by gen1 are sent to gen2 with send, and then the values yielded by gen2 are bound to i. If this were adopted, I would also recommend that if gen2 were a function rather than a generator, then the function be called on each value yielded by gen1 and the results of the function bound to i. Then for gen(x) | fun as i: would be like: for map(fun, gen(x)) as i: Of course, this leads to simply using map rather | to combine generators by making map use send if passed a generator as it's first argument: for map(gen2(y), gen1(x)) as i: But this doesn't scale as well syntactically when you want to chain several generators together. for map(gen3(z), map(gen2(y), gen1(x))) as i: vs for gen1(x) | gen2(y) | gen3(z) as i: Unfortunately, the way that send is currently defined, gen2 can't skip values to act as a filter or generate multiple values for one value sent in. To do this would require that the operations of getting another value sent in and yielding values be separated, rather than combined as they are for send. One way to do this is to use callbacks for getting another value. This could be done using the current next semantics by simply treating the callback as an iterator and passing it as another parameter to the generator: for gen2(y, gen1(x)) as i: This is exactly what's currently being done by the itertools functions. But this also doesn't scale well syntactically when stacking up several generators. A better way would be to allow send and next to raise a new NextValue exception when the generator wants another value sent in. Then a new receive expression would be used in the generator to get the value. This would act like an iterator within the generator: def filter(pred): for var in receive: if pred(var): yield var which would be used like this down at the basic iterator level: it = filter(some_pred) for x in some_iterable: try: value = it.send(x) while True: process(value) value = next(it) except NextValue: pass and this would done automatically by the new for statement: for some_iterable | filter(some_pred) as value: process(value) this also allows generators to generate multiple values for each value received: def repeat(n): for var in receive: for i in range(n): yield var for some_iterable | repeat(3) as value: process(value) With the new yield from syntax, your threesomes example becomes: def threesomes(): yield from receive | repeat(3) Or even just: def threesomes(): return repeat(3) Other functions can be done in this style too: def map(fn): for var in receive: yield fn(var) So that stacking these all up is much more readable syntactically: for gen1(x) | filter(some_pred) | map(add_1) | threesomes() as i: You have to admit that this is much more readable than: for threesomes(map(add_1, filter(some_pred, gen1(x)))) as i: -bruce frederiksen From dangyogi at gmail.com Fri Feb 13 07:30:50 2009 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Fri, 13 Feb 2009 01:30:50 -0500 Subject: [Python-ideas] Proto-PEP on a 'yield from' statement In-Reply-To: <91ad5bf80902121912i577734b5qa9c9b8dcbec8cf97@mail.gmail.com> References: <4994AEF7.2080703@canterbury.ac.nz> <4994D37C.2000509@gmail.com> <91ad5bf80902121912i577734b5qa9c9b8dcbec8cf97@mail.gmail.com> Message-ID: <4995139A.7040400@gmail.com> George Sakkis wrote: > - Returning a value by stuffing it in the StopIteration abuses the > exception mechanism. > > Without a compelling, concrete, example I'm -1 on the return part; +1 > for the rest. > In thinking about this some more, what I think makes more sense is to simply return the final value from close rather than attaching it to StopIteration. This still leaves open the syntax to use for this inside the generator. Perhaps: return finally some_value -bruce frederiksen From rwgk at yahoo.com Fri Feb 13 07:46:10 2009 From: rwgk at yahoo.com (Ralf W. Grosse-Kunstleve) Date: Thu, 12 Feb 2009 22:46:10 -0800 (PST) Subject: [Python-ideas] set.add() return value References: <118486.47943.qm@web111413.mail.gq1.yahoo.com> <2F8C0565097D4CA690BB83CC5B6A2C19@RaymondLaptop1> <203230.34503.qm@web111401.mail.gq1.yahoo.com> <50697b2c0902121512h6749333tb2a5cf5ff1e54899@mail.gmail.com> <171757.71077.qm@web111402.mail.gq1.yahoo.com> <4994FCF4.3030301@pearwood.info> Message-ID: <779748.53402.qm@web111403.mail.gq1.yahoo.com> > It's true that the results you found aren't consistent with O(1), > but as I understand it, Python dicts are O(1) amortized ("on average > over the long term"). Sometimes dicts resize, which is not a constant > time operation, and sometimes the dict has to walk a short linked list, > which depends on the proportion of hashes that lead to a collisions. Thanks for the insight. I didn't know such a process is considered O(1). I agree it is fair because in practice it seems to work very well, but the "collisions" part can turn it into O(N) as demonstrated (in a crude way) by the script below. Therefore the O(1) classification is a bit misleading. My other script was simplified. I did look at more data points. The curve is amazingly flat but not a constant function. It is frustrating that simple requests like wanting a useful True or False, that costs nothing, instead of a useless None, tend to digress like this ... import os class normal(object): def __init__(O, value): O.value = value def __hash__(O): return O.value.__hash__() class bad(object): def __init__(O, value): O.value = value def __hash__(O): return 0 for N in [1000, 10000]: t0 = os.times()[0] s = set([normal(i) for i in xrange(N)]) print os.times()[0]-t0 assert len(s) == N t0 = os.times()[0] s = set([bad(i) for i in xrange(N)]) print os.times()[0]-t0 assert len(s) == N From pyideas at rebertia.com Fri Feb 13 08:00:16 2009 From: pyideas at rebertia.com (Chris Rebert) Date: Thu, 12 Feb 2009 23:00:16 -0800 Subject: [Python-ideas] set.add() return value In-Reply-To: <779748.53402.qm@web111403.mail.gq1.yahoo.com> References: <118486.47943.qm@web111413.mail.gq1.yahoo.com> <2F8C0565097D4CA690BB83CC5B6A2C19@RaymondLaptop1> <203230.34503.qm@web111401.mail.gq1.yahoo.com> <50697b2c0902121512h6749333tb2a5cf5ff1e54899@mail.gmail.com> <171757.71077.qm@web111402.mail.gq1.yahoo.com> <4994FCF4.3030301@pearwood.info> <779748.53402.qm@web111403.mail.gq1.yahoo.com> Message-ID: <50697b2c0902122300q27e82716y76a9a5b13e3f1766@mail.gmail.com> On Thu, Feb 12, 2009 at 10:46 PM, Ralf W. Grosse-Kunstleve wrote: > >> It's true that the results you found aren't consistent with O(1), >> but as I understand it, Python dicts are O(1) amortized ("on average >> over the long term"). Sometimes dicts resize, which is not a constant >> time operation, and sometimes the dict has to walk a short linked list, >> which depends on the proportion of hashes that lead to a collisions. > > Thanks for the insight. I didn't know such a process is considered > O(1). I agree it is fair because in practice it seems to work very > well, but the "collisions" part can turn it into O(N) as demonstrated > (in a crude way) by the script below. Therefore the O(1) classification > is a bit misleading. > > My other script was simplified. I did look at more data points. The > curve is amazingly flat but not a constant function. > > It is frustrating that simple requests like wanting a useful True or > False, that costs nothing, instead of a useless None, tend to digress > like this ... It may seem like just a simple feature request, but by not allowing it, we preserve the consistency, cleanness, and relative purity of Python, and I wouldn't trade these aspects of Python for anything, least of which a small feature request that would only yield an very small gain if approved. As The Zen says: "Special cases aren't special enough to break the rules." You're asking for the "special case" of a mutating method of 'set' to break the general "rule" that mutating methods normally return None; it's simply not special /enough/ to justify breaking this very handy (and imho, intuitive) rule of thumb; or at least that seems to be the consensus. Cheers, Chris -- Follow the path of the Iguana... http://rebertia.com From bruce at leapyear.org Fri Feb 13 08:17:12 2009 From: bruce at leapyear.org (Bruce Leban) Date: Thu, 12 Feb 2009 23:17:12 -0800 Subject: [Python-ideas] Proto-PEP on a 'yield from' statement In-Reply-To: <499512C8.6080700@gmail.com> References: <4994AEF7.2080703@canterbury.ac.nz> <4994D37C.2000509@gmail.com> <26E3F4C63E8F41FC953C6393DA15393F@RaymondLaptop1> <499512C8.6080700@gmail.com> Message-ID: On Thu, Feb 12, 2009 at 10:27 PM, Bruce Frederiksen wrote: > Raymond Hettinger wrote: > >> I recommend dropping the notion of forwarding from the proposal. >> The idea is use-case challenged, complicated, and should not be >> hidden behind new syntax. >> >> Would hate for this to become a trojan horse proposal >> when most folks just want a fast iterator pass-through mechasism: >> > I don't really understand your objection. How does adding the ability to > forward send/throw values and closing the subgenerator in any way whatsoever > get in the way of you using this as a fast iterator pass-through mechanism? > > I agree that 98% of the time the simple pass-through mechanism is all that > will be required of this new feature. And I agree that this alone is > sufficient motivation to want to see this feature added. But I have done > quite a bit of work with nested generators and end up having to use > itertools.chain, which also doesn't support the full generator behavior. > One of the advantages of full support of generators in a new 'yield from' is future proofing. Given the statement: The effect is to run the iterator to exhaustion, during which time it behaves as though it were communicating directly with the caller of the generator containing the ``yield from`` expression (the "delegating generator"). If, hypothetically, Python were to add some new feature to generators then those would automatically work in this context. Every place in current code which implements this kind of mechanism is not future proof. I didn't follow all the variations on the for loop, but regarding send, it seems to me that a natural case is this: for x in foo: bar = process(x) foo.send(bar) which sends the value bar to the generator and the value that comes back is used in the next iteration of the loop. I know that what I wrote doesn't do that so what I really mean is something like this but easier to write: try: x = foo.next() while True: bar = process(x) x = foo.send(bar) except StopIteration: pass and the syntax that occurs to me is: for x in foo: bar = process(x) continue bar As to chaining generators, I don't see that as a for loop-specific feature. If that's useful in a for, then it's useful outside and should stand on its own. (And I withhold judgment on that as I don't yet see a benefit to the new syntax vs. what's available today.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From denis.spir at free.fr Fri Feb 13 10:03:59 2009 From: denis.spir at free.fr (spir) Date: Fri, 13 Feb 2009 10:03:59 +0100 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: References: <20090212141040.0c89e0fc@o> <70e6cf130902121029l17ec06e0k9d3ea91e23ecd74e@mail.gmail.com> <91ad5bf80902121440j62fd3bcar8abef4bdca492ea@mail.gmail.com> <20090212225512.GA10700@panix.com> Message-ID: <20090213100359.1bb3e0bf@o> Le Thu, 12 Feb 2009 23:29:33 -0500, Terry Reedy a ?crit : > Aahz wrote: > > On Thu, Feb 12, 2009, George Sakkis wrote: > >> On Thu, Feb 12, 2009 at 2:24 PM, Terry Reedy wrote: > >>> PROPOSAL: Allow the simple case to stay simple. Allow field names to be > >>> omitted for all fields in a string and then default to 0, 1, ... so that > >>> example above could be written as > >>> > >>>>>> msg = "{} == {}".format > >>> Given that computers are glorified counting machines, it *is* a bit annoying > >>> to be required to do the counting manually. I think this is at least half > >>> the objection to switching to .format. > >> What happens when both empty and non-empty fields appear ? E.g. would > >> > >> 'I love {} with {1} and {} with {1}'.format('bacon', 'eggs', 'sausage') > >> > >> return 'I love bacon with eggs and eggs with eggs', or it would be > >> smarter and see that 1 is used explicitly and skip over it, giving 'I > >> love bacon with eggs and sausage with eggs' ? > > > > I'd favor raising an exception. Alternatively, we could do the > > equivalent of what % formatting does, which would be the first option > > (that is, '{#}' is considered equivalent to mapped interpolation in % > > formatting). > > From > > http://bugs.python.org/issue5237 > > """This proposal is currently all or nothing for simplicity of > description and presumed ease of implementation. The patch to the doc > could then be "If all replacement fields are left blank, then sequential > indexes 0,1, ... will be automatically inserted." inserted after [Each > replacement field contains either the numeric index of a positional > argument, or the name of a keyword argument.]. Mixing blank and > non-blank specs would then be an error and raise an exception. """ > > I think mixing implicit and explicit indexes would be confusing. Mixing > implicit indexes and keywords could perhaps work, but I won't propose > that. It would be a rare usage, while my goal is to make the common > case '%s' format as easy to write as it is now, by replacing '%s' with > '{}' [two keystrokes each]. > > Terry Jan Reedy Agree with the base {} proposal. Not really with forbidding mixing: A side-advantage is that this principle is consistent with the tuple/dict opposition: tuple items are implicitely numbered. Note that this is even more analog to positional vs keyword function arguments. So that there is probably no need to forbid {} and {name} in the same format string, as long as names came after positionally identified sub-strings. We need instead a rule stating that empty {} come first and that names comply with the common identifier pattern, in order to avoid conflicts with implicit indexes. Denis ------ la vida e estranya From denis.spir at free.fr Fri Feb 13 10:20:19 2009 From: denis.spir at free.fr (spir) Date: Fri, 13 Feb 2009 10:20:19 +0100 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: <49950173.9010703@acm.org> References: <20090212141040.0c89e0fc@o> <70e6cf130902121029l17ec06e0k9d3ea91e23ecd74e@mail.gmail.com> <4994BFFE.8000903@pearwood.info> <49950173.9010703@acm.org> Message-ID: <20090213102019.29532390@o> Le Thu, 12 Feb 2009 21:13:23 -0800, Talin a ?crit : > Secondly, there is an argument to be made towards moving away from any > syntactical pattern that requires the programmer to synchronize two > lists, in this case the set of '%' field markers in the string and the > sequence of replacement values. Having to maintain a correspondence > between lists is almost never a problem when code is first written, but > I think we can all remember instances where bugs have been introduced by > maintainers who added a new item to one list but forgot to add the > corresponding item to the other list. I fully agree. But then the only solution is probably (re)considering a format ala Cobra: print 'Hello. My name is [name] and I am [age].' [] pairs can actually hold any valid expression: print "Found [count(n)*3+1] items." There were probably *very* good reasons to refuse such an obvious format. Denis ------ la vida e estranya From tanzer at swing.co.at Fri Feb 13 09:58:37 2009 From: tanzer at swing.co.at (Christian Tanzer) Date: Fri, 13 Feb 2009 08:58:37 -0000 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: Your message of "Thu, 12 Feb 2009 22:49:13 -0500" <4994EDB9.4040500@trueblade.com> References: <4994EDB9.4040500@trueblade.com> Message-ID: Eric Smith wrote at Thu, 12 Feb 2009 22:49:13 -0500: > Raymond wrote: > >[Terry Reedy] > >> There is certainly some disconnect on the issue. > > > >FWIW, whenever I done talks on 3.0, it is common to get > >an aversive reaction when the new syntax is shown. > >I pitch it in a positive light, but you can sense churning stomachs. > > [not picking on Raymond here at all, his message was just convenient] > > There are a number of comments in this thread that lead me to think not > everyone is aware that .format is fully supported in 2.6. I just want to > make sure everyone knows that's the case. > > If you want to support 2.6+ and 3.0+, you can certainly use .format. I think the main problem is the huge amount of existing code that uses `%` for formatting. As long as there is no easy way to migrate that code to `.format`, moves to deprecate `%`-formatting are bound to cause friction. Counting lines containing `%` in my code base gives 14534 -- a few of these are numeric (but I'd be surprised if the numeric ones are more than a couple of hundred). -- Christian Tanzer http://www.c-tanzer.at/ From phd at phd.pp.ru Fri Feb 13 10:30:19 2009 From: phd at phd.pp.ru (Oleg Broytmann) Date: Fri, 13 Feb 2009 12:30:19 +0300 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: <20090213102019.29532390@o> References: <20090212141040.0c89e0fc@o> <70e6cf130902121029l17ec06e0k9d3ea91e23ecd74e@mail.gmail.com> <4994BFFE.8000903@pearwood.info> <49950173.9010703@acm.org> <20090213102019.29532390@o> Message-ID: <20090213093019.GA22886@phd.pp.ru> On Fri, Feb 13, 2009 at 10:20:19AM +0100, spir wrote: > print 'Hello. My name is [name] and I am [age].' I found this exceptionally funny. print 'Hello. My name is {0} and I am {1}.' uses indexes where print 'Hello. My name is [name] and I am [age].' uses strings; time to talk about consistency with tuples/lists and dicts! Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From solipsis at pitrou.net Fri Feb 13 15:19:52 2009 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 13 Feb 2009 14:19:52 +0000 (UTC) Subject: [Python-ideas] String formatting and namedtuple References: Message-ID: Guido van Rossum writes: > > I don't intend to force the issue. I'm disappointed though -- > .format() fixes several common stumbling blocks with %(name)s and at > least one with %s. Everytime I try to experiment a bit with format codes, I find them unintuitively complex: >>> "{0!r}".format(2.5) '2.5' >>> "{0:r}".format(2.5) Traceback (most recent call last): File "", line 1, in ValueError: Unknown conversion type r >>> "{0!f}".format(2.5) Traceback (most recent call last): File "", line 1, in ValueError: Unknown conversion specifier f >>> "{0:f}".format(2.5) '2.500000' Why must the 'f' code be preceded by a colon rather than by an exclamation mark? There is surely a rational explanation, but in day-to-day use it is really confusing. Add to this the annoyance of typing ".format" and of adding curly braces everywhere, and "%" is clearly handier despite the lonely tuple problem. Regards Antoine. From solipsis at pitrou.net Fri Feb 13 15:39:27 2009 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 13 Feb 2009 14:39:27 +0000 (UTC) Subject: [Python-ideas] set.add() return value References: <118486.47943.qm@web111413.mail.gq1.yahoo.com> <2F8C0565097D4CA690BB83CC5B6A2C19@RaymondLaptop1> <203230.34503.qm@web111401.mail.gq1.yahoo.com> <50697b2c0902121512h6749333tb2a5cf5ff1e54899@mail.gmail.com> <171757.71077.qm@web111402.mail.gq1.yahoo.com> <4994FCF4.3030301@pearwood.info> Message-ID: Steven D'Aprano writes: > > It's true that the results you found aren't consistent with O(1), but as > I understand it, Python dicts are O(1) amortized ("on average over the > long term"). They are not O(1) amortized, but O(1) best case. The worst case being O(N) (if all keys fall in the same hash bucket). The whole game with hash tables is to find a sufficiently smart hash function such as keys are equiprobably distributed amongst buckets and lookup cost is O(1) rather than O(n). Hash functions can be improved over time, a recent example can be found at http://bugs.python.org/issue5186. Regards Antoine. From guido at python.org Fri Feb 13 18:56:55 2009 From: guido at python.org (Guido van Rossum) Date: Fri, 13 Feb 2009 09:56:55 -0800 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: References: <4994EDB9.4040500@trueblade.com> Message-ID: On Fri, Feb 13, 2009 at 12:58 AM, Christian Tanzer wrote: > I think the main problem is the huge amount of existing code that uses > `%` for formatting. As long as there is no easy way to migrate that > code to `.format`, moves to deprecate `%`-formatting are bound to > cause friction. Yes, that was our concern too when we decided to keep % without deprecation in 3.0. My guess is that *most* of these use string literals, and we *can* write a 2to3 fixer for those. It is the cases where the format is being passed in as an argument or precomputed somehow where 2to3 falls down. It would be useful to have an idea how frequently that happens. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Fri Feb 13 19:01:19 2009 From: guido at python.org (Guido van Rossum) Date: Fri, 13 Feb 2009 10:01:19 -0800 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: References: Message-ID: On Fri, Feb 13, 2009 at 6:19 AM, Antoine Pitrou wrote: > Guido van Rossum writes: >> >> I don't intend to force the issue. I'm disappointed though -- >> .format() fixes several common stumbling blocks with %(name)s and at >> least one with %s. > > Everytime I try to experiment a bit with format codes, I find them unintuitively > complex: > >>>> "{0!r}".format(2.5) > '2.5' >>>> "{0:r}".format(2.5) > Traceback (most recent call last): > File "", line 1, in > ValueError: Unknown conversion type r >>>> "{0!f}".format(2.5) > Traceback (most recent call last): > File "", line 1, in > ValueError: Unknown conversion specifier f >>>> "{0:f}".format(2.5) > '2.500000' > > Why must the 'f' code be preceded by a colon rather than by an exclamation mark? Actually, !r is the exception. The rule is that a colon is followed by a formatting language specific to type type, e.g. {:f} (which is only supported by floating point numbers and means fixed-point), whereas an exclamation point is followed by a single letter that bypasses the type-specific formatting -- {!r} is really the only one you need to learn. > There is surely a rational explanation, but in day-to-day use it is really > confusing. Add to this the annoyance of typing ".format" and of adding curly > braces everywhere, and "%" is clearly handier despite the lonely tuple problem. It is true that the advantages of .format() are probably more appreciated when you are writing a larger program. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From dangyogi at gmail.com Fri Feb 13 20:11:42 2009 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Fri, 13 Feb 2009 14:11:42 -0500 Subject: [Python-ideas] Proto-PEP on a 'yield from' statement In-Reply-To: <4994D715.7020707@canterbury.ac.nz> References: <4994AEF7.2080703@canterbury.ac.nz> <4994D37C.2000509@gmail.com> <4994D715.7020707@canterbury.ac.nz> Message-ID: <4995C5EE.3040700@gmail.com> Greg Ewing wrote: > Bruce Frederiksen wrote: > >> I'm also against using return as the syntax for a final value from >> the subgenerator. > > Can you look at what I said in the last revision about > "Generators as Threads" and tell me whether you still > feel that way? > I don't really understand the "Generators as Threads". You say that a function, such as: y = f(x) could be translated into "an equivalent" generator call: y = yield from g(x) But the yield from form causes g(x) to send output to the caller, which f(x) doesn't do. It seems like I would either want one or the other: either yield from g(x) to send g's output to my caller, or y = sum(g(x)) to get a final answer myself of the generated values from g(x). On the other hand, if you're thinking that g(x) is going to be taking values from my caller (passed on to it through send) and producing a final answer, then we have a problem because g(x) will be using a yield expression to accept the values, but the yield expression also produces results which will be sent back to my caller. These results going back is probably not what I want. This is why I think that it's important to separate the aspects of sending and receiving values to/from generators. That's why I proposed receive, rather than the yield expression, to accept values in the generator. I would propose deprecating the yield expression. I would also propose changing send to only send a value into the generator and not return a result. Then you could get a sum of your input values by: y = sum(receive) without generating bogus values back to your caller. I don't know if this helps, or if I've completely missed the point that you were trying to make? ... -bruce frederiksen From dangyogi at gmail.com Fri Feb 13 20:18:03 2009 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Fri, 13 Feb 2009 14:18:03 -0500 Subject: [Python-ideas] Revised revised PEP on yield-from In-Reply-To: <49950F80.5050309@canterbury.ac.nz> References: <49950F80.5050309@canterbury.ac.nz> Message-ID: <4995C76B.4070904@gmail.com> An HTML attachment was scrubbed... URL: From dangyogi at gmail.com Fri Feb 13 22:08:24 2009 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Fri, 13 Feb 2009 16:08:24 -0500 Subject: [Python-ideas] Proto-PEP on a 'yield from' statement In-Reply-To: References: <4994AEF7.2080703@canterbury.ac.nz> <4994D37C.2000509@gmail.com> <26E3F4C63E8F41FC953C6393DA15393F@RaymondLaptop1> <499512C8.6080700@gmail.com> Message-ID: <4995E148.8090905@gmail.com> Bruce Leban wrote: > I didn't follow all the variations on the for loop, but regarding > send, it seems to me that a natural case is this: > > for x in foo: > bar = process(x) > foo.send(bar) > > which sends the value bar to the generator and the value that comes > back is used in the next iteration of the loop. I know that what I > wrote doesn't do that so what I really mean is something like this but > easier to write: > > try: > x = foo.next() > while True: > bar = process(x) > x = foo.send(bar) > except StopIteration: > pass > > and the syntax that occurs to me is: > > for x in foo: > bar = process(x) > continue bar In thinking more about this, here's how it's shaping up. The current generators become "old-style generators" and a "new-style generator" be added. The new-style generator is the same as the old-style generators w.r.t. next/throw/close and yield *statements*. But new-style generators separate the operations of getting values into the generator vs getting values out of the generator. Thus, the yield *expression* is not allowed in new style generators and is replaced by a some kind of marker (reserved word, special identifier, return from builtin function, ??) that is used to receive values into the generator. I'll call this simply receive here for now. The presence of receive is what marks the generator as a new-style generator. Receive looks like an iterator and can be used and passed around as an iterator to anything requiring an iterator within the generator. It does not return values to the caller like yield expressions do. Commensurate with receive, the send method is changed in new-style generators. It still provides a value to the generator, but no longer returns anything. This covers the use case where you want to interact with the generator, as you've indicated above. Thus, a new style generator would work just like you show in your first example, which is more intuitive than the current definition of send as passing a value both directions. So there would not be a need to change the continue statement. > > As to chaining generators, I don't see that as a for loop-specific > feature. If that's useful in a for, then it's useful outside and > should stand on its own. Agreed. Also, a new method, called using is added to new-style generators to provide an iterator to be used as its receive object. This takes an iterator, attaches it to the generator, and returns the generator so that you can do for i in gen(x).using(iterable). This covers the use case where you have all of the input values ahead of time. And then, as a little extra syntactic sugar, the | operator would be overloaded on generators and iterators to call this using method: class iterator: ... def __or__(self, gen_b): return gen_b.using(self) Thus, when chaining generators together, you can use either: for gen1(x) | gen2(y) | gen3(z) as i: or for gen3(z).using(gen2(y).using(gen1(x))) as i: This also introduces a "new-style" for statement that properly honors the generator interface (calls close and throw like you'd expect) vs the "old-style" for statement that doesn't. The reason for the different syntax is that there may be code out there that uses a generator in a for statement with a break in it and then wants to continue with the generator in a subsequent for statement: g = gen(x) for i in g: # doesn't close g ... if cond: break for i in g: # process the rest of g's elements ... This could be done with the new-style for statement as: g = gen(x) for somelib.notclosing(g) as i: ... if cond: break for g as i: ... Comments? Should this be part of the yield from PEP? -bruce frederiksen From greg.ewing at canterbury.ac.nz Fri Feb 13 22:56:38 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 14 Feb 2009 10:56:38 +1300 Subject: [Python-ideas] Proto-PEP on a 'yield from' statement In-Reply-To: <4995C5EE.3040700@gmail.com> References: <4994AEF7.2080703@canterbury.ac.nz> <4994D37C.2000509@gmail.com> <4994D715.7020707@canterbury.ac.nz> <4995C5EE.3040700@gmail.com> Message-ID: <4995EC96.9040408@canterbury.ac.nz> Bruce Frederiksen wrote: > But the yield from form causes g(x) to send output to the caller, which > f(x) doesn't do. In the usage I'm talking about there, you're not interested in the values being yielded. You're using yields without arguments as a way of suspending the thread. So you're not calling g() for the purpose of yielding values. You're calling it for the side effects it produces, and/or the value it returns using a return statement -- the same reasons you were calling f() in the non-thread version. There are also cases where you do want to use the yielded values. For example if you have a function acting as a consumer, and a generator acting as a producer. The producer may want to spread its computation over several functions, but all the produced values should still go to the consumer. The same consideration applies if you're using send() to push values in the other direction. In that case, the outer function is the producer and the generator is the consumer. Whenever the consumer wants to get another value, it does a yield -- and the value should come from the producer, however deeply nested the yield call is. There are, of course, cases where this is not what you want. But in those cases, you don't use a 'yield from' expression -- you use a for-loop or explicit next() and send() calls to do whatever you want to do with the values being passed in and out. > This is why I think that it's important to > separate the aspects of sending and receiving values to/from > generators. That's why I proposed receive, rather than the yield > expression, to accept values in the generator. There may be merit in that, but it's a separate issue, outside the scope of this PEP. And as has been pointed out, if such a change is ever made, it will carry over naturally into the semantics of 'yield from'. -- Greg From greg.ewing at canterbury.ac.nz Fri Feb 13 23:06:33 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 14 Feb 2009 11:06:33 +1300 Subject: [Python-ideas] Revised revised PEP on yield-from In-Reply-To: <4995C76B.4070904@gmail.com> References: <49950F80.5050309@canterbury.ac.nz> <4995C76B.4070904@gmail.com> Message-ID: <4995EEE9.6060404@canterbury.ac.nz> Bruce Frederiksen wrote: > 1. I don't believe that you want the first yield statement (line 4). > I think that this line should be deleted. You're right, that's a mistake. > 2. I would suggest returning the final value from close rather than > attached to StopIteration. The advantage of using StopIteration is that any iterator can take part in the protocol without having to grow a close() method. I also suspect the implementation will be more straightforward, since the point at which the return value from a generator becomes available is the same point at which StopIteration is raised. If close() is used for this, the value would have to be stored somewhere until such time as close() is called. Taking all this into account, using StopIteration to carry the return value seems the most elegant solution to me. -- Greg From greg.ewing at canterbury.ac.nz Fri Feb 13 23:38:57 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 14 Feb 2009 11:38:57 +1300 Subject: [Python-ideas] Revised revised revised PEP on yield-from Message-ID: <4995F681.20702@canterbury.ac.nz> Fourth draft of the PEP. Corrected an error in the expansion and added a bit more to the Rationale. PEP: XXX Title: Syntax for Delegating to a Subgenerator Version: $Revision$ Last-Modified: $Date$ Author: Gregory Ewing Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 13-Feb-2009 Python-Version: 2.7 Post-History: Abstract ======== A syntax is proposed to allow a generator to easily delegate part of its operations to another generator, the subgenerator interacting directly with the main generator's caller for as long as it runs. Additionally, the subgenerator is allowed to return with a value, and the value is made available to the delegating generator. The new syntax also opens up some opportunities for optimisation when one generator re-yields values produced by another. Proposal ======== The following new expression syntax will be allowed in the body of a generator: :: yield from where is an expression evaluating to an iterable, from which an iterator is extracted. The effect is to run the iterator to exhaustion, during which time it behaves as though it were communicating directly with the caller of the generator containing the ``yield from`` expression (the "delegating generator"). In detail: * Any values that the iterator yields are passed directly to the caller. * Any values sent to the delegating generator using ``send()`` are sent directly to the iterator. (If the iterator does not have a ``send()`` method, values sent in are ignored.) * Calls to the ``throw()`` method of the delegating generator are forwarded to the iterator. (If the iterator does not have a ``throw()`` method, the thrown-in exception is raised in the delegating generator.) * If the delegating generator's ``close()`` method is called, the iterator is finalised before finalising the delegating generator. The value of the ``yield from`` expression is the first argument to the ``StopIteration`` exception raised by the iterator when it terminates. Additionally, generators will be allowed to execute a ``return`` statement with a value, and that value will be passed as an argument to the ``StopIteration`` exception. Formal Semantics ---------------- The statement :: result = yield from expr is semantically equivalent to :: _i = iter(expr) try: _u = _i.next() while 1: try: _v = yield _u except Exception, _e: if hasattr(_i, 'throw'): _i.throw(_e) else: raise else: if hasattr(_i, 'send'): _u = _i.send(_v) else: _u = _i.next() except StopIteration, _e: _a = _e.args if len(_a) > 0: result = _a[0] else: result = None finally: if hasattr(_i, 'close'): _i.close() Rationale ========= A Python generator is a form of coroutine, but has the limitation that it can only yield to its immediate caller. This means that a piece of code containing a ``yield`` cannot be factored out and put into a separate function in the same way as other code. Performing such a factoring causes the called function to itself become a generator, and it is necessary to explicitly iterate over this second generator and re-yield any values that it produces. If yielding of values is the only concern, this is not very arduous and can be performed with a loop such as :: for v in g: yield v However, if the subgenerator is to interact properly with the caller in the case of calls to ``send()``, ``throw()`` and ``close()``, things become considerably more complicated. As the formal expansion presented above illustrates, the necessary code is very longwinded, and it is tricky to handle all the corner cases correctly. In this situation, the advantages of a specialised syntax should be clear. Generators as Threads --------------------- A motivating use case for generators being able to return values concerns the use of generators to implement lightweight threads. When using generators in that way, it is reasonable to want to spread the computation performed by the lightweight thread over many functions. One would like to be able to call a subgenerator as though it were an ordinary function, passing it parameters and receiving a returned value. Using the proposed syntax, a statement such as :: y = f(x) where f is an ordinary function, can be transformed into a delegation call :: y = yield from g(x) where g is a generator. One can reason about the behaviour of the resulting code by thinking of g as an ordinary function that can be suspended using a ``yield`` statement. When using generators as threads in this way, typically one is not interested in the values being passed in or out of the yields. However, there are use cases for this as well, where the thread is seen as a producer or consumer of items. The ``yield from`` expression allows the logic of the thread to be spread over as many functions as desired, with the production or consumption of items occuring in any subfunction, and the items are automatically routed to or from their ultimate source or destination. Concerning ``throw()`` and ``close()``, it is reasonable to expect that if an exception is thrown into the thread from outside, it should first be raised in the innermost generator where the thread is suspended, and propagate outwards from there; and that if the thread is terminated from outside by calling ``close()``, the chain of active generators should be finalised from the innermost outwards. Syntax ------ The particular syntax proposed has been chosen as suggestive of its meaning, while not introducing any new keywords and clearly standing out as being different from a plain ``yield``. Optimisations ------------- Using a specialised syntax opens up possibilities for optimisation when there is a long chain of generators. Such chains can arise, for instance, when recursively traversing a tree structure. The overhead of passing ``next()`` calls and yielded values down and up the chain can cause what ought to be an O(n) operation to become O(n\*\*2). A possible strategy is to add a slot to generator objects to hold a generator being delegated to. When a ``next()`` or ``send()`` call is made on the generator, this slot is checked first, and if it is nonempty, the generator that it references is resumed instead. If it raises StopIteration, the slot is cleared and the main generator is resumed. This would reduce the delegation overhead to a chain of C function calls involving no Python code execution. A possible enhancement would be to traverse the whole chain of generators in a loop and directly resume the one at the end, although the handling of StopIteration is more complicated then. Use of StopIteration to return values ------------------------------------- There are a variety of ways that the return value from the generator could be passed back. Some alternatives include storing it as an attribute of the generator-iterator object, or returning it as the value of the ``close()`` call to the subgenerator. However, the proposed mechanism is attractive for a couple of reasons: * Using the StopIteration exception makes it easy for other kinds of iterators to participate in the protocol without having to grow a close() method. * It simplifies the implementation, because the point at which the return value from the subgenerator becomes available is the same point at which StopIteration is raised. Delaying until any later time would require storing the return value somewhere. Criticisms ========== Under this proposal, the value of a ``yield from`` expression would be derived in a very different way from that of an ordinary ``yield`` expression. This suggests that some other syntax not containing the word ``yield`` might be more appropriate, but no alternative has so far been proposed, other than ``call``, which has already been rejected by the BDFL. It has been suggested that some mechanism other than ``return`` in the subgenerator should be used to establish the value returned by the ``yield from`` expression. However, this would interfere with the goal of being able to think of the subgenerator as a suspendable function, since it would not be able to return values in the same way as other functions. The use of an argument to StopIteration to pass the return value has been criticised as an "abuse of exceptions", without any concrete justification of this claim. In any case, this is only one suggested implementation; another mechanism could be used without losing any essential features of the proposal. Alternative Proposals ===================== Proposals along similar lines have been made before, some using the syntax ``yield *`` instead of ``yield from``. While ``yield *`` is more concise, it could be argued that it looks too similar to an ordinary ``yield`` and the difference might be overlooked when reading code. To the author's knowledge, previous proposals have focused only on yielding values, and thereby suffered from the criticism that the two-line for-loop they replace is not sufficiently tiresome to write to justify a new syntax. By dealing with sent values as well as yielded ones, this proposal provides considerably more benefit. Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: -- Greg From greg.ewing at canterbury.ac.nz Fri Feb 13 23:43:42 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 14 Feb 2009 11:43:42 +1300 Subject: [Python-ideas] set.add() return value In-Reply-To: <779748.53402.qm@web111403.mail.gq1.yahoo.com> References: <118486.47943.qm@web111413.mail.gq1.yahoo.com> <2F8C0565097D4CA690BB83CC5B6A2C19@RaymondLaptop1> <203230.34503.qm@web111401.mail.gq1.yahoo.com> <50697b2c0902121512h6749333tb2a5cf5ff1e54899@mail.gmail.com> <171757.71077.qm@web111402.mail.gq1.yahoo.com> <4994FCF4.3030301@pearwood.info> <779748.53402.qm@web111403.mail.gq1.yahoo.com> Message-ID: <4995F79E.3050601@canterbury.ac.nz> Ralf W. Grosse-Kunstleve wrote: > I didn't know such a process is considered > O(1). I agree it is fair because in practice it seems to work very > well, but the "collisions" part can turn it into O(N) The very worst possible case is probably something greater than O(1), but it's so unlikely to happen in practice that this is usually ignored. -- Greg From greg.ewing at canterbury.ac.nz Fri Feb 13 23:46:40 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 14 Feb 2009 11:46:40 +1300 Subject: [Python-ideas] set.add() return value In-Reply-To: <50697b2c0902122300q27e82716y76a9a5b13e3f1766@mail.gmail.com> References: <118486.47943.qm@web111413.mail.gq1.yahoo.com> <2F8C0565097D4CA690BB83CC5B6A2C19@RaymondLaptop1> <203230.34503.qm@web111401.mail.gq1.yahoo.com> <50697b2c0902121512h6749333tb2a5cf5ff1e54899@mail.gmail.com> <171757.71077.qm@web111402.mail.gq1.yahoo.com> <4994FCF4.3030301@pearwood.info> <779748.53402.qm@web111403.mail.gq1.yahoo.com> <50697b2c0902122300q27e82716y76a9a5b13e3f1766@mail.gmail.com> Message-ID: <4995F850.2040802@canterbury.ac.nz> Chris Rebert wrote: > As The Zen says: "Special cases aren't special enough to break the rules." What might be more acceptable is to add a new method for this, with a name suggesting that it's more than just a plain mutating operation, e.g. was_it_there = myset.test_and_add(42) -- Greg From steve at pearwood.info Sat Feb 14 02:24:41 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 14 Feb 2009 12:24:41 +1100 Subject: [Python-ideas] set.add() return value In-Reply-To: <4995F850.2040802@canterbury.ac.nz> References: <118486.47943.qm@web111413.mail.gq1.yahoo.com> <2F8C0565097D4CA690BB83CC5B6A2C19@RaymondLaptop1> <203230.34503.qm@web111401.mail.gq1.yahoo.com> <50697b2c0902121512h6749333tb2a5cf5ff1e54899@mail.gmail.com> <171757.71077.qm@web111402.mail.gq1.yahoo.com> <4994FCF4.3030301@pearwood.info> <779748.53402.qm@web111403.mail.gq1.yahoo.com> <50697b2c0902122300q27e82716y76a9a5b13e3f1766@mail.gmail.com> <4995F850.2040802@canterbury.ac.nz> Message-ID: <49961D59.1050208@pearwood.info> Greg Ewing wrote: > Chris Rebert wrote: > >> As The Zen says: "Special cases aren't special enough to break the >> rules." > > What might be more acceptable is to add a new method > for this, with a name suggesting that it's more than > just a plain mutating operation, e.g. > > was_it_there = myset.test_and_add(42) > What's the use-case for this? What's wrong with doing this? if 42 in myset: myset.add(42) Short, sweet, and doesn't require any new methods. The OP's original use-case was based on his misapprehension that key lookup in a set was O(N log N). I don't see any useful advantage to a new method, let alone a change in semantics to the existing method. -- Steven From python at rcn.com Sat Feb 14 02:31:54 2009 From: python at rcn.com (Raymond Hettinger) Date: Fri, 13 Feb 2009 17:31:54 -0800 Subject: [Python-ideas] Revised revised revised PEP on yield-from References: <4995F681.20702@canterbury.ac.nz> Message-ID: <0A5E3DAFC3204EA08954903202DB45F3@RaymondLaptop1> From: "Greg Ewing" > The statement > > :: > > result = yield from expr > > is semantically equivalent to > > :: > > _i = iter(expr) > try: > _u = _i.next() > while 1: > try: > _v = yield _u > except Exception, _e: > if hasattr(_i, 'throw'): > _i.throw(_e) > else: > raise > else: > if hasattr(_i, 'send'): > _u = _i.send(_v) > else: > _u = _i.next() > except StopIteration, _e: > _a = _e.args > if len(_a) > 0: > result = _a[0] > else: > result = None > finally: > if hasattr(_i, 'close'): > _i.close() Are there any use cases that warrant all this complexity? I not yet seen a single piece of real-world code that would benefit from yield-from having pass-throughs for send/throw/close. So far, this seems to have been a purely theoretical exercise what is possible, but it doesn't seem to include investigation as to whether it is actually useful. In the absence of real-world use cases, it might still be helpful to look at some contrived, hypothetical use cases so we can see if the super-powered version actually provides a better solution (is the code more self-evidently correct, is the construct easy to use and understand, is it awkward to use)? The proto-pep seems heavy on specification and light on showing that this is actually something we want to have. Plenty of folks have shown an interest in a basic version of yield-every or yield-from, but prior to this protoPEP, I've never seen any request for or discussion of a version that does pass-throughs for send/throw/close. Raymond From greg.ewing at canterbury.ac.nz Sat Feb 14 02:44:17 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 14 Feb 2009 14:44:17 +1300 Subject: [Python-ideas] set.add() return value In-Reply-To: <49961D59.1050208@pearwood.info> References: <118486.47943.qm@web111413.mail.gq1.yahoo.com> <2F8C0565097D4CA690BB83CC5B6A2C19@RaymondLaptop1> <203230.34503.qm@web111401.mail.gq1.yahoo.com> <50697b2c0902121512h6749333tb2a5cf5ff1e54899@mail.gmail.com> <171757.71077.qm@web111402.mail.gq1.yahoo.com> <4994FCF4.3030301@pearwood.info> <779748.53402.qm@web111403.mail.gq1.yahoo.com> <50697b2c0902122300q27e82716y76a9a5b13e3f1766@mail.gmail.com> <4995F850.2040802@canterbury.ac.nz> <49961D59.1050208@pearwood.info> Message-ID: <499621F1.3040209@canterbury.ac.nz> Steven D'Aprano wrote: > Greg Ewing wrote: > >> was_it_there = myset.test_and_add(42) > > What's the use-case for this? What's wrong with doing this? > > if 42 in myset: > myset.add(42) Nothing, other than it feels a little bit wrong looking the value up twice when once would be enough. I'm not particularly worried about this, I was just pointing out that adding a new method would be less of a violation of the Zen than changing an existing one. -- Greg From greg.ewing at canterbury.ac.nz Sat Feb 14 02:54:15 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 14 Feb 2009 14:54:15 +1300 Subject: [Python-ideas] Revised revised revised PEP on yield-from In-Reply-To: <0A5E3DAFC3204EA08954903202DB45F3@RaymondLaptop1> References: <4995F681.20702@canterbury.ac.nz> <0A5E3DAFC3204EA08954903202DB45F3@RaymondLaptop1> Message-ID: <49962447.3070503@canterbury.ac.nz> Raymond Hettinger wrote: > Are there any use cases that warrant all this complexity? Have you read the latest version of the Rationale? I've tried to explain more clearly where I'm coming from. As for real-world use cases, I've seen at least two frameworks that people have come up with for using generators as threads, where you make calls by writing things like result = yield Call(f(x, y)) There is a "driver" at the top that's maintaining a stack of generators and managing all the plumbing, so that you can pretend the above statement is just doing the same as result = f(x, y) except that f is suspendable. So some people *are* actually doing this sort of thing in real life, in a rather ad-hoc way. My proposal would standardise and streamline it, and make it more efficient. It would also free up the values being passed in and out of the yields so you can use them for your own purposes, instead of using them to implement the coroutine machinery. -- Greg From carl at carlsensei.com Sat Feb 14 02:57:30 2009 From: carl at carlsensei.com (Carl Johnson) Date: Fri, 13 Feb 2009 15:57:30 -1000 Subject: [Python-ideas] Proto-PEP on a 'yield from' statement Message-ID: <2A13452A-8B49-46A4-AA67-45C1C6517822@carlsensei.com> Greg Ewing wrote: > Antoine Pitrou wrote: > > > The problem I can see is that in normal iteration forms (e.g. a > "for" loop), the > > argument to StopIteration is ignored. Therefore, a generator > executing such a > > return statement and expecting the caller to use the return value > wouldn't be > > usable in normal iteration contexts. > > How is this different from an ordinary function returning a > value that is ignored by the caller? > > It's up to the caller to decide whether to use the return > value. If it wants the return value, then it has to either > use a 'yield from' or catch the StopIteration itself and > extract the value. I think there is a difference. I were to >>> def my_gen(): ... yield 1 ... return 2 #or "return finally 2" or whatever I would would be very surprised at the result when putting it into a for-loop: >>> for i in my_gen(): ... print(i) ... 1 >>> Where did the two go? Why did it disappear? Well, the answer is that the for-loop body ignored it, the same way the None emitted by a mutating function gets ignored. But that answer doesn't seem very compelling to me, just confusing. -1 for returning from a generator. -- Carl From steve at pearwood.info Sat Feb 14 05:14:00 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 14 Feb 2009 15:14:00 +1100 Subject: [Python-ideas] Revised revised revised PEP on yield-from In-Reply-To: <0A5E3DAFC3204EA08954903202DB45F3@RaymondLaptop1> References: <4995F681.20702@canterbury.ac.nz> <0A5E3DAFC3204EA08954903202DB45F3@RaymondLaptop1> Message-ID: <49964508.5020207@pearwood.info> Raymond Hettinger wrote: > The proto-pep seems heavy on specification and light on > showing that this is actually something we want to have. > Plenty of folks have shown an interest in a basic version > of yield-every or yield-from, but prior to this protoPEP, > I've never seen any request for or discussion of a version > that does pass-throughs for send/throw/close. What he said. I'm +1 on a basic pass-through "yield from". I understand the motivation in the protoPEP (factoring out parts of a generator into other generators), but it's not clear how genuinely useful this is in practice. I haven't used threads, and the motivating use case doesn't mean anything to me. If I've understood the protoPEP, it wraps four distinct pieces of functionality: "yield from" pass-through pass-through for send pass-through for throw pass-through for close I think each one needs to be justified, or at least explained, individually. I'm afraid I'm not even clear on what pass-through for send/throw/close would even mean, let alone why they would be useful. Basic yield pass-through is obvious, and even if we decide that it's nothing more than syntactic sugar for "for x in gen: yield x", I think it's a clear win for readability. But the rest needs some clear, simple examples of how they would be used. -- Steven From eric at trueblade.com Sat Feb 14 11:40:39 2009 From: eric at trueblade.com (Eric Smith) Date: Sat, 14 Feb 2009 05:40:39 -0500 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: References: Message-ID: <49969FA7.1090708@trueblade.com> Guido van Rossum wrote: > It is true that the advantages of .format() are probably more > appreciated when you are writing a larger program. True. I think that the auto-number feature goes a little way in helping replace casual uses of %-formatting. To that end, I've implemented a string.Formatter subclass that mostly implements this suggestion, just so people can try it out if they want. I believe it's complete, except it doesn't handle escaping '{' and '}'. It's attached to http://bugs.python.org/issue5237 as auto_number_formatter_3.py. $ ./python Python 2.7a0 (trunk:69608, Feb 14 2009, 04:51:18) [GCC 4.1.2 20070626 (Red Hat 4.1.2-13)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from auto_number_formatter_3 import formatter as _ >>> _('{} {} {}').format(3, 'pi', 3.14) '3 pi 3.14' >>> It also lets you add in format specifiers, conversion specifiers, and object access. Once you use those, the improvement of leaving out the index numbers is less clear. I'll leave it for debate if this is useful. I think it probably is, if only because it's easier to explain the behavior: If you leave out the 'field name', a sequential number is added in front of the 'replacement string' (using PEP 3101 nomenclature). >>> _('{} {} {}').format(3, 'pi', 3.14) '3 pi 3.14' >>> _('{:#b} {!r:^16} {.imag}').format(3, set([14,3]), 3j+1) '0b11 set([3, 14]) 3.0' >>> It also supports all of the regular ''.format() behavior, you just can't mix-and-match using field names and omitting them. >>> _('{foo:10}').format(foo='bar') 'bar ' >>> _('{0} {}').format(1, 2) Traceback (most recent call last): ...
ValueError: cannot mix and match auto indexing >>> As I said, pure-Python example doesn't handle escaping '{' and '}' (because the parsing is tedious and already implemented in the C version), but is otherwise complete. If the consensus is that this is useful, I'll implement it in ''.format(), otherwise I'm done with this issue. Eric. PS: Just as a plug, I should note this is the first use I've found for the under-appreciated string.Formatter. From bruce at leapyear.org Sun Feb 15 02:17:21 2009 From: bruce at leapyear.org (Bruce Leban) Date: Sat, 14 Feb 2009 17:17:21 -0800 Subject: [Python-ideas] Revised revised revised PEP on yield-from In-Reply-To: <49964508.5020207@pearwood.info> References: <4995F681.20702@canterbury.ac.nz> <0A5E3DAFC3204EA08954903202DB45F3@RaymondLaptop1> <49964508.5020207@pearwood.info> Message-ID: I'd like to understand better what this function would do: def generate_concatenate(generator_list): for g in generator_list: yield from g in particular, what does generator_concatenate.close() do? --- Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Sun Feb 15 04:58:12 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 15 Feb 2009 16:58:12 +1300 Subject: [Python-ideas] Revised revised revised PEP on yield-from In-Reply-To: References: <4995F681.20702@canterbury.ac.nz> <0A5E3DAFC3204EA08954903202DB45F3@RaymondLaptop1> <49964508.5020207@pearwood.info> Message-ID: <499792D4.3070101@canterbury.ac.nz> Bruce Leban wrote: > I'd like to understand better what this function would do: > > def generate_concatenate(generator_list): > for g in generator_list: > yield from g > > in particular, what does generator_concatenate.close() do? If it happens to be in the midst of iterating over one of the generators from the list at the time (i.e. executing the 'yield from') then that generator is finalized, then generate_concatenate itself is finalized. Otherwise, nothing special happens. From dangyogi at gmail.com Sun Feb 15 20:41:42 2009 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Sun, 15 Feb 2009 14:41:42 -0500 Subject: [Python-ideas] PEP on yield-from: throw example In-Reply-To: <49964508.5020207@pearwood.info> References: <4995F681.20702@canterbury.ac.nz> <0A5E3DAFC3204EA08954903202DB45F3@RaymondLaptop1> <49964508.5020207@pearwood.info> Message-ID: <49986FF6.3060707@gmail.com> Steven D'Aprano wrote: > If I've understood the protoPEP, it wraps four distinct pieces of > functionality: > > "yield from" pass-through > pass-through for send > pass-through for throw > pass-through for close > > I think each one needs to be justified, or at least explained, > individually. I'm afraid I'm not even clear on what pass-through for > send/throw/close would even mean, let alone why they would be useful. > Basic yield pass-through is obvious, and even if we decide that it's > nothing more than syntactic sugar for "for x in gen: yield x", I think > it's a clear win for readability. But the rest needs some clear, > simple examples of how they would be used. OK, let's give this a try. I to do several posts, one on each item above in an attempt to demonstrate what we're talking about here. First of all, to be clear on this, the send, throw and close mechanisms were proposed in PEP 342 and adopted in Python 2.5. For some reason though, these new mechanisms didn't seem to make it into the standard Python documentation. So you'll need to read PEP 342 if you have any question on how these work. This post is on "pass-through for close". I've tried to make these as simple as possible, but there's still a little bit to it, so please bear with me. Let's get started. We're going to do a little loan application program. We're going to process a list of loan applications. Each loan application consists of a list of people. If any of the people on the list qualify, then they get the loan. If none of the people qualify, they don't get the loan. We're going to have a generator that generates the individual names. If the name does not qualify, then DoesntQualify is raised by the caller using the throw method: class DoesntQualify(Exception): pass Names = [['Raymond'], ['Bruce', 'Marilyn'], ['Jack', 'Jill']] def gen(l): count = 0 try: for names in l: count += 1 for name in names: try: yield name break except DoesntQualify: pass else: print names, "don't qualify" finally: print "processed", count, "applications" Now we need a function that gets passed this generator and checks each name to see if it qualifies. I would expect to be able to write: def process(generator): for name in generator: if len(name) > 5: print name, "qualifies" else: raise DoesntQualify But running this gives: >>> g = gen(Names) >>> process(g) Raymond qualifies Traceback (most recent call last): File "throw2.py", line 34, in process(g) File "throw2.py", line 31, in process raise DoesntQualify __main__.DoesntQualify What I expected was the for statement in process would forward the DoesntQualify exception to the generator. But it doesn't do this, so I'm left to do it myself. My next try developing this example, was: def process(generator): for name in generator: while True: if len(name) > 5: print name, "qualifies" break else: name = generator.throw(DoesntQualify) But running this gives: Raymond qualifies Marilyn qualifies ['Jack', 'Jill'] don't qualify processed 3 applications Traceback (most recent call last): File "throw2.py", line 46, in process2(gen(Names)) File "throw2.py", line 43, in process2 name = iterable.throw(DoesntQualify) StopIteration Oops, the final throw raised StopIteration when it hit the end of Names. So I end up with: def process(generator): try: for name in generator: while True: if len(name) > 5: print name, "qualifies" break else: name = generator.throw(DoesntQualify) except StopIteration: pass This one works: Raymond qualifies Marilyn qualifies ['Jack', 'Jill'] don't qualify processed 3 applications But by this time, it's probably more clear if I just abandon the for statement entirely: def process(generator): name = generator.next() while True: try: if len(name) > 5: print name, "qualifies" name = generator.next() else: name = generator.throw(DoesntQualify) except StopIteration: break But now I need to change process to add a limit to the number of accepted applications: def process(generator, limit): name = generator.next() count = 1 while count <= limit: try: if len(name) > 5: print name, "qualifies" name = generator.next() count += 1 else: name = generator.throw(DoesntQualify) except StopIteration: break Seems easy enough, except that this is broken again because the final "processed N applications" message won't come out if the limit is hit (unless you are running CPython and call it in such a way that the generator is immediately collected -- but this doesn't work on jython or ironpython). That's what the close method is for, and I forgot to call it: def process(generator, limit): name = generator.next() count = 1 while count <= limit: try: if len(name) > 5: print name, "qualifies" name = generator.next() count += 1 else: name = generator.throw(DoesntQualify) except StopIteration: break generator.close() So what starts out conceptually simple, ends up more complicated and error prone that I had expected; and the reason is that the for statement doesn't support these new generators methods. If it did, I would have: def process(generator, limit): count = 1 for generator as name: # new syntax doesn't break old code if len(name) > 5: print name, "qualifies" count += 1 if count > limit: break else: raise DoesntQualify # new for passes this to generator.throw # new for remembers to call generator.close for me. Now, we need to extend this because there are several lists of applications. I'd like to be able to use the same gen function on each list, and the same process function and just introduce an intermediate generator that gathers up the output of several generators. This is exactly what itertools.chain does! So this should be very easy: >>> g1 = gen(Names1) >>> g2 = gen(Names2) >>> g3 = gen(Names3) >>> process(itertools.chain(g1, g2, g3), limit=5) But, nope, itertools.chain doesn't honor the extra generator methods either. If we had yield from, then I could use that instead of itertools.chain: def multi_gen(gen_list): for gen in gen_list: yield from gen When I use yield from, it sets multi_gen aside and lets process talk directly to each generator. So I would expect that not only would objects yielded by each generator be passed directly back to process, but that exceptions passed in by process with throw would be passed directly to the generator. Why would this *not* be the case? With the for statement, I can see that doing the throw/close processing might break some legacy code and understand the reservation in doing so there. But here we have a new language construct where we don't need to worry about legacy code. It's also a construct dealing directly and exclusively with generators. If I can't use yield from, and itertools.chain does work, and the for statement doesn't work, then I'm faced once again with having to code everything again myself: def multi_gen(gen_list): for gen in gen_list: while True: try: yield gen.next() except DoesntQualify, e: yield gen.throw(e) except StopIteration: gen.close() Yuck! Did I get this one right? Nope, same StopIteration problem with gen.throw... Let's try: def multi_gen(gen_list): for gen in gen_list: try: while True: try: yield gen.next() except DoesntQualify, e: yield gen.throw(e) except StopIteration: pass finally: gen.close() Even more yuck! This feels more like programming in assembler than python :-( -bruce frederiksen From dangyogi at gmail.com Sun Feb 15 21:12:20 2009 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Sun, 15 Feb 2009 15:12:20 -0500 Subject: [Python-ideas] PEP on yield-from, send example In-Reply-To: <49964508.5020207@pearwood.info> References: <4995F681.20702@canterbury.ac.nz> <0A5E3DAFC3204EA08954903202DB45F3@RaymondLaptop1> <49964508.5020207@pearwood.info> Message-ID: <49987724.4070604@gmail.com> Steven D'Aprano wrote: > If I've understood the protoPEP, it wraps four distinct pieces of > functionality: > > "yield from" pass-through > pass-through for send > pass-through for throw > pass-through for close > > I think each one needs to be justified, or at least explained, > individually. I'm afraid I'm not even clear on what pass-through for > send/throw/close would even mean, let alone why they would be useful. Here's the send example. I want to write a function that plays "guess this number" by making successive guesses and getting a high/low response. My first version will generate random guesses: def rand_guesser(limit): lo = 0 # answer is > lo hi = limit + 1 # answer is < hi num_tries = 0 while lo + 2 < hi: guess = random.randint(lo + 1, hi - 1) num_tries += 1 result = yield guess if result == 0: break if result < 0: lo = guess else: hi = guess else: guess = lo + 1 print "rand_guesser: got", guess, "in", num_tries, "tries" and then the function that calls it: def test(guesser, limit): n = random.randint(1, limit) print "the secret number is", n try: guess = guesser.next() while True: print "got", guess guess = guesser.send(cmp(guess, n)) except StopIteration: pass # guesser.close() isn't necessary if we got StopIteration, # because the generator has already finalized. >>> test(rand_guesser(100), 100) answer is 67 got 33 got 81 got 69 got 47 got 56 got 68 got 58 got 62 got 64 got 67 rand_guesser: got 67 in 10 tries So far, so good. But how does binary_search compare with random_guesser? def binary_search(limit): lo = 0 hi = limit + 1 num_tries = 0 while lo + 2 < hi: guess = (hi + lo) // 2 num_tries += 1 result = yield guess if result == 0: break if result < 0: lo = guess else: hi = guess else: guess = lo + 1 print "binary_search: got", guess, "in", num_tries, "tries" >>> test(binary_search(100), 100) answer is 73 got 50 got 75 got 62 got 68 got 71 got 73 binary_search: got 73 in 6 tries Hmmm, but compare these, I need to run them on the same answer number. I know, I can just chain them together. Then after test will just see both sets of guesses back to back... Another obvious choice for itertools.chain! >>> test(itertools.chain(random_guesser(100), binary_search(100)), 100) answer is 86 got 62 Traceback (most recent call last): File "throw2.py", line 134, in test(itertools.chain(rand_guesser(100), binary_search(100)), 100) File "throw2.py", line 128, in test guess = guesser.send(cmp(guess, n)) AttributeError: 'itertools.chain' object has no attribute 'send' Oops, that's right, itertools.chain doesn't play nicely with advanced generators... :-( So I guess I have to write my own intermediate multi_guesser... Luckily, we have yield from! def multi_guesser(l, limit): for gen in l: yield from gen(limit) What does yield from do? It sets multi_guesser aside so that test can communicate directly with each gen. Objects yielded by the gen go directly back to test. And I would expect that objects sent from test (with send) would go directly to the gen. If that's the case, this works fine! If not, then I'm sad again and have to do something like: def multi_guesser(l, limit): for gen in l: g = gen(limit) try: guess = g.next() while True: guess = g.send((yield guess)) except StopIteration: pass Which one do you think is more pythonic? Which one would you rather get stuck maintaining? (Personally, I'd vote for itertools.chain!) >>> test(multi_guesser((rand_guesser, binary_search), 100), 100) answer is 95 got 39 got 99 got 80 got 98 got 93 got 94 got 97 got 96 rand_guesser: got 95 in 8 tries got 50 got 75 got 88 got 94 got 97 got 95 binary_search: got 95 in 6 tries -bruce frederiksen From george.sakkis at gmail.com Mon Feb 16 03:25:27 2009 From: george.sakkis at gmail.com (George Sakkis) Date: Sun, 15 Feb 2009 21:25:27 -0500 Subject: [Python-ideas] PEP on yield-from: throw example In-Reply-To: <49986FF6.3060707@gmail.com> References: <4995F681.20702@canterbury.ac.nz> <0A5E3DAFC3204EA08954903202DB45F3@RaymondLaptop1> <49964508.5020207@pearwood.info> <49986FF6.3060707@gmail.com> Message-ID: <91ad5bf80902151825w61215fd6g6aac11949e9df71d@mail.gmail.com> On Sun, Feb 15, 2009 at 2:41 PM, Bruce Frederiksen wrote: > I to do several posts, one on each item above in > an attempt to demonstrate what we're talking about here. Thanks for the examples, they gave some good idea of what we're *really* talking about :) > So what starts out conceptually simple, ends up more complicated and error > prone that I had expected; and the reason is that the for statement doesn't > support these new generators methods. If it did, I would have: > > def process(generator, limit): > count = 1 > for generator as name: # new syntax doesn't break old code > if len(name) > 5: > print name, "qualifies" > count += 1 > if count > limit: break > else: > raise DoesntQualify # new for passes this to generator.throw > # new for remembers to call generator.close for me. Backwards compatibility is not the (only) issue here. Calling implicitly the extra generator methods is optional at best and non-intuitive at worse. For close() it's usually desirable to be called when a loop exits naturally, although that's debatable for prematurely ended loops; the caller may still have a use for the non-exhausted generator. For throw() however, I strongly disagree that a raise statement in a loop should implicitly call generator.throw(), regardless of what "for" syntax is used. When I read "raise Exception", I expect the control to flow out of the current frame to the caller, not in an unrelated frame of some generator. The only viable option would perhaps be a new statement, say "throw Exception", that distinguishes it clearly from raise. > If I can't use yield from, and itertools.chain does work, and the for > statement doesn't work, then I'm faced once again with having to code > everything again myself: As I said, I don't think that the for statement can or should be made to "work", but would updating chain(), or all itertools.* for that matter, so that they play well with the new methods solve most real world cases ? If so, that's probably better than adding new syntax, practicality-beats-purity and all that. George From lie.1296 at gmail.com Mon Feb 16 08:07:11 2009 From: lie.1296 at gmail.com (Lie Ryan) Date: Mon, 16 Feb 2009 07:07:11 +0000 (UTC) Subject: [Python-ideas] Making colons optional? References: <2C0D0123-D691-42CD-A52D-7B5284857721@gmail.com> <498B1CDB.3000804@scottdial.com> <4991B15C.7070904@ronadam.com> Message-ID: On Tue, 10 Feb 2009 10:54:52 -0600, Ron Adam wrote: > Lie Ryan wrote: >> On Thu, 05 Feb 2009 12:07:39 -0500, Scott Dial wrote: > >> personally, I like this: >> >> if some() and some_other() or \ >> some_more(complex=(True,)) and \ >> a_final_call(egg=(1,2,3)) \ >> : >> do_something() > > > My preference is to lead with keywords or symbols when it's convenient: > > if (some() > and some_other() > or some_more(complex=(True,)) > and a_final_call(egg=(1, 2, 3))): > do_something() > > > For long math expressions spanning multiple lines I usually split before > addition or subtraction signs. > > if (some_long_value * another_long_value > + some_long_value * a_special_quanity - > an_offset_of_some_size): > do_something() > > > With syntax highlighting these becomes even more readable. The eye and > mind just follow along the vertical line of highlighted keywords or > operations until it reaches the end. > > As far as optional things go, I'd like the '\' to be optional for > multiple lines that end with a ':'. Adding parentheses around the > expression works, but it seems like a compromise to me. > > Cheers, > Ron Under ideal circumstances, when I see codes like that I'll turn it into a function. Such a complex expression decreases readability exponentially, turning it into a function would make it much more readable. Worse is if the expression uses a lot of parentheses, it becomes a nightmare trying to count the parentheses, keep it balanced, and keep it logically correct. r = 1.0 / ( (n**p * p**n)**(n*p) + l**w ) where r = readability n = number of terms p = number of parentheses used to change evaluation order l = number of lines w = average width of each line n = 4, l = 4, p = 3, w = 20 the readability of that expression is: r = 2.6547283374476086e-45 From tanzer at swing.co.at Mon Feb 16 09:48:14 2009 From: tanzer at swing.co.at (Christian Tanzer) Date: Mon, 16 Feb 2009 08:48:14 -0000 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: Your message of "Fri, 13 Feb 2009 09:56:55 -0800" References: Message-ID: Guido van Rossum wrote at Fri, 13 Feb 2009 09:56:55 -0800: > On Fri, Feb 13, 2009 at 12:58 AM, Christian Tanzer wrote: > > I think the main problem is the huge amount of existing code that uses > > `%` for formatting. As long as there is no easy way to migrate that > > code to `.format`, moves to deprecate `%`-formatting are bound to > > cause friction. > > Yes, that was our concern too when we decided to keep % without > deprecation in 3.0. > > My guess is that *most* of these use string literals, and we *can* > write a 2to3 fixer for those. > > It is the cases where the format is being passed in as an argument or > precomputed somehow where 2to3 falls down. It would be useful to have > an idea how frequently that happens. A fair amount of my use cases involve a non-literal format string (i.e., passed in as argument, defined as module or class variable, or even doc-strings used as format strings). I'd guess that most non-literal format strings are used together with dictionaries. Unfortunately, it's hard to grep for this :-(, so I can't give you hard numbers. Another, probably fairly common, use case involving non-literal strings is %-formatting in I18N strings -- though it might be possible to fix these automatically. -- Christian Tanzer http://www.c-tanzer.at/ From greg.ewing at canterbury.ac.nz Mon Feb 16 10:29:24 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 16 Feb 2009 22:29:24 +1300 Subject: [Python-ideas] PEP on yield-from: throw example In-Reply-To: <91ad5bf80902151825w61215fd6g6aac11949e9df71d@mail.gmail.com> References: <4995F681.20702@canterbury.ac.nz> <0A5E3DAFC3204EA08954903202DB45F3@RaymondLaptop1> <49964508.5020207@pearwood.info> <49986FF6.3060707@gmail.com> <91ad5bf80902151825w61215fd6g6aac11949e9df71d@mail.gmail.com> Message-ID: <499931F4.8050601@canterbury.ac.nz> George Sakkis wrote: > For throw() however, I strongly disagree that > a raise statement in a loop should implicitly call generator.throw(), > regardless of what "for" syntax is used. Just in case it's not clear, the behaviour being suggested here is *not* part of my proposal. As far as yield-from is concerned, propagation of exceptions into the subgenerator would only occur when throw() was called on the generator containing the yield-from, and then only when it's suspended in the midst of it. Raise statements within the delegating generator have nothing to do with the matter and aren't affected at all. Having some examples to look at is a good idea, but Bruce seems to be going off on a tangent and making some proposals of his own for enhancing the for-loop. I fear that this will only confuse the discussion further. Perhaps I should also point out that yield-from is *not* intended to help things like itertools.chain manage the cleanup of its generators, so examples involving things with chain-like behaviour are probably not going to help clarify what it *is* intended for. It would be nice to have a language feature to help with things like that, but I have no idea at the moment what such a thing would be like. -- Greg From guido at python.org Mon Feb 16 17:00:30 2009 From: guido at python.org (Guido van Rossum) Date: Mon, 16 Feb 2009 08:00:30 -0800 Subject: [Python-ideas] Revised revised revised PEP on yield-from In-Reply-To: <0A5E3DAFC3204EA08954903202DB45F3@RaymondLaptop1> References: <4995F681.20702@canterbury.ac.nz> <0A5E3DAFC3204EA08954903202DB45F3@RaymondLaptop1> Message-ID: On Fri, Feb 13, 2009 at 5:31 PM, Raymond Hettinger wrote: > Are there any use cases that warrant all this complexity? > I not yet seen a single piece of real-world code that would > benefit from yield-from having pass-throughs for send/throw/close. > So far, this seems to have been a purely theoretical exercise > what is possible, but it doesn't seem to include investigation as to > whether it is actually useful. > In the absence of real-world use cases, it might still be helpful > to look at some contrived, hypothetical use cases so we can > see if the super-powered version actually provides a better > solution (is the code more self-evidently correct, is the construct > easy to use and understand, is it awkward to use)? > > The proto-pep seems heavy on specification and light on > showing that this is actually something we want to have. > Plenty of folks have shown an interest in a basic version > of yield-every or yield-from, but prior to this protoPEP, > I've never seen any request for or discussion of a version > that does pass-throughs for send/throw/close. While I haven't read the PEP thouroughly, I believe I understand the concept of pass-through and I think I have a compelling use case, at least for passing through .send(). The rest then shouldn't be a problem. Let's also not forget that 99% of all uses of generators don't involve .send(), .throw() or .close(). My use case is flattening of trees, for example parse trees. For concreteness, assume a node has a label and a list of children. The iteration should receive ENTER and LEAVE pseudo-labels when entering a level. We can then write a pre-order iterator like this, using yield-from without caring about pass-through: def __iter__(self): yield self.label if self.children: yield ENTER for child in self.children: yield from child yield LEAVE Now suppose the caller of the iteration wants to be able to occasionally truncate the traversal, e.g. it may not be interested in the subtree for certain labels, or it may want to skip very deep trees. It's not possible to anticipate what the caller is wants to truncate, so we don't want to build direct support for e.g. skip-lists or level-control into the iterator. Instead, the caller now uses .send(SKIP) when it wants to skip a subtree. The iterator responds with a SKIPPED pseudo-label. For example: def __iter__(self): skip = yield self.label if skip == SKIP: yield SKIPPED else: skip = yield ENTER if skip == SKIP: yield SKIPPED else: for child in self.children: yield from child yield LEAVE I believe the pass-through semantics proposed for yield-from are *exactly* what we need in this case. Without it, the for-loop would have to be written like this: for child in self.children: it = iter(child) while True: try: value = it.send(skip) except StopIteration: break skip = yield value Other remarks: (a) I don't know if the PEP proposes that "yield from expr" should return the last value returned by (i.e. sent to) a yield somewhere deeply nested; I think this would be useful. (b) I hope the PEP also explains what to do if "expr" is not a generator but some other kind of iterator. IMO it should work as long as .send() etc. are not used. I think it would probably be safest to raise an exception is .send() is used and the receiving iterator is not a generator. For .throw() and .close() it would probably be most useful to let them have their effect in the last generator on the stack. (c) A quick skim of the PEP didn't show suggestions for how to implement this. I think this needs to be addressed. I don't think it will be possible to literally replace the outer generator with the inner one while that is running; the treatment of StopIteration probably requires some kind of chaining, so that there is still a cost associated with deeply nested yield-from clauses. However it could be much more efficient than explicit for-loops. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Feb 16 17:04:48 2009 From: guido at python.org (Guido van Rossum) Date: Mon, 16 Feb 2009 08:04:48 -0800 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: References: Message-ID: On Mon, Feb 16, 2009 at 12:48 AM, Christian Tanzer wrote: > A fair amount of my use cases involve a non-literal format string > (i.e., passed in as argument, defined as module or class variable, or > even doc-strings used as format strings). I'd guess that most > non-literal format strings are used together with dictionaries. > > Unfortunately, it's hard to grep for this :-(, so I can't give you > hard numbers. It would be pretty simple to rig up 2to3 to report any string literals containing e.g. '%(...)s' that are not immediately followed by a % operator. > Another, probably fairly common, use case involving non-literal > strings is %-formatting in I18N strings -- though it might be possible > to fix these automatically. Plus, i18n is one of the motivators for .format() -- less chance of forgetting to type the trailing 's' in '%(foobar)s' and the ability to rearrange positional arguments a la "foo {1} bar {0} baz". -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bruce at leapyear.org Mon Feb 16 20:22:56 2009 From: bruce at leapyear.org (Bruce Leban) Date: Mon, 16 Feb 2009 11:22:56 -0800 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: References: <20090212141040.0c89e0fc@o> <70e6cf130902121029l17ec06e0k9d3ea91e23ecd74e@mail.gmail.com> Message-ID: On Thu, Feb 12, 2009 at 11:24 AM, Terry Reedy wrote: > > PROPOSAL: Allow the simple case to stay simple. Allow field names to be > omitted for all fields in a string and then default to 0, 1, ... so that > example above could be written as > > >>> msg = "{} == {}".format > > Given that computers are glorified counting machines, it *is* a bit > annoying to be required to do the counting manually. Explicit syntax is better imho: "The answers are {.} and {.}.".format(x,y) I'm suggesting a bare dot because it looks like something rather than nothing and this syntax is currently invalid. --- Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From janssen at parc.com Mon Feb 16 20:47:07 2009 From: janssen at parc.com (Bill Janssen) Date: Mon, 16 Feb 2009 11:47:07 PST Subject: [Python-ideas] set.add() return value In-Reply-To: <49961D59.1050208@pearwood.info> References: <118486.47943.qm@web111413.mail.gq1.yahoo.com> <2F8C0565097D4CA690BB83CC5B6A2C19@RaymondLaptop1> <203230.34503.qm@web111401.mail.gq1.yahoo.com> <50697b2c0902121512h6749333tb2a5cf5ff1e54899@mail.gmail.com> <171757.71077.qm@web111402.mail.gq1.yahoo.com> <4994FCF4.3030301@pearwood.info> <779748.53402.qm@web111403.mail.gq1.yahoo.com> <50697b2c0902122300q27e82716y76a9a5b13e3f1766@mail.gmail.com> <4995F850.2040802@canterbury.ac.nz> <49961D59.1050208@pearwood.info> Message-ID: <6292.1234813627@parc.com> Steven D'Aprano wrote: > Greg Ewing wrote: > > Chris Rebert wrote: > > > >> As The Zen says: "Special cases aren't special enough to break the > >> rules." > > > > What might be more acceptable is to add a new method > > for this, with a name suggesting that it's more than > > just a plain mutating operation, e.g. > > > > was_it_there = myset.test_and_add(42) > > > > What's the use-case for this? What's wrong with doing this? > > if 42 in myset: > myset.add(42) Well, for me, What's wrong is that it's complex to write and debug, mainly. Don't you mean to say, was_it_there = (42 in myset) if not was_it_there: myset.add(42) for instance? Not that I'm pushing for the addition of this method, but I see the point. Bill From tjreedy at udel.edu Mon Feb 16 20:59:09 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 16 Feb 2009 14:59:09 -0500 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: References: <20090212141040.0c89e0fc@o> <70e6cf130902121029l17ec06e0k9d3ea91e23ecd74e@mail.gmail.com> Message-ID: Bruce Leban wrote: > > > On Thu, Feb 12, 2009 at 11:24 AM, Terry Reedy > > wrote: > > > PROPOSAL: Allow the simple case to stay simple. Allow field names > to be omitted for all fields in a string and then default to 0, 1, > ... so that example above could be written as > > > >> msg = "{} == {}".format > > Given that computers are glorified counting machines, it *is* a bit > annoying to be required to do the counting manually. > > > Explicit syntax is better imho: > "The answers are {.} and {.}.".format(x,y) > > I'm suggesting a bare dot because it looks like something rather than > nothing and this syntax is currently invalid. -1 There is nothing 'explicit' about '.'. {} is just as currently invalid. The purpose of my proposal is to make the simple case simple. In terms of keystrokes, unshift - . - shift is as bad as unshift - 0 - shift. tjr From eric at trueblade.com Mon Feb 16 21:50:12 2009 From: eric at trueblade.com (Eric Smith) Date: Mon, 16 Feb 2009 15:50:12 -0500 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: References: <20090212141040.0c89e0fc@o> <70e6cf130902121029l17ec06e0k9d3ea91e23ecd74e@mail.gmail.com> Message-ID: <4999D184.3080105@trueblade.com> Terry Reedy wrote: > The purpose of my proposal is to make the simple case simple. > In terms of keystrokes, unshift - . - shift is as bad as unshift - 0 - > shift. I have this mostly implemented in ''.format(), despite my earlier statement that I was done playing with it after the sample that's attached to http://bugs.python.org/issue5237. Now that I have this working in the core, I find Terry's argument above quite convincing. Not having to do the shift-unshift-shift dance to type in the format placeholder really does make it easier to use for the simple cases. I'm seeing this as a definite improvement. From g.brandl at gmx.net Mon Feb 16 22:16:19 2009 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 16 Feb 2009 22:16:19 +0100 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: References: Message-ID: Guido van Rossum schrieb: > On Mon, Feb 16, 2009 at 12:48 AM, Christian Tanzer wrote: >> A fair amount of my use cases involve a non-literal format string >> (i.e., passed in as argument, defined as module or class variable, or >> even doc-strings used as format strings). I'd guess that most >> non-literal format strings are used together with dictionaries. >> >> Unfortunately, it's hard to grep for this :-(, so I can't give you >> hard numbers. > > It would be pretty simple to rig up 2to3 to report any string literals > containing e.g. '%(...)s' that are not immediately followed by a % > operator. The major problems I see are 1) __mod__ application with a single right-hand operand (a tuple makes it a string formatting to 100%, at least without other types overloading %) 2) format strings coming from external sources The first can't be helped easily. For the second, a helper function that converts %s format strings to {0} format strings could be imagined. A call of the form fmtstr % (a, b) would then be converted to _mod2format(fmtstr).format(a, b) To fix 1), _mod2format could even return a wrapper that executes .__mod__ on .format() if fmtstr is not a string. >> Another, probably fairly common, use case involving non-literal >> strings is %-formatting in I18N strings -- though it might be possible >> to fix these automatically. > > Plus, i18n is one of the motivators for .format() -- less chance of > forgetting to type the trailing 's' in '%(foobar)s' and the ability to > rearrange positional arguments a la "foo {1} bar {0} baz". I expect future generations of translators will be thankful. Current generation may be angry when they have to revise their .po files :) Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From greg.ewing at canterbury.ac.nz Mon Feb 16 22:52:42 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 17 Feb 2009 10:52:42 +1300 Subject: [Python-ideas] Revised revised revised PEP on yield-from In-Reply-To: References: <4995F681.20702@canterbury.ac.nz> <0A5E3DAFC3204EA08954903202DB45F3@RaymondLaptop1> Message-ID: <4999E02A.8050103@canterbury.ac.nz> Guido van Rossum wrote: > (a) I don't know if the PEP proposes that "yield from expr" should > return the last value returned by (i.e. sent to) a yield somewhere > deeply nested; I think this would be useful. No, it doesn't. Its value is the value passed to the 'return' statement that terminates the subgenerator (and generators are enhanced to permit return with a value). My reason for doing this is so you can use subgenerators like functions in a generator that's being used as a lightweight thread. > (b) I hope the PEP also explains what to do if "expr" is not a > generator but some other kind of iterator. Yes, it does. Currently I'm proposing that, if the relevant methods are not defined, send() is treated like next(), and throw() and close() do what they would have done normally on the parent generator. > (c) A quick skim of the PEP didn't show suggestions for how to > implement this. One way would be to simply emit the bytecode corresponding to the presented expansion, although that wouldn't be very efficient in terms of either speed or code size. The PEP also sketches an optimised implementation in which the generator has a slot which refers to the generator being delegated to. Calls to next(), send(), throw() and close() are forwarded via this slot if it is nonempty. There will still be a small overhead involved in the delegation, but it's only a chain of C function calls instead of Python ones, which ought to be a big improvement. It might be possible to reduce the overhead even further by following the chain of delegation pointers in a loop until reaching the end and then calling the end generator directly. It would be trickier to get right, though, because you'd have to be prepared to back up and try earlier generators in the face of StopIterations. -- Greg From guido at python.org Mon Feb 16 22:51:18 2009 From: guido at python.org (Guido van Rossum) Date: Mon, 16 Feb 2009 13:51:18 -0800 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: References: <20090212141040.0c89e0fc@o> <70e6cf130902121029l17ec06e0k9d3ea91e23ecd74e@mail.gmail.com> Message-ID: On Mon, Feb 16, 2009 at 11:59 AM, Terry Reedy wrote: > Bruce Leban wrote: >> >> >> On Thu, Feb 12, 2009 at 11:24 AM, Terry Reedy > > wrote: >> >> >> PROPOSAL: Allow the simple case to stay simple. Allow field names >> to be omitted for all fields in a string and then default to 0, 1, >> ... so that example above could be written as >> >> > >> msg = "{} == {}".format >> >> Given that computers are glorified counting machines, it *is* a bit >> annoying to be required to do the counting manually. >> >> >> Explicit syntax is better imho: >> "The answers are {.} and {.}.".format(x,y) >> >> I'm suggesting a bare dot because it looks like something rather than >> nothing and this syntax is currently invalid. > > -1 > There is nothing 'explicit' about '.'. > {} is just as currently invalid. > The purpose of my proposal is to make the simple case simple. > In terms of keystrokes, unshift - . - shift is as bad as unshift - 0 - > shift. Well said. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg.ewing at canterbury.ac.nz Mon Feb 16 23:00:12 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 17 Feb 2009 11:00:12 +1300 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: References: <20090212141040.0c89e0fc@o> <70e6cf130902121029l17ec06e0k9d3ea91e23ecd74e@mail.gmail.com> Message-ID: <4999E1EC.6020900@canterbury.ac.nz> Bruce Leban wrote: > "The answers are {.} and {.}.".format(x,y) If you're going to the trouble of putting in a dot, it's not much more hardship to put a number there instead, is it? -- Greg From guido at python.org Tue Feb 17 02:20:10 2009 From: guido at python.org (Guido van Rossum) Date: Mon, 16 Feb 2009 17:20:10 -0800 Subject: [Python-ideas] Revised revised revised PEP on yield-from Message-ID: [Resend, hopefully bag.python.org is fixed again] On Mon, Feb 16, 2009 at 1:52 PM, Greg Ewing wrote: > Guido van Rossum wrote: >> (a) I don't know if the PEP proposes that "yield from expr" should >> return the last value returned by (i.e. sent to) a yield somewhere >> deeply nested; I think this would be useful. > > No, it doesn't. Its value is the value passed to the > 'return' statement that terminates the subgenerator > (and generators are enhanced to permit return with a > value). > > My reason for doing this is so you can use subgenerators > like functions in a generator that's being used as > a lightweight thread. There better be a pretty darn good reason to do this. I really don't like overloading return this way -- normally returning from a generator is equivalent to falling off the end and raises StopIteration, and I don't think you can change that easily. >> (b) I hope the PEP also explains what to do if "expr" is not a >> generator but some other kind of iterator. > > Yes, it does. Currently I'm proposing that, if the > relevant methods are not defined, send() is treated > like next(), and throw() and close() do what they > would have done normally on the parent generator. I'm not sure I like this interpretation of .send() -- it looks asymmetrical with the way .send() to non-generator iterators is treated in other contexts, where it is an error. I'm fine with the other two though, so I'm not sure how strong my opposition should be. >> (c) A quick skim of the PEP didn't show suggestions for how to >> implement this. > > One way would be to simply emit the bytecode corresponding > to the presented expansion, although that wouldn't be > very efficient in terms of either speed or code size. Also pretty complex given the special cases for .send(), .throw() and .close() -- if it weren't for pass-through it could b a simplified for-loop (leaving out the variable assignment), but because of the pass-through it seems it would have to be ugly. > The PEP also sketches an optimised implementation in > which the generator has a slot which refers to the > generator being delegated to. Calls to next(), send(), > throw() and close() are forwarded via this slot if it > is nonempty. And that could in turn be a generator with another such slot, right? Hopefully the testing for the presence of .throw, .send and .close could be done once at the start of the yield-from and represented as a set of flags (or perhaps the test could be delayed until the first time it's needed). I recommend that you produce a working implementation of this; who knows what other issues you might run into (including, whether your proposed interpretation of return from a generator above makes sense. > There will still be a small overhead involved in the > delegation, but it's only a chain of C function calls > instead of Python ones, which ought to be a big > improvement. Agreed. > It might be possible to reduce the overhead even further > by following the chain of delegation pointers in a loop > until reaching the end and then calling the end > generator directly. It would be trickier to get right, > though, because you'd have to be prepared to back up > and try earlier generators in the face of StopIterations. Well there you have a question that could be answered by trying to implement it. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Feb 17 02:23:35 2009 From: guido at python.org (Guido van Rossum) Date: Mon, 16 Feb 2009 17:23:35 -0800 Subject: [Python-ideas] set.add() return value Message-ID: [Resend, hopefully bag.python.org is fixed now.] On Fri, Feb 13, 2009 at 5:24 PM, Steven D'Aprano wrote: > Greg Ewing wrote: >> >> Chris Rebert wrote: >> >>> As The Zen says: "Special cases aren't special enough to break the >>> rules." >> >> What might be more acceptable is to add a new method >> for this, with a name suggesting that it's more than >> just a plain mutating operation, e.g. >> >> was_it_there = myset.test_and_add(42) >> > > What's the use-case for this? What's wrong with doing this? > > if 42 in myset: > myset.add(42) > > Short, sweet, and doesn't require any new methods. This example also has a bug, which neither of the two posters responding caught (unless Bill J was being *very* subtle). > The OP's original use-case was based on his misapprehension that key lookup > in a set was O(N log N). I don't see any useful advantage to a new method, > let alone a change in semantics to the existing method. Regardless of the OP's misunderstanding, I might be +0 on changing .add() and similar methods to returning True if the set was changed and False if it wasn't. I don't see a serious API incompatibility looming here (I'm assuming that the None which it currently returns is discarded by almost all code calling it rather than relied upon). It seems cheap enough to make this change; the internal implementation has this information available (e.g. look at set_insert_key()). Before I go to +1 though I would have to see that there are enough examples found in real code that could benefit from this small optimization. There's also the nagging concern that once we do this for set operations people might ask to do this for dict operations too, and then what's next. Plus, the abstract base classes would have to be adjusted and then existing implementations thereof would become invalid as a result. IOW the slippery-slope argument. So maybe +0 is too string still; I'm wavering somewhere between -0 and +0 ATM. PS Note that this is not the same case as the list.sort() method -- the latter's return value would be redundant (and IMO misleading) while the return value for .add() provides new information. My position on that use case is not wavering. :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From python at rcn.com Tue Feb 17 03:13:25 2009 From: python at rcn.com (Raymond Hettinger) Date: Mon, 16 Feb 2009 18:13:25 -0800 Subject: [Python-ideas] set.add() return value References: <118486.47943.qm@web111413.mail.gq1.yahoo.com> <2F8C0565097D4CA690BB83CC5B6A2C19@RaymondLaptop1> <203230.34503.qm@web111401.mail.gq1.yahoo.com> <50697b2c0902121512h6749333tb2a5cf5ff1e54899@mail.gmail.com> <171757.71077.qm@web111402.mail.gq1.yahoo.com> <4994FCF4.3030301@pearwood.info> <779748.53402.qm@web111403.mail.gq1.yahoo.com> <50697b2c0902122300q27e82716y76a9a5b13e3f1766@mail.gmail.com><4995F850.2040802@canterbury.ac.nz> <49961D59.1050208@pearwood.info> Message-ID: <476B74AB12994003AFE1A697CD13AB57@RaymondLaptop1> [Steven D'Aprano] > What's the use-case for this? What's wrong with doing this? > > if 42 in myset: > myset.add(42) > > Short, sweet, and doesn't require any new methods. I think you meant: "if 42 not in myset". Am -1 on the proposed change. AFAICT, it will never save more than one line of code. The current version is explicit and clear. Moreover, it does not require special knowledge to read. Currently, the set API has a zero learning curve and doesn't demand that you remember a rule particular to Python. In contrast, the proposed version requires that you know that Python has a special API and you need to remember that when you're reading someone else's code. Give this snippet to five Python programmers (not in this thread) and see if they correctly deduce when f(x), f(y), f(z), and f(m) will be called. Also, see if they can correctly assert known post-conditions (after each if-statement and before each function call). if(not myset.add(x)): f(x) else: g(x) if(myset.discard(y)): f(y) else: g(y) if(myset.update(z): f(z) else: g(z) if(myset): f(m) else: g(m) As I mentioned in another post, this API is counterintuitive for anyone who is used to other languages where operations like set.add() always returns self and lets them write code like: myset.add(x).add(y).add(z) Essentially the argument against boils down to there not being an intuitively obvious correct interpretation of what the code does while the existing approach is crystal clear. The set API currently has no rough edges. It is one of the cleanest APIs in the language and I hope it stays that way. Raymond From guido at python.org Tue Feb 17 03:32:55 2009 From: guido at python.org (Guido van Rossum) Date: Mon, 16 Feb 2009 18:32:55 -0800 Subject: [Python-ideas] set.add() return value In-Reply-To: <476B74AB12994003AFE1A697CD13AB57@RaymondLaptop1> References: <118486.47943.qm@web111413.mail.gq1.yahoo.com> <203230.34503.qm@web111401.mail.gq1.yahoo.com> <50697b2c0902121512h6749333tb2a5cf5ff1e54899@mail.gmail.com> <171757.71077.qm@web111402.mail.gq1.yahoo.com> <4994FCF4.3030301@pearwood.info> <779748.53402.qm@web111403.mail.gq1.yahoo.com> <50697b2c0902122300q27e82716y76a9a5b13e3f1766@mail.gmail.com> <4995F850.2040802@canterbury.ac.nz> <49961D59.1050208@pearwood.info> <476B74AB12994003AFE1A697CD13AB57@RaymondLaptop1> Message-ID: On Mon, Feb 16, 2009 at 6:13 PM, Raymond Hettinger wrote: > [Steven D'Aprano] >> >> What's the use-case for this? What's wrong with doing this? >> >> if 42 in myset: >> myset.add(42) >> >> Short, sweet, and doesn't require any new methods. > > I think you meant: "if 42 not in myset". > > Am -1 on the proposed change. > AFAICT, it will never save more than one line of code. > The current version is explicit and clear. > Moreover, it does not require special knowledge to read. > > Currently, the set API has a zero learning curve and doesn't > demand that you remember a rule particular to Python. > In contrast, the proposed version requires that you know > that Python has a special API and you need to remember that > when you're reading someone else's code. > > Give this snippet to five Python programmers (not in this thread) > and see if they correctly deduce when f(x), f(y), f(z), and f(m) will > be called. Also, see if they can correctly assert known post-conditions > (after each if-statement and before each function call). > > if(not myset.add(x)): > f(x) > else: > g(x) > if(myset.discard(y)): > f(y) > else: > g(y) > if(myset.update(z): > f(z) > else: > g(z) > if(myset): > f(m) > else: > g(m) Point taken, but... > As I mentioned in another post, this API is counterintuitive for > anyone who is used to other languages where operations like > set.add() always returns self and lets them write code like: > > myset.add(x).add(y).add(z) Which language has that? > Essentially the argument against boils down to there not being > an intuitively obvious correct interpretation of what the code > does while the existing approach is crystal clear. > > The set API currently has no rough edges. It is one of the cleanest > APIs in the language and I hope it stays that way. Hey Raymond, relax. You don't have to take it personal. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From python at rcn.com Tue Feb 17 03:51:32 2009 From: python at rcn.com (Raymond Hettinger) Date: Mon, 16 Feb 2009 18:51:32 -0800 Subject: [Python-ideas] Revised revised revised PEP on yield-from References: <4995F681.20702@canterbury.ac.nz> Message-ID: <8F04B80DFCFE433DA740128DB5E2AEDF@RaymondLaptop1> [Greg Ewing] > * Any values that the iterator yields are passed directly to the > caller. > > * Any values sent to the delegating generator using ``send()`` > are sent directly to the iterator. (If the iterator does not > have a ``send()`` method, values sent in are ignored.) > > * Calls to the ``throw()`` method of the delegating generator are > forwarded to the iterator. (If the iterator does not have a > ``throw()`` method, the thrown-in exception is raised in the > delegating generator.) > > * If the delegating generator's ``close()`` method is called, the > iterator is finalised before finalising the delegating generator. > > The value of the ``yield from`` expression is the first argument to the > ``StopIteration`` exception raised by the iterator when it terminates. > > Additionally, generators will be allowed to execute a ``return`` > statement with a value, and that value will be passed as an argument > to the ``StopIteration`` exception. Looks like a language construct where only a handful of python programmers will be able to correctly describe what it does. I've only seen requests for the functionality in the first bullet point. The rest seems like unnecessary complexity -- something that will take a page in the docs rather than a couple lines. > if hasattr(_i, 'throw'): > _i.throw(_e) > else: > raise This seems like it is awkwardly trying to cater to two competing needs. It recognized that the outer generator make have a legitimate need to catch an exception and that the inner generator might want it too. Unfortunately, only one can be caught and there is no way to have both the inner and outer generator/iterator each do their part in servicing an exception. Also, am concerned that this slows down the more common case of _i not having a throw method. The same thoughts apply to send() and close(). Potentially, both an inner and outer generator will need a close function but will have no way of doing both. Raymond From charlie137 at gmail.com Tue Feb 17 04:05:09 2009 From: charlie137 at gmail.com (Guillaume Chereau) Date: Tue, 17 Feb 2009 11:05:09 +0800 Subject: [Python-ideas] Revised revised revised PEP on yield-from In-Reply-To: References: Message-ID: <8e9327d40902161905j237a64dei468a811cf42539b@mail.gmail.com> I think we can already create coroutines and lightweight tasks already without the need of the "yield from" syntax. The trick is to assume that when a coroutine yields an other coroutine, the behaviour is the same as using "yield from". The code somehow looks like this : # This is the actual coroutine we want to write @coroutine def func(): yield func2() # just like 'yield from func2()' yield 10 # That would be the return value in a coroutine # This is the code -probably wrong, and missing many things, but that gives the idea- of the coroutine library class Coroutine: def __init__(generator): self.generator = generator def run(self): for value in self.generator: if not isinstance(value, Coroutine): yield value else: # XXX: should also do all the proper checking as 'yield from' do for subvalue in value.run(): yield subvalue def coroutine(func): def ret(): return Coroutine(func) return ret The proposal could make things faster, and also slightly more coherent. But what I don't really like with it, is that when you start to write coroutines, you have to use yield every time you call an other coroutine, and it make the code full of yield statement ; the proposal if adopted would make it even worst. Cheers, Guillaume Chereau -- http://charlie137.blogspot.com/ From python at rcn.com Tue Feb 17 04:14:31 2009 From: python at rcn.com (Raymond Hettinger) Date: Mon, 16 Feb 2009 19:14:31 -0800 Subject: [Python-ideas] set.add() return value References: <118486.47943.qm@web111413.mail.gq1.yahoo.com> <203230.34503.qm@web111401.mail.gq1.yahoo.com> <50697b2c0902121512h6749333tb2a5cf5ff1e54899@mail.gmail.com> <171757.71077.qm@web111402.mail.gq1.yahoo.com> <4994FCF4.3030301@pearwood.info> <779748.53402.qm@web111403.mail.gq1.yahoo.com> <50697b2c0902122300q27e82716y76a9a5b13e3f1766@mail.gmail.com> <4995F850.2040802@canterbury.ac.nz> <49961D59.1050208@pearwood.info> <476B74AB12994003AFE1A697CD13AB57@RaymondLaptop1> Message-ID: >> As I mentioned in another post, this API is counterintuitive for >> anyone who is used to other languages where operations like >> set.add() always returns self and lets them write code like: >> >> myset.add(x).add(y).add(z) > > Which language has that? I think there a several. Smalltalk comes to mind: ''' addAll: aCollection Adds all the elements of 'aCollection' to the receiver, answer aCollection ''' Looks like Objective C takes the same approach: ''' Method: IndexedCollection @keyword{-insertObject:} newObject @keyword{atIndex:} (unsigned)index @deftypemethodx IndexedCollecting {} @keyword{-insertElement:} (elt)newElement @keyword{atIndex:} (unsigned)index returns self. ''' Also, Ruby uses the style of having mutating methods return self: ''' arr.fill( anObject ) -> arr arr.fill( anObject, start [, length ] ) -> arr arr.fill( anObject, aRange ) -> arr hsh.update( anOtherHash ) -> hsh Adds the contents of anOtherHash to hsh, overwriting entries with duplicate keys with those from anOtherHash. h1 = { "a" => 100, "b" => 200 } h2 = { "b" => 254, "c" => 300 } h1.update(h2) ? {"a"=>100, "b"=>254, "c"=>300} ''' I haven't checked other languages but am sure I've seen this style show-up in a number of places. Raymond From guido at python.org Tue Feb 17 05:31:06 2009 From: guido at python.org (Guido van Rossum) Date: Mon, 16 Feb 2009 20:31:06 -0800 Subject: [Python-ideas] Revised revised revised PEP on yield-from In-Reply-To: <8F04B80DFCFE433DA740128DB5E2AEDF@RaymondLaptop1> References: <4995F681.20702@canterbury.ac.nz> <8F04B80DFCFE433DA740128DB5E2AEDF@RaymondLaptop1> Message-ID: On Mon, Feb 16, 2009 at 6:51 PM, Raymond Hettinger wrote: > Looks like a language construct where only a handful > of python programmers will be able to correctly describe > what it does. That doesn't necessarily matter. It's true for quite a few Python constructs that many Python programmers use without knowing every little semantic detail. If you don't use .send(), .throw(), .close(), the semantics of "yield from" are very simple to explain and remember. All the rest is there to make advanced uses possible. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Feb 17 05:40:47 2009 From: guido at python.org (Guido van Rossum) Date: Mon, 16 Feb 2009 20:40:47 -0800 Subject: [Python-ideas] Revised revised revised PEP on yield-from In-Reply-To: <8e9327d40902161905j237a64dei468a811cf42539b@mail.gmail.com> References: <8e9327d40902161905j237a64dei468a811cf42539b@mail.gmail.com> Message-ID: On Mon, Feb 16, 2009 at 7:05 PM, Guillaume Chereau wrote: > But what I don't really like with it, is that when you start to write > coroutines, you have to use yield every time you call an other > coroutine, and it make the code full of yield statement ; the proposal > if adopted would make it even worst. I don't see how the proposal would make it *worse* (assuming that's what you meant, and "worst" was a typo). Coroutines make my head hurt more than metaclasses, but I see value in the proposal besides coroutines. Recursive generators are pretty calling and the "for x in A: yield x" idiom gets tiresome to read and write. Personally I care more about having to mentally parse it and realize "oh, it's a recursive generator" than about the extra typing it requires (only seven characters more :-). The mental parsing is a relatively big burden: the whole first line gives no indication that it's a recursive iterator, because "for x in A:" could start a loop doing something quite different. Your code-reading brain has to find the correspondence between the loop control variable of the for-loop header and the variable yielded in the loop body, see that nothing else is going on, and *then* it can draw the conclusion about the recursive generator. Once learned, "yield from" is a mental shortcut that saves your code-reading brain time and effort, which in turn helps understanding the larger picture of which this construct is a part -- you don't have to push anything onto your mental stack to see what's going on with "yield from". -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg.ewing at canterbury.ac.nz Tue Feb 17 05:53:57 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 17 Feb 2009 17:53:57 +1300 Subject: [Python-ideas] Revised revised revised PEP on yield-from In-Reply-To: References: <4995F681.20702@canterbury.ac.nz> <0A5E3DAFC3204EA08954903202DB45F3@RaymondLaptop1> <4999E02A.8050103@canterbury.ac.nz> Message-ID: <499A42E5.3010204@canterbury.ac.nz> Guido van Rossum wrote: > There better be a pretty darn good reason to do this. I think that making it easy to use generators as lightweight threads is a good enough reason. > I really don't > like overloading return this way -- normally returning from a > generator is equivalent to falling off the end and raises > StopIteration It still is. It's just that if you happen to return a value, it gets attached to the StopIteration for the use of anything that wants to care. It will make no difference at all to anything already existing. Also, if a generator that returns something gets called in a context that doesn't know about generator return values, the value is simply discarded, just as with an ordinary function call that ignores the return value. > I'm not sure I like this interpretation of .send() -- it looks > asymmetrical with the way .send() to non-generator iterators is > treated in other contexts, where it is an error. I wouldn't object to raising an exception in that case. Come to think of it, doing that would me more consistent with the idea of the caller talking directly to the subgenerator. > And that could in turn be a generator with another such slot, right? That's right. > Hopefully the testing for the presence of .throw, .send and .close > could be done once at the start of the yield-from and represented as a > set of flags. Yes. You could even cache bound methods for these if you wanted. > I recommend that you produce a working implementation of this; who > knows what other issues you might run into Good idea. I'll see what I can come up with. -- Greg From guido at python.org Tue Feb 17 06:06:08 2009 From: guido at python.org (Guido van Rossum) Date: Mon, 16 Feb 2009 21:06:08 -0800 Subject: [Python-ideas] Revised revised revised PEP on yield-from In-Reply-To: <499A42E5.3010204@canterbury.ac.nz> References: <4995F681.20702@canterbury.ac.nz> <0A5E3DAFC3204EA08954903202DB45F3@RaymondLaptop1> <4999E02A.8050103@canterbury.ac.nz> <499A42E5.3010204@canterbury.ac.nz> Message-ID: On Mon, Feb 16, 2009 at 8:53 PM, Greg Ewing wrote: > Guido van Rossum wrote: > >> There better be a pretty darn good reason to do this. > > I think that making it easy to use generators as > lightweight threads is a good enough reason. I still expect that even with the new syntax this will be pretty cumbersome, and require the user to be aware of all sorts of oddities and restrictions. I think it may be better to leave this to libraries like Greenlets and systems like Stackless which manage to hind the mechanics much better. Also, the asymmetry between "yield expr" (which returns a value passed in by the caller using .send()) and "yield from expr" (which returns a value coming from the sub-generator) really bothers me. Finally, your PEP currently doesn't really do this use case justice; can you provide a more complete motivating example? I don't quite understand how I would write the function that is delegated to as "yield from g(x)" nor do I quite see what the caller of the outer generator should expect from successive next() or .send() calls. >> I really don't >> like overloading return this way -- normally returning from a >> generator is equivalent to falling off the end and raises >> StopIteration > > It still is. It's just that if you happen to return a > value, it gets attached to the StopIteration for the > use of anything that wants to care. It will make no > difference at all to anything already existing. So, "return" is equivalent to "raise StopIteration" and "return " is equivalent to "raise StopIteration()"? I suppose I could live with that. > Also, if a generator that returns something gets called > in a context that doesn't know about generator return > values, the value is simply discarded, just as with > an ordinary function call that ignores the return > value. > >> I'm not sure I like this interpretation of .send() -- it looks >> asymmetrical with the way .send() to non-generator iterators is >> treated in other contexts, where it is an error. > > I wouldn't object to raising an exception in that case. > Come to think of it, doing that would me more consistent > with the idea of the caller talking directly to the > subgenerator. > >> And that could in turn be a generator with another such slot, right? > > That's right. > >> Hopefully the testing for the presence of .throw, .send and .close >> could be done once at the start of the yield-from and represented as a >> set of flags. > > Yes. You could even cache bound methods for these if you > wanted. > >> I recommend that you produce a working implementation of this; who >> knows what other issues you might run into > > Good idea. I'll see what I can come up with. Sounds good. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg.ewing at canterbury.ac.nz Tue Feb 17 06:16:27 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 17 Feb 2009 18:16:27 +1300 Subject: [Python-ideas] set.add() return value In-Reply-To: References: Message-ID: <499A482B.2050203@canterbury.ac.nz> Guido van Rossum wrote: > There's also the nagging concern that once we do this for set > operations people might ask to do this for dict operations too, and > then what's next. I think there could be some theoretical justification to do it for sets at least. The pattern of "if something isn't already in some set, then add it and do some further processing" turns up fairly frequently in various algorithms. With dicts it's a bit different -- usually you're looking in the dict first to get something out, and if you don't find it, you then do something to manufacture a value and put it in. This is covered by setdefault() or whatever the modern replacement for it is. -- Greg From python at rcn.com Tue Feb 17 06:27:02 2009 From: python at rcn.com (Raymond Hettinger) Date: Mon, 16 Feb 2009 21:27:02 -0800 Subject: [Python-ideas] set.add() return value References: <499A482B.2050203@canterbury.ac.nz> Message-ID: <2A463F1306254841912360D75DEB0305@RaymondLaptop1> [Greg Ewing] > I think there could be some theoretical justification > to do it for sets at least. The pattern of "if something > isn't already in some set, then add it and do some further > processing" turns up fairly frequently in various algorithms. If we have to do this, then a separate method with a very clear name is better than mucking-up the signature for set.add() and set.discard(). Of course, we don't have to do this. It saves only one line. Raymond From greg.ewing at canterbury.ac.nz Tue Feb 17 06:37:01 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 17 Feb 2009 18:37:01 +1300 Subject: [Python-ideas] Revised revised revised PEP on yield-from In-Reply-To: <8F04B80DFCFE433DA740128DB5E2AEDF@RaymondLaptop1> References: <4995F681.20702@canterbury.ac.nz> <8F04B80DFCFE433DA740128DB5E2AEDF@RaymondLaptop1> Message-ID: <499A4CFD.7010806@canterbury.ac.nz> Raymond Hettinger wrote: > Looks like a language construct where only a handful > of python programmers will be able to correctly describe > what it does. The whole area of generators is one where I think only a minority of programmers will ever fully understand all the gory details. Those paragraphs are there for the purpose of providing a complete and rigorous specification of what's being proposed. For most people, almost all the essential information is summed up in this one sentence near the top: The effect is to run the iterator to exhaustion, during which time it behaves as though it were communicating directly with the caller of the generator containing the ``yield from`` expression. As I've said, I believe it's actually quite simple conceptually. It's just that it gets messy trying to explain it by way of expansion into currently existing Python code and concepts. > This seems like it is awkwardly trying to cater to two competing needs. > It recognized that the outer generator make have a legitimate need > to catch an exception and that the inner generator might want it too. > Unfortunately, only one can be caught and there is no way to have > both the inner and outer generator/iterator each do their part in > servicing an exception. I don't understand what you mean by that. If you were making an ordinary function call, you'd expect that the called function would get first try at catching any exception occurring while it's running, and if it doesn't, it propagates out to the calling function. Also it's not true that only one of them can catch the exception. The inner one might catch it, do some processing and then re-raise it. Or it might do something in a finally block. My intent is for all these things to work the same way when one generator delegates to another using yield-from. -- Greg From greg.ewing at canterbury.ac.nz Tue Feb 17 06:44:29 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 17 Feb 2009 18:44:29 +1300 Subject: [Python-ideas] Revised revised revised PEP on yield-from In-Reply-To: <8e9327d40902161905j237a64dei468a811cf42539b@mail.gmail.com> References: <8e9327d40902161905j237a64dei468a811cf42539b@mail.gmail.com> Message-ID: <499A4EBD.5080003@canterbury.ac.nz> Guillaume Chereau wrote: > But what I don't really like with it, is that when you start to write > coroutines, you have to use yield every time you call an other > coroutine, Yes, but that's unavoidable as long as you're faking things with generators instead of using real threads, unless some other construct is introduced that's tantamount to 'yield' by another name -- and then you have to remember to use that. > and it make the code full of yield statement ; the proposal > if adopted would make it even worst. I don't see how it would be any worse. Your code at first glance looks incomprehensible to me -- how am I supposed to know that the first 'yield' is a blocking operation while the second one is returning a value? It relies on obscure conventions implemented by some kind of wrapper that you have to learn about. -- Greg From greg.ewing at canterbury.ac.nz Tue Feb 17 07:31:42 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 17 Feb 2009 19:31:42 +1300 Subject: [Python-ideas] Revised revised revised PEP on yield-from In-Reply-To: References: <4995F681.20702@canterbury.ac.nz> <0A5E3DAFC3204EA08954903202DB45F3@RaymondLaptop1> <4999E02A.8050103@canterbury.ac.nz> <499A42E5.3010204@canterbury.ac.nz> Message-ID: <499A59CE.8030509@canterbury.ac.nz> Guido van Rossum wrote: > I don't quite > understand how I would write the function that is delegated to as > "yield from g(x)" nor do I quite see what the caller of the outer > generator should expect from successive next() or .send() calls. It should be able to expect whatever would happen if the body of the delegated-to generator were inlined into the delegating generator. That's the core idea behind all of this -- being able to take a chunk of code containing yields, abstract it out and put it in another function, without the ouside world being any the wiser. We do this all the time with ordinary functions and don't ever question the utility of being able to do so. I'm at a bit of a loss to understand why people can't see the utility in being able to do the same thing with generator code. I take your point about needing a better generators- as-threads example, though, and I'll see if I can come up with something. > So, "return" is equivalent to "raise StopIteration" and "return > " is equivalent to "raise StopIteration()"? Yes. -- Greg From python at rcn.com Tue Feb 17 08:04:48 2009 From: python at rcn.com (Raymond Hettinger) Date: Mon, 16 Feb 2009 23:04:48 -0800 Subject: [Python-ideas] Revised revised revised PEP on yield-from References: <4995F681.20702@canterbury.ac.nz><0A5E3DAFC3204EA08954903202DB45F3@RaymondLaptop1><4999E02A.8050103@canterbury.ac.nz><499A42E5.3010204@canterbury.ac.nz> <499A59CE.8030509@canterbury.ac.nz> Message-ID: <536DB665F131475CA5BCD33F642BE15D@RaymondLaptop1> [Greg Ewing] > It should be able to expect whatever would happen if the > body of the delegated-to generator were inlined into the > delegating generator. > > That's the core idea behind all of this -- being able to > take a chunk of code containing yields, abstract it out > and put it in another function, without the ouside world > being any the wiser. That's a very nice synopsis. It should probably be right at the top of the PEP. Raymond From charlie137 at gmail.com Tue Feb 17 08:46:12 2009 From: charlie137 at gmail.com (Guillaume Chereau) Date: Tue, 17 Feb 2009 15:46:12 +0800 Subject: [Python-ideas] Revised revised revised PEP on yield-from In-Reply-To: <499A4EBD.5080003@canterbury.ac.nz> References: <8e9327d40902161905j237a64dei468a811cf42539b@mail.gmail.com> <499A4EBD.5080003@canterbury.ac.nz> Message-ID: <8e9327d40902162346o413f6a25see1253ff0962ba54@mail.gmail.com> On Tue, Feb 17, 2009 at 1:44 PM, Greg Ewing wrote: > Guillaume Chereau wrote: > >> But what I don't really like with it, is that when you start to write >> coroutines, you have to use yield every time you call an other >> coroutine, > > Yes, but that's unavoidable as long as you're faking > things with generators instead of using real threads, > unless some other construct is introduced that's > tantamount to 'yield' by another name -- and then > you have to remember to use that. Yes I though about this idea, like adding an attribute to a function to tell the interpreter to magically yield its result, but that totally break the mental parsing of the code (not talking about possible implementation problems). > >> and it make the code full of yield statement ; the proposal >> if adopted would make it even worst. > > I don't see how it would be any worse. Your code at > first glance looks incomprehensible to me -- how am I > supposed to know that the first 'yield' is a blocking > operation while the second one is returning a value? > It relies on obscure conventions implemented by some > kind of wrapper that you have to learn about. I Agree, and I don't like it either. It was just the easer way I found to implement micro threads using generators. The proposal would indeed make things more logical, specially if we can use 'return' into the generators. The point I wanted to make was that then we need to write "yield from" every time we call a coroutine from an other one, that is probably a lot, and so made me unhappy about the syntax. In the context of a coroutine, 'yield from' means : "we start this other coroutine, and return to the current coroutine when it is done", and I was expecting the syntax to somehow express this idea. On the other hand, the other usage of "yield from" (to replace : "for x in a: yield x") is totally fine. I tried to think of some other keywords to suggest that would suite both usages, but couldn't find anything. Best regards, Guillaume From tanzer at swing.co.at Tue Feb 17 08:47:57 2009 From: tanzer at swing.co.at (Christian Tanzer) Date: Tue, 17 Feb 2009 07:47:57 -0000 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: Your message of "Mon, 16 Feb 2009 22:16:19 +0100" References: Message-ID: Georg Brandl wrote at Mon, 16 Feb 2009 22:16:19 +0100: > Guido van Rossum schrieb: > > On Mon, Feb 16, 2009 at 12:48 AM, Christian Tanzer wrote: > >> A fair amount of my use cases involve a non-literal format string > >> (i.e., passed in as argument, defined as module or class variable, or > >> even doc-strings used as format strings). I'd guess that most > >> non-literal format strings are used together with dictionaries. > >> > >> Unfortunately, it's hard to grep for this :-(, so I can't give you > >> hard numbers. > > > > It would be pretty simple to rig up 2to3 to report any string literals > > containing e.g. '%(...)s' that are not immediately followed by a % > > operator. > > The major problems I see are > > 1) __mod__ application with a single right-hand operand (a tuple makes it > a string formatting to 100%, at least without other types overloading %) > 2) format strings coming from external sources > > The first can't be helped easily. For the second, a helper function that > converts %s format strings to {0} format strings could be imagined. A call > of the form > > fmtstr % (a, b) > > would then be converted to > > _mod2format(fmtstr).format(a, b) > > To fix 1), _mod2format could even return a wrapper that executes .__mod__ > on .format() if fmtstr is not a string. Please note that the right-hand operand can be a dictionary (more specifically, any object implementing `__getitem__()`) For objects implementing `__getitem__`, the stuff inside the parentheses of `%(...)s` can get pretty fancy. -- Christian Tanzer http://www.c-tanzer.at/ From steve at pearwood.info Tue Feb 17 08:36:39 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 17 Feb 2009 18:36:39 +1100 Subject: [Python-ideas] set.add() return value In-Reply-To: <2A463F1306254841912360D75DEB0305@RaymondLaptop1> References: <499A482B.2050203@canterbury.ac.nz> <2A463F1306254841912360D75DEB0305@RaymondLaptop1> Message-ID: <200902171836.39520.steve@pearwood.info> On Tue, 17 Feb 2009 04:27:02 pm Raymond Hettinger wrote: > [Greg Ewing] > > > I think there could be some theoretical justification > > to do it for sets at least. The pattern of "if something > > isn't already in some set, then add it and do some further > > processing" turns up fairly frequently in various algorithms. > > If we have to do this, then a separate method with > a very clear name is better than mucking-up the signature > for set.add() and set.discard(). > > Of course, we don't have to do this. It saves only one line. I'm with Raymond on this. I don't think it adds anything. It's not that saving one line is unimportant, sometimes saving one line is worth it. But there has to be a significant mental cost to that line to make it worth while, and I don't think that applies to any variation of: if x in set: print "already there" else: set.add(x) versus: if set.add(x): # or another method name if you prefer print "already there" (BTW, the two above have slightly different semantics. The first clearly and obviously only performs the add if x is not in the set; the second is unclear whether the add is attempted first or not, and an implementation could choose either.) Contrast Guido's description of why he likes "yield from A" compared to "for x in A: yield x": [quote] Your code-reading brain has to find the correspondence between the loop control variable of the for-loop header and the variable yielded in the loop body, see that nothing else is going on, and *then* it can draw the conclusion about the recursive generator. [end quote] As far as I can see, there is no similar significant mental shortcut added by giving set.add() a return value. -- Steven D'Aprano Operations Manager Cybersource Pty Ltd, ABN 13 053 904 082 Level 1, 130-132 Stawell St, Richmond VIC 3121 Tel: +61 3 9428 6922 Fax: +61 3 9428 6944 Web: http://www.cybersource.com.au From bmintern at gmail.com Tue Feb 17 09:22:08 2009 From: bmintern at gmail.com (Brandon Mintern) Date: Tue, 17 Feb 2009 03:22:08 -0500 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: References: Message-ID: <4c0fccce0902170022y47cb6beak9d41729619dd1d9d@mail.gmail.com> On Tue, Feb 17, 2009 at 2:47 AM, Christian Tanzer wrote: > Please note that the right-hand operand can be a dictionary > (more specifically, any object implementing `__getitem__()`) > > For objects implementing `__getitem__`, the stuff inside the > parentheses of `%(...)s` can get pretty fancy. Indeed it can. I had some functionality for templating C programs that relied on just this. The files would be mostly C code, but then %-formatting was used to specify certain chunks to be generated by Python code. The custom class I wrote implemented a __getitem__ class that broke down the given key into arguments to an indicated function, eval-ed Python code, and generated the desired C code. This was super-useful for things like spitting out static const arrays and specifying the array sizes in the header file without requiring duplicate human effort. An example would be the following snippet from a C-template, num_to_words.ctemplate: %(array; char * ones; "zero one two three four five six seven eight nine".split())s Taking this file, reading it in, and doing %-interpolation with my custom class would result in the following output: num_to_words.h: extern char * ones[10]; num_to_words.c: #ifndef NUM_TO_WORDS_C #define NUM_TO_WORDS_C #include "num_to_words.h" char * ones = { "zero", "one", "two", "three", "four", "five", "six", "seven", "eight", "nine" }; #endif So there are some pretty crazy things possible with %-formatting and custom __getitem__ classes. As long as format can do similar things, though, I don't think there is a problem. Brandon From bmintern at gmail.com Tue Feb 17 09:37:06 2009 From: bmintern at gmail.com (Brandon Mintern) Date: Tue, 17 Feb 2009 03:37:06 -0500 Subject: [Python-ideas] set.add() return value In-Reply-To: <200902171836.39520.steve@pearwood.info> References: <499A482B.2050203@canterbury.ac.nz> <2A463F1306254841912360D75DEB0305@RaymondLaptop1> <200902171836.39520.steve@pearwood.info> Message-ID: <4c0fccce0902170037x375cb14dp6b50c2bbdd05cf91@mail.gmail.com> On Tue, Feb 17, 2009 at 2:36 AM, Steven D'Aprano wrote: > if x in set: > print "already there" > else: > set.add(x) > > versus: > > if set.add(x): # or another method name if you prefer > print "already there" These are truly different in terms of runtime performance, though. In the first example, let's examine both cases. If x isn't in the set, we hash x, perform a worst-case set lookup (because we have to scan the entire list for whatever bucket that x hashes to), hash x again, and then perform that same lookup again before adding x to the set. If x is in the set, we perform one hash and one lookup. In the second example, we always hash and lookup x exactly once, adding it if necessary. This could turn out to be a non-trivial difference in runtime if the set is large enough or x is an object with a non-trivial hash. Sure the difference is a constant factor, but it's a significant one, somewhere between 1.25-2 depending on the percentage of "add"s that are unique items (if every item is unique, we're looking at a 2x speedup for set operations). But maybe that's just my premature optimization coming through, Brandon From greg.ewing at canterbury.ac.nz Tue Feb 17 09:45:33 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 17 Feb 2009 21:45:33 +1300 Subject: [Python-ideas] Yield-from example: A parser Message-ID: <499A792D.9090701@canterbury.ac.nz> Here's an attempt at providing a fleshed-out use case for some of the more contentious parts of the yield-from proposal. Exammple using send() and return-with-value to implement a parser ================================================================= I'm going to develop two versions of a simple XML parser. The first version will use current Python facilities to do it in "pull" mode, where the scanner is a generator and the parser is a set of recursive functions that pull tokens out of it as needed. The second version will turn it around so that it operates in "push" mode, with the scanner being an outer loop, and the parser being a generator that gets tokens pushed into it using "push". Using yield-from, it will be seen that the parser code is almost identical to the first version, the main difference being that some function calls are replaced by yield-from statements. Version 1 - Pull-mode, recursive functions ------------------------------------------ First, let's write a scanner and set it to work on some test data. import re pat = re.compile(r"(\S+)|(<[^>]*>)") def scanner(text): for m in pat.finditer(text): token = m.group(0) print "Feeding:", repr(token) yield token yield None # to signal EOF text = " This is a foo file you know. " token_stream = scanner(text) I'm using a global variable 'token_stream' to hold the source of tokens here. It could just as well be passed around as a parameter to the parsing functions, but doing it this way will bring out the symmetry with the other version better later on. Now let's write a function to parse a sequence of items, each of which is either a plain word or a compound element surrounted by opening and closing tags. We will return the result as a suitably-structured parse tree. def parse_items(closing_tag = None): elems = [] while 1: token = token_stream.next() if not token: break # EOF if is_opening_tag(token): elems.append(parse_elem(token)) elif token == closing_tag: break else: elems.append(token) return elems This makes use of a small utility function to see if a token is an opening tag: def is_opening_tag(token): return token.startswith("<") and not token.startswith("" % name items = parse_items(closing_tag) return (name, items) Now we can call the parser and print the result. tree = parse_items() print tree This is all currently-valid Python, and when run it produces the followiing output. Feeding: '' Feeding: 'This' Feeding: 'is' Feeding: 'a' Feeding: '' Feeding: 'foo' Feeding: 'file' Feeding: '' Feeding: 'you' Feeding: 'know.' Feeding: '' [('foo', ['This', 'is', 'a', ('b', ['foo', 'file']), 'you', 'know.'])] Version 2 - Push-mode, recursive generators ------------------------------------------- For this version, we turn things around so that the scanner is on the outside. The top-level code will now be like this: def run(): parser = parse_items() try: for m in pat.finditer(text): token = m.group(0) print "Feeding:", repr(token) parser.send(token) parser.send(None) # to signal EOF except StopIteration, e: tree = e.args[0] print tree The parser is going to be a generator that returns the parse tree when it exits. To the outside, this manifests as a StopIteration exception with an argument, which we extract. (We're assuming that there are no syntax errors in the input, so the parser is ready to exit by the time the input is exhausted. A more robust implementation would take care of the case that it isn't.) Now the only thing we need to do to our two parsing functions parse_items and parse_elem is to replace any calls to get a token with 'yield', and insert 'yield from' into each of the recursive calls. The changes are marked with arrows below. def parse_elem(opening_tag): name = opening_tag[1:-1] closing_tag = "" % name items = yield from parse_items(closing_tag) # <--- return (name, items) def parse_items(closing_tag = None): elems = [] while 1: token = yield # <--- if not token: break # EOF if is_opening_tag(token): elems.append(yield from parse_elem(token)) # <--- elif elem == closing_tag: break else: elems.append(token) return elems That's all -- everything else remains the same. I can't run this, because yield-from doesn't exist, but if I could, the output would be the same as before. I hope this helps to give an idea of what I'm talking about with generators-as-threads. The whole parser can be seen as a lightweight thread, implemented as a collection of mutually recursive functions. Whenever any of the functions needs to get a token, it yields. To return a value to the calling function, it uses a return statement just like any other function. I hope it is also clear why returning values via yield, or having 'yield from' return any of the yielded values, would be the wrong thing to do. The send/yield channel and the return channel are being used for completely different purposes, and conflating them would be disastrous. -- Greg From greg.ewing at canterbury.ac.nz Tue Feb 17 09:57:35 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 17 Feb 2009 21:57:35 +1300 Subject: [Python-ideas] set.add() return value In-Reply-To: <200902171836.39520.steve@pearwood.info> References: <499A482B.2050203@canterbury.ac.nz> <2A463F1306254841912360D75DEB0305@RaymondLaptop1> <200902171836.39520.steve@pearwood.info> Message-ID: <499A7BFF.9020204@canterbury.ac.nz> Steven D'Aprano wrote: > It's not > that saving one line is unimportant, sometimes saving one line is > worth it. But there has to be a significant mental cost to that line > to make it worth while There could be cases where the execution speed difference is important, especially if testing whether the element is in the set is expensive. Back when I was developing Plex, I found that the biggest bottleneck in building a scanner object was the NFA to DFA conversion. That algorithm consists almost entirely of testing whether things are in sets and adding them if they're not, and the set elements themselves are other sets, so they're not cheap to do membership tests on. If I'd had an atomic test-and-add operation for sets, I could imagine that part running about twice as fast as it did. -- Greg From greg.ewing at canterbury.ac.nz Tue Feb 17 10:04:37 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 17 Feb 2009 22:04:37 +1300 Subject: [Python-ideas] Revised revised revised PEP on yield-from In-Reply-To: <8e9327d40902162346o413f6a25see1253ff0962ba54@mail.gmail.com> References: <8e9327d40902161905j237a64dei468a811cf42539b@mail.gmail.com> <499A4EBD.5080003@canterbury.ac.nz> <8e9327d40902162346o413f6a25see1253ff0962ba54@mail.gmail.com> Message-ID: <499A7DA5.5070900@canterbury.ac.nz> Guillaume Chereau wrote: > In the context of a coroutine, 'yield from' means : "we start this > other coroutine, and return to the current coroutine when it is done", > and I was expecting the syntax to somehow express this idea. For what it's worth, I tend to feel the same way. I was originally going to call it 'call': y = call g(x) but Guido convinced me that 'call' is far too generic a word and doesn't convey any connection with generators at all. If anyone has any other suggestions, I'll gladly consider them. -- Greg From python at rcn.com Tue Feb 17 10:07:15 2009 From: python at rcn.com (Raymond Hettinger) Date: Tue, 17 Feb 2009 01:07:15 -0800 Subject: [Python-ideas] set.add() return value References: <499A482B.2050203@canterbury.ac.nz><2A463F1306254841912360D75DEB0305@RaymondLaptop1><200902171836.39520.steve@pearwood.info> <499A7BFF.9020204@canterbury.ac.nz> Message-ID: <1AF71C53F36A442BB12A06EF2A0F956A@RaymondLaptop1> [Greg Ewing] > If I'd had an atomic test-and-add operation for sets, > I could imagine that part running about twice as fast > as it did. Imagining and timing are two different things. See Object/dictnotes.txt for the results of a month-long effort to study ways to improve hash table performance. One result was that two successive accesses for the same key in a large table did *not* double the time. Due to cache effects, the second access was nearly free. Also, I think this performance discussion is over focusing on the set lookup operation as if it were the dominant part of typical programs. Remember, that every time there is dotted access, there is a hash table lookup. Just writing s.add(x) results in a lookup for "add" as well as the the lookup/insertion of x. The performance argument is a red-herring. Raymond P.S. The lookup time does nearly double if there is an expensive hash function. But then, it would make sense to cache the results of that hash just like we do with strings. From solipsis at pitrou.net Tue Feb 17 11:19:18 2009 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 17 Feb 2009 10:19:18 +0000 (UTC) Subject: [Python-ideas] Yield-from example: A parser References: <499A792D.9090701@canterbury.ac.nz> Message-ID: Greg Ewing writes: > > I hope this helps to give an idea of what I'm talking about > with generators-as-threads. The whole parser can be seen as > a lightweight thread, implemented as a collection of mutually > recursive functions. Whenever any of the functions needs to > get a token, it yields. Perhaps that's a stupid question but what is the point of version 2 over version 1? If I sum it up: - in version 1, the parsing functions pull the tokens using scanner.next() (*) - in version 2, the parsing functions pull the tokens using "yield from" They may be syntactically different, but semantically they look completely the same. (*) by the way, using a "for" loop would probably have felt more natural Regards Antoine. From sturla at molden.no Tue Feb 17 12:57:55 2009 From: sturla at molden.no (Sturla Molden) Date: Tue, 17 Feb 2009 12:57:55 +0100 Subject: [Python-ideas] Porting os.fork to Windows? Message-ID: <499AA643.5090208@molden.no> To my astonishment, I just found out this: The Windows-port of tcsh has an implementation of fork on top of the Windows API, and the license is BSD (unlike Cygwin's GPL'd fork). It would be interesting to try this with Python. Here is the source: http://tcsh.sourcearchive.com/documentation/6.14.00-7/fork_8c-source.html http://tcsh.sourcearchive.com/documentation/6.14.00-7/forkdata_8h-source.html Sturla Molden From sturla at molden.no Tue Feb 17 13:31:43 2009 From: sturla at molden.no (Sturla Molden) Date: Tue, 17 Feb 2009 13:31:43 +0100 Subject: [Python-ideas] Porting os.fork to Windows? In-Reply-To: <499AA643.5090208@molden.no> References: <499AA643.5090208@molden.no> Message-ID: <499AAE2F.6010107@molden.no> On 2/17/2009 12:57 PM, Sturla Molden wrote: > http://tcsh.sourcearchive.com/documentation/6.14.00-7/fork_8c-source.html > http://tcsh.sourcearchive.com/documentation/6.14.00-7/forkdata_8h-source.html ftp://ftp.funet.fi/pub/unix/shells/tcsh/ From eric at trueblade.com Tue Feb 17 13:36:30 2009 From: eric at trueblade.com (Eric Smith) Date: Tue, 17 Feb 2009 07:36:30 -0500 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: <4999D184.3080105@trueblade.com> References: <20090212141040.0c89e0fc@o> <70e6cf130902121029l17ec06e0k9d3ea91e23ecd74e@mail.gmail.com> <4999D184.3080105@trueblade.com> Message-ID: <499AAF4E.3020506@trueblade.com> Eric Smith wrote: > I have this mostly implemented in ''.format(), despite my earlier > statement that I was done playing with it after the sample that's > attached to http://bugs.python.org/issue5237. The one issue that's causing me problems is what to do with format specifiers that themselves need expanding. >>> '{0:>{1}}'.format('foo', 5) ' foo' Should: >>> '{:{}}'.format('foo', 5) produce the same output, or should it be an error? I think it should probably work, but it complicates the implementation sufficiently that I probably won't be able to finish it up for a couple of weeks. I know this is a not-so-useful corner case, but the implementation has to do something here. I could easily throw an exception, but I don't see how that's more desirable than just making it work. From qrczak at knm.org.pl Tue Feb 17 14:43:09 2009 From: qrczak at knm.org.pl (Marcin 'Qrczak' Kowalczyk) Date: Tue, 17 Feb 2009 14:43:09 +0100 Subject: [Python-ideas] set.add() return value In-Reply-To: <1AF71C53F36A442BB12A06EF2A0F956A@RaymondLaptop1> References: <499A482B.2050203@canterbury.ac.nz> <2A463F1306254841912360D75DEB0305@RaymondLaptop1> <200902171836.39520.steve@pearwood.info> <499A7BFF.9020204@canterbury.ac.nz> <1AF71C53F36A442BB12A06EF2A0F956A@RaymondLaptop1> Message-ID: <3f4107910902170543g708fb477we627414da8e1f12c@mail.gmail.com> What about: old_len = len(s) s.add(elem) if len(s) != old_len: ... -- Marcin Kowalczyk qrczak at knm.org.pl http://qrnik.knm.org.pl/~qrczak/ From tjreedy at udel.edu Tue Feb 17 16:27:07 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 17 Feb 2009 10:27:07 -0500 Subject: [Python-ideas] set.add() return value In-Reply-To: <3f4107910902170543g708fb477we627414da8e1f12c@mail.gmail.com> References: <499A482B.2050203@canterbury.ac.nz> <2A463F1306254841912360D75DEB0305@RaymondLaptop1> <200902171836.39520.steve@pearwood.info> <499A7BFF.9020204@canterbury.ac.nz> <1AF71C53F36A442BB12A06EF2A0F956A@RaymondLaptop1> <3f4107910902170543g708fb477we627414da8e1f12c@mail.gmail.com> Message-ID: Marcin 'Qrczak' Kowalczyk wrote: > What about: > > old_len = len(s) > s.add(elem) > if len(s) != old_len: > ... Nice! From tjreedy at udel.edu Tue Feb 17 16:28:10 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 17 Feb 2009 10:28:10 -0500 Subject: [Python-ideas] set.add() return value In-Reply-To: <499A7BFF.9020204@canterbury.ac.nz> References: <499A482B.2050203@canterbury.ac.nz> <2A463F1306254841912360D75DEB0305@RaymondLaptop1> <200902171836.39520.steve@pearwood.info> <499A7BFF.9020204@canterbury.ac.nz> Message-ID: Greg Ewing wrote: > Steven D'Aprano wrote: >> It's not that saving one line is unimportant, sometimes saving one >> line is worth it. But there has to be a significant mental cost to >> that line to make it worth while > > There could be cases where the execution speed difference > is important, especially if testing whether the element is > in the set is expensive. > > Back when I was developing Plex, I found that the biggest > bottleneck in building a scanner object was the NFA to > DFA conversion. That algorithm consists almost entirely > of testing whether things are in sets and adding them if > they're not, and the set elements themselves are other > sets, so they're not cheap to do membership tests on. But why not just unconditionally add them? Marcin just posted a nice solution for proceeding differently if really necessary. > If I'd had an atomic test-and-add operation for sets, > I could imagine that part running about twice as fast > as it did. > From tjreedy at udel.edu Tue Feb 17 16:39:37 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 17 Feb 2009 10:39:37 -0500 Subject: [Python-ideas] set.add() return value In-Reply-To: <200902171836.39520.steve@pearwood.info> References: <499A482B.2050203@canterbury.ac.nz> <2A463F1306254841912360D75DEB0305@RaymondLaptop1> <200902171836.39520.steve@pearwood.info> Message-ID: Steven D'Aprano wrote: > I'm with Raymond on this. Me too. Some languages return self from mutating methods (allowing chaining). Python's builtins and stdlib consistently (afaik) returns None (preventing chaining). Returning True or False depending on whether a mutation was done seems like an innovation. It is an intriguing idea, but doing so just occasionally would add mental burden. Why now change list.extend()? (False if the extension is empty.) or list.sort() (False if the list is already sorted.) List.append(), of course, would always return True. Of course, in Python, having *any* consistent return other than None is difficult because much mutation is done thru special methods wrapped in syntax. tjr From george.sakkis at gmail.com Tue Feb 17 17:28:47 2009 From: george.sakkis at gmail.com (George Sakkis) Date: Tue, 17 Feb 2009 11:28:47 -0500 Subject: [Python-ideas] set.add() return value In-Reply-To: References: <499A482B.2050203@canterbury.ac.nz> <2A463F1306254841912360D75DEB0305@RaymondLaptop1> <200902171836.39520.steve@pearwood.info> Message-ID: <91ad5bf80902170828k77da6a8hc0c325f12366da58@mail.gmail.com> On Tue, Feb 17, 2009 at 10:39 AM, Terry Reedy wrote: > Returning True or False depending on whether a mutation was done seems like an innovation. Not really, all Java containers and STL sets and maps do it. > It is an intriguing idea, but > doing so just occasionally would add mental burden. Why now change > list.extend()? (False if the extension is empty.) or list.sort() (False if > the list is already sorted.) List.append(), of course, would always return > True. The mental burden is not bigger than, say, learning that you append() to a list but you add() to a set. Consistency for consistency's sake is IMHO misguided here; there's only so much you can do by treating all containers uniformly. Unlike sequences, the motivation for sets (and dicts for that matter) comes from real use cases so we shouldn't dismiss this request by an all-or-nothing argument. George From solipsis at pitrou.net Tue Feb 17 18:02:38 2009 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 17 Feb 2009 17:02:38 +0000 (UTC) Subject: [Python-ideas] Porting os.fork to Windows? References: <499AA643.5090208@molden.no> Message-ID: Sturla Molden writes: > > To my astonishment, I just found out this: > > The Windows-port of tcsh has an implementation of fork on top of the > Windows API, and the license is BSD (unlike Cygwin's GPL'd fork). Could you post an entry on http://bugs.python.org, so that it doesn't get lost? Thanks Antoine. From janssen at parc.com Tue Feb 17 19:02:20 2009 From: janssen at parc.com (Bill Janssen) Date: Tue, 17 Feb 2009 10:02:20 PST Subject: [Python-ideas] set.add() return value In-Reply-To: References: Message-ID: <15657.1234893740@parc.com> Guido van Rossum wrote: > This example also has a bug, which neither of the two posters > responding caught (unless Bill J was being *very* subtle). Sorry, I should have been more brutal. To my mind, the fact that Steve got it wrong was a nice illustration of how much extra mental effort needed to be expended because the feature Ralf suggests wasn't available. You have to write a test, the test has to include an inversion, you have to introduce a new variable to hold the result of the test. That's something like 3 function points, instead of one. Functionality "built in" into the standard library is much better for everyone than functionality a programmer has to generate himself, mainly because of the extra review it gets. I think that's the real payoff for "batteries included". Bill From santagada at gmail.com Tue Feb 17 19:06:11 2009 From: santagada at gmail.com (Leonardo Santagada) Date: Tue, 17 Feb 2009 15:06:11 -0300 Subject: [Python-ideas] set.add() return value In-Reply-To: References: <118486.47943.qm@web111413.mail.gq1.yahoo.com> <203230.34503.qm@web111401.mail.gq1.yahoo.com> <50697b2c0902121512h6749333tb2a5cf5ff1e54899@mail.gmail.com> <171757.71077.qm@web111402.mail.gq1.yahoo.com> <4994FCF4.3030301@pearwood.info> <779748.53402.qm@web111403.mail.gq1.yahoo.com> <50697b2c0902122300q27e82716y76a9a5b13e3f1766@mail.gmail.com> <4995F850.2040802@canterbury.ac.nz> <49961D59.1050208@pearwood.info> <476B74AB12994003AFE1A697CD13AB57@RaymondLaptop1> Message-ID: <858A97AE-30D2-4E37-A2B8-F62A122CFDA4@gmail.com> On Feb 17, 2009, at 12:14 AM, Raymond Hettinger wrote: >>> As I mentioned in another post, this API is counterintuitive for >>> anyone who is used to other languages where operations like >>> set.add() always returns self and lets them write code like: >>> >>> myset.add(x).add(y).add(z) >> >> Which language has that? > > I think there a several. > > Smalltalk comes to mind: > ''' > addAll: aCollection > Adds all the elements of 'aCollection' to the receiver, > answer aCollection > ''' > > Looks like Objective C takes the same approach: > ''' > Method: IndexedCollection @keyword{-insertObject:} newObject > @keyword{atIndex:} (unsigned)index > @deftypemethodx IndexedCollecting {} @keyword{-insertElement:} > (elt)newElement @keyword{atIndex:} (unsigned)index > returns self. > ''' > > Also, Ruby uses the style of having mutating methods return self: > > ''' > arr.fill( anObject ) -> arr > arr.fill( anObject, start [, length ] ) -> arr > arr.fill( anObject, aRange ) -> arr > > > hsh.update( anOtherHash ) -> hsh > > Adds the contents of anOtherHash to hsh, overwriting entries with > duplicate keys with those from anOtherHash. > > h1 = { "a" => 100, "b" => 200 } > h2 = { "b" => 254, "c" => 300 } > h1.update(h2) ? {"a"=>100, "b"=>254, "c"=>300} > > ''' > > > I haven't checked other languages but am sure I've seen this style > show-up in a number of places. JQuery also does this. Can I ask why python doesn't do it? Seems a more interesting change than returning a boolean. -- Leonardo Santagada santagada at gmail.com From guido at python.org Tue Feb 17 19:32:17 2009 From: guido at python.org (Guido van Rossum) Date: Tue, 17 Feb 2009 10:32:17 -0800 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: <4c0fccce0902170022y47cb6beak9d41729619dd1d9d@mail.gmail.com> References: <4c0fccce0902170022y47cb6beak9d41729619dd1d9d@mail.gmail.com> Message-ID: You can do similar things with .format(), but inside {} the : and ! characters always end the key. On Tue, Feb 17, 2009 at 12:22 AM, Brandon Mintern wrote: > On Tue, Feb 17, 2009 at 2:47 AM, Christian Tanzer wrote: >> Please note that the right-hand operand can be a dictionary >> (more specifically, any object implementing `__getitem__()`) >> >> For objects implementing `__getitem__`, the stuff inside the >> parentheses of `%(...)s` can get pretty fancy. > > Indeed it can. I had some functionality for templating C programs that > relied on just this. The files would be mostly C code, but then > %-formatting was used to specify certain chunks to be generated by > Python code. The custom class I wrote implemented a __getitem__ class > that broke down the given key into arguments to an indicated function, > eval-ed Python code, and generated the desired C code. This was > super-useful for things like spitting out static const arrays and > specifying the array sizes in the header file without requiring > duplicate human effort. > > An example would be the following snippet from a C-template, > num_to_words.ctemplate: > > %(array; char * ones; "zero one two three four five six seven eight > nine".split())s > > Taking this file, reading it in, and doing %-interpolation with my > custom class would result in the following output: > > num_to_words.h: > > extern char * ones[10]; > > > num_to_words.c: > > #ifndef NUM_TO_WORDS_C > #define NUM_TO_WORDS_C > > #include "num_to_words.h" > > char * ones = { > "zero", > "one", > "two", > "three", > "four", > "five", > "six", > "seven", > "eight", > "nine" > }; > > #endif > > > So there are some pretty crazy things possible with %-formatting and > custom __getitem__ classes. As long as format can do similar things, > though, I don't think there is a problem. > > > Brandon > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bruce at leapyear.org Tue Feb 17 20:23:00 2009 From: bruce at leapyear.org (Bruce Leban) Date: Tue, 17 Feb 2009 11:23:00 -0800 Subject: [Python-ideas] Revised revised revised PEP on yield-from In-Reply-To: <499A7DA5.5070900@canterbury.ac.nz> References: <8e9327d40902161905j237a64dei468a811cf42539b@mail.gmail.com> <499A4EBD.5080003@canterbury.ac.nz> <8e9327d40902162346o413f6a25see1253ff0962ba54@mail.gmail.com> <499A7DA5.5070900@canterbury.ac.nz> Message-ID: Reading the last few messages what came to mind was 'yield to'. In the coroutine contexts it might be more appropriate thinking of yielding execution 'to' the other generator. But in other contexts it seems more appropriate to think of yielding values 'from' the other generator. Of course it does both of these things. And 'yield from' makes more sense to me. It reminds me of a Honeymooners skit with Ralph and Ed arguing about the doors between two rooms labeled 'in' and 'out'. Ralph: "You are supposed to go in the door marked 'In.'" Ed: "I wasn't going in that room. I was coming out of this room." Ralph: "You were not going out of this room. You were going in that room." Ed: "How could I go into that room without coming out of this room?" --- Bruce On Tue, Feb 17, 2009 at 1:04 AM, Greg Ewing wrote: > Guillaume Chereau wrote: > > In the context of a coroutine, 'yield from' means : "we start this >> other coroutine, and return to the current coroutine when it is done", >> and I was expecting the syntax to somehow express this idea. >> > > For what it's worth, I tend to feel the same way. > I was originally going to call it 'call': > > y = call g(x) > > but Guido convinced me that 'call' is far too generic > a word and doesn't convey any connection with generators > at all. > > If anyone has any other suggestions, I'll gladly > consider them. > > -- > Greg > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Feb 17 20:47:13 2009 From: guido at python.org (Guido van Rossum) Date: Tue, 17 Feb 2009 11:47:13 -0800 Subject: [Python-ideas] Revised revised revised PEP on yield-from In-Reply-To: <499A59CE.8030509@canterbury.ac.nz> References: <4995F681.20702@canterbury.ac.nz> <0A5E3DAFC3204EA08954903202DB45F3@RaymondLaptop1> <4999E02A.8050103@canterbury.ac.nz> <499A42E5.3010204@canterbury.ac.nz> <499A59CE.8030509@canterbury.ac.nz> Message-ID: On Mon, Feb 16, 2009 at 10:31 PM, Greg Ewing wrote: > Guido van Rossum wrote: > >> I don't quite >> understand how I would write the function that is delegated to as >> "yield from g(x)" nor do I quite see what the caller of the outer >> generator should expect from successive next() or .send() calls. > > It should be able to expect whatever would happen if the > body of the delegated-to generator were inlined into the > delegating generator. I understand that when I'm thinking of generators (as you saw in the tree traversal example I posted). My question was in the context of lightweight threads and your proposal for the value returned by "yield from". I believe I now understand what you are trying to do, but the way to think about it in this case seems very different than when you're refactoring generators. IIUC there will be some kind of "scheduler" that manages a number of lightweight threads, each represented by a suspended stack of generators, and a number of blocking resources like sockets or mutexes. The scheduler knows what resource each thread is waiting for (could also be ready to run or sleeping until a specific time) and when the resource is ready it resumes the generator passing along whatever value is required using .send(). E.g. on input, it could read the data from the socket, or it could just pass a flag indicating that the resource is ready and let the generator make the actual recv() call. When a generator wants to access a resource, it uses "yield" (not "yield from"!) to send a description of the resource needed to the scheduler. When a generator wants to call another function that might block, the other function must be written as a generator too, and it is called using "yield from". The other function uses "yield" to access blocking resources, and "return" to return a value to its "caller" (really the generator that used "yield from"). I believe that Twisted has a similar scheme that doesn't have the benefit of arbitrarily nested generators; I recall Phillip Eby talkingabout this too. I've never used lightweight threads myself -- I'm a bit "old school" and would typically either use real OS threads, like Java, or event-driven programming possibly with callbacks, like Tcl/Tk. But I can see the utility of this approach and reluctantly admit that the proposed semantics for the "yield from" return value are just right for this approach. I do think that it is still requires the user to be quite aware of what is going on behind the scenes, for example to remember when to use "yield from" (for functions that have been written to cooperate with the scheduler) and when to use regular calls (for functions that cannot block) -- messing this up is quite painful, e.g. forgetting to use "yield from" will probably produce a pretty confusing error message. Also, it would seem you cannot write functions running in lightweight threads that are also "ordinary" generators, since yield is reserved for "calling" the scheduler. I have a little example in my head that I might as well show here: suppose we have a file-like object with a readline() method that calls a read() method which in turn calls a fillbuf() function. If I want to read a line from the file, I might write (assuming I am executing inside a generator that is really used for light-weight threading, so that "yield" communicates with the scheduler): line = yield from f.readline() The readline() method could naively be implemented as: def readline(self): line = [] while True: c = self.read(1) if not c: break line.append(c) if c == '\n': break return ''.join(line) The read() method could be: def read(self, n): if len(self.buf) < n: yield from self.fillbuf(n - len(self.buf)) result, self.buf = self.buf[:n], self.buf[n:] return result I'm leaving fillbuf() to the imagination of the reader; its implementation depends on the protocol with the scheduler to actually read data. Or there might be a lower-level unbuffered read() generator that encapsulates the scheduler protocol. I don't think I could add a generator to the file-like class that would call readline() until the file is exhausted though, at least not easily; code that is processing lines will have to use a while-loop like this: while True: line = yield from f.readline() if not line: break ...process line... Trying to turn this into a generator like I can do with an ordinary file-like object doesn't work: def __iter__(self): while True: line = yield from self.readline() if not line: break yield line ## ??????? This is because lightweight threads use yield to communicate with the scheduler, and they cannot easily also use it to yield successive values to their caller. I could imagine some kind of protocol where yield always returns a tuple whose first value is a string or token indicating what kind of yield it is, e.g. "yield" when it is returning the next value from the readline-loop, and "scheduler" when it is wanting to talk to the scheduler, but the caller would have to look for this and it would become much uglier than just writing out the while-loop. > That's the core idea behind all of this -- being able to > take a chunk of code containing yields, abstract it out > and put it in another function, without the ouside world > being any the wiser. > > We do this all the time with ordinary functions and > don't ever question the utility of being able to do so. > I'm at a bit of a loss to understand why people can't > see the utility in being able to do the same thing > with generator code. I do, I do. It's the complication with the return value that I am still questioning, since that goes beyond simply refactoring generators. > I take your point about needing a better generators- > as-threads example, though, and I'll see if I can come > up with something. Right. >> So, "return" is equivalent to "raise StopIteration" and "return >> " is equivalent to "raise StopIteration()"? > > Yes. I apologize for even asking that bit, it was very clear in the PEP. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Feb 17 20:56:17 2009 From: guido at python.org (Guido van Rossum) Date: Tue, 17 Feb 2009 11:56:17 -0800 Subject: [Python-ideas] set.add() return value In-Reply-To: <4c0fccce0902170037x375cb14dp6b50c2bbdd05cf91@mail.gmail.com> References: <499A482B.2050203@canterbury.ac.nz> <2A463F1306254841912360D75DEB0305@RaymondLaptop1> <200902171836.39520.steve@pearwood.info> <4c0fccce0902170037x375cb14dp6b50c2bbdd05cf91@mail.gmail.com> Message-ID: On Tue, Feb 17, 2009 at 12:37 AM, Brandon Mintern wrote: > On Tue, Feb 17, 2009 at 2:36 AM, Steven D'Aprano wrote: >> if x in set: >> print "already there" >> else: >> set.add(x) >> >> versus: >> >> if set.add(x): # or another method name if you prefer >> print "already there" > > These are truly different in terms of runtime performance, though. In > the first example, let's examine both cases. If x isn't in the set, we > hash x, perform a worst-case set lookup (because we have to scan the > entire list for whatever bucket that x hashes to), hash x again, and > then perform that same lookup again before adding x to the set. If x > is in the set, we perform one hash and one lookup. > > In the second example, we always hash and lookup x exactly once, > adding it if necessary. This could turn out to be a non-trivial > difference in runtime if the set is large enough or x is an object > with a non-trivial hash. Sure the difference is a constant factor, but > it's a significant one, somewhere between 1.25-2 depending on the > percentage of "add"s that are unique items (if every item is unique, > we're looking at a 2x speedup for set operations). > > But maybe that's just my premature optimization coming through, Another way to optimize this without changing the API would be to cache the key, hash and result of the last lookup attempted. Then the second lookup would be pretty much free. It would usually only take a single pointer comparison to decide whether the cache is valid. Oh, and an INCREF/DECREF pair somewhere, though I suspect there's already such a pair and we'd simply delaying the DECREF. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg.ewing at canterbury.ac.nz Tue Feb 17 21:41:18 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 18 Feb 2009 09:41:18 +1300 Subject: [Python-ideas] set.add() return value In-Reply-To: <1AF71C53F36A442BB12A06EF2A0F956A@RaymondLaptop1> References: <499A482B.2050203@canterbury.ac.nz> <2A463F1306254841912360D75DEB0305@RaymondLaptop1> <200902171836.39520.steve@pearwood.info> <499A7BFF.9020204@canterbury.ac.nz> <1AF71C53F36A442BB12A06EF2A0F956A@RaymondLaptop1> Message-ID: <499B20EE.7060402@canterbury.ac.nz> Raymond Hettinger wrote: > One result was that two successive accesses for the > same key in a large table did *not* double the time. What kind of keys, though? In my case the keys aren't strings, they're tuples. Even after you've found the right slot in the hash table, you still need to compare the tuples. > Remember, that every time there is > dotted access, there is a hash table lookup. For a string, which is probably interned. As far as I'm aware, there is no notion of interning for tuples. -- Greg From greg.ewing at canterbury.ac.nz Tue Feb 17 21:48:37 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 18 Feb 2009 09:48:37 +1300 Subject: [Python-ideas] Yield-from example: A parser In-Reply-To: References: <499A792D.9090701@canterbury.ac.nz> Message-ID: <499B22A5.8020506@canterbury.ac.nz> Antoine Pitrou wrote: > They may be syntactically different, but semantically they look completely the > same. The control structure is inverted. Sometimes it's more convenient to push instead of pull. If everything could always be done using pulls, there would be no need for send() in the first place. > (*) by the way, using a "for" loop would probably have felt more natural Not sure what you mean by that. There's no single place in the parser that loops until the input is exhausted. -- Greg From python at rcn.com Tue Feb 17 22:03:40 2009 From: python at rcn.com (Raymond Hettinger) Date: Tue, 17 Feb 2009 13:03:40 -0800 Subject: [Python-ideas] set.add() return value References: <499A482B.2050203@canterbury.ac.nz><2A463F1306254841912360D75DEB0305@RaymondLaptop1><200902171836.39520.steve@pearwood.info><499A7BFF.9020204@canterbury.ac.nz><1AF71C53F36A442BB12A06EF2A0F956A@RaymondLaptop1> <499B20EE.7060402@canterbury.ac.nz> Message-ID: <7D87C9B3313E40EE9877772353E1206E@RaymondLaptop1> >> One result was that two successive accesses for the >> same key in a large table did *not* double the time. > > What kind of keys, though? In my case the keys aren't > strings, they're tuples. Even after you've found the > right slot in the hash table, you still need to compare > the tuples. Even if the hash has to be recomputed, processor caching makes the second pass very cheap compared to the first. I spent a month trying to optimize dicts and learned that this is a dead-end. It is certainly *not* worth mucking-up the API for sets. Even if it did improve you one app, that would be atypical. Most apps that test and add will do something inside the branch that consumes far more time than the contains test. Raymond From greg.ewing at canterbury.ac.nz Tue Feb 17 22:18:19 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 18 Feb 2009 10:18:19 +1300 Subject: [Python-ideas] set.add() return value In-Reply-To: References: <499A482B.2050203@canterbury.ac.nz> <2A463F1306254841912360D75DEB0305@RaymondLaptop1> <200902171836.39520.steve@pearwood.info> <499A7BFF.9020204@canterbury.ac.nz> Message-ID: <499B299B.9030600@canterbury.ac.nz> Terry Reedy wrote: > Greg Ewing wrote [concerning NFA to DFA]: > But why not just unconditionally add them? Because you also need to know whether it was already in the set, so you can terminate the recursion. > Marcin just posted a nice solution for proceeding differently if really > necessary. Yes, that's a neat trick! It does obfuscate the code a little bit, though. -- Greg From lists at cheimes.de Tue Feb 17 23:28:15 2009 From: lists at cheimes.de (Christian Heimes) Date: Tue, 17 Feb 2009 23:28:15 +0100 Subject: [Python-ideas] Porting os.fork to Windows? In-Reply-To: <499AA643.5090208@molden.no> References: <499AA643.5090208@molden.no> Message-ID: Sturla Molden schrieb: > > To my astonishment, I just found out this: > > The Windows-port of tcsh has an implementation of fork on top of the > Windows API, and the license is BSD (unlike Cygwin's GPL'd fork). The code implements fork() on top of the NT API, not the Win32 API. The NT Kernel supports all necessary bits and pieces for fork(). However the features aren't exposed in the Win32 API layer. According to several internet sites only the POSIX layer for NT exposes fork(). fork() is an efficient and fast op on Unix systems like Linux and BSD, because it uses a technique called copy on write (cow). I couldn't find any information if NT uses cow, too. Christian From python at rcn.com Tue Feb 17 23:28:52 2009 From: python at rcn.com (Raymond Hettinger) Date: Tue, 17 Feb 2009 14:28:52 -0800 Subject: [Python-ideas] set.add() return value References: <499A482B.2050203@canterbury.ac.nz><2A463F1306254841912360D75DEB0305@RaymondLaptop1><200902171836.39520.steve@pearwood.info><499A7BFF.9020204@canterbury.ac.nz><1AF71C53F36A442BB12A06EF2A0F956A@RaymondLaptop1><499B20EE.7060402@canterbury.ac.nz> <7D87C9B3313E40EE9877772353E1206E@RaymondLaptop1> Message-ID: <8967B14AC0F04ED89CFAA4D4A0E4D3FA@RaymondLaptop1> The main argument against set.add() having a return value is that someone reading the code has to remember or guess the meaning of the return value for s.add(x). Possible interpretations: * Always None # This is what we have now throughout the language * Always Self # What is done in Ruby, Objective C, and Smalltalk # to support method chaining. We also do this in # some of our own APIs like Tkinter. * True if x existed * True if x was added # Opposite meaning from the previous interpretation * True if x was mutated # Opposite from if-existed and same as was-added Each of the those interpretations are reasonable, so we should recognize that "explicit is better than implicit" and " in the face of ambiguity, refuse the temptation to guess". The situation with set.discard() is similar but the last three cases are no longer opposite (if-existed is the same as was-discarded and was-mutated). I looked for examples in real-world code and found that the test/set pattern is not as uniform as posited. For example, collections.namedtuple() has a typical use case (special handling for duplicate values) that is not helped by the proposal: seen_names = set() for name in field_names: if name.startswith('_') and not rename: raise ValueError('Field names cannot start with an underscore: %r' % name) if name in seen_names: raise ValueError('Encountered duplicate field name: %r' % name) seen_names.add(name) The performance argument is specious. Greg's expected twofold speed-up is based on speculation, not on objective timings of an actual test/set implementation. To achieve a twofold speed-up, all of the following would have to be true: * Every s.add() is True. Otherwise, there is no duplicate step to be saved. * The are zero hardware cache effects on the table search and the hash value computation. But in real life, the cost of doing something twice can be twentyfold less the second time. According to Intel, the cost of a memory access is a half-cycle but if there is a cache miss, it is as expensive as a floating point divide. * The inner-loop does no other work. This is unusual. Typically, the if-seen path does some useful work. Hence the test/set operation does not dominate. I spent a couple of man-months writing and optimizing the set module. If the test/set pairing had proven viable, I would have included a separate method for it. My research on hash tables (documented in Objects/dictnotes.txt) showed that real-world performance benefits were elusive. The hardware cache has already done much of our work for us. We should take that freebie. Take it from mr-i-want-to-optimize-everything, this proposal won't payoff in terms of meaning speedups in real code. Currently, when I teach sets, I take pleasure in the near-zero learning curve and absence of special cases. Set code becomes a prime example of "readability counts" and "simple is better than complex." More importantly, I think there is value in API simplicity for lists, sets, and dicts, our basic tools. The ABCs for sets currently reflect that simplicity and it would be sad if that started to get lost in order to save one line here or there. Summary: -1 on having set.add() get a new return value -0 on having a new, explicitly named method for an atomic test/add okay-i-will-shut-up-now-ly yours, Raymond From rhamph at gmail.com Tue Feb 17 23:31:43 2009 From: rhamph at gmail.com (Adam Olsen) Date: Tue, 17 Feb 2009 15:31:43 -0700 Subject: [Python-ideas] Revised revised revised PEP on yield-from In-Reply-To: References: <4995F681.20702@canterbury.ac.nz> <0A5E3DAFC3204EA08954903202DB45F3@RaymondLaptop1> <4999E02A.8050103@canterbury.ac.nz> <499A42E5.3010204@canterbury.ac.nz> <499A59CE.8030509@canterbury.ac.nz> Message-ID: On Tue, Feb 17, 2009 at 12:47 PM, Guido van Rossum wrote: > My question was in the context of lightweight threads and your > proposal for the value returned by "yield from". I believe I now > understand what you are trying to do, but the way to think about it in > this case seems very different than when you're refactoring > generators. IIUC there will be some kind of "scheduler" that manages a > number of lightweight threads, each represented by a suspended stack > of generators, and a number of blocking resources like sockets or > mutexes. The scheduler knows what resource each thread is waiting for > (could also be ready to run or sleeping until a specific time) and > when the resource is ready it resumes the generator passing along > whatever value is required using .send(). E.g. on input, it could read > the data from the socket, or it could just pass a flag indicating that > the resource is ready and let the generator make the actual recv() > call. When a generator wants to access a resource, it uses "yield" > (not "yield from"!) to send a description of the resource needed to > the scheduler. When a generator wants to call another function that > might block, the other function must be written as a generator too, > and it is called using "yield from". The other function uses "yield" > to access blocking resources, and "return" to return a value to its > "caller" (really the generator that used "yield from"). If a scheduler is used it can treat a chained function as just another resource, either because it has a decorator or simply by default. I can't see any need for new syntax. Overloading return has more merit. The typical approach today would be "yield Return(val)" or "raise Return(val)". However, I'm quite bothered by the risk of silently swallowing the return argument when not using a scheduler. -- Adam Olsen, aka Rhamphoryncus From steve at pearwood.info Wed Feb 18 00:01:22 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 18 Feb 2009 10:01:22 +1100 Subject: [Python-ideas] set.add() return value In-Reply-To: <15657.1234893740@parc.com> References: <15657.1234893740@parc.com> Message-ID: <499B41C2.7020702@pearwood.info> Bill Janssen wrote: > Guido van Rossum wrote: > >> This example also has a bug, which neither of the two posters >> responding caught (unless Bill J was being *very* subtle). > > Sorry, I should have been more brutal. To my mind, the fact that Steve > got it wrong was a nice illustration of how much extra mental effort > needed to be expended because the feature Ralf suggests wasn't > available. I'm not excusing my sloppiness, but I think that my mistake is an argument against the proposal that add (or in Greg's case, test_and_add) should return True if the object was *not* added (because it was already there). In other words, despite Greg's suggested name: was_it_there = myset.test_and_add(42) I was confused by my intuition that the return flag should mean the object was added, and then made the mental translation: test_and_add() returns True => 42 was added to the set became: 42 in set returns True => 42 is to be added to the set which is of course the opposite intention to the proposal but in the brief time I spent writing the email easy to make. I believe that some people will expect to use it like this: if set.test_and_add(42): print "42 was not added because it was already there." and others like this: if set.test_and_add(42): print "42 was added because it wasn't there." and yet others (I expect the majority, including me) will flip from one to the other and consequently find that 50% of the time they get it wrong. I think that test_and_add is naturally two operations, not one, and forcing it to be one operation will just lead to confusion. -- Steven From bruce at leapyear.org Wed Feb 18 00:17:12 2009 From: bruce at leapyear.org (Bruce Leban) Date: Tue, 17 Feb 2009 15:17:12 -0800 Subject: [Python-ideas] set.add() return value In-Reply-To: <499B41C2.7020702@pearwood.info> References: <15657.1234893740@parc.com> <499B41C2.7020702@pearwood.info> Message-ID: On Tue, Feb 17, 2009 at 3:01 PM, Steven D'Aprano wrote: > > I think that test_and_add is naturally two operations, not one, and forcing > it to be one operation will just lead to confusion. There is one scenario where testing membership and adding a new member *is* a single operation: multiple threads. That's a good reason to *not* support a test/change operation in regular sets as it may lead people to think (falsely) that it provide some kind of concurrency protection. --- Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Wed Feb 18 00:18:09 2009 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 17 Feb 2009 23:18:09 +0000 (UTC) Subject: [Python-ideas] Yield-from example: A parser References: <499A792D.9090701@canterbury.ac.nz> <499B22A5.8020506@canterbury.ac.nz> Message-ID: Greg Ewing writes: > If everything could always be done using pulls, there > would be no need for send() in the first place. Sorry if I'm missing something, but send() allows a bidirectional exchange of values, it doesn't seem to matter whether you pull or you push. > Not sure what you mean by that. There's no single place > in the parser that loops until the input is exhausted. How about: def scanner(text): for m in pat.finditer(text): token = m.group(0) print "Feeding:", repr(token) yield token yield None # to signal EOF and: def parse_items(closing_tag = None): elems = [] while 1: token = token_stream.next() if not token: break # EOF [etc.] It looks like parse_items pulls from token_stream until exhaustion. From jimjjewett at gmail.com Wed Feb 18 01:38:16 2009 From: jimjjewett at gmail.com (Jim Jewett) Date: Tue, 17 Feb 2009 19:38:16 -0500 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: <499AAF4E.3020506@trueblade.com> References: <20090212141040.0c89e0fc@o> <70e6cf130902121029l17ec06e0k9d3ea91e23ecd74e@mail.gmail.com> <4999D184.3080105@trueblade.com> <499AAF4E.3020506@trueblade.com> Message-ID: On 2/17/09, Eric Smith wrote: > The one issue that's causing me problems is what to do with format > specifiers that themselves need expanding. > >>> '{0:>{1}}'.format('foo', 5) > ' foo' > Should: > >>> '{:{}}'.format('foo', 5) > produce the same output, or should it be an error? I think it should > probably work, but it complicates the implementation sufficiently that I > probably won't be able to finish it up for a couple of weeks. > I know this is a not-so-useful corner case, but the implementation has > to do something here. I could easily throw an exception, but I don't see > how that's more desirable than just making it work. Then go ahead and throw an exception; it can always be made legal later. In 2.4, the original decorator patch applied to classes as well as functions. The restriction to only functions was made because the use case for classes wasn't as clear-cut, and it is much easier (from a policy standpoint) to add functionality than to take it away. (The restriction was removed in 2.6.) Those extra .format extensions would *probably* be useful, but there is little harm in waiting another month (or even another two releases) before adding them. -jJ From dangyogi at gmail.com Wed Feb 18 02:29:00 2009 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Tue, 17 Feb 2009 20:29:00 -0500 Subject: [Python-ideas] Yield-from example: A parser In-Reply-To: References: <499A792D.9090701@canterbury.ac.nz> <499B22A5.8020506@canterbury.ac.nz> Message-ID: <499B645C.8030609@gmail.com> Antoine Pitrou wrote: > How about: > > def scanner(text): > for m in pat.finditer(text): > token = m.group(0) > print "Feeding:", repr(token) > yield token > yield None # to signal EOF > > and: > > def parse_items(closing_tag = None): > elems = [] > while 1: > token = token_stream.next() > if not token: > break # EOF > [etc.] > Or just: def scanner(text): for m in pat.finditer(text): token = m.group(0) print "Feeding:", repr(token) yield token and: def parse_items(closing_tag = None): elems = [] for token in token_stream: [etc.] -bruce frederiksen From lie.1296 at gmail.com Wed Feb 18 03:20:07 2009 From: lie.1296 at gmail.com (Lie Ryan) Date: Wed, 18 Feb 2009 02:20:07 +0000 (UTC) Subject: [Python-ideas] String formatting and namedtuple References: <4c0fccce0902170022y47cb6beak9d41729619dd1d9d@mail.gmail.com> Message-ID: On Tue, 17 Feb 2009 10:32:17 -0800, Guido van Rossum wrote: > You can do similar things with .format(), but inside {} the : and ! > characters always end the key. Why not keep something like str.old_format(formatcode, tuple_or_object) for py3k for backward compatibility purpose, then completely removing it on python 4.0? Add a note that .old_format is obsolete and would be removed and codes should use the newer formatters. That way 2to3 tool would become simpler (just convert most things to old_format). While for the most common use cases (e.g. 'old%sliteral' % item), it can automatically be converted it to 'new{}literal'.format(item) or $- substitution. From charlie137 at gmail.com Wed Feb 18 03:51:59 2009 From: charlie137 at gmail.com (Guillaume Chereau) Date: Wed, 18 Feb 2009 10:51:59 +0800 Subject: [Python-ideas] Revised revised revised PEP on yield-from In-Reply-To: References: <4995F681.20702@canterbury.ac.nz> <0A5E3DAFC3204EA08954903202DB45F3@RaymondLaptop1> <4999E02A.8050103@canterbury.ac.nz> <499A42E5.3010204@canterbury.ac.nz> <499A59CE.8030509@canterbury.ac.nz> Message-ID: <8e9327d40902171851u52ddd973wdb796d072ec99026@mail.gmail.com> On Wed, Feb 18, 2009 at 3:47 AM, Guido van Rossum wrote: > Trying to turn this into a generator like I can do with an ordinary > file-like object doesn't work: > > def __iter__(self): > while True: > line = yield from self.readline() > if not line: break > yield line ## ??????? > > This is because lightweight threads use yield to communicate with the > scheduler, and they cannot easily also use it to yield successive > values to their caller. I could imagine some kind of protocol where > yield always returns a tuple whose first value is a string or token > indicating what kind of yield it is, e.g. "yield" when it is returning > the next value from the readline-loop, and "scheduler" when it is > wanting to talk to the scheduler, but the caller would have to look > for this and it would become much uglier than just writing out the > while-loop. Yes that was the problem I had when I worked with this kind of things. My solution was to enforce that yielding an other lightweight thread means that we start it and wait for the result, and yielding anything else is a return to the caller (I am not very satisfied with this, and that is why I follow this thread with attention, looking for a better way to do it) On a side note, my library didn't use any scheduler, it was simply turning the generator into a function that can be called with two callback arguments, and start this function with the send and throw methods of the calling thread as arguments. I believe that twisted is using a similar mechanism, but I can't tell for sure for I never had a deep look at it. If some people are interested in the code, you can have a look at it here [0], there are a few use examples at the bottom of the file. [0] http://git.openmoko.org/?p=tichy.git;a=blob;f=tichy/tasklet.py; -- http://charlie137.blogspot.com/ From greg.ewing at canterbury.ac.nz Wed Feb 18 03:58:19 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 18 Feb 2009 15:58:19 +1300 Subject: [Python-ideas] set.add() return value In-Reply-To: References: <15657.1234893740@parc.com> <499B41C2.7020702@pearwood.info> Message-ID: <499B794B.6050308@canterbury.ac.nz> Bruce Leban wrote: > That's a good reason to *not* support a test/change operation in regular > sets as it may lead people to think (falsely) that it provide some kind > of concurrency protection. Well, perhaps it should provide concurrency protection. If it's implemented in C with the GIL held, it probably wouldn't be hard to make that guarantee. -- Greg From greg.ewing at canterbury.ac.nz Wed Feb 18 04:07:09 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 18 Feb 2009 16:07:09 +1300 Subject: [Python-ideas] Yield-from example: A parser In-Reply-To: References: <499A792D.9090701@canterbury.ac.nz> <499B22A5.8020506@canterbury.ac.nz> Message-ID: <499B7B5D.3090506@canterbury.ac.nz> Antoine Pitrou wrote: > def parse_items(closing_tag = None): > elems = [] > while 1: > token = token_stream.next() > if not token: > break # EOF > [etc.] > > It looks like parse_items pulls from token_stream until exhaustion. Only the outermost call does that. The recursive calls are made with a closing_tag value, and parse_items stops as soon as it reaches that. Writing a for-loop would give the erroneous impression that we were looping to exhaustion, when most of the time we're not. Also, even in the outer call, it stops when it gets a token of None, which is half a loop sooner than the scanner becomes exhausted. If a for-loop were used, it would always be exited via a break. Thirdly, if I had used a for-loop there, I would have to have turned it into a while loop for the push version, and that would have obscured the symmetry between the two versions. -- Greg From guido at python.org Wed Feb 18 04:12:05 2009 From: guido at python.org (Guido van Rossum) Date: Tue, 17 Feb 2009 19:12:05 -0800 Subject: [Python-ideas] set.add() return value In-Reply-To: <499B794B.6050308@canterbury.ac.nz> References: <15657.1234893740@parc.com> <499B41C2.7020702@pearwood.info> <499B794B.6050308@canterbury.ac.nz> Message-ID: On Tue, Feb 17, 2009 at 6:58 PM, Greg Ewing wrote: > Bruce Leban wrote: >> That's a good reason to *not* support a test/change operation in regular >> sets as it may lead people to think (falsely) that it provide some kind of >> concurrency protection. > > Well, perhaps it should provide concurrency protection. > If it's implemented in C with the GIL held, it probably > wouldn't be hard to make that guarantee. That may depend on whether the hash or equality test is implemented in Python. Also, I've heard of lock-free thread-safe dict implementations in Java where it may not be so easy to decide whether you actually inserted something or not -- and it may or may not be there when you look for it again. All in all, I am now siding with Raymond -- let's leave well enough alone. Hopefully we can now retire this thread, I don't think there's much more that can be said that hasn't been said already. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg.ewing at canterbury.ac.nz Wed Feb 18 04:29:20 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 18 Feb 2009 16:29:20 +1300 Subject: [Python-ideas] Revised revised revised PEP on yield-from In-Reply-To: References: <4995F681.20702@canterbury.ac.nz> <0A5E3DAFC3204EA08954903202DB45F3@RaymondLaptop1> <4999E02A.8050103@canterbury.ac.nz> <499A42E5.3010204@canterbury.ac.nz> <499A59CE.8030509@canterbury.ac.nz> Message-ID: <499B8090.80109@canterbury.ac.nz> Guido van Rossum wrote: > Also, it would seem you cannot write > functions running in lightweight threads that are also "ordinary" > generators, since yield is reserved for "calling" the scheduler. Yes, that's a problem. I don't have a good answer for that at the moment. BTW, I have another idea for an example (a thread scheduler and an example using it to deal with sockets asynchronously). Is anyone still interested, or have you all seen enough already? -- Greg From jnoller at gmail.com Wed Feb 18 04:38:07 2009 From: jnoller at gmail.com (Jesse Noller) Date: Tue, 17 Feb 2009 22:38:07 -0500 Subject: [Python-ideas] Revised revised revised PEP on yield-from In-Reply-To: <499B8090.80109@canterbury.ac.nz> References: <4995F681.20702@canterbury.ac.nz> <0A5E3DAFC3204EA08954903202DB45F3@RaymondLaptop1> <4999E02A.8050103@canterbury.ac.nz> <499A42E5.3010204@canterbury.ac.nz> <499A59CE.8030509@canterbury.ac.nz> <499B8090.80109@canterbury.ac.nz> Message-ID: <4222a8490902171938o490b5c6ew680810145b651c@mail.gmail.com> On Tue, Feb 17, 2009 at 10:29 PM, Greg Ewing wrote: > Guido van Rossum wrote: >> >> Also, it would seem you cannot write >> functions running in lightweight threads that are also "ordinary" >> generators, since yield is reserved for "calling" the scheduler. > > Yes, that's a problem. I don't have a good answer > for that at the moment. > > BTW, I have another idea for an example (a thread > scheduler and an example using it to deal with > sockets asynchronously). Is anyone still interested, > or have you all seen enough already? > > -- > Greg Oooh, oooh I am! I am! (still interested that is). jesse From guido at python.org Wed Feb 18 05:22:51 2009 From: guido at python.org (Guido van Rossum) Date: Tue, 17 Feb 2009 20:22:51 -0800 Subject: [Python-ideas] Revised revised revised PEP on yield-from In-Reply-To: <4222a8490902171938o490b5c6ew680810145b651c@mail.gmail.com> References: <4995F681.20702@canterbury.ac.nz> <4999E02A.8050103@canterbury.ac.nz> <499A42E5.3010204@canterbury.ac.nz> <499A59CE.8030509@canterbury.ac.nz> <499B8090.80109@canterbury.ac.nz> <4222a8490902171938o490b5c6ew680810145b651c@mail.gmail.com> Message-ID: On Tue, Feb 17, 2009 at 7:38 PM, Jesse Noller wrote: > On Tue, Feb 17, 2009 at 10:29 PM, Greg Ewing > wrote: >> Guido van Rossum wrote: >>> >>> Also, it would seem you cannot write >>> functions running in lightweight threads that are also "ordinary" >>> generators, since yield is reserved for "calling" the scheduler. >> >> Yes, that's a problem. I don't have a good answer >> for that at the moment. >> >> BTW, I have another idea for an example (a thread >> scheduler and an example using it to deal with >> sockets asynchronously). Is anyone still interested, >> or have you all seen enough already? >> >> -- >> Greg > > Oooh, oooh I am! I am! (still interested that is). I think it would be a better example than the parser example you gave before. Somehow the parser example is not very convincing, perhaps because it feels a bit unnatural to see a parser as a bunch of threads or coroutines -- the conventional way to write (which you presented as a starting point) works just fine. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From sturla at molden.no Wed Feb 18 11:10:27 2009 From: sturla at molden.no (Sturla Molden) Date: Wed, 18 Feb 2009 11:10:27 +0100 Subject: [Python-ideas] Porting os.fork to Windows? In-Reply-To: References: <499AA643.5090208@molden.no> Message-ID: <499BDE93.4050205@molden.no> On 2/17/2009 6:02 PM, Antoine Pitrou wrote: > Could you post an entry on http://bugs.python.org, so that it doesn't get lost? It's not a bug. If I register it as a bug, a lot of people might be annoyed. I've looked more carefully at the tcsh fork now. It requires a special version of malloc, as one must know the top and bottom of the heap. I don't think msvcr71 (or whatever version) reports that. Second, I am not sure how well it handles DLLs and memory mappings. But Cygwin's fork do, so we could get inspiration form there. S.M. From sturla at molden.no Wed Feb 18 11:11:37 2009 From: sturla at molden.no (Sturla Molden) Date: Wed, 18 Feb 2009 11:11:37 +0100 Subject: [Python-ideas] Porting os.fork to Windows? In-Reply-To: References: <499AA643.5090208@molden.no> Message-ID: <499BDED9.20703@molden.no> On 2/17/2009 11:28 PM, Christian Heimes wrote: > The code implements fork() on top of the NT API, not the Win32 API. The tcsh code implements fork on top of the Win32 API like Cygwin. > fork() is an efficient and fast op on Unix systems like Linux and BSD, > because it uses a technique called copy on write (cow). I couldn't > find any information if NT uses cow, too. NT's fork (in the SUA subsystem on Vista) is copy-on-write optimized. It uses ZwCreateProcess in ntdll.dll to produce a copy-on-write clone of a process. But that is besides the point. The fork() in tcsh does not do copy-on-write optimization, not does Cygwin's fork(). There is nothing that mandates that fork() must be implemented with copy-on-write. The raison d'etre for pthreads was the non-cow fork() implementation in Solaris. Linux originally had a non-cow fork too. Albeit slower, this version of fork is still useful. S.M. From gagsl-py2 at yahoo.com.ar Wed Feb 18 11:31:38 2009 From: gagsl-py2 at yahoo.com.ar (Gabriel Genellina) Date: Wed, 18 Feb 2009 08:31:38 -0200 Subject: [Python-ideas] Porting os.fork to Windows? References: <499AA643.5090208@molden.no> <499BDE93.4050205@molden.no> Message-ID: En Wed, 18 Feb 2009 08:10:27 -0200, Sturla Molden escribi?: > On 2/17/2009 6:02 PM, Antoine Pitrou wrote: >> Could you post an entry on http://bugs.python.org, so that it doesn't >> get lost? > It's not a bug. If I register it as a bug, a lot of people might be > annoyed. The tracker isn't used for bugs only - one of the available categories is "feature request". -- Gabriel Genellina From solipsis at pitrou.net Wed Feb 18 13:58:27 2009 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 18 Feb 2009 12:58:27 +0000 (UTC) Subject: [Python-ideas] set.add() return value References: <499A482B.2050203@canterbury.ac.nz><2A463F1306254841912360D75DEB0305@RaymondLaptop1><200902171836.39520.steve@pearwood.info><499A7BFF.9020204@canterbury.ac.nz><1AF71C53F36A442BB12A06EF2A0F956A@RaymondLaptop1> <499B20EE.7060402@canterbury.ac.nz> <7D87C9B3313E40EE9877772353E1206E@RaymondLaptop1> Message-ID: Raymond Hettinger writes: > > Even if it did improve you one app, that would be > atypical. Most apps that test and add will do > something inside the branch that consumes far more > time than the contains test. Sorry, that argument could be turned into an argument against *any* optimization. For example, "there's no need to optimize building dict literals, since most apps will do something with the dict that consumes far more time than building the dict". The assumption being that only optimizations which benefit a large class of applications should be attempted, which is bogus (are large classes of applications bottlenecked by dict literal creation? yet we have specialized opcodes for dict literal creation, and so on for lots of other things). From guido at python.org Wed Feb 18 16:28:35 2009 From: guido at python.org (Guido van Rossum) Date: Wed, 18 Feb 2009 07:28:35 -0800 Subject: [Python-ideas] set.add() return value In-Reply-To: References: <499A482B.2050203@canterbury.ac.nz> <2A463F1306254841912360D75DEB0305@RaymondLaptop1> <200902171836.39520.steve@pearwood.info> <499A7BFF.9020204@canterbury.ac.nz> <1AF71C53F36A442BB12A06EF2A0F956A@RaymondLaptop1> <499B20EE.7060402@canterbury.ac.nz> <7D87C9B3313E40EE9877772353E1206E@RaymondLaptop1> Message-ID: On Wed, Feb 18, 2009 at 4:58 AM, Antoine Pitrou wrote: > Raymond Hettinger writes: >> >> Even if it did improve you one app, that would be >> atypical. Most apps that test and add will do >> something inside the branch that consumes far more >> time than the contains test. > > Sorry, that argument could be turned into an argument against *any* optimization. Well, the long form would be something like the benefits for a small class of apps vs. the effort in implementing it plus the cost for other apps. While the implementation cost of this feature is small (which makes it attractive to some), I think that the cost of developers guessing the returned value wrong is prohibitive. > For example, "there's no need to optimize building dict literals, since most > apps will do something with the dict that consumes far more time than building > the dict". > > The assumption being that only optimizations which benefit a large class of > applications should be attempted, which is bogus (are large classes of > applications bottlenecked by dict literal creation? yet we have specialized > opcodes for dict literal creation, and so on for lots of other things). Here you make a logic error. The presence of an opcode for anything in Python does not mean that it is considered to be worth optimizing -- it just means it is a primitive operation in the language, with syntax associated with it. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From dangyogi at gmail.com Wed Feb 18 17:45:33 2009 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Wed, 18 Feb 2009 11:45:33 -0500 Subject: [Python-ideas] Yield-from example: A parser In-Reply-To: <499B22A5.8020506@canterbury.ac.nz> References: <499A792D.9090701@canterbury.ac.nz> <499B22A5.8020506@canterbury.ac.nz> Message-ID: <499C3B2D.2060600@gmail.com> Greg Ewing wrote: > Antoine Pitrou wrote: >> (*) by the way, using a "for" loop would probably have felt more natural > > Not sure what you mean by that. There's no single place > in the parser that loops until the input is exhausted. I guess I'm not the only one who's intuitive feel for "for" loops is that they run to exhaustion (and, yes, it feels right to use for loops to search a list for something and break when it's found, but even then it feels like the program should be done with the iterable when the for loop terminates). So it's interesting that the for statement is instinctively avoided in this situation when it doesn't need to be (as it's currently defined). But this could also be seen as a counter-example to my proposal for a new-style for statement (if it had been written with a for statement :-). -bruce frederiksen From dangyogi at gmail.com Wed Feb 18 18:38:21 2009 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Wed, 18 Feb 2009 12:38:21 -0500 Subject: [Python-ideas] PEP on yield-from: throw example In-Reply-To: <499931F4.8050601@canterbury.ac.nz> References: <4995F681.20702@canterbury.ac.nz> <0A5E3DAFC3204EA08954903202DB45F3@RaymondLaptop1> <49964508.5020207@pearwood.info> <49986FF6.3060707@gmail.com> <91ad5bf80902151825w61215fd6g6aac11949e9df71d@mail.gmail.com> <499931F4.8050601@canterbury.ac.nz> Message-ID: <499C478D.4010902@gmail.com> Greg Ewing wrote: > George Sakkis wrote: > >> For throw() however, I strongly disagree that >> a raise statement in a loop should implicitly call generator.throw(), >> regardless of what "for" syntax is used. > > Just in case it's not clear, the behaviour being suggested > here is *not* part of my proposal. As far as yield-from is > concerned, propagation of exceptions into the subgenerator > would only occur when throw() was called on the generator > containing the yield-from, and then only when it's suspended > in the midst of it. Raise statements within the delegating > generator have nothing to do with the matter and aren't > affected at all. > > Having some examples to look at is a good idea, but > Bruce seems to be going off on a tangent and making some > proposals of his own for enhancing the for-loop. I fear > that this will only confuse the discussion further. > > Perhaps I should also point out that yield-from is *not* > intended to help things like itertools.chain manage the > cleanup of its generators, so examples involving things > with chain-like behaviour are probably not going to help > clarify what it *is* intended for. > > It would be nice to have a language feature to help with > things like that, but I have no idea at the moment what > such a thing would be like. > Guilty! I apologize for any side-tracking of the yield from discussion. As people are asking for real world examples and I've done a lot with generators, and I didn't see many other people offering examples, I thought I could offer some. But my code obviously doesn't use yield from, so I'm looking to use of the for statement or itertools.chain, which are the two that would be replaced by yield from. So I'm thinking, on the one hand, that examples where for or chain should forward send/throw/close should transfer to yield from. But I'm also thinking that the same arguments apply to for/chain. OTOH, the desire to use "yield from" for a "poor man's" cooperative threading facility also brings me to think that generators have 3 fatal design flaws that will prevent them from growing into something much more useful (like threading): 1. The double use of send/throw and the yield expression for simultaneous input and output to/from the generator; rather than separating input and output as two different constructs. Sending one value in does not always correspond to getting one value out. 2. The absence of an object (even an implicit one like sys.stdin and sys.stdout are for input and print) representing the target of the yield/throw/send that can be passed on to other functions, allowing them to contribute to the generator's output stream in a much more natural way. * I'm thinking here of a pair of cooperating pipe objects, read and write, and a pair of built-in functions, something like input and print that get and send an object to implicit pipein and pipeout objects (one for each "thread"). These would replace send and yield. * But I think that the iterator interface is very successful, should be kept intact, and is what the read pipe object should look like. 3. The double use of yield to indicate rendezvoused output to the parent "thread", as well as to flag its containing function as one that always starts a new "thread" when executed. * This prevents us from having generator A simply call generator B to have B yield objects for A. In other words, calling B as a normal function that doesn't start another thread would mean that B yields to the current thread's pipeout. While starting B in a new thread with its own pipeout would do what current generators do. Thus generator A would have the option to run B in two ways, as a new generator thread to yield values back to A, or within A's thread as a normal function to yield values to the same place that A yields values to. * I'm thinking that there would be a builtin generate function (or some special syntax) used to run a function in a new thread. Thus generate(gen_b, arg1, arg2, ...) would return a read pipe (which is an iterable) connected to the write pipe for the new thread: for x in generate(gen_b, arg1, arg2, ...): or maybe: for x in gen_b(arg1, arg2, ...)&: or whatever, is different than: gen_b(arg1, arg2, ...) This would accomplish what yield from is trying to do in a more flexible and readable way. So the question in my mind is: do we move towards adopting some new kind of generator/threading capability (and eventually deprecating current generators) that doesn't have these limitations, or do we stick with generators? If we want to stick with the current generators, then I'm in favor of the proposed "yield from" (with the possible exception of the new "return"). But even if we want to more towards a new-style generator capability, "yield from" could be fielded much more quickly than a whole new-style generator capability, so ??? If people are interested in discussing this further, I'm open to that. Otherwise, sorry for the side-tracking... -bruce frederiksen From greg.ewing at canterbury.ac.nz Wed Feb 18 20:23:42 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 19 Feb 2009 08:23:42 +1300 Subject: [Python-ideas] Revised revised revised PEP on yield-from In-Reply-To: References: <4995F681.20702@canterbury.ac.nz> <4999E02A.8050103@canterbury.ac.nz> <499A42E5.3010204@canterbury.ac.nz> <499A59CE.8030509@canterbury.ac.nz> <499B8090.80109@canterbury.ac.nz> <4222a8490902171938o490b5c6ew680810145b651c@mail.gmail.com> Message-ID: <499C603E.6040600@canterbury.ac.nz> Guido van Rossum wrote: > I think it would be a better example than the parser example you gave > before. Somehow the parser example is not very convincing, perhaps > because it feels a bit unnatural to see a parser as a bunch of threads > or coroutines -- the conventional way to write (which you presented as > a starting point) works just fine. Yes, that's true. I wanted to start with something a bit simpler and easier to follow, though. The scheduler is going to be rather more convoluted. -- Greg From greg.ewing at canterbury.ac.nz Wed Feb 18 20:38:39 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 19 Feb 2009 08:38:39 +1300 Subject: [Python-ideas] Yield-from example: A parser In-Reply-To: <499C3B2D.2060600@gmail.com> References: <499A792D.9090701@canterbury.ac.nz> <499B22A5.8020506@canterbury.ac.nz> <499C3B2D.2060600@gmail.com> Message-ID: <499C63BF.8050306@canterbury.ac.nz> Bruce Frederiksen wrote: > I guess I'm not the only one who's intuitive feel for "for" loops is > that they run to exhaustion (and, yes, it feels right to use for loops > to search a list for something and break when it's found, but even then > it feels like the program should be done with the iterable when the for > loop terminates). Yes, the early break from a searching loop is more of an optimisation -- you *could* carry on and scan the rest of the list, remembering what you found, but there's no point in doing so. It's different when the iterator has side effects, though. My instincts tell me not to use a for-loop then unless the iterator really is going to be exhausted. -- Greg From greg.ewing at canterbury.ac.nz Wed Feb 18 21:04:48 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 19 Feb 2009 09:04:48 +1300 Subject: [Python-ideas] PEP on yield-from: throw example In-Reply-To: <499C478D.4010902@gmail.com> References: <4995F681.20702@canterbury.ac.nz> <0A5E3DAFC3204EA08954903202DB45F3@RaymondLaptop1> <49964508.5020207@pearwood.info> <49986FF6.3060707@gmail.com> <91ad5bf80902151825w61215fd6g6aac11949e9df71d@mail.gmail.com> <499931F4.8050601@canterbury.ac.nz> <499C478D.4010902@gmail.com> Message-ID: <499C69E0.5010409@canterbury.ac.nz> Bruce Frederiksen wrote: > 1. The double use of send/throw and the yield expression for > simultaneous input and output to/from the generator; rather than > separating input and output as two different constructs. Sending > one value in does not always correspond to getting one value out. You might not be interested in sending or receiving a value every time, but you do have to suspend the generator each time you want to send and/or receive a value. Currently, there is only one way to suspend a generator, which for historical reasons is called 'yield'. Each time you use it, you have the opportunity to send a value, and an opportunity to receive a value, but you don't have to use both of these (or either of them) if you don't want to. What you seem to be proposing is having two aliases for 'yield', one of which only sends and the other only receives. Is that right? If so, I don't see much point in it other than making code read slightly better. > * I'm thinking here of a pair of cooperating pipe objects, > read and write, Pipes are different in an important way -- they have queueing. Writes to one end don't have to interleave perfectly with reads at the other. But generators aren't like that -- there is no buffer to hold sent/yielded values until the other end is ready for them. Or are you suggesting that there should be such buffering? I would say that's a higher-level facility that should be provided by library code using yield, or something like it, as a primitive. -- Greg From greg.ewing at canterbury.ac.nz Wed Feb 18 21:22:24 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 19 Feb 2009 09:22:24 +1300 Subject: [Python-ideas] Alternative name for yield-from Message-ID: <499C6E00.2030602@canterbury.ac.nz> I've had another idea about what to call yield-from: y = pass g(x) which means "run this generator, passing through any sent/yielded values etc." It's short, it's suggestive, it doesn't use any new keywords, and there's no danger of confusing it with 'yield'. Now, you're probably reaching for the -1 button at this point, thinking "WTF? That's completely different from the existing meaning of pass!" But there's a sense in which the existing 'pass' can be seen as a degenerate case. Consider the generator def nada(): if False: yield Since it never yields anything, doing pass nada() is effectively a no-op. Thus, 'pass' with no generator at all is a no-op as well. There's still one remaining difference -- the presence of 'pass' with a value would make the containing function into a generator, whereas plain 'pass' wouldn't. We'd just have to live with that inconsistency. -- Greg From guido at python.org Wed Feb 18 22:16:48 2009 From: guido at python.org (Guido van Rossum) Date: Wed, 18 Feb 2009 13:16:48 -0800 Subject: [Python-ideas] Alternative name for yield-from In-Reply-To: <499C6E00.2030602@canterbury.ac.nz> References: <499C6E00.2030602@canterbury.ac.nz> Message-ID: On Wed, Feb 18, 2009 at 12:22 PM, Greg Ewing wrote: > I've had another idea about what to call yield-from: > > y = pass g(x) > > which means "run this generator, passing through > any sent/yielded values etc." It's short, it's > suggestive, it doesn't use any new keywords, and > there's no danger of confusing it with 'yield'. > > Now, you're probably reaching for the -1 button > at this point, thinking "WTF? That's completely > different from the existing meaning of pass!" > > But there's a sense in which the existing 'pass' > can be seen as a degenerate case. Consider the > generator > > def nada(): > if False: > yield > > Since it never yields anything, doing > > pass nada() > > is effectively a no-op. Thus, 'pass' with no > generator at all is a no-op as well. > > There's still one remaining difference -- the > presence of 'pass' with a value would make the > containing function into a generator, whereas > plain 'pass' wouldn't. We'd just have to live > with that inconsistency. -1000 because of that last one. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From rhamph at gmail.com Wed Feb 18 22:27:02 2009 From: rhamph at gmail.com (Adam Olsen) Date: Wed, 18 Feb 2009 14:27:02 -0700 Subject: [Python-ideas] Revised revised revised PEP on yield-from In-Reply-To: <499C603E.6040600@canterbury.ac.nz> References: <4995F681.20702@canterbury.ac.nz> <499A42E5.3010204@canterbury.ac.nz> <499A59CE.8030509@canterbury.ac.nz> <499B8090.80109@canterbury.ac.nz> <4222a8490902171938o490b5c6ew680810145b651c@mail.gmail.com> <499C603E.6040600@canterbury.ac.nz> Message-ID: On Wed, Feb 18, 2009 at 12:23 PM, Greg Ewing wrote: > Guido van Rossum wrote: > >> I think it would be a better example than the parser example you gave >> before. Somehow the parser example is not very convincing, perhaps >> because it feels a bit unnatural to see a parser as a bunch of threads >> or coroutines -- the conventional way to write (which you presented as >> a starting point) works just fine. > > Yes, that's true. I wanted to start with something a > bit simpler and easier to follow, though. The scheduler > is going to be rather more convoluted. Implementing a scheduler perhaps, but that can be omitted. Just give us the usage of the scheduler and how it's better than "yield Return(val)". -- Adam Olsen, aka Rhamphoryncus From sturla at molden.no Thu Feb 19 00:34:04 2009 From: sturla at molden.no (Sturla Molden) Date: Thu, 19 Feb 2009 00:34:04 +0100 (CET) Subject: [Python-ideas] Parallel processing with Python Message-ID: <7b5638cfdfafda1efc99e63b0c0ae9e7.squirrel@webmail.uio.no> About a year ago, I posted a scheme to comp.lang.python describing how to use isolated interpreters to circumvent the GIL on SMPs: http://groups.google.no/group/comp.lang.python/msg/0351c532aad97c5e?hl=no&dmode=source In the following, an "appdomain" will be defined as a thread assosciated with an unique embedded Python interpreter. One interpreter per thread is how tcl work. Erlang also uses isolated threads that only communicate through messages (as opposed to shared objects). Appdomains are also available in the .NET framework, and in Java as "Java isolates". They are potentially very useful as multicore CPUs become abundant. They allow one process to run one independent Python interpreter on each available CPU core. In Python, "appdomains" can be created by embedding the Python interpreter multiple times in a process. For this to work, we have to make multiple copies of the Python DLL and rename them (e.g. Python25-0.dll, Python25-1.dll, Python25-2.dll, etc.) Otherwise the dynamic loader will just return a handle to the already imported DLL. As DLLs can be accessed with ctypes, we don't even have to program a line of C to do this. we can start up a Python interpreter and use ctypes to embed more interpreters into it, associating each interpreter with its own thread. ctypes takes care of releasing the GIL in the parent interpreter, so calls to these sub-interpreters become asynchronous. I had a mock-up of this scheme working. Martin L?wis replied he doubted this would work, and pointed out that Python extension libraries (.pyd files) are DLLs as well. They would only be imported once, and their global states would thus crash, thus producing havoc: http://groups.google.no/group/comp.lang.python/msg/0a7a22910c1d5bf5?hl=no&dmode=source He was right, of course, but also wrong. In fact I had already proven him wrong by importing a DLL multiple times. If it can be done for Python25.dll, it can be done for any other DLL as well - including .pyd files - in exactly the same way. Thus what remains is to change Python's dynamic loader to use the same "copy and import" scheme. This can either be done by changing Python's C code, or (at least on Windows) to redirect the LoadLibrary API call from kernel32.dll to a custom DLL. Both a quite easy and requires minimal C coding. Thus it is quite easy to make multiple, independent Python interpreters live isolated lives in the same process. As opposed to multiple processes, they can communicate without involving any IPC. It would also be possible to design proxy objects allowing one interpreter access to an object in another. Immutable object such as strings would be particularly easy to share. This very simple scheme should allow parallel processing with Python similar to how it's done in Erlang, without the GIL getting in our way. At least on Windows this can be done without touching the CPython source at all. I am not sure about Linux though. I may be necessary to patch the CPython source to make it work there. Sturla Molden From greg.ewing at canterbury.ac.nz Thu Feb 19 02:31:47 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 19 Feb 2009 14:31:47 +1300 Subject: [Python-ideas] Revised revised revised PEP on yield-from In-Reply-To: References: <4995F681.20702@canterbury.ac.nz> <499A42E5.3010204@canterbury.ac.nz> <499A59CE.8030509@canterbury.ac.nz> <499B8090.80109@canterbury.ac.nz> <4222a8490902171938o490b5c6ew680810145b651c@mail.gmail.com> <499C603E.6040600@canterbury.ac.nz> Message-ID: <499CB683.1070009@canterbury.ac.nz> Adam Olsen wrote: > Implementing a scheduler perhaps, but that can be omitted. Just give > us the usage of the scheduler and how it's better than "yield > Return(val)". That's not really going to reveal much more than what you've seen in the previous example. I've started working on the scheduler, and it's actually turning out to be fairly simple. Including the implementation isn't going to make the example much longer, and I think it will be instructive. I'll explain what's going on at each step, so it won't be a code dump. -- Greg From rhamph at gmail.com Thu Feb 19 03:34:30 2009 From: rhamph at gmail.com (Adam Olsen) Date: Wed, 18 Feb 2009 19:34:30 -0700 Subject: [Python-ideas] Parallel processing with Python In-Reply-To: <7b5638cfdfafda1efc99e63b0c0ae9e7.squirrel@webmail.uio.no> References: <7b5638cfdfafda1efc99e63b0c0ae9e7.squirrel@webmail.uio.no> Message-ID: On Wed, Feb 18, 2009 at 4:34 PM, Sturla Molden wrote: > Thus it is quite easy to make multiple, independent Python interpreters > live isolated lives in the same process. As opposed to multiple processes, > they can communicate without involving any IPC. It would also be possible > to design proxy objects allowing one interpreter access to an object in > another. Immutable object such as strings would be particularly easy to > share. > > This very simple scheme should allow parallel processing with Python > similar to how it's done in Erlang, without the GIL getting in our way. At > least on Windows this can be done without touching the CPython source at > all. I am not sure about Linux though. I may be necessary to patch the > CPython source to make it work there. To clarify: * Erlang's modules/classes/functions are not first-class objects, so it doesn't need a copy of them. Python does, so each interpreter would have a memory footprint about the same as a true process. * Any communication requires a serialize/copy/deserialize sequence. You don't need a full context switch, but it's still not cheap. * It's probably not worth sharing even str objects. You'd need atomic refcounting and a hack in Py_TYPE to always give the local type, both of which would slow everything down. The real use case here is when you have a large, existing library that you're not willing to modify to use a custom (shared memory) allocator. That library must have a large data set (too large to duplicate in each process), and not an external database, must be multithreaded in a scalable way, yet be too performance sensitive for real IPC. Also, any data shared between interpreters must be part of that large, existing library, rather than python objects. Finally, since each interpreter uses as much memory as a process you must only need a small, fixed number of interpreters, preferably long running (or at least a thread pool). If that describes your use case then I'm happy for you, go ahead and use this DLL trick. -- Adam Olsen, aka Rhamphoryncus From rrr at ronadam.com Thu Feb 19 03:41:44 2009 From: rrr at ronadam.com (Ron Adam) Date: Wed, 18 Feb 2009 20:41:44 -0600 Subject: [Python-ideas] Alternative name for yield-from In-Reply-To: <499C6E00.2030602@canterbury.ac.nz> References: <499C6E00.2030602@canterbury.ac.nz> Message-ID: <499CC6E8.5050104@ronadam.com> Greg Ewing wrote: > I've had another idea about what to call yield-from: > > y = pass g(x) Is y a list of values sent to g? Excuse me if I'm a bit lost there. I haven't played around with the new generator features yet. I'm thinking if a in "a = yield expr" is the value sent in, and 'y = pass g(x)' iterates all of g, before continueing, then y would have all of the values sent to generator g. Or would it be the last value send to g or the value returned from g? > which means "run this generator, passing through > any sent/yielded values etc." It's short, it's > suggestive, it doesn't use any new keywords, and > there's no danger of confusing it with 'yield'. How about something like this instead: yield expr for next_val from g(X) Where next_value comes from g(), and expr is the value yielded out. Having it resemble a generator expression, makes it clearer it iterates until complete. The idea of generators working as simple threads is cool, but my mental picture of that has probably been a bit unrealistically simple. But I'm still hopeful for a simple way to do it. My wish is to be able to define a generator, and use a decorator or some syntax at define time to make it run in such a way as to buffer at least one value. That is it runs (in the back ground) until it reaches a yield statement instead of waiting for the .next() method to be called and running to the next yield statement. One thing that always bothered me with generator functions is that they are not explicitly defined. Having the presence of yield to determine if they are a generator or not, seems a bit implicit to me. I'd much prefer a new keyword, 'defgen' or something like it instead of reusing 'def'. To me, generators are different enough from functions to merit their own defining keyword. Or maybe it's better to define them as an object to start with? class mygen(generator): ... class mygen(threaded_generator): ... I would be +1 for an objective way to define generators. (Is there a way to do that already?) Cheers, Ron From guido at python.org Thu Feb 19 05:34:51 2009 From: guido at python.org (Guido van Rossum) Date: Wed, 18 Feb 2009 20:34:51 -0800 Subject: [Python-ideas] Revised revised revised PEP on yield-from In-Reply-To: <499CB683.1070009@canterbury.ac.nz> References: <4995F681.20702@canterbury.ac.nz> <499A59CE.8030509@canterbury.ac.nz> <499B8090.80109@canterbury.ac.nz> <4222a8490902171938o490b5c6ew680810145b651c@mail.gmail.com> <499C603E.6040600@canterbury.ac.nz> <499CB683.1070009@canterbury.ac.nz> Message-ID: On Wed, Feb 18, 2009 at 5:31 PM, Greg Ewing wrote: > Adam Olsen wrote: > >> Implementing a scheduler perhaps, but that can be omitted. Just give >> us the usage of the scheduler and how it's better than "yield >> Return(val)". > > That's not really going to reveal much more than > what you've seen in the previous example. > > I've started working on the scheduler, and it's > actually turning out to be fairly simple. Including > the implementation isn't going to make the example > much longer, and I think it will be instructive. > > I'll explain what's going on at each step, so it > won't be a code dump. I'm eagerly looking forward to it. We should get some Twisted folks to buy in. After that, I really encourage you to start working on a reference implementation. (Oh, and please send your PEP to the PEP editors. See PEP 1 for how to format it, where to send it, etc.) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From rwgk at yahoo.com Thu Feb 19 10:01:39 2009 From: rwgk at yahoo.com (Ralf W. Grosse-Kunstleve) Date: Thu, 19 Feb 2009 01:01:39 -0800 (PST) Subject: [Python-ideas] set.add() return value References: <15657.1234893740@parc.com> <499B41C2.7020702@pearwood.info> <499B794B.6050308@canterbury.ac.nz> Message-ID: <209121.20920.qm@web111401.mail.gq1.yahoo.com> > All in all, I am now siding with Raymond -- let's leave well enough > alone. Hopefully we can now retire this thread, I don't think there's > much more that can be said that hasn't been said already. FWIW, a few thoughts anyway: - I don't actually care about performance**. It was a mistake on my part to try to abuse the performance argument for what I was really after: elegance (defined as power/volume while still looking simple). - I always assumed it must be an oversight that set.add() doesn't return a trivial piece of information that it has. Somebody has made the arbitrary decision to withhold this information from me, like you withhold certain information from a child. It feels very wrong. I'm grown up and wish to take responsibility myself. "If I don't like it, you cannot use it" is forceful; "If you don't like it, just don't use it" is a much better tone between partners. - The purity argument was used a lot. In my opinion, set would be pure if it didn't arbitrarily withhold information from me that would allow me to make my algorithms more concise. Or as Einstein put it, "Make everything as simple as possible, but not simpler." set.add() returning None is failing the "but" part. ** 1. If performance really matters I will go to C++ anyway. 2. These days, I can simply put "if (i % chunk_n != chunk_i): continue" into some loop and throw the calculation at a few hundred (chunk_n) CPUs. From greg.ewing at canterbury.ac.nz Thu Feb 19 10:19:55 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 19 Feb 2009 22:19:55 +1300 Subject: [Python-ideas] Alternative name for yield-from In-Reply-To: <499CC6E8.5050104@ronadam.com> References: <499C6E00.2030602@canterbury.ac.nz> <499CC6E8.5050104@ronadam.com> Message-ID: <499D243B.8080801@canterbury.ac.nz> Ron Adam wrote: > Is y a list of values sent to g? Excuse me if I'm a bit lost there. I > haven't played around with the new generator features yet. This relates to the discussion about my proposal for a 'yield from' statement. Part of it is that 'return x' in a generator will be equivalent to 'raise StopIteration(x)', with x becoming the value of the 'yield from' expression (or in this case the 'pass' expression). > Having the presence of yield to determine > if they are a generator or not, seems a bit implicit to me. I'd much > prefer a new keyword, 'defgen' or something like it That argument was debated at great length and lost long ago, when generators were first invented. I don't think there's any chance of it being changed now. I don't know that it would help anything much anyhow, since you also need to be aware that it's a generator every time you use it, and even if it's defined differently, that doesn't help at the call site. The only solution to that is naming conventions and/or documentation. -- Greg From greg.ewing at canterbury.ac.nz Thu Feb 19 11:12:22 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 19 Feb 2009 23:12:22 +1300 Subject: [Python-ideas] Revised^4 PEP on yield-from In-Reply-To: <499D2740.1070408@canterbury.ac.nz> References: <499C6E00.2030602@canterbury.ac.nz> <499CC6E8.5050104@ronadam.com> <499D243B.8080801@canterbury.ac.nz> <499D2740.1070408@canterbury.ac.nz> Message-ID: <499D3086.6020706@canterbury.ac.nz> Fifth draft of the PEP. Re-worded a few things slightly to hopefully make the proposal a bit clearer up front. Anyone have any further suggested changes before I sent it to the pepmeister? PEP: XXX Title: Syntax for Delegating to a Subgenerator Version: $Revision$ Last-Modified: $Date$ Author: Gregory Ewing Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 13-Feb-2009 Python-Version: 2.7 Post-History: Abstract ======== A syntax is proposed for a generator to delegate part of its operations to another generator. This allows a section of code containing 'yield' to be factored out and placed in another generator. Additionally, the subgenerator is allowed to return with a value, and the value is made available to the delegating generator. The new syntax also opens up some opportunities for optimisation when one generator re-yields values produced by another. Proposal ======== The following new expression syntax will be allowed in the body of a generator: :: yield from where is an expression evaluating to an iterable, from which an iterator is extracted. The iterator is run to exhaustion, during which time it behaves as though it were communicating directly with the caller of the generator containing the ``yield from`` expression (the "delegating generator"). When the iterator is another generator, the effect is the same as if the body of the subgenerator were inlined at the point of the 'yield from' expression. The subgenerator is allowed to execute a 'return' statement with a value, and that value becomes the value of the 'yield from' expression. In terms of the iterator protocol: * Any values that the iterator yields are passed directly to the caller. * Any values sent to the delegating generator using ``send()`` are sent directly to the iterator. (If the iterator does not have a ``send()`` method, it remains to be decided whether the value sent in is ignored or an exception is raised.) * Calls to the ``throw()`` method of the delegating generator are forwarded to the iterator. (If the iterator does not have a ``throw()`` method, the thrown-in exception is raised in the delegating generator.) * If the delegating generator's ``close()`` method is called, the iterator is finalised before finalising the delegating generator. * The value of the ``yield from`` expression is the first argument to the ``StopIteration`` exception raised by the iterator when it terminates. * ``return expr`` in a generator is equivalent to ``raise StopIteration(expr)``. Formal Semantics ---------------- The statement :: result = yield from expr is semantically equivalent to :: _i = iter(expr) try: _u = _i.next() while 1: try: _v = yield _u except Exception, _e: if hasattr(_i, 'throw'): _i.throw(_e) else: raise else: if hasattr(_i, 'send'): _u = _i.send(_v) else: _u = _i.next() except StopIteration, _e: _a = _e.args if len(_a) > 0: result = _a[0] else: result = None finally: if hasattr(_i, 'close'): _i.close() Rationale ========= A Python generator is a form of coroutine, but has the limitation that it can only yield to its immediate caller. This means that a piece of code containing a ``yield`` cannot be factored out and put into a separate function in the same way as other code. Performing such a factoring causes the called function to itself become a generator, and it is necessary to explicitly iterate over this second generator and re-yield any values that it produces. If yielding of values is the only concern, this is not very arduous and can be performed with a loop such as :: for v in g: yield v However, if the subgenerator is to interact properly with the caller in the case of calls to ``send()``, ``throw()`` and ``close()``, things become considerably more complicated. As the formal expansion presented above illustrates, the necessary code is very longwinded, and it is tricky to handle all the corner cases correctly. In this situation, the advantages of a specialised syntax should be clear. Generators as Threads --------------------- A motivating use case for generators being able to return values concerns the use of generators to implement lightweight threads. When using generators in that way, it is reasonable to want to spread the computation performed by the lightweight thread over many functions. One would like to be able to call a subgenerator as though it were an ordinary function, passing it parameters and receiving a returned value. Using the proposed syntax, a statement such as :: y = f(x) where f is an ordinary function, can be transformed into a delegation call :: y = yield from g(x) where g is a generator. One can reason about the behaviour of the resulting code by thinking of g as an ordinary function that can be suspended using a ``yield`` statement. When using generators as threads in this way, typically one is not interested in the values being passed in or out of the yields. However, there are use cases for this as well, where the thread is seen as a producer or consumer of items. The ``yield from`` expression allows the logic of the thread to be spread over as many functions as desired, with the production or consumption of items occuring in any subfunction, and the items are automatically routed to or from their ultimate source or destination. Concerning ``throw()`` and ``close()``, it is reasonable to expect that if an exception is thrown into the thread from outside, it should first be raised in the innermost generator where the thread is suspended, and propagate outwards from there; and that if the thread is terminated from outside by calling ``close()``, the chain of active generators should be finalised from the innermost outwards. Syntax ------ The particular syntax proposed has been chosen as suggestive of its meaning, while not introducing any new keywords and clearly standing out as being different from a plain ``yield``. Optimisations ------------- Using a specialised syntax opens up possibilities for optimisation when there is a long chain of generators. Such chains can arise, for instance, when recursively traversing a tree structure. The overhead of passing ``next()`` calls and yielded values down and up the chain can cause what ought to be an O(n) operation to become O(n\*\*2). A possible strategy is to add a slot to generator objects to hold a generator being delegated to. When a ``next()`` or ``send()`` call is made on the generator, this slot is checked first, and if it is nonempty, the generator that it references is resumed instead. If it raises StopIteration, the slot is cleared and the main generator is resumed. This would reduce the delegation overhead to a chain of C function calls involving no Python code execution. A possible enhancement would be to traverse the whole chain of generators in a loop and directly resume the one at the end, although the handling of StopIteration is more complicated then. Use of StopIteration to return values ------------------------------------- There are a variety of ways that the return value from the generator could be passed back. Some alternatives include storing it as an attribute of the generator-iterator object, or returning it as the value of the ``close()`` call to the subgenerator. However, the proposed mechanism is attractive for a couple of reasons: * Using the StopIteration exception makes it easy for other kinds of iterators to participate in the protocol without having to grow extra attributes or a close() method. * It simplifies the implementation, because the point at which the return value from the subgenerator becomes available is the same point at which StopIteration is raised. Delaying until any later time would require storing the return value somewhere. Criticisms ========== Under this proposal, the value of a ``yield from`` expression would be derived in a very different way from that of an ordinary ``yield`` expression. This suggests that some other syntax not containing the word ``yield`` might be more appropriate, but no acceptable alternative has so far been proposed. It has been suggested that some mechanism other than ``return`` in the subgenerator should be used to establish the value returned by the ``yield from`` expression. However, this would interfere with the goal of being able to think of the subgenerator as a suspendable function, since it would not be able to return values in the same way as other functions. The use of an argument to StopIteration to pass the return value has been criticised as an "abuse of exceptions", without any concrete justification of this claim. In any case, this is only one suggested implementation; another mechanism could be used without losing any essential features of the proposal. Alternative Proposals ===================== Proposals along similar lines have been made before, some using the syntax ``yield *`` instead of ``yield from``. While ``yield *`` is more concise, it could be argued that it looks too similar to an ordinary ``yield`` and the difference might be overlooked when reading code. To the author's knowledge, previous proposals have focused only on yielding values, and thereby suffered from the criticism that the two-line for-loop they replace is not sufficiently tiresome to write to justify a new syntax. By also dealing with calls to ``send()``, ``throw()`` and ``close()``, this proposal provides considerably more benefit. Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: From sturla at molden.no Thu Feb 19 11:53:30 2009 From: sturla at molden.no (Sturla Molden) Date: Thu, 19 Feb 2009 11:53:30 +0100 Subject: [Python-ideas] Parallel processing with Python In-Reply-To: References: <7b5638cfdfafda1efc99e63b0c0ae9e7.squirrel@webmail.uio.no> Message-ID: <499D3A2A.2040706@molden.no> On 2/19/2009 3:34 AM, Adam Olsen wrote: > * Erlang's modules/classes/functions are not first-class objects, so > it doesn't need a copy of them. Python does, so each interpreter > would have a memory footprint about the same as a true process. Yes, each interpreter will have the memory footprint of an interpreter. So the memory use would be about the same as with multiprocessing. > * Any communication requires a serialize/copy/deserialize sequence. No it does not, and this why embedded interpreters are better than multiple processes (cf. multiprocessing). Since the interpreters share virtual memory, many objects can be shared without any serialization. That is, C pointers will be valid in both interpreters, so it should in many cases be possible to pass a PyObject* from one interpreter to another. This kind of communication would be easiest to achieve with immutable objects. Another advantage is that there will be just one process to kill, whenever that is required. S.M. From solipsis at pitrou.net Thu Feb 19 14:39:09 2009 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 19 Feb 2009 13:39:09 +0000 (UTC) Subject: [Python-ideas] Revised^4 PEP on yield-from References: <499C6E00.2030602@canterbury.ac.nz> <499CC6E8.5050104@ronadam.com> <499D243B.8080801@canterbury.ac.nz> <499D2740.1070408@canterbury.ac.nz> <499D3086.6020706@canterbury.ac.nz> Message-ID: Greg Ewing writes: > > Using a specialised syntax opens up possibilities for optimisation > when there is a long chain of generators. Such chains can arise, for > instance, when recursively traversing a tree structure. The overhead > of passing ``next()`` calls and yielded values down and up the chain > can cause what ought to be an O(n) operation to become O(n\*\*2). It should be relatively easy to avoid O(n**2) behaviour when traversing a tree, so I find this argument quite artificial. > It has been suggested that some mechanism other than ``return`` in > the subgenerator should be used to establish the value returned by > the ``yield from`` expression. However, this would interfere with > the goal of being able to think of the subgenerator as a suspendable > function, since it would not be able to return values in the same way > as other functions. The problem I have with allowing "return" in generators is that it makes things much more confusing (try explaining a beginner that he has the right to return a value from a generator but the value can't be retrieved through any conventional means: a "for" loop, a builtin function or method consuming the iterator, etc.). I can imagine it being useful in some Twisted-like situation: the inner generator would first yield a bunch of intermediate Deferreds to wait for the completion of some asynchronous thing, and then return the final Deferred for its caller to retrieve. But I think it wouldn't need the "yield from" construct to function, just the "return" thing. Regards Antoine. From guido at python.org Thu Feb 19 16:06:05 2009 From: guido at python.org (Guido van Rossum) Date: Thu, 19 Feb 2009 07:06:05 -0800 Subject: [Python-ideas] set.add() return value In-Reply-To: <209121.20920.qm@web111401.mail.gq1.yahoo.com> References: <15657.1234893740@parc.com> <499B41C2.7020702@pearwood.info> <499B794B.6050308@canterbury.ac.nz> <209121.20920.qm@web111401.mail.gq1.yahoo.com> Message-ID: On Thu, Feb 19, 2009 at 1:01 AM, Ralf W. Grosse-Kunstleve wrote: >> All in all, I am now siding with Raymond -- let's leave well enough >> alone. Hopefully we can now retire this thread, I don't think there's >> much more that can be said that hasn't been said already. > > FWIW, a few thoughts anyway: > > - I don't actually care about performance**. It was a mistake on my part to try to abuse > the performance argument for what I was really after: elegance (defined as power/volume > while still looking simple). > > - I always assumed it must be an oversight that set.add() doesn't return a trivial > piece of information that it has. Somebody has made the arbitrary decision to withhold > this information from me, like you withhold certain information from a child. It feels > very wrong. I'm grown up and wish to take responsibility myself. "If I don't like it, > you cannot use it" is forceful; "If you don't like it, just don't use it" is a much > better tone between partners. > > - The purity argument was used a lot. In my opinion, set would be pure if it didn't arbitrarily > withhold information from me that would allow me to make my algorithms more concise. Or as > Einstein put it, "Make everything as simple as possible, but not simpler." > set.add() returning None is failing the "but" part. I'm sorry, but your choice of words in the last two bullets feels very condescending to me, and despite my vow to end the thread I cannot let it the last words spoken be the words of a sore loser. Your position completely ignores the argument about people getting confused about what the return value would mean: "it was added" vs. "it was already there". It also seems to counter the elegance argument: if we cram as much information into every API as we can, certainly our APIs will feel more crowded and less elegant. Why not have list.append() return the new length of the list? Etc., etc. The overarching design principle in Python is to encourage the most maintainable code. This usually means to encourage concise code, but not so concise as to be hard to understand (or easily to misread) for the next person maintaining it. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Thu Feb 19 16:10:27 2009 From: guido at python.org (Guido van Rossum) Date: Thu, 19 Feb 2009 07:10:27 -0800 Subject: [Python-ideas] Parallel processing with Python In-Reply-To: <499D3A2A.2040706@molden.no> References: <7b5638cfdfafda1efc99e63b0c0ae9e7.squirrel@webmail.uio.no> <499D3A2A.2040706@molden.no> Message-ID: On Thu, Feb 19, 2009 at 2:53 AM, Sturla Molden wrote: > On 2/19/2009 3:34 AM, Adam Olsen wrote: >> * Any communication requires a serialize/copy/deserialize sequence. > > No it does not, and this why embedded interpreters are better than multiple > processes (cf. multiprocessing). Since the interpreters share virtual > memory, many objects can be shared without any serialization. > That is, C pointers will be valid in both interpreters, so it should in many > cases be possible to pass a PyObject* from one interpreter to another. This > kind of communication would be easiest to achieve with immutable objects. Only if you have an approach to GC that does not require locking. The current reference counting macros are not thread-safe so every INCREF and DECREF would have to be somehow protected by a lock or turned into an atomic operation. Recall that many frequently used objects, like None, small integers, and interned strings for common identifiers, are shared and constantly INCREFed and DECREFed. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Thu Feb 19 16:12:16 2009 From: guido at python.org (Guido van Rossum) Date: Thu, 19 Feb 2009 07:12:16 -0800 Subject: [Python-ideas] Revised^4 PEP on yield-from In-Reply-To: <499D3086.6020706@canterbury.ac.nz> References: <499C6E00.2030602@canterbury.ac.nz> <499CC6E8.5050104@ronadam.com> <499D243B.8080801@canterbury.ac.nz> <499D2740.1070408@canterbury.ac.nz> <499D3086.6020706@canterbury.ac.nz> Message-ID: On Thu, Feb 19, 2009 at 2:12 AM, Greg Ewing wrote: > Fifth draft of the PEP. Re-worded a few things slightly > to hopefully make the proposal a bit clearer up front. Wow, how I long for the days when we routinely put things like this under revision control so its easy to compare versions. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From solipsis at pitrou.net Thu Feb 19 16:28:34 2009 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 19 Feb 2009 15:28:34 +0000 (UTC) Subject: [Python-ideas] Revised^4 PEP on yield-from References: <499C6E00.2030602@canterbury.ac.nz> <499CC6E8.5050104@ronadam.com> <499D243B.8080801@canterbury.ac.nz> <499D2740.1070408@canterbury.ac.nz> <499D3086.6020706@canterbury.ac.nz> Message-ID: Greg Ewing writes: > > Use of StopIteration to return values > ------------------------------------- Why not a dedicated exception (e.g. GeneratorReturn) instead? Two advantages to doing so: * people mistakingly doing a "for" loop over such a generator would be reminded that they are missing something (the return value) * you could take advantage of existing iterator-consuming features (e.g. "yield from map(str, innergenerator())"), since they would just forward the exception instead of silencing it Regards Antoine. From sturla at molden.no Thu Feb 19 16:54:25 2009 From: sturla at molden.no (Sturla Molden) Date: Thu, 19 Feb 2009 16:54:25 +0100 (CET) Subject: [Python-ideas] Parallel processing with Python In-Reply-To: References: <7b5638cfdfafda1efc99e63b0c0ae9e7.squirrel@webmail.uio.no> <499D3A2A.2040706@molden.no> Message-ID: <8a655477a6a34bf62d4373023f6c0dcc.squirrel@webmail.uio.no> > On Thu, Feb 19, 2009 at 2:53 AM, Sturla Molden wrote: > Only if you have an approach to GC that does not require locking. The > current reference counting macros are not thread-safe so every INCREF > and DECREF would have to be somehow protected by a lock or turned into > an atomic operation. Recall that many frequently used objects, like > None, small integers, and interned strings for common identifiers, are > shared and constantly INCREFed and DECREFed. Thanks for the info. I think this would be just a minor inconvinience. Sending a message in the form of PyObject *x from A to B could perhaps be solved like this: Interpreter A: Increfs immutable pyobj x aquires the GIL of interpreter B messages pyobject x to interpreter B releases the GIL of interpreter B Interpreter B: Creates a proxy object p for reading attributes of x Increfs & decrefs p (refcounts for x or its attributes are not touched by p) when p is collected: aquires the GIL of interpreter A decrefs x releases the GIL of interpreter A Synchronization of reference counting is obviously needed (and the GIL is great for that). But serialization of the whole object would be avoided. This would depend on immutability of the message object. S.M. From guido at python.org Thu Feb 19 17:08:08 2009 From: guido at python.org (Guido van Rossum) Date: Thu, 19 Feb 2009 08:08:08 -0800 Subject: [Python-ideas] Revised^4 PEP on yield-from In-Reply-To: References: <499C6E00.2030602@canterbury.ac.nz> <499CC6E8.5050104@ronadam.com> <499D243B.8080801@canterbury.ac.nz> <499D2740.1070408@canterbury.ac.nz> <499D3086.6020706@canterbury.ac.nz> Message-ID: On Thu, Feb 19, 2009 at 7:28 AM, Antoine Pitrou wrote: > Greg Ewing writes: >> >> Use of StopIteration to return values >> ------------------------------------- > > Why not a dedicated exception (e.g. GeneratorReturn) instead? > Two advantages to doing so: > * people mistakingly doing a "for" loop over such a generator would be > reminded that they are missing something (the return value) > * you could take advantage of existing iterator-consuming features (e.g. > "yield from map(str, innergenerator())"), since they would just forward > the exception instead of silencing it Seconded -- but I would make it inherit from StopIteration so that the for-loop (unless modified) would just ignore it. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From veloso at verylowsodium.com Thu Feb 19 17:16:01 2009 From: veloso at verylowsodium.com (Greg Falcon) Date: Thu, 19 Feb 2009 11:16:01 -0500 Subject: [Python-ideas] Revised^4 PEP on yield-from In-Reply-To: <499D3086.6020706@canterbury.ac.nz> References: <499C6E00.2030602@canterbury.ac.nz> <499CC6E8.5050104@ronadam.com> <499D243B.8080801@canterbury.ac.nz> <499D2740.1070408@canterbury.ac.nz> <499D3086.6020706@canterbury.ac.nz> Message-ID: <3cdcefb80902190816l7cc5494dl7f1cd2c833ae855f@mail.gmail.com> On Thu, Feb 19, 2009 at 5:12 AM, Greg Ewing wrote: > * ``return expr`` in a generator is equivalent to ``raise > StopIteration(expr)``. It seems to me equivalence here might not be what you want. This parallel does not exist today between "return" and "raise StopIteration()", where the former can't be intercepted and blocked by a try/except block, but the latter can. I think it would be confusing for a return statement to be swallowed by code intended as an error handler. Greg F From guido at python.org Thu Feb 19 18:21:01 2009 From: guido at python.org (Guido van Rossum) Date: Thu, 19 Feb 2009 09:21:01 -0800 Subject: [Python-ideas] Revised^4 PEP on yield-from In-Reply-To: <3cdcefb80902190816l7cc5494dl7f1cd2c833ae855f@mail.gmail.com> References: <499C6E00.2030602@canterbury.ac.nz> <499CC6E8.5050104@ronadam.com> <499D243B.8080801@canterbury.ac.nz> <499D2740.1070408@canterbury.ac.nz> <499D3086.6020706@canterbury.ac.nz> <3cdcefb80902190816l7cc5494dl7f1cd2c833ae855f@mail.gmail.com> Message-ID: On Thu, Feb 19, 2009 at 8:16 AM, Greg Falcon wrote: > On Thu, Feb 19, 2009 at 5:12 AM, Greg Ewing wrote: >> * ``return expr`` in a generator is equivalent to ``raise >> StopIteration(expr)``. > > It seems to me equivalence here might not be what you want. > > This parallel does not exist today between "return" and "raise > StopIteration()", where the former can't be intercepted and blocked by > a try/except block, but the latter can. Technically, 'return' is treated as an uncatchable exception -- but an exception nevertheless, since you *do* get to intercept it with try/finally. > I think it would be confusing > for a return statement to be swallowed by code intended as an error > handler. Only marginally though, since once the generator returns, it *does* raise StopIteration. But all in all I agree it would be better to keep the existing return semantics and only turn it into StopIteration(expr) after all try/except and try/finally blocks have been left -- IOW at the moment the frame is being cleared up. That would minimize surprises IMO. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From dangyogi at gmail.com Thu Feb 19 19:17:52 2009 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Thu, 19 Feb 2009 13:17:52 -0500 Subject: [Python-ideas] PEP on yield-from: throw example In-Reply-To: <499C69E0.5010409@canterbury.ac.nz> References: <4995F681.20702@canterbury.ac.nz> <0A5E3DAFC3204EA08954903202DB45F3@RaymondLaptop1> <49964508.5020207@pearwood.info> <49986FF6.3060707@gmail.com> <91ad5bf80902151825w61215fd6g6aac11949e9df71d@mail.gmail.com> <499931F4.8050601@canterbury.ac.nz> <499C478D.4010902@gmail.com> <499C69E0.5010409@canterbury.ac.nz> Message-ID: <499DA250.7040901@gmail.com> Greg Ewing wrote: > Bruce Frederiksen wrote: > >> 1. The double use of send/throw and the yield expression for >> simultaneous input and output to/from the generator; rather than >> separating input and output as two different constructs. Sending >> one value in does not always correspond to getting one value out. > > You might not be interested in sending or receiving > a value every time, but you do have to suspend the > generator each time you want to send and/or receive > a value. > > Currently, there is only one way to suspend a > generator, which for historical reasons is called > 'yield'. Each time you use it, you have the opportunity > to send a value, and an opportunity to receive a > value, but you don't have to use both of these (or > either of them) if you don't want to. > > What you seem to be proposing is having two aliases > for 'yield', one of which only sends and the other > only receives. Is that right? If so, I don't see > much point in it other than making code read > slightly better. I'm thinking the yield goes away (both the statement and expression form). This would be replaced by builtin functions. I would propose that the builtins take optional pipe arguments that would default to the current thread's pipein/pipeout. I would also propose that each thread be allowed multiple input and/or output pipes and that the selection of which to use could be done by passing an integer value for the pipe argument. For example: send(obj, pipeout = None) send_from(iterable, pipeout = None) # does what "yield from" is supposed to do next(iterator = None) num_input_pipes() num_output_pipes() You may need a few more functions to round this out: pipein(index = 0) # returns the current thread's pipein[index] object, could also use iter() for this. pipeout(index = 0) # returns the current thread's pipeout[index] object throwforward(exc_type, exc_value = None, traceback = None, pipeout = None) throwback(exc_type, exc_value = None, traceback = None, pipein = None) Thus: yield expr becomes send(expr) which doesn't mean "this is generator" or that control will *necessarily* be transfered to another thread here. It depends on whether the other thread has already done a next on the corresponding pipein. I'm thinking that the C code (byte interpretor) that manages Python stack frame objects become detached from Python stack, so that a Python to Python call does not grow the C stack. This would allow the C code to fork the Python stack and switch between branches quite easily. This separation of input and output would clean up most generator examples. Guido's tree flattener has special code to yield SKIP in response to SKIP, because he doesn't really want a value returned from sending a SKIP in. This would no longer be necessary. def __iter__(self): skip = yield self.label if skip == SKIP: yield SKIPPED else: skip = yield ENTER if skip == SKIP: yield SKIPPED else: for child in self.children: yield from child yield LEAVE # I guess a SKIP can't be returned here? becomes: def __iter__(self): return generate(self.flatten) def flatten(self): send(self.label) if next() != SKIP: send(ENTER) if next() != SKIP: for child in self.children: child.flatten() send(LEAVE) Also, the caller could then simply look like: for token in tree(): if too_deep: send(SKIP) else: send(None) rather than: response = None gen = tree() try: while True: token = gen.send(response) if too_deep: response = SKIP else: response = None except StopIteration: pass The reason for this extra complexity is that send returns a value. Separating send from yielding values lets you call send from within for statements without having another value land in your lap that you really would rather have sent to the for statement. The same thing applies to throw. If throw didn't return a value, then it could be easily called within for statements. The parsing example goes from: def scanner(text): for m in pat.finditer(text): token = m.group(0) print "Feeding:", repr(token) yield token yield None # to signal EOF def parse_items(closing_tag = None): elems = [] while 1: token = token_stream.next() if not token: break # EOF if is_opening_tag(token): elems.append(parse_elem(token)) elif token == closing_tag: break else: elems.append(token) return elems def parse_elem(opening_tag): name = opening_tag[1:-1] closing_tag = "" % name items = parse_items(closing_tag) return (name, items) to def scanner(text): for m in pat.finditer(text): token = m.group(0) print "Feeding:", repr(token) send(token) def parse_items(closing_tag = None): for token in next(): if is_opening_tag(token): send(parse_elem(token)) elif token == closing_tag: break else: send(token) def parse_elem(opening_tag): name = opening_tag[1:-1] closing_tag = "" % name items = list(generate(parse_items(closing_tag), pipein=pipein())) return (name, items) and perhaps called as: tree = list(scanner(text) | parse_items()) This also obviates the need to do an initial next call when pushing (sending) to generators which are acting as consumers. A need which is difficult to explain and to understand. > >> * I'm thinking here of a pair of cooperating pipe objects, >> read and write, > > Pipes are different in an important way -- they > have queueing. Writes to one end don't have to > interleave perfectly with reads at the other. > But generators aren't like that -- there is no > buffer to hold sent/yielded values until the > other end is ready for them. > > Or are you suggesting that there should be such > buffering? I would say that's a higher-level facility > that should be provided by library code using > yield, or something like it, as a primitive. I didn't mean to imply that buffering was required, or even desired. With no buffering, the sender and receiver stay in-sync, just like generators. A write would suspend until a matching read, and vice versa. Only when the pipe sees both a write and a read would the object be transfered from the writer to the reader. Thus, write/read replaces yield as the way to suspend the current "thread". This avoids the confusion about whether we're "pushing" or "pulling" to/from a generator. For example, itertools.tee is currently designed as a generator that "pulls" values from its iterable parameter. But then it can't switch roles to "push" values to its consumers, and so must be prepared to store values in case the consumers aren't synchronized with each other. With this new approach, the consumer waiting for the send value would be activated by the pipe connecting it to tee. And if that consumer wasn't ready for a value yet, tee would be suspended until it was. So tee would not have to store any values. def tee(): num_outputs = num_output_pipes() for input in next(): for i in range(num_outputs): send(input, i) Does this help? -bruce frederiksen From greg.ewing at canterbury.ac.nz Thu Feb 19 21:36:52 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 20 Feb 2009 09:36:52 +1300 Subject: [Python-ideas] Revised^4 PEP on yield-from In-Reply-To: References: <499C6E00.2030602@canterbury.ac.nz> <499CC6E8.5050104@ronadam.com> <499D243B.8080801@canterbury.ac.nz> <499D2740.1070408@canterbury.ac.nz> <499D3086.6020706@canterbury.ac.nz> Message-ID: <499DC2E4.5040903@canterbury.ac.nz> Antoine Pitrou wrote: > It should be relatively easy to avoid O(n**2) behaviour when traversing a tree, How? > The problem I have with allowing "return" in generators is that it makes things > much more confusing (try explaining a beginner that he has the right to return a > value from a generator but the value can't be retrieved through any conventional > means I don't think it will be any harder than explaining why they get a syntax error if they try to return something from a generator at present. -- Greg From greg.ewing at canterbury.ac.nz Thu Feb 19 21:48:15 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 20 Feb 2009 09:48:15 +1300 Subject: [Python-ideas] Revised^4 PEP on yield-from In-Reply-To: References: <499C6E00.2030602@canterbury.ac.nz> <499CC6E8.5050104@ronadam.com> <499D243B.8080801@canterbury.ac.nz> <499D2740.1070408@canterbury.ac.nz> <499D3086.6020706@canterbury.ac.nz> Message-ID: <499DC58F.9060305@canterbury.ac.nz> Antoine Pitrou wrote: > Why not a dedicated exception (e.g. GeneratorReturn) instead? > Two advantages to doing so: > * people mistakingly doing a "for" loop over such a generator would be > reminded that they are missing something (the return value) You don't get an warning that you are "missing something" if you ignore the return value from an ordinary function call, so I don't find this argument very convincing. > * you could take advantage of existing iterator-consuming features (e.g. > "yield from map(str, innergenerator())"), since they would just forward > the exception instead of silencing it You also don't get ordinary return values of functions that you call forwarded to your caller, and I don't see how it would be any more useful to do so here. There is one possible reason it might be useful, and that's if you want to catch a StopIteration that may or may not have a value attached, i.e. try: # some iteration thing except GeneratorReturn, e: result = e.args[0] except StopIteration: result = None However, that would be better addressed by enhancing StopIteration with a 'value' attribute that is None if it wasn't given an argument. -- Greg From rhamph at gmail.com Thu Feb 19 21:49:28 2009 From: rhamph at gmail.com (Adam Olsen) Date: Thu, 19 Feb 2009 13:49:28 -0700 Subject: [Python-ideas] Parallel processing with Python In-Reply-To: <499D3A2A.2040706@molden.no> References: <7b5638cfdfafda1efc99e63b0c0ae9e7.squirrel@webmail.uio.no> <499D3A2A.2040706@molden.no> Message-ID: On Thu, Feb 19, 2009 at 3:53 AM, Sturla Molden wrote: > On 2/19/2009 3:34 AM, Adam Olsen wrote: >> * Any communication requires a serialize/copy/deserialize sequence. > > No it does not, and this why embedded interpreters are better than multiple > processes (cf. multiprocessing). Since the interpreters share virtual > memory, many objects can be shared without any serialization. > That is, C pointers will be valid in both interpreters, so it should in many > cases be possible to pass a PyObject* from one interpreter to another. This > kind of communication would be easiest to achieve with immutable objects. If you could directly use another interpreter's PyObject in the current interpreter then they wouldn't separate interpreters. You need to apply it for the type objects too, and if you start sharing those you'll kill any performance advantage of this whole scheme. The realistic scenario is you treat each interpreter as a monitor: you can call into another interpreter quite cheaply (release your GIL, set your current interpreter to be them, acquire their GIL). However, since you are only in one at any given point in time, you need to copy anything you want to transmit. To put it another way, your proxy objects can hold pointers to the other interpreter's objects, but you can't use them until you go back into that other interpreter. -- Adam Olsen, aka Rhamphoryncus From greg.ewing at canterbury.ac.nz Thu Feb 19 21:54:27 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 20 Feb 2009 09:54:27 +1300 Subject: [Python-ideas] Parallel processing with Python In-Reply-To: <8a655477a6a34bf62d4373023f6c0dcc.squirrel@webmail.uio.no> References: <7b5638cfdfafda1efc99e63b0c0ae9e7.squirrel@webmail.uio.no> <499D3A2A.2040706@molden.no> <8a655477a6a34bf62d4373023f6c0dcc.squirrel@webmail.uio.no> Message-ID: <499DC703.8060107@canterbury.ac.nz> Sturla Molden wrote: > Interpreter B: > Creates a proxy object p for reading attributes of x This is not sufficient. When code running in interpreter B reads an attribute of x via the proxy, it will get a reference to some other object belonging to interpreter A. > This would depend on immutability of the message object. Immutability doesn't save you. Even immutable objects get their refcounts adjusted just like any other object. -- Greg From greg.ewing at canterbury.ac.nz Thu Feb 19 21:58:46 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 20 Feb 2009 09:58:46 +1300 Subject: [Python-ideas] Revised^4 PEP on yield-from In-Reply-To: <3cdcefb80902190816l7cc5494dl7f1cd2c833ae855f@mail.gmail.com> References: <499C6E00.2030602@canterbury.ac.nz> <499CC6E8.5050104@ronadam.com> <499D243B.8080801@canterbury.ac.nz> <499D2740.1070408@canterbury.ac.nz> <499D3086.6020706@canterbury.ac.nz> <3cdcefb80902190816l7cc5494dl7f1cd2c833ae855f@mail.gmail.com> Message-ID: <499DC806.7000203@canterbury.ac.nz> Greg Falcon wrote: > It seems to me equivalence here might not be what you want. > > This parallel does not exist today between "return" and "raise > StopIteration()", where the former can't be intercepted and blocked by > a try/except block, but the latter can. Hmmm, you're right, it's not exactly equivalent. I'll adjust the wording of that part -- it's not my intention to make returns catchable. -- Greg From greg.ewing at canterbury.ac.nz Thu Feb 19 22:06:39 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 20 Feb 2009 10:06:39 +1300 Subject: [Python-ideas] Revised^4 PEP on yield-from In-Reply-To: References: <499C6E00.2030602@canterbury.ac.nz> <499CC6E8.5050104@ronadam.com> <499D243B.8080801@canterbury.ac.nz> <499D2740.1070408@canterbury.ac.nz> <499D3086.6020706@canterbury.ac.nz> Message-ID: <499DC9DF.8040502@canterbury.ac.nz> Guido van Rossum wrote: > Seconded -- but I would make it inherit from StopIteration so that the > for-loop (unless modified) would just ignore it. But is there really any good reason to use a different exception? Currently, 'return' without a value in an ordinary function is equivalent to 'return None'. If this is done, they wouldn't be equivalent in generators, since they would raise different exceptions. -- Greg From greg.ewing at canterbury.ac.nz Thu Feb 19 22:12:52 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 20 Feb 2009 10:12:52 +1300 Subject: [Python-ideas] PEP on yield-from: throw example In-Reply-To: <499DA250.7040901@gmail.com> References: <4995F681.20702@canterbury.ac.nz> <0A5E3DAFC3204EA08954903202DB45F3@RaymondLaptop1> <49964508.5020207@pearwood.info> <49986FF6.3060707@gmail.com> <91ad5bf80902151825w61215fd6g6aac11949e9df71d@mail.gmail.com> <499931F4.8050601@canterbury.ac.nz> <499C478D.4010902@gmail.com> <499C69E0.5010409@canterbury.ac.nz> <499DA250.7040901@gmail.com> Message-ID: <499DCB54.2050704@canterbury.ac.nz> Bruce Frederiksen wrote: > I would propose > that the builtins take optional pipe arguments that would default to the > current thread's pipein/pipeout. I would also propose that each thread > be allowed multiple input and/or output pipes... All this is higher-level stuff that can be built on the primitive operation of yielding. For instance it could easily be added to the scheduling library I'm about to post (I tried to post it yesterday, but it bounced). -- Greg From greg.ewing at canterbury.ac.nz Thu Feb 19 22:32:23 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 20 Feb 2009 10:32:23 +1300 Subject: [Python-ideas] Yield-From Example 2: Scheduler Message-ID: <499DCFE7.2020407@canterbury.ac.nz> (I've tried to send this twice before and got smtp 554 errors. I'm trying again without the attachment.) Yield-From Example 2: A Scheduler for Generator-Based Threads ============================================================= Just for fun, let's write a thread scheduler. Each thread will be a generator. Whenever a thread wants to suspend itself, it will do a 'yield'. We won't make any use of values sent or received by yields in this example. We'll start with a global variable to hold the currently running thread. current = None We'll also want a queue of threads that are waiting to run. ready_list = [] The first thing we'll want is a way of getting a thread into the scheduling system. def schedule(g): ready_list.append(g) The core loop of the scheduler will repeatedly take the thread at the head of the queue and run it until it yields. def run(): global current while ready_list: g = ready_list[0] current = g try: g.next() except StopIteration: unschedule(g) else: expire_timeslice(g) If the thread is still at the head of the ready list after it has yielded, we move it to the end, so that the ready threads will run round-robin fashion. def expire_timeslice(g): if ready_list and ready_list[0] is g: del ready_list[0] ready_list.append(g) When the thread finishes, we use the following function to remove it from the scheduling system. def unschedule(g): if g in ready_list: ready_list.remove(g) We've got enough so far to try a simple test. def person(name, count): for i in xrange(count): print name, "running" yield schedule(person("John", 2)) schedule(person("Michael", 3)) schedule(person("Terry", 4)) run() We can run this with present-day Python, and we get John running Michael running Terry running John running Michael running Terry running Michael running Terry running Terry running Waiting for Resources --------------------- Things get more interesting when our threads do something non-trivial. Let's turn our people into dining philosophers. For this we'll need a way to represent forks (the eating kind, not the unix kind) and a way for a thread to wait for one to become available. But before launching into that, let's add two more functions to our scheduler that will come in useful. def block(queue): queue.append(current) unschedule(current) This removes the currently running thread from the ready list and adds it to a list that you specify. def unblock(queue): if queue: g = queue.pop(0) schedule(g) This removes the thread at the head of the specified list, if any, and adds it to the ready list. Now we can start implementing an eating utensil. class Utensil: def __init__(self, id): self.id = id self.available = True self.queue = [] The utensil has a flag indicating whether it's available, and a queue of threads waiting to use it. To acquire a utensil, we first check to see whether it is available. If not, we block the current thread on the queue, and then yield. When we get to run again, it's our turn, so we mark the utensil as being in use. def acquire(self): if not self.available: block(self.queue) yield self.available = False To release the utensil, we mark it as available and then unblock the thread at the head of the queue, if any. def release(self): self.available = True unblock(self.queue) Next we need a life cycle for a philosopher. def philosopher(name, lifetime, think_time, eat_time, left_fork, right_fork): for i in xrange(lifetime): for j in xrange(think_time): print name, "thinking" yield print name, "waiting for fork", left_fork.id yield from left_fork.acquire() print name, "acquired fork", left_fork.id print name, "waiting for fork", right_fork.id yield from right_fork.acquire() print name, "acquired fork", right_fork.id for j in xrange(eat_time): # They're Python philosophers, so they eat spam rather than spaghetti print name, "eating spam" yield print name, "releasing forks", left_fork.id, "and", right_fork.id left_fork.release() right_fork.release() print name, "leaving the table" Now we can set up a scenario. forks = [Utensil(i) for i in xrange(3)] schedule(philosopher("Plato", 7, 2, 3, forks[0], forks[1])) schedule(philosopher("Socrates", 8, 3, 1, forks[1], forks[2])) schedule(philosopher("Euclid", 5, 1, 4, forks[2], forks[0])) run() We can't run this in current Python, because of the yield-froms. However, we can test it by substituting for-loops such as for _ in left_fork.acquire(): yield After doing this, the output is Plato thinking Socrates thinking Euclid thinking Plato thinking Socrates thinking Euclid waiting for fork 2 Euclid acquired fork 2 Euclid waiting for fork 0 Euclid acquired fork 0 Euclid eating spam ...etc... Waiting for External Events --------------------------- So far our thread system has been completely self-absorbed and unable to deal with the outside world. Let's arrange things so that, if there are no threads ready to run, the scheduler will wait for some file to become readable or writable using select(). It's easiest to do this by writing a new main loop that builds on the previous one. def run2(): while 1: run() if not wait_for_event(): return We will need a data structure to hold threads waiting for files. Each file needs two queues associated with it, for threads waiting to read and write respectively. class FdQueues: def __init__(self): self.readq = [] self.writeq = [] We will keep a mapping from file objects to their associated FdQueue instances. fd_queues = {} The following function retrieves the queues for a given fd, creating new ones if they don't already exist. def get_fd_queues(fd): q = fd_queues.get(fd) if not q: q = FdQueues() fd_queues[fd] = q return q Now we can write a new pair of scheduling primitives to block on a file. def block_for_reading(fd): block(get_fd_queues(fd).readq) def block_for_writing(fd): block(get_fd_queues(fd).writeq) It's expected that the thread calling these will immediately yield afterwards. We could incorporate the yield into these functions, but we'll be building higher level functions on top of these shortly, and it will be more convenient to do the yield there. We'll also want a way of removing a file from the fd_queues when we've finished with it, so we'll add a function to close it and clean up. def close_fd(fd): if fd in fd_queues: del fd_queues[fd] fd.close() Now we can write wait_for_event(). It's a bit longwinded, but fairly straightforward. We build lists of file objects having nonempty read or write queues, pass them to select(), and for each one that's ready, we unblock the thread at the head of the relevant queue. If there are no threads waiting on any files, we return False to tell the scheduler there's no more work to do. def wait_for_event(): from select import select read_fds = [] write_fds = [] for fd, q in fd_queues.iteritems(): if q.readq: read_fds.append(fd) if q.writeq: write_fds.append(fd) if not (read_fds or write_fds): return False read_fds, write_fds, _ = select(read_fds, write_fds, []) for fd in read_fds: unblock(fd_queues[fd].readq) for fd in write_fds: unblock(fd_queues[fd].writeq) return True At this point we can try a quick test to see if everything works so far. def loop(): while 1: print "Waiting for input" block_for_reading(stdin) yield print "Input is ready" line = stdin.readline() print "Input was:", repr(line) if not line: break schedule(loop()) run2() Sample session: Waiting for input asdf Input is ready Input was: 'asdf\n' Waiting for input qwer Input is ready Input was: 'qwer\n' Waiting for input Input is ready Input was: '' It's not a very convincing test yet, though, since there's only one thread, so let's play around with some sockets and build a multithreaded server. A Spam Server ------------- We're going to implement the following protocol. The client sends the word "SPAM" followed by a number, and the server replies with "100 SPAM FOLLOWS" and the corresponding number of repetitions of the phrase "spam glorious spam". If the requested number is not greater than zero or the request is malformed, the server replies "400 WE ONLY SERVE SPAM". We could do with some higher-level functions for blocking operations on sockets, so let's write a few. First, accepting a connection from a listening socket. def sock_accept(sock): block_for_reading(sock) yield return sock.accept() Now reading a line of text from a socket. We keep reading until the data ends with a newline or EOF is reached. (We're assuming that the client will wait for a reply before sending another line, so we don't have to worry about reading too much.) We also close the socket on EOF, since we won't be reading from it again after that. def sock_readline(sock): buf = "" while buf[-1:] != "\n": block_for_reading(sock) yield data = sock.recv(1024) if not data: break buf += data if not buf: close_fd(sock) return buf Writing data to a socket. We loop until all the data has been written. We don't use sendall(), because it might block, and we don't want to hold up other threads. def sock_write(sock, data): while data: block_for_writing(sock) yield n = sock.send(data) data = data[n:] Now we're ready to write the main loop of the server. It will set up a listening socket, then repeatedly accept connections and spawn a thread to handle each one. port = 4200 def listener(): lsock = socket(AF_INET, SOCK_STREAM) lsock.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1) lsock.bind(("", port)) lsock.listen(5) while 1: csock, addr = yield from sock_accept(lsock) print "Listener: Accepted connection from", addr schedule(handler(csock)) The handler function handles the interaction with one client session. def handler(sock): while 1: line = yield from sock_readline(sock) if not line: break try: n = parse_request(line) yield from sock_write(sock, "100 SPAM FOLLOWS\n") for i in xrange(n): yield from sock_write(sock, "spam glorious spam\n") except BadRequest: yield from sock_write(sock, "400 WE ONLY SERVE SPAM\n") The handler uses the following function to parse the request and check it for validity. class BadRequest(Exception): pass def parse_request(line): tokens = line.split() if len(tokens) != 2 or tokens[0] != "SPAM": raise BadRequest try: n = int(tokens[1]) except ValueError: raise BadRequest if n < 1: raise BadRequest return n All we need to do now is spawn the main loop and run the scheduler. schedule(listener()) run2() At this point, I got fed up with expanding yield-from statements by hand, and wrote a program to do it for me. Here's a sample client session: % telnet localhost 4200 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. SPAM 3 100 SPAM FOLLOWS spam glorious spam spam glorious spam spam glorious spam EGGS 400 WE ONLY SERVE SPAM ^] telnet> Connection closed. % Conclusions ----------- The yield-from statement makes it possible to write thread code using generators almost the same way as you would write ordinary code. Whether it's any easier or clearer than using things like yield Call(g(x)) and yield Return(x) is debatable. However, I think this example does show that the implementation of a generator-based scheduler can be very clean and simple when yield-from is available, and if it is suitably optimised, probably more efficient as well. I was going to attach a zip file containing the code I used for testing all this, but the python-ideas mail server seems to be bouncing the message when I do that. I'll try sending it in a separate message. -- Greg From greg.ewing at canterbury.ac.nz Thu Feb 19 22:54:43 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 20 Feb 2009 10:54:43 +1300 Subject: [Python-ideas] Yield-From: Thread testing code Message-ID: <499DD523.5060104@canterbury.ac.nz> It seems that the python-ideas mail server really doesn't like my attachment, so it's time for Plan C. You can download the code from here: http://www.cosc.canterbury.ac.nz/greg.ewing/python/yield-from/Threads.zip Is the mail server configured not to accept attachments or something? The error message I'm getting is Your message cannot be delivered to the following recipients: Recipient address: python-ideas at python.org Reason: SMTP transmission failure has occurred Diagnostic code: smtp;554 permanent error Remote system: dns;mail.python.org (TCP|132.181.2.71|1204|194.109.207.14|25) (bag.python.org ESMTP ) -- Greg From greg.ewing at canterbury.ac.nz Thu Feb 19 22:56:07 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 20 Feb 2009 10:56:07 +1300 Subject: [Python-ideas] Revised^4 PEP on yield-from In-Reply-To: <20090219213757.GA5621@mcnabbs.org> References: <499C6E00.2030602@canterbury.ac.nz> <499CC6E8.5050104@ronadam.com> <499D243B.8080801@canterbury.ac.nz> <499D2740.1070408@canterbury.ac.nz> <499D3086.6020706@canterbury.ac.nz> <20090219213757.GA5621@mcnabbs.org> Message-ID: <499DD577.8060106@canterbury.ac.nz> Andrew McNabb wrote: > I think that "e.args[0]" > looks mystical whereas "e.value" is elegant and straightforward. Okay, I'll add this to the PEP. -- Greg From greg.ewing at canterbury.ac.nz Thu Feb 19 23:16:44 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 20 Feb 2009 11:16:44 +1300 Subject: [Python-ideas] Revised**5 PEP on yield-from Message-ID: <499DDA4C.8090906@canterbury.ac.nz> Revision 6 of the PEP. * Changed wording slightly to avoid implying that StopIteration(value) can be caught inside the returning generator. * Added a suggestion that StopIteration be given a 'value' attribute. * Mentioned the idea of a GeneratorReturn exception in the Criticisms section. PEP: XXX Title: Syntax for Delegating to a Subgenerator Version: $Revision$ Last-Modified: $Date$ Author: Gregory Ewing Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 13-Feb-2009 Python-Version: 2.7 Post-History: Abstract ======== A syntax is proposed for a generator to delegate part of its operations to another generator. This allows a section of code containing 'yield' to be factored out and placed in another generator. Additionally, the subgenerator is allowed to return with a value, and the value is made available to the delegating generator. The new syntax also opens up some opportunities for optimisation when one generator re-yields values produced by another. Proposal ======== The following new expression syntax will be allowed in the body of a generator: :: yield from where is an expression evaluating to an iterable, from which an iterator is extracted. The iterator is run to exhaustion, during which time it behaves as though it were communicating directly with the caller of the generator containing the ``yield from`` expression (the "delegating generator"). When the iterator is another generator, the effect is the same as if the body of the subgenerator were inlined at the point of the 'yield from' expression. The subgenerator is allowed to execute a 'return' statement with a value, and that value becomes the value of the 'yield from' expression. In terms of the iterator protocol: * Any values that the iterator yields are passed directly to the caller. * Any values sent to the delegating generator using ``send()`` are sent directly to the iterator. (If the iterator does not have a ``send()`` method, it remains to be decided whether the value sent in is ignored or an exception is raised.) * Calls to the ``throw()`` method of the delegating generator are forwarded to the iterator. (If the iterator does not have a ``throw()`` method, the thrown-in exception is raised in the delegating generator.) * If the delegating generator's ``close()`` method is called, the iterator is finalised before finalising the delegating generator. * The value of the ``yield from`` expression is the first argument to the ``StopIteration`` exception raised by the iterator when it terminates. * ``return expr`` in a generator causes ``StopIteration(expr)`` to be raised. For convenience, the ``StopIteration`` exception will be given a ``value`` attribute that holds its first argument, or None if there are no arguments. Formal Semantics ---------------- The statement :: result = yield from expr is semantically equivalent to :: _i = iter(expr) try: _u = _i.next() while 1: try: _v = yield _u except Exception, _e: if hasattr(_i, 'throw'): _i.throw(_e) else: raise else: if hasattr(_i, 'send'): _u = _i.send(_v) else: _u = _i.next() except StopIteration, _e: _a = _e.args if len(_a) > 0: result = _a[0] else: result = None finally: if hasattr(_i, 'close'): _i.close() Rationale ========= A Python generator is a form of coroutine, but has the limitation that it can only yield to its immediate caller. This means that a piece of code containing a ``yield`` cannot be factored out and put into a separate function in the same way as other code. Performing such a factoring causes the called function to itself become a generator, and it is necessary to explicitly iterate over this second generator and re-yield any values that it produces. If yielding of values is the only concern, this is not very arduous and can be performed with a loop such as :: for v in g: yield v However, if the subgenerator is to interact properly with the caller in the case of calls to ``send()``, ``throw()`` and ``close()``, things become considerably more complicated. As the formal expansion presented above illustrates, the necessary code is very longwinded, and it is tricky to handle all the corner cases correctly. In this situation, the advantages of a specialised syntax should be clear. Generators as Threads --------------------- A motivating use case for generators being able to return values concerns the use of generators to implement lightweight threads. When using generators in that way, it is reasonable to want to spread the computation performed by the lightweight thread over many functions. One would like to be able to call a subgenerator as though it were an ordinary function, passing it parameters and receiving a returned value. Using the proposed syntax, a statement such as :: y = f(x) where f is an ordinary function, can be transformed into a delegation call :: y = yield from g(x) where g is a generator. One can reason about the behaviour of the resulting code by thinking of g as an ordinary function that can be suspended using a ``yield`` statement. When using generators as threads in this way, typically one is not interested in the values being passed in or out of the yields. However, there are use cases for this as well, where the thread is seen as a producer or consumer of items. The ``yield from`` expression allows the logic of the thread to be spread over as many functions as desired, with the production or consumption of items occuring in any subfunction, and the items are automatically routed to or from their ultimate source or destination. Concerning ``throw()`` and ``close()``, it is reasonable to expect that if an exception is thrown into the thread from outside, it should first be raised in the innermost generator where the thread is suspended, and propagate outwards from there; and that if the thread is terminated from outside by calling ``close()``, the chain of active generators should be finalised from the innermost outwards. Syntax ------ The particular syntax proposed has been chosen as suggestive of its meaning, while not introducing any new keywords and clearly standing out as being different from a plain ``yield``. Optimisations ------------- Using a specialised syntax opens up possibilities for optimisation when there is a long chain of generators. Such chains can arise, for instance, when recursively traversing a tree structure. The overhead of passing ``next()`` calls and yielded values down and up the chain can cause what ought to be an O(n) operation to become O(n\*\*2). A possible strategy is to add a slot to generator objects to hold a generator being delegated to. When a ``next()`` or ``send()`` call is made on the generator, this slot is checked first, and if it is nonempty, the generator that it references is resumed instead. If it raises StopIteration, the slot is cleared and the main generator is resumed. This would reduce the delegation overhead to a chain of C function calls involving no Python code execution. A possible enhancement would be to traverse the whole chain of generators in a loop and directly resume the one at the end, although the handling of StopIteration is more complicated then. Use of StopIteration to return values ------------------------------------- There are a variety of ways that the return value from the generator could be passed back. Some alternatives include storing it as an attribute of the generator-iterator object, or returning it as the value of the ``close()`` call to the subgenerator. However, the proposed mechanism is attractive for a couple of reasons: * Using the StopIteration exception makes it easy for other kinds of iterators to participate in the protocol without having to grow extra attributes or a close() method. * It simplifies the implementation, because the point at which the return value from the subgenerator becomes available is the same point at which StopIteration is raised. Delaying until any later time would require storing the return value somewhere. Criticisms ========== Under this proposal, the value of a ``yield from`` expression would be derived in a very different way from that of an ordinary ``yield`` expression. This suggests that some other syntax not containing the word ``yield`` might be more appropriate, but no acceptable alternative has so far been proposed. It has been suggested that some mechanism other than ``return`` in the subgenerator should be used to establish the value returned by the ``yield from`` expression. However, this would interfere with the goal of being able to think of the subgenerator as a suspendable function, since it would not be able to return values in the same way as other functions. The use of an argument to StopIteration to pass the return value has been criticised as an "abuse of exceptions", without any concrete justification of this claim. In any case, this is only one suggested implementation; another mechanism could be used without losing any essential features of the proposal. It has been suggested that a different exception, such as GeneratorReturn, should be used instead of StopIteration to return a value. However, no convincing practical reason for this has been put forward, and the addition of a ``value`` attribute to StopIteration mitigates any difficulties in extracting a return value from a StopIteration exception that may or may not have one. Also, using a different exception would mean that, unlike ordinary functions, 'return' without a value in a generator would not be equivalent to 'return None'. Alternative Proposals ===================== Proposals along similar lines have been made before, some using the syntax ``yield *`` instead of ``yield from``. While ``yield *`` is more concise, it could be argued that it looks too similar to an ordinary ``yield`` and the difference might be overlooked when reading code. To the author's knowledge, previous proposals have focused only on yielding values, and thereby suffered from the criticism that the two-line for-loop they replace is not sufficiently tiresome to write to justify a new syntax. By also dealing with calls to ``send()``, ``throw()`` and ``close()``, this proposal provides considerably more benefit. Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: From rhamph at gmail.com Thu Feb 19 23:49:46 2009 From: rhamph at gmail.com (Adam Olsen) Date: Thu, 19 Feb 2009 15:49:46 -0700 Subject: [Python-ideas] Yield-From Example 2: Scheduler In-Reply-To: <499DCFE7.2020407@canterbury.ac.nz> References: <499DCFE7.2020407@canterbury.ac.nz> Message-ID: On Thu, Feb 19, 2009 at 2:32 PM, Greg Ewing wrote: > Conclusions > ----------- > > The yield-from statement makes it possible to write thread code using > generators almost the same way as you would write ordinary code. > > Whether it's any easier or clearer than using things like yield > Call(g(x)) and yield Return(x) is debatable. However, I think this > example does show that the implementation of a generator-based > scheduler can be very clean and simple when yield-from is available, > and if it is suitably optimised, probably more efficient as well. You don't use plain "yield" except internally, so you don't even need "yield Call(g(x))". You only need "yield g(x)" and your scheduler can assume it's a generator. "yield from g(x)" provides no value to the user (it's simply a longer spelling of a same thing). The debatable value to the scheduler is a one time cost. It'd have to be quite massive to warrant imposing new syntax on everybody else. The fact that such schedulers already exist is a pretty strong argument that it's not a massive cost. -- Adam Olsen, aka Rhamphoryncus From solipsis at pitrou.net Thu Feb 19 23:54:02 2009 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 19 Feb 2009 22:54:02 +0000 (UTC) Subject: [Python-ideas] Revised^4 PEP on yield-from References: <499C6E00.2030602@canterbury.ac.nz> <499CC6E8.5050104@ronadam.com> <499D243B.8080801@canterbury.ac.nz> <499D2740.1070408@canterbury.ac.nz> <499D3086.6020706@canterbury.ac.nz> <499DC2E4.5040903@canterbury.ac.nz> Message-ID: Greg Ewing writes: > > > It should be relatively easy to avoid O(n**2) behaviour when traversing > > a tree, > > How? By doing the traversal iteratively rather than recursively. Well, I admit the following function took a couple of attempts to get right: def traverse_depth_first(tree): stack = [] yield tree.value it = iter(tree.children) while True: try: child = it.next() except StopIteration: if not stack: raise it, tree = stack.pop() else: stack.append((it, tree)) tree = child yield tree.value it = iter(tree.children) From greg.ewing at canterbury.ac.nz Fri Feb 20 00:09:02 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 20 Feb 2009 12:09:02 +1300 Subject: [Python-ideas] Revised^4 PEP on yield-from In-Reply-To: References: <499C6E00.2030602@canterbury.ac.nz> <499CC6E8.5050104@ronadam.com> <499D243B.8080801@canterbury.ac.nz> <499D2740.1070408@canterbury.ac.nz> <499D3086.6020706@canterbury.ac.nz> <499DC2E4.5040903@canterbury.ac.nz> Message-ID: <499DE68E.2020605@canterbury.ac.nz> Antoine Pitrou wrote: > By doing the traversal iteratively rather than recursively. Well, I admit > the following function took a couple of attempts to get right: It's also a totally unreasonable amount of obfuscation to endure just to be able to traverse the tree with a generator. -- Greg From solipsis at pitrou.net Fri Feb 20 00:16:26 2009 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 19 Feb 2009 23:16:26 +0000 (UTC) Subject: [Python-ideas] Revised^4 PEP on yield-from References: <499C6E00.2030602@canterbury.ac.nz> <499CC6E8.5050104@ronadam.com> <499D243B.8080801@canterbury.ac.nz> <499D2740.1070408@canterbury.ac.nz> <499D3086.6020706@canterbury.ac.nz> <499DC2E4.5040903@canterbury.ac.nz> <499DE68E.2020605@canterbury.ac.nz> Message-ID: Greg Ewing writes: > > It's also a totally unreasonable amount of obfuscation to > endure just to be able to traverse the tree with a generator. Greg, I find this qualification ("obfuscation") a bit offensive... It's certainly a matter of taste, and, while it's less straightforward than an explicitly recusive traversal, I don't find that particular chunk of code obfuscated at all. To me, it's not harder to understand than the various examples of "yield from" use you have posted to justify that feature. (and, actually, I don't understand how "yield from" helps for a depth-first traversal. Could you post an example of it?) Regards Antoine. From greg.ewing at canterbury.ac.nz Fri Feb 20 01:24:32 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 20 Feb 2009 13:24:32 +1300 Subject: [Python-ideas] Revised^4 PEP on yield-from In-Reply-To: References: <499C6E00.2030602@canterbury.ac.nz> <499CC6E8.5050104@ronadam.com> <499D243B.8080801@canterbury.ac.nz> <499D2740.1070408@canterbury.ac.nz> <499D3086.6020706@canterbury.ac.nz> <499DC2E4.5040903@canterbury.ac.nz> <499DE68E.2020605@canterbury.ac.nz> Message-ID: <499DF840.3050100@canterbury.ac.nz> Antoine Pitrou wrote: > Greg, I find this qualification ("obfuscation") a bit offensive... Sorry, I don't mean that personally. The fact is that it does look obfuscated to my eyes, and I'd be surprised if I were the only person who thinks so. > (and, actually, I don't understand how "yield from" helps for a depth-first > traversal. Could you post an example of it?) Traversing a binary tree with a non-generator: def traverse(node): if node: process_node(node) traverse(node.left) traverse(node.right) Traversing it with a generator: def traverse(node): if node: yield process_node(node) yield from traverse(node.left) yield from traverse(node.right) Do you still think an unrolled version would be equally clear? If so, you have extremely different tastes from me! -- Greg From solipsis at pitrou.net Fri Feb 20 01:39:56 2009 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 20 Feb 2009 00:39:56 +0000 (UTC) Subject: [Python-ideas] Revised^4 PEP on yield-from References: <499C6E00.2030602@canterbury.ac.nz> <499CC6E8.5050104@ronadam.com> <499D243B.8080801@canterbury.ac.nz> <499D2740.1070408@canterbury.ac.nz> <499D3086.6020706@canterbury.ac.nz> <499DC2E4.5040903@canterbury.ac.nz> <499DE68E.2020605@canterbury.ac.nz> <499DF840.3050100@canterbury.ac.nz> Message-ID: Greg Ewing writes: > > Traversing it with a generator: > > def traverse(node): > if node: > yield process_node(node) > yield from traverse(node.left) > yield from traverse(node.right) > > Do you still think an unrolled version would be > equally clear? If so, you have extremely different > tastes from me! Of course, I admit the "yield from" version is simpler :) However, if there isn't a specialized optimization in the interpreter, it will also probably be slower (because it switches between frames a lot, which is likely expensive, although I don't know of any timings). Besides, my point was to show that you didn't /need/ "yield from" to write a linear traversal generator, and the 15 or so lines of that generator are sufficiently generic to be reused from project to project. (my opinion on your PEP being that it brings the complication inside the interpreter itself, especially if you want to implement the feature in an optimized way. I haven't read the scheduler example yet, though...) Regards Antoine. From greg.ewing at canterbury.ac.nz Fri Feb 20 01:48:29 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 20 Feb 2009 13:48:29 +1300 Subject: [Python-ideas] Revised^4 PEP on yield-from In-Reply-To: References: <499C6E00.2030602@canterbury.ac.nz> <499CC6E8.5050104@ronadam.com> <499D243B.8080801@canterbury.ac.nz> <499D2740.1070408@canterbury.ac.nz> <499D3086.6020706@canterbury.ac.nz> <499DC2E4.5040903@canterbury.ac.nz> <499DE68E.2020605@canterbury.ac.nz> <499DF840.3050100@canterbury.ac.nz> Message-ID: <499DFDDD.5050504@canterbury.ac.nz> Antoine Pitrou wrote: > However, if there > isn't a specialized optimization in the interpreter, it will also probably be > slower (because it switches between frames a lot That's true, but I'm 99.9% sure that if it's implemented at all then it will be implemented fairly efficiently, because doing so is actually easier than implementing it inefficiently.:-) > (my opinion on your PEP being that it brings the complication inside the > interpreter itself, especially if you want to implement the feature in an > optimized way. Bringing it into the interpreter is what makes it possible, and fairly straightforward, to implement it efficiently. Hopefully this will become clearer when I get a reference implementation going. -- Greg From jh at improva.dk Fri Feb 20 01:18:24 2009 From: jh at improva.dk (Jacob Holm) Date: Fri, 20 Feb 2009 01:18:24 +0100 Subject: [Python-ideas] Revised^4 PEP on yield-from In-Reply-To: References: <499C6E00.2030602@canterbury.ac.nz> <499CC6E8.5050104@ronadam.com> <499D243B.8080801@canterbury.ac.nz> <499D2740.1070408@canterbury.ac.nz> <499D3086.6020706@canterbury.ac.nz> <499DC2E4.5040903@canterbury.ac.nz> <499DE68E.2020605@canterbury.ac.nz> Message-ID: <499DF6D0.3070300@improva.dk> Antoine Pitrou wrote: > (and, actually, I don't understand how "yield from" helps for a depth-first > traversal. Could you post an example of it?) Antoine, I expect something like: def traverse_depth_first(tree): yield tree.value for child in tree.children: yield from traverse_depth_first(child) to be semantically equivalent and *much* easier to read than your version. If we use the expansion listed in the PEP as the implementation of "yield from", we have the O(n**2) performance mentioned. I *know* we can do better than that, but I don't (yet) know enough about the python internals to tell you how. I am +1 on the PEP assuming we find a way around the O(n**2) behavior, +0.75 if not :) Regards Jacob From rhamph at gmail.com Fri Feb 20 01:50:26 2009 From: rhamph at gmail.com (Adam Olsen) Date: Thu, 19 Feb 2009 17:50:26 -0700 Subject: [Python-ideas] Revised^4 PEP on yield-from In-Reply-To: <499DF840.3050100@canterbury.ac.nz> References: <499C6E00.2030602@canterbury.ac.nz> <499D243B.8080801@canterbury.ac.nz> <499D2740.1070408@canterbury.ac.nz> <499D3086.6020706@canterbury.ac.nz> <499DC2E4.5040903@canterbury.ac.nz> <499DE68E.2020605@canterbury.ac.nz> <499DF840.3050100@canterbury.ac.nz> Message-ID: On Thu, Feb 19, 2009 at 5:24 PM, Greg Ewing wrote: > Traversing a binary tree with a non-generator: > > def traverse(node): > if node: > process_node(node) > traverse(node.left) > traverse(node.right) > > Traversing it with a generator: > > def traverse(node): > if node: > yield process_node(node) > yield from traverse(node.left) > yield from traverse(node.right) > > Do you still think an unrolled version would be > equally clear? If so, you have extremely different > tastes from me! This is a pretty good example, IMO. However, I'd like to see what a trampoline would look like to support something like this: @trampoline def traverse(node): if node: yield leaf(process_node(node)) yield traverse(node.left) yield traverse(node.right) If the use case is sufficiently common we can consider putting such a trampoline in the stdlib. If not it should at least go in the cookbook. And FWIW, a C implementation of such a trampoline should be almost identical to what the PEP proposes. It's just substituting a type check for the new syntax. -- Adam Olsen, aka Rhamphoryncus From guido at python.org Fri Feb 20 01:53:27 2009 From: guido at python.org (Guido van Rossum) Date: Thu, 19 Feb 2009 16:53:27 -0800 Subject: [Python-ideas] Revised^4 PEP on yield-from In-Reply-To: <499DFDDD.5050504@canterbury.ac.nz> References: <499C6E00.2030602@canterbury.ac.nz> <499D3086.6020706@canterbury.ac.nz> <499DC2E4.5040903@canterbury.ac.nz> <499DE68E.2020605@canterbury.ac.nz> <499DF840.3050100@canterbury.ac.nz> <499DFDDD.5050504@canterbury.ac.nz> Message-ID: On Thu, Feb 19, 2009 at 4:48 PM, Greg Ewing wrote: > Antoine Pitrou wrote: >> >> However, if there >> isn't a specialized optimization in the interpreter, it will also probably >> be >> slower (because it switches between frames a lot > > That's true, but I'm 99.9% sure that if it's implemented > at all then it will be implemented fairly efficiently, > because doing so is actually easier than implementing > it inefficiently.:-) Hey Greg, I think your efforts now should be focused on the implementation and not on continued arguing with unbelievers. Code speaks. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg.ewing at canterbury.ac.nz Fri Feb 20 02:04:48 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 20 Feb 2009 14:04:48 +1300 Subject: [Python-ideas] Yield-From Example 2: Scheduler In-Reply-To: References: <499DCFE7.2020407@canterbury.ac.nz> Message-ID: <499E01B0.3090102@canterbury.ac.nz> Adam Olsen wrote: > You don't use plain "yield" except internally, so you don't even need > "yield Call(g(x))". You only need "yield g(x)" and your scheduler can > assume it's a generator. I was about to answer that you need some way to distinguish between calls and returns, but if we have StopIteration(value) then that's not true. So you're right, this is a viable way of implementing generator scheduling. So there perhaps isn't a very strong case for using 'yield from' in a scheduler setting. However it still illustrates the utility of being able to return a value from a generator, since otherwise you would either have to explicitly raise the StopIteration or resort to some convention such as yield Call(g)/yield Return(x). Also there may be a performance advantage, since there will be some cost involved in using Python code to manage the generator stack and route returned values to the right places. In a wider setting, I still think there is benefit in being able to abstract out pieces of generator code without needing to have a special driver. -- Greg From rhamph at gmail.com Fri Feb 20 02:16:45 2009 From: rhamph at gmail.com (Adam Olsen) Date: Thu, 19 Feb 2009 18:16:45 -0700 Subject: [Python-ideas] Revised^4 PEP on yield-from In-Reply-To: References: <499C6E00.2030602@canterbury.ac.nz> <499DC2E4.5040903@canterbury.ac.nz> <499DE68E.2020605@canterbury.ac.nz> <499DF840.3050100@canterbury.ac.nz> <499DFDDD.5050504@canterbury.ac.nz> Message-ID: On Thu, Feb 19, 2009 at 5:53 PM, Guido van Rossum wrote: > Hey Greg, I think your efforts now should be focused on the > implementation and not on continued arguing with unbelievers. Code > speaks. Speaking of code... # trampoline implementation class trampoline(object): def __init__(self, func, instance=None): self.func = func self.instance = instance def __get__(self, obj, type=None): return trampoline(self.func, obj) def __call__(self, *args, **kwargs): if self.instance is not None: args = (self.instance,) + args return trampolineiter(self.func(*args, **kwargs)) class trampolineiter(object): def __init__(self, iterable): self.stack = [iterable] def __iter__(self): return self def next(self): while True: try: x = self.stack[-1].next() except StopIteration: self.stack.pop(-1) if not self.stack: raise else: if isinstance(x, trampolineiter): assert len(x.stack) == 1 self.stack.append(x.stack[0]) elif isinstance(x, leaf): return x.data else: raise TypeError("Unexpected type yielded to trampoline") class leaf(object): def __init__(self, data): self.data = data # Example usage class Tree(object): def __init__(self, name, left, right): self.name = name self.left = left self.right = right @trampoline def __iter__(self): if self: yield leaf(self) yield traverse(self.left) yield traverse(self.right) >>> mytree = Tree(0, Tree(1, None, Tree(2, None, None)), Tree(3, None, None)) >>> for node in mytree: ... print "Found:", node.name ... Found: 0 Found: 1 Found: 2 Found: 3 -- Adam Olsen, aka Rhamphoryncus From rhamph at gmail.com Fri Feb 20 02:33:20 2009 From: rhamph at gmail.com (Adam Olsen) Date: Thu, 19 Feb 2009 18:33:20 -0700 Subject: [Python-ideas] Yield-From Example 2: Scheduler In-Reply-To: <499E01B0.3090102@canterbury.ac.nz> References: <499DCFE7.2020407@canterbury.ac.nz> <499E01B0.3090102@canterbury.ac.nz> Message-ID: On Thu, Feb 19, 2009 at 6:04 PM, Greg Ewing wrote: > So there perhaps isn't a very strong case for using 'yield > from' in a scheduler setting. However it still illustrates > the utility of being able to return a value from a generator, > since otherwise you would either have to explicitly raise > the StopIteration or resort to some convention such as > yield Call(g)/yield Return(x). If you can propose semantics for return that don't risk silently eating values then I'll at least gives -0. Until then I'm -1. > In a wider setting, I still think there is benefit in > being able to abstract out pieces of generator code > without needing to have a special driver. Microthreads need the special driver anyway, to handle all the other kinds of resources. -- Adam Olsen, aka Rhamphoryncus From guido at python.org Fri Feb 20 02:47:03 2009 From: guido at python.org (Guido van Rossum) Date: Thu, 19 Feb 2009 17:47:03 -0800 Subject: [Python-ideas] Yield-From Example 2: Scheduler In-Reply-To: References: <499DCFE7.2020407@canterbury.ac.nz> <499E01B0.3090102@canterbury.ac.nz> Message-ID: On Thu, Feb 19, 2009 at 5:33 PM, Adam Olsen wrote: > On Thu, Feb 19, 2009 at 6:04 PM, Greg Ewing wrote: >> So there perhaps isn't a very strong case for using 'yield >> from' in a scheduler setting. However it still illustrates >> the utility of being able to return a value from a generator, >> since otherwise you would either have to explicitly raise >> the StopIteration or resort to some convention such as >> yield Call(g)/yield Return(x). > > If you can propose semantics for return that don't risk silently > eating values then I'll at least gives -0. Until then I'm -1. Can you explain why you feel so strongly about this? As has been pointed out, returned values are routinely ignored in Python, it is not an error. >> In a wider setting, I still think there is benefit in >> being able to abstract out pieces of generator code >> without needing to have a special driver. > > Microthreads need the special driver anyway, to handle all the other > kinds of resources. I consider "yield from A" *primarily* a shorthand with (optimization potential) for the oft-repeated clause for x in A: yield x Given that this is a useful thing to have anyways, we might as well give it subtle extra semantics for pass-through and returning values. After all that is what .send(), .throw() and .close() already do for generators: most generators don't use these, but they allow certan advanced things to be built out of generators with more ease and grace. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg.ewing at canterbury.ac.nz Fri Feb 20 03:13:58 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 20 Feb 2009 15:13:58 +1300 Subject: [Python-ideas] Yield-From Example 2: Scheduler In-Reply-To: References: <499DCFE7.2020407@canterbury.ac.nz> <499E01B0.3090102@canterbury.ac.nz> Message-ID: <499E11E6.3060905@canterbury.ac.nz> Adam Olsen wrote: > If you can propose semantics for return that don't risk silently > eating values then I'll at least gives -0. Until then I'm -1. I'm still far from convinced that "eating" return values is a bad thing. Do you worry that a value returned from an ordinary function gets eaten if the caller ignores it? If not, why should you worry about this with generators? > Microthreads need the special driver anyway, to handle all the other > kinds of resources. Yes, but if all you want is to recurse over a tree, and don't want to block on resources etc. in it anywhere, having to use a thread framework seems overkill. -- Greg From greg.ewing at canterbury.ac.nz Fri Feb 20 03:24:13 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 20 Feb 2009 15:24:13 +1300 Subject: [Python-ideas] More alternative names for yield-from Message-ID: <499E144D.2020900@canterbury.ac.nz> I still wish I could find a better name for it. Some more ideas: y = suspendable g(x) y = suspending g(x) y = sus g(x) # if you want something short & snappy -- Greg From scott+python-ideas at scottdial.com Fri Feb 20 04:01:51 2009 From: scott+python-ideas at scottdial.com (Scott Dial) Date: Thu, 19 Feb 2009 22:01:51 -0500 Subject: [Python-ideas] More alternative names for yield-from In-Reply-To: <499E144D.2020900@canterbury.ac.nz> References: <499E144D.2020900@canterbury.ac.nz> Message-ID: <499E1D1F.8080606@scottdial.com> Greg Ewing wrote: > I still wish I could find a better name for it. > Some more ideas: > > y = suspendable g(x) > > y = suspending g(x) > > y = sus g(x) # if you want something short & snappy > Along with "call", all of these names don't seem to cater to the majority of people who are going use it as a replacement for the "for x in i: yield x" pattern. I haven't seen anyone (with that use case in mind) who has argued that "yield from" was a bad name. OTOH, I have seen several arguments about the treading use case being questionable, but maybe I mistakenly associate those words with that use. Nevertheless, I think there would be a sizable number of people disappointed to find out "suspendable" is a reserved word. But, I think if you were going down the other path with names, you would end up thinking about things like: y = yield for g(x) -Scott -- Scott Dial scott at scottdial.com scodial at cs.indiana.edu From charlie137 at gmail.com Fri Feb 20 04:02:13 2009 From: charlie137 at gmail.com (Guillaume Chereau) Date: Fri, 20 Feb 2009 11:02:13 +0800 Subject: [Python-ideas] More alternative names for yield-from In-Reply-To: <499E144D.2020900@canterbury.ac.nz> References: <499E144D.2020900@canterbury.ac.nz> Message-ID: <8e9327d40902191902x15ab268dw2accbea928a99f3c@mail.gmail.com> What about : y = redirect g(x) y = yields g(x) -Gui On Fri, Feb 20, 2009 at 10:24 AM, Greg Ewing wrote: > I still wish I could find a better name for it. > Some more ideas: > > y = suspendable g(x) > > y = suspending g(x) > > y = sus g(x) # if you want something short & snappy > > -- > Greg > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- http://charlie137.blogspot.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Fri Feb 20 04:09:25 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 20 Feb 2009 16:09:25 +1300 Subject: [Python-ideas] Yield-from: Suspendable generators Message-ID: <499E1EE5.2020909@canterbury.ac.nz> Whether you're using yield-from or not, it doesn't seem to be possible to have a for-loop iterating over something that is also suspendable in a generator-based thread setting. The basic problem is that we have one channel and two different things we want to use it for. The obvious answer is that we need to multiplex. Since we're already using the entire bandwidth of the channel (anything could be a valid yielded value) we need to introduce some out-of-band data somehow. Suppose we have a new expression suspend [] [with ] This is a lot like a yield, except that it sends a tuple (value, tag). The existing yield expression yield becomes equivalent to suspend with 'yield' There is a new generator method to go along with this: g.resume(value, tag) If the generator is suspended at a suspend expression, the value of the suspend expression becomes (value, tag). If it is suspended at a yield, and the tag is 'yield' then the value becomes the value of the yield expression. (Not sure what to do in other cases, maybe raise an exception.) The existing send() method is mapped to resume() as follows: def send(self, value): value2, tag2 = self.resume(value, 'yield') if tag2 == 'yield': return value2 else: # What to do here? Ignore? Raise an exception? So we've generalised the yield channel into a suspend channel, which can have any number of sub-channels. We have reserved one of these sub-channels, tagged with 'yield', for carrying yielded values. The rest of the channels are free for use by other things such as thread-scheduling libraries. To complete this, we also need a variant of the for-loop that is willing to pass values from the other channels on to the caller. Picking a random syntax for illustration, for y from g(x): # body would be roughly equivalent to it = g(x) value = None tag = 'yield' try: while 1: value, tag = it.resume(value, tag) if tag == 'yield': y = value value = None # body else: value, tag = suspend value with tag except StopIteration: pass plus suitable handling of 'throw' and 'close'. -- Greg From greg.ewing at canterbury.ac.nz Fri Feb 20 04:15:22 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 20 Feb 2009 16:15:22 +1300 Subject: [Python-ideas] More alternative names for yield-from In-Reply-To: <499E1D1F.8080606@scottdial.com> References: <499E144D.2020900@canterbury.ac.nz> <499E1D1F.8080606@scottdial.com> Message-ID: <499E204A.8050500@canterbury.ac.nz> Scott Dial wrote: > Along with "call", all of these names don't seem to cater to the > majority of people who are going use it as a replacement for the > "for x in i: yield x" pattern. Yes, it seems to be difficult to find a single name that fits all the use cases. I want something that means "Do what that stuff would have done if I'd written it all out right here." Anyone got a really good thesaurus handy? -- Greg From turnbull at sk.tsukuba.ac.jp Fri Feb 20 04:17:10 2009 From: turnbull at sk.tsukuba.ac.jp (Stephen J. Turnbull) Date: Fri, 20 Feb 2009 12:17:10 +0900 Subject: [Python-ideas] A suggestion: Do proto-PEPs in Google Docs In-Reply-To: References: <499C6E00.2030602@canterbury.ac.nz> <499CC6E8.5050104@ronadam.com> <499D243B.8080801@canterbury.ac.nz> <499D2740.1070408@canterbury.ac.nz> <499D3086.6020706@canterbury.ac.nz> Message-ID: <87tz6ppzk9.fsf@xemacs.org> [Aside to Guido: Oops, I think I accidentally sent you a contentless reply. Sorry!] As a suggestion, I think this is relevant to everybody who might be writing a PEP, so I'm cross-posting to Python-Dev. Probably no discussion is needed, but Reply-To is set to Python-Ideas. On Python-Ideas, Guido van Rossum writes: > On Thu, Feb 19, 2009 at 2:12 AM, Greg Ewing wrote: > > Fifth draft of the PEP. Re-worded a few things slightly > > to hopefully make the proposal a bit clearer up front. > > Wow, how I long for the days when we routinely put things like this > under revision control so its easy to compare versions. FWIW, Google Docs is almost there. Working with Brett et al on early drafts of PEP 0374 was easy and pleasant, and Google Docs gives control of access to the document to the editor, not the Subversion admin. The ability to make comments that are not visible to non-editors was nice. Now that it's in Subversion it's much less convenient for me (a non-committer). I actually have to *decide* to work on it, rather than simply raising a browser window, hitting "refresh" and fixing a typo or two (then back to "day job" work). The main problem with Google Docs is that is records a revision automatically every so often (good) but doesn't prune the automatic commits (possibly hard to do efficiently) OR mark user saves specially (easy to do). This lack of marking "important" revisions makes the diff functionality kind of tedious. I don't know how automatic the conversion to reST was, but the PEP in Subversion is a quite accurate conversion of the Google Doc version. Overall, I recommend use of Google Docs for "Python-Ideas" level of PEP drafts. From charles_c_balkon at yahoo.com Fri Feb 20 04:14:30 2009 From: charles_c_balkon at yahoo.com (charles balkon) Date: Thu, 19 Feb 2009 19:14:30 -0800 (PST) Subject: [Python-ideas] a portable python Message-ID: <786791.8081.qm@web51510.mail.re2.yahoo.com> I would like to use python but I really hate the way you guys change versions like underwear. If it could be possible could there be a separate version of python like "IcePython" that would be a executable with a bz2 file containing all the py files for the modules. Then when i run the script i would run this single executable and it would dig into it's own version of py files hidden in the bz2. This would make life much easier since I could ignore the rapid changes in the api of python until i'm ready to move code to a new version. I think there's a reason corporations want long 5 year versions of linux. This would help when you have many different boxes running many different oses and many different python versions. it would be more of a write once with this executable version and it's bz2 file tagging along. Then no matter what box i want to run it on i just drop 3 files and it will run there. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Fri Feb 20 05:45:55 2009 From: brett at python.org (Brett Cannon) Date: Thu, 19 Feb 2009 20:45:55 -0800 Subject: [Python-ideas] a portable python In-Reply-To: <786791.8081.qm@web51510.mail.re2.yahoo.com> References: <786791.8081.qm@web51510.mail.re2.yahoo.com> Message-ID: On Thu, Feb 19, 2009 at 19:14, charles balkon wrote: > I would like to use python but I really hate the way you guys change > versions > like underwear. > Clean underwear is a good thing. > > If it could be possible could there be a separate version of python like > "IcePython" > that would be a executable with a bz2 file containing all the py files for > the modules. No, it's not worth our time when third-parties have already solved this. > > Then when i run the script i would run this single executable and it would > dig into it's own version of py files hidden in the bz2. > Sounds like you are on UNIX, so look at virtualenv or some other similar solution to py2exe/py2app for UNIX. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: From collinw at gmail.com Fri Feb 20 06:03:59 2009 From: collinw at gmail.com (Collin Winter) Date: Thu, 19 Feb 2009 21:03:59 -0800 Subject: [Python-ideas] [Python-Dev] A suggestion: Do proto-PEPs in Google Docs In-Reply-To: <87tz6ppzk9.fsf@xemacs.org> References: <499C6E00.2030602@canterbury.ac.nz> <499CC6E8.5050104@ronadam.com> <499D243B.8080801@canterbury.ac.nz> <499D2740.1070408@canterbury.ac.nz> <499D3086.6020706@canterbury.ac.nz> <87tz6ppzk9.fsf@xemacs.org> Message-ID: <43aa6ff70902192103g70d282e0p20fcb0c961d407d0@mail.gmail.com> On Thu, Feb 19, 2009 at 7:17 PM, Stephen J. Turnbull wrote: > On Python-Ideas, Guido van Rossum writes: > > > On Thu, Feb 19, 2009 at 2:12 AM, Greg Ewing wrote: > > > > Fifth draft of the PEP. Re-worded a few things slightly > > > to hopefully make the proposal a bit clearer up front. > > > > Wow, how I long for the days when we routinely put things like this > > under revision control so its easy to compare versions. > > FWIW, Google Docs is almost there. Working with Brett et al on early > drafts of PEP 0374 was easy and pleasant, and Google Docs gives > control of access to the document to the editor, not the Subversion > admin. The ability to make comments that are not visible to > non-editors was nice. Now that it's in Subversion it's much less > convenient for me (a non-committer). I actually have to *decide* to > work on it, rather than simply raising a browser window, hitting > "refresh" and fixing a typo or two (then back to "day job" work). > > The main problem with Google Docs is that is records a revision > automatically every so often (good) but doesn't prune the automatic > commits (possibly hard to do efficiently) OR mark user saves specially > (easy to do). This lack of marking "important" revisions makes the > diff functionality kind of tedious. > > I don't know how automatic the conversion to reST was, but the PEP in > Subversion is a quite accurate conversion of the Google Doc version. > > Overall, I recommend use of Google Docs for "Python-Ideas" level of > PEP drafts. Rietveld would also be a good option: it offers more at-will revision control (rather than "whenever Google Docs decides"), allows you to attach comments to the revisions, and will give you nice diffs between PEP iterations. Collin From rhamph at gmail.com Fri Feb 20 06:07:20 2009 From: rhamph at gmail.com (Adam Olsen) Date: Thu, 19 Feb 2009 22:07:20 -0700 Subject: [Python-ideas] Yield-From Example 2: Scheduler In-Reply-To: References: <499DCFE7.2020407@canterbury.ac.nz> <499E01B0.3090102@canterbury.ac.nz> Message-ID: On Thu, Feb 19, 2009 at 6:47 PM, Guido van Rossum wrote: > On Thu, Feb 19, 2009 at 5:33 PM, Adam Olsen wrote: >> On Thu, Feb 19, 2009 at 6:04 PM, Greg Ewing wrote: >>> So there perhaps isn't a very strong case for using 'yield >>> from' in a scheduler setting. However it still illustrates >>> the utility of being able to return a value from a generator, >>> since otherwise you would either have to explicitly raise >>> the StopIteration or resort to some convention such as >>> yield Call(g)/yield Return(x). >> >> If you can propose semantics for return that don't risk silently >> eating values then I'll at least gives -0. Until then I'm -1. > > Can you explain why you feel so strongly about this? As has been > pointed out, returned values are routinely ignored in Python, it is > not an error. It's more of a gut instinct thing. It'd be hard to make ignored returns into an error, but it's easy here. -- Adam Olsen, aka Rhamphoryncus From rwgk at yahoo.com Fri Feb 20 06:52:51 2009 From: rwgk at yahoo.com (Ralf W. Grosse-Kunstleve) Date: Thu, 19 Feb 2009 21:52:51 -0800 (PST) Subject: [Python-ideas] TypeError: 'module' object is not callable Message-ID: <419932.53278.qm@web111414.mail.gq1.yahoo.com> I see there have been discussion about module __call__ about three years ago: http://mail.python.org/pipermail/python-list/2006-February/thread.html#366176 Is there an existing pronouncement on this subject? __call__ would help avoiding strange things like from StringIO import StringIO or having to come up with silly names like run, driver, manager, etc. Ideally, __call__ could be either a function or class. I imagine, nothing special, except that a module object looks for __call__ instead of producing a type error. From pyideas at rebertia.com Fri Feb 20 08:11:48 2009 From: pyideas at rebertia.com (Chris Rebert) Date: Thu, 19 Feb 2009 23:11:48 -0800 Subject: [Python-ideas] TypeError: 'module' object is not callable In-Reply-To: <419932.53278.qm@web111414.mail.gq1.yahoo.com> References: <419932.53278.qm@web111414.mail.gq1.yahoo.com> Message-ID: <50697b2c0902192311s57c0121bj6f8f0c2e0a70b485@mail.gmail.com> On Thu, Feb 19, 2009 at 9:52 PM, Ralf W. Grosse-Kunstleve wrote: > > I see there have been discussion about module __call__ about three years ago: > > http://mail.python.org/pipermail/python-list/2006-February/thread.html#366176 > > Is there an existing pronouncement on this subject? > > __call__ would help avoiding strange things like from StringIO import StringIO > or having to come up with silly names like run, driver, manager, etc. > > Ideally, __call__ could be either a function or class. > > I imagine, nothing special, except that a module object looks for __call__ instead > of producing a type error. IMHO, that seems like it would unreasonably blur the line between whether a module is a class or an instance. If it has a __call__ definition in it, that would seem to imply that it is somehow a class. But since you want to be able to call it on the module itself, that seems to suggest that the module is an instance; but in that case, lookup would start in the class 'module', not the module itself, and thus fail. Seems your proposal would require modules to be some strange hybrid and an exception to the normal Python rules. On this, I defer to the Zen: " Special cases aren't special enough to break the rules." I don't find the StringIO case very odd, though I do think renaming the module (and others in similar situations) to comply with PEP8 (i.e. "stringio" ) would help. Cheers, Chris -- Follow the path of the Iguana... http://rebertia.com From denis.spir at free.fr Fri Feb 20 09:07:21 2009 From: denis.spir at free.fr (spir) Date: Fri, 20 Feb 2009 09:07:21 +0100 Subject: [Python-ideas] More alternative names for yield-from In-Reply-To: <499E1D1F.8080606@scottdial.com> References: <499E144D.2020900@canterbury.ac.nz> <499E1D1F.8080606@scottdial.com> Message-ID: <20090220090721.4b9e0041@o> Le Thu, 19 Feb 2009 22:01:51 -0500, Scott Dial s'exprima ainsi: > But, I think if you were going down the other path with names, you would > end up thinking about things like: > > y = yield for g(x) + .99 ------ la vita e estrany From cmjohnson.mailinglist at gmail.com Fri Feb 20 12:39:16 2009 From: cmjohnson.mailinglist at gmail.com (Carl Johnson) Date: Fri, 20 Feb 2009 01:39:16 -1000 Subject: [Python-ideas] More alternative names for yield-from In-Reply-To: <3bdda690902200338m15f52f12l89aad6498a922b4b@mail.gmail.com> References: <499E144D.2020900@canterbury.ac.nz> <499E1D1F.8080606@scottdial.com> <499E204A.8050500@canterbury.ac.nz> <3bdda690902200338m15f52f12l89aad6498a922b4b@mail.gmail.com> Message-ID: <3bdda690902200339i59b24df7l18f518ec98d87821@mail.gmail.com> On Thu, Feb 19, 2009 at 5:15 PM, Greg Ewing wrote: > I want something that means "Do what that stuff > would have done if I'd written it all out right > here." Anyone got a really good thesaurus handy? > "Macro". I wonder if this isn't going in the wrong direction. This syntax change is being considered because there's no way to change the control flow inside of a function (so that, for example, we can yield multiple items) besides using one of the existing statements or executing some other function. If there were, one could just write: from itertools import yield_from yield_all_from(gen) or whatever. With functions the lack of external regulation on control flow doesn't seem to be a big deal, but apparently for generators it is? So maybe we need to think more clearly about what kinds of control flow changes are appropriate for generators? Is yield from really going to solve all our problems? Or will we be back for a new keyword in 6 months? Inconclusively-yours, -- Carl -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Fri Feb 20 12:50:31 2009 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 20 Feb 2009 11:50:31 +0000 (UTC) Subject: [Python-ideas] More alternative names for yield-from References: <499E144D.2020900@canterbury.ac.nz> Message-ID: Greg Ewing writes: > > I still wish I could find a better name for it. To me, "yield from" is a very good name for what you are proposing. No need to find something else. From lie.1296 at gmail.com Fri Feb 20 13:35:29 2009 From: lie.1296 at gmail.com (Lie Ryan) Date: Fri, 20 Feb 2009 12:35:29 +0000 (UTC) Subject: [Python-ideas] a portable python References: <786791.8081.qm@web51510.mail.re2.yahoo.com> Message-ID: On Thu, 19 Feb 2009 19:14:30 -0800, charles balkon wrote: > I would like to use python but I really hate the way you guys change > versions like underwear. > > If it could be possible could there be a separate version of python like > "IcePython" that would be a executable with a bz2 file containing all > the py files for the modules. Then when i run the script i would run > this single executable and it would dig into it's own version of py > files hidden in the bz2. Do you know that you can simply just not install a new version... and if you need multiple versions, do you know that it is not hard to do multiple version installation of python... > This would make life much easier since I could ignore the rapid changes > in the api of python until i'm ready to move code to a new version. > > I think there's a reason corporations want long 5 year versions of > linux. Some people still uses python1.5 which is all the way back into time. > This would help when you have many different boxes running many > different oses and many different python versions. > > it would be more of a write once with this executable version and it's > bz2 file tagging along. > > Then no matter what box i want to run it on i just drop 3 files and it > will run there. But I think the idea of having multiple portable python interpreter is quite interesting, it can be used for things like bringing python in a USB drive to a foreign machine and be able to run python scripts from that machine without the need to do installations (e.g. portableapps.com style) And... Google turned out a USB-drive version of python: http:// www.portablepython.com/ (Windows-only) From g.brandl at gmx.net Fri Feb 20 14:05:40 2009 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 20 Feb 2009 14:05:40 +0100 Subject: [Python-ideas] More alternative names for yield-from In-Reply-To: References: <499E144D.2020900@canterbury.ac.nz> Message-ID: Antoine Pitrou schrieb: > Greg Ewing writes: >> >> I still wish I could find a better name for it. > > To me, "yield from" is a very good name for what you are proposing. No need to > find something else. Agreed. It even fit quite well in the schedular example. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From arnodel at googlemail.com Fri Feb 20 17:01:50 2009 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Fri, 20 Feb 2009 16:01:50 +0000 Subject: [Python-ideas] More alternative names for yield-from In-Reply-To: <499E144D.2020900@canterbury.ac.nz> References: <499E144D.2020900@canterbury.ac.nz> Message-ID: <9bfc700a0902200801v42df9b54l4d43e4d83bff46dd@mail.gmail.com> 2009/2/20 Greg Ewing : > I still wish I could find a better name for it. I think 'yield from' is better than all the other suggestions apart from 'yield *' which would be my choice. A while ago I created a function to almost exactly add the functionality you describe in your PEP including the return behaviour, which I implemented using raise RETURN( ... ). I've slightly modified it so it agrees with your specification and include it in this message after some very simple examples. Of course it would not help you implement the PEP but I thought it may be handy to play with some examples in order to fine tune the PEP. The function is called 'co' and applied to a generator function it turns it into a generator that implements the special behavior of yield FROM ( ... ) raise RETURN ( ... ). Here are some very simple examples and the implementation of co() follows at the end. # # List flattener. # def coflatten(x): if isinstance(x, list): for y in x: yield FROM( coflatten(y) ) else: yield x flatten = co(coflatten) >>> list(flatten([[[[1,2,3]]],4,5])) [1, 2, 3, 4, 5] # # Something like Guido's tree example # class Tree(object): def __init__(self, label, children=()): self.label = label self.children = children @co def __iter__(self): skip = yield self.label if skip: yield 'SKIPPED' else: yield 'ENTER' for child in self.children: yield FROM( child ) yield 'LEAVE' >>> tree = Tree('A', [Tree('B'), Tree('C')]) >>> >>> list(tree) ['A', 'ENTER', 'B', 'ENTER', 'LEAVE', 'C', 'ENTER', 'LEAVE', 'LEAVE'] >>> i = iter(tree) >>> skip = None >>> try: ... while True: ... a, skip = i.send(skip), None ... print a ... if a == 'B': ... skip = True ... except StopIteration: ... pass ... A ENTER B SKIPPED C ENTER LEAVE LEAVE # # Simple example with raise RETURN ( ... ) # @co def gen(): a = yield FROM( co_nested() ) yield a def co_nested(): yield FROM( [1, 2] ) raise RETURN( 3 ) >>> list(gen()) [1, 2, 3] # # Implementation of co(), FROM, RETURN # class FROM(object): def __init__(self, gen, val=None): self.data = gen, val class RETURN(StopIteration): def __init__(self, val=None): self.val = val def co(cogen, val=None): def gen(*args, **kwargs): gen = cogen(*args, **kwargs) val = None callstack = [] while True: try: ret = gen.next() if val is None else gen.send(val) except StopIteration, e: if callstack: gen, val = callstack.pop(), getattr(e, 'val', None) continue raise if type(ret) is FROM: callstack.append(gen) gen, val = ret.data gen = iter(gen) else: try: val = yield ret except Exception, e: if hasattr(gen, 'throw'): val = yield gen.throw(e) else: raise return gen -- Arnaud From lists at cheimes.de Fri Feb 20 18:35:00 2009 From: lists at cheimes.de (Christian Heimes) Date: Fri, 20 Feb 2009 18:35:00 +0100 Subject: [Python-ideas] TypeError: 'module' object is not callable In-Reply-To: <419932.53278.qm@web111414.mail.gq1.yahoo.com> References: <419932.53278.qm@web111414.mail.gq1.yahoo.com> Message-ID: Ralf W. Grosse-Kunstleve wrote: > I see there have been discussion about module __call__ about three years ago: > > http://mail.python.org/pipermail/python-list/2006-February/thread.html#366176 > > Is there an existing pronouncement on this subject? > > __call__ would help avoiding strange things like from StringIO import StringIO > or having to come up with silly names like run, driver, manager, etc. > > Ideally, __call__ could be either a function or class. > > I imagine, nothing special, except that a module object looks for __call__ instead > of producing a type error. It's technically not possible without jumping through several loops. Magic methods are looked up on the class. Modules are instances of the ModuleType class. Christian From guido at python.org Fri Feb 20 18:47:28 2009 From: guido at python.org (Guido van Rossum) Date: Fri, 20 Feb 2009 09:47:28 -0800 Subject: [Python-ideas] TypeError: 'module' object is not callable In-Reply-To: References: <419932.53278.qm@web111414.mail.gq1.yahoo.com> Message-ID: Oh, we could easily add a __call__ to the module type that looks for a __call__ function. I just don't think it's a good idea. On Fri, Feb 20, 2009 at 9:35 AM, Christian Heimes wrote: > Ralf W. Grosse-Kunstleve wrote: >> I see there have been discussion about module __call__ about three years ago: >> >> http://mail.python.org/pipermail/python-list/2006-February/thread.html#366176 >> >> Is there an existing pronouncement on this subject? >> >> __call__ would help avoiding strange things like from StringIO import StringIO >> or having to come up with silly names like run, driver, manager, etc. >> >> Ideally, __call__ could be either a function or class. >> >> I imagine, nothing special, except that a module object looks for __call__ instead >> of producing a type error. > > > It's technically not possible without jumping through several loops. > Magic methods are looked up on the class. Modules are instances of the > ModuleType class. > > Christian > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From steve at pearwood.info Fri Feb 20 18:52:14 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 21 Feb 2009 04:52:14 +1100 Subject: [Python-ideas] TypeError: 'module' object is not callable In-Reply-To: References: <419932.53278.qm@web111414.mail.gq1.yahoo.com> Message-ID: <499EEDCE.6010509@pearwood.info> Christian Heimes wrote: > Ralf W. Grosse-Kunstleve wrote: >> I see there have been discussion about module __call__ about three years ago: >> >> http://mail.python.org/pipermail/python-list/2006-February/thread.html#366176 >> >> Is there an existing pronouncement on this subject? >> >> __call__ would help avoiding strange things like from StringIO import StringIO >> or having to come up with silly names like run, driver, manager, etc. >> >> Ideally, __call__ could be either a function or class. >> >> I imagine, nothing special, except that a module object looks for __call__ instead >> of producing a type error. > > > It's technically not possible without jumping through several loops. > Magic methods are looked up on the class. Modules are instances of the > ModuleType class. It shouldn't be that difficult to implement. Something like: class ModuleType: # probably implemented in C? def __call__(self, *args): return self.__dict__['__call__'](*args) The question is, should it be implemented? -- Steven From venkat83 at gmail.com Fri Feb 20 19:07:22 2009 From: venkat83 at gmail.com (Venkatraman S) Date: Fri, 20 Feb 2009 23:37:22 +0530 Subject: [Python-ideas] Register based interpreter Message-ID: Hi, Has there been any discussion or effort on moving from a stack based interpreter to a more register based one? (my limited search in the archives did not yield me any results) regards, -V- -------------- next part -------------- An HTML attachment was scrubbed... URL: From pyideas at rebertia.com Fri Feb 20 19:15:52 2009 From: pyideas at rebertia.com (Chris Rebert) Date: Fri, 20 Feb 2009 10:15:52 -0800 Subject: [Python-ideas] Register based interpreter In-Reply-To: References: Message-ID: <50697b2c0902201015g7572d645hbfe8b0fe60960f8d@mail.gmail.com> On Fri, Feb 20, 2009 at 10:07 AM, Venkatraman S wrote: > Hi, > > Has there been any discussion or effort on moving from a stack based > interpreter to a more register based one? > (my limited search in the archives did not yield me any results) It's possible PyPy will investigate that, and the Python implementation on Parrot (which was stagnant last I checked) would obviously be register-based, but as for CPython, I don't think it's going to happen (at least not any time soon). The bytecode is stack-oriented through-and-through; it'd be quite a major change to become register-based. Cheers, Chris -- Follow the path of the Iguana... http://rebertia.com From santagada at gmail.com Fri Feb 20 20:41:43 2009 From: santagada at gmail.com (Leonardo Santagada) Date: Fri, 20 Feb 2009 16:41:43 -0300 Subject: [Python-ideas] Register based interpreter In-Reply-To: <50697b2c0902201015g7572d645hbfe8b0fe60960f8d@mail.gmail.com> References: <50697b2c0902201015g7572d645hbfe8b0fe60960f8d@mail.gmail.com> Message-ID: On Feb 20, 2009, at 3:15 PM, Chris Rebert wrote: > On Fri, Feb 20, 2009 at 10:07 AM, Venkatraman S > wrote: >> Hi, >> >> Has there been any discussion or effort on moving from a stack based >> interpreter to a more register based one? >> (my limited search in the archives did not yield me any results) > > It's possible PyPy will investigate that, and the Python > implementation on Parrot (which was stagnant last I checked) would > obviously be register-based, but as for CPython, I don't think it's > going to happen (at least not any time soon). The bytecode is > stack-oriented through-and-through; it'd be quite a major change to > become register-based. Antonio Cuni made some experiments on PyPy about this, If you ask at the pypy-dev mailing list or on irc (#pypy on freenode.net) he or others can explain what happened. If I remember correctly there weren't any significant improvements in performance as dispatch and memory copies is not the problem on python, the bytecodes are very complex. []'s -- Leonardo Santagada santagada at gmail.com From brett at python.org Fri Feb 20 20:46:56 2009 From: brett at python.org (Brett Cannon) Date: Fri, 20 Feb 2009 11:46:56 -0800 Subject: [Python-ideas] Register based interpreter In-Reply-To: References: Message-ID: On Fri, Feb 20, 2009 at 10:07, Venkatraman S wrote: > Hi, > > Has there been any discussion or effort on moving from a stack based > interpreter to a more register based one? > (my limited search in the archives did not yield me any results) Very limited work done way back in the day by I believe Skip Montanaro, but it didn't get far enough for a complete implementation and it didn't pan out anyway. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Fri Feb 20 20:48:53 2009 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 20 Feb 2009 19:48:53 +0000 (UTC) Subject: [Python-ideas] Register based interpreter References: <50697b2c0902201015g7572d645hbfe8b0fe60960f8d@mail.gmail.com> Message-ID: Leonardo Santagada writes: > > Antonio Cuni made some experiments on PyPy about this, If you ask at > the pypy-dev mailing list or on irc (#pypy on freenode.net) he or > others can explain what happened. If I remember correctly there > weren't any significant improvements in performance as dispatch and > memory copies is not the problem on python, the bytecodes are very > complex. If bytecode dispatch were not a problem, I wonder how enabling computed gotos on the py3k branch could yield up to a 15% overall speedup on pybench :-) The biggest complication I can think of with a register-based VM is that you have to decref objects as soon as they aren't used anymore, which means you have to track the actual lifetime of registers (while it's done automatically with a stack-based design). I don't know how much it could slow down an implementation (perhaps not at all, if a clever implementation is devised...). Regards Antoine. From collinw at gmail.com Fri Feb 20 21:46:23 2009 From: collinw at gmail.com (Collin Winter) Date: Fri, 20 Feb 2009 12:46:23 -0800 Subject: [Python-ideas] Register based interpreter In-Reply-To: References: <50697b2c0902201015g7572d645hbfe8b0fe60960f8d@mail.gmail.com> Message-ID: <43aa6ff70902201246t3cd97e07k2f8a7c80d61e59c0@mail.gmail.com> On Fri, Feb 20, 2009 at 11:48 AM, Antoine Pitrou wrote: > Leonardo Santagada writes: >> >> Antonio Cuni made some experiments on PyPy about this, If you ask at >> the pypy-dev mailing list or on irc (#pypy on freenode.net) he or >> others can explain what happened. If I remember correctly there >> weren't any significant improvements in performance as dispatch and >> memory copies is not the problem on python, the bytecodes are very >> complex. > > If bytecode dispatch were not a problem, I wonder how enabling computed gotos on > the py3k branch could yield up to a 15% overall speedup on pybench :-) > > The biggest complication I can think of with a register-based VM is that you > have to decref objects as soon as they aren't used anymore, which means you have > to track the actual lifetime of registers (while it's done automatically with a > stack-based design). I don't know how much it could slow down an implementation > (perhaps not at all, if a clever implementation is devised...). FYI, some relevant reading on converting a stack-based JVM to use register-based bytecode: http://www.usenix.org/events/vee05/full_papers/p153-yunhe.pdf Collin Winter From tjreedy at udel.edu Fri Feb 20 22:00:28 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 20 Feb 2009 16:00:28 -0500 Subject: [Python-ideas] Register based interpreter In-Reply-To: References: Message-ID: Brett Cannon wrote: > > > On Fri, Feb 20, 2009 at 10:07, Venkatraman S > wrote: > > Hi, > > Has there been any discussion or effort on moving from a stack based > interpreter to a more register based one? > (my limited search in the archives did not yield me any results) > > > Very limited work done way back in the day by I believe Skip Montanaro, > but it didn't get far enough for a complete implementation and it didn't > pan out anyway. It was about 10 years ago, possibly before pydev, so google search in c.l.p might have something. From greg.ewing at canterbury.ac.nz Fri Feb 20 22:09:07 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 21 Feb 2009 10:09:07 +1300 Subject: [Python-ideas] Yield-From Example 2: Scheduler In-Reply-To: References: <499DCFE7.2020407@canterbury.ac.nz> <499E01B0.3090102@canterbury.ac.nz> Message-ID: <499F1BF3.2000907@canterbury.ac.nz> Adam Olsen wrote: > It's more of a gut instinct thing. It'd be hard to make ignored > returns into an error, But why would you even *want* to make ignored returns into errors? Some people like to run lint over their C code and get it to warn about unused return values that haven't been cast away, but that's because C uses return values to signal errors, and not paying attention to them can cause real problems. But Python uses exceptions for errors, so ignoring return values is usually harmless. If you igore something that you shouldn't have ignored, you will usually find out about it pretty quickly, because your program will produce incorrect results. Maybe this is where your gut instinct is coming from? -- Greg From greg.ewing at canterbury.ac.nz Fri Feb 20 22:47:08 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 21 Feb 2009 10:47:08 +1300 Subject: [Python-ideas] Register based interpreter In-Reply-To: <50697b2c0902201015g7572d645hbfe8b0fe60960f8d@mail.gmail.com> References: <50697b2c0902201015g7572d645hbfe8b0fe60960f8d@mail.gmail.com> Message-ID: <499F24DC.7030206@canterbury.ac.nz> Chris Rebert wrote: > The bytecode is > stack-oriented through-and-through; it'd be quite a major change to > become register-based. Also, I have a hard time believing it would make much difference. Python bytecodes are mostly quite large units of functionality, so the time taken manipulating the stack is not likely to be a significant component. This is different from something like Java where you have bytecodes that directly manipulate low-level things such as integers. -- Greg From greg.ewing at canterbury.ac.nz Fri Feb 20 22:59:50 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 21 Feb 2009 10:59:50 +1300 Subject: [Python-ideas] Register based interpreter In-Reply-To: References: <50697b2c0902201015g7572d645hbfe8b0fe60960f8d@mail.gmail.com> Message-ID: <499F27D6.6060909@canterbury.ac.nz> Antoine Pitrou wrote: > If bytecode dispatch were not a problem, I wonder how enabling computed gotos on > the py3k branch could yield up to a 15% overall speedup on pybench :-) Perhaps there is room for a hybrid approach where you take sequences of instructions such as LOAD_LOCAL x LOAD_LOCAL y BINARY_ADD STORE_GLOBAL z and turn them into 3-operand macroinstructions TRINARY_ADD LOCAL(x), LOCAL(y), GLOBAL(z) This would actually do all the same pushing and popping as before (at least conceptually), but it would reduce the number of bytecodes being executed, and therefore the number of unpredictable branches, from 4 to 1. So it would be kind of like a register-based instruction set, except that there aren't any registers, so there wouldn't be any problem with managing refcounts. (BTW, calling them "bytecodes" might become something of a misnomer -- they're going to be more like "longcodes".:-) -- Greg From greg.ewing at canterbury.ac.nz Fri Feb 20 23:15:59 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 21 Feb 2009 11:15:59 +1300 Subject: [Python-ideas] Yield-from: Details to be decided Message-ID: <499F2B9F.7060004@canterbury.ac.nz> I've got to the point in the implementation where I need to decide what to do if you send() a value to a generator that's delegating to something that doesn't have a send() method. Possibilities include: * Ignore the value and call next() instead * Raise an exception What do people think? I'm inclined to raise an exception for the time being, since we can always relax it later if we want. Also, doing so is more consistent with the idea of the caller talking directly to the sub-iterator. -- Greg From jnoller at gmail.com Fri Feb 20 23:15:18 2009 From: jnoller at gmail.com (Jesse Noller) Date: Fri, 20 Feb 2009 17:15:18 -0500 Subject: [Python-ideas] Yield-from: Details to be decided In-Reply-To: <499F2B9F.7060004@canterbury.ac.nz> References: <499F2B9F.7060004@canterbury.ac.nz> Message-ID: <126292BD-B8F8-4612-821F-F2CCEDC59651@gmail.com> Raising an exception seems more sane right now On Feb 20, 2009, at 5:15 PM, Greg Ewing wrote: > I've got to the point in the implementation where I > need to decide what to do if you send() a value to > a generator that's delegating to something that > doesn't have a send() method. > > Possibilities include: > > * Ignore the value and call next() instead > > * Raise an exception > > What do people think? I'm inclined to raise an > exception for the time being, since we can always > relax it later if we want. Also, doing so is more > consistent with the idea of the caller talking > directly to the sub-iterator. > > -- > Greg > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From guido at python.org Fri Feb 20 23:30:13 2009 From: guido at python.org (Guido van Rossum) Date: Fri, 20 Feb 2009 14:30:13 -0800 Subject: [Python-ideas] Yield-from: Details to be decided In-Reply-To: <126292BD-B8F8-4612-821F-F2CCEDC59651@gmail.com> References: <499F2B9F.7060004@canterbury.ac.nz> <126292BD-B8F8-4612-821F-F2CCEDC59651@gmail.com> Message-ID: In general .send() is picky when it knows for sure the value won't be used -- try .send() on a generator suspended before the first time it yields, that raises an exception too. So yes, an exception, please. (Doesn't the PEP specify this? I told you it would be useful to start coding. :-)) --Guido On Fri, Feb 20, 2009 at 2:15 PM, Jesse Noller wrote: > Raising an exception seems more sane right now > > On Feb 20, 2009, at 5:15 PM, Greg Ewing wrote: > >> I've got to the point in the implementation where I >> need to decide what to do if you send() a value to >> a generator that's delegating to something that >> doesn't have a send() method. >> >> Possibilities include: >> >> * Ignore the value and call next() instead >> >> * Raise an exception >> >> What do people think? I'm inclined to raise an >> exception for the time being, since we can always >> relax it later if we want. Also, doing so is more >> consistent with the idea of the caller talking >> directly to the sub-iterator. >> >> -- >> Greg >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From dangyogi at gmail.com Fri Feb 20 23:31:14 2009 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Fri, 20 Feb 2009 17:31:14 -0500 Subject: [Python-ideas] Yield-from: Details to be decided In-Reply-To: <499F2B9F.7060004@canterbury.ac.nz> References: <499F2B9F.7060004@canterbury.ac.nz> Message-ID: <499F2F32.2020900@gmail.com> Raise an exception. -bruce frederiksen Greg Ewing wrote: > I've got to the point in the implementation where I > need to decide what to do if you send() a value to > a generator that's delegating to something that > doesn't have a send() method. > > Possibilities include: > > * Ignore the value and call next() instead > > * Raise an exception > > What do people think? I'm inclined to raise an > exception for the time being, since we can always > relax it later if we want. Also, doing so is more > consistent with the idea of the caller talking > directly to the sub-iterator. > From dangyogi at gmail.com Fri Feb 20 23:41:33 2009 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Fri, 20 Feb 2009 17:41:33 -0500 Subject: [Python-ideas] use context managers for new-style "for" statement Message-ID: <499F319D.2040707@gmail.com> An idea occurred to me about another way to achieve "new-style" for statements that finalize generators properly: 1. Modify for statements to accept context mangers in addition to iterables. If it gets a context manager, it calls __exit__ when the loop terminates for any reason. Otherwise, it does what it does now. 2. Add an optional __throw__ method to context managers. If the context manager has a __throw__ method, the for statement forwards uncaught exceptions within its body to the context manager. Thus, you can use: for i in closing(gen(x)): if you want the generator closed automatically. This would also work for files: for line in open(filename): We might also add a new context manager to contextlib to do both the close and the throw. Maybe call it throwing_to? for i in throwing_to(gen(x)): I would think that throwing_to would also do what closing does. This somewhat simplifies the common "with closing"/"for" pattern as well as adds support for the new close/throw generator methods without any new syntax. Comments? -bruce frederiksen From dangyogi at gmail.com Fri Feb 20 23:46:18 2009 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Fri, 20 Feb 2009 17:46:18 -0500 Subject: [Python-ideas] Yield-from: Details to be decided In-Reply-To: <499F2B9F.7060004@canterbury.ac.nz> References: <499F2B9F.7060004@canterbury.ac.nz> Message-ID: <499F32BA.2030209@gmail.com> Actually, PEP 342 specifies that send(None) is like next(): "Calling send(None) is exactly equivalent to calling a generator's next() method." So to honor this, you would need to have send(None) call next, while send(anything_else) raises an exception... -bruce frederiksen Greg Ewing wrote: > I've got to the point in the implementation where I > need to decide what to do if you send() a value to > a generator that's delegating to something that > doesn't have a send() method. > > Possibilities include: > > * Ignore the value and call next() instead > > * Raise an exception > > What do people think? I'm inclined to raise an > exception for the time being, since we can always > relax it later if we want. Also, doing so is more > consistent with the idea of the caller talking > directly to the sub-iterator. > From guido at python.org Fri Feb 20 23:49:44 2009 From: guido at python.org (Guido van Rossum) Date: Fri, 20 Feb 2009 14:49:44 -0800 Subject: [Python-ideas] Yield-from: Details to be decided In-Reply-To: <499F32BA.2030209@gmail.com> References: <499F2B9F.7060004@canterbury.ac.nz> <499F32BA.2030209@gmail.com> Message-ID: Good point. On Fri, Feb 20, 2009 at 2:46 PM, Bruce Frederiksen wrote: > Actually, PEP 342 specifies that send(None) is like next(): > > "Calling send(None) is exactly equivalent to calling a generator's next() > method." > > So to honor this, you would need to have send(None) call next, while > send(anything_else) raises an exception... > > -bruce frederiksen > > Greg Ewing wrote: >> >> I've got to the point in the implementation where I >> need to decide what to do if you send() a value to >> a generator that's delegating to something that >> doesn't have a send() method. >> >> Possibilities include: >> >> * Ignore the value and call next() instead >> >> * Raise an exception >> >> What do people think? I'm inclined to raise an >> exception for the time being, since we can always >> relax it later if we want. Also, doing so is more >> consistent with the idea of the caller talking >> directly to the sub-iterator. >> > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From solipsis at pitrou.net Sat Feb 21 00:01:37 2009 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 20 Feb 2009 23:01:37 +0000 (UTC) Subject: [Python-ideas] Register based interpreter References: <50697b2c0902201015g7572d645hbfe8b0fe60960f8d@mail.gmail.com> <499F27D6.6060909@canterbury.ac.nz> Message-ID: Greg Ewing writes: > > Perhaps there is room for a hybrid approach where you > take sequences of instructions such as > > LOAD_LOCAL x > LOAD_LOCAL y > BINARY_ADD > STORE_GLOBAL z > > and turn them into 3-operand macroinstructions > > TRINARY_ADD LOCAL(x), LOCAL(y), GLOBAL(z) I was thinking of something like that, although in a simpler form of 2-operand macroinstructions: BINARY_ADD 5, 8 where 5 and 8 are indices into a "super array" which would be equal to: [local variables] + [code object constants] (fast random access to the array would imply copying the constants table each time a new frame is created for the given code object, but it would hopefully remain quite cheap) We can keep the current instruction format and address up to 256 local variables and constants by splitting the 16-bit operand in two bytes. The result would be stored on top of stack. (if we want to access the top of the stack using the macroinstructions, we could reserve 255 as a special index value for popping the top of the stack...) Regards Antoine. From python at rcn.com Sat Feb 21 00:23:45 2009 From: python at rcn.com (Raymond Hettinger) Date: Fri, 20 Feb 2009 15:23:45 -0800 Subject: [Python-ideas] use context managers for new-style "for" statement References: <499F319D.2040707@gmail.com> Message-ID: > for i in closing(gen(x)): > > if you want the generator closed automatically. That doesn't really improve on what we have now: with closing(gen(x)) as g: for i in g: The proposed syntax puts to much on one-line and unnecessarily complicates another one of Python's fundamental tools. > We might also add a new context manager to contextlib to do both the > close and the throw. Maybe call it throwing_to? > > for i in throwing_to(gen(x)): This looks somewhat unattractive to my eyes. > Comments? I think the "problem" you're solving isn't worth solving. Raymond ## untested recipe def closeme(iterable): it = iter(iterable) try: for i in it: yield i finally: it.close() # doesn't this do this same thing without any interpreter magic? for i in closeme(gen(x)): ... for i in chain.from_iterable(map(closeme, [it1, it2, it3, it4])): ... From arnodel at googlemail.com Sat Feb 21 00:55:50 2009 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Fri, 20 Feb 2009 23:55:50 +0000 Subject: [Python-ideas] Sending to a generator being looped over in a for-loop Message-ID: <9bfc700a0902201555j50530d03jf44d6fbe0d13ea8@mail.gmail.com> When giving some examples in a previous post, I gave this tree-traversing example: class Tree(object): def __init__(self, label, children=()): self.label = label self.children = children def __iter__(self): skip = yield self.label if skip == 'SKIP': yield 'SKIPPED' else: yield 'ENTER' for child in self.children: yield from child yield 'LEAVE' Here is a tree: tree = Tree('A', [Tree('B'), Tree('C')]) Here is an example of how to traverse it, avoiding the children of nodes called 'B'. I can't use a for-loop as I need to send the skip-value to the tree iterator and this makes the loop look quite complicated. Here's one way to do it: i = iter(tree) skip = None try: while True: a = i.send(skip) print a skip = 'SKIP' if a == 'B' else None except StopIteration: pass Now imagine that the 'continue' statement within a for-loop has an optional argument that is sent to the generator being looped over at the next iteration step. I would then be able to write the loop above much more simply: for a in tree: print a if a == 'B': continue 'SKIP' -- Arnaud From grosser.meister.morti at gmx.net Sat Feb 21 01:03:37 2009 From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=) Date: Sat, 21 Feb 2009 01:03:37 +0100 Subject: [Python-ideas] Sending to a generator being looped over in a for-loop In-Reply-To: <9bfc700a0902201555j50530d03jf44d6fbe0d13ea8@mail.gmail.com> References: <9bfc700a0902201555j50530d03jf44d6fbe0d13ea8@mail.gmail.com> Message-ID: <499F44D9.3050405@gmx.net> Ok, this is a good example (IMHO). So +1 from me. But I'm not sure about the syntax. Maybe "yield for child"? Arnaud Delobelle wrote: > When giving some examples in a previous post, I gave this > tree-traversing example: > > class Tree(object): > def __init__(self, label, children=()): > self.label = label > self.children = children > def __iter__(self): > skip = yield self.label > if skip == 'SKIP': > yield 'SKIPPED' > else: > yield 'ENTER' > for child in self.children: > yield from child > yield 'LEAVE' > > Here is a tree: > > tree = Tree('A', [Tree('B'), Tree('C')]) > > Here is an example of how to traverse it, avoiding the children of > nodes called 'B'. I can't use a for-loop as I need to send the > skip-value to the tree iterator and this makes the loop look quite > complicated. Here's one way to do it: > > i = iter(tree) > skip = None > try: > while True: > a = i.send(skip) > print a > skip = 'SKIP' if a == 'B' else None > except StopIteration: > pass > > Now imagine that the 'continue' statement within a for-loop has an > optional argument that is sent to the generator being looped over at > the next iteration step. I would then be able to write the loop above > much more simply: > > for a in tree: > print a > if a == 'B': > continue 'SKIP' > From jh at improva.dk Sat Feb 21 03:00:51 2009 From: jh at improva.dk (Jacob Holm) Date: Sat, 21 Feb 2009 03:00:51 +0100 Subject: [Python-ideas] Revised**5 PEP on yield-from In-Reply-To: <499DDA4C.8090906@canterbury.ac.nz> References: <499DDA4C.8090906@canterbury.ac.nz> Message-ID: <499F6053.40407@improva.dk> Hi Greg A few comments. First, I think the suggested expansion is not quite right: > _i = iter(expr) > try: > _u = _i.next() > while 1: > try: > _v = yield _u > except Exception, _e: > if hasattr(_i, 'throw'): > _i.throw(_e) Shouldn't this be "_u = _i.throw(e)" instead? > else: > raise > else: > if hasattr(_i, 'send'): > _u = _i.send(_v) > else: > _u = _i.next() > except StopIteration, _e: > _a = _e.args > if len(_a) > 0: > result = _a[0] > else: > result = None > finally: > if hasattr(_i, 'close'): > _i.close() Second, I spend a looong time today thinking about how to make this fast. > > Optimisations > ------------- > > Using a specialised syntax opens up possibilities for optimisation > when there is a long chain of generators. Such chains can arise, for > instance, when recursively traversing a tree structure. The overhead > of passing ``next()`` calls and yielded values down and up the chain > can cause what ought to be an O(n) operation to become O(n\*\*2). True, but as long as it is only chains of generators I can sketch a solution that would give amortized constant time for finding the generator to resume. Unfortunately it is easy to construct a situation where what we have is actually a tree of generators, each using "yield from" its parent. Simple example: def foo(): yield 1 yield 2 yield 3 def bar(it): yield from it root = foo() child1 = bar(root) child2 = bar(root) print child1.next() # prints 1, yielded from root print child2.next() # prints 2, yielded from root In this example, both "child1" and "child2" are yielding from "root". So what we are dealing with is a set of dynamic trees under the operations "find_root" (to find the target when calling the generator methods), "link" (when doing yield from) and "delete_root" (each time a generator is exhausted). I strongly suspect that the lower bound for this problem is O(logn/loglogn) per operation, and I can see a number of ways to do it in O(logn) time. I am not sure any of them are fast enough to beat the simple O(n) solution in real code though. (I will see if I can find a version that has a chance). A thought just occurred to me... would it be acceptable to disallow "yield from" with a generator that is not "fresh"? That would make "child2.next()" above illegal and remove the problem with dealing with trees of generators. > > A possible strategy is to add a slot to generator objects to hold a > generator being delegated to. When a ``next()`` or ``send()`` call is > made on the generator, this slot is checked first, and if it is > nonempty, the generator that it references is resumed instead. If it > raises StopIteration, the slot is cleared and the main generator is > resumed. > > This would reduce the delegation overhead to a chain of C function > calls involving no Python code execution. A possible enhancement would > be to traverse the whole chain of generators in a loop and directly > resume the one at the end, although the handling of StopIteration is > more complicated then. I hope you can get that second version to work. All further optimization would probably have to start with something like that anyway. Best regards Jacob From rhamph at gmail.com Sat Feb 21 04:10:17 2009 From: rhamph at gmail.com (Adam Olsen) Date: Fri, 20 Feb 2009 20:10:17 -0700 Subject: [Python-ideas] Yield-From Example 2: Scheduler In-Reply-To: <499F1BF3.2000907@canterbury.ac.nz> References: <499DCFE7.2020407@canterbury.ac.nz> <499E01B0.3090102@canterbury.ac.nz> <499F1BF3.2000907@canterbury.ac.nz> Message-ID: On Fri, Feb 20, 2009 at 2:09 PM, Greg Ewing wrote: > But Python uses exceptions for errors, so ignoring > return values is usually harmless. If you igore something > that you shouldn't have ignored, you will usually find > out about it pretty quickly, because your program will > produce incorrect results. > > Maybe this is where your gut instinct is coming from? No, it's more about expect return to behave like yield, as in this: def x(): yield 1 yield 2 return 3 I realize I'm losing ground though. My argument is not as strong as I thought it was. -- Adam Olsen, aka Rhamphoryncus From dangyogi at gmail.com Sat Feb 21 04:30:21 2009 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Fri, 20 Feb 2009 22:30:21 -0500 Subject: [Python-ideas] use context managers for new-style "for" statement In-Reply-To: References: <499F319D.2040707@gmail.com> Message-ID: <499F754D.90701@gmail.com> Raymond Hettinger wrote: > ## untested recipe > > def closeme(iterable): > it = iter(iterable) > try: > for i in it: > yield i > finally: > it.close() > > > # doesn't this do this same thing without any interpreter magic? > for i in closeme(gen(x)): > ... > > for i in chain.from_iterable(map(closeme, [it1, it2, it3, it4])): > ... No, this only works on CPython because of the referencing counting collector and the fact that PEP 342 specifies that __del__ calls close on generators. This will not work reliably on Jython, IronPython or Pypy because none of these have reference counting collectors. The closeme generator really adds nothing here, because it is just another generator that relies on either running off the end of the generator, or its close or throw methods to be called to activate the finally clause. This is identical to the generators that it is being mapped over. *Nothing* in python is defined to call the close or throw methods on generators, except for the generator __del__ method -- and that is *only* called reliably in CPython, and not in any of the other python implementations which may never garbage collect the generator if it's allocated near the end of the program run! I had generators with try/finally, and these fail on Jython and IronPython. What I ended up doing to get my program working on Jython was to convert all of my generators to return context managers. That way I could not accidentally forget to use a with statement with them. Thus: def gen(x): return itertools.chain.from_iterable(...) for i in gen(x): ... becomes the following hack: class chain_context(object): def __init__(self, outer_it): self.outer_it = outer_iterable(outer_it) def __enter__(self): return itertools.chain.from_iterable(self.outer_it) def __exit__(self, type, value, tb): self.outer_it.close() class outer_iterable(object): def __init__(self, outer_it): self.outer_it = iter(outer_it) self.inner_it = None def __iter__(self): return self def close(self): if hasattr(self.inner_it, '__exit__'): self.inner_it.__exit__(None, None, None) elif hasattr(self.inner_it, 'close'): self.inner_it.close() if hasattr(self.outer_it, 'close'): self.outer_it.close() def next(self): ans = self.outer_it.next() if hasattr(ans, '__enter__'): self.inner_it = ans return ans.__enter__() ans = iter(ans) self.inner_it = ans return ans def gen(x): return chain_context(...) with gen(x) as it: for i in it: ... Most of my generators used chain. Those that didn't went from: def gen(x): ... to: def gen(x): def gen2(x): ... return contextlib.closing(gen2(x)) This got the program working on Jython in a way that future maintenance on the program can't screw up, but it sure doesn't feel "pythonic"... -bruce frederiksen From aahz at pythoncraft.com Sat Feb 21 04:53:38 2009 From: aahz at pythoncraft.com (Aahz) Date: Fri, 20 Feb 2009 19:53:38 -0800 Subject: [Python-ideas] use context managers for new-style "for" statement In-Reply-To: References: <499F319D.2040707@gmail.com> Message-ID: <20090221035338.GA2873@panix.com> On Fri, Feb 20, 2009, Raymond Hettinger wrote: >Bruce Frederikson: >> >> for i in closing(gen(x)): >> >> if you want the generator closed automatically. > > That doesn't really improve on what we have now: > > with closing(gen(x)) as g: > for i in g: > > The proposed syntax puts to much on one-line and unnecessarily > complicates another one of Python's fundamental tools. In addition to Bruce's other followup, saving a level of indention does have some utility. That's not enough by itself, of course. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ Weinberg's Second Law: If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization. From greg.ewing at canterbury.ac.nz Sat Feb 21 04:59:29 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 21 Feb 2009 16:59:29 +1300 Subject: [Python-ideas] Yield-from: Mysterious doctest failures Message-ID: <499F7C21.2060109@canterbury.ac.nz> I'm getting this from tests/test_generators.py. As far as I can see, the SyntaxError messages are identical. Does anyone know what doctest is complaining about here? ********************************************************************** File "/Local/Projects/D/Python-YieldFrom/Python-2.6.1/Lib/test/test_generators.py", line ?, in test.test_generators.__test__.coroutine Failed example: def f(): x = yield = y Expected: Traceback (most recent call last): ... SyntaxError: assignment to yield expression not possible (, line 1) Got: Traceback (most recent call last): File "/Local/Projects/D/Python-YieldFrom/Python-2.6.1/Lib/doctest.py", line 1231, in __run compileflags, 1) in test.globs File "", line 1 SyntaxError: assignment to yield expression not possible (, line 1) ast: yield_expr: an = 0x10d9ed8 ********************************************************************** -- Greg From greg.ewing at canterbury.ac.nz Sat Feb 21 05:09:01 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 21 Feb 2009 17:09:01 +1300 Subject: [Python-ideas] Yield-from: Details to be decided In-Reply-To: References: <499F2B9F.7060004@canterbury.ac.nz> <126292BD-B8F8-4612-821F-F2CCEDC59651@gmail.com> Message-ID: <499F7E5D.6020207@canterbury.ac.nz> Guido van Rossum wrote: > So yes, an exception, please. > (Doesn't the PEP specify this? It's remarked upon as an unresolved issue in the PEP right now. I'll change it to specify an exception if you think that's the right thing to do. -- Greg From greg.ewing at canterbury.ac.nz Sat Feb 21 06:39:53 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 21 Feb 2009 18:39:53 +1300 Subject: [Python-ideas] Revised**5 PEP on yield-from In-Reply-To: <499F6053.40407@improva.dk> References: <499DDA4C.8090906@canterbury.ac.nz> <499F6053.40407@improva.dk> Message-ID: <499F93A9.9070500@canterbury.ac.nz> Jacob Holm wrote: >> try: >> _v = yield _u >> except Exception, _e: >> if hasattr(_i, 'throw'): >> _i.throw(_e) > > Shouldn't this be "_u = _i.throw(e)" instead? Yes, I think possibly it should... but I need to wait until my brain stops hurting from thinking about all this before I'll know for certain... I'm no longer sure that the implementation I'm working on can be exactly described in terms of any Python expansion, anyway. One problem is that if the generator gets None from a yield, it has no way of knowing whether it came from a next() or a send(None), so it doesn't know which method to call on the subiterator. The best that it could do is if _v is None: _u = _i.next() else: _u = _i.send(_v) but I would have to modify my implementation as it currently stands to make it work like that. This may be the right thing to do, as it would bring things into line with the advertised equivalence of next() and send(None), and you would still get an exception if you tried to send something non-None to an object without a send() method. > I strongly suspect that the lower bound for this problem is O(logn/loglogn) > per operation, and I can see a number of ways to do it in O(logn) time. Before you bust too many neurons on O() calculations, be aware that the constant factors are important here. I'm relying on traversing a delegation chain using C calls taking negligible time compared to doing it in Python, so that it can be treated as a constant-time operation for all practical purposes regardless of the length of the chain. Testing will reveal whether this turns out to be true in real life... -- Greg From greg.ewing at canterbury.ac.nz Sat Feb 21 07:09:51 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 21 Feb 2009 19:09:51 +1300 Subject: [Python-ideas] Yield-from: Details to be decided In-Reply-To: <499F32BA.2030209@gmail.com> References: <499F2B9F.7060004@canterbury.ac.nz> <499F32BA.2030209@gmail.com> Message-ID: <499F9AAF.8000802@canterbury.ac.nz> Bruce Frederiksen wrote: > Actually, PEP 342 specifies that send(None) is like next(): > > "Calling send(None) is exactly equivalent to calling a generator's > next() method." Hmmm, yes, but... that's talking about what happens when you call send() on a *generator*. But when yield-from is delegating to some iterator that's not a generator, according the current wording in the PEP, things are supposed to behave as though you were talking directly to the iterator. If it doesn't have a send() method, and you tried to call it directly, you would get an exception. We're in unprecedented territory here, and it's hard to tell what will turn out to be the most useful behaviour without more experience. Raising an exception for now seems like the safest thing to do. -- Greg From steve at pearwood.info Sat Feb 21 08:55:01 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 21 Feb 2009 18:55:01 +1100 Subject: [Python-ideas] Yield-from: Mysterious doctest failures In-Reply-To: <499F7C21.2060109@canterbury.ac.nz> References: <499F7C21.2060109@canterbury.ac.nz> Message-ID: <499FB355.4000908@pearwood.info> Greg Ewing wrote: > I'm getting this from tests/test_generators.py. As far as > I can see, the SyntaxError messages are identical. Does > anyone know what doctest is complaining about here? > > ********************************************************************** > File > "/Local/Projects/D/Python-YieldFrom/Python-2.6.1/Lib/test/test_generators.py", > line ?, in test.test_generators.__test__.coroutine > Failed example: > def f(): x = yield = y > Expected: > Traceback (most recent call last): > ... > SyntaxError: assignment to yield expression not possible ( test.test_generators.__test__.coroutine[23]>, line 1) > Got: > Traceback (most recent call last): > File > "/Local/Projects/D/Python-YieldFrom/Python-2.6.1/Lib/doctest.py", line > 1231, in __run > compileflags, 1) in test.globs > File "", line 1 > SyntaxError: assignment to yield expression not possible ( test.test_generators.__test__.coroutine[22]>, line 1) > ast: yield_expr: an = 0x10d9ed8 > ********************************************************************** The index in the coroutines are different: 23 expected, 22 got. Also, unless it's an artifact of either your or my mail client, there's a spurious space *before* the SyntaxError exception. With spaces replaced by # for visibility: Expected: ####Traceback (most recent call last): ######... ####SyntaxError: assignment to yield expression not possible (, line 1) Got: ####Traceback (most recent call last): ######spam spam spam spam #####SyntaxError: assignment to yield expression not possible (, line 1) Assuming this is genuine, I have no idea how a SyntaxError exception ends up putting a space before the exception name! -- Steven From greg.ewing at canterbury.ac.nz Sat Feb 21 09:03:20 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 21 Feb 2009 21:03:20 +1300 Subject: [Python-ideas] Yield-from: Nonexistent throw() and close()? Message-ID: <499FB548.9090003@canterbury.ac.nz> I'm happy to raise exceptions in the face of nonexistent send() methods, but I'm not so sure about throw() and close(). The purpose of close() is to make sure the generator is cleaned up before discarding it. If it's delegating to something that doesn't have a close(), then presumably that thing doesn't need to do any cleaning up, in which case we should just ignore it and close() the generator as usual. In the case of throw(), if we raise an AttributeError when the iterator doesn't have throw(), we're still going to end up raising an exception in the delegating generator, just not the one that was thrown in. One use case I can think of is for killing a generator- based thread -- you could define a KillThread exception for this and throw it into the thread you want to kill. The main loop of your scheduler would then catch KillThread exceptions and silently remove the thread from the system. But if the KillThread gets turned into something else, it won't get caught and everything will fall over for no good reason. So at least in that case I think it makes more sense to ignore the lack of a throw() method and raise the thrown-in exception in the delegating generator. Does this reasoning make sense? -- Greg From greg.ewing at canterbury.ac.nz Sat Feb 21 09:23:36 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 21 Feb 2009 21:23:36 +1300 Subject: [Python-ideas] Yield-from: Mysterious doctest failures In-Reply-To: <499FB355.4000908@pearwood.info> References: <499F7C21.2060109@canterbury.ac.nz> <499FB355.4000908@pearwood.info> Message-ID: <499FBA08.5060102@canterbury.ac.nz> Steven D'Aprano wrote: > Greg Ewing wrote: >> SyntaxError: assignment to yield expression not possible (> test.test_generators.__test__.coroutine[23]>, line 1) >> SyntaxError: assignment to yield expression not possible >> (, line 1) > The index in the coroutines are different: 23 expected, 22 got. Okay, got it now. I had removed some tests concerning return with value in generators, and that changed the numbering of the tests. > Also, unless it's an artifact of either your or my mail client, there's > a spurious space *before* the SyntaxError exception. It's not an artifact -- I noticed that too, and I'm just as mystified. I haven't touched anything anywhere near whatever produces those messages! It doesn't seem to stop the tests from passing, though, so I'm happy now. Thanks, Greg From jh at improva.dk Sat Feb 21 09:26:44 2009 From: jh at improva.dk (Jacob Holm) Date: Sat, 21 Feb 2009 09:26:44 +0100 Subject: [Python-ideas] Yield-from: Nonexistent throw() and close()? In-Reply-To: <499FB548.9090003@canterbury.ac.nz> References: <499FB548.9090003@canterbury.ac.nz> Message-ID: <499FBAC4.9040202@improva.dk> Greg Ewing wrote: > I'm happy to raise exceptions in the face of nonexistent > send() methods, but I'm not so sure about throw() and > close(). I'm not. I'd hate for this: def foo(): for i in xrange(5): yield i to behave different from this: def foo(): yield from xrange(5) I think a missing send should be converted to a next, just as the PEP proposed. > > The purpose of close() is to make sure the generator is > cleaned up before discarding it. If it's delegating to > something that doesn't have a close(), then presumably > that thing doesn't need to do any cleaning up, in which > case we should just ignore it and close() the generator > as usual. > > In the case of throw(), if we raise an AttributeError > when the iterator doesn't have throw(), we're still > going to end up raising an exception in the delegating > generator, just not the one that was thrown in. > > One use case I can think of is for killing a generator- > based thread -- you could define a KillThread exception > for this and throw it into the thread you want to kill. > The main loop of your scheduler would then catch > KillThread exceptions and silently remove the thread > from the system. > > But if the KillThread gets turned into something else, > it won't get caught and everything will fall over for > no good reason. So at least in that case I think it > makes more sense to ignore the lack of a throw() method > and raise the thrown-in exception in the delegating > generator. > > Does this reasoning make sense? > Yup. Works for me. Regards Jacob From jh at improva.dk Sat Feb 21 10:06:43 2009 From: jh at improva.dk (Jacob Holm) Date: Sat, 21 Feb 2009 10:06:43 +0100 Subject: [Python-ideas] Revised**5 PEP on yield-from In-Reply-To: <499F93A9.9070500@canterbury.ac.nz> References: <499DDA4C.8090906@canterbury.ac.nz> <499F6053.40407@improva.dk> <499F93A9.9070500@canterbury.ac.nz> Message-ID: <499FC423.6080500@improva.dk> Greg Ewing wrote: > I'm no longer sure that the implementation I'm working > on can be exactly described in terms of any Python > expansion, anyway. :( > > One problem is that if the generator gets None from > a yield, it has no way of knowing whether it came from > a next() or a send(None), This is only a problem with the recursive implementation though, isn't it? > so it doesn't know which > method to call on the subiterator. The best that it > could do is > > if _v is None: > _u = _i.next() > else: > _u = _i.send(_v) > > but I would have to modify my implementation as it > currently stands to make it work like that. > > This may be the right thing to do, as it would bring > things into line with the advertised equivalence of > next() and send(None), and you would still get an > exception if you tried to send something non-None to > an object without a send() method. I still (see example in another thread) think that a missing 'send' should be treated as a 'next'. To me, the "communicating directly with the caller" bit is less important than the argument that the caller is still talking to a generator that *has* a send but may ignore the values sent. >> I strongly suspect that the lower bound for this problem is >> O(logn/loglogn) >> per operation, and I can see a number of ways to do it in O(logn) time. > > Before you bust too many neurons on O() calculations, > be aware that the constant factors are important here. > I'm relying on traversing a delegation chain using > C calls taking negligible time compared to doing it > in Python, so that it can be treated as a constant-time > operation for all practical purposes regardless of the > length of the chain. Testing will reveal whether this > turns out to be true in real life... > Too late, but don't worry. I'm doing this for fun :) I think I can actually see a way to do it that should be fast enough, but I'd like to work out the details first. If it works it will be O(1) with low constants as long as you don't build trees, and similar to traversing a delegation chain in the worst case. All this depends on getting it working using delegation chains first though, as most of the StopIteration and Exception handling would be the same. Best regards Jacob From greg.ewing at canterbury.ac.nz Sat Feb 21 10:23:14 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 21 Feb 2009 22:23:14 +1300 Subject: [Python-ideas] Revised**5 PEP on yield-from In-Reply-To: <499FC423.6080500@improva.dk> References: <499DDA4C.8090906@canterbury.ac.nz> <499F6053.40407@improva.dk> <499F93A9.9070500@canterbury.ac.nz> <499FC423.6080500@improva.dk> Message-ID: <499FC802.9080409@canterbury.ac.nz> Jacob Holm wrote: >> One problem is that if the generator gets None from >> a yield, it has no way of knowing whether it came from >> a next() or a send(None), > This is only a problem with the recursive implementation though, isn't it? It's a problem when trying to specify the semantics in terms of an expansion into currently valid Python. It's not necessarily a problem in the actual implementation, which isn't constrained that way. > I still (see example in another thread) think that a missing 'send' > should be > treated as a 'next'. To me, the "communicating directly with the caller" > bit > is less important than the argument that the caller is still talking to a > generator that *has* a send but may ignore the values sent. Yes, I'm starting to think that way, too. -- Greg From dangyogi at gmail.com Sat Feb 21 14:37:18 2009 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Sat, 21 Feb 2009 08:37:18 -0500 Subject: [Python-ideas] Yield-from: Nonexistent throw() and close()? In-Reply-To: <499FB548.9090003@canterbury.ac.nz> References: <499FB548.9090003@canterbury.ac.nz> Message-ID: <49A0038E.7050401@gmail.com> Greg Ewing wrote: > Does this reasoning make sense? Yes. As far as I know, there are no cases where a generator would have a close, but not a throw method. But if I am wrong, then a throw should call close on the subgenerator in this case. From dangyogi at gmail.com Sat Feb 21 14:47:36 2009 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Sat, 21 Feb 2009 08:47:36 -0500 Subject: [Python-ideas] Yield-from: Nonexistent throw() and close()? In-Reply-To: <499FBAC4.9040202@improva.dk> References: <499FB548.9090003@canterbury.ac.nz> <499FBAC4.9040202@improva.dk> Message-ID: <49A005F8.2010001@gmail.com> Jacob Holm wrote: > Greg Ewing wrote: >> I'm happy to raise exceptions in the face of nonexistent >> send() methods, but I'm not so sure about throw() and >> close(). > I'm not. I'd hate for this: > > def foo(): > for i in xrange(5): > yield i > > to behave different from this: > > def foo(): > yield from xrange(5) These two forms already behave differently when generators are used (rather than xrange), why should they not also behave differently when non-generators are used? "In the face of ambiguity, refuse the temptation to guess." I think that an exception makes more sense, otherwise, we are guessing as to what the programmer intended by using send in your example. From dangyogi at gmail.com Sat Feb 21 14:50:35 2009 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Sat, 21 Feb 2009 08:50:35 -0500 Subject: [Python-ideas] Yield-from: Details to be decided In-Reply-To: <499F9AAF.8000802@canterbury.ac.nz> References: <499F2B9F.7060004@canterbury.ac.nz> <499F32BA.2030209@gmail.com> <499F9AAF.8000802@canterbury.ac.nz> Message-ID: <49A006AB.7080309@gmail.com> Greg Ewing wrote: > Bruce Frederiksen wrote: >> Actually, PEP 342 specifies that send(None) is like next(): >> >> "Calling send(None) is exactly equivalent to calling a generator's >> next() method." > > Hmmm, yes, but... that's talking about what happens > when you call send() on a *generator*. OK, that argument makes sense. And I can't think of any counter-examples off of the top of my head where send(None) needs to call next... -bruce frederiksen From dangyogi at gmail.com Sat Feb 21 15:02:49 2009 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Sat, 21 Feb 2009 09:02:49 -0500 Subject: [Python-ideas] Revised**5 PEP on yield-from In-Reply-To: <499FC802.9080409@canterbury.ac.nz> References: <499DDA4C.8090906@canterbury.ac.nz> <499F6053.40407@improva.dk> <499F93A9.9070500@canterbury.ac.nz> <499FC423.6080500@improva.dk> <499FC802.9080409@canterbury.ac.nz> Message-ID: <49A00989.2070609@gmail.com> Greg Ewing wrote: > It's a problem when trying to specify the semantics > in terms of an expansion into currently valid Python. > It's not necessarily a problem in the actual > implementation, which isn't constrained that way. Let's not let the limitations of what can be expressed directly in Python influence the implementation. This is just a documentation issue. You can use comments or include some an extra explanation to clarify the PEP. Or you could define it in the PEP as a class (in Python), rather than a generator. > >> I still (see example in another thread) think that a missing 'send' >> should be >> treated as a 'next'. To me, the "communicating directly with the >> caller" bit >> is less important than the argument that the caller is still talking >> to a >> generator that *has* a send but may ignore the values sent. > > Yes, I'm starting to think that way, too. Has anybody shown a use-case for this? - bruce frederiksen From jh at improva.dk Sat Feb 21 15:08:54 2009 From: jh at improva.dk (Jacob Holm) Date: Sat, 21 Feb 2009 15:08:54 +0100 Subject: [Python-ideas] Yield-from: Nonexistent throw() and close()? In-Reply-To: <49A005F8.2010001@gmail.com> References: <499FB548.9090003@canterbury.ac.nz> <499FBAC4.9040202@improva.dk> <49A005F8.2010001@gmail.com> Message-ID: <49A00AF6.3030403@improva.dk> Hi Bruce Bruce Frederiksen wrote: > Jacob Holm wrote: >> I'd hate for this: >> >> def foo(): >> for i in xrange(5): >> yield i >> >> to behave different from this: >> >> def foo(): >> yield from xrange(5) > These two forms already behave differently when generators are used > (rather than xrange), why should they not also behave differently when > non-generators are used? Not sure in what way you think they behave differently? foo is a generator in both cases, and as such has a send method. I am thinking of #2 as a simple rewrite/refactoring using the nifty new feature. Why should foo().send('bar') ignore the value in #1 and raise an exception in #2? > > "In the face of ambiguity, refuse the temptation to guess." > > I think that an exception makes more sense, otherwise, we are guessing > as to what the programmer intended by using send in your example. I disagree. The principle of least surprise tells me that #1 and #2 should be the same. Regards Jacob From dangyogi at gmail.com Sat Feb 21 15:36:05 2009 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Sat, 21 Feb 2009 09:36:05 -0500 Subject: [Python-ideas] use context managers for new-style "for" statement In-Reply-To: <20090221035338.GA2873@panix.com> References: <499F319D.2040707@gmail.com> <20090221035338.GA2873@panix.com> Message-ID: <49A01155.2090807@gmail.com> Aahz wrote: > On Fri, Feb 20, 2009, Raymond Hettinger wrote: > >> Bruce Frederikson: >> >>> for i in closing(gen(x)): >>> >>> if you want the generator closed automatically. >>> >> That doesn't really improve on what we have now: >> >> with closing(gen(x)) as g: >> for i in g: >> >> The proposed syntax puts to much on one-line and unnecessarily >> complicates another one of Python's fundamental tools. >> > > In addition to Bruce's other followup, saving a level of indention does > have some utility. That's not enough by itself, of course. > Yes, the *close* capability of generators can be exercised with an external "with closing" statement (except for itertools.chain, which has an inner generator which is inaccessible to the caller, but this proposal doesn't fix that problem either). But the improvement comes when you want to exercise the *throw* capability of generators within a for statement. Adding an optional __throw__ to context managers and honoring it in with statements doesn't let the generator convert the exception into a new yielded value to be tried ("oh, you didn't like the last value I yielded, try this one instead") which is one use of generator.throw. And this solves both issues (close and throw) with the same mechanism, so it's conceptually simpler for the user than two separate solutions. It also naturally extends the context manager capability to be able to encapsulate try/except patterns into re-usable context managers, in addition to encapsulating try/finally patterns. Here, I'm referring to the standard use of context managers in with statements, irrespective of the for statement. Oh, and, yes, it does allow the programmer to elide the use of the with statement in combination with the for statement. In my practice, 90% of the with statements I use are this with/for pattern so I imagine that this would also be appreciated by the Python community. But this is not the sole benefit, nor even the most important benefit. So while I might agree with "That's not enough by itself", I wish to point out that this last benefit is not "by itself". -bruce frederiksen From ziade.tarek at gmail.com Sat Feb 21 15:40:37 2009 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Sat, 21 Feb 2009 15:40:37 +0100 Subject: [Python-ideas] ABCs : adding an implements alias ? Message-ID: <94bdd2610902210640ue5cb31dn4d51d852b8a1dd58@mail.gmail.com> Hello I am playing with ABCs and I don't find the usage of the built-in issubclass() natural to test wheter a class implements an ABC. I think it's just the name of the built-in, or maybe the fact that it is hidden in the mro. (or maybe because I have used zope.interfaces a lot). What about introducing two aliases ? - implements, that points to issubclass - implemented_by, that points to __subclasshook__ So we could write things like : >>> from collections import Sized >>> implements(list, Sized) True >>> Sized.implemented_by(list) True Instead of: >>> from collections import Sized >>> issubclass(list, Sized) True >>> Sized.__subclasshook__(list) # mmm, maybe no one would write this.. True Regards Tarek -- Tarek Ziad? | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/ From dangyogi at gmail.com Sat Feb 21 16:11:43 2009 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Sat, 21 Feb 2009 10:11:43 -0500 Subject: [Python-ideas] Yield-from: Nonexistent throw() and close()? In-Reply-To: <49A00AF6.3030403@improva.dk> References: <499FB548.9090003@canterbury.ac.nz> <499FBAC4.9040202@improva.dk> <49A005F8.2010001@gmail.com> <49A00AF6.3030403@improva.dk> Message-ID: <49A019AF.6090905@gmail.com> Jacob Holm wrote: > Hi Bruce > > Bruce Frederiksen wrote: >> Jacob Holm wrote: >>> I'd hate for this: >>> >>> def foo(): >>> for i in xrange(5): >>> yield i >>> >>> to behave different from this: >>> >>> def foo(): >>> yield from xrange(5) >> These two forms already behave differently when generators are used >> (rather than xrange), why should they not also behave differently >> when non-generators are used? > Not sure in what way you think they behave differently? foo is a > generator in both cases, and as such has a send method. True, but I was referring to the subgenerator/subiterable (xrange in your example); not to foo itself. If xrange were a generator, #2 behaves intentionally differently than #1. > I am thinking of #2 as a simple rewrite/refactoring using the nifty > new feature. Why should foo().send('bar') ignore the value in #1 and > raise an exception in #2? Thinking of #2 as a simple rewrite of #1 is not how this PEP defines yield from. Unfortunately, AFAICT the *reason* for the difference is that the for statement was defined well before PEP 342 was adopted. When PEP 342 was adopted, changing the for statement to honor the new generator methods would break legacy code. But the "yield from" is being defined after PEP 342, so it may safely include these new capabilities. Your example does not show why you would want to use send with foo. As it is currently defined, there is no reason to do so. If the yield in #1 is meant to be a yield *expression* who's values are acted on by foo somehow, then "yield from" won't work here, no matter how it's defined, because the yielded values are inaccessible to foo. The same is true if xrange were a generator. -bruce frederiksen From guido at python.org Sat Feb 21 17:31:21 2009 From: guido at python.org (Guido van Rossum) Date: Sat, 21 Feb 2009 08:31:21 -0800 Subject: [Python-ideas] Yield-from: Details to be decided In-Reply-To: <499F9AAF.8000802@canterbury.ac.nz> References: <499F2B9F.7060004@canterbury.ac.nz> <499F32BA.2030209@gmail.com> <499F9AAF.8000802@canterbury.ac.nz> Message-ID: On Fri, Feb 20, 2009 at 10:09 PM, Greg Ewing wrote: > Bruce Frederiksen wrote: >> >> Actually, PEP 342 specifies that send(None) is like next(): >> >> "Calling send(None) is exactly equivalent to calling a generator's next() >> method." > > Hmmm, yes, but... that's talking about what happens > when you call send() on a *generator*. > > But when yield-from is delegating to some iterator > that's not a generator, according the current wording > in the PEP, things are supposed to behave as though > you were talking directly to the iterator. If it > doesn't have a send() method, and you tried to call > it directly, you would get an exception. > > We're in unprecedented territory here, and it's hard > to tell what will turn out to be the most useful > behaviour without more experience. Raising an > exception for now seems like the safest thing to > do. Agreed after all. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Sat Feb 21 17:38:46 2009 From: guido at python.org (Guido van Rossum) Date: Sat, 21 Feb 2009 08:38:46 -0800 Subject: [Python-ideas] ABCs : adding an implements alias ? In-Reply-To: <94bdd2610902210640ue5cb31dn4d51d852b8a1dd58@mail.gmail.com> References: <94bdd2610902210640ue5cb31dn4d51d852b8a1dd58@mail.gmail.com> Message-ID: On Sat, Feb 21, 2009 at 6:40 AM, Tarek Ziad? wrote: > I am playing with ABCs and I don't find the usage of the built-in > issubclass() natural to test wheter a class implements an ABC. > > I think it's just the name of the built-in, or maybe the fact that it > is hidden in the mro. (or maybe because I have used zope.interfaces a > lot). Probably the latter. Please try getting used to it. Zope didn't choose its API because it was the best but because in the days issubclass() and isinstance() weren't overloadable. We have an API for this already, and it won't go away. No need to create another one. Remember TOOWTDI. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From arnodel at googlemail.com Sat Feb 21 20:12:14 2009 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Sat, 21 Feb 2009 19:12:14 +0000 Subject: [Python-ideas] Yield-from: Details to be decided In-Reply-To: <499F9AAF.8000802@canterbury.ac.nz> References: <499F2B9F.7060004@canterbury.ac.nz> <499F32BA.2030209@gmail.com> <499F9AAF.8000802@canterbury.ac.nz> Message-ID: <9bfc700a0902211112s69628ab0tae38a7994dfbc924@mail.gmail.com> 2009/2/21 Greg Ewing : > Bruce Frederiksen wrote: >> >> Actually, PEP 342 specifies that send(None) is like next(): >> >> "Calling send(None) is exactly equivalent to calling a generator's next() >> method." > > Hmmm, yes, but... that's talking about what happens > when you call send() on a *generator*. A generator containing a yield-from expression is still a generator though. > But when yield-from is delegating to some iterator > that's not a generator, according the current wording > in the PEP, things are supposed to behave as though > you were talking directly to the iterator. If it > doesn't have a send() method, and you tried to call > it directly, you would get an exception. To my eyes, this means that there is an inconsistency between this PEP and PEP 342. > We're in unprecedented territory here, and it's hard > to tell what will turn out to be the most useful > behaviour without more experience. Raising an > exception for now seems like the safest thing to > do. It will mean that you will need to be aware of the implementation of a generator in order to know whether it is OK to use send(None) as an alternative spelling of next(). In some cases it is handy to use send(None) rather than next, and PEP 342 guarantees that it will work on generators. This will break that guarantee. A way to go round this would be to make the objects returned by generator functions containing yield-from expressions something else than generators - maybe 'delegators' ? -- Arnaud From greg.ewing at canterbury.ac.nz Sat Feb 21 22:52:49 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 22 Feb 2009 10:52:49 +1300 Subject: [Python-ideas] Revised**5 PEP on yield-from In-Reply-To: <49A00989.2070609@gmail.com> References: <499DDA4C.8090906@canterbury.ac.nz> <499F6053.40407@improva.dk> <499F93A9.9070500@canterbury.ac.nz> <499FC423.6080500@improva.dk> <499FC802.9080409@canterbury.ac.nz> <49A00989.2070609@gmail.com> Message-ID: <49A077B1.7040400@canterbury.ac.nz> Bruce Frederiksen wrote: > Let's not let the limitations of what can be expressed directly in > Python influence the implementation. No, but I have an instinct that if something *can* be expressed using basic Python, there's a good chance it's conceptually sound. Anyway, for other reasons I've settled on an interpretation that can be expressed in Python, as it happens. I'll be releasing another draft of the PEP with suitable modifications. -- Greg From greg.ewing at canterbury.ac.nz Sat Feb 21 23:11:04 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 22 Feb 2009 11:11:04 +1300 Subject: [Python-ideas] Yield-from: Nonexistent throw() and close()? In-Reply-To: <49A00AF6.3030403@improva.dk> References: <499FB548.9090003@canterbury.ac.nz> <499FBAC4.9040202@improva.dk> <49A005F8.2010001@gmail.com> <49A00AF6.3030403@improva.dk> Message-ID: <49A07BF8.8070504@canterbury.ac.nz> Jacob Holm wrote: > Hi Bruce > Bruce Frederiksen wrote: >> Jacob Holm wrote: >> >>> I'd hate for this: >>> >>> def foo(): >>> for i in xrange(5): >>> yield i >>> >>> to behave different from this: >>> >>> def foo(): >>> yield from xrange(5) >> >> These two forms already behave differently when generators are used Since yield-from doesn't exist yet, there's no 'already behave'. The issue is whether they *should* behave differently or not. The issue can perhaps be brought into better focus by considering this: def dontsendtome(): y = yield 42 if y is not None: raise AttributeError("dontsendtome has no method 'send'") Now by the "inlining" interpretation (which I think is very good and want to keep if at all possible), def icantbesenttoeither(): yield from dontsendtome() should also raise an AttributeError if you send() it something that isn't None. Now, from the outside, there's little observable difference between a generator such as dontsendtome() and some other iterator that doesn't have a send() method -- both raise an AttributeError if you try to send() something other than None. So it's reasonable to expect them to behave similarly when used in a 'yield from'. (There is one observable difference, namely that send(None) to an iterator with no send() raises an AttributeError, whereas it's impossible to write a generator which can tell the difference. But I see this as an artifact due to the lack of a send(iter, value) protocol function, which really ought to exist and translate send(None) into next() for all iterators). -- Greg From greg.ewing at canterbury.ac.nz Sat Feb 21 23:16:32 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 22 Feb 2009 11:16:32 +1300 Subject: [Python-ideas] ABCs : adding an implements alias ? In-Reply-To: <94bdd2610902210640ue5cb31dn4d51d852b8a1dd58@mail.gmail.com> References: <94bdd2610902210640ue5cb31dn4d51d852b8a1dd58@mail.gmail.com> Message-ID: <49A07D40.6040404@canterbury.ac.nz> Tarek Ziad? wrote: > I am playing with ABCs and I don't find the usage of the built-in > issubclass() natural to test wheter a class implements an ABC. One of the main points of ABCs, as I understand things, is so that you can declare a class as being a subclass of something "after the fact", and thus fool existing code that uses isinstance and issubclass tests into treating your objects in a duck-typed way. Requiring a different function to test for ABC inheritance would defeat that. -- Greg From greg.ewing at canterbury.ac.nz Sat Feb 21 23:28:16 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 22 Feb 2009 11:28:16 +1300 Subject: [Python-ideas] Yield-from: Details to be decided In-Reply-To: <9bfc700a0902211112s69628ab0tae38a7994dfbc924@mail.gmail.com> References: <499F2B9F.7060004@canterbury.ac.nz> <499F32BA.2030209@gmail.com> <499F9AAF.8000802@canterbury.ac.nz> <9bfc700a0902211112s69628ab0tae38a7994dfbc924@mail.gmail.com> Message-ID: <49A08000.5070701@canterbury.ac.nz> Arnaud Delobelle wrote: > It will mean that you will need to be aware of the implementation of a > generator in order to know whether it is OK to use send(None) as an > alternative spelling of next(). Yes, and I've now decided that send(None) will be converted to next() upon delegation in all cases. I'm no longer going to describe the semantics in terms of "direct communication", since that's not exactly true any more (and probably never really was). -- Greg From greg.ewing at canterbury.ac.nz Sat Feb 21 23:43:28 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 22 Feb 2009 11:43:28 +1300 Subject: [Python-ideas] Revised**6 PEP on yield-from Message-ID: <49A08390.1020405@canterbury.ac.nz> I've re-worded things yet again to nail the semantics down as per recent discussions. Briefly: * send(None) converted to next() upon delegation * send(not_None) raises exception if no send() method * throw() and close() ignore missing methods No longer describing semantics in terms of "direct communication". Fixed a bug in the expansion (return value of throw() was getting lost). PEP: XXX Title: Syntax for Delegating to a Subgenerator Version: $Revision$ Last-Modified: $Date$ Author: Gregory Ewing Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 13-Feb-2009 Python-Version: 2.7 Post-History: Abstract ======== A syntax is proposed for a generator to delegate part of its operations to another generator. This allows a section of code containing 'yield' to be factored out and placed in another generator. Additionally, the subgenerator is allowed to return with a value, and the value is made available to the delegating generator. The new syntax also opens up some opportunities for optimisation when one generator re-yields values produced by another. Proposal ======== The following new expression syntax will be allowed in the body of a generator: :: yield from where is an expression evaluating to an iterable, from which an iterator is extracted. The iterator is run to exhaustion, during which time it yields and receives values directly to or from the caller of the generator containing the ``yield from`` expression (the "delegating generator"). When the iterator is another generator, the effect is the same as if the body of the subgenerator were inlined at the point of the ``yield from`` expression. Furthermore, the subgenerator is allowed to execute a ``return`` statement with a value, and that value becomes the value of the ``yield from`` expression. In general, the semantics can be understood in terms of the iterator protocol as follows: * Any values that the iterator yields are passed directly to the caller. * Any values sent to the delegating generator using ``send()`` are passed directly to the iterator. If the sent value is None, the iterator's ``next()`` method is called. If the sent value is not None, the iterator's ``send()`` method is called if it has one, otherwise an exception is raised in the delegating generator. * Calls to the ``throw()`` method of the delegating generator are forwarded to the iterator. If the iterator does not have a ``throw()`` method, the thrown-in exception is raised in the delegating generator. * If the delegating generator's ``close()`` method is called, the ``close() method of the iterator is called first if it has one, then the delegating generator is finalised. * The value of the ``yield from`` expression is the first argument to the ``StopIteration`` exception raised by the iterator when it terminates. * ``return expr`` in a generator causes ``StopIteration(expr)`` to be raised. For convenience, the ``StopIteration`` exception will be given a ``value`` attribute that holds its first argument, or None if there are no arguments. Formal Semantics ---------------- 1. The statement :: result = yield from expr is semantically equivalent to :: _i = iter(expr) try: _u = _i.next() while 1: try: _v = yield _u except Exception, _e: if hasattr(_i, 'throw'): _u = _i.throw(_e) else: raise else: if _v is None: _u = _i.next() else: _u = _i.send(_v) except StopIteration, _e: result = _e.value finally: if hasattr(_i, 'close'): _i.close() 2. In a generator, the statement :: raise value is semantically equivalent to raise StopIteration(value) except that, as currently, the exception cannot be caught by 'except' clauses within the returning generator. 3. The StopIteration exception behaves as though defined thusly: :: class StopIteration(Exception): def __init__(self, *args): if len(args) > 0: self.value = args[0] else: self.value = None Exception.__init__(self, *args) Rationale ========= A Python generator is a form of coroutine, but has the limitation that it can only yield to its immediate caller. This means that a piece of code containing a ``yield`` cannot be factored out and put into a separate function in the same way as other code. Performing such a factoring causes the called function to itself become a generator, and it is necessary to explicitly iterate over this second generator and re-yield any values that it produces. If yielding of values is the only concern, this is not very arduous and can be performed with a loop such as :: for v in g: yield v However, if the subgenerator is to interact properly with the caller in the case of calls to ``send()``, ``throw()`` and ``close()``, things become considerably more complicated. As the formal expansion presented above illustrates, the necessary code is very longwinded, and it is tricky to handle all the corner cases correctly. In this situation, the advantages of a specialised syntax should be clear. Generators as Threads --------------------- A motivating use case for generators being able to return values concerns the use of generators to implement lightweight threads. When using generators in that way, it is reasonable to want to spread the computation performed by the lightweight thread over many functions. One would like to be able to call a subgenerator as though it were an ordinary function, passing it parameters and receiving a returned value. Using the proposed syntax, a statement such as :: y = f(x) where f is an ordinary function, can be transformed into a delegation call :: y = yield from g(x) where g is a generator. One can reason about the behaviour of the resulting code by thinking of g as an ordinary function that can be suspended using a ``yield`` statement. When using generators as threads in this way, typically one is not interested in the values being passed in or out of the yields. However, there are use cases for this as well, where the thread is seen as a producer or consumer of items. The ``yield from`` expression allows the logic of the thread to be spread over as many functions as desired, with the production or consumption of items occuring in any subfunction, and the items are automatically routed to or from their ultimate source or destination. Concerning ``throw()`` and ``close()``, it is reasonable to expect that if an exception is thrown into the thread from outside, it should first be raised in the innermost generator where the thread is suspended, and propagate outwards from there; and that if the thread is terminated from outside by calling ``close()``, the chain of active generators should be finalised from the innermost outwards. Syntax ------ The particular syntax proposed has been chosen as suggestive of its meaning, while not introducing any new keywords and clearly standing out as being different from a plain ``yield``. Optimisations ------------- Using a specialised syntax opens up possibilities for optimisation when there is a long chain of generators. Such chains can arise, for instance, when recursively traversing a tree structure. The overhead of passing ``next()`` calls and yielded values down and up the chain can cause what ought to be an O(n) operation to become O(n\*\*2). A possible strategy is to add a slot to generator objects to hold a generator being delegated to. When a ``next()`` or ``send()`` call is made on the generator, this slot is checked first, and if it is nonempty, the generator that it references is resumed instead. If it raises StopIteration, the slot is cleared and the main generator is resumed. This would reduce the delegation overhead to a chain of C function calls involving no Python code execution. A possible enhancement would be to traverse the whole chain of generators in a loop and directly resume the one at the end, although the handling of StopIteration is more complicated then. Use of StopIteration to return values ------------------------------------- There are a variety of ways that the return value from the generator could be passed back. Some alternatives include storing it as an attribute of the generator-iterator object, or returning it as the value of the ``close()`` call to the subgenerator. However, the proposed mechanism is attractive for a couple of reasons: * Using the StopIteration exception makes it easy for other kinds of iterators to participate in the protocol without having to grow extra attributes or a close() method. * It simplifies the implementation, because the point at which the return value from the subgenerator becomes available is the same point at which StopIteration is raised. Delaying until any later time would require storing the return value somewhere. Criticisms ========== Under this proposal, the value of a ``yield from`` expression would be derived in a very different way from that of an ordinary ``yield`` expression. This suggests that some other syntax not containing the word ``yield`` might be more appropriate, but no acceptable alternative has so far been proposed. It has been suggested that some mechanism other than ``return`` in the subgenerator should be used to establish the value returned by the ``yield from`` expression. However, this would interfere with the goal of being able to think of the subgenerator as a suspendable function, since it would not be able to return values in the same way as other functions. The use of an argument to StopIteration to pass the return value has been criticised as an "abuse of exceptions", without any concrete justification of this claim. In any case, this is only one suggested implementation; another mechanism could be used without losing any essential features of the proposal. It has been suggested that a different exception, such as GeneratorReturn, should be used instead of StopIteration to return a value. However, no convincing practical reason for this has been put forward, and the addition of a ``value`` attribute to StopIteration mitigates any difficulties in extracting a return value from a StopIteration exception that may or may not have one. Also, using a different exception would mean that, unlike ordinary functions, 'return' without a value in a generator would not be equivalent to 'return None'. Alternative Proposals ===================== Proposals along similar lines have been made before, some using the syntax ``yield *`` instead of ``yield from``. While ``yield *`` is more concise, it could be argued that it looks too similar to an ordinary ``yield`` and the difference might be overlooked when reading code. To the author's knowledge, previous proposals have focused only on yielding values, and thereby suffered from the criticism that the two-line for-loop they replace is not sufficiently tiresome to write to justify a new syntax. By also dealing with calls to ``send()``, ``throw()`` and ``close()``, this proposal provides considerably more benefit. Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: From aahz at pythoncraft.com Sun Feb 22 01:03:56 2009 From: aahz at pythoncraft.com (Aahz) Date: Sat, 21 Feb 2009 16:03:56 -0800 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: <49950173.9010703@acm.org> References: <20090212141040.0c89e0fc@o> <70e6cf130902121029l17ec06e0k9d3ea91e23ecd74e@mail.gmail.com> <4994BFFE.8000903@pearwood.info> <49950173.9010703@acm.org> Message-ID: <20090222000355.GA7468@panix.com> On Thu, Feb 12, 2009, Talin wrote: > > Secondly, there is an argument to be made towards moving away from any > syntactical pattern that requires the programmer to synchronize two > lists, in this case the set of '%' field markers in the string and the > sequence of replacement values. Having to maintain a correspondence > between lists is almost never a problem when code is first written, but > I think we can all remember instances where bugs have been introduced by > maintainers who added a new item to one list but forgot to add the > corresponding item to the other list. This is still problematic with {#} syntax when lists are re-ordered, which creates a more difficult editing job. Moreover, precisely because {#} syntax doesn't blow up in the face of adding elements, it creates more silent bugs IMO. Altogether, although you have a point I think it ends up being the same total pain, just shifted around. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ Weinberg's Second Law: If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization. From ryan.freckleton at gmail.com Sun Feb 22 03:04:04 2009 From: ryan.freckleton at gmail.com (Ryan Freckleton) Date: Sat, 21 Feb 2009 19:04:04 -0700 Subject: [Python-ideas] Revised**6 PEP on yield-from In-Reply-To: <49A08390.1020405@canterbury.ac.nz> References: <49A08390.1020405@canterbury.ac.nz> Message-ID: <318072440902211804j64cd5180vdbcba21530a792f8@mail.gmail.com> I haven't been following the discussion too closely, but shouldn't it be "return value" is semantically equivalent to raise StopIteration(value) instead of "raise value"? in bullet 2 of formal semantics? ===== --Ryan E. Freckleton On Sat, Feb 21, 2009 at 3:43 PM, Greg Ewing wrote: > I've re-worded things yet again to nail the semantics > down as per recent discussions. Briefly: > > * send(None) converted to next() upon delegation > * send(not_None) raises exception if no send() method > * throw() and close() ignore missing methods > > No longer describing semantics in terms of "direct > communication". > > Fixed a bug in the expansion (return value of > throw() was getting lost). > > > PEP: XXX > Title: Syntax for Delegating to a Subgenerator > Version: $Revision$ > Last-Modified: $Date$ > Author: Gregory Ewing > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 13-Feb-2009 > Python-Version: 2.7 > Post-History: > > > Abstract > ======== > > A syntax is proposed for a generator to delegate part of its > operations to another generator. This allows a section of code > containing 'yield' to be factored out and placed in another > generator. Additionally, the subgenerator is allowed to return with a > value, and the value is made available to the delegating generator. > > The new syntax also opens up some opportunities for optimisation when > one generator re-yields values produced by another. > > > Proposal > ======== > > The following new expression syntax will be allowed in the body of a > generator: > > :: > > yield from > > where is an expression evaluating to an iterable, from which an > iterator is extracted. The iterator is run to exhaustion, during which > time it yields and receives values directly to or from the caller of > the generator containing the ``yield from`` expression (the > "delegating generator"). > > When the iterator is another generator, the effect is the same as if > the body of the subgenerator were inlined at the point of the ``yield > from`` expression. Furthermore, the subgenerator is allowed to execute > a ``return`` statement with a value, and that value becomes the value of > the ``yield from`` expression. > > In general, the semantics can be understood in terms of the iterator > protocol as follows: > > * Any values that the iterator yields are passed directly to the > caller. > > * Any values sent to the delegating generator using ``send()`` > are passed directly to the iterator. If the sent value is None, > the iterator's ``next()`` method is called. If the sent value is > not None, the iterator's ``send()`` method is called if it has > one, otherwise an exception is raised in the delegating generator. > > * Calls to the ``throw()`` method of the delegating generator are > forwarded to the iterator. If the iterator does not have a > ``throw()`` method, the thrown-in exception is raised in the > delegating generator. > > * If the delegating generator's ``close()`` method is called, the > ``close() method of the iterator is called first if it has one, > then the delegating generator is finalised. > > * The value of the ``yield from`` expression is the first argument > to the ``StopIteration`` exception raised by the iterator when it > terminates. > > * ``return expr`` in a generator causes ``StopIteration(expr)`` to > be raised. > > For convenience, the ``StopIteration`` exception will be given a > ``value`` attribute that holds its first argument, or None if there > are no arguments. > > > Formal Semantics > ---------------- > > 1. The statement > > :: > > result = yield from expr > > is semantically equivalent to > > :: > > _i = iter(expr) > try: > _u = _i.next() > while 1: > try: > _v = yield _u > except Exception, _e: > if hasattr(_i, 'throw'): > _u = _i.throw(_e) > else: > raise > else: > if _v is None: > _u = _i.next() > else: > _u = _i.send(_v) > except StopIteration, _e: > result = _e.value > finally: > if hasattr(_i, 'close'): > _i.close() > > 2. In a generator, the statement > > :: > > raise value > > is semantically equivalent to > > raise StopIteration(value) > > except that, as currently, the exception cannot be caught by 'except' > clauses within the returning generator. > > 3. The StopIteration exception behaves as though defined thusly: > > :: > > class StopIteration(Exception): > > def __init__(self, *args): > if len(args) > 0: > self.value = args[0] > else: > self.value = None > Exception.__init__(self, *args) > > > Rationale > ========= > > A Python generator is a form of coroutine, but has the limitation that > it can only yield to its immediate caller. This means that a piece of > code containing a ``yield`` cannot be factored out and put into a > separate function in the same way as other code. Performing such a > factoring causes the called function to itself become a generator, and > it is necessary to explicitly iterate over this second generator and > re-yield any values that it produces. > > If yielding of values is the only concern, this is not very arduous > and can be performed with a loop such as > > :: > > for v in g: > yield v > > However, if the subgenerator is to interact properly with the caller > in the case of calls to ``send()``, ``throw()`` and ``close()``, things > become considerably more complicated. As the formal expansion presented > above illustrates, the necessary code is very longwinded, and it is tricky > to handle all the corner cases correctly. In this situation, the advantages > of a specialised syntax should be clear. > > > Generators as Threads > --------------------- > > A motivating use case for generators being able to return values > concerns the use of generators to implement lightweight threads. When > using generators in that way, it is reasonable to want to spread the > computation performed by the lightweight thread over many functions. > One would like to be able to call a subgenerator as though it were > an ordinary function, passing it parameters and receiving a returned > value. > > Using the proposed syntax, a statement such as > > :: > > y = f(x) > > where f is an ordinary function, can be transformed into a delegation > call > > :: > > y = yield from g(x) > > where g is a generator. One can reason about the behaviour of the > resulting code by thinking of g as an ordinary function that can be > suspended using a ``yield`` statement. > > When using generators as threads in this way, typically one is not > interested in the values being passed in or out of the yields. > However, there are use cases for this as well, where the thread is > seen as a producer or consumer of items. The ``yield from`` > expression allows the logic of the thread to be spread over as > many functions as desired, with the production or consumption of > items occuring in any subfunction, and the items are automatically > routed to or from their ultimate source or destination. > > Concerning ``throw()`` and ``close()``, it is reasonable to expect > that if an exception is thrown into the thread from outside, it should > first be raised in the innermost generator where the thread is suspended, > and propagate outwards from there; and that if the thread is terminated > from outside by calling ``close()``, the chain of active generators > should be finalised from the innermost outwards. > > > Syntax > ------ > > The particular syntax proposed has been chosen as suggestive of its > meaning, while not introducing any new keywords and clearly standing > out as being different from a plain ``yield``. > > > Optimisations > ------------- > > Using a specialised syntax opens up possibilities for optimisation > when there is a long chain of generators. Such chains can arise, for > instance, when recursively traversing a tree structure. The overhead > of passing ``next()`` calls and yielded values down and up the chain > can cause what ought to be an O(n) operation to become O(n\*\*2). > > A possible strategy is to add a slot to generator objects to hold a > generator being delegated to. When a ``next()`` or ``send()`` call is > made on the generator, this slot is checked first, and if it is > nonempty, the generator that it references is resumed instead. If it > raises StopIteration, the slot is cleared and the main generator is > resumed. > > This would reduce the delegation overhead to a chain of C function > calls involving no Python code execution. A possible enhancement would > be to traverse the whole chain of generators in a loop and directly > resume the one at the end, although the handling of StopIteration is > more complicated then. > > > Use of StopIteration to return values > ------------------------------------- > > There are a variety of ways that the return value from the generator > could be passed back. Some alternatives include storing it as an > attribute of the generator-iterator object, or returning it as the > value of the ``close()`` call to the subgenerator. However, the proposed > mechanism is attractive for a couple of reasons: > > * Using the StopIteration exception makes it easy for other kinds > of iterators to participate in the protocol without having to > grow extra attributes or a close() method. > > * It simplifies the implementation, because the point at which the > return value from the subgenerator becomes available is the same > point at which StopIteration is raised. Delaying until any later > time would require storing the return value somewhere. > > > Criticisms > ========== > > Under this proposal, the value of a ``yield from`` expression would > be derived in a very different way from that of an ordinary ``yield`` > expression. This suggests that some other syntax not containing the > word ``yield`` might be more appropriate, but no acceptable alternative > has so far been proposed. > > It has been suggested that some mechanism other than ``return`` in > the subgenerator should be used to establish the value returned by > the ``yield from`` expression. However, this would interfere with > the goal of being able to think of the subgenerator as a suspendable > function, since it would not be able to return values in the same way > as other functions. > > The use of an argument to StopIteration to pass the return value > has been criticised as an "abuse of exceptions", without any > concrete justification of this claim. In any case, this is only > one suggested implementation; another mechanism could be used > without losing any essential features of the proposal. > > It has been suggested that a different exception, such as > GeneratorReturn, should be used instead of StopIteration to return a > value. However, no convincing practical reason for this has been put > forward, and the addition of a ``value`` attribute to StopIteration > mitigates any difficulties in extracting a return value from a > StopIteration exception that may or may not have one. Also, using a > different exception would mean that, unlike ordinary functions, > 'return' without a value in a generator would not be equivalent to > 'return None'. > > > Alternative Proposals > ===================== > > Proposals along similar lines have been made before, some using the > syntax ``yield *`` instead of ``yield from``. While ``yield *`` is > more concise, it could be argued that it looks too similar to an > ordinary ``yield`` and the difference might be overlooked when reading > code. > > To the author's knowledge, previous proposals have focused only on > yielding values, and thereby suffered from the criticism that the > two-line for-loop they replace is not sufficiently tiresome to write > to justify a new syntax. By also dealing with calls to ``send()``, > ``throw()`` and ``close()``, this proposal provides considerably more > benefit. > > > Copyright > ========= > > This document has been placed in the public domain. > > > > .. > Local Variables: > mode: indented-text > indent-tabs-mode: nil > sentence-end-double-space: t > fill-column: 70 > coding: utf-8 > End: > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From dangyogi at gmail.com Sun Feb 22 05:02:17 2009 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Sat, 21 Feb 2009 23:02:17 -0500 Subject: [Python-ideas] Revised**6 PEP on yield-from In-Reply-To: <49A08390.1020405@canterbury.ac.nz> References: <49A08390.1020405@canterbury.ac.nz> Message-ID: <49A0CE49.3000905@gmail.com> Greg Ewing wrote: > * Any values sent to the delegating generator using ``send()`` > are passed directly to the iterator. If the sent value is None, > the iterator's ``next()`` method is called. If the sent value is > not None, the iterator's ``send()`` method is called if it has > one, otherwise an exception is raised in the delegating generator. Shouldn't this define which exception is raised? Also, raising the exception within the delegating generator will (unless caught there) finalize the generator. This may cause surprising results if the caller catches the exception and tries to continue to use the generator. Intuitively, I would expect that the delegating generator would not see this exception; as if the delegating generator itself lacked a send method. The reasoning is that the error is with the caller and not the delegating generator. Also, given that the send method may work while the delegating generator is outside of any yield from, but not work during a yield from; not raising the exception within the delegating generator gives the caller a safe way to test the waters without finalizing the delegating generator. OTOH, this may be a reason to just translate send to next for non-None values too??? Perhaps the justification in that case would be to think of it like sending to a yield *statement* (which can't accept the sent value) -- which is not an error in generators. (Is it too late to change my vote for send(non-None) generating an exception? :-) > [snip] > > 2. In a generator, the statement > > :: > > raise value > > is semantically equivalent to > > raise StopIteration(value) Did you mean "return value" rather than "raise value" here? -bruce frederiksen From greg.ewing at canterbury.ac.nz Sun Feb 22 10:14:42 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 22 Feb 2009 22:14:42 +1300 Subject: [Python-ideas] Revised**6 PEP on yield-from In-Reply-To: <318072440902211804j64cd5180vdbcba21530a792f8@mail.gmail.com> References: <49A08390.1020405@canterbury.ac.nz> <318072440902211804j64cd5180vdbcba21530a792f8@mail.gmail.com> Message-ID: <49A11782.6070406@canterbury.ac.nz> Ryan Freckleton wrote: > I haven't been following the discussion too closely, but shouldn't it be > "return value" is semantically equivalent to raise > StopIteration(value) Yes, that was a typo. -- Greg From greg.ewing at canterbury.ac.nz Sun Feb 22 11:25:33 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 22 Feb 2009 23:25:33 +1300 Subject: [Python-ideas] Revised**6 PEP on yield-from In-Reply-To: <49A0CE49.3000905@gmail.com> References: <49A08390.1020405@canterbury.ac.nz> <49A0CE49.3000905@gmail.com> Message-ID: <49A1281D.20101@canterbury.ac.nz> Bruce Frederiksen wrote: > Greg Ewing wrote: > If the sent value is >> not None, the iterator's ``send()`` method is called if it has >> one, otherwise an exception is raised in the delegating generator. > > Shouldn't this define which exception is raised? To put it more precisely, whatever exception results from attempting to call the non-existent send() method is propagated into the delegating generator. > Intuitively, I would expect that the delegating generator would not see > this exception; as if the delegating generator itself lacked a send > method. This would introduce an inconsistency between delegating to a generator and delegating to some other kind of iterator. When delegating to another generator, the inlining principle requires that any exceptions raised by the subgenerator must be propagated through the delegating generator. This includes whatever exceptions might result from attempting to send values to the subgenerator. My feeling is that other iterators should behave the same way as generators, as closely as possible, when delegated to. There's also the consideration that the semantics you propose can't be expressed in terms of a Python expansion, since there's no way for a generator to throw an exception right out of itself without triggering any except or finally blocks on the way. While that's not a fatal flaw, I think it's highly desirable to be able to specify the semantics in terms of an expansion, because of its precision. Currently the expansion in the PEP is the only precise and complete specification. It's very hard to express all the nuances and ramifications in words and be sure that you've covered everything -- as witnessed by your comments above! > OTOH, this may be a reason to just translate send to next for non-None > values too??? Perhaps the justification in that case would be to think > of it like sending to a yield *statement* (which can't accept the sent > value) -- which is not an error in generators. That's a distinct possibility. Guido pointed out that there is an existing case where send() refuses to accept anything other than None, and that's when you call it immediately after the generator starts. But that case doesn't apply here, because the first call to a delegated iterator is always made implicitly by the yield-from expression itself. So a send() that gets delegated to a subiterator is *never* the first call, and therefore it should ignore any sent values that it doesn't care about. In other words, go back to what I had in the first draft of the PEP: if hasattr(_i, 'send'): _u = _i.send(_v) else: _u = _i.next() or perhaps if _v is not None and hasattr(_i, 'send'): _u = _i.send(_v) else: _u = _i.next() Guido, what do you think about this? -- Greg From greg.ewing at canterbury.ac.nz Sun Feb 22 12:55:49 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 23 Feb 2009 00:55:49 +1300 Subject: [Python-ideas] Yield-from: Implementation available Message-ID: <49A13D45.7080608@canterbury.ac.nz> I've got a prototype implementation working. You can get it here in the form of a patch to Python 2.6.1. http://www.cosc.canterbury.ac.nz/greg.ewing/python/yield-from/Python-2.6.1-yieldfrom.zip -- Greg From guido at python.org Sun Feb 22 16:57:52 2009 From: guido at python.org (Guido van Rossum) Date: Sun, 22 Feb 2009 07:57:52 -0800 Subject: [Python-ideas] Revised**6 PEP on yield-from In-Reply-To: <49A1281D.20101@canterbury.ac.nz> References: <49A08390.1020405@canterbury.ac.nz> <49A0CE49.3000905@gmail.com> <49A1281D.20101@canterbury.ac.nz> Message-ID: On Sun, Feb 22, 2009 at 2:25 AM, Greg Ewing wrote: > Bruce Frederiksen wrote: >> >> Greg Ewing wrote: >> If the sent value is >>> >>> not None, the iterator's ``send()`` method is called if it has >>> one, otherwise an exception is raised in the delegating generator. >> >> Shouldn't this define which exception is raised? > > To put it more precisely, whatever exception results from > attempting to call the non-existent send() method is propagated > into the delegating generator. > >> Intuitively, I would expect that the delegating generator would not see >> this exception; as if the delegating generator itself lacked a send method. > > This would introduce an inconsistency between delegating to a > generator and delegating to some other kind of iterator. > > When delegating to another generator, the inlining principle > requires that any exceptions raised by the subgenerator must be > propagated through the delegating generator. This includes > whatever exceptions might result from attempting to send values > to the subgenerator. > > My feeling is that other iterators should behave the same way > as generators, as closely as possible, when delegated to. > > There's also the consideration that the semantics you propose > can't be expressed in terms of a Python expansion, since there's > no way for a generator to throw an exception right out of itself > without triggering any except or finally blocks on the way. And that would be bad -- if the yield-from is inside a try/finally, I'd expect the finally clause to be run. > While that's not a fatal flaw, I think it's highly desirable > to be able to specify the semantics in terms of an expansion, > because of its precision. Currently the expansion in the PEP > is the only precise and complete specification. It's very > hard to express all the nuances and ramifications in words > and be sure that you've covered everything -- as witnessed > by your comments above! > >> OTOH, this may be a reason to just translate send to next for non-None >> values too??? Perhaps the justification in that case would be to think of >> it like sending to a yield *statement* (which can't accept the sent value) >> -- which is not an error in generators. > > That's a distinct possibility. Guido pointed out that there > is an existing case where send() refuses to accept anything > other than None, and that's when you call it immediately > after the generator starts. > > But that case doesn't apply here, because the first call to a > delegated iterator is always made implicitly by the yield-from > expression itself. So a send() that gets delegated to a subiterator > is *never* the first call, and therefore it should ignore any sent > values that it doesn't care about. > > In other words, go back to what I had in the first draft of > the PEP: > > if hasattr(_i, 'send'): > _u = _i.send(_v) > else: > _u = _i.next() > > or perhaps > > if _v is not None and hasattr(_i, 'send'): > _u = _i.send(_v) > else: > _u = _i.next() > > Guido, what do you think about this? I think it's all pretty academic as long as it is specified with sufficient exactness that someone else reimplementing it will arrive at the same choices. I don't particularly like the LBYL (Look Before You Leap) idiom, so let's do this: if _v is None: _u = _i.next() # Or, in Py3k, _u = next(i) else: _u = _i.send(_v) This means that sending a non-None value into a generator that delegates to a non-generator iterator will fail. I doubt there will be too many use cases that are inconvenienced by this. After all, the presence of a 'send' attribute doesn't mean it can be called anyway, and even if it can, it doesn't mean the call will succeed. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From benjamin at python.org Sun Feb 22 19:03:00 2009 From: benjamin at python.org (Benjamin Peterson) Date: Sun, 22 Feb 2009 18:03:00 +0000 (UTC) Subject: [Python-ideas] Yield-from: Implementation available References: <49A13D45.7080608@canterbury.ac.nz> Message-ID: Greg Ewing writes: > > I've got a prototype implementation working. You > can get it here in the form of a patch to > Python 2.6.1. Do you want code reviews? If so, please post it to Rietveld. From greg.ewing at canterbury.ac.nz Sun Feb 22 20:45:07 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 23 Feb 2009 08:45:07 +1300 Subject: [Python-ideas] Revised**6 PEP on yield-from In-Reply-To: References: <49A08390.1020405@canterbury.ac.nz> <49A0CE49.3000905@gmail.com> <49A1281D.20101@canterbury.ac.nz> Message-ID: <49A1AB43.2020605@canterbury.ac.nz> Guido van Rossum wrote: > This means that sending a non-None value into a generator that > delegates to a non-generator iterator will fail. I doubt there will be > too many use cases that are inconvenienced by this. Okay, I'll leave it that way -- we can always relax it later if need be. I'll also update the PEP to be more clear about what exception is raised and where it goes. -- Greg From solipsis at pitrou.net Sun Feb 22 21:50:01 2009 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 22 Feb 2009 20:50:01 +0000 (UTC) Subject: [Python-ideas] Yield-from: Implementation available References: <49A13D45.7080608@canterbury.ac.nz> Message-ID: Greg Ewing writes: > > I've got a prototype implementation working. You > can get it here in the form of a patch to > Python 2.6.1. Out of curiosity, did you do some timings on some of the examples you presented (e.g. the tree-walking one :-)) ? From greg.ewing at canterbury.ac.nz Sun Feb 22 22:00:21 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 23 Feb 2009 10:00:21 +1300 Subject: [Python-ideas] Yield-from: Implementation available In-Reply-To: References: <49A13D45.7080608@canterbury.ac.nz> Message-ID: <49A1BCE5.1010104@canterbury.ac.nz> Antoine Pitrou wrote: > Out of curiosity, did you do some timings on some of the examples you presented > (e.g. the tree-walking one :-)) ? Not yet, I've been concentrating on getting the basic functionality working, but I'll be doing this soon. -- Greg From greg.ewing at canterbury.ac.nz Sun Feb 22 22:44:04 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 23 Feb 2009 10:44:04 +1300 Subject: [Python-ideas] Yield-from: Timings Message-ID: <49A1C724.6030807@canterbury.ac.nz> I've done some timings, and preliminary results suggest that the overhead of delegating via a yield-from is about 10% of what it is using a for-loop with a yield in it. That's somewhat higher than I was expecting, but still a considerable improvement. -- Greg From greg.ewing at canterbury.ac.nz Sun Feb 22 23:52:58 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 23 Feb 2009 11:52:58 +1300 Subject: [Python-ideas] Yield-from: More timings Message-ID: <49A1D74A.4040404@canterbury.ac.nz> I neglected to turn off cyclic gc while performing the timings. Doing so, the overhead is measured at about 5%. I've also tried some experiments with traversing a binary tree. For very small trees the yield-from version starts off slightly slower overall than the for-loop version. Break-even occurs at about 31 nodes in the tree, and after that the yield-from version steadily gains ground, until at 1 million nodes it's 1.6 times faster. The columns below are: depth of the tree, for-loop time, yield-from time and the ratio of the times (> 1 meaning yield-from is faster). 1 0.000049 0.000060 0.805263 2 0.000051 0.000072 0.706534 3 0.000069 0.000108 0.637564 4 0.000130 0.000148 0.878364 5 0.000265 0.000253 1.04675 6 0.000457 0.000458 0.997223 7 0.000988 0.000920 1.07379 8 0.002025 0.001743 1.162 9 0.003985 0.003437 1.15947 10 0.008345 0.006848 1.2185 11 0.017398 0.013902 1.25145 12 0.036223 0.027844 1.30091 13 0.075175 0.055327 1.35874 14 0.154206 0.110290 1.39819 15 0.318121 0.220421 1.44324 16 0.663929 0.442529 1.50031 17 1.369077 0.891843 1.53511 18 2.827645 1.798144 1.57254 19 5.883996 3.661280 1.60709 20 12.093364 7.424631 1.62882 def forloop(node): if node == 1: yield node else: for x in forloop(node[0]): yield x for x in forloop(node[1]): yield x def yieldfrom(node): if node == 1: yield node else: yield from yieldfrom(node[0]) yield from yieldfrom(node[1]) -- Greg From greg.ewing at canterbury.ac.nz Mon Feb 23 06:14:51 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 23 Feb 2009 18:14:51 +1300 Subject: [Python-ideas] New version of yield-from patch Message-ID: <49A230CB.6090703@canterbury.ac.nz> I've fixed a few bugs and included some more tests and examples. http://www.cosc.canterbury.ac.nz/greg.ewing/python/yield-from/Python-2.6.1-yieldfrom-rev2.zip -- Greg From venkat83 at gmail.com Mon Feb 23 11:44:50 2009 From: venkat83 at gmail.com (Venkatraman S) Date: Mon, 23 Feb 2009 16:14:50 +0530 Subject: [Python-ideas] Register based interpreter In-Reply-To: References: <50697b2c0902201015g7572d645hbfe8b0fe60960f8d@mail.gmail.com> Message-ID: On Sat, Feb 21, 2009 at 1:18 AM, Antoine Pitrou wrote: > > > > Antonio Cuni made some experiments on PyPy about this, If you ask at > > the pypy-dev mailing list or on irc (#pypy on freenode.net) he or > > others can explain what happened. > Looks like there was no attempt on that front. > The biggest complication I can think of with a register-based VM is that > you > have to decref objects as soon as they aren't used anymore, which means you > have > to track the actual lifetime of registers (while it's done automatically > with a > stack-based design). I hope to take this as 'fun' project and try it out. Not sure how far i will reach. -V- http://twitter.com/venkat83 -------------- next part -------------- An HTML attachment was scrubbed... URL: From qrczak at knm.org.pl Mon Feb 23 14:05:09 2009 From: qrczak at knm.org.pl (Marcin 'Qrczak' Kowalczyk) Date: Mon, 23 Feb 2009 14:05:09 +0100 Subject: [Python-ideas] Revised**6 PEP on yield-from In-Reply-To: <49A08390.1020405@canterbury.ac.nz> References: <49A08390.1020405@canterbury.ac.nz> Message-ID: <3f4107910902230505q22ec3402q57d6ddfd3698187a@mail.gmail.com> Given that the expansion is quite complicated... Imagine that we want to alter the values while passing them from the inner generator. For example: for x in expr: yield x+1 How to let this be as transparent to send/close/etc. as yield from? If the transformation is an expression, one can use a generator comprehension: yield from (x+1 for x in expr) but this does not work if we have some statements inside: seen = set() for x in expr: if x not in seen: seen.add(x) yield x (I'm not familiar with the details of how send works, but I hope the point is valid.) -- Marcin Kowalczyk qrczak at knm.org.pl http://qrnik.knm.org.pl/~qrczak/ From collinw at gmail.com Tue Feb 24 01:54:56 2009 From: collinw at gmail.com (Collin Winter) Date: Mon, 23 Feb 2009 16:54:56 -0800 Subject: [Python-ideas] Register based interpreter In-Reply-To: References: <50697b2c0902201015g7572d645hbfe8b0fe60960f8d@mail.gmail.com> Message-ID: <43aa6ff70902231654v404e839boe0a22055f729c68b@mail.gmail.com> On Mon, Feb 23, 2009 at 2:44 AM, Venkatraman S wrote: > > On Sat, Feb 21, 2009 at 1:18 AM, Antoine Pitrou wrote: >> >> > >> > Antonio Cuni made some experiments on PyPy about this, If you ask at >> > the pypy-dev mailing list or on irc (#pypy on freenode.net) he or >> > others can explain what happened. > > Looks like there was no attempt on that front. > >> >> The biggest complication I can think of with a register-based VM is that >> you >> have to decref objects as soon as they aren't used anymore, which means >> you have >> to track the actual lifetime of registers (while it's done automatically >> with a >> stack-based design). > > I hope to take this as 'fun' project and try it out. Not sure how far i will > reach. I think this would be interesting to attempt. Lua switched to a stack-based VM to a register-based VM for Lua 5.0. http://www.tecgraf.puc-rio.br/~lhf/ftp/doc/sblp2005.pdf includes some benchmark numbers. Collin Winter From greg.ewing at canterbury.ac.nz Wed Feb 25 11:39:45 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 25 Feb 2009 23:39:45 +1300 Subject: [Python-ideas] Revised^4 PEP on yield-from In-Reply-To: References: <499C6E00.2030602@canterbury.ac.nz> <499CC6E8.5050104@ronadam.com> <499D243B.8080801@canterbury.ac.nz> <499D2740.1070408@canterbury.ac.nz> <499D3086.6020706@canterbury.ac.nz> Message-ID: <49A51FF1.9060000@canterbury.ac.nz> Raymond Hettinger wrote: > The hasattr() tests can be expensive for repeated throwers and senders. > Any merit to caching the result of the test? The expansion is a specification of the semantics, not an implementation. My current implementation does something like _m = getattr(_i, 'throw', None) if _m: _m(_e) and a further optimisation would be to cache the bound method. -- Greg From ironfroggy at gmail.com Wed Feb 25 14:55:53 2009 From: ironfroggy at gmail.com (Calvin Spealman) Date: Wed, 25 Feb 2009 08:55:53 -0500 Subject: [Python-ideas] Super, Hooks, and Aspect Oriented Programming Message-ID: <76fd5acf0902250555y52db25ccve7d7d6fd7a37fb0f@mail.gmail.com> I've been giving some thought to super-calls, and I'm looking to consolidate the common patterns of its use. I think if we can do that, we can find a more concise expression of the different uses. So far I've come up with every use being one or more of the following: - Do something before the super method is called, optionally changing the parameters - Do something after the super method is called, optionally changing the return value - Do something with an exception the function raises - Consume new parameters, used by one or more other hooks in place - Do something instead of calling the super method, if some condition is true (such as a particular parameter being present of a specific type) Can anyone add to this? Are there any restrictions to the combination of usage patterns? Obviously, this doesn't capture them all, but can we capture most of them? If we could, could we wrap them up in some hook library (3rd party or otherwise) @before and @after are common enough ideas for decorators already @catching could be used to catch exceptions raised by the super (or another) method How could these patterns consume extra parameters to hand over to one or more hooks? How do we do this in a way that we the MRO doesn't mess up what parameters we consume at what point? I feel the common patterns we see now are error prone, and finding this concise solution might give us better control. I'd love to see something like this included with a future release, but right now I'm trying to get ideas for what it would look like, so I can build a library and use it, and others can use it, and it can be seen if its even useful in real world practice. -- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://techblog.ironfroggy.com/ Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy From guido at python.org Wed Feb 25 18:24:55 2009 From: guido at python.org (Guido van Rossum) Date: Wed, 25 Feb 2009 09:24:55 -0800 Subject: [Python-ideas] Super, Hooks, and Aspect Oriented Programming In-Reply-To: <76fd5acf0902250555y52db25ccve7d7d6fd7a37fb0f@mail.gmail.com> References: <76fd5acf0902250555y52db25ccve7d7d6fd7a37fb0f@mail.gmail.com> Message-ID: I think no exploration of super() can be complete without considering the difference between overriding a constructor (__init__ or __new__) and overriding a regular method. Another thing to include is the use of keyword arguments. Also, have you looked at how other languages do this yet? --Guido On Wed, Feb 25, 2009 at 5:55 AM, Calvin Spealman wrote: > I've been giving some thought to super-calls, and I'm looking to > consolidate the common patterns of its use. I think if we can do that, > we can find a more concise expression of the different uses. So far > I've come up with every use being one or more of the following: > > - Do something before the super method is called, optionally changing > the parameters > - Do something after the super method is called, optionally changing > the return value > - Do something with an exception the function raises > - Consume new parameters, used by one or more other hooks in place > - Do something instead of calling the super method, if some condition > is true (such as a particular parameter being present of a specific > type) > > Can anyone add to this? Are there any restrictions to the combination > of usage patterns? Obviously, this doesn't capture them all, but can > we capture most of them? If we could, could we wrap them up in some > hook library (3rd party or otherwise) > > @before and @after are common enough ideas for decorators already > @catching could be used to catch exceptions raised by the super (or > another) method > > How could these patterns consume extra parameters to hand over to one > or more hooks? > How do we do this in a way that we the MRO doesn't mess up what > parameters we consume at what point? I feel the common patterns we see > now are error prone, and finding this concise solution might give us > better control. > > I'd love to see something like this included with a future release, > but right now I'm trying to get ideas for what it would look like, so > I can build a library and use it, and others can use it, and it can be > seen if its even useful in real world practice. > > -- > Read my blog! I depend on your acceptance of my opinion! I am interesting! > http://techblog.ironfroggy.com/ > Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From python at rcn.com Wed Feb 25 20:47:13 2009 From: python at rcn.com (Raymond Hettinger) Date: Wed, 25 Feb 2009 11:47:13 -0800 Subject: [Python-ideas] Revised^4 PEP on yield-from References: <499C6E00.2030602@canterbury.ac.nz> <499CC6E8.5050104@ronadam.com> <499D243B.8080801@canterbury.ac.nz> <499D2740.1070408@canterbury.ac.nz> <499D3086.6020706@canterbury.ac.nz> <49A51FF1.9060000@canterbury.ac.nz> Message-ID: <8E5DFA88F9874BCEA828C390E9BF945F@RaymondLaptop1> [Greg Ewing] >> The hasattr() tests can be expensive for repeated throwers and senders. >> Any merit to caching the result of the test? > > The expansion is a specification of the semantics, not > an implementation. I would have thought that running hasattr() no more than once is a semantic detail. Also, speed is a relevant consideration whenever users have the choice of continuing to do things the old way. If the new syntax is slower for key use cases, why would anyone adopt it? my-two-cents-ly yours, Raymond From greg.ewing at canterbury.ac.nz Wed Feb 25 21:53:40 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 26 Feb 2009 09:53:40 +1300 Subject: [Python-ideas] Super, Hooks, and Aspect Oriented Programming In-Reply-To: References: <76fd5acf0902250555y52db25ccve7d7d6fd7a37fb0f@mail.gmail.com> Message-ID: <49A5AFD4.8010608@canterbury.ac.nz> Guido van Rossum wrote: > I think no exploration of super() can be complete without considering > the difference between overriding a constructor (__init__ or __new__) > and overriding a regular method. Another thing to include is the use > of keyword arguments. I'm not a big fan of @before and @after kinds of things. They gain little over writing out the equivalent code in full, and add more mental baggage to the language to keep in your head. -- Greg From leif.walsh at gmail.com Thu Feb 26 09:08:00 2009 From: leif.walsh at gmail.com (Leif Walsh) Date: Thu, 26 Feb 2009 03:08:00 -0500 Subject: [Python-ideas] [Python-Dev] A suggestion: Do proto-PEPs in Google Docs In-Reply-To: <87tz6ppzk9.fsf@xemacs.org> References: <499C6E00.2030602@canterbury.ac.nz> <499CC6E8.5050104@ronadam.com> <499D243B.8080801@canterbury.ac.nz> <499D2740.1070408@canterbury.ac.nz> <499D3086.6020706@canterbury.ac.nz> <87tz6ppzk9.fsf@xemacs.org> Message-ID: On Thu, Feb 19, 2009 at 10:17 PM, Stephen J. Turnbull wrote: > Overall, I recommend use of Google Docs for "Python-Ideas" level of > PEP drafts. +1! I also like Google Sites for collaborative editing. -- Cheers, Leif From daniel at stutzbachenterprises.com Thu Feb 26 15:36:29 2009 From: daniel at stutzbachenterprises.com (Daniel Stutzbach) Date: Thu, 26 Feb 2009 08:36:29 -0600 Subject: [Python-ideas] "try with" syntactic sugar Message-ID: Around two-thirds of the time, whenever I use the wonderful new "with" construct, it's enclosed in a "try-except" block, like this: try: with something as f: many lines of code except some_error: handle error The "with" statement is great, but it results in the bulk of the code being indented twice. I'd like to propose a little syntactic sugar, the "try with": try with something as f: many lines of code except some_error: handle error It saves one line of vertical space, and gets rid of an indentation level for the bulk of the code that rests within the "with" statement. Thoughts? Here's a short script to count how many uses of "with" within your code are immediately preceded by "try": #!/usr/bin/python import re, sys re_with = re.compile(r'(try:[ \t]*)?[\r\n]+[ \t]+with ') try_with = 0 total = 0 for fname in sys.argv[1:]: data = open(fname).read() for match in re_with.findall(data): if match: try_with += 1 total += 1 print 'try-with:', try_with, 'out of:', total, '(', try_with*100.0/total,'%)' Usage: Cashew:~$ /tmp/count_try_with.py *.py try-with: 17 out of: 25 ( 68.0 %) -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises, LLC -------------- next part -------------- An HTML attachment was scrubbed... URL: From ironfroggy at gmail.com Thu Feb 26 15:44:00 2009 From: ironfroggy at gmail.com (Calvin Spealman) Date: Thu, 26 Feb 2009 09:44:00 -0500 Subject: [Python-ideas] "try with" syntactic sugar In-Reply-To: References: Message-ID: <76fd5acf0902260644k2d28b17fw9944881f4e6a173a@mail.gmail.com> +0 But... Do we also add try for:, try if:, try while:, etc.? Not terrible, but potentially terrible. On Thu, Feb 26, 2009 at 9:36 AM, Daniel Stutzbach wrote: > Around two-thirds of the time, whenever I use the wonderful new "with" > construct, it's enclosed in a "try-except" block, like this: > > try: > ?? with something as f: > ??????? many lines of code > except some_error: > ??? handle error > > The "with" statement is great, but it results in the bulk of the code being > indented twice.? I'd like to propose a little syntactic sugar, the "try > with": > > try with something as f: > ??? many lines of code > except some_error: > ??? handle error > > It saves one line of vertical space, and gets rid of an indentation level > for the bulk of the code that rests within the "with" statement.? Thoughts? > > Here's a short script to count how many uses of "with" within your code are > immediately preceded by "try": > > #!/usr/bin/python > import re, sys > > re_with = re.compile(r'(try:[ \t]*)?[\r\n]+[ \t]+with ') > > try_with = 0 > total = 0 > for fname in sys.argv[1:]: > ??? data = open(fname).read() > ??? for match in re_with.findall(data): > ??????? if match: try_with += 1 > ??????? total += 1 > > print 'try-with:', try_with, 'out of:', total, '(', > try_with*100.0/total,'%)' > > > Usage: > Cashew:~$ /tmp/count_try_with.py *.py > try-with: 17 out of: 25 ( 68.0 %) > > -- > Daniel Stutzbach, Ph.D. > President, Stutzbach Enterprises, LLC > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > -- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://techblog.ironfroggy.com/ Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy From daniel at stutzbachenterprises.com Thu Feb 26 16:09:38 2009 From: daniel at stutzbachenterprises.com (Daniel Stutzbach) Date: Thu, 26 Feb 2009 09:09:38 -0600 Subject: [Python-ideas] "try with" syntactic sugar In-Reply-To: <76fd5acf0902260644k2d28b17fw9944881f4e6a173a@mail.gmail.com> References: <76fd5acf0902260644k2d28b17fw9944881f4e6a173a@mail.gmail.com> Message-ID: On Thu, Feb 26, 2009 at 8:44 AM, Calvin Spealman wrote: > But... Do we also add try for:, try if:, try while:, etc.? Not > terrible, but potentially terrible. > No, because they're rarely useful. Modifying my script a little, less than 1% of my "if" and "for" statements are immediately preceded by "try", compared to 68% of my "with" statements. "with" blocks very frequently define the scope of an object that might generate a particular type of exception (e.g., file object may throw IOErrors, SQL transactions might through SQL exceptions, etc.). -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises, LLC -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Thu Feb 26 16:10:40 2009 From: eric at trueblade.com (Eric Smith) Date: Thu, 26 Feb 2009 10:10:40 -0500 Subject: [Python-ideas] "try with" syntactic sugar In-Reply-To: References: Message-ID: <49A6B0F0.4000107@trueblade.com> Daniel Stutzbach wrote: ... > The "with" statement is great, but it results in the bulk of the code > being indented twice. I'd like to propose a little syntactic sugar, the > "try with": > > try with something as f: > many lines of code > except some_error: > handle error Not commenting on how useful the proposal is (I'm mostly stuck in Python 2.4), but couldn't this be: with something as f: many lines of code except some_error: handle error That is, you don't really need the "try", since the above syntax is currently illegal? Eric. From alec at swapoff.org Thu Feb 26 16:14:44 2009 From: alec at swapoff.org (Alec Thomas) Date: Fri, 27 Feb 2009 02:14:44 +1100 Subject: [Python-ideas] "try with" syntactic sugar In-Reply-To: <49A6B0F0.4000107@trueblade.com> References: <49A6B0F0.4000107@trueblade.com> Message-ID: <5b52d53b0902260714j12efb93fw1f7b7c0ec2654650@mail.gmail.com> 2009/2/27 Eric Smith : > Daniel Stutzbach wrote: > ... >> >> The "with" statement is great, but it results in the bulk of the code >> being indented twice. ?I'd like to propose a little syntactic sugar, the >> "try with": >> >> try with something as f: >> ? ?many lines of code >> except some_error: >> ? ?handle error > > Not commenting on how useful the proposal is (I'm mostly stuck in Python > 2.4), but couldn't this be: > > with something as f: > ? ?many lines of code > except some_error: > ? ?handle error +1 Like the OP, I find myself doing this same sequence of code over and over. The syntax you describe here would clean the code up considerably. -- "Life? Don't talk to me about life." - Marvin From cesare.dimauro at a-tono.com Thu Feb 26 16:19:13 2009 From: cesare.dimauro at a-tono.com (Cesare Di Mauro) Date: Thu, 26 Feb 2009 16:19:13 +0100 Subject: [Python-ideas] "try with" syntactic sugar In-Reply-To: <5b52d53b0902260714j12efb93fw1f7b7c0ec2654650@mail.gmail.com> References: <49A6B0F0.4000107@trueblade.com> <5b52d53b0902260714j12efb93fw1f7b7c0ec2654650@mail.gmail.com> Message-ID: On Feb 26, 2009 at 16:14 PM, Alec Thomas wrote: > 2009/2/27 Eric Smith : >> Daniel Stutzbach wrote: >> ... >>> >>> The "with" statement is great, but it results in the bulk of the code >>> being indented twice. ?I'd like to propose a little syntactic sugar, the >>> "try with": >>> >>> try with something as f: >>> ? ?many lines of code >>> except some_error: >>> ? ?handle error >> >> Not commenting on how useful the proposal is (I'm mostly stuck in Python >> 2.4), but couldn't this be: >> >> with something as f: >> ? ?many lines of code >> except some_error: >> ? ?handle error > > +1 > > Like the OP, I find myself doing this same sequence of code over and > over. The syntax you describe here would clean the code up > considerably. +1 Same for me. It avoids an additional try: except: surrounding the with. From daniel at stutzbachenterprises.com Thu Feb 26 16:33:52 2009 From: daniel at stutzbachenterprises.com (Daniel Stutzbach) Date: Thu, 26 Feb 2009 09:33:52 -0600 Subject: [Python-ideas] "try with" syntactic sugar In-Reply-To: References: Message-ID: On Thu, Feb 26, 2009 at 8:36 AM, Daniel Stutzbach < daniel at stutzbachenterprises.com> wrote: > re_with = re.compile(r'(try:[ \t]*)?[\r\n]+[ \t]+with ') > Here's an updated regular expression that does a better job of ignoring comments and strings: re_with = re.compile(r'(try:[ \t]*)?[\r\n]+[ \t]+with .*:') -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises, LLC -------------- next part -------------- An HTML attachment was scrubbed... URL: From dangyogi at gmail.com Thu Feb 26 19:29:23 2009 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Thu, 26 Feb 2009 13:29:23 -0500 Subject: [Python-ideas] "try with" syntactic sugar In-Reply-To: References: Message-ID: <49A6DF83.1070802@gmail.com> Daniel Stutzbach wrote: > Here's an updated regular expression that does a better job of > ignoring comments and strings: > > re_with = re.compile(r'(try:[ \t]*)?[\r\n]+[ \t]+with .*:') Here's an even better version: #!/usr/bin/python import re, sys re_with = re.compile(r''' ^(?P\s*) # capture the indent try: (?:[ \t]*(?:\#.*)?[\r\n]+)? # newlines (ignoring comments) (?P (?P=indent)[ \t]+ # at higher indent level with\s (?:[^#:]*(?:(?:\#.*)?[\r\n]+)?)*: # with .*: (ignoring comments # and newlines before :) )? # end (?P) ''', re.MULTILINE | re.VERBOSE) try_with = 0 total = 0 for fname in sys.argv[1:]: data = open(fname).read() for match in re_with.finditer(data): if match.group('with'): try_with += 1 total += 1 print 'try-with:', try_with, 'out of:', total, '(', try_with*100.0/total,'%)' When I run this on a project of mine, I get: try-with: 1 out of: 87 ( 1.14942528736 %) The pattern that I find myself using is with/for, which can be counted by this program: #!/usr/bin/python import re, sys re_with = re.compile(r''' ^(?P\s*) # capture the indent with\s (?:[^#:]*(?:(?:\#.*)?[\r\n]+)?)*: # with .*: (ignoring comments # and newlines before :) (?:[ \t]*(?:\#.*)?[\r\n]+)? # newlines (ignoring comments) (?P (?P=indent)[ \t]+ # at higher indent level for\s (?:[^#:]*(?:(?:\#.*)?[\r\n]+)?)*: # for .*: (ignoring comments # and newlines before :) )? # end (?P) ''', re.MULTILINE | re.VERBOSE) with_for = 0 total = 0 for fname in sys.argv[1:]: data = open(fname).read() for match in re_with.finditer(data): if match.group('for'): with_for += 1 total += 1 print 'with-for:', with_for, 'out of:', total, '(', with_for*100.0/total,'%)' On my code, I get: with-for: 38 out of: 47 ( 80.8510638298 %) A few days ago, I proposed that a __throw__ method be added to context managers so that context managers could be used to capture common try/except usage patterns along with it's current ability to capture common try/finally patterns. I am curious whether you see common try/except patterns that could be captured in a context manager so that you could simply write: with my_except_handling(something) as x: many lines of code rather than: try: with something as f: many lines of code except some_error: handle error and be able to replace several occurrences of try/except/handle error with the same my_except_handling context manager? -bruce frederiksen From daniel at stutzbachenterprises.com Thu Feb 26 19:55:09 2009 From: daniel at stutzbachenterprises.com (Daniel Stutzbach) Date: Thu, 26 Feb 2009 12:55:09 -0600 Subject: [Python-ideas] "try with" syntactic sugar In-Reply-To: <49A6DF83.1070802@gmail.com> References: <49A6DF83.1070802@gmail.com> Message-ID: Hi Bruce, I bow to your superior regular expression knowledge. :) However, your version counts the number of "try"s that are followed by "with". Mine counts the number of "with"s that are preceded by "try" (which I think the more relevant metric?). Any chance you could alter your script? -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises, LLC -------------- next part -------------- An HTML attachment was scrubbed... URL: From dangyogi at gmail.com Fri Feb 27 02:28:29 2009 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Thu, 26 Feb 2009 20:28:29 -0500 Subject: [Python-ideas] "try with" syntactic sugar In-Reply-To: References: <49A6DF83.1070802@gmail.com> Message-ID: <49A741BD.1050303@gmail.com> Daniel Stutzbach wrote: > Hi Bruce, > > I bow to your superior regular expression knowledge. :) However, > your version counts the number of "try"s that are followed by "with". > Mine counts the number of "with"s that are preceded by "try" (which I > think the more relevant metric?). Any chance you could alter your script? Sorry, my mistake. How about this? #!/usr/bin/python import re, sys re_with = re.compile(r''' ^(?P\s*) # capture the indent # (might be try, might be with) (?P try: (?:[ \t]*(?:\#.*)?[\r\n]+)? # newlines (ignoring comments) (?P=indent)[ \t]+ # kick out the indent level )? with\s (?:[^#:]*(?:(?:\#.*)?[\r\n]+)?)*: # with .*: (ignoring comments # and newlines before :) ''', re.MULTILINE | re.VERBOSE) try_with = 0 total = 0 for fname in sys.argv[1:]: data = open(fname).read() for match in re_with.finditer(data): if match.group('try'): try_with += 1 total += 1 print 'try-with:', try_with, 'out of:', total, '(', try_with*100.0/total,'%)' On my code, I now get: try-with: 1 out of: 47 ( 2.12765957447 %) On my last response, I mentioned a suggestion to add __throw__ to context managers. But then I remembered that the __exit__ method is already given the exception information if an exception is raised. So you can already do what I was suggesting now. I'm still curious as to how often you could share try/except cases by writing your own context managers. -bruce frederiksen From daniel at stutzbachenterprises.com Fri Feb 27 03:00:59 2009 From: daniel at stutzbachenterprises.com (Daniel Stutzbach) Date: Thu, 26 Feb 2009 20:00:59 -0600 Subject: [Python-ideas] "try with" syntactic sugar In-Reply-To: <49A741BD.1050303@gmail.com> References: <49A6DF83.1070802@gmail.com> <49A741BD.1050303@gmail.com> Message-ID: On Thu, Feb 26, 2009 at 7:28 PM, Bruce Frederiksen wrote: > How about this? I tried it, but it matches quite a few comments and strings. Try putting a "print repr(match.group(0))" in the innermost loop to debug it. > On my last response, I mentioned a suggestion to add __throw__ to context > managers. But then I remembered that the __exit__ method is already given > the exception information if an exception is raised. So you can already do > what I was suggesting now. > > I'm still curious as to how often you could share try/except cases by > writing your own context managers. > Not particularly often. Much of the time, the exception handler has to clean up the mess when a file is unexpectedly unreadable or SQL explodes, and the clean-up bit is tied to the immediately surrounding code. YMMV. -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises, LLC -------------- next part -------------- An HTML attachment was scrubbed... URL: From arnodel at googlemail.com Fri Feb 27 11:49:24 2009 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Fri, 27 Feb 2009 10:49:24 +0000 Subject: [Python-ideas] "try with" syntactic sugar In-Reply-To: References: Message-ID: <9bfc700a0902270249w70ae5740yeec1259728867d03@mail.gmail.com> 2009/2/26 Daniel Stutzbach : > Around two-thirds of the time, whenever I use the wonderful new "with" > construct, it's enclosed in a "try-except" block, like this: > > try: > ?? with something as f: > ??????? many lines of code > except some_error: > ??? handle error > > The "with" statement is great, but it results in the bulk of the code being > indented twice.? I'd like to propose a little syntactic sugar, the "try > with": > > try with something as f: > ??? many lines of code > except some_error: > ??? handle error > > It saves one line of vertical space, and gets rid of an indentation level > for the bulk of the code that rests within the "with" statement.? Thoughts? Every compound statement could be made into an implicit try, i.e. : except: would mean: try: : except: So you could write: with something as f: many lines of code except some_error: handle error -- Arnaud From gagsl-py2 at yahoo.com.ar Fri Feb 27 12:14:28 2009 From: gagsl-py2 at yahoo.com.ar (Gabriel Genellina) Date: Fri, 27 Feb 2009 09:14:28 -0200 Subject: [Python-ideas] "try with" syntactic sugar In-Reply-To: References: <49A6DF83.1070802@gmail.com> <49A741BD.1050303@gmail.com> Message-ID: En Fri, 27 Feb 2009 00:00:59 -0200, Daniel Stutzbach escribi?: > On Thu, Feb 26, 2009 at 7:28 PM, Bruce Frederiksen > wrote: > >> How about this? > I tried it, but it matches quite a few comments and strings. Try I think a r.e. cannot handle this query well enough. This script uses the tokenize module and should be immune to those false positives (but it's much slower) import sys, os from tokenize import generate_tokens from token import NAME def process(path): nfiles = nwith = ntrywith = 0 for base, dirs, files in os.walk(path): print base print '%d "try+with" out of %d "with" (%.1f%%) in %d files (partial)' % ( ntrywith, nwith, ntrywith*100.0/nwith if nwith else 0, nfiles) print for fn in files: if fn[-3:]!='.py': continue if 'CVS' in dirs: dirs.remove('CVS') if '.svn' in dirs: dirs.remove('.svn') fullfn = os.path.join(base, fn) #print fn nfiles += 1 with open(fullfn) as f: try: was_try = False for toknum, tokval, _, _, _ in generate_tokens(f.readline): if toknum==NAME: is_with = tokval == 'with' if is_with: nwith += 1 if was_try: ntrywith += 1 was_try = tokval == 'try' except Exception, e: print e print '%d "try+with" out of %d "with" (%.1f%%) in %d files' % ( ntrywith, nwith, ntrywith*100.0/nwith if nwith else 0, nfiles) process(sys.argv[1]) I got 2/25 on my code (1500 files, but many are rather old to use "with") -- Gabriel Genellina From lists at cheimes.de Fri Feb 27 12:18:58 2009 From: lists at cheimes.de (Christian Heimes) Date: Fri, 27 Feb 2009 12:18:58 +0100 Subject: [Python-ideas] "try with" syntactic sugar In-Reply-To: References: Message-ID: Daniel Stutzbach wrote > try with something as f: > many lines of code > except some_error: > handle error > > It saves one line of vertical space, and gets rid of an indentation level > for the bulk of the code that rests within the "with" statement. Thoughts? Your proposal sounds like a very good idea. +1 from me. I like to push your proposal even further and add the full set of exept, else, finally blocks to the with statement. Additionally I'd like to have multiple arguments, too. Example ------- log.info("Starting copy") with somelock, open(infile, 'r') as fin, open(outfile, 'w') as fout: fout.write(fin.read()) except Excption: log.exception("An error has occured") else: log.info("Copy successful") finally: log.info("All done") Equivalent ---------- log.info("Starting copy") try: with somelock: with open(infile, 'r') as fin: with open(outfile, 'w') as fout: fout.write(fin.read()) except Excption: log.exception("An error has occured") else: log.info("Copy successful") finally: log.info("All done") Christian From curt at hagenlocher.org Fri Feb 27 13:49:58 2009 From: curt at hagenlocher.org (Curt Hagenlocher) Date: Fri, 27 Feb 2009 04:49:58 -0800 Subject: [Python-ideas] "try with" syntactic sugar In-Reply-To: <9bfc700a0902270249w70ae5740yeec1259728867d03@mail.gmail.com> References: <9bfc700a0902270249w70ae5740yeec1259728867d03@mail.gmail.com> Message-ID: On Fri, Feb 27, 2009 at 2:49 AM, Arnaud Delobelle wrote: > > Every compound statement could be made into an implicit try, i.e. That way lies madness. What distinguishes "with" from other compound statements is that it's already about resource management in the face of possible exceptions. -- Curt Hagenlocher curt at hagenlocher.org From daniel at stutzbachenterprises.com Fri Feb 27 16:06:43 2009 From: daniel at stutzbachenterprises.com (Daniel Stutzbach) Date: Fri, 27 Feb 2009 09:06:43 -0600 Subject: [Python-ideas] "try with" syntactic sugar In-Reply-To: References: <49A6DF83.1070802@gmail.com> <49A741BD.1050303@gmail.com> Message-ID: On Fri, Feb 27, 2009 at 5:14 AM, Gabriel Genellina wrote: > I think a r.e. cannot handle this query well enough. > This script uses the tokenize module and should be immune to those false > positives (but it's much slower) > Thanks, Gabriel. Using the tokenize module is, indeed, much better. I think the performance problems were caused by the O(n**2) cost of reading through the directories and then removing them selectively. I modified it to have an O(n) cost and it's quite snappy. #!/usr/bin/python from __future__ import with_statement import sys, os from tokenize import generate_tokens from token import NAME def process(paths): nfiles = nwith = ntrywith = 0 for path in paths: for base, dirs, files in os.walk(path): if nfiles: print '%d "try+with" out of %d "with" (%.1f%%) in %d files (so far)\ '% (ntrywith, nwith, ntrywith*100.0/nwith if nwith else 0, nfiles) print base newdirs = [] for d in list(dirs): if d == 'CVS' or d == '_darcs' or d[0] == '.': continue newdirs.append(d) dirs[:] = newdirs for fn in files: if fn[-3:]!='.py': continue fullfn = os.path.join(base, fn) #print fn nfiles += 1 with open(fullfn) as f: try: was_try = False for toknum, tokval, _, _, _ in generate_tokens(f.readline): if toknum==NAME: is_with = tokval == 'with' if is_with: nwith += 1 if was_try: ntrywith += 1 was_try = tokval == 'try' except Exception, e: print e print '%d "try+with" out of %d "with" (%.1f%%) in %d files' % ( ntrywith, nwith, ntrywith*100.0/nwith if nwith else 0, nfiles) process(sys.argv[1:]) -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises, LLC -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.brandl at gmx.net Fri Feb 27 17:18:41 2009 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 27 Feb 2009 17:18:41 +0100 Subject: [Python-ideas] "try with" syntactic sugar In-Reply-To: References: Message-ID: Christian Heimes schrieb: > Daniel Stutzbach wrote >> try with something as f: >> many lines of code >> except some_error: >> handle error >> >> It saves one line of vertical space, and gets rid of an indentation level >> for the bulk of the code that rests within the "with" statement. Thoughts? > > Your proposal sounds like a very good idea. +1 from me. > > I like to push your proposal even further and add the full set of exept, > else, finally blocks to the with statement. I'd assumed this is already implicit in the proposal. > Additionally I'd like to have multiple arguments, too. +1 to that (it should have been there in the beginning). > Example > ------- > log.info("Starting copy") > with somelock, open(infile, 'r') as fin, open(outfile, 'w') as fout: > fout.write(fin.read()) > except Excption: > log.exception("An error has occured") > else: > log.info("Copy successful") > finally: > log.info("All done") I still like it better with the "try" before the "with". Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From Scott.Daniels at Acm.Org Fri Feb 27 20:04:10 2009 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Fri, 27 Feb 2009 11:04:10 -0800 Subject: [Python-ideas] "try with" syntactic sugar In-Reply-To: References: <49A6DF83.1070802@gmail.com> <49A741BD.1050303@gmail.com> Message-ID: Daniel Stutzbach wrote: > It seems to me the question is, among the single-statement try blocks, what kinds of (non-function call, non-assignment) statements are the inner statement, and what frequency does this occur. On Python5, 25, and 30, I get quite similar numbers: 2.5 2.6 3.0 10 15 14 with 15 11 13 yield 25 26 22 class 32 23 *8 print # print is a function in 3.0, called 8 times 37 33 24 while 44 35 *26 exec # exec is a function in 3.0, called 26 times 9 42 41 pass 49 44 64 raise 67 45 29 del 95 61 40 for 142 68 42 if 182 105 63 from 230 163 107 import 588 307 150 return These numbers are pulled from the output of the attached imperfect program, and hand-editted together. The numbers seem to me to say that try-for and try-while substantially exceed try-with statements. --Scott David Daniels @Acm.Org -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: hunt_try.py URL: From guido at python.org Fri Feb 27 20:20:14 2009 From: guido at python.org (Guido van Rossum) Date: Fri, 27 Feb 2009 11:20:14 -0800 Subject: [Python-ideas] "try with" syntactic sugar In-Reply-To: References: <9bfc700a0902270249w70ae5740yeec1259728867d03@mail.gmail.com> Message-ID: On Fri, Feb 27, 2009 at 4:49 AM, Curt Hagenlocher wrote: > On Fri, Feb 27, 2009 at 2:49 AM, Arnaud Delobelle > wrote: >> >> Every compound statement could be made into an implicit try, i.e. > > That way lies madness. ?What distinguishes "with" from other compound > statements is that it's already about resource management in the face > of possible exceptions. Still, a firm -1 from me. Once we have "try try" I'm sure people are going to clamor for "try if", "try while", "try for", even (oh horror :-) "try try". I don't think we should complicate the syntax just to save one level of indentation occasionally. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jh at improva.dk Sat Feb 28 21:54:32 2009 From: jh at improva.dk (Jacob Holm) Date: Sat, 28 Feb 2009 21:54:32 +0100 Subject: [Python-ideas] Revised**5 PEP on yield-from In-Reply-To: <499FC423.6080500@improva.dk> References: <499DDA4C.8090906@canterbury.ac.nz> <499F6053.40407@improva.dk> <499F93A9.9070500@canterbury.ac.nz> <499FC423.6080500@improva.dk> Message-ID: <49A9A488.4070308@improva.dk> Replying to myself here... Jacob Holm wrote: > I think I can actually see a way to do it that should be fast enough, > but I'd like to > work out the details first. If it works it will be O(1) with low > constants as long as > you don't build trees, and similar to traversing a delegation chain in > the worst case. > > All this depends on getting it working using delegation chains first > though, as most of > the StopIteration and Exception handling would be the same. I have now worked out the details, and it is indeed possible to get O(1) for simple cases and amortized O(logN) in general, all with fairly low constants. I have implemented the tree structure as a python module and added a trampoline-based pure-python implementation of "yield-from" to try it out. It seems that this version beats a normal "for v in it: yield v" when the delegation chains get around 90 generators deep. This may sound like much, but keep in mind that this is all in python. I expect a C implementation of this would break even much sooner than that (hoping for <5). If you (or anyone else) is interested I can post the code here. (Or you can suggest a good place to upload it). Regards Jacob