From greg at krypto.org Sun Apr 1 05:50:17 2012 From: greg at krypto.org (Gregory P. Smith) Date: Sat, 31 Mar 2012 20:50:17 -0700 Subject: [Python-ideas] Thread stopping In-Reply-To: References: Message-ID: On Sat, Mar 31, 2012 at 3:33 AM, Michael Foord wrote: > An "uninterruptable context manager" would be nice - but would probably > need extra vm support and isn't essential. I'm not so sure that would need much vm support for an uninterruptable context manager, at least in CPython 3.2 and 3.3: Isn't something like this about it; assuming you were only talking about uninteruptable within the context of native Python code rather than whatever other extension modules or interpreter embedding code may be running on their own in C/C++/Java/C# thread land: class UninterruptableContext: def __enter__(self, ...): self._orig_switchinterval = sys.getswitchinterval() sys.setswitchinterval(1000000000) # 31 years with no Python thread switching def __exit__(self, ...): sys.setswitchinterval(self._orig_switchinterval) the danger with that of course is that you could be saving an obsolete switch interval value. but I suspect it is rare to change that other than near process start time. you could document the caveat and suggest that the switch interval be set to its desired setting before using any of these context managers. or monkeypatch setswitchinterval out with a dummy when this library is imported so that it becomes the sole user and owner of that api. all of which are pretty evil-hacks to expunge from ones memory and pretend you didn't read. the _other_ big caveat to the above is that if you do any blocking operations that release the GIL within such a context manager I think you just voluntarily give up your right to not be interrupted. Plus it depends on setswitchinterval() which is an API that we could easily discard in the future with different threading and GIL implementations. brainstorming... its what python-ideas is for. I have zero use cases for the uninterruptable context manager within Python. for tiny sections of C code, sure. Within a high level language... not so much. Please use finer grained locks. An uninterruptible context manager is essentially a context manager around the GIL. -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From xorninja at gmail.com Sun Apr 1 16:34:42 2012 From: xorninja at gmail.com (Itzik Kotler) Date: Sun, 1 Apr 2012 17:34:42 +0300 Subject: [Python-ideas] Pythonect 0.1.0 Release Message-ID: Hi All, I'm pleased to announce the first beta release of Pythonect interpreter. Pythonect is a new, experimental, general-purpose dataflow programming language based on Python. It aims to combine the intuitive feel of shell scripting (and all of its perks like implicit parallelism) with the flexibility and agility of Python. Pythonect interpreter (and reference implementation) is written in Python, and is available under the BSD license. Here's a quick tour of Pythonect: The canonical "Hello, world" example program in Pythonect: >>> "Hello World" -> print : Hello World Hello World >>> '->' and '|' are both Pythonect operators. The pipe operator (i.e. '|') passes one item at a item, while the other operator passes all items at once. Python statements and other None-returning function are acting as a pass-through: >>> "Hello World" -> print -> print : Hello World : Hello World Hello World >>> >>> 1 -> import math -> math.log 0.0 >>> Parallelization in Pythonect: >>> "Hello World" -> [ print , print ] : Hello World : Hello World ['Hello World', 'Hello World'] >>> range(0,3) -> import math -> math.sqrt [0.0, 1.0, 1.4142135623730951] >>> In the future, I am planning on adding support for multi-processing, and even distributed computing. The '_' identifier allow access to current item: >>> "Hello World" -> [ print , print ] -> _ + " and Python" : Hello World : Hello World ['Hello World and Python', 'Hello World and Python'] >>> >>> [ 1 , 2 ] -> _**_ [1, 4] >>> True/False return values as filters: >>> "Hello World" -> _ == "Hello World" -> print : Hello World >>> >>> "Hello World" -> _ == "Hello World1" -> print False >>> >>> range(1,10) -> _ % 2 == 0 [2, 4, 6, 8] >>> Last but not least, I have also added extra syntax for making remote procedure call easy: >> 1 -> inc at xmlrpc://localhost:8000 -> print : 2 2 >>> Download Pythonect v0.1.0 from: http://github.com/downloads/ikotler/pythonect/Pythonect-0.1.0.tar.gz More information can be found at: http://www.pythonect.org I will appreciate any input / feedback that you can give me. Also, for those interested in working on the project, I'm actively interested in welcoming and supporting both new developers and new users. Feel free to contact me. Regards, Itzik Kotler | http://www.ikotler.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From jkbbwr at gmail.com Sun Apr 1 17:58:22 2012 From: jkbbwr at gmail.com (Jakob Bowyer) Date: Sun, 1 Apr 2012 16:58:22 +0100 Subject: [Python-ideas] Pythonect 0.1.0 Release In-Reply-To: References: Message-ID: You might want to PEP8 your code, move imports to the top lose some of the un-needed lines On Sun, Apr 1, 2012 at 3:34 PM, Itzik Kotler wrote: > Hi All, > > I'm pleased to announce the first beta release of Pythonect interpreter. > > Pythonect is a new, experimental, general-purpose dataflow programming > language based on Python. > > It aims to combine the intuitive feel of shell scripting (and all of its > perks like implicit parallelism) with the flexibility and agility of Python. > > Pythonect interpreter (and reference implementation) is written in Python, > and is available under the BSD license. > > Here's a quick tour of Pythonect: > > The canonical "Hello, world" example program in Pythonect: > >>>> "Hello World" -> print > : Hello World > Hello World >>>> > > '->' and '|' are both Pythonect operators. > > The pipe operator (i.e. '|') passes one item at a item, while the other > operator passes all items at once. > > > Python statements and other None-returning function are acting as a > pass-through: > >>>> "Hello World" -> print -> print > : Hello World > : Hello World > Hello World >>>> > >>>> 1 -> import math -> math.log > 0.0 >>>> > > > Parallelization in Pythonect: > >>>> "Hello World" -> [ print , print ] > : Hello World > : Hello World > ['Hello World', 'Hello World'] > >>>> range(0,3) -> import math -> math.sqrt > [0.0, 1.0, 1.4142135623730951] >>>> > > In the future, I am planning on adding support for multi-processing, and > even distributed computing. > > > The '_' identifier allow access to current item: > >>>> "Hello World" -> [ print , print ] -> _ + " and Python" > : Hello World > : Hello World > ['Hello World and Python', 'Hello World and Python'] >>>> > >>>> [ 1 , 2 ] -> _**_ > [1, 4] >>>> > > > True/False return values as filters: > >>>> "Hello World" -> _ == "Hello World" -> print > : Hello World >>>> > >>>> "Hello World" -> _ == "Hello World1" -> print > False >>>> > >>>> range(1,10) -> _ % 2 == 0 > [2, 4, 6, 8] >>>> > > > Last but not least, I have also added extra syntax for making remote > procedure call easy: > >>> 1 -> inc at xmlrpc://localhost:8000 -> print > : 2 > 2 >>>> > > Download Pythonect v0.1.0 from: > http://github.com/downloads/ikotler/pythonect/Pythonect-0.1.0.tar.gz > > More information can be found at: http://www.pythonect.org > > > I will appreciate any input / feedback that you can give me. > > Also, for those interested in working on the project, I'm actively > interested in welcoming and supporting both new developers and new users. > Feel free to contact me. > > > Regards, > Itzik Kotler | http://www.ikotler.org > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From guido at python.org Sun Apr 1 18:05:22 2012 From: guido at python.org (Guido van Rossum) Date: Sun, 1 Apr 2012 09:05:22 -0700 Subject: [Python-ideas] Pythonect 0.1.0 Release In-Reply-To: References: Message-ID: April fool, right? On Sunday, April 1, 2012, Itzik Kotler wrote: > Hi All, > > I'm pleased to announce the first beta release of Pythonect interpreter. > > Pythonect is a new, experimental, general-purpose dataflow programming > language based on Python. > > It aims to combine the intuitive feel of shell scripting (and all of its > perks like implicit parallelism) with the flexibility and agility of > Python. > > Pythonect interpreter (and reference implementation) is written in Python, > and is available under the BSD license. > > Here's a quick tour of Pythonect: > > The canonical "Hello, world" example program in Pythonect: > > >>> "Hello World" -> print > : Hello World > Hello World > >>> > > '->' and '|' are both Pythonect operators. > > The pipe operator (i.e. '|') passes one item at a item, while the other > operator passes all items at once. > > > Python statements and other None-returning function are acting as a > pass-through: > > >>> "Hello World" -> print -> print > : Hello World > : Hello World > Hello World > >>> > > >>> 1 -> import math -> math.log > 0.0 > >>> > > > Parallelization in Pythonect: > > >>> "Hello World" -> [ print , print ] > : Hello World > : Hello World > ['Hello World', 'Hello World'] > > >>> range(0,3) -> import math -> math.sqrt > [0.0, 1.0, 1.4142135623730951] > >>> > > In the future, I am planning on adding support for multi-processing, and > even distributed computing. > > > The '_' identifier allow access to current item: > > >>> "Hello World" -> [ print , print ] -> _ + " and Python" > : Hello World > : Hello World > ['Hello World and Python', 'Hello World and Python'] > >>> > > >>> [ 1 , 2 ] -> _**_ > [1, 4] > >>> > > > True/False return values as filters: > > >>> "Hello World" -> _ == "Hello World" -> print > : Hello World > >>> > > >>> "Hello World" -> _ == "Hello World1" -> print > False > >>> > > >>> range(1,10) -> _ % 2 == 0 > [2, 4, 6, 8] > >>> > > > Last but not least, I have also added extra syntax for making remote > procedure call easy: > > >> 1 -> inc at xmlrpc://localhost:8000 -> print > : 2 > 2 > >>> > > Download Pythonect v0.1.0 from: > http://github.com/downloads/ikotler/pythonect/Pythonect-0.1.0.tar.gz > > More information can be found at: http://www.pythonect.org > > > I will appreciate any input / feedback that you can give me. > > Also, for those interested in working on the project, I'm actively > interested in welcoming and supporting both new developers and new users. > Feel free to contact me. > > > Regards, > Itzik Kotler | http://www.ikotler.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From xorninja at gmail.com Sun Apr 1 20:22:58 2012 From: xorninja at gmail.com (Itzik Kotler) Date: Sun, 1 Apr 2012 21:22:58 +0300 Subject: [Python-ideas] Pythonect 0.1.0 Release In-Reply-To: References: Message-ID: It might be April fools, but its not a fool's concept :-) Regards, Itzik Kotler | http://www.ikotler.org On Sun, Apr 1, 2012 at 7:05 PM, Guido van Rossum wrote: > April fool, right? > > > On Sunday, April 1, 2012, Itzik Kotler wrote: > >> Hi All, >> >> I'm pleased to announce the first beta release of Pythonect interpreter. >> >> Pythonect is a new, experimental, general-purpose dataflow programming >> language based on Python. >> >> It aims to combine the intuitive feel of shell scripting (and all of its >> perks like implicit parallelism) with the flexibility and agility of >> Python. >> >> Pythonect interpreter (and reference implementation) is written in >> Python, and is available under the BSD license. >> >> Here's a quick tour of Pythonect: >> >> The canonical "Hello, world" example program in Pythonect: >> >> >>> "Hello World" -> print >> : Hello World >> Hello World >> >>> >> >> '->' and '|' are both Pythonect operators. >> >> The pipe operator (i.e. '|') passes one item at a item, while the other >> operator passes all items at once. >> >> >> Python statements and other None-returning function are acting as a >> pass-through: >> >> >>> "Hello World" -> print -> print >> : Hello World >> : Hello World >> Hello World >> >>> >> >> >>> 1 -> import math -> math.log >> 0.0 >> >>> >> >> >> Parallelization in Pythonect: >> >> >>> "Hello World" -> [ print , print ] >> : Hello World >> : Hello World >> ['Hello World', 'Hello World'] >> >> >>> range(0,3) -> import math -> math.sqrt >> [0.0, 1.0, 1.4142135623730951] >> >>> >> >> In the future, I am planning on adding support for multi-processing, and >> even distributed computing. >> >> >> The '_' identifier allow access to current item: >> >> >>> "Hello World" -> [ print , print ] -> _ + " and Python" >> : Hello World >> : Hello World >> ['Hello World and Python', 'Hello World and Python'] >> >>> >> >> >>> [ 1 , 2 ] -> _**_ >> [1, 4] >> >>> >> >> >> True/False return values as filters: >> >> >>> "Hello World" -> _ == "Hello World" -> print >> : Hello World >> >>> >> >> >>> "Hello World" -> _ == "Hello World1" -> print >> False >> >>> >> >> >>> range(1,10) -> _ % 2 == 0 >> [2, 4, 6, 8] >> >>> >> >> >> Last but not least, I have also added extra syntax for making remote >> procedure call easy: >> >> >> 1 -> inc at xmlrpc://localhost:8000 -> print >> : 2 >> 2 >> >>> >> >> Download Pythonect v0.1.0 from: >> http://github.com/downloads/ikotler/pythonect/Pythonect-0.1.0.tar.gz >> >> More information can be found at: http://www.pythonect.org >> >> >> I will appreciate any input / feedback that you can give me. >> >> Also, for those interested in working on the project, I'm actively >> interested in welcoming and supporting both new developers and new users. >> Feel free to contact me. >> >> >> Regards, >> Itzik Kotler | http://www.ikotler.org >> > > > -- > --Guido van Rossum (python.org/~guido ) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ram.rachum at gmail.com Sun Apr 1 22:25:41 2012 From: ram.rachum at gmail.com (Ram Rachum) Date: Sun, 1 Apr 2012 13:25:41 -0700 (PDT) Subject: [Python-ideas] with *context_managers: Message-ID: <29272049.1143.1333311941739.JavaMail.geo-discussion-forums@vbgx21> I'd like to be able to do this: with *context_managers: pass # Some suite. This is useful when you have an unknown number of context managers that you want to use. I currently use `contextlib.nested`, but I'd like the *star syntax much better. What do you think? Ram. -------------- next part -------------- An HTML attachment was scrubbed... URL: From maxmoroz at gmail.com Mon Apr 2 10:30:26 2012 From: maxmoroz at gmail.com (Max Moroz) Date: Mon, 2 Apr 2012 01:30:26 -0700 Subject: [Python-ideas] comparison of operator.itemgetter objects Message-ID: Currently, __eq__() method is not defined in class operator.itemgetter, hence non-identical itemgetter objects compare as non-equal. I wanted to propose defining __eq__() method that would return the result of comparison for equality of the list of arguments submitted at initialization. This would make operator.itemgetter('name') compare as equal to operator.itemgetter('name'). The motivation for this is that sorted data structure (such as blist.sortedset) might want to verify if two arguments (say, lhs and rhs) of a binary operation (such as union) have the same sort key (a callable object passed to the constructor of the sorted data structure). Such a verification is useful because the desirable behavior of such binary operations is to use the common sort key if the lhs and rhs have the same sort key; and to raise an exception (or at least use a default value of the sort key) otherwise. I think that comparing sort keys for equality works well in many useful cases: (a) Named function. These compare as equal only if they are identical. If lhs and rhs were initialized with distinct named functions, I would argue that the programmer did not intend them to be compatible for the purpose of binary operations, even if they happen to be identical in behavior (e.g., if both functions return back the argument passed to them). In a well-designed program, there is no need to duplicate the named function definition if the two are expected to always have the same behavior. Therefore, the two distinct functions are intended to be different in behavior at least in some situations, and therefore the sorted data structure objects that use them as keys should be considered incompatible. (b) User-defined callable class. The author of such class should define __eq__() in a way that would compare as equal callable objects that behave identically, assuming it's not prohibitively expensive. Unfortunately, in two cases comparing keys for equality does not work well. (c) itemgetter. Suppose a programmer passed `itemgetter('name')` as the sort key argument to the sorted data structure's constructor. The resulting data structures would seem incompatible for the purposes of binary operations. This is likely to be confusing and undesirable. (d) lambda functions. Similarly, suppose a programmer passed `lambda x : -x` as the sort key argument to the sorted data structure's constructor. Since two lambda functions are not identical, they would compare as unequal. It seems to be very easy to address the undesirable behavior described in (c): add method __eq__() to operator.itemgetter, which would compare the list of arguments received at initialization. This would only break code that relies on an undocumented fact that distinct itemgetter instances compare as non-equal. The alternative is for each sorted data structure to handle this comparison on its own. This is repetitive and error-prone. Furthermore, it is expensive for an outsider to find out what arguments were given to an itemgetter at initialization. It is far harder to address the undesirable behavior described in (d). If it can be addressed at all, it would have to done in the sorted data structure implementation, since I don't think anyone would want lambda function comparison behavior to change. So for the purposes of this discussion, I ignore case (d). Is this a reasonable idea? Is it useful enough to be considered? Are there any downsides I didn't think of? Are there any other callables created by Python's builtin or standard library functions where __eq__ might be useful to define? Thanks, Max -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Mon Apr 2 11:38:05 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 02 Apr 2012 19:38:05 +1000 Subject: [Python-ideas] comparison of operator.itemgetter objects In-Reply-To: References: Message-ID: <4F79737D.6030508@pearwood.info> Max Moroz wrote: > Currently, __eq__() method is not defined in class operator.itemgetter, > hence non-identical itemgetter objects compare as non-equal. > > I wanted to propose defining __eq__() method that would return the result > of comparison for equality of the list of arguments submitted at > initialization. This would make operator.itemgetter('name') compare as > equal to operator.itemgetter('name'). In general, I think that having equality tests fall back on identity test is so rarely what you actually want that sometimes I wonder why we bother. In this case I was going to say just write your own subclass, but: py> from operator import itemgetter py> class MyItemgetter(itemgetter): ... pass ... Traceback (most recent call last): File "", line 1, in TypeError: type 'operator.itemgetter' is not an acceptable base type -- Steven From maxmoroz at gmail.com Mon Apr 2 12:39:47 2012 From: maxmoroz at gmail.com (Max Moroz) Date: Mon, 2 Apr 2012 03:39:47 -0700 Subject: [Python-ideas] Python-ideas Digest, Vol 65, Issue 3 In-Reply-To: References: Message-ID: Steven D'Aprano wrote: > In this case I was going to say just write your own subclass, but: > > py> from operator import itemgetter > py> class MyItemgetter(itemgetter): > ... ? ? pass > ... > Traceback (most recent call last): > ? File "", line 1, in > TypeError: type 'operator.itemgetter' is not an acceptable base type I suspect it's the same reason that bool or generator can't be subclassed: there is no obvious use case for subclassing it, and an attempt to do so is more likely to create mistakes than produce anything useful. I actually agree that itemgetter is a very specific callable class that is unlikely to be extensible in any meaningful way. Subclassing to add an __eq__() method seems to be adding what really belongs in the base class, rather than truly extending the base class. But that's just my opinion. Even if it could be done, it's not cheap. I like this recipe on SO (after a minor fix): http://stackoverflow.com/a/9970405/336527. An alternative would be to create a dummy class that defines only __getitem__ method, and use an instance of that class to collect all the values. Either approach involves creating a new object, calling the itemgetter, collecting the values into a set-like data structure, and then comparing them. From g.rodola at gmail.com Mon Apr 2 13:40:43 2012 From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=) Date: Mon, 2 Apr 2012 13:40:43 +0200 Subject: [Python-ideas] with *context_managers: In-Reply-To: <29272049.1143.1333311941739.JavaMail.geo-discussion-forums@vbgx21> References: <29272049.1143.1333311941739.JavaMail.geo-discussion-forums@vbgx21> Message-ID: Il 01 aprile 2012 22:25, Ram Rachum ha scritto: > I'd like to be able to do this: > > with *context_managers: > ? ? pass # Some suite. > > > This is useful when you have an unknown number of context managers that you > want to use. I currently use `contextlib.nested`, but I'd like the *star > syntax much better. > > What do you think? > > > Ram. I believe writing a specialized context manager object which is able to hold multiple context managers altogheter is better than introducing a new syntax for such a use case which should be pretty rare/uncommon. Also, it's not clear what to expect from "with *context_managers as ctx: ...". Regards, --- Giampaolo http://code.google.com/p/pyftpdlib/ http://code.google.com/p/psutil/ http://code.google.com/p/pysendfile/ From p.f.moore at gmail.com Mon Apr 2 13:44:39 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 2 Apr 2012 12:44:39 +0100 Subject: [Python-ideas] comparison of operator.itemgetter objects In-Reply-To: <4F79737D.6030508@pearwood.info> References: <4F79737D.6030508@pearwood.info> Message-ID: On 2 April 2012 10:38, Steven D'Aprano wrote: > TypeError: type 'operator.itemgetter' is not an acceptable base type Quite apart from the question of whether you might want to subclass operator.itemgetter, that's a really rubbish error message. Why is it not acceptable? Searching the source, it appears that types can say they can't be subclassed by setting Py_TPFLAGS_BASETYPE, so maybe a better error would be "the designer of type '%s' has disallowed subclassing". Still doesn't say why they did, but at least it gives a hint as to what's going on... Paul. From fuzzyman at gmail.com Mon Apr 2 13:53:17 2012 From: fuzzyman at gmail.com (Michael Foord) Date: Mon, 2 Apr 2012 12:53:17 +0100 Subject: [Python-ideas] Thread stopping In-Reply-To: References: Message-ID: On 30 March 2012 05:53, Eli Bendersky wrote: > On Thu, Mar 29, 2012 at 21:48, Andrew Svetlov wrote: > >> I propose to add Thread.interrupt() function. >> > > > Could you specify some use cases where you believe this would be better > than explicitly asking the thread to stop? > What do you mean by "asking the thread to stop?". What is proposed is precisely that. The usual suggestion is a flag, and have the thread check if it has been "asked to stop". This is only suitable for fine grained tasks (e.g. computationally bound loops) where there is a suitable place to check. Any coarse grained task, or code with multiple loops for example, may not have any place to check - or may need checking code in *many* places. One concrete example - at Resolver Systems we implemented a spreadsheet application where multiple documents could be calculating simultaneously in separate threads. (This was in IronPython with no GIL and true free threading.) As we were executing *user code* there was no way for the code to check if it had been requested to stop. (Unless we transformed the code and annotated it with checks everywhere.) With .NET threads we could simply request the thread to exit (if the user wanted to halt a calculation - for example because they had updated the code / spreadsheet) and it worked very well. Thread interruption is a useful feature. Michael > > Eli > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From fuzzyman at gmail.com Mon Apr 2 13:56:05 2012 From: fuzzyman at gmail.com (Michael Foord) Date: Mon, 2 Apr 2012 12:56:05 +0100 Subject: [Python-ideas] Thread stopping In-Reply-To: References: Message-ID: On 1 April 2012 04:50, Gregory P. Smith wrote: > > On Sat, Mar 31, 2012 at 3:33 AM, Michael Foord wrote: > >> An "uninterruptable context manager" would be nice - but would probably >> need extra vm support and isn't essential. > > > I'm not so sure that would need much vm support for an uninterruptable > context manager, at least in CPython 3.2 and 3.3: > > Isn't something like this about it; assuming you were only talking about > uninteruptable within the context of native Python code rather than > whatever other extension modules or interpreter embedding code may be > running on their own in C/C++/Java/C# thread land: > > class UninterruptableContext: > def __enter__(self, ...): > self._orig_switchinterval = sys.getswitchinterval() > sys.setswitchinterval(1000000000) # 31 years with no Python thread > switching > > def __exit__(self, ...): > sys.setswitchinterval(self._orig_switchinterval) > > the danger with that of course is that you could be saving an obsolete > switch interval value. but I suspect it is rare to change that other than > near process start time. you could document the caveat and suggest that the > switch interval be set to its desired setting before using any of these > context managers. or monkeypatch setswitchinterval out with a dummy when > this library is imported so that it becomes the sole user and owner of that > api. all of which are pretty evil-hacks to expunge from ones memory and > pretend you didn't read. > > the _other_ big caveat to the above is that if you do any blocking > operations that release the GIL within such a context manager I think you > just voluntarily give up your right to not be interrupted. Plus it depends > on setswitchinterval() which is an API that we could easily discard in the > future with different threading and GIL implementations. > > brainstorming... its what python-ideas is for. > > I have zero use cases for the uninterruptable context manager within > Python. for tiny sections of C code, sure. Within a high level language... > not so much. Please use finer grained locks. An uninterruptible context > manager is essentially a context manager around the GIL. > > Hello Gregory, I think you misunderstand what we mean by uninterruptable. It has nothing to do with *thread switching*, the interruption we are talking about is the proposed new feature where threads can be terminated by raising a ThreadInterrupt exception inside them. An uninterruptable context manager (which I'm not convinced is needed or easy to implement) simply means that a ThreadInterrupt won't be raised whilst code inside the context manager is executing. It *does not* mean that execution can't switch to another thread via the normal means. Michael > -gps > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From fuzzyman at gmail.com Mon Apr 2 14:00:06 2012 From: fuzzyman at gmail.com (Michael Foord) Date: Mon, 2 Apr 2012 13:00:06 +0100 Subject: [Python-ideas] with *context_managers: In-Reply-To: References: <29272049.1143.1333311941739.JavaMail.geo-discussion-forums@vbgx21> Message-ID: On 2 April 2012 12:40, Giampaolo Rodol? wrote: > Il 01 aprile 2012 22:25, Ram Rachum ha scritto: > > I'd like to be able to do this: > > > > with *context_managers: > > pass # Some suite. > > > > > > This is useful when you have an unknown number of context managers that > you > > want to use. I currently use `contextlib.nested`, but I'd like the *star > > syntax much better. > > > > What do you think? > > > > > > Ram. > > I believe writing a specialized context manager object which is able > to hold multiple context managers altogheter is better than > introducing a new syntax for such a use case which should be pretty > rare/uncommon. > There's now an example of a need for this in the standard library. mock.patch collects together an arbitrary number of context managers that need to be entered sequentially (together). As there is no replacement for contextlib.nested it has custom code calling __enter__ and __exit__ on all the context managers and keeping track of which ones have been successfully entered (because if there is an exception whilst entering one, only those that have *already* been entered should have __exit__ called). > Also, it's not clear what to expect from "with *context_managers as ctx: > ...". > > It is clear. It should be a tuple of results (what else *could* it be). Michael > Regards, > > --- Giampaolo > http://code.google.com/p/pyftpdlib/ > http://code.google.com/p/psutil/ > http://code.google.com/p/pysendfile/ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Mon Apr 2 14:05:54 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 2 Apr 2012 14:05:54 +0200 Subject: [Python-ideas] with *context_managers: References: <29272049.1143.1333311941739.JavaMail.geo-discussion-forums@vbgx21> Message-ID: <20120402140554.75f61281@pitrou.net> On Mon, 2 Apr 2012 13:00:06 +0100 Michael Foord wrote: > > > Also, it's not clear what to expect from "with *context_managers as ctx: > > ...". > > > > > It is clear. It should be a tuple of results (what else *could* it be). A timedelta, obviously. Regards Antoine. From eric at trueblade.com Mon Apr 2 14:24:32 2012 From: eric at trueblade.com (Eric V. Smith) Date: Mon, 02 Apr 2012 08:24:32 -0400 Subject: [Python-ideas] with *context_managers: In-Reply-To: References: <29272049.1143.1333311941739.JavaMail.geo-discussion-forums@vbgx21> Message-ID: <4F799A80.6090604@trueblade.com> On 4/2/2012 7:40 AM, Giampaolo Rodol? wrote: > Il 01 aprile 2012 22:25, Ram Rachum ha scritto: >> I'd like to be able to do this: >> >> with *context_managers: >> pass # Some suite. >> >> >> This is useful when you have an unknown number of context managers that you >> want to use. I currently use `contextlib.nested`, but I'd like the *star >> syntax much better. >> >> What do you think? >> >> >> Ram. > > I believe writing a specialized context manager object which is able > to hold multiple context managers altogheter is better than > introducing a new syntax for such a use case which should be pretty > rare/uncommon. > Also, it's not clear what to expect from "with *context_managers as ctx: ...". See http://bugs.python.org/issue13585 From guido at python.org Mon Apr 2 16:54:22 2012 From: guido at python.org (Guido van Rossum) Date: Mon, 2 Apr 2012 07:54:22 -0700 Subject: [Python-ideas] Thread stopping In-Reply-To: References: Message-ID: Perhaps off-topic, but the one thing that isn't easy to do is stopping a thread that's blocked (perhaps forever) in some blocking operation -- e.g. acquiring a lock that's been forgotten or a read on a malfunctioning socket (it happens!). Having to code those operations consistently with timeouts is a pain, so if there was a way to make those system calls return an error I'd really like that. I'm not super worried about skipping finally-clauses, we can figure out a hack for that. -- --Guido van Rossum (python.org/~guido) From andrew.svetlov at gmail.com Mon Apr 2 17:15:08 2012 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Mon, 2 Apr 2012 18:15:08 +0300 Subject: [Python-ideas] Thread stopping In-Reply-To: References: Message-ID: On Mon, Apr 2, 2012 at 5:54 PM, Guido van Rossum wrote: > Perhaps off-topic, but the one thing that isn't easy to do is stopping > a thread that's blocked (perhaps forever) in some blocking operation > -- e.g. acquiring a lock that's been forgotten or a read on a > malfunctioning socket (it happens!). Having to code those operations > consistently with timeouts is a pain, so if there was a way to make > those system calls return an error I'd really like that. > > I'm not super worried about skipping finally-clauses, we can figure > out a hack for that. > Python already has support for processing EITNR in threading synchronization objects. It's done to switch GIL to main thread if signal received when GIL acquired by some background thread. That mechanic can be easy extended for thread interruption case I think. Windows is also not a problem. From sven at marnach.net Mon Apr 2 17:34:52 2012 From: sven at marnach.net (Sven Marnach) Date: Mon, 2 Apr 2012 16:34:52 +0100 Subject: [Python-ideas] with *context_managers: In-Reply-To: <29272049.1143.1333311941739.JavaMail.geo-discussion-forums@vbgx21> References: <29272049.1143.1333311941739.JavaMail.geo-discussion-forums@vbgx21> Message-ID: <20120402153451.GA2470@bagheera> Ram Rachum schrieb am Sun, 01. Apr 2012, um 13:25:41 -0700: > I'd like to be able to do this: > > with *context_managers: > pass # Some suite. > > This is useful when you have an unknown number of context managers that you > want to use. I currently use `contextlib.nested`, but I'd like the *star > syntax much better. 'contextlib.nested()' is broken and not available in Python 3.x (the language this list is about). The only replacement so far is with CM1() as cm1, CM2() as cm2: ... which only works for a fixed number of context managers. And there is a class 'ContextStack' in Nick Coghlan's 'contextlib2' library [1], which might be included in Python 3.3. With this class, you could write your code as with ContextStack() as stack: for cm in context_managers: stack.enter_context(cm) This still leaves the question whether your proposed syntax would be preferable, also with regard to issue 2292 [2]. [1]: http://readthedocs.org/docs/contextlib2/en/latest/#contextlib2.ContextStack [2]: http://bugs.python.org/issue2292 Cheers, Sven From paul at colomiets.name Mon Apr 2 21:43:52 2012 From: paul at colomiets.name (Paul Colomiets) Date: Mon, 2 Apr 2012 22:43:52 +0300 Subject: [Python-ideas] Protecting finally clauses of interruptions Message-ID: Hi, I'd like to propose a way to protect `finally` clauses from interruptions (either by KeyboardInterrupt or by timeout, or any other way). I think frame may be extended to have `f_in_finally` attribute (or pick a better name). Internally it should probably be implemented as a counter of nested finally clauses, but interface should probably expose only boolean attribute. For `__exit__` method some flag in `co_flags` should be introduced, which says that for whole function `f_in_finally` should be true. Having this attribute you can then inspect stack and check whether it's safe to interrupt it or not. Coroutine library which interrupts by timeout, can then sleep a bit and try again (probably for finite number of retries). For signal handler there are also several options to wait when thread escapes finally clause: use another thread, use alert signal, use sys.settrace, or exit only inside main loop. To be clear: I do not propose to change default SIGINT behavior, only to implement a frame flag, and give library developers experiment with the rest. -- Paul From yselivanov.ml at gmail.com Mon Apr 2 22:37:35 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Mon, 2 Apr 2012 16:37:35 -0400 Subject: [Python-ideas] Protecting finally clauses of interruptions In-Reply-To: References: Message-ID: <7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com> On 2012-04-02, at 3:43 PM, Paul Colomiets wrote: > Hi, > > I'd like to propose a way to protect `finally` clauses from > interruptions (either by KeyboardInterrupt or by timeout, or any other > way). > > I think frame may be extended to have `f_in_finally` attribute (or > pick a better name). Internally it should probably be implemented as a > counter of nested finally clauses, but interface should probably > expose only boolean attribute. For `__exit__` method some flag in > `co_flags` should be introduced, which says that for whole function > `f_in_finally` should be true. Paul, First of all sorry for not replying to your previous email in the thread. I've been thinking about the mechanism that will be both useful for thread interruption + for the new emerging coroutine libraries. And I think that we need to draft a PEP. Your current approach with only 'f_in_finally' flag is a half measure, as you will have to somehow monitor frame execution. I think a better solution would be to: 1. Implement a mechanism to throw exceptions in running threads. It should be possible to wake up thread if it waits on a lock, or any other syscall. 2. Add 'f_in_finally' counter, as you proposed. 3. Either add a special base exception, that can be thrown in a currently executing frame to interrupt it, or add a special method to frame object 'f_interrupt()'. Once a frame is attempted to be interrupted, it checks its 'f_in_finally' counter. If it is 0, then throw exception, if not - wait till it sets back to 0 and throw exception immediately. This approach would give you enough flexibility to cover the following cases: 1. Thread interruption 2. Greenlet-based coroutines (throw exception in your event hub) 3. Generator-based coroutines Plus, proper 'finally' statements execution will be guaranteed by the interpreter. - Yury From paul at colomiets.name Mon Apr 2 22:49:21 2012 From: paul at colomiets.name (Paul Colomiets) Date: Mon, 2 Apr 2012 23:49:21 +0300 Subject: [Python-ideas] Protecting finally clauses of interruptions In-Reply-To: <7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com> References: <7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com> Message-ID: Hi Yury, On Mon, Apr 2, 2012 at 11:37 PM, Yury Selivanov wrote: > 1. Implement a mechanism to throw exceptions in running threads. ?It should > be possible to wake up thread if it waits on a lock, or any other syscall. > It's complex, because if thread waits on a lock you can't determine if it's interrupted after lock or before. E.g. it's common to write: l.lock() try: ... finally: l.unlock() Which will break if you interrupted just after lock is acquired. > 2. Add 'f_in_finally' counter, as you proposed. > Ack > 3. Either add a special base exception, that can be thrown in a currently > executing frame to interrupt it, or add a special method to frame object > 'f_interrupt()'. Once a frame is attempted to be interrupted, it checks > its 'f_in_finally' counter. ?If it is 0, then throw exception, if not - > wait till it sets back to 0 and throw exception immediately. > Not sure how it supposed to work. If it's coroutine it may yield while in finally, and you want it be interrupted only when it exits from finally. -- Paul From ncoghlan at gmail.com Mon Apr 2 22:52:01 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 3 Apr 2012 06:52:01 +1000 Subject: [Python-ideas] with *context_managers: In-Reply-To: <20120402153451.GA2470@bagheera> References: <29272049.1143.1333311941739.JavaMail.geo-discussion-forums@vbgx21> <20120402153451.GA2470@bagheera> Message-ID: On Tue, Apr 3, 2012 at 1:34 AM, Sven Marnach wrote: > which only works for a fixed number of context managers. ?And there is > a class 'ContextStack' in Nick Coghlan's 'contextlib2' library [1], > which might be included in Python 3.3. ?With this class, you could > write your code as > > ? ?with ContextStack() as stack: > ? ? ? ?for cm in context_managers: > ? ? ? ? ? ?stack.enter_context(cm) > > This still leaves the question whether your proposed syntax would be > preferable, also with regard to issue 2292 [2]. Both "with *(iterable)" and "for cm in iterable: stack.enter(cm)" are flawed in exactly the same way that contextlib.nested() is flawed: they encourage creating the iterable of context managers first, which means that inner __init__ methods are not covered by outer __exit__ methods. This breaks as soon as you have resources (such as files) where the acquire/release resource management pairing is actually __init__/__exit__ with __enter__ just returning self rather than acquiring the resource. If the iterable of context managers is created first, then the outer resources *will be leaked* if any of the inner constructors fail. The only way to write code that handles an arbitrary number of arbitrary context managers in a robust fashion is to ensure the initialisation steps are also covered by the outer context managers: with CallbackStack() as stack: for make_cm in cm_factories: stack.enter(make_cm()) (Note that I'm not particularly happy with the class and method names for contextlib2.ContextStack, and plan to redesign it a bit before adding it to the stdlib module: https://bitbucket.org/ncoghlan/contextlib2/issue/8/rename-contextstack-to-callbackstack-and) The only time you can get away with a contextlib.nested() style API where the iterable of context managers is created first is when you *know* that all of the context managers involved do their resource acquisition in __enter__ rather than __init__. In the general case, though, any such API is broken because it doesn't reliably clean up files and similar acquired-on-initialisation resources. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From yselivanov.ml at gmail.com Mon Apr 2 23:15:38 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Mon, 2 Apr 2012 17:15:38 -0400 Subject: [Python-ideas] Protecting finally clauses of interruptions In-Reply-To: References: <7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com> Message-ID: <339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com> On 2012-04-02, at 4:49 PM, Paul Colomiets wrote: > It's complex, because if thread waits on a lock you can't determine if it's > interrupted after lock or before. E.g. it's common to write: > > l.lock() > try: > ... > finally: > l.unlock() > > Which will break if you interrupted just after lock is acquired. Yes, that's a good question. However, I fail to see how just adding 'f_in_finally' solves the problem. >> 3. Either add a special base exception, that can be thrown in a currently >> executing frame to interrupt it, or add a special method to frame object >> 'f_interrupt()'. Once a frame is attempted to be interrupted, it checks >> its 'f_in_finally' counter. If it is 0, then throw exception, if not - >> wait till it sets back to 0 and throw exception immediately. >> > > Not sure how it supposed to work. If it's coroutine it may yield > while in finally, and you want it be interrupted only when it exits from > finally. And what's the problem with that? It should be able to yield in its finally freely. @coroutine def read_data(connection): try: yield connection.recv() finally: yield connection.close() print('this shouldn't be printed if a timeout occurs') yield read_data().with_timeout(0.1) In the above example, if 'connection.recv()' takes longer than 0.1s to execute, the scheduler (trampoline) should interrupt the coroutine, 'connection.abort()' line will be executed, and once connection is aborted, it should stop the coroutine immediately. As of now, if you throw an exception while generator is in its 'try' block, everything will work as I explained. The interpreter will execute the 'finally' block, and propagate the exception at the end of it. However, if you throw an exception while generator in its 'finally' block (!), then your coroutine will be aborted too early. With your 'f_in_finally' flag, scheduler simply won't try to interrupt the coroutine, but, then the 'print(...)' line will be executed (!!) (and it shouldn't really). So, we need to shift the control of when a frame is best to be interrupted to the interpreter, not the user code. - Yury From yselivanov.ml at gmail.com Mon Apr 2 23:26:40 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Mon, 2 Apr 2012 17:26:40 -0400 Subject: [Python-ideas] Protecting finally clauses of interruptions In-Reply-To: References: <7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com> Message-ID: On 2012-04-02, at 4:49 PM, Paul Colomiets wrote: > l.lock() > try: > ... > finally: > l.unlock() > > Which will break if you interrupted just after lock is acquired. I guess the best way to solve this puzzle, is to track all locks that the thread acquires and release them in case of forced interruption. - Yury From paul at colomiets.name Tue Apr 3 00:23:31 2012 From: paul at colomiets.name (Paul Colomiets) Date: Tue, 3 Apr 2012 01:23:31 +0300 Subject: [Python-ideas] Protecting finally clauses of interruptions In-Reply-To: <339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com> References: <7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com> <339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com> Message-ID: Hi Yury, >On 2012-04-02, at 4:49 PM, Paul Colomiets wrote: >> l.lock() >> try: >> ? ?... >> finally: >> ? ?l.unlock() >> >> Which will break if you interrupted just after lock is acquired. > >I guess the best way to solve this puzzle, is to track all locks that >the thread acquires and release them in case of forced interruption. Same with open files, and with all other kinds of contexts. I'd go he route of making __enter__ also uninterruptable (and make timeout inside a lock itself). On Tue, Apr 3, 2012 at 12:15 AM, Yury Selivanov wrote: >>> 3. Either add a special base exception, that can be thrown in a currently >>> executing frame to interrupt it, or add a special method to frame object >>> 'f_interrupt()'. Once a frame is attempted to be interrupted, it checks >>> its 'f_in_finally' counter. ?If it is 0, then throw exception, if not - >>> wait till it sets back to 0 and throw exception immediately. >>> >> >> Not sure how it supposed to work. If it's coroutine it may yield >> while in finally, and you want it be interrupted only when it exits from >> finally. > > And what's the problem with that? ?It should be able to yield in its > finally freely. > > @coroutine > def read_data(connection): > ?try: > ? ?yield connection.recv() > ?finally: > ? ?yield connection.close() > ?print('this shouldn't be printed if a timeout occurs') > > yield read_data().with_timeout(0.1) > > In the above example, if 'connection.recv()' takes longer than 0.1s to > execute, the scheduler (trampoline) should interrupt the coroutine, > 'connection.abort()' line will be executed, and once connection is > aborted, it should stop the coroutine immediately. > > As of now, if you throw an exception while generator is in its 'try' > block, everything will work as I explained. ?The interpreter will > execute the 'finally' block, and propagate the exception at the end > of it. > > However, if you throw an exception while generator in its 'finally' > block (!), then your coroutine will be aborted too early. ?With your > 'f_in_finally' flag, scheduler simply won't try to interrupt the > coroutine, but, then the 'print(...)' line will be executed (!!) > (and ?it shouldn't really). ?So, we need to shift the control of when > a frame is best to be interrupted to the interpreter, not the user > code. You've probably not explained your proposal well. If I call frame.f_interrupt() what should it do? Return anything yield from a generator? And how you supposed to continue generator iteration in this case? Or are you going to iterate result of `f_interrupt()`? What it should do if it's not topmost frame? In all my use cases it doesn't matter if "print" is executed, just like it doesn't matter if timeout occured after 1000 ms or after 1001 or 1010 ms or even after 1500 ms as it actually could. So sleeping a bit and trying again is OK. You need to make all __exit__ and finally clauses fast, but it usually not a problem. -- Paul From paul at colomiets.name Tue Apr 3 00:24:24 2012 From: paul at colomiets.name (Paul Colomiets) Date: Tue, 3 Apr 2012 01:24:24 +0300 Subject: [Python-ideas] Protecting finally clauses of interruptions In-Reply-To: References: <7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com> <339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com> Message-ID: Hi Yury, On Tue, Apr 3, 2012 at 1:20 AM, Yury Selivanov wrote: > On 2012-04-02, at 6:00 PM, Paul Colomiets wrote: > >> Hi Yury, >> >>> On 2012-04-02, at 4:49 PM, Paul Colomiets wrote: >>>> l.lock() >>>> try: >>>> ? ... >>>> finally: >>>> ? l.unlock() >>>> >>>> Which will break if you interrupted just after lock is acquired. >>> >>> I guess the best way to solve this puzzle, is to track all locks that >>> the thread acquires and release them in case of forced interruption. >> >> Same with open files, and with all other kinds of contexts. I'd go >> he route of making __enter__ also uninterruptable (and make timeout >> inside a lock itself). > > > I still don't get how exactly do you propose to handle sudden thread > interruption in your own example: > > l.lock() > # (!) the thread may be interrupted at this point > try: > ? ... > finally: > ? l.unlock() > > You don't have a 'with' statement here. > By wrapping lock into a context manager. -- Paul From yselivanov.ml at gmail.com Tue Apr 3 00:28:11 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Mon, 2 Apr 2012 18:28:11 -0400 Subject: [Python-ideas] Protecting finally clauses of interruptions In-Reply-To: References: <7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com> <339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com> Message-ID: <7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com> On 2012-04-02, at 6:24 PM, Paul Colomiets wrote: >> I still don't get how exactly do you propose to handle sudden thread >> interruption in your own example: >> >> l.lock() >> # (!) the thread may be interrupted at this point >> try: >> ... >> finally: >> l.unlock() >> >> You don't have a 'with' statement here. >> > > By wrapping lock into a context manager. How's that going to work for tons of existing code? - Yury From paul at colomiets.name Tue Apr 3 00:33:02 2012 From: paul at colomiets.name (Paul Colomiets) Date: Tue, 3 Apr 2012 01:33:02 +0300 Subject: [Python-ideas] Protecting finally clauses of interruptions In-Reply-To: <7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com> References: <7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com> <339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com> <7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com> Message-ID: Hi Yury, On Tue, Apr 3, 2012 at 1:28 AM, Yury Selivanov wrote: > On 2012-04-02, at 6:24 PM, Paul Colomiets wrote: >>> I still don't get how exactly do you propose to handle sudden thread >>> interruption in your own example: >>> >>> l.lock() >>> # (!) the thread may be interrupted at this point >>> try: >>> ? ... >>> finally: >>> ? l.unlock() >>> >>> You don't have a 'with' statement here. >>> >> >> By wrapping lock into a context manager. > > How's that going to work for tons of existing code? > It isn't. But it doesn't break code any more than it already is. Your proposal doesn't solve any problems with existing code too. But anyway I don't propose any new ways to interrupt code I only propose a way to inform trampoline when it's unsafe to interrupt code. -- Paul From yselivanov.ml at gmail.com Tue Apr 3 01:04:22 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Mon, 2 Apr 2012 19:04:22 -0400 Subject: [Python-ideas] Protecting finally clauses of interruptions In-Reply-To: References: <7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com> <339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com> <7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com> Message-ID: <187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com> On 2012-04-02, at 6:33 PM, Paul Colomiets wrote: >> How's that going to work for tons of existing code? >> > > It isn't. But it doesn't break code any more than it > already is. Your proposal doesn't solve any problems > with existing code too. > > But anyway I don't propose any new ways to interrupt > code I only propose a way to inform trampoline when it's > unsafe to interrupt code. Well, if we're thinking only about interrupting coroutines (not threads), then it's going to work, yes. My initial desire to use a special exception for the purpose, was because of: - it's easier to throw exception in the thread (the C-API function already exists, and we need to think about consequences of using it) - PyPy disables the JIT when working with frames (if I recall correctly). That's why I wanted 'f_in_finally' to be an implementation detail of CPython, hidden from the user code. Perhaps PyPy could implement the handling of our special exception in a more efficient way, without the side-effect of disabling the JIT. What do you think? - Yury From greg at krypto.org Tue Apr 3 01:10:22 2012 From: greg at krypto.org (Gregory P. Smith) Date: Mon, 2 Apr 2012 16:10:22 -0700 Subject: [Python-ideas] Protecting finally clauses of interruptions In-Reply-To: <7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com> References: <7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com> <339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com> <7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com> Message-ID: On Mon, Apr 2, 2012 at 3:28 PM, Yury Selivanov wrote: > On 2012-04-02, at 6:24 PM, Paul Colomiets wrote: > >> I still don't get how exactly do you propose to handle sudden thread > >> interruption in your own example: > >> > >> l.lock() > >> # (!) the thread may be interrupted at this point > >> try: > >> ... > >> finally: > >> l.unlock() > >> > >> You don't have a 'with' statement here. > >> > > > > By wrapping lock into a context manager. > > How's that going to work for tons of existing code? > A context manager doesn't solve this interruption "race condition" issue anyways. If the __enter__ method is interrupted it won't have returned a context and thus __exit__ will never be called. -gps > - > Yury > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yselivanov.ml at gmail.com Tue Apr 3 01:13:47 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Mon, 2 Apr 2012 19:13:47 -0400 Subject: [Python-ideas] Protecting finally clauses of interruptions In-Reply-To: References: <7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com> <339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com> <7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com> Message-ID: <15E33FC2-CB5D-4713-9F54-C711BAF73B98@gmail.com> On 2012-04-02, at 7:10 PM, Gregory P. Smith wrote: > If the __enter__ method is interrupted it won't have returned a context and thus __exit__ will never be called. To address that Paul proposed to make __enter__ non-interruptable as well. - Yury From paul at colomiets.name Tue Apr 3 01:36:42 2012 From: paul at colomiets.name (Paul Colomiets) Date: Tue, 3 Apr 2012 02:36:42 +0300 Subject: [Python-ideas] Protecting finally clauses of interruptions In-Reply-To: <187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com> References: <7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com> <339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com> <7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com> <187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com> Message-ID: Hi Yury, On Tue, Apr 3, 2012 at 2:04 AM, Yury Selivanov wrote: > On 2012-04-02, at 6:33 PM, Paul Colomiets wrote: >>> How's that going to work for tons of existing code? >>> >> >> It isn't. But it doesn't break code any more than it >> already is. Your proposal doesn't solve any problems >> with existing code too. >> >> But anyway I don't propose any new ways to interrupt >> code I only propose a way to inform trampoline when it's >> unsafe to interrupt code. > > Well, if we're thinking only about interrupting coroutines > (not threads), then it's going to work, yes. > Yes the threading stuff is more complex. For the main thread there are few possible implementations, e.g. using signals. If thread signals were ever implemented in python they can be used too. The real problem is inspecting a stack from another thread. But my solution by itself gives a pretty big field of experimentation. Like you can wrap every blocking call, and check the stack on EINTR, and either send an exception or wait a bit, like with coroutines (and you can emit EINTR with pthread_kill, and implement waiting either using another thread or using sys.settrace(), as I don't think performance really matter here) > My initial desire to use a special exception for the purpose, > was because of: > > - it's easier to throw exception in the thread (the C-API > function already exists, and we need to think about > consequences of using it) > It's nice for python to have finally protection built-in, but I don't see how it can be implemented in a generic way. E.g. I usually don't want to break finally unless it executes too long, or if I hit Ctrl+C multiple times. And that is too subjective, to be proposed as common behavior. I also doubt how it can work with stack of yield-based coroutines, as it knows nothing about this kind of stack. > - PyPy disables the JIT when working with frames (if I recall > correctly). ?That's why I wanted 'f_in_finally' to be an > implementation detail of CPython, hidden from the user code. > Perhaps PyPy could implement the handling of our special > exception in a more efficient way, without the side-effect > of disabling the JIT. > Yes, I think PyPy can make some exception. Or maybe until PyPy implement support of 3.3, some library may end up with nice high level API, which both python implementation can include. -- Paul From yselivanov.ml at gmail.com Tue Apr 3 02:02:06 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Mon, 2 Apr 2012 20:02:06 -0400 Subject: [Python-ideas] Protecting finally clauses of interruptions In-Reply-To: References: <7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com> <339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com> <7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com> <187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com> Message-ID: On 2012-04-02, at 7:36 PM, Paul Colomiets wrote: > It's nice for python to have finally protection built-in, > but I don't see how it can be implemented in a generic way. How about adding some sort of 'interruption protocol'? Say Threads, generators, and Greenlets will have a special method called '_interrupt'. While it will be implemented differently for each of them, it will work on the same principle: - check if the underlying code block is in one of its 'finally' statements - if it is: set a special flag to abort when the frame's counter of 'finally' blocks reaches 0 to raise the ExecutionInterrupt exception - if it is not: raise the ExecutionInterrupt exception right away - Yury From paul at colomiets.name Tue Apr 3 09:16:22 2012 From: paul at colomiets.name (Paul Colomiets) Date: Tue, 3 Apr 2012 10:16:22 +0300 Subject: [Python-ideas] Protecting finally clauses of interruptions In-Reply-To: References: <7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com> <339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com> <7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com> <187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com> Message-ID: Hi Yury, On Tue, Apr 3, 2012 at 3:02 AM, Yury Selivanov wrote: > On 2012-04-02, at 7:36 PM, Paul Colomiets wrote: > >> It's nice for python to have finally protection built-in, >> but I don't see how it can be implemented in a generic way. > > How about adding some sort of 'interruption protocol'? > > Say Threads, generators, and Greenlets will have a special > method called '_interrupt'. ?While it will be implemented > differently for each of them, it will work on the same > principle: > On the first look, it's nice proposal. On the second one we have the following problems: 1. For yield-based coroutines you must inspect stack anyway, since interpreter doesn't have a stack, you build it yourself (although, I don't know how `yield from` changes that) 2. For greenlet based coroutines it is unclear what the stack is. For example: def f1(): try: pass finally: g1.switch() def f2(): sleep(1.0) g1 = greenlet(f1) g2 = greenlet(f2) g1.switch() Is it safe to interrupt g2 while it's in `sleep`? (If you wonder how I fix this problem with f_in_finally stack, it's easy. I usually switch to a coroutine from trampoline, so this is a boundary of the stack which should be checked for f_in_finally). 3. For threads it was discussed several times and rejected. This proposal may make thread interruptions slightly safer, but I'm not sure it's enough to convince people. Also at the first implementation we may oversight some places where it's unsafe to break. Like for some objects __init__/__exit__ is safe pair of functions not __enter__/__exit__. So we might need `with` expression to be uninterrtuptable, for the code like: with open('something') as f: ... It may break if interrupted inside `open()`. (Although, if __enter__ is protected you can fix the problem with a simple wrapper). So I still propose add a frame flag, which doesn't break anything, and gives us experiment with interruptions without putting some experimental code into the core. -- Paul From alon at horev.net Tue Apr 3 10:02:25 2012 From: alon at horev.net (Alon Horev) Date: Tue, 3 Apr 2012 11:02:25 +0300 Subject: [Python-ideas] with *context_managers: In-Reply-To: References: <29272049.1143.1333311941739.JavaMail.geo-discussion-forums@vbgx21> <20120402153451.GA2470@bagheera> Message-ID: Another proposal, change nested to work with generators: with nested(open(path) for path in files): .... pros: 1. lazy evaluation of context manager creation (which is everything that is bad with today's nested). 2. shorter than ContextStack. cons: 1. generators are not always an option, in these cases ContextStack is the way to go. 2. a little implicit - I can imagine a python newbie swearing my mom because he didn't know he should use a generator instead of a list. thanks, Alon Horev On Mon, Apr 2, 2012 at 11:52 PM, Nick Coghlan wrote: > On Tue, Apr 3, 2012 at 1:34 AM, Sven Marnach wrote: > > which only works for a fixed number of context managers. And there is > > a class 'ContextStack' in Nick Coghlan's 'contextlib2' library [1], > > which might be included in Python 3.3. With this class, you could > > write your code as > > > > with ContextStack() as stack: > > for cm in context_managers: > > stack.enter_context(cm) > > > > This still leaves the question whether your proposed syntax would be > > preferable, also with regard to issue 2292 [2]. > > Both "with *(iterable)" and "for cm in iterable: stack.enter(cm)" are > flawed in exactly the same way that contextlib.nested() is flawed: > they encourage creating the iterable of context managers first, which > means that inner __init__ methods are not covered by outer __exit__ > methods. > > This breaks as soon as you have resources (such as files) where the > acquire/release resource management pairing is actually > __init__/__exit__ with __enter__ just returning self rather than > acquiring the resource. If the iterable of context managers is created > first, then the outer resources *will be leaked* if any of the inner > constructors fail. The only way to write code that handles an > arbitrary number of arbitrary context managers in a robust fashion is > to ensure the initialisation steps are also covered by the outer > context managers: > > with CallbackStack() as stack: > for make_cm in cm_factories: > stack.enter(make_cm()) > > (Note that I'm not particularly happy with the class and method names > for contextlib2.ContextStack, and plan to redesign it a bit before > adding it to the stdlib module: > > https://bitbucket.org/ncoghlan/contextlib2/issue/8/rename-contextstack-to-callbackstack-and > ) > > The only time you can get away with a contextlib.nested() style API > where the iterable of context managers is created first is when you > *know* that all of the context managers involved do their resource > acquisition in __enter__ rather than __init__. In the general case, > though, any such API is broken because it doesn't reliably clean up > files and similar acquired-on-initialisation resources. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yselivanov.ml at gmail.com Tue Apr 3 16:09:07 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Tue, 3 Apr 2012 10:09:07 -0400 Subject: [Python-ideas] Protecting finally clauses of interruptions In-Reply-To: References: <7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com> <339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com> <7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com> <187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com> Message-ID: <314BE01B-160C-4C84-8E0D-AD82D14E2E56@gmail.com> On 2012-04-03, at 3:16 AM, Paul Colomiets wrote: > 1. For yield-based coroutines you must inspect stack > anyway, since interpreter doesn't have a stack, you > build it yourself (although, I don't know how `yield from` > changes that) > > 2. For greenlet based coroutines it is unclear what > the stack is. For example: > > def f1(): > try: > pass > finally: > g1.switch() > > def f2(): > sleep(1.0) > > g1 = greenlet(f1) > g2 = greenlet(f2) > g1.switch() > > Is it safe to interrupt g2 while it's in `sleep`? (If you wonder > how I fix this problem with f_in_finally stack, it's easy. I > usually switch to a coroutine from trampoline, so this is > a boundary of the stack which should be checked for > f_in_finally). Wait. So you're tracing the whole coroutine execution stack to check if the current coroutine was called in a finally block of some other coroutine? For handling timeouts I don't think that is necessary (maybe there are other use cases?) In the example below you actually have to interrupt g2: def g1(): try: ... finally: g2().with_timeout(0.1) def g2(): sleep(2) You shouldn't guarantee that the *whole* chain of functions/ coroutines/etc will be safe in their finally statements, you just need to protect the top coroutines in the timeouts queue. Hence, in the above example, if you run g1() with a timeout, the trampoline should ensure that it won't interrupt it while it is in its finally block. But it can interrupt g2() in any context at any point of its execution. And if g2() gets interrupted, g1()'s finally statement will be broken, yes. But that's the responsibility of the developer to ensure that the code in 'finally' handles exceptions within it correctly. That's just my approach to handle timeouts, I'm not advocating it to be the very right one. Are there any other use-cases when you have to inspect the execution stack? Because if there is no, 'interrupt()' method is sufficient and implementable, as both generators and greenlets are well aware about the code frames they holding. > 3. For threads it was discussed several times and rejected. > This proposal may make thread interruptions slightly safer, > but I'm not sure it's enough to convince people. That's why I'm advocating for a PEP. Thread interruption isn't a safe feature in the .NET CLR either. You may break things with it there too. And it doesn't protect the chain of functions calling each other from their 'finally' statements, it just protects the top frame. The 'abort' and 'interrupt' methods aren't advertised to be used in .NET, use them at your own risk. So I don't think that we can, or should ensure 100% safety when interrupting a thread. And that's why I think it is worth to propose a mechanism that will work for many concurrency primitives. > So I still propose add a frame flag, which doesn't break > anything, and gives us experiment with interruptions > without putting some experimental code into the core. There are cons and pros in your solution. Pros ---- - can be used right away in coroutine libraries. - somewhat simple and small CPython patch. Cons ---- - you have to work with frames almost throughout the execution of the program. In PyPy you simply will have the JIT disabled. And I'm not sure how frame access works in Jython and IronPython from the performance point of view. - no mechanism for interrupting a running thread. In almost any coroutine library you will have a thread pool, and sometimes you need a way to interrupt workers. So it's not enough even for coroutines. - Yury From andrew.svetlov at gmail.com Tue Apr 3 18:11:36 2012 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Tue, 3 Apr 2012 19:11:36 +0300 Subject: [Python-ideas] Protecting finally clauses of interruptions In-Reply-To: <314BE01B-160C-4C84-8E0D-AD82D14E2E56@gmail.com> References: <7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com> <339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com> <7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com> <187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com> <314BE01B-160C-4C84-8E0D-AD82D14E2E56@gmail.com> Message-ID: Instead of lookup from nested frame it's possible to propagate flag down to called frames. On Tue, Apr 3, 2012 at 5:09 PM, Yury Selivanov wrote: > On 2012-04-03, at 3:16 AM, Paul Colomiets wrote: >> 1. For yield-based coroutines you must inspect stack >> anyway, since interpreter doesn't have a stack, you >> build it yourself (although, I don't know how `yield from` >> changes that) >> >> 2. For greenlet based coroutines it is unclear what >> the stack is. For example: >> >> def f1(): >> ? ?try: >> ? ? ? ?pass >> ? ?finally: >> ? ? ? ?g1.switch() >> >> def f2(): >> ? ?sleep(1.0) >> >> g1 = greenlet(f1) >> g2 = greenlet(f2) >> g1.switch() >> >> Is it safe to interrupt g2 while it's in `sleep`? (If you wonder >> how I fix this problem with f_in_finally stack, it's easy. I >> usually switch to a coroutine from trampoline, so this is >> a boundary of the stack which should be checked for >> f_in_finally). > > Wait. ?So you're tracing the whole coroutine execution stack to > check if the current coroutine was called in a finally block of > some other coroutine? ?For handling timeouts I don't think that > is necessary (maybe there are other use cases?) > > In the example below you actually have to interrupt g2: > > def g1(): > ? try: > ? ? ... > ? finally: > ? ? g2().with_timeout(0.1) > > def g2(): > ? sleep(2) > > You shouldn't guarantee that the *whole* chain of functions/ > coroutines/etc will be safe in their finally statements, you just > need to protect the top coroutines in the timeouts queue. > > Hence, in the above example, if you run g1() with a timeout, the > trampoline should ensure that it won't interrupt it while it is > in its finally block. ?But it can interrupt g2() in any context > at any point of its execution. ?And if g2() gets interrupted, > g1()'s finally statement will be broken, yes. ?But that's the > responsibility of the developer to ensure that the code in > 'finally' handles exceptions within it correctly. > > That's just my approach to handle timeouts, I'm not advocating > it to be the very right one. > > Are there any other use-cases when you have to inspect the > execution stack? ?Because if there is no, 'interrupt()' method > is sufficient and implementable, as both generators and > greenlets are well aware about the code frames they holding. > >> 3. For threads it was discussed several times and rejected. >> This proposal may make thread interruptions slightly safer, >> but I'm not sure it's enough to convince people. > > That's why I'm advocating for a PEP. ?Thread interruption isn't > a safe feature in the .NET CLR either. ?You may break things with > it there too. ?And it doesn't protect the chain of functions > calling each other from their 'finally' statements, it just > protects the top frame. ?The 'abort' and 'interrupt' methods > aren't advertised to be used in .NET, use them at your own risk. > > So I don't think that we can, or should ensure 100% safety when > interrupting a thread. ?And that's why I think it is worth to > propose a mechanism that will work for many concurrency > primitives. > >> So I still propose add a frame flag, which doesn't break >> anything, and gives us experiment with interruptions >> without putting some experimental code into the core. > > > There are cons and pros in your solution. > > Pros > ---- > > - can be used right away in coroutine libraries. > > - somewhat simple and small CPython patch. > > Cons > ---- > > - you have to work with frames almost throughout the execution > of the program. ?In PyPy you simply will have the JIT disabled. > And I'm not sure how frame access works in Jython and IronPython > from the performance point of view. > > - no mechanism for interrupting a running thread. ?In almost any > coroutine library you will have a thread pool, and sometimes you > need a way to interrupt workers. ?So it's not enough even for > coroutines. > > - > Yury > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -- Thanks, Andrew Svetlov From paul at colomiets.name Tue Apr 3 21:22:50 2012 From: paul at colomiets.name (Paul Colomiets) Date: Tue, 3 Apr 2012 22:22:50 +0300 Subject: [Python-ideas] Protecting finally clauses of interruptions In-Reply-To: <314BE01B-160C-4C84-8E0D-AD82D14E2E56@gmail.com> References: <7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com> <339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com> <7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com> <187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com> <314BE01B-160C-4C84-8E0D-AD82D14E2E56@gmail.com> Message-ID: Hi Yury, On Tue, Apr 3, 2012 at 5:09 PM, Yury Selivanov wrote: > You shouldn't guarantee that the *whole* chain of functions/ > coroutines/etc will be safe in their finally statements, you just > need to protect the top coroutines in the timeouts queue. > For yield-based coroutines the common case is the following: yield lock.acquire() try: # something finally: yield lock.release() The implementation of lock.release() is something which goes to distributed locking manager, so should not be interrupted. Although I can think of some ways of fixing it using tight coupling of locks with trampoline, but having obvious way to do it without hacks is much better. (Although, I don't know how `yield from` changes working with yield-based coroutines, may be it's behavior is quite different) For greenlets situation is a bit different, as Python knows the stack there, but you still need to traverse it (or as Andrew mentioned, you can just propagate flag). > - no mechanism for interrupting a running thread. Yes. That was intentionally to have greater chance to success. Interruption may be separate proposal. > In almost any coroutine library you will have a thread pool, > and sometimes you need a way to interrupt workers. Which one? I know that twisted uses thread pool to handle: 1. DNS (which is just silly) 2. Few other protocols which doesn't have asynchronous libraries (which should be fixed) The whole intention of using coroutine library is to not to have thread pool. Could you describe your use case with more details? > So it's not enough even for coroutines. > Very subjective, and doesn't match my expectations. -- Paul From yselivanov.ml at gmail.com Wed Apr 4 03:23:34 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Tue, 3 Apr 2012 21:23:34 -0400 Subject: [Python-ideas] Protecting finally clauses of interruptions In-Reply-To: References: <7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com> <339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com> <7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com> <187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com> <314BE01B-160C-4C84-8E0D-AD82D14E2E56@gmail.com> Message-ID: On 2012-04-03, at 3:22 PM, Paul Colomiets wrote: > (Although, I don't know how `yield from` changes working with > yield-based coroutines, may be it's behavior is quite different) > > For greenlets situation is a bit different, as Python knows the > stack there, but you still need to traverse it (or as Andrew > mentioned, you can just propagate flag). Why traverse? Why propagate? As I explained in my previous posts here, you need to protect only the top-stack coroutines in the timeouts or trampoline execution queues. You should illustrate your logic with a more clear example - say three or four coroutines that call each other + with a glimpse of how your trampoline works. But I'm not sure that is really necessary. >> - no mechanism for interrupting a running thread. > > Yes. That was intentionally to have greater chance to success. > Interruption may be separate proposal. > >> In almost any coroutine library you will have a thread pool, >> and sometimes you need a way to interrupt workers. > > Which one? I know that twisted uses thread pool to handle: Besides Twisted? eventlet; gevent will have them in 1.0, etc. > 1. DNS (which is just silly) > 2. Few other protocols which doesn't have asynchronous > libraries (which should be fixed) > > The whole intention of using coroutine library is to not to > have thread pool. Could you describe your use case > with more details? Well, our company has been using coroutines for like 2.5 years now (the framework in not yet opensourced). And in our practice threadpool is really handy, as it allows you to: - use non-asyncronous libraries, which you don't want to monkeypatch with greensockets (or even unable to mokeypatch) - wrap some functions that are usually very fast, but once in a while may take some time. And sometimes you don't want to offload them to a separate process - and yes, do DNS lookups if you don't have a compiled cpython extension that wraps c-ares or something alike. Please let's avoid shifting further discussion to proving or disproving the necessity of threadpools. They are being actively used and there is a demand for (more or less) graceful threads interruption or abortion. >> So it's not enough even for coroutines. >> > > Very subjective, and doesn't match my expectations. As I said -- we've been working with coroutines (combined generators + greenlets) for a few years, and apparently have different experience, opinions and expectations from what you have. And I suppose developers and users of eventlet, gevent, twisted (@inlineCallbacks) and other libraries have their own opinions and ideas too. Not to mention, it would be interesting to hear from PyPy, Juthon and IronPython teams. It also seems that neither of us have enough experience working with 'yield from' style coroutines. Please write a PEP and we'll continue discussion from that point. Hopefully, it will get more attention than this thread. - Yury From greg.ewing at canterbury.ac.nz Wed Apr 4 02:03:29 2012 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 04 Apr 2012 12:03:29 +1200 Subject: [Python-ideas] Protecting finally clauses of interruptions In-Reply-To: References: <7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com> <339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com> <7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com> <187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com> Message-ID: <4F7B8FD1.3040308@canterbury.ac.nz> Paul Colomiets wrote: > So I still propose add a frame flag, which doesn't break > anything, and gives us experiment with interruptions > without putting some experimental code into the core. I don't think a frame flag on its own is quite enough. You don't just want to prevent interruptions while in a finally block, you want to defer them until the finally counter gets back to zero. Making the interrupter sleep and try again in that situation is rather ugly. So perhaps there could also be a callback that gets invoked when the counter goes down to zero. -- Greg From paul at colomiets.name Wed Apr 4 10:04:04 2012 From: paul at colomiets.name (Paul Colomiets) Date: Wed, 4 Apr 2012 11:04:04 +0300 Subject: [Python-ideas] Protecting finally clauses of interruptions In-Reply-To: References: <7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com> <339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com> <7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com> <187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com> <314BE01B-160C-4C84-8E0D-AD82D14E2E56@gmail.com> Message-ID: Hi, On Wed, Apr 4, 2012 at 4:23 AM, Yury Selivanov wrote: > On 2012-04-03, at 3:22 PM, Paul Colomiets wrote: >> (Although, I don't know how `yield from` changes working with >> yield-based coroutines, may be it's behavior is quite different) >> >> For greenlets situation is a bit different, as Python knows the >> stack there, but you still need to traverse it (or as Andrew >> mentioned, you can just propagate flag). > > Why traverse? ?Why propagate? ?As I explained in my previous posts > here, you need to protect only the top-stack coroutines in the > timeouts or trampoline execution queues. ?You should illustrate > your logic with a more clear example - say three or four coroutines > that call each other + with a glimpse of how your trampoline works. > But I'm not sure that is really necessary. > Here is more detailed previous example (although, still simplified): @coroutine def add_money(user_id, money): yield redis_lock(user_id) try: yield redis_incr('user:'+user_id+':money', money) finally: yield redis_unlock(user_id) # this one is crucial to show the point of discusssion # other function are similar: @coroutine def redis_unlock(lock): yield redis_socket.wait_write() # yields back when socket is ready for writing cmd = ('DEL user:'+lock+'\n').encode('ascii') redis_socket.write(cmd) # should be loop here, actually yield redis_socket.wait_read() result = redis_socket.read(1024) # here loop too assert result == 'OK\n' The trampoline when gets coroutine from `next()` or `send()` method puts it on top of stack and doesn't dispatch original one until topmost one is exited. The point is that if timeout arrives inside a `redis_unlock` function, we must wait until finally from `add_user` is finished >> >> The whole intention of using coroutine library is to not to >> have thread pool. Could you describe your use case >> with more details? > > Well, our company has been using coroutines for like 2.5 years > now (the framework in not yet opensourced). ?And in our practice > threadpool is really handy, as it allows you to: > > - use non-asyncronous libraries, which you don't want to > monkeypatch with greensockets (or even unable to mokeypatch) > And we rewrite them in python. It seems to be more useful. > - wrap some functions that are usually very fast, but once in > a while may take some time. ?And sometimes you don't want to > offload them to a separate process > Ack. > - and yes, do DNS lookups if you don't have a compiled cpython > extension that wraps c-ares or something alike. > Maybe let's propose asynchronous DNS library for python? We have same problem, although we do not resolve hosts at runtime (only at startup) so synchronous one is well enough for our purposes. > Please let's avoid shifting further discussion to proving or > disproving the necessity of threadpools. Agreed. > They are being actively used and there is a demand for > (more or less) graceful threads interruption or abortion. > Given use cases, what stops you to make explicit interrtuption points? > > Please write a PEP and we'll continue discussion from that > point. ?Hopefully, it will get more attention than this thread. > I don't see the point in writing a PEP until I have an idea what PEP should propose. If you have, you can do it. Again you want to implement thread interruption, and that's not my point, there is another thread for that. On Wed, Apr 4, 2012 at 3:03 AM, Greg Ewing wrote: > > I don't think a frame flag on its own is quite enough. > You don't just want to prevent interruptions while in > a finally block, you want to defer them until the finally > counter gets back to zero. Making the interrupter sleep > and try again in that situation is rather ugly. > > So perhaps there could also be a callback that gets > invoked when the counter goes down to zero. Do you mean put callback in a frame, which get executed at next bytecode just like signal handler, except it waits until finally clause is executed? I would work, except in may have light performance impact on each bytecode. But I'm not sure if it will be noticeable. -- Paul From victor.varvariuc at gmail.com Wed Apr 4 10:07:55 2012 From: victor.varvariuc at gmail.com (Victor Varvariuc) Date: Wed, 4 Apr 2012 11:07:55 +0300 Subject: [Python-ideas] dict.items to accept optional iterable with keys to use Message-ID: Sometimes you want a dict which is subset of another dict. It would nice if dict.items accepted an optional list of keys to return. If no keys are given - use default behavior - get all items. class NewDict(dict): def items(self, keys=()): """Another version of dict.items() which accepts specific keys to use.""" for key in keys or self.keys(): yield key, self[key] a = NewDict({ 1: 'one', 2: 'two', 3: 'three', 4: 'four', 5: 'five'}) print(dict(a.items()))print(dict(a.items((1, 3, 5)))) vic at ubuntu:~/Desktop$ python test.py {1: 'one', 2: 'two', 3: 'three', 4: 'four', 5: 'five'}{1: 'one', 3: 'three', 5: 'five'} Thanks for the attention. -- *Victor Varvariuc* -------------- next part -------------- An HTML attachment was scrubbed... URL: From pyideas at rebertia.com Wed Apr 4 10:22:05 2012 From: pyideas at rebertia.com (Chris Rebert) Date: Wed, 4 Apr 2012 01:22:05 -0700 Subject: [Python-ideas] dict.items to accept optional iterable with keys to use In-Reply-To: References: Message-ID: On Wed, Apr 4, 2012 at 1:07 AM, Victor Varvariuc wrote: > Sometimes you want a dict which is subset of another dict. It would nice if > dict.items accepted an optional list of keys to return. If no keys are given > - use default behavior - get all items. > print(dict(a.items((1, 3, 5)))) In that use case, why not just write a dict comprehension?: print({k: a[k] for k in (1, 3, 5)}) Completely explicit, and only a mere few characters longer. Cheers, Chris From steve at pearwood.info Wed Apr 4 10:23:01 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 4 Apr 2012 18:23:01 +1000 Subject: [Python-ideas] dict.items to accept optional iterable with keys to use In-Reply-To: References: Message-ID: <20120404082301.GB19862@ando> On Wed, Apr 04, 2012 at 11:07:55AM +0300, Victor Varvariuc wrote: > Sometimes you want a dict which is subset of another dict. It would nice if > dict.items accepted an optional list of keys to return. If no keys are > given - use default behavior - get all items. Too trivial to bother with. # I want a list of keys/values: items = [(key, mydict[key]) for key in list_of_keys] # I want them generated on demand: iterable = ((key, mydict[key]) for key in list_of_keys) # I want a new dict: newdict = dict((key, mydict[key]) for key in list_of_keys) Both of those require that list_of_keys includes keys that do exist. If you want to skip missing keys: [(k,v) for (k,v) in mydict.items() if k in list_of_keys] To be even more efficient, use a set of keys instead of a list. -- Steven From techtonik at gmail.com Wed Apr 4 10:25:57 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Wed, 4 Apr 2012 11:25:57 +0300 Subject: [Python-ideas] Python probe: execute code in isolation (subinterpreter?) and get results Message-ID: Hi, Is there a standard way to execute a Python code and inspect the results without spawning an external Python process? If there is no such way, I'd like to propose the feature, and there are two user stories. Both are about application environment probing. Story #1: Choosing the best library in a safe manner Probing environment is required for Python applications to make component selection logic explicit and less error-prone. I can tell from my experience with Spyder IDE that startup procedure is the most fragile part for this cross-platform application, which makes use of optionally installed components on user system. Implicit import nature and inability to revert import operation makes situation complicated. Below is an example. Take a note that this is not about packaging. Spyder IDE is a Qt application that optionally embeds IPython console. Qt has two bindings - PyQt4 and PySide. PyQt4 binding has two APIs - #1 and #2. If PyQt4 is used and version of installed IPython >= 0.11, the API #2 must be chosen. So, the IPython version probing should come first. A standard way to detect IPython version is to import IPython before the rest of the application, but IPython may detect PyQt4 itself and import it too for probing version. And if Spyder uses PySide we now have a conflict with Qt libraries loaded. If there was a way to execute Python script in subinterpreter to probe all installed component versions and return results, the selection logic would be much more readable and sane. Story #2: Get settings from user script Blender uses SCons to automate builds. SCons script is written in Python and it uses execfile(filename, globals, locals) to read platform specific user script with settings. Unfortunately, execfile() is a hack that doesn't treat Python scripts the same way the interpreter treats them - for example, globals access will throw exceptions - http://bugs.python.org/issue14049 More important that users won't be able to troubleshoot the exceptions, because standalone script works as expected. Executing user script code in a subprocess will most likely negatively affect performance, which is rather critical for a build tool. Pickling and unpickling objects with global state through communication pipe may also be the source of bugs. So, having a cheap way to communicate with Python subinterpreter and get a simple dict in result will make Python more snappy. -- anatoly t. From mal at egenix.com Wed Apr 4 11:20:20 2012 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 04 Apr 2012 11:20:20 +0200 Subject: [Python-ideas] Python probe: execute code in isolation (subinterpreter?) and get results In-Reply-To: References: Message-ID: <4F7C1254.5070703@egenix.com> anatoly techtonik wrote: > Hi, > > Is there a standard way to execute a Python code and inspect the > results without spawning an external Python process? If there is no > such way, I'd like to propose the feature, and there are two user > stories. Both are about application environment probing. > > > Story #1: Choosing the best library in a safe manner > > Probing environment is required for Python applications to make > component selection logic explicit and less error-prone. I can tell > from my experience with Spyder IDE that startup procedure is the most > fragile part for this cross-platform application, which makes use of > optionally installed components on user system. Implicit import nature > and inability to revert import operation makes situation complicated. > Below is an example. Take a note that this is not about packaging. > > Spyder IDE is a Qt application that optionally embeds IPython console. > Qt has two bindings - PyQt4 and PySide. PyQt4 binding has two APIs - > #1 and #2. If PyQt4 is used and version of installed IPython >= 0.11, > the API #2 must be chosen. So, the IPython version probing should come > first. A standard way to detect IPython version is to import IPython > before the rest of the application, but IPython may detect PyQt4 > itself and import it too for probing version. And if Spyder uses > PySide we now have a conflict with Qt libraries loaded. If there was a > way to execute Python script in subinterpreter to probe all installed > component versions and return results, the selection logic would be > much more readable and sane. Given that you are also loading external shared libraries, I don't see how you could do this within the same process. Unloading shared libs is possible (even if fragile), but if you don't even know which libs have been loaded, likely impossible to do in a cross- platform way. > Story #2: Get settings from user script > > Blender uses SCons to automate builds. SCons script is written in > Python and it uses execfile(filename, globals, locals) to read > platform specific user script with settings. Unfortunately, execfile() > is a hack that doesn't treat Python scripts the same way the > interpreter treats them - for example, globals access will throw > exceptions - http://bugs.python.org/issue14049 More important that > users won't be able to troubleshoot the exceptions, because standalone > script works as expected. You're not using execfile() correctly: if you want a script to be run in the same way as a module, then the local and global namespace dictionaries have to be the same. So the second story already works in vanilla Python :-) Lots of Python applications read configuration data from user supplied (Python) config files. It's less secure than e.g. INI files, but gives you a lot of flexibility in defining what you need. > Executing user script code in a subprocess will most likely negatively > affect performance, which is rather critical for a build tool. > Pickling and unpickling objects with global state through > communication pipe may also be the source of bugs. So, having a cheap > way to communicate with Python subinterpreter and get a simple dict in > result will make Python more snappy. I don't see how you could get story #1 working without a subprocess. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 04 2012) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2012-04-03: Python Meeting Duesseldorf today ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From yselivanov.ml at gmail.com Wed Apr 4 18:59:57 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Wed, 4 Apr 2012 12:59:57 -0400 Subject: [Python-ideas] Protecting finally clauses of interruptions In-Reply-To: References: <7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com> <339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com> <7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com> <187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com> <314BE01B-160C-4C84-8E0D-AD82D14E2E56@gmail.com> Message-ID: <7007EA8B-09BB-46B8-A5D1-52CAB128DAB0@gmail.com> On 2012-04-04, at 4:04 AM, Paul Colomiets wrote: > Hi, > > On Wed, Apr 4, 2012 at 4:23 AM, Yury Selivanov wrote: >> On 2012-04-03, at 3:22 PM, Paul Colomiets wrote: >>> (Although, I don't know how `yield from` changes working with >>> yield-based coroutines, may be it's behavior is quite different) >>> >>> For greenlets situation is a bit different, as Python knows the >>> stack there, but you still need to traverse it (or as Andrew >>> mentioned, you can just propagate flag). >> >> Why traverse? Why propagate? As I explained in my previous posts >> here, you need to protect only the top-stack coroutines in the >> timeouts or trampoline execution queues. You should illustrate >> your logic with a more clear example - say three or four coroutines >> that call each other + with a glimpse of how your trampoline works. >> But I'm not sure that is really necessary. >> > > Here is more detailed previous example (although, still simplified): > > @coroutine > def add_money(user_id, money): > yield redis_lock(user_id) > try: > yield redis_incr('user:'+user_id+':money', money) > finally: > yield redis_unlock(user_id) > > # this one is crucial to show the point of discusssion > # other function are similar: > @coroutine > def redis_unlock(lock): > yield redis_socket.wait_write() # yields back when socket is > ready for writing > cmd = ('DEL user:'+lock+'\n').encode('ascii') > redis_socket.write(cmd) # should be loop here, actually > yield redis_socket.wait_read() > result = redis_socket.read(1024) # here loop too > assert result == 'OK\n' > > The trampoline when gets coroutine from `next()` or `send()` method > puts it on top of stack and doesn't dispatch original one until topmost > one is exited. > > The point is that if timeout arrives inside a `redis_unlock` function, we > must wait until finally from `add_user` is finished How can it "arrive" inside "redis_unlock"? Let's assume you called "add_money" as such: yield add_money(1, 10).with_timeout(10) Then it's the 'add_money' coroutine that should be in the tieouts queue/tree! 'add_money' specifically should be tried to be interrupted when your 10s timeout reaches. And if 'add_money' is in its 'finally' statement - you simply postpone its interruption, meaning that 'redis_unlock' will end its execution nicely. Again, I'm not sure how exactly you manage your timeouts. The way I am, simplified: I have a timeouts heapq with pointers to those coroutines that were *explicitly* executed with a timeout. So I'm protecting only the coroutines in that queue, because only them can be interrupted. And the coroutines they call, are protected *automatically*. If you do it differently, can you please elaborate on how your scheduler is actually designed? >>> >>> The whole intention of using coroutine library is to not to >>> have thread pool. Could you describe your use case >>> with more details? >> >> Well, our company has been using coroutines for like 2.5 years >> now (the framework in not yet opensourced). And in our practice >> threadpool is really handy, as it allows you to: >> >> - use non-asyncronous libraries, which you don't want to >> monkeypatch with greensockets (or even unable to mokeypatch) >> > > And we rewrite them in python. It seems to be more useful. Sometimes you can't afford the luxury ;) > >> - wrap some functions that are usually very fast, but once in >> a while may take some time. And sometimes you don't want to >> offload them to a separate process >> > > Ack. > >> - and yes, do DNS lookups if you don't have a compiled cpython >> extension that wraps c-ares or something alike. >> > > Maybe let's propose asynchronous DNS library for python? > We have same problem, although we do not resolve hosts at > runtime (only at startup) so synchronous one is well enough > for our purposes. > >> Please let's avoid shifting further discussion to proving or >> disproving the necessity of threadpools. > > Agreed. > >> They are being actively used and there is a demand for >> (more or less) graceful threads interruption or abortion. >> > > Given use cases, what stops you to make explicit > interrtuption points? > >> >> Please write a PEP and we'll continue discussion from that >> point. Hopefully, it will get more attention than this thread. >> > > I don't see the point in writing a PEP until I have an idea > what PEP should propose. If you have, you can do it. Again OK, point taken. Please give me couple of days to at least come up with a summary document. I still don't like your solution because it works directly with frames. With an upcoming PyPy support of python 3, I don't think I want to loose the JIT support. I also want to take a look at the new PyPy continuations. Ideally, as I proposed earlier, we should introduce some sort of interruption protocol -- method 'interrupt()', with perhaps a callback. > you want to implement thread interruption, and that's not > my point, there is another thread for that. We have two requests: ability to safely interrupt python function or generator (1); ability to safely interrupt python's threads (2). Both (1) and (2) share the same requirement of safe 'finally' statements. To me, both features are similar enough to come up with a single solution, rather than inventing different approaches. > On Wed, Apr 4, 2012 at 3:03 AM, Greg Ewing wrote: >> >> I don't think a frame flag on its own is quite enough. >> You don't just want to prevent interruptions while in >> a finally block, you want to defer them until the finally >> counter gets back to zero. Making the interrupter sleep >> and try again in that situation is rather ugly. That's the second reason I don't like your proposal. def foo(): try: .. finally: yield unlock() # <--- the ideal point to interrupt foo f = open('a', 'w') # what if we interrupt it here? try: .. finally: f.close() >> So perhaps there could also be a callback that gets >> invoked when the counter goes down to zero. > > Do you mean put callback in a frame, which get > executed at next bytecode just like signal handler, > except it waits until finally clause is executed? > > I would work, except in may have light performance > impact on each bytecode. But I'm not sure if it will > be noticeable. That's essentially the way we currently did it. We transform the coroutine's __code__ object to make it from: def a(): try: # code1 finally: # code2 to: def a(): __self__ = __get_current_coroutine() try: # code1 finally: __self__.enter_finally() try: # code2 finally: __self__.exit_finally() 'enter_finally' and 'exit_finally' maintain the internal counter of finally blocks. If a coroutine needs to be interrupted, we check that counter. If it is 0 - throw in a special exception. If not - wait till it becomes 0 and throw the exception in 'exit_finally'. Works flawlessly, but with the high cost of having to patch __code__ objects. - Yury From sven at marnach.net Wed Apr 4 19:01:58 2012 From: sven at marnach.net (Sven Marnach) Date: Wed, 4 Apr 2012 18:01:58 +0100 Subject: [Python-ideas] dict.items to accept optional iterable with keys to use In-Reply-To: References: Message-ID: <20120404170158.GB2470@bagheera> Victor Varvariuc schrieb am Wed, 04. Apr 2012, um 11:07:55 +0300: > Sometimes you want a dict which is subset of another dict. It would nice if > dict.items accepted an optional list of keys to return. If no keys are > given - use default behavior - get all items. How about using `operator.itemgetter()`? from operator import itemgetter itemgetter(*keys)(my_dict) It will return a tuple of the values corresponding to the given keys. Cheers, Sven From paul at colomiets.name Wed Apr 4 20:44:30 2012 From: paul at colomiets.name (Paul Colomiets) Date: Wed, 4 Apr 2012 21:44:30 +0300 Subject: [Python-ideas] Protecting finally clauses of interruptions In-Reply-To: <7007EA8B-09BB-46B8-A5D1-52CAB128DAB0@gmail.com> References: <7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com> <339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com> <7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com> <187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com> <314BE01B-160C-4C84-8E0D-AD82D14E2E56@gmail.com> <7007EA8B-09BB-46B8-A5D1-52CAB128DAB0@gmail.com> Message-ID: Hi Yury, On Wed, Apr 4, 2012 at 7:59 PM, Yury Selivanov wrote: >> Here is more detailed previous example (although, still simplified): >> >> @coroutine >> def add_money(user_id, money): >> ? ?yield redis_lock(user_id) >> ? ?try: >> ? ? ? ?yield redis_incr('user:'+user_id+':money', money) >> ? ?finally: >> ? ? ? ?yield redis_unlock(user_id) >> >> # this one is crucial to show the point of discusssion >> # other function are similar: >> @coroutine >> def redis_unlock(lock): >> ? ?yield redis_socket.wait_write() ?# yields back when socket is >> ready for writing >> ? ?cmd = ('DEL user:'+lock+'\n').encode('ascii') >> ? ?redis_socket.write(cmd) ?# should be loop here, actually >> ? ?yield redis_socket.wait_read() >> ? ?result = redis_socket.read(1024) ?# here loop too >> ? ?assert result == 'OK\n' >> >> The trampoline when gets coroutine from `next()` or `send()` method >> puts it on top of stack and doesn't dispatch original one until topmost >> one is exited. >> >> The point is that if timeout arrives inside a `redis_unlock` function, we >> must wait until finally from `add_user` is finished > > How can it "arrive" inside "redis_unlock"? ?Let's assume you called > "add_money" as such: > > yield add_money(1, 10).with_timeout(10) > > Then it's the 'add_money' coroutine that should be in the tieouts queue/tree! > 'add_money' specifically should be tried to be interrupted when your 10s timeout > reaches. ?And if 'add_money' is in its 'finally' statement - you simply postpone > its interruption, meaning that 'redis_unlock' will end its execution nicely. > > Again, I'm not sure how exactly you manage your timeouts. ?The way I am, > simplified: I have a timeouts heapq with pointers to those coroutines > that were *explicitly* executed with a timeout. ?So I'm protecting only > the coroutines in that queue, because only them can be interrupted. ?And > the coroutines they call, are protected *automatically*. > > If you do it differently, can you please elaborate on how your scheduler > is actually designed? > I have a global timeout for processing single request. It's actually higher in a chain of generator calls. So dispatcher looks like: def dispatcher(self, method, args): with timeout(10): yield getattr(self.method)(*args) And all the local timeouts, like timeout for single request are usually applied at a socket level, where specific protocol is implemented: def redis_unlock(lock): ? ?yield redis_socket.wait_write(2) ?# wait two seconds # TimeoutError may have been raised in wait_write() ? ?cmd = ('DEL user:'+lock+'\n').encode('ascii') ? ?redis_socket.write(cmd) ?# should be loop here, actually ? ?yield redis_socket.wait_read(2) # another two seconds ? ?result = redis_socket.read(1024) ?# here loop too ? ?assert result == 'OK\n' So they are not interruptions. Although, we don't use them much with coroutines, global timeout for request is usually enough. But anyway I don't see a reason to protect a single frame, because even if you have a simple mutex without coroutines you end up with: def something(): lock.acquire() try: pass finally: lock.release() And if lock's imlementation is something along the lines of: def release(self): self._native_lock.release() How would you be sure that interruption is not executed when interpreter resolved `self._native_lock.release` but not yet called it? > OK, point taken. ?Please give me couple of days to at least > come up with a summary document. No hurry. > I still don't like your > solution because it works directly with frames. ?With an > upcoming PyPy support of python 3, I don't think I want > to loose the JIT support. > It's also interesting question. I don't think it's possible to interrupt JIT'ed code in arbitrary location. > Ideally, as I proposed earlier, we should introduce some > sort of interruption protocol -- method 'interrupt()', with > perhaps a callback. > On which object? Is it sys.interrupt()? Or is it thread.interrupt()? >> you want to implement thread interruption, and that's not >> my point, there is another thread for that. > > We have two requests: ability to safely interrupt python > function or generator (1); ability to safely interrupt > python's threads (2). ?Both (1) and (2) share the same > requirement of safe 'finally' statements. ?To me, both > features are similar enough to come up with a single > solution, rather than inventing different approaches. > Again I do not propose described point (1). I propose a way to *inspect* a stack if it's safe to interrupt. >> On Wed, Apr 4, 2012 at 3:03 AM, Greg Ewing wrote: >>> >>> I don't think a frame flag on its own is quite enough. >>> You don't just want to prevent interruptions while in >>> a finally block, you want to defer them until the finally >>> counter gets back to zero. Making the interrupter sleep >>> and try again in that situation is rather ugly. > > That's the second reason I don't like your proposal. > > def foo(): > ? try: > ? ? ?.. > ? finally: > ? ? ?yield unlock() > ? # <--- the ideal point to interrupt foo > > ? f = open('a', 'w') > ? # what if we interrupt it here? > ? try: > ? ? ?.. > ? finally: > ? ? ?f.close() > And which one fixes this problem? There is no guarantee that your timeout code haven't interrupted at " # what if we interrupt it here?". If it's a bit less likely, it's not real solution. Please, don't present it as such. >>> So perhaps there could also be a callback that gets >>> invoked when the counter goes down to zero. >> >> Do you mean put callback in a frame, which get >> executed at next bytecode just like signal handler, >> except it waits until finally clause is executed? >> >> I would work, except in may have light performance >> impact on each bytecode. But I'm not sure if it will >> be noticeable. > > That's essentially the way we currently did it. ?We transform the > coroutine's __code__ object to make it from: > > def a(): > ? try: > ? ? ?# code1 > ? finally: > ? ? ?# code2 > > to: > > def a(): > ? __self__ = __get_current_coroutine() > ? try: > ? ? # code1 > ? finally: > ? ? __self__.enter_finally() > ? ? try: > ? ? ? # code2 > ? ? finally: > ? ? ? __self__.exit_finally() > > 'enter_finally' and 'exit_finally' maintain the internal counter > of finally blocks. ?If a coroutine needs to be interrupted, we check > that counter. ?If it is 0 - throw in a special exception. ?If not - > wait till it becomes 0 and throw the exception in 'exit_finally'. > The problem is in interruption of another thread. You must inspect stack only with C code holding GIL. Implementation might be more complex, but yes, it's probably can be done, without noticeable slow down. -- Paul From tjreedy at udel.edu Wed Apr 4 21:05:34 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 04 Apr 2012 15:05:34 -0400 Subject: [Python-ideas] Python probe: execute code in isolation (subinterpreter?) and get results In-Reply-To: References: Message-ID: On 4/4/2012 4:25 AM, anatoly techtonik wrote: > Story #2: Get settings from user script > > Blender uses SCons to automate builds. SCons script is written in > Python and it uses execfile(filename, globals, locals) to read > platform specific user script with settings. Unfortunately, execfile() > is a hack that doesn't treat Python scripts the same way the > interpreter treats them - for example, globals access will throw > exceptions - http://bugs.python.org/issue14049 Please stop misrepresenting Python. I clearly explained the issue in 14049 before closing it. If you did not understand, re-read until you do. Trying to get an object via an unbound name (non-existent variable), always results in a NameError. There is nothing unique about execfile. If Blender's scons script is using execfile wrong (by passing separate globals and locals), tell them to fix *their* bug. -- Terry Jan Reedy From yselivanov.ml at gmail.com Wed Apr 4 21:37:16 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Wed, 4 Apr 2012 15:37:16 -0400 Subject: [Python-ideas] Protecting finally clauses of interruptions In-Reply-To: References: <7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com> <339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com> <7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com> <187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com> <314BE01B-160C-4C84-8E0D-AD82D14E2E56@gmail.com> <7007EA8B-09BB-46B8-A5D1-52CAB128DAB0@gmail.com> Message-ID: <61FF0CD2-A658-4E90-A2CF-975D972774D4@gmail.com> On 2012-04-04, at 2:44 PM, Paul Colomiets wrote: > I have a global timeout for processing single request. It's actually higher > in a chain of generator calls. So dispatcher looks like: > > def dispatcher(self, method, args): > with timeout(10): > yield getattr(self.method)(*args) How does it work? To what object are you actually attaching timeout? I'm just curious now how your 'timeout' context manager works. And what's the advantage of having some "global" timeout instead of a timeout specifically bound to some coroutine? Do you have that code publicly released somewhere? I just really want to understand how exactly your architecture works to come with a better proposal (if there is one possible ;). As an off-topic: would be interesting to have various coroutines approaches and architectures listed somewhere, to understand how python programmers actually do it. > And all the local timeouts, like timeout for single request are > usually applied at a socket level, where specific protocol > is implemented: > > def redis_unlock(lock): > yield redis_socket.wait_write(2) # wait two seconds > # TimeoutError may have been raised in wait_write() > cmd = ('DEL user:'+lock+'\n').encode('ascii') > redis_socket.write(cmd) # should be loop here, actually > yield redis_socket.wait_read(2) # another two seconds > result = redis_socket.read(1024) # here loop too > assert result == 'OK\n' So you have explicit timeouts in the 'redis_unlock', but you want them to be ignored if it was called from some 'finally' block? > So they are not interruptions. Although, we don't use them > much with coroutines, global timeout for request is > usually enough. Don't really follow you here. > But anyway I don't see a reason to protect a single frame, > because even if you have a simple mutex without coroutines > you end up with: > > def something(): > lock.acquire() > try: > pass > finally: > lock.release() > > And if lock's imlementation is something along the lines of: > > def release(self): > self._native_lock.release() > > How would you be sure that interruption is not executed > when interpreter resolved `self._native_lock.release` but > not yet called it? Is it in a context of coroutines or threads? If former, then because you, perhaps, want to interrupt 'something()'? And it is a separate frame from the frame where 'release()' is running? > It's also interesting question. I don't think it's possible to interrupt > JIT'ed code in arbitrary location. I guess that should really be asked on the pypy-dev mail-list, once we have a proposal. >> That's the second reason I don't like your proposal. >> >> def foo(): >> try: >> .. >> finally: >> yield unlock() >> # <--- the ideal point to interrupt foo >> >> f = open('a', 'w') >> # what if we interrupt it here? >> try: >> .. >> finally: >> f.close() >> > > And which one fixes this problem? There is no guarantee > that your timeout code haven't interrupted > at " # what if we interrupt it here?". If it's a bit less likely, > it's not real solution. Please, don't present it as such. Sorry, I must had it explained in more details. Right now we interrupt code only where we have a 'yield', a greenlet.switch(), or at the end of finally block, not at some arbitrary opcode. - Yury From yselivanov.ml at gmail.com Wed Apr 4 22:07:19 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Wed, 4 Apr 2012 16:07:19 -0400 Subject: [Python-ideas] Protecting finally clauses of interruptions In-Reply-To: References: <7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com> <339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com> <7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com> <187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com> <314BE01B-160C-4C84-8E0D-AD82D14E2E56@gmail.com> <7007EA8B-09BB-46B8-A5D1-52CAB128DAB0@gmail.com> Message-ID: <7B674147-30DD-4ACD-A62A-3B1220994F31@gmail.com> On 2012-04-04, at 2:44 PM, Paul Colomiets wrote: > But anyway I don't see a reason to protect a single frame, > because even if you have a simple mutex without coroutines > you end up with: BTW, for instance, in our framework each coroutine is a special object that wraps generator/plain function. It controls everything that the underlying generator/function yields/returns. But the actual execution, propagation of returned values and raised errors is the scheduler's job. So when you are yielding a coroutine from another coroutine, frames are not even connected to each other, since the actual execution of the callee will be performed by the scheduler. It's not like a regular python call. For us, having 'f_in_finally' somehow propagated would be completely useless. I think even if it's decided to implement just your proposal, I feel that 'f_in_finally' should indicate the state of only its *own* frame. - Yury From paul at colomiets.name Wed Apr 4 22:43:05 2012 From: paul at colomiets.name (Paul Colomiets) Date: Wed, 4 Apr 2012 23:43:05 +0300 Subject: [Python-ideas] Protecting finally clauses of interruptions In-Reply-To: <61FF0CD2-A658-4E90-A2CF-975D972774D4@gmail.com> References: <7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com> <339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com> <7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com> <187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com> <314BE01B-160C-4C84-8E0D-AD82D14E2E56@gmail.com> <7007EA8B-09BB-46B8-A5D1-52CAB128DAB0@gmail.com> <61FF0CD2-A658-4E90-A2CF-975D972774D4@gmail.com> Message-ID: Hi Yury, On Wed, Apr 4, 2012 at 10:37 PM, Yury Selivanov wrote: > On 2012-04-04, at 2:44 PM, Paul Colomiets wrote: >> I have a global timeout for processing single request. It's actually higher >> in a chain of generator calls. So dispatcher looks like: >> >> def dispatcher(self, method, args): >> ? ?with timeout(10): >> ? ? ? ?yield getattr(self.method)(*args) > > How does it work? ?To what object are you actually attaching timeout? > There is basically a "Coroutine" object. It's actually a list with paused generators, with top of them being currently running (or stopped for doing IO). It represents stack, because there is no built-in stack for generators. > And what's the advantage of having some "global" timeout instead > of a timeout specifically bound to some coroutine? > We have guaranteed time of request processing. Or to be more precise guaranteed time when request stops processing so we don't have a lot of coroutines hanging forever. It allows to not to place timeouts all over the code. May be your use case is very different. E.g. this pattern doesn't work well with batch processing of big data. We process many tiny user requests per second. > Do you have that code publicly released somewhere? ?I just really want > to understand how exactly your architecture works to come with a > better proposal (if there is one possible ;). > This framework does timeout handling in described way: https://github.com/tailhook/zorro Although, it's using greenlets. The difference is that we we don't need to keep a stack in our own scheduler when using greenlets, but everything else applies. > As an off-topic: would be interesting to have various coroutines > approaches and architectures listed somewhere, to understand how > python programmers actually do it. > Sure :) >> And all the local timeouts, like timeout for single request are >> usually applied at a socket level, where specific protocol >> is implemented: >> >> def redis_unlock(lock): >> ? ?yield redis_socket.wait_write(2) ?# wait two seconds >> ? # TimeoutError may have been raised in wait_write() >> ? ?cmd = ('DEL user:'+lock+'\n').encode('ascii') >> ? ?redis_socket.write(cmd) ?# should be loop here, actually >> ? ?yield redis_socket.wait_read(2) ?# another two seconds >> ? ?result = redis_socket.read(1024) ?# here loop too >> ? ?assert result == 'OK\n' > > So you have explicit timeouts in the 'redis_unlock', but you want > them to be ignored if it was called from some 'finally' block? > No! I'd just omit them if I wanted. I don't want interruption of `add_money` which calls `redis_unlock` in finally to be done. >> So they are not interruptions. Although, we don't use them >> much with coroutines, global timeout for request is >> usually enough. > > Don't really follow you here. > You may think of it as socket with timeout set. socket.set_timeout(2) socket.recv(1024) It will raise TimeoutError, this should propagate as a normal exception. As opposed to being externally interrupted e.g. with SIGINT or SIGALERT. >> But anyway I don't see a reason to protect a single frame, >> because even if you have a simple mutex without coroutines >> you end up with: >> >> def something(): >> ?lock.acquire() >> ?try: >> ? ?pass >> ?finally: >> ? ?lock.release() >> >> And if lock's imlementation is something along the lines of: >> >> def release(self): >> ? ?self._native_lock.release() >> >> How would you be sure that interruption is not executed >> when interpreter resolved `self._native_lock.release` but >> not yet called it? > > Is it in a context of coroutines or threads? I don't see a difference, except the code which maintains stack. I'd say both are problem, if you neither propagate f_in_finally nor traverse a stack (although, a way of propagation may be different) > If former, the > because you, perhaps, want to interrupt 'something()'? I want to interrupt a thread. Or "Coroutine" in definition described above (having a stack of frames) or in greenlet's definition. > And it is a > separate frame from the frame where 'release()' is running? Of course (How it can be inlined? :) ) >>> That's the second reason I don't like your proposal. >>> >>> def foo(): >>> ? try: >>> ? ? ?.. >>> ? finally: >>> ? ? ?yield unlock() >>> ? # <--- the ideal point to interrupt foo >>> >>> ? f = open('a', 'w') >>> ? # what if we interrupt it here? >>> ? try: >>> ? ? ?.. >>> ? finally: >>> ? ? ?f.close() >>> >> >> And which one fixes this problem? There is no guarantee >> that your timeout code haven't interrupted >> at " # what if we interrupt it here?". If it's a bit less likely, >> it's not real solution. Please, don't present it as such. > > Sorry, I must had it explained in more details. ?Right now we > interrupt code only where we have a 'yield', a greenlet.switch(), > or at the end of finally block, not at some arbitrary opcode. > Sure I do similar. But it doesn't work with threads, as they have no explicit yield or switch. On Wed, Apr 4, 2012 at 11:07 PM, Yury Selivanov wrote: > On 2012-04-04, at 2:44 PM, Paul Colomiets wrote: > >> But anyway I don't see a reason to protect a single frame, >> because even if you have a simple mutex without coroutines >> you end up with: > > BTW, for instance, in our framework each coroutine is a special > object that wraps generator/plain function. ?It controls everything > that the underlying generator/function yields/returns. ?But the actual > execution, propagation of returned values and raised errors is the > scheduler's job. ?So when you are yielding a coroutine from another > coroutine, frames are not even connected to each other, since the > actual execution of the callee will be performed by the scheduler. > It's not like a regular python call. > Same applies here. But you propagate return value/error right? So you can't say "frames are not connected". They aren't from the interpreter point of view. But they are logically connected. So for example: def a(): yield b() def b(): yield If `a().with_timeout(0.1)` is interrupted when it's waiting for value of `b()`, will `b()` continue it's execution? > For us, having 'f_in_finally' somehow propagated would be completely > useless. > I hope I can convince you with this email :) > I think even if it's decided to implement just your proposal, I feel > that 'f_in_finally' should indicate the state of only its *own* frame. That was original intention. But it requires stack traversing. Andrew proposed to propagate this flag, which is another point of view on the same thing (not sure which one to pick though) -- Paul From yselivanov.ml at gmail.com Wed Apr 4 23:24:45 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Wed, 4 Apr 2012 17:24:45 -0400 Subject: [Python-ideas] Protecting finally clauses of interruptions In-Reply-To: References: <7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com> <339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com> <7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com> <187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com> <314BE01B-160C-4C84-8E0D-AD82D14E2E56@gmail.com> <7007EA8B-09BB-46B8-A5D1-52CAB128DAB0@gmail.com> <61FF0CD2-A658-4E90-A2CF-975D972774D4@gmail.com> Message-ID: On 2012-04-04, at 4:43 PM, Paul Colomiets wrote: >> How does it work? To what object are you actually attaching timeout? >> > > There is basically a "Coroutine" object. It's actually a list with > paused generators, with top of them being currently running > (or stopped for doing IO). It represents stack, because there > is no built-in stack for generators. Interesting. I decided to go with a simple coroutine objects with a '_caller' pointer to maintain the stack virtually. > This framework does timeout handling in described way: > > https://github.com/tailhook/zorro > > Although, it's using greenlets. The difference is that we > we don't need to keep a stack in our own scheduler > when using greenlets, but everything else applies. Are you using that particular framework (zorro)? Or some modification of it that uses generators too? >> Sorry, I must had it explained in more details. Right now we >> interrupt code only where we have a 'yield', a greenlet.switch(), >> or at the end of finally block, not at some arbitrary opcode. >> > > Sure I do similar. But it doesn't work with threads, as > they have no explicit yield or switch. OK. > On Wed, Apr 4, 2012 at 11:07 PM, Yury Selivanov wrote: >> On 2012-04-04, at 2:44 PM, Paul Colomiets wrote: >> >>> But anyway I don't see a reason to protect a single frame, >>> because even if you have a simple mutex without coroutines >>> you end up with: >> >> BTW, for instance, in our framework each coroutine is a special >> object that wraps generator/plain function. It controls everything >> that the underlying generator/function yields/returns. But the actual >> execution, propagation of returned values and raised errors is the >> scheduler's job. So when you are yielding a coroutine from another >> coroutine, frames are not even connected to each other, since the >> actual execution of the callee will be performed by the scheduler. >> It's not like a regular python call. >> > > Same applies here. But you propagate return value/error right? > So you can't say "frames are not connected". They aren't from > the interpreter point of view. But they are logically connected. OK, we're on the same page. '''"frames are not connected" from the interpreter point of view''', that essentially means that 'f_in_finally' will always be a flag related only to its own frame, right? Any 'propagation' of this flag is the responsibility of framework developers. > So for example: > > def a(): > yield b() > > def b(): > yield > > If `a().with_timeout(0.1)` is interrupted when it's waiting for value > of `b()`, will `b()` continue it's execution? Well, in our framework, if a() is getting aborted after it's scheduled b(), but before it received the result of b(), we interrupt both of them (and those that b() might had called). >> I think even if it's decided to implement just your proposal, I feel >> that 'f_in_finally' should indicate the state of only its *own* frame. > > That was original intention. But it requires stack traversing. Andrew > proposed to propagate this flag, which is another point of view > on the same thing (not sure which one to pick though) Again, if coroutines' frames aren't connected on the interpreter level (it's the responsibility of a framework), about what exact propagation are you and Andrew talking (in the sole context of the patch to cpython)? - Yury From paul at colomiets.name Wed Apr 4 23:46:13 2012 From: paul at colomiets.name (Paul Colomiets) Date: Thu, 5 Apr 2012 00:46:13 +0300 Subject: [Python-ideas] Protecting finally clauses of interruptions In-Reply-To: References: <7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com> <339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com> <7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com> <187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com> <314BE01B-160C-4C84-8E0D-AD82D14E2E56@gmail.com> <7007EA8B-09BB-46B8-A5D1-52CAB128DAB0@gmail.com> <61FF0CD2-A658-4E90-A2CF-975D972774D4@gmail.com> Message-ID: Hi Yury, On Thu, Apr 5, 2012 at 12:24 AM, Yury Selivanov wrote: > On 2012-04-04, at 4:43 PM, Paul Colomiets wrote: >>> How does it work? ?To what object are you actually attaching timeout? >>> >> >> There is basically a "Coroutine" object. It's actually a list with >> paused generators, with top of them being currently running >> (or stopped for doing IO). It represents stack, because there >> is no built-in stack for generators. > > Interesting. ?I decided to go with a simple coroutine objects with > a '_caller' pointer to maintain the stack virtually. > It doesn't matter. IIRC, that was to draw a tree of calls starting with roots. But that's irrelevant to the topic of discussion. >> This framework does timeout handling in described way: >> >> https://github.com/tailhook/zorro >> >> Although, it's using greenlets. The difference is that we >> we don't need to keep a stack in our own scheduler >> when using greenlets, but everything else applies. > > Are you using that particular framework (zorro)? ?Or some modification > of it that uses generators too? > Currently we experimenting with greenlets and zorro. My description of yield-based coroutines from earlier project (unfortunately non-public one). > OK, we're on the same page. ?'''"frames are not connected" from the > interpreter point of view''', that essentially means that 'f_in_finally' > will always be a flag related only to its own frame, right? ?Any > 'propagation' of this flag is the responsibility of framework developers. > Yes, that's ok for me. >> So for example: >> >> def a(): >> ?yield b() >> >> def b(): >> ?yield >> >> If `a().with_timeout(0.1)` is interrupted when it's waiting for value >> of `b()`, will `b()` continue it's execution? > > Well, in our framework, if a() is getting aborted after it's scheduled > b(), but before it received the result of b(), we interrupt both of them > (and those that b() might had called). > Exactly! And you don't want them to be interrupted in the case `a()` rewritten as: def a(): try: pass finally: yield b() Same with threads and greenlets. >>> I think even if it's decided to implement just your proposal, I feel >>> that 'f_in_finally' should indicate the state of only its *own* frame. >> >> That was original intention. But it requires stack traversing. Andrew >> proposed to propagate this flag, which is another point of view >> on the same thing (not sure which one to pick though) > > Again, if coroutines' frames aren't connected on the interpreter level > (it's the responsibility of a framework), about what exact propagation > are you and Andrew talking (in the sole context of the patch to cpython)? > For threads and greenlets flag can be implicitly propagated, and for yield-based coroutines f_in_finally can be made writable, so you can propagate it in you own scheduler Not sure It's good idea, just describing it. -- Paul From greg.ewing at canterbury.ac.nz Thu Apr 5 00:18:46 2012 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 05 Apr 2012 10:18:46 +1200 Subject: [Python-ideas] Protecting finally clauses of interruptions In-Reply-To: References: <339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com> <7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com> <187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com> <314BE01B-160C-4C84-8E0D-AD82D14E2E56@gmail.com> Message-ID: <4F7CC8C6.1030902@canterbury.ac.nz> Paul Colomiets wrote: > Do you mean put callback in a frame, which get > executed at next bytecode just like signal handler, > except it waits until finally clause is executed? It wouldn't be in each frame -- probably it would just be a global hook that gets called whenever a finally-counter anywhere gets decremented from 1 to 0. It would be passed the relevant frame so it could decide what to do from there. I don't think it would have much performance impact, since it would only get triggered by exiting a finally block. Nothing would need to happen per bytecode or anything like that. -- Greg From paul at colomiets.name Thu Apr 5 00:45:42 2012 From: paul at colomiets.name (Paul Colomiets) Date: Thu, 5 Apr 2012 01:45:42 +0300 Subject: [Python-ideas] Protecting finally clauses of interruptions In-Reply-To: <4F7CC8C6.1030902@canterbury.ac.nz> References: <339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com> <7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com> <187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com> <314BE01B-160C-4C84-8E0D-AD82D14E2E56@gmail.com> <4F7CC8C6.1030902@canterbury.ac.nz> Message-ID: Hi Greg, On Thu, Apr 5, 2012 at 1:18 AM, Greg Ewing wrote: > Paul Colomiets wrote: > >> Do you mean put callback in a frame, which get >> executed at next bytecode just like signal handler, >> except it waits until finally clause is executed? > > > It wouldn't be in each frame -- probably it would just > be a global hook that gets called whenever a finally-counter > anywhere gets decremented from 1 to 0. It would be passed > the relevant frame so it could decide what to do from > there. > It's similar to sys.settrace() except it only executed when finally counter decremented to 0, right? Flag `f_in_finally` is still there, right? It solves my use case well. Yury, is it ok if l'll start a PEP with this idea, and when it will have some support (or be rejected), you'll come up with thread interruption proposal? -- Paul From yselivanov.ml at gmail.com Thu Apr 5 01:19:41 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Wed, 4 Apr 2012 19:19:41 -0400 Subject: [Python-ideas] Protecting finally clauses of interruptions In-Reply-To: References: <339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com> <7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com> <187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com> <314BE01B-160C-4C84-8E0D-AD82D14E2E56@gmail.com> <4F7CC8C6.1030902@canterbury.ac.nz> Message-ID: <052F12E9-BAD4-4B51-A850-261B46E8A8C8@gmail.com> On 2012-04-04, at 6:45 PM, Paul Colomiets wrote: > It's similar to sys.settrace() except it only executed when > finally counter decremented to 0, right? > Flag `f_in_finally` is still there, right? Yes, please keep it. With your current proposal, it's the only way to see if it is safe to interrupt coroutine right now, or we have to wait until the callback gets called. > Yury, is it ok if l'll start a PEP with this idea, and when > it will have some support (or be rejected), you'll come > up with thread interruption proposal? Sure, go ahead. If I'm lucky enough to come up with a better proposal I promise to shout about it loud ;) - Yury From jimjjewett at gmail.com Thu Apr 5 18:03:34 2012 From: jimjjewett at gmail.com (Jim Jewett) Date: Thu, 5 Apr 2012 12:03:34 -0400 Subject: [Python-ideas] comparison of operator.itemgetter objects In-Reply-To: <4F79737D.6030508@pearwood.info> References: <4F79737D.6030508@pearwood.info> Message-ID: On Mon, Apr 2, 2012 at 5:38 AM, Steven D'Aprano wrote: > In general, I think that having equality tests fall back on identity test is > so rarely what you actually want that sometimes I wonder why we bother. Because identity ==> equality. (There are exceptions, like NaN, but that behavior is buggy.) And for objects without a comparison function, the most commonly made comparison (e.g., as a dict key) is one where identity is desired. -jJ From jimjjewett at gmail.com Thu Apr 5 18:33:19 2012 From: jimjjewett at gmail.com (Jim Jewett) Date: Thu, 5 Apr 2012 12:33:19 -0400 Subject: [Python-ideas] comparison of operator.itemgetter objects In-Reply-To: References: Message-ID: On Mon, Apr 2, 2012 at 4:30 AM, Max Moroz wrote: > I think that comparing sort keys for equality works well in many useful > cases: > (a) Named function. These compare as equal only?if they are identical. If > lhs and rhs were initialized with distinct named functions, I would argue > that the programmer did not intend them to be compatible for the purpose of > binary operations, even if they happen to be identical in behavior (e.g., if > both functions return back the argument passed to them). In a well-designed > program, there is no need to duplicate the named function definition if the > two are expected to always have the same behavior. It may be that they were created as inner functions, and the reason to duplicate was either to avoid creating the function at all unless it was needed, or to keep the smaller function's logic near where it was needed. In a sense, you are already recognizing this by asking that different but equivalent functions produced by the itemgetter factor compare equal. > (c) itemgetter. Suppose a programmer passed `itemgetter('name')` as the sort > key argument to the sorted data structure's constructor. The resulting data > structures would seem incompatible for the purposes of binary operations. > This is likely to be confusing and undesirable. operator.attrgetter seems similar. > (d) lambda functions. Similarly, suppose a programmer passed `lambda x : -x` > as the sort key argument to the sorted data structure's constructor. Since > two lambda functions are not identical, they would compare as unequal. > It seems to be very easy to address the undesirable behavior described in > (c): add method __eq__() to operator.itemgetter, which would compare the > list of arguments received at initialization. Agreed. I think this may just be a case of someone assuming YAGNI, but if you do need it, and submit a patch, it should be OK. > It is far harder to address the undesirable behavior described in (d). If it > can be addressed at all, it would have to done in the sorted data structure > implementation, since I don't think anyone would want lambda function > comparison behavior to change. So for the purposes of this discussion, I > ignore case (d). Why not? If you really care about identity for a lambda function, then you should be using "is", and if you don't, then equivalent behavior should be enough. I would support a change to function.__eq__ (which would fall through to lambda) such that they were equal if they had the same bytecode, signature, and execution context (defaults, globals, etc). I would also support making functions and methods orderable, for more easily replicated reprs. I'm not volunteering to write the patch, at least today. > Is this a reasonable idea? Is it useful enough to be considered? Are there > any downsides I didn't think of? Caring that two functions are identical is probably even less common than sticking a function in a dict, and the "nope, these are not equal" case would get a bit slower. -jJ From tjreedy at udel.edu Thu Apr 5 21:14:28 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 05 Apr 2012 15:14:28 -0400 Subject: [Python-ideas] comparison of operator.itemgetter objects In-Reply-To: References: Message-ID: On 4/5/2012 12:33 PM, Jim Jewett wrote: > Why not? If you really care about identity for a lambda function, A 'lambda function' is simply a function whose .__name__ attribute is "". There is no difference otherwise. Hence cases '(a) function' and '(d) lambda function' (in snipped portion) are the same class and > I would support a change to function.__eq__ (which would fall through > to lambda) 'falling through' cannot happen as there is nothing other to fall through to. -- Terry Jan Reedy From paul at colomiets.name Fri Apr 6 23:04:28 2012 From: paul at colomiets.name (Paul Colomiets) Date: Sat, 7 Apr 2012 00:04:28 +0300 Subject: [Python-ideas] Draft PEP on protecting finally clauses Message-ID: Hi, I've finally made a PEP. Any feedback is appreciated. -- Paul PEP: XXX Title: Protecting cleanup statements from interruptions Version: $Revision$ Last-Modified: $Date$ Author: Paul Colomiets Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 06-Apr-2012 Python-Version: 3.3 Abstract ======== This PEP proposes a way to protect python code from being interrupted inside finally statement or context manager. Rationale ========= Python has two nice ways to do cleanup. One is a ``finally`` statement and the other is context manager (or ``with`` statement). Although, neither of them is protected from ``KeyboardInterrupt`` or ``generator.throw()``. For example:: lock.acquire() try: print('starting') do_someting() finally: print('finished') lock.release() If ``KeyboardInterrupt`` occurs just after ``print`` function is executed, lock will not be released. Similarly the following code using ``with`` statement is affected:: from threading import Lock class MyLock: def __init__(self): self._lock_impl = lock def __enter__(self): self._lock_impl.acquire() print("LOCKED") def __exit__(self): print("UNLOCKING") self._lock_impl.release() lock = MyLock() with lock: do_something If ``KeyboardInterrupt`` occurs near any of the ``print`` statements, lock will never be released. Coroutine Use Case ------------------ Similar case occurs with coroutines. Usually coroutine libraries want to interrupt coroutine with a timeout. There is a ``generator.throw()`` method for this use case, but there are no method to know is it currently yielded from inside a ``finally``. Example that uses yield-based coroutines follows. Code looks similar using any of the popular coroutine libraries Monocle [1]_, Bluelet [2]_, or Twisted [3]_. :: def run_locked() yield connection.sendall('LOCK') try: yield do_something() yield do_something_else() finally: yield connection.sendall('UNLOCK') with timeout(5): yield run_locked() In the example above ``yield something`` means pause executing current coroutine and execute coroutine ``something`` until it finished execution. So that library keeps stack of generators itself. The ``connection.sendall`` waits until socket is writable and does thing similar to what ``socket.sendall`` does. The ``with`` statement ensures that all that code is executed within 5 seconds timeout. It does so by registering a callback in main loop, which calls ``generator.throw()`` to the top-most frame in the coroutine stack when timeout happens. The ``greenlets`` extension works in similar way, except it doesn't need ``yield`` to enter new stack frame. Otherwise considerations are similar. Specification ============= Frame Flag 'f_in_cleanup' ------------------------- A new flag on frame object is proposed. It is set to ``True`` if this frame is currently in the ``finally`` suite. Internally it must be implemented as a counter of nested finally statements currently executed. The internal counter is also incremented when entering ``WITH_SETUP`` bytecode and ``WITH_CLEANUP`` bytecode, and is decremented when leaving that bytecode. This allows to protect ``__enter__`` and ``__exit__`` methods too. Function 'sys.setcleanuphook' ----------------------------- A new function for the ``sys`` module is proposed. This function sets a callback which is executed every time ``f_in_cleanup`` becomes ``False``. Callbacks gets ``frame`` as it's sole argument so it can get some evindence where it is called from. The setting is thread local and is stored inside ``PyThreadState`` structure. Inspect Module Enhancements --------------------------- Two new functions are proposed for ``inspect`` module: ``isframeincleanup`` and ``getcleanupframe``. ``isframeincleanup`` given ``frame`` object or ``generator`` object as sole argument returns the value of ``f_in_cleanup`` attribute of a frame itself or of the ``gi_frame`` attribute of a generator. ``getcleanupframe`` given ``frame`` object as sole argument returns the innermost frame which has true value of ``f_in_cleanup`` or ``None`` if no frames in the stack has the attribute set. It starts to inspect from specified frame and walks to outer frames using ``f_back`` pointers, just like ``getouterframes`` does. Example ======= Example implementation of ``SIGINT`` handler that interrupts safely might look like:: import inspect, sys, functools def sigint_handler(sig, frame) if inspect.getcleanupframe(frame) is None: raise KeyboardInterrupt() sys.setcleanuphook(functools.partial(sigint_handler, 0)) Coroutine example is out of scope of this document, because it's implemention depends very much on a trampoline (or main loop) used by coroutine library. Unresolved Issues ================= Interruption Inside With Statement Expression --------------------------------------------- Given the statement:: with open(filename): do_something() Python can be interrupted after ``open`` is called, but before ``SETUP_WITH`` bytecode is executed. There are two possible decisions: * Protect expression inside ``with`` statement. This would need another bytecode, since currently there is no delimiter at the start of ``with`` expression * Let user write a wrapper if he considers it's important for his use-case. Safe wrapper code might look like the following:: class FileWrapper(object): def __init__(self, filename, mode): self.filename = filename self.mode = mode def __enter__(self): self.file = open(self.filename, self.mode) def __exit__(self): self.file.close() Alternatively it can be written using context manager:: @contextmanager def open_wrapper(filename, mode): file = open(filename, mode) try: yield file finally: file.close() This code is safe, as first part of generator (before yield) is executed inside ``WITH_SETUP`` bytecode of caller Exception Propagation --------------------- Sometimes ``finally`` block or ``__enter__/__exit__`` method can be exited with an exception. Usually it's not a problem, since more important exception like ``KeyboardInterrupt`` or ``SystemExit`` should be thrown instead. But it may be nice to be able to keep original exception inside a ``__context__`` attibute. So cleanup hook signature may grow an exception argument:: def sigint_handler(sig, frame) if inspect.getcleanupframe(frame) is None: raise KeyboardInterrupt() sys.setcleanuphook(retry_sigint) def retry_sigint(frame, exception=None): if inspect.getcleanupframe(frame) is None: raise KeyboardInterrupt() from exception .. note:: No need to have three arguments like in ``__exit__`` method since we have a ``__traceback__`` attribute in exception in Python 3.x Although, this will set ``__cause__`` for the exception, which is not exactly what's intended. So some hidden interpeter logic may be used to put ``__context__`` attribute on every exception raised in cleanup hook. Interruption Between Acquiring Resource and Try Block ----------------------------------------------------- Example from the first section is not totally safe. Let's look closer:: lock.acquire() try: do_something() finally: lock.release() There is no way it can be fixed without modifying the code. The actual fix of this code depends very much on use case. Usually code can be fixed using a ``with`` statement:: with lock: do_something() Although, for coroutines you usually can't use ``with`` statement because you need to ``yield`` for both aquire and release operations. So code might be rewritten as following:: try: yield lock.acquire() do_something() finally: yield lock.release() The actual lock code might need more code to support this use case, but implementation is usually trivial, like check if lock has been acquired and unlock if it is. Setting Interruption Context Inside Finally Itself -------------------------------------------------- Some coroutine libraries may need to set a timeout for the finally clause itself. For example:: try: do_something() finally: with timeout(0.5): try: yield do_slow_cleanup() finally: yield do_fast_cleanup() With current semantics timeout will either protect the whole ``with`` block or nothing at all, depending on the implementation of a library. What the author is intended is to treat ``do_slow_cleanup`` as an ordinary code, and ``do_fast_cleanup`` as a cleanup (non-interruptible one). Similar case might occur when using greenlets or tasklets. This case can be fixed by exposing ``f_in_cleanup`` as a counter, and by calling cleanup hook on each decrement. Corouting library may then remember the value at timeout start, and compare it on each hook execution. But in practice example is considered to be too obscure to take in account. Alternative Python Implementations Support ========================================== We consider ``f_in_cleanup`` and implementation detail. The actual implementation may have some fake frame-like object passed to signal handler, cleanup hook and returned from ``getcleanupframe``. The only requirement is that ``inspect`` module functions work as expected on that objects. For this reason we also allow to pass a ``generator`` object to a ``isframeincleanup`` function, this disables need to use ``gi_frame`` attribute. It may need to be specified that ``getcleanupframe`` must return the same object that will be passed to cleanup hook at next invocation. Alternative Names ================= Original proposal had ``f_in_finally`` flag. The original intention was to protect ``finally`` clauses. But as it grew up to protecting ``__enter__`` and ``__exit__`` methods too, the ``f_in_cleanup`` method seems better. Although ``__enter__`` method is not a cleanup routine, it at least relates to cleanup done by context managers. ``setcleanuphook``, ``isframeincleanup`` and ``getcleanupframe`` can be unobscured to ``set_cleanup_hook``, ``is_frame_in_cleanup`` and ``get_cleanup_frame``, althought they follow convention of their respective modules. Alternative Proposals ===================== Propagating 'f_in_cleanup' Flag Automatically ----------------------------------------------- This can make ``getcleanupframe`` unnecessary. But for yield based coroutines you need to propagate it yourself. Making it writable leads to somewhat unpredictable behavior of ``setcleanuphook`` Add Bytecodes 'INCR_CLEANUP', 'DECR_CLEANUP' -------------------------------------------- These bytecodes can be used to protect expression inside ``with`` statement, as well as making counter increments more explicit and easy to debug (visible inside a disassembly). Some middle ground might be chosen, like ``END_FINALLY`` and ``SETUP_WITH`` imlicitly decrements counter (``END_FINALLY`` is present at end of ``with`` suite). Although, adding new bytecodes must be considered very carefully. Expose 'f_in_cleanup' as a Counter ---------------------------------- The original intention was to expose minimum needed functionality. Although, as we consider frame flag ``f_in_cleanup`` as an implementation detail, we may expose it as a counter. Similarly, if we have a counter we may need to have cleanup hook called on every counter decrement. It's unlikely have much performance impact as nested finally clauses are unlikely common case. Add code object flag 'CO_CLEANUP' --------------------------------- As an alternative to set flag inside ``WITH_SETUP``, and ``WITH_CLEANUP`` bytecodes we can introduce a flag ``CO_CLEANUP``. When interpreter starts to execute code with ``CO_CLEANUP`` set, it sets ``f_in_cleanup`` for the whole function body. This flag is set for code object of ``__enter__`` and ``__exit__`` special methods. Technically it might be set on functions called ``__enter__`` and ``__exit__``. This seems to be less clear solution. It also covers the case where ``__enter__`` and ``__exit__`` are called manually. This may be accepted either as feature or as a unnecessary side-effect (unlikely as a bug). It may also impose a problem when ``__enter__`` or ``__exit__`` function are implemented in C, as there usually no frame to check for ``f_in_cleanup`` flag. Have Cleanup Callback on Frame Object Itself ---------------------------------------------- Frame may be extended to have ``f_cleanup_callback`` which is called when ``f_in_cleanup`` is reset to 0. It would help to register different callbacks to different coroutines. Despite apparent beauty. This solution doesn't add anything. As there are two primary use cases: * Set callback in signal handler. The callback is inherently single one for this case * Use single callback per loop for coroutine use case. And in almost all cases there is only one loop per thread No Cleanup Hook --------------- Original proposal included no cleanup hook specification. As there are few ways to achieve the same using current tools: * Use ``sys.settrace`` and ``f_trace`` callback. It may impose some problem to debugging, and has big performance impact (although, interrupting doesn't happen very often) * Sleep a bit more and try again. For coroutine library it's easy. For signals it may be achieved using ``alert``. Both methods are considered too impractical and a way to catch exit from ``finally`` statement is proposed. References ========== .. [1] Monocle https://github.com/saucelabs/monocle .. [2] Bluelet https://github.com/sampsyo/bluelet .. [3] Twisted: inlineCallbacks http://twistedmatrix.com/documents/8.1.0/api/twisted.internet.defer.html .. [4] Original discussion http://mail.python.org/pipermail/python-ideas/2012-April/014705.html Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: From victor.stinner at gmail.com Sat Apr 7 12:04:28 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Sat, 7 Apr 2012 12:04:28 +0200 Subject: [Python-ideas] Protecting finally clauses of interruptions In-Reply-To: References: Message-ID: > I'd like to propose a way to protect `finally` clauses from > interruptions (either by KeyboardInterrupt or by timeout, or any other > way). With Python 3.3, you can easily write a context manager disabling interruptions using signal.pthread_sigmask(). If a signal is send, the signal will be waiting in a queue, and the signal handler will be called when the signals are unblocked. (On some OSes, the signal handler is not called immediatly.) pthread_sigmask() only affects the current thread. If you have two threads, and you block all signals in thread A, the C signal handler will be called in the thread B. But if I remember correctly, the Python signal handler is always called in the main thread. pthread_sigmask() is not available on all platforms (e.g. not on Windows), and some OSes have a poor support of signals+threads (e.g. OpenBSD and old versions of FreeBSD). Calling pthread_sigmask() twice (enter and exit the final block) has a cost, I don't think that it should be done by default. It may also have unexpected behaviours. I prefer to make it explicit. -- You may hack ceval.c to not call the Python signal handler in a final block, but system calls will still be interrupted (EINTR). Victor From paul at colomiets.name Sat Apr 7 13:11:00 2012 From: paul at colomiets.name (Paul Colomiets) Date: Sat, 7 Apr 2012 14:11:00 +0300 Subject: [Python-ideas] Protecting finally clauses of interruptions In-Reply-To: References: Message-ID: Hi Victor, On Sat, Apr 7, 2012 at 1:04 PM, Victor Stinner wrote: >> I'd like to propose a way to protect `finally` clauses from >> interruptions (either by KeyboardInterrupt or by timeout, or any other >> way). > > With Python 3.3, you can easily write a context manager disabling > interruptions using signal.pthread_sigmask(). If a signal is send, the > signal will be waiting in a queue, and the signal handler will be > called when the signals are unblocked. (On some OSes, the signal > handler is not called immediatly.) > And now you need to patch every library which happens to use `finally` statement, to make it work. Doesn't seem to be realistic. > > You may hack ceval.c to not call the Python signal handler in a final > block, but system calls will still be interrupted (EINTR). > This is not a problem for networking IO as it is always prepared for EINTR, and posix mutexes never return EINTR. So for the primary use-cases it's ok. But at least I'll add this consideration to the PEP. -- Paul From andrew.svetlov at gmail.com Sat Apr 7 23:08:50 2012 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Sun, 8 Apr 2012 00:08:50 +0300 Subject: [Python-ideas] Draft PEP on protecting finally clauses In-Reply-To: References: Message-ID: I've published this PEP as PEP-419: http://www.python.org/dev/peps/pep-0419/ Thank you, Paul. On Sat, Apr 7, 2012 at 12:04 AM, Paul Colomiets wrote: > Hi, > > I've finally made a PEP. Any feedback is appreciated. > > -- > Paul > > > PEP: XXX > Title: Protecting cleanup statements from interruptions > Version: $Revision$ > Last-Modified: $Date$ > Author: Paul Colomiets > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 06-Apr-2012 > Python-Version: 3.3 > > > Abstract > ======== > > This PEP proposes a way to protect python code from being interrupted inside > finally statement or context manager. > > > Rationale > ========= > > Python has two nice ways to do cleanup. One is a ``finally`` statement > and the other is context manager (or ``with`` statement). Although, > neither of them is protected from ``KeyboardInterrupt`` or > ``generator.throw()``. For example:: > > ? ?lock.acquire() > ? ?try: > ? ? ? ?print('starting') > ? ? ? ?do_someting() > ? ?finally: > ? ? ? ?print('finished') > ? ? ? ?lock.release() > > If ``KeyboardInterrupt`` occurs just after ``print`` function is > executed, lock will not be released. Similarly the following code > using ``with`` statement is affected:: > > ? ?from threading import Lock > > ? ?class MyLock: > > ? ? ? ?def __init__(self): > ? ? ? ? ? ?self._lock_impl = lock > > ? ? ? ?def __enter__(self): > ? ? ? ? ? ?self._lock_impl.acquire() > ? ? ? ? ? ?print("LOCKED") > > ? ? ? ?def __exit__(self): > ? ? ? ? ? ?print("UNLOCKING") > ? ? ? ? ? ?self._lock_impl.release() > > ? ?lock = MyLock() > ? ?with lock: > ? ? ? ?do_something > > If ``KeyboardInterrupt`` occurs near any of the ``print`` statements, > lock will never be released. > > > Coroutine Use Case > ------------------ > > Similar case occurs with coroutines. Usually coroutine libraries want > to interrupt coroutine with a timeout. There is a > ``generator.throw()`` method for this use case, but there are no > method to know is it currently yielded from inside a ``finally``. > > Example that uses yield-based coroutines follows. Code looks > similar using any of the popular coroutine libraries Monocle [1]_, > Bluelet [2]_, or Twisted [3]_. :: > > ? ?def run_locked() > ? ? ? ?yield connection.sendall('LOCK') > ? ? ? ?try: > ? ? ? ? ? ?yield do_something() > ? ? ? ? ? ?yield do_something_else() > ? ? ? ?finally: > ? ? ? ? ? ?yield connection.sendall('UNLOCK') > > ? ?with timeout(5): > ? ? ? ?yield run_locked() > > In the example above ``yield something`` means pause executing current > coroutine and execute coroutine ``something`` until it finished > execution. So that library keeps stack of generators itself. The > ``connection.sendall`` waits until socket is writable and does thing > similar to what ``socket.sendall`` does. > > The ``with`` statement ensures that all that code is executed within 5 > seconds timeout. It does so by registering a callback in main loop, > which calls ``generator.throw()`` to the top-most frame in the > coroutine stack when timeout happens. > > The ``greenlets`` extension works in similar way, except it doesn't > need ``yield`` to enter new stack frame. Otherwise considerations are > similar. > > > Specification > ============= > > Frame Flag 'f_in_cleanup' > ------------------------- > > A new flag on frame object is proposed. It is set to ``True`` if this > frame is currently in the ``finally`` suite. ?Internally it must be > implemented as a counter of nested finally statements currently > executed. > > The internal counter is also incremented when entering ``WITH_SETUP`` > bytecode and ``WITH_CLEANUP`` bytecode, and is decremented when > leaving that bytecode. This allows to protect ``__enter__`` and > ``__exit__`` methods too. > > > Function 'sys.setcleanuphook' > ----------------------------- > > A new function for the ``sys`` module is proposed. This function sets > a callback which is executed every time ``f_in_cleanup`` becomes > ``False``. Callbacks gets ``frame`` as it's sole argument so it can > get some evindence where it is called from. > > The setting is thread local and is stored inside ``PyThreadState`` > structure. > > > Inspect Module Enhancements > --------------------------- > > Two new functions are proposed for ``inspect`` module: > ``isframeincleanup`` and ``getcleanupframe``. > > ``isframeincleanup`` given ``frame`` object or ``generator`` object as > sole argument returns the value of ``f_in_cleanup`` attribute of a > frame itself or of the ``gi_frame`` attribute of a generator. > > ``getcleanupframe`` given ``frame`` object as sole argument returns > the innermost frame which has true value of ``f_in_cleanup`` or > ``None`` if no frames in the stack has the attribute set. It starts to > inspect from specified frame and walks to outer frames using > ``f_back`` pointers, just like ``getouterframes`` does. > > > Example > ======= > > Example implementation of ``SIGINT`` handler that interrupts safely > might look like:: > > ? ?import inspect, sys, functools > > ? ?def sigint_handler(sig, frame) > ? ? ? ?if inspect.getcleanupframe(frame) is None: > ? ? ? ? ? ?raise KeyboardInterrupt() > ? ? ? ?sys.setcleanuphook(functools.partial(sigint_handler, 0)) > > Coroutine example is out of scope of this document, because it's > implemention depends very much on a trampoline (or main loop) used by > coroutine library. > > > Unresolved Issues > ================= > > Interruption Inside With Statement Expression > --------------------------------------------- > > Given the statement:: > > ? ?with open(filename): > ? ? ? ?do_something() > > Python can be interrupted after ``open`` is called, but before > ``SETUP_WITH`` bytecode is executed. There are two possible decisions: > > * Protect expression inside ``with`` statement. This would need > ?another bytecode, since currently there is no delimiter at the start > ?of ``with`` expression > > * Let user write a wrapper if he considers it's important for his > ?use-case. Safe wrapper code might look like the following:: > > ? ?class FileWrapper(object): > > ? ? ? ?def __init__(self, filename, mode): > ? ? ? ? ? ?self.filename = filename > ? ? ? ? ? ?self.mode = mode > > ? ? ? ?def __enter__(self): > ? ? ? ? ? ?self.file = open(self.filename, self.mode) > > ? ? ? ?def __exit__(self): > ? ? ? ? ? ?self.file.close() > > ?Alternatively it can be written using context manager:: > > ? ?@contextmanager > ? ?def open_wrapper(filename, mode): > ? ? ? ?file = open(filename, mode) > ? ? ? ?try: > ? ? ? ? ? ?yield file > ? ? ? ?finally: > ? ? ? ? ? ?file.close() > > ?This code is safe, as first part of generator (before yield) is > ?executed inside ``WITH_SETUP`` bytecode of caller > > > Exception Propagation > --------------------- > > Sometimes ``finally`` block or ``__enter__/__exit__`` method can be > exited with an exception. Usually it's not a problem, since more > important exception like ``KeyboardInterrupt`` or ``SystemExit`` > should be thrown instead. But it may be nice to be able to keep > original exception inside a ``__context__`` attibute. So cleanup hook > signature may grow an exception argument:: > > ? ?def sigint_handler(sig, frame) > ? ? ? ?if inspect.getcleanupframe(frame) is None: > ? ? ? ? ? ?raise KeyboardInterrupt() > ? ? ? ?sys.setcleanuphook(retry_sigint) > > ? ?def retry_sigint(frame, exception=None): > ? ? ? ?if inspect.getcleanupframe(frame) is None: > ? ? ? ? ? ?raise KeyboardInterrupt() from exception > > .. note:: > > ? ?No need to have three arguments like in ``__exit__`` method since > ? ?we have a ``__traceback__`` attribute in exception in Python 3.x > > Although, this will set ``__cause__`` for the exception, which is not > exactly what's intended. So some hidden interpeter logic may be used > to put ``__context__`` attribute on every exception raised in cleanup > hook. > > > Interruption Between Acquiring Resource and Try Block > ----------------------------------------------------- > > Example from the first section is not totally safe. Let's look closer:: > > ? ?lock.acquire() > ? ?try: > ? ? ? ?do_something() > ? ?finally: > ? ? ? ?lock.release() > > There is no way it can be fixed without modifying the code. The actual > fix of this code depends very much on use case. > > Usually code can be fixed using a ``with`` statement:: > > ? ?with lock: > ? ? ? ?do_something() > > Although, for coroutines you usually can't use ``with`` statement > because you need to ``yield`` for both aquire and release operations. > So code might be rewritten as following:: > > ? ?try: > ? ? ? ?yield lock.acquire() > ? ? ? ?do_something() > ? ?finally: > ? ? ? ?yield lock.release() > > The actual lock code might need more code to support this use case, > but implementation is usually trivial, like check if lock has been > acquired and unlock if it is. > > > Setting Interruption Context Inside Finally Itself > -------------------------------------------------- > > Some coroutine libraries may need to set a timeout for the finally > clause itself. For example:: > > ? ?try: > ? ? ? ?do_something() > ? ?finally: > ? ? ? ?with timeout(0.5): > ? ? ? ? ? ?try: > ? ? ? ? ? ? ? ?yield do_slow_cleanup() > ? ? ? ? ? ?finally: > ? ? ? ? ? ? ? ?yield do_fast_cleanup() > > With current semantics timeout will either protect > the whole ``with`` block or nothing at all, depending on the > implementation of a library. What the author is intended is to treat > ``do_slow_cleanup`` as an ordinary code, and ``do_fast_cleanup`` as a > cleanup (non-interruptible one). > > Similar case might occur when using greenlets or tasklets. > > This case can be fixed by exposing ``f_in_cleanup`` as a counter, and > by calling cleanup hook on each decrement. ?Corouting library may then > remember the value at timeout start, and compare it on each hook > execution. > > But in practice example is considered to be too obscure to take in > account. > > > Alternative Python Implementations Support > ========================================== > > We consider ``f_in_cleanup`` and implementation detail. The actual > implementation may have some fake frame-like object passed to signal > handler, cleanup hook and returned from ``getcleanupframe``. The only > requirement is that ``inspect`` module functions work as expected on > that objects. For this reason we also allow to pass a ``generator`` > object to a ``isframeincleanup`` function, this disables need to use > ``gi_frame`` attribute. > > It may need to be specified that ``getcleanupframe`` must return the > same object that will be passed to cleanup hook at next invocation. > > > Alternative Names > ================= > > Original proposal had ``f_in_finally`` flag. The original intention > was to protect ``finally`` clauses. But as it grew up to protecting > ``__enter__`` and ``__exit__`` methods too, the ``f_in_cleanup`` > method seems better. Although ``__enter__`` method is not a cleanup > routine, it at least relates to cleanup done by context managers. > > ``setcleanuphook``, ``isframeincleanup`` and ``getcleanupframe`` can > be unobscured to ``set_cleanup_hook``, ``is_frame_in_cleanup`` and > ``get_cleanup_frame``, althought they follow convention of their > respective modules. > > > Alternative Proposals > ===================== > > Propagating 'f_in_cleanup' Flag Automatically > ----------------------------------------------- > > This can make ``getcleanupframe`` unnecessary. But for yield based > coroutines you need to propagate it yourself. Making it writable leads > to somewhat unpredictable behavior of ``setcleanuphook`` > > > Add Bytecodes 'INCR_CLEANUP', 'DECR_CLEANUP' > -------------------------------------------- > > These bytecodes can be used to protect expression inside ``with`` > statement, as well as making counter increments more explicit and easy > to debug (visible inside a disassembly). Some middle ground might be > chosen, like ``END_FINALLY`` and ``SETUP_WITH`` imlicitly decrements > counter (``END_FINALLY`` is present at end of ``with`` suite). > > Although, adding new bytecodes must be considered very carefully. > > > Expose 'f_in_cleanup' as a Counter > ---------------------------------- > > The original intention was to expose minimum needed functionality. > Although, as we consider frame flag ``f_in_cleanup`` as an > implementation detail, we may expose it as a counter. > > Similarly, if we have a counter we may need to have cleanup hook > called on every counter decrement. It's unlikely have much performance > impact as nested finally clauses are unlikely common case. > > > Add code object flag 'CO_CLEANUP' > --------------------------------- > > As an alternative to set flag inside ``WITH_SETUP``, and > ``WITH_CLEANUP`` bytecodes we can introduce a flag ``CO_CLEANUP``. > When interpreter starts to execute code with ``CO_CLEANUP`` set, it > sets ``f_in_cleanup`` for the whole function body. ?This flag is set > for code object of ``__enter__`` and ``__exit__`` special methods. > Technically it might be set on functions called ``__enter__`` and > ``__exit__``. > > This seems to be less clear solution. It also covers the case where > ``__enter__`` and ``__exit__`` are called manually. This may be > accepted either as feature or as a unnecessary side-effect (unlikely > as a bug). > > It may also impose a problem when ``__enter__`` or ``__exit__`` > function are implemented in C, as there usually no frame to check for > ``f_in_cleanup`` flag. > > > Have Cleanup Callback on Frame Object Itself > ---------------------------------------------- > > Frame may be extended to have ``f_cleanup_callback`` which is called > when ``f_in_cleanup`` is reset to 0. It would help to register > different callbacks to different coroutines. > > Despite apparent beauty. This solution doesn't add anything. As there > are two primary use cases: > > * Set callback in signal handler. The callback is inherently single > ?one for this case > > * Use single callback per loop for coroutine use case. And in almost > ?all cases there is only one loop per thread > > > No Cleanup Hook > --------------- > > Original proposal included no cleanup hook specification. As there are > few ways to achieve the same using current tools: > > * Use ``sys.settrace`` and ``f_trace`` callback. It may impose some > ?problem to debugging, and has big performance impact (although, > ?interrupting doesn't happen very often) > > * Sleep a bit more and try again. For coroutine library it's easy. For > ?signals it may be achieved using ``alert``. > > Both methods are considered too impractical and a way to catch exit > from ``finally`` statement is proposed. > > > References > ========== > > .. [1] Monocle > ? https://github.com/saucelabs/monocle > > .. [2] Bluelet > ? https://github.com/sampsyo/bluelet > > .. [3] Twisted: inlineCallbacks > ? http://twistedmatrix.com/documents/8.1.0/api/twisted.internet.defer.html > > .. [4] Original discussion > ? http://mail.python.org/pipermail/python-ideas/2012-April/014705.html > > > Copyright > ========= > > This document has been placed in the public domain. > > > > .. > ? Local Variables: > ? mode: indented-text > ? indent-tabs-mode: nil > ? sentence-end-double-space: t > ? fill-column: 70 > ? coding: utf-8 > ? End: > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -- Thanks, Andrew Svetlov From andrew.svetlov at gmail.com Sat Apr 7 23:09:37 2012 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Sun, 8 Apr 2012 00:09:37 +0300 Subject: [Python-ideas] Draft PEP on protecting finally clauses In-Reply-To: References: Message-ID: What's about reference implementation? On Sun, Apr 8, 2012 at 12:08 AM, Andrew Svetlov wrote: > I've published this PEP as PEP-419: http://www.python.org/dev/peps/pep-0419/ > Thank you, Paul. > > On Sat, Apr 7, 2012 at 12:04 AM, Paul Colomiets wrote: >> Hi, >> >> I've finally made a PEP. Any feedback is appreciated. >> >> -- >> Paul >> >> >> PEP: XXX >> Title: Protecting cleanup statements from interruptions >> Version: $Revision$ >> Last-Modified: $Date$ >> Author: Paul Colomiets >> Status: Draft >> Type: Standards Track >> Content-Type: text/x-rst >> Created: 06-Apr-2012 >> Python-Version: 3.3 >> >> >> Abstract >> ======== >> >> This PEP proposes a way to protect python code from being interrupted inside >> finally statement or context manager. >> >> >> Rationale >> ========= >> >> Python has two nice ways to do cleanup. One is a ``finally`` statement >> and the other is context manager (or ``with`` statement). Although, >> neither of them is protected from ``KeyboardInterrupt`` or >> ``generator.throw()``. For example:: >> >> ? ?lock.acquire() >> ? ?try: >> ? ? ? ?print('starting') >> ? ? ? ?do_someting() >> ? ?finally: >> ? ? ? ?print('finished') >> ? ? ? ?lock.release() >> >> If ``KeyboardInterrupt`` occurs just after ``print`` function is >> executed, lock will not be released. Similarly the following code >> using ``with`` statement is affected:: >> >> ? ?from threading import Lock >> >> ? ?class MyLock: >> >> ? ? ? ?def __init__(self): >> ? ? ? ? ? ?self._lock_impl = lock >> >> ? ? ? ?def __enter__(self): >> ? ? ? ? ? ?self._lock_impl.acquire() >> ? ? ? ? ? ?print("LOCKED") >> >> ? ? ? ?def __exit__(self): >> ? ? ? ? ? ?print("UNLOCKING") >> ? ? ? ? ? ?self._lock_impl.release() >> >> ? ?lock = MyLock() >> ? ?with lock: >> ? ? ? ?do_something >> >> If ``KeyboardInterrupt`` occurs near any of the ``print`` statements, >> lock will never be released. >> >> >> Coroutine Use Case >> ------------------ >> >> Similar case occurs with coroutines. Usually coroutine libraries want >> to interrupt coroutine with a timeout. There is a >> ``generator.throw()`` method for this use case, but there are no >> method to know is it currently yielded from inside a ``finally``. >> >> Example that uses yield-based coroutines follows. Code looks >> similar using any of the popular coroutine libraries Monocle [1]_, >> Bluelet [2]_, or Twisted [3]_. :: >> >> ? ?def run_locked() >> ? ? ? ?yield connection.sendall('LOCK') >> ? ? ? ?try: >> ? ? ? ? ? ?yield do_something() >> ? ? ? ? ? ?yield do_something_else() >> ? ? ? ?finally: >> ? ? ? ? ? ?yield connection.sendall('UNLOCK') >> >> ? ?with timeout(5): >> ? ? ? ?yield run_locked() >> >> In the example above ``yield something`` means pause executing current >> coroutine and execute coroutine ``something`` until it finished >> execution. So that library keeps stack of generators itself. The >> ``connection.sendall`` waits until socket is writable and does thing >> similar to what ``socket.sendall`` does. >> >> The ``with`` statement ensures that all that code is executed within 5 >> seconds timeout. It does so by registering a callback in main loop, >> which calls ``generator.throw()`` to the top-most frame in the >> coroutine stack when timeout happens. >> >> The ``greenlets`` extension works in similar way, except it doesn't >> need ``yield`` to enter new stack frame. Otherwise considerations are >> similar. >> >> >> Specification >> ============= >> >> Frame Flag 'f_in_cleanup' >> ------------------------- >> >> A new flag on frame object is proposed. It is set to ``True`` if this >> frame is currently in the ``finally`` suite. ?Internally it must be >> implemented as a counter of nested finally statements currently >> executed. >> >> The internal counter is also incremented when entering ``WITH_SETUP`` >> bytecode and ``WITH_CLEANUP`` bytecode, and is decremented when >> leaving that bytecode. This allows to protect ``__enter__`` and >> ``__exit__`` methods too. >> >> >> Function 'sys.setcleanuphook' >> ----------------------------- >> >> A new function for the ``sys`` module is proposed. This function sets >> a callback which is executed every time ``f_in_cleanup`` becomes >> ``False``. Callbacks gets ``frame`` as it's sole argument so it can >> get some evindence where it is called from. >> >> The setting is thread local and is stored inside ``PyThreadState`` >> structure. >> >> >> Inspect Module Enhancements >> --------------------------- >> >> Two new functions are proposed for ``inspect`` module: >> ``isframeincleanup`` and ``getcleanupframe``. >> >> ``isframeincleanup`` given ``frame`` object or ``generator`` object as >> sole argument returns the value of ``f_in_cleanup`` attribute of a >> frame itself or of the ``gi_frame`` attribute of a generator. >> >> ``getcleanupframe`` given ``frame`` object as sole argument returns >> the innermost frame which has true value of ``f_in_cleanup`` or >> ``None`` if no frames in the stack has the attribute set. It starts to >> inspect from specified frame and walks to outer frames using >> ``f_back`` pointers, just like ``getouterframes`` does. >> >> >> Example >> ======= >> >> Example implementation of ``SIGINT`` handler that interrupts safely >> might look like:: >> >> ? ?import inspect, sys, functools >> >> ? ?def sigint_handler(sig, frame) >> ? ? ? ?if inspect.getcleanupframe(frame) is None: >> ? ? ? ? ? ?raise KeyboardInterrupt() >> ? ? ? ?sys.setcleanuphook(functools.partial(sigint_handler, 0)) >> >> Coroutine example is out of scope of this document, because it's >> implemention depends very much on a trampoline (or main loop) used by >> coroutine library. >> >> >> Unresolved Issues >> ================= >> >> Interruption Inside With Statement Expression >> --------------------------------------------- >> >> Given the statement:: >> >> ? ?with open(filename): >> ? ? ? ?do_something() >> >> Python can be interrupted after ``open`` is called, but before >> ``SETUP_WITH`` bytecode is executed. There are two possible decisions: >> >> * Protect expression inside ``with`` statement. This would need >> ?another bytecode, since currently there is no delimiter at the start >> ?of ``with`` expression >> >> * Let user write a wrapper if he considers it's important for his >> ?use-case. Safe wrapper code might look like the following:: >> >> ? ?class FileWrapper(object): >> >> ? ? ? ?def __init__(self, filename, mode): >> ? ? ? ? ? ?self.filename = filename >> ? ? ? ? ? ?self.mode = mode >> >> ? ? ? ?def __enter__(self): >> ? ? ? ? ? ?self.file = open(self.filename, self.mode) >> >> ? ? ? ?def __exit__(self): >> ? ? ? ? ? ?self.file.close() >> >> ?Alternatively it can be written using context manager:: >> >> ? ?@contextmanager >> ? ?def open_wrapper(filename, mode): >> ? ? ? ?file = open(filename, mode) >> ? ? ? ?try: >> ? ? ? ? ? ?yield file >> ? ? ? ?finally: >> ? ? ? ? ? ?file.close() >> >> ?This code is safe, as first part of generator (before yield) is >> ?executed inside ``WITH_SETUP`` bytecode of caller >> >> >> Exception Propagation >> --------------------- >> >> Sometimes ``finally`` block or ``__enter__/__exit__`` method can be >> exited with an exception. Usually it's not a problem, since more >> important exception like ``KeyboardInterrupt`` or ``SystemExit`` >> should be thrown instead. But it may be nice to be able to keep >> original exception inside a ``__context__`` attibute. So cleanup hook >> signature may grow an exception argument:: >> >> ? ?def sigint_handler(sig, frame) >> ? ? ? ?if inspect.getcleanupframe(frame) is None: >> ? ? ? ? ? ?raise KeyboardInterrupt() >> ? ? ? ?sys.setcleanuphook(retry_sigint) >> >> ? ?def retry_sigint(frame, exception=None): >> ? ? ? ?if inspect.getcleanupframe(frame) is None: >> ? ? ? ? ? ?raise KeyboardInterrupt() from exception >> >> .. note:: >> >> ? ?No need to have three arguments like in ``__exit__`` method since >> ? ?we have a ``__traceback__`` attribute in exception in Python 3.x >> >> Although, this will set ``__cause__`` for the exception, which is not >> exactly what's intended. So some hidden interpeter logic may be used >> to put ``__context__`` attribute on every exception raised in cleanup >> hook. >> >> >> Interruption Between Acquiring Resource and Try Block >> ----------------------------------------------------- >> >> Example from the first section is not totally safe. Let's look closer:: >> >> ? ?lock.acquire() >> ? ?try: >> ? ? ? ?do_something() >> ? ?finally: >> ? ? ? ?lock.release() >> >> There is no way it can be fixed without modifying the code. The actual >> fix of this code depends very much on use case. >> >> Usually code can be fixed using a ``with`` statement:: >> >> ? ?with lock: >> ? ? ? ?do_something() >> >> Although, for coroutines you usually can't use ``with`` statement >> because you need to ``yield`` for both aquire and release operations. >> So code might be rewritten as following:: >> >> ? ?try: >> ? ? ? ?yield lock.acquire() >> ? ? ? ?do_something() >> ? ?finally: >> ? ? ? ?yield lock.release() >> >> The actual lock code might need more code to support this use case, >> but implementation is usually trivial, like check if lock has been >> acquired and unlock if it is. >> >> >> Setting Interruption Context Inside Finally Itself >> -------------------------------------------------- >> >> Some coroutine libraries may need to set a timeout for the finally >> clause itself. For example:: >> >> ? ?try: >> ? ? ? ?do_something() >> ? ?finally: >> ? ? ? ?with timeout(0.5): >> ? ? ? ? ? ?try: >> ? ? ? ? ? ? ? ?yield do_slow_cleanup() >> ? ? ? ? ? ?finally: >> ? ? ? ? ? ? ? ?yield do_fast_cleanup() >> >> With current semantics timeout will either protect >> the whole ``with`` block or nothing at all, depending on the >> implementation of a library. What the author is intended is to treat >> ``do_slow_cleanup`` as an ordinary code, and ``do_fast_cleanup`` as a >> cleanup (non-interruptible one). >> >> Similar case might occur when using greenlets or tasklets. >> >> This case can be fixed by exposing ``f_in_cleanup`` as a counter, and >> by calling cleanup hook on each decrement. ?Corouting library may then >> remember the value at timeout start, and compare it on each hook >> execution. >> >> But in practice example is considered to be too obscure to take in >> account. >> >> >> Alternative Python Implementations Support >> ========================================== >> >> We consider ``f_in_cleanup`` and implementation detail. The actual >> implementation may have some fake frame-like object passed to signal >> handler, cleanup hook and returned from ``getcleanupframe``. The only >> requirement is that ``inspect`` module functions work as expected on >> that objects. For this reason we also allow to pass a ``generator`` >> object to a ``isframeincleanup`` function, this disables need to use >> ``gi_frame`` attribute. >> >> It may need to be specified that ``getcleanupframe`` must return the >> same object that will be passed to cleanup hook at next invocation. >> >> >> Alternative Names >> ================= >> >> Original proposal had ``f_in_finally`` flag. The original intention >> was to protect ``finally`` clauses. But as it grew up to protecting >> ``__enter__`` and ``__exit__`` methods too, the ``f_in_cleanup`` >> method seems better. Although ``__enter__`` method is not a cleanup >> routine, it at least relates to cleanup done by context managers. >> >> ``setcleanuphook``, ``isframeincleanup`` and ``getcleanupframe`` can >> be unobscured to ``set_cleanup_hook``, ``is_frame_in_cleanup`` and >> ``get_cleanup_frame``, althought they follow convention of their >> respective modules. >> >> >> Alternative Proposals >> ===================== >> >> Propagating 'f_in_cleanup' Flag Automatically >> ----------------------------------------------- >> >> This can make ``getcleanupframe`` unnecessary. But for yield based >> coroutines you need to propagate it yourself. Making it writable leads >> to somewhat unpredictable behavior of ``setcleanuphook`` >> >> >> Add Bytecodes 'INCR_CLEANUP', 'DECR_CLEANUP' >> -------------------------------------------- >> >> These bytecodes can be used to protect expression inside ``with`` >> statement, as well as making counter increments more explicit and easy >> to debug (visible inside a disassembly). Some middle ground might be >> chosen, like ``END_FINALLY`` and ``SETUP_WITH`` imlicitly decrements >> counter (``END_FINALLY`` is present at end of ``with`` suite). >> >> Although, adding new bytecodes must be considered very carefully. >> >> >> Expose 'f_in_cleanup' as a Counter >> ---------------------------------- >> >> The original intention was to expose minimum needed functionality. >> Although, as we consider frame flag ``f_in_cleanup`` as an >> implementation detail, we may expose it as a counter. >> >> Similarly, if we have a counter we may need to have cleanup hook >> called on every counter decrement. It's unlikely have much performance >> impact as nested finally clauses are unlikely common case. >> >> >> Add code object flag 'CO_CLEANUP' >> --------------------------------- >> >> As an alternative to set flag inside ``WITH_SETUP``, and >> ``WITH_CLEANUP`` bytecodes we can introduce a flag ``CO_CLEANUP``. >> When interpreter starts to execute code with ``CO_CLEANUP`` set, it >> sets ``f_in_cleanup`` for the whole function body. ?This flag is set >> for code object of ``__enter__`` and ``__exit__`` special methods. >> Technically it might be set on functions called ``__enter__`` and >> ``__exit__``. >> >> This seems to be less clear solution. It also covers the case where >> ``__enter__`` and ``__exit__`` are called manually. This may be >> accepted either as feature or as a unnecessary side-effect (unlikely >> as a bug). >> >> It may also impose a problem when ``__enter__`` or ``__exit__`` >> function are implemented in C, as there usually no frame to check for >> ``f_in_cleanup`` flag. >> >> >> Have Cleanup Callback on Frame Object Itself >> ---------------------------------------------- >> >> Frame may be extended to have ``f_cleanup_callback`` which is called >> when ``f_in_cleanup`` is reset to 0. It would help to register >> different callbacks to different coroutines. >> >> Despite apparent beauty. This solution doesn't add anything. As there >> are two primary use cases: >> >> * Set callback in signal handler. The callback is inherently single >> ?one for this case >> >> * Use single callback per loop for coroutine use case. And in almost >> ?all cases there is only one loop per thread >> >> >> No Cleanup Hook >> --------------- >> >> Original proposal included no cleanup hook specification. As there are >> few ways to achieve the same using current tools: >> >> * Use ``sys.settrace`` and ``f_trace`` callback. It may impose some >> ?problem to debugging, and has big performance impact (although, >> ?interrupting doesn't happen very often) >> >> * Sleep a bit more and try again. For coroutine library it's easy. For >> ?signals it may be achieved using ``alert``. >> >> Both methods are considered too impractical and a way to catch exit >> from ``finally`` statement is proposed. >> >> >> References >> ========== >> >> .. [1] Monocle >> ? https://github.com/saucelabs/monocle >> >> .. [2] Bluelet >> ? https://github.com/sampsyo/bluelet >> >> .. [3] Twisted: inlineCallbacks >> ? http://twistedmatrix.com/documents/8.1.0/api/twisted.internet.defer.html >> >> .. [4] Original discussion >> ? http://mail.python.org/pipermail/python-ideas/2012-April/014705.html >> >> >> Copyright >> ========= >> >> This document has been placed in the public domain. >> >> >> >> .. >> ? Local Variables: >> ? mode: indented-text >> ? indent-tabs-mode: nil >> ? sentence-end-double-space: t >> ? fill-column: 70 >> ? coding: utf-8 >> ? End: >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas > > > > -- > Thanks, > Andrew Svetlov -- Thanks, Andrew Svetlov From g.brandl at gmx.net Sun Apr 8 00:45:46 2012 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 08 Apr 2012 00:45:46 +0200 Subject: [Python-ideas] Draft PEP on protecting finally clauses In-Reply-To: References: Message-ID: On 04/07/2012 11:08 PM, Andrew Svetlov wrote: > I've published this PEP as PEP-419: http://www.python.org/dev/peps/pep-0419/ > Thank you, Paul. > > On Sat, Apr 7, 2012 at 12:04 AM, Paul Colomiets wrote: >> Hi, >> >> I've finally made a PEP. Any feedback is appreciated. NB: After a PEP is checked in, it should be posted to python-dev for general discussion. Georg From luoyonggang at gmail.com Sun Apr 8 09:39:56 2012 From: luoyonggang at gmail.com (=?UTF-8?B?572X5YuH5YiaKFlvbmdnYW5nIEx1bykg?=) Date: Sun, 8 Apr 2012 15:39:56 +0800 Subject: [Python-ideas] I found to detect if an object is GCed is very hard within python. Message-ID: For example, I write an C extension with two object Parent and Child, they referenced each other as member, so circular referenced, how to use Python unittest to detect that? 2012/4/7, Alec Taylor : > Has been withdrawn... and implemented > > http://www.python.org/dev/peps/pep-0274/ > -- > http://mail.python.org/mailman/listinfo/python-list > -- ????????? ?? ? ??? Yours sincerely, Yonggang Luo From phd at phdru.name Sun Apr 8 12:57:29 2012 From: phd at phdru.name (Oleg Broytman) Date: Sun, 8 Apr 2012 14:57:29 +0400 Subject: [Python-ideas] I found to detect if an object is GCed is very hard within python. In-Reply-To: References: Message-ID: <20120408105729.GA23012@iskra.aviel.ru> Hello. We are sorry but we cannot help you. This mailing list is to discuss new Python ideas; if you're having problems learning, understanding or using Python, please find another forum. Probably python-list/comp.lang.python mailing list/news group is the best place; there are Python developers who participate in it; you may get a faster, and probably more complete, answer there. See http://www.python.org/community/ for other lists/news groups/fora. Thank you for understanding. On Sun, Apr 08, 2012 at 03:39:56PM +0800, "?????????(Yonggang Luo) " wrote: > For example, I write an C extension with two object Parent and Child, > they referenced each other as member, so circular referenced, how to > use Python unittest to detect that? I think, a weak reference with a callback can help. Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From ubershmekel at gmail.com Mon Apr 9 16:20:22 2012 From: ubershmekel at gmail.com (Yuval Greenfield) Date: Mon, 9 Apr 2012 17:20:22 +0300 Subject: [Python-ideas] pdb.set_trace may not seem long Message-ID: Proposal: pdb.st = pdb.set_trace ----------- I find myself typing this a lot: import pdb;pdb.set_trace() It's the int3 of python. When I want to debug an exact point in the code I use the above line. I hope I don't come off as spoiled, it's just that import pdb;pdb.pm() is so short that I can't help but wonder how better my life would be if I could do: import pdb;pdb.st() What do you guys think? I know aliasing isn't cool since TSBOAPOOOWTDI but practicality beats purity.... Yuval -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Apr 9 17:01:11 2012 From: guido at python.org (Guido van Rossum) Date: Mon, 9 Apr 2012 08:01:11 -0700 Subject: [Python-ideas] pdb.set_trace may not seem long In-Reply-To: References: Message-ID: I think this is unnecessary; if you find yourself typing that so much, create a module with a one-letter name and put a bunch of one-letter convenience functions in it. Or figure out how to create macros for your editor. On Mon, Apr 9, 2012 at 7:20 AM, Yuval Greenfield wrote: > Proposal: > > pdb.st = pdb.set_trace > ----------- > > I find myself typing this a lot: > > ? ? import pdb;pdb.set_trace() > > It's the int3 of python. When I want to debug an exact point in the code I > use the above line. > > I hope I don't come off as spoiled, it's just that import pdb;pdb.pm() is so > short that I can't help but wonder how better my life would be if I could > do: > > ? ? import pdb;pdb.st() > > > What do you guys think? I know aliasing isn't cool since?TSBOAPOOOWTDI but > practicality beats purity.... > > > Yuval > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Guido van Rossum (python.org/~guido) From sven at marnach.net Mon Apr 9 16:43:33 2012 From: sven at marnach.net (Sven Marnach) Date: Mon, 9 Apr 2012 15:43:33 +0100 Subject: [Python-ideas] pdb.set_trace may not seem long In-Reply-To: References: Message-ID: <20120409144333.GA24979@bagheera> Yuval Greenfield schrieb am Mon, 09. Apr 2012, um 17:20:22 +0300: > I find myself typing this a lot: > > import pdb;pdb.set_trace() What I do instead: I start pdb in Emacs, set a break point on the line and run the script. This should work in any other interactive debugger for Python, too. If for some reason you prefer to insert the above-mentioned line into your source code, how about defining an editor macro for this purpose, so you could do it with a single shortcut? Cheers, Sven From ned at nedbatchelder.com Mon Apr 9 18:26:17 2012 From: ned at nedbatchelder.com (Ned Batchelder) Date: Mon, 09 Apr 2012 12:26:17 -0400 Subject: [Python-ideas] pdb.set_trace may not seem long In-Reply-To: References: Message-ID: <4F830DA9.30703@nedbatchelder.com> On 4/9/2012 10:20 AM, Yuval Greenfield wrote: > Proposal: > > pdb.st = pdb.set_trace > ----------- > > I find myself typing this a lot: > > import pdb;pdb.set_trace() > > It's the int3 of python. When I want to debug an exact point in the > code I use the above line. > > I hope I don't come off as spoiled, it's just that import pdb;pdb.pm > () is so short that I can't help but wonder how better > my life would be if I could do: > > import pdb;pdb.st () > > > What do you guys think? I know aliasing isn't cool since TSBOAPOOOWTDI > but practicality beats purity.... > I have exactly one abbreviation defined in my .vimrc: abbrev pdbxx import pdb;pdb.set_trace() --Ned. > > Yuval > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From techtonik at gmail.com Mon Apr 9 19:45:28 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Mon, 9 Apr 2012 20:45:28 +0300 Subject: [Python-ideas] pdb.set_trace may not seem long In-Reply-To: References: Message-ID: On Mon, Apr 9, 2012 at 5:20 PM, Yuval Greenfield wrote: > Proposal: > > pdb.st = pdb.set_trace > ----------- > > I find myself typing this a lot: > > ? ? import pdb;pdb.set_trace() How about? import pdb.trace -- anatoly t. From guido at python.org Mon Apr 9 20:13:10 2012 From: guido at python.org (Guido van Rossum) Date: Mon, 9 Apr 2012 11:13:10 -0700 Subject: [Python-ideas] pdb.set_trace may not seem long In-Reply-To: References: Message-ID: On Mon, Apr 9, 2012 at 10:45 AM, anatoly techtonik wrote: > On Mon, Apr 9, 2012 at 5:20 PM, Yuval Greenfield wrote: >> Proposal: >> >> pdb.st = pdb.set_trace >> ----------- >> >> I find myself typing this a lot: >> >> ? ? import pdb;pdb.set_trace() > > How about? > > ? ?import pdb.trace Yuck. An import intended to have a side effect. This also won't work if pdb.trace was imported before. -- --Guido van Rossum (python.org/~guido) From techtonik at gmail.com Mon Apr 9 20:59:43 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Mon, 9 Apr 2012 21:59:43 +0300 Subject: [Python-ideas] pdb.set_trace may not seem long In-Reply-To: References: Message-ID: On Mon, Apr 9, 2012 at 9:13 PM, Guido van Rossum wrote: > On Mon, Apr 9, 2012 at 10:45 AM, anatoly techtonik wrote: >> On Mon, Apr 9, 2012 at 5:20 PM, Yuval Greenfield wrote: >>> Proposal: >>> >>> pdb.st = pdb.set_trace >>> ----------- >>> >>> I find myself typing this a lot: >>> >>> ? ? import pdb;pdb.set_trace() >> >> How about? >> >> ? ?import pdb.trace > > Yuck. An import intended to have a side effect. On the other side it is an import intended to debug side effects. Syntax sugar that makes debugging in Python more intuitive. I always land on stackoverflow when I need to recall the structure of this pdb import call. In ideal world it would be even something like: import debug.start but of course, a builtin debug() which calls registered debugger for an application or pdb (by default) could be even more shorter and easy for newbies at the cost of added magic. > This also won't work > if pdb.trace was imported before. Can pdb.trace remove itself from sys.modules while being imported? From victor.varvariuc at gmail.com Tue Apr 10 11:30:34 2012 From: victor.varvariuc at gmail.com (Victor Varvariuc) Date: Tue, 10 Apr 2012 12:30:34 +0300 Subject: [Python-ideas] Improve import mechanism for circular imports Message-ID: Consider the following directory structure (Python 3): [test] main.py [tree] __init__.py # empty branch.py root.py main.py: print('main: Entered the module') print('main: from tree import root')from tree import root # this first finished importing module 'tree.root'# then creates local name 'root' which references that imported module# Why not creating local name 'root' at the same time when sys.modules['tree.root'] is created?# If import fails:# - either delete the attribute 'root'# - or leave it - how it affects the runtime?from tree import branch print('main: Creating a root and a leaf attached to it') _root = root.Root()_branch = branch.Branch(_root) branch.py: print('tree.branch: Entered the module') import tree, sysprint("tree.branch: 'root' in dir(tree) ->", 'root' in dir(tree))print("tree.branch: 'tree.root' in sys.modules ->", 'tree.root' in sys.modules) print('tree.branch: from tree import root')#from tree import root # ImportError: cannot import name root import tree.root # workaroundroot = sys.modules['tree.root'] # though name 'root' does not exist yet in module 'tree', sys.modules['root.tree'] already does print('tree.branch: Defining class branch.Branch()') class Branch(): def __init__(self, _parent): assert isinstance(_parent, root.Root), 'Pass a `Root` instance' root.py: print('tree.root: Entered the module') print('tree.root: from tree import branch')from tree import branch print('tree.root: Defining class Root()') class Root(): def attach(self, _branch): assert isinstance(_branch, branch.Branch), 'Pass a `Branch` instance' self.branch = _branch Running it: vic at wic:~/projects/test$ python3 main.py main: Entered the modulemain: from tree import roottree.root: Entered the moduletree.root: from tree import branchtree.branch: Entered the moduletree.branch: 'root' in dir(tree) -> Falsetree.branch: 'tree.root' in sys.modules -> Truetree.branch: from tree import roottree.branch: Defining class branch.Branch()tree.root: Defining class Root()main: Creating a root and a leaf attached to it So, There are circular imports in this example code. Currently, `root = sys.modules['tree.root']` hack in branch.py works. Wouldn't it be useful to create attribute `root` in `main` at the same time `sys.modules['tree.root']` is created when doing `from tree import root` in main? This would solve more complex cases with when circular imports are involved, without applying such hacks. Thank you -- *Victor * -------------- next part -------------- An HTML attachment was scrubbed... URL: From phd at phdru.name Tue Apr 10 11:55:51 2012 From: phd at phdru.name (Oleg Broytman) Date: Tue, 10 Apr 2012 13:55:51 +0400 Subject: [Python-ideas] Avoid (was: Improve) circular import In-Reply-To: References: Message-ID: <20120410095551.GC22368@iskra.aviel.ru> Hi! On Tue, Apr 10, 2012 at 12:30:34PM +0300, Victor Varvariuc wrote: > Consider the following directory structure (Python 3): > > [test] > main.py > [tree] > __init__.py # empty > branch.py > root.py > > branch.py: > > import tree My opinion is - restructure code to avoid circular import instead of hacking import machinery. Why does a submodule import the entire package instead of importing just root? Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From victor.varvariuc at gmail.com Tue Apr 10 13:24:18 2012 From: victor.varvariuc at gmail.com (Victor Varvariuc) Date: Tue, 10 Apr 2012 14:24:18 +0300 Subject: [Python-ideas] Avoid (was: Improve) circular import In-Reply-To: <20120410095551.GC22368@iskra.aviel.ru> References: <20120410095551.GC22368@iskra.aviel.ru> Message-ID: On Tue, Apr 10, 2012 at 12:55 PM, Oleg Broytman wrote: > My opinion is - restructure code to avoid circular import instead of hacking import machinery. It's not feasible sometimes. See: http://stackoverflow.com/questions/1556387/circular-import-dependency-in-python Yes, they could be considered the same package. But if this results in a massively huge file then it's impractical. I agree that frequently, circular dependencies mean the design should be thought through again. But there ARE some design patterns where it's appropriate (and where merging the files together would result in a huge file) so I think it's dogmatic to say that the packages should either be combined or the design should be re-evaluated. ? Matthew Lund Dec 1 '11 at 21:49 > Why does a submodule import the entire package instead of importing just root? import tree, sys print("tree.branch: 'root' in dir(tree) ->", 'root' in dir(tree)) print("tree.branch: 'tree.root' in sys.modules ->", 'tree.root' in sys.modules) Ignore this part - my fault. -- *Victor* -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.varvariuc at gmail.com Tue Apr 10 13:30:00 2012 From: victor.varvariuc at gmail.com (Victor Varvariuc) Date: Tue, 10 Apr 2012 14:30:00 +0300 Subject: [Python-ideas] Improve circular import Message-ID: On Tue, Apr 10, 2012 at 12:55 PM, Oleg Broytman wrote: > Why does a submodule import the entire package instead of importing just root? import tree, sys print("tree.branch: 'root' in dir(tree) ->", 'root' in dir(tree)) print("tree.branch: 'tree.root' in sys.modules ->", 'tree.root' in sys.modules) Ignore this part - my fault. It should have been: import sys print("tree.branch: 'root' in dir(main) ->", 'root' in dir(sys.modules['__main__'])) print("tree.branch: 'tree.root' in sys.modules ->", 'tree.root' in sys.modules) This shows that `root` attribute does not exist yet in main, though 'tree.root' exists in `sys.modules`. -- *Victor* -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Tue Apr 10 13:57:11 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 10 Apr 2012 12:57:11 +0100 Subject: [Python-ideas] Improve import mechanism for circular imports In-Reply-To: References: Message-ID: On 10 April 2012 10:30, Victor Varvariuc wrote: > There are circular imports in this example code. > Currently, `root = sys.modules['tree.root']` hack in branch.py works. > Wouldn't it be useful to create attribute `root` in `main` at the same time > `sys.modules['tree.root']` is created when doing `from tree import root` in > main? > This would solve more complex cases with when circular imports are involved, > without applying such hacks. Why does "tree" even exist? Why not just have 2 top-level modules "root" and "branch". ("It's only an example" isn't a good answer - it isn't a good example if it doesn't demonstrate why "tree" is needed for your real use case). Can you explain the purpose of "tree" in your real code? Also, having things happen at import time (as opposed to simply defining classes, functions, etc) is not good form, precisely because partially-imported modules are in an odd state, and problems like this can easily arise if you don't know what you are doing. Instead of giving a made-up example, if you describe what you are trying to achieve, I'm fairly certain someone here (or more likely on python-list) could show you a better way to do it, without needing circular imports. Paul. From phd at phdru.name Tue Apr 10 14:03:15 2012 From: phd at phdru.name (Oleg Broytman) Date: Tue, 10 Apr 2012 16:03:15 +0400 Subject: [Python-ideas] Avoid circular import In-Reply-To: References: <20120410095551.GC22368@iskra.aviel.ru> Message-ID: <20120410120315.GD22368@iskra.aviel.ru> On Tue, Apr 10, 2012 at 02:24:18PM +0300, Victor Varvariuc wrote: > On Tue, Apr 10, 2012 at 12:55 PM, Oleg Broytman wrote: > > > My opinion is - restructure code to avoid circular import instead of > hacking import machinery. > > It's not feasible sometimes. > > See: > http://stackoverflow.com/questions/1556387/circular-import-dependency-in-python I don't see anything interesting there. Without deeper knowledge about the code I'd recommend to import b.d in a/__init__.py before importing c. > Yes, they could be considered the same package. But if this results in a > massively huge file then it's impractical. I agree that frequently, > circular dependencies mean the design should be thought through again. But > there ARE some design patterns where it's appropriate (and where merging > the files together would result in a huge file) so I think it's dogmatic to > say that the packages should either be combined or the design should be > re-evaluated. ? Matthew Lund Dec 1 '11 at 21:49 I didn't and do not recommend merging code into one huge module. Call me dogmatic but I recommend to refactor and move common parts to avoid circular import. Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From techtonik at gmail.com Tue Apr 10 14:35:16 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Tue, 10 Apr 2012 15:35:16 +0300 Subject: [Python-ideas] Avoid circular import In-Reply-To: <20120410120315.GD22368@iskra.aviel.ru> References: <20120410095551.GC22368@iskra.aviel.ru> <20120410120315.GD22368@iskra.aviel.ru> Message-ID: On Tue, Apr 10, 2012 at 3:03 PM, Oleg Broytman wrote: > On Tue, Apr 10, 2012 at 02:24:18PM +0300, Victor Varvariuc wrote: >> On Tue, Apr 10, 2012 at 12:55 PM, Oleg Broytman wrote: >> >> > My opinion is - restructure code to avoid circular import instead of >> hacking import machinery. >> >> It's not feasible sometimes. JFY here is an example how to make it in general case. http://codereview.appspot.com/5449109/ >> See: >> http://stackoverflow.com/questions/1556387/circular-import-dependency-in-python Do you really fighting with this specific case? There is a solution to that with delayed import. From ncoghlan at gmail.com Tue Apr 10 15:45:04 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 10 Apr 2012 23:45:04 +1000 Subject: [Python-ideas] Improve import mechanism for circular imports In-Reply-To: References: Message-ID: On Tue, Apr 10, 2012 at 9:57 PM, Paul Moore wrote: > Instead of giving a made-up example, if you describe what you are > trying to achieve, I'm fairly certain someone here (or more likely on > python-list) could show you a better way to do it, without needing > circular imports. There isn't actually a *strong* philosophical objection to improving the circular import support. It's just a sufficiently hard problem that the rote answer is "nobody has cared enough about the problem to come up with a fix that works properly, is backwards compatible and doesn't hurt the performance of regular imports". The relevant tracker issue is http://bugs.python.org/issue992389 (yes, that issue is approaching it's 8th birthday later this year) With import.c going away soon (courtesy of the migration to importlib as the main import implementation), it may become easier to devise a solution (or at least generate a better error message). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From sven at marnach.net Tue Apr 10 15:56:36 2012 From: sven at marnach.net (Sven Marnach) Date: Tue, 10 Apr 2012 14:56:36 +0100 Subject: [Python-ideas] Avoid circular import In-Reply-To: <20120410120315.GD22368@iskra.aviel.ru> References: <20120410095551.GC22368@iskra.aviel.ru> <20120410120315.GD22368@iskra.aviel.ru> Message-ID: <20120410135636.GA30763@bagheera> Oleg Broytman schrieb am Tue, 10. Apr 2012, um 16:03:15 +0400: > I didn't and do not recommend merging code into one huge module. Call > me dogmatic but I recommend to refactor and move common parts to avoid > circular import. I actually did ran into cases where this was not possible. (I just wrote a lengthy description of such a case, but I figured it wasn't too helpful, so it's not included here.) A point to consider is that there are cases of circular imports that used to work fine with implicit relative imports. When using explicit relative imports though, they would stop working -- see [1] for a minimal example demonstrating this problem. [1]: http://stackoverflow.com/questions/6351805/cyclic-module-dependencies-and-relative-imports-in-python This issue can be easily overcome with function-level imports, but some people don't like function-level imports either. The same issue turned up when porting the Python Imaging Library to Python 3. PIL uses implicit relative, circular imports which have to be turned into function-level imports to work properly on Python 3, see [2] for details. [2]: https://github.com/sloonz/pil-py3k/pull/2 Cheers, Sven From phd at phdru.name Tue Apr 10 16:11:05 2012 From: phd at phdru.name (Oleg Broytman) Date: Tue, 10 Apr 2012 18:11:05 +0400 Subject: [Python-ideas] Avoid circular import In-Reply-To: <20120410135636.GA30763@bagheera> References: <20120410095551.GC22368@iskra.aviel.ru> <20120410120315.GD22368@iskra.aviel.ru> <20120410135636.GA30763@bagheera> Message-ID: <20120410141105.GA30439@iskra.aviel.ru> On Tue, Apr 10, 2012 at 02:56:36PM +0100, Sven Marnach wrote: > This issue can be easily overcome with function-level imports, but > some people don't like function-level imports either. Can I say I doubt it's a good reason to change Python? > The same issue turned up when porting the Python Imaging Library to > Python 3. PIL uses implicit relative, circular imports which have to > be turned into function-level imports to work properly on Python 3, > see [2] for details. > > [2]: https://github.com/sloonz/pil-py3k/pull/2 Was there any major problem in fixing that? Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From sven at marnach.net Tue Apr 10 16:53:01 2012 From: sven at marnach.net (Sven Marnach) Date: Tue, 10 Apr 2012 15:53:01 +0100 Subject: [Python-ideas] Avoid circular import In-Reply-To: <20120410141105.GA30439@iskra.aviel.ru> References: <20120410095551.GC22368@iskra.aviel.ru> <20120410120315.GD22368@iskra.aviel.ru> <20120410135636.GA30763@bagheera> <20120410141105.GA30439@iskra.aviel.ru> Message-ID: <20120410145301.GB30763@bagheera> Oleg Broytman schrieb am Tue, 10. Apr 2012, um 18:11:05 +0400: > > This issue can be easily overcome with function-level imports, but > > some people don't like function-level imports either. > > Can I say I doubt it's a good reason to change Python? Sorry, I actuually didn't mean to argue in favour of any proposal. I just meant to point out issues relevant to this thread that some people probably are not aware of. > > The same issue turned up when porting the Python Imaging Library to > > Python 3. PIL uses implicit relative, circular imports which have to > > be turned into function-level imports to work properly on Python 3, > > see [2] for details. > > > > [2]: https://github.com/sloonz/pil-py3k/pull/2 > > Was there any major problem in fixing that? This was meant as an example that these issues arise in practice, even in libraries that can hardly be cosidered obscure. Cheers, Sven From steven.samuel.cole at gmail.com Tue Apr 10 17:22:29 2012 From: steven.samuel.cole at gmail.com (Steven Samuel Cole) Date: Wed, 11 Apr 2012 01:22:29 +1000 Subject: [Python-ideas] all, any - why no none ? Message-ID: <4F845035.4020907@gmail.com> hello, i'm aware they've been around for quite a while, but for some reason, i did not have the builtins all(seq) and any(seq) on the radar thus far. when i used them today, i just assumed there was a corresponding none(seq), but was surprised to learn that is not true. why is that ? has this been considered ? discussed ? dismissed ? i did search, but the net being neither case-sensitive nor semantic, the results were off topic. sure, i can do not all or not any or any of these: http://stackoverflow.com/q/6518394/217844 http://stackoverflow.com/q/3583860/217844 but none(seq): True if all items in seq are None would IMHO be the pythonic way to do this. what do you think ? # ssc From anacrolix at gmail.com Tue Apr 10 17:44:04 2012 From: anacrolix at gmail.com (Matt Joiner) Date: Tue, 10 Apr 2012 23:44:04 +0800 Subject: [Python-ideas] all, any - why no none ? In-Reply-To: <4F845035.4020907@gmail.com> References: <4F845035.4020907@gmail.com> Message-ID: Try not any(). On Apr 10, 2012 11:23 PM, "Steven Samuel Cole" wrote: > hello, > > i'm aware they've been around for quite a while, but for some reason, i > did not have the builtins all(seq) and any(seq) on the radar thus far. when > i used them today, i just assumed there was a corresponding none(seq), but > was surprised to learn that is not true. > > why is that ? has this been considered ? discussed ? dismissed ? i did > search, but the net being neither case-sensitive nor semantic, the results > were off topic. > > sure, i can do not all or not any or any of these: > http://stackoverflow.com/q/**6518394/217844 > http://stackoverflow.com/q/**3583860/217844 > but none(seq): True if all items in seq are None > would IMHO be the pythonic way to do this. > > what do you think ? > > # ssc > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Apr 10 17:45:22 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 11 Apr 2012 01:45:22 +1000 Subject: [Python-ideas] all, any - why no none ? In-Reply-To: <4F845035.4020907@gmail.com> References: <4F845035.4020907@gmail.com> Message-ID: On Wed, Apr 11, 2012 at 1:22 AM, Steven Samuel Cole wrote: > hello, > > i'm aware they've been around for quite a while, but for some reason, i did > not have the builtins all(seq) and any(seq) on the radar thus far. when i > used them today, i just assumed there was a corresponding none(seq), but was > surprised to learn that is not true. > > why is that ? has this been considered ? discussed ? dismissed ? i did > search, but the net being neither case-sensitive nor semantic, the results > were off topic. > > sure, i can do not all or not any or any of these: > http://stackoverflow.com/q/6518394/217844 > http://stackoverflow.com/q/3583860/217844 > but none(seq): True if all items in seq are None > what do you think ? I think it doesn't come up often enough to be worth special casing over the more explicit "all(x is None for x in seq)". Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From bruce at leapyear.org Tue Apr 10 17:46:18 2012 From: bruce at leapyear.org (Bruce Leban) Date: Tue, 10 Apr 2012 08:46:18 -0700 Subject: [Python-ideas] all, any - why no none ? In-Reply-To: <4F845035.4020907@gmail.com> References: <4F845035.4020907@gmail.com> Message-ID: If you really mean "if all items in seq are None" then all(x is None for x in seq) is very clear and explicit that you don't mean all(x == None for x in seq) which is not exactly the same thing. If you don't care about exactly being None and you just want falsenes, then not any(seq) works already. If I saw none(seq) I would think it meant "none of seq is true" as that is a more common phrase. You'd need a name like all_none(seq). But then I want any_none() and none_none() too. And all_true() and all_false(), etc. Not enough value here. --- Bruce Follow me: http://www.twitter.com/Vroo http://www.vroospeak.com On Tue, Apr 10, 2012 at 8:22 AM, Steven Samuel Cole < steven.samuel.cole at gmail.com> wrote: > hello, > > i'm aware they've been around for quite a while, but for some reason, i > did not have the builtins all(seq) and any(seq) on the radar thus far. when > i used them today, i just assumed there was a corresponding none(seq), but > was surprised to learn that is not true. > > why is that ? has this been considered ? discussed ? dismissed ? i did > search, but the net being neither case-sensitive nor semantic, the results > were off topic. > > sure, i can do not all or not any or any of these: > http://stackoverflow.com/q/**6518394/217844 > http://stackoverflow.com/q/**3583860/217844 > but none(seq): True if all items in seq are None > would IMHO be the pythonic way to do this. > > what do you think ? > > # ssc > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From oscar.j.benjamin at gmail.com Tue Apr 10 17:49:01 2012 From: oscar.j.benjamin at gmail.com (Oscar Benjamin) Date: Tue, 10 Apr 2012 16:49:01 +0100 Subject: [Python-ideas] all, any - why no none ? In-Reply-To: <4F845035.4020907@gmail.com> References: <4F845035.4020907@gmail.com> Message-ID: none looks similar to None. The code below rightly gives a NameError. If none were a builtin function, not only would it allow the bug below but it would evaluate to True. >>> def f(x=none): ... if x: ... do_stuff() ... Traceback (most recent call last): File "", line 1, in NameError: name 'none' is not defined On Tue, Apr 10, 2012 at 4:22 PM, Steven Samuel Cole < steven.samuel.cole at gmail.com> wrote: > hello, > > i'm aware they've been around for quite a while, but for some reason, i > did not have the builtins all(seq) and any(seq) on the radar thus far. when > i used them today, i just assumed there was a corresponding none(seq), but > was surprised to learn that is not true. > > why is that ? has this been considered ? discussed ? dismissed ? i did > search, but the net being neither case-sensitive nor semantic, the results > were off topic. > > sure, i can do not all or not any or any of these: > http://stackoverflow.com/q/**6518394/217844 > http://stackoverflow.com/q/**3583860/217844 > but none(seq): True if all items in seq are None > would IMHO be the pythonic way to do this. > > what do you think ? > > # ssc > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ubershmekel at gmail.com Tue Apr 10 17:50:59 2012 From: ubershmekel at gmail.com (Yuval Greenfield) Date: Tue, 10 Apr 2012 18:50:59 +0300 Subject: [Python-ideas] all, any - why no none ? In-Reply-To: <4F845035.4020907@gmail.com> References: <4F845035.4020907@gmail.com> Message-ID: On Tue, Apr 10, 2012 at 6:22 PM, Steven Samuel Cole < steven.samuel.cole at gmail.com> wrote: > .... > what do you think ? > > I don't like how a missed capitalization can be so deadly. if x is none: return assert x is not none I can see that happening a lot and people being terribly confused and annoyed. Strongly against this idea. Yuval -------------- next part -------------- An HTML attachment was scrubbed... URL: From sven at marnach.net Tue Apr 10 20:27:11 2012 From: sven at marnach.net (Sven Marnach) Date: Tue, 10 Apr 2012 19:27:11 +0100 Subject: [Python-ideas] Allow imports to a global name Message-ID: <20120410182711.GC30763@bagheera> Sometimes it is useful to do a import to a global name inside a function. A common use case is the 'pylab' module, which must be imported *after* the backend has been set using 'matplotlib.use()'. If the backend is configuration-dependent, the statement import pylab will usually be inside a function, but the module should be available globally, so you would do global pylab import pylab While this code works (at least in CPython), the current language specification forbids it [1], so the correct code should be global pylab import pylab as _pylab pylab = _pylab I don't see why we shouldn't allow the shorter version -- it is certainly easier to read. The behaviour of pylab might be considered a questionable design choice. I've encountered the above non-conforming, but working code out in the wild several times, though, so the language specification might as well allow it. Cheers, Sven [1]: http://docs.python.org/dev/reference/simple_stmts.html#the-global-statement From nathan.alexander.rice at gmail.com Tue Apr 10 23:10:45 2012 From: nathan.alexander.rice at gmail.com (Nathan Rice) Date: Tue, 10 Apr 2012 17:10:45 -0400 Subject: [Python-ideas] make __closure__ writable In-Reply-To: References: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com> <4F638871.1010603@hotpy.org> <0A6B28B2-C9FC-4835-9821-3F7079742AAE@gmail.com> <4F639370.1020609@hotpy.org> <93E337A9-1326-41DB-9C0D-281D78C611A4@gmail.com> <8FBB3A4C-E5B4-4EC1-B5B5-70E2DEA2A23B@gmail.com> <4F684F32.5080006@hotpy.org> <8406714B-876E-435E-84AB-716804C92387@gmail.com> Message-ID: As one of the resident crackpot/idealists I don't know that my +1 means much, but I have a decent amount of code where I jump through a lot of hoops to emulate writable __closure__, including copying a function using FunctionType(), then replace one instance of a function with another multiple places in a function graph; I also do a lot of lambda wrapping and unwrapping for the same purpose. This is primarily relevant relative to symbolic computation graphs, such as dataflow structures, computer algebra systems, etc. From ben+python at benfinney.id.au Wed Apr 11 00:34:46 2012 From: ben+python at benfinney.id.au (Ben Finney) Date: Wed, 11 Apr 2012 08:34:46 +1000 Subject: [Python-ideas] all, any - why no none ? References: <4F845035.4020907@gmail.com> Message-ID: <87ehrvgqrd.fsf@benfinney.id.au> Steven Samuel Cole writes: > i'm aware they've been around for quite a while, but for some reason, > i did not have the builtins all(seq) and any(seq) on the radar thus > far. when i used them today, i just assumed there was a corresponding > none(seq), but was surprised to learn that is not true. And then you realised ?not any(seq)? works fine, and continued on satisfied. Right? -- \ ?Not to be absolutely certain is, I think, one of the essential | `\ things in rationality.? ?Bertrand Russell | _o__) | Ben Finney From steve at pearwood.info Wed Apr 11 02:22:31 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 11 Apr 2012 10:22:31 +1000 Subject: [Python-ideas] Allow imports to a global name In-Reply-To: <20120410182711.GC30763@bagheera> References: <20120410182711.GC30763@bagheera> Message-ID: <4F84CEC7.9090402@pearwood.info> Sven Marnach wrote: > Sometimes it is useful to do a import to a global name inside a > function. A common use case is the 'pylab' module, which must be > imported *after* the backend has been set using 'matplotlib.use()'. > If the backend is configuration-dependent, the statement > > import pylab > > will usually be inside a function, but the module should be available > globally, so you would do > > global pylab > import pylab > > While this code works (at least in CPython), the current language > specification forbids it [1] I quote: Names listed in a global statement must not be defined as formal parameters or in a for loop control target, class definition, function definition, or import statement. http://docs.python.org/dev/reference/simple_stmts.html#the-global-statement I can understand that it makes no sense to declare a function parameter as global, and I can an argument in favour of optimizing for loops by ensuring that the target is always a local rather than global. But what is the rationale for prohibiting globals being used for classes, functions, and imports? It seems like an unnecessary restriction, particularly since CPython doesn't bother to enforce it. The semantics of "global x; import x" is simple and obvious. +1 on removing the unenforced prohibition on global class/def/import inside functions. -- Steven From guido at python.org Wed Apr 11 02:36:46 2012 From: guido at python.org (Guido van Rossum) Date: Tue, 10 Apr 2012 17:36:46 -0700 Subject: [Python-ideas] Allow imports to a global name In-Reply-To: <4F84CEC7.9090402@pearwood.info> References: <20120410182711.GC30763@bagheera> <4F84CEC7.9090402@pearwood.info> Message-ID: +1 --Guido van Rossum (sent from Android phone) On Apr 10, 2012 5:23 PM, "Steven D'Aprano" wrote: > Sven Marnach wrote: > >> Sometimes it is useful to do a import to a global name inside a >> function. A common use case is the 'pylab' module, which must be >> imported *after* the backend has been set using 'matplotlib.use()'. >> If the backend is configuration-dependent, the statement >> >> import pylab >> >> will usually be inside a function, but the module should be available >> globally, so you would do >> >> global pylab >> import pylab >> >> While this code works (at least in CPython), the current language >> specification forbids it [1] >> > > > I quote: > > Names listed in a global statement must not be defined as > formal parameters or in a for loop control target, class > definition, function definition, or import statement. > > http://docs.python.org/dev/**reference/simple_stmts.html#** > the-global-statement > > I can understand that it makes no sense to declare a function parameter as > global, and I can an argument in favour of optimizing for loops by ensuring > that the target is always a local rather than global. But what is the > rationale for prohibiting globals being used for classes, functions, and > imports? > > It seems like an unnecessary restriction, particularly since CPython > doesn't bother to enforce it. The semantics of "global x; import x" is > simple and obvious. > > +1 on removing the unenforced prohibition on global class/def/import > inside functions. > > > -- > Steven > > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Apr 11 02:59:47 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 11 Apr 2012 10:59:47 +1000 Subject: [Python-ideas] Allow imports to a global name In-Reply-To: References: <20120410182711.GC30763@bagheera> <4F84CEC7.9090402@pearwood.info> Message-ID: On Wed, Apr 11, 2012 at 10:36 AM, Guido van Rossum wrote: > +1 Ditto. FWIW, I'm actually in favour of dropping everything after the "or" in that paragraph from the language spec, since we don't enforce *any* of it. Aside from formal parameter definitions (which explicitly declare local variables), name binding operations are just name binding operations regardless of the specific syntax. With global: >>> def f(): ... global x ... for x in (): pass ... class x: pass ... def x(): pass ... import sys as x ... >>> f() >>> x With nonlocal: >>> def outer(): ... x = 1 ... def inner(): ... nonlocal x ... for x in (): pass ... class x: pass ... def x(): pass ... import sys as x ... inner() ... return x ... >>> outer() By contrast: >>> def f(x): ... global x ... File "", line 1 SyntaxError: name 'x' is parameter and global >>> def outer(x): ... def inner(x): ... nonlocal x ... SyntaxError: name 'x' is parameter and nonlocal Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From raymond.hettinger at gmail.com Wed Apr 11 05:29:36 2012 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Tue, 10 Apr 2012 23:29:36 -0400 Subject: [Python-ideas] Allow imports to a global name In-Reply-To: References: <20120410182711.GC30763@bagheera> <4F84CEC7.9090402@pearwood.info> Message-ID: On Apr 10, 2012, at 8:59 PM, Nick Coghlan wrote: > . FWIW, I'm actually in favour of dropping everything after the > "or" in that paragraph from the language spec, since we don't enforce > *any* of it. +1 The restriction seems unnecessary to me. That being said, we should check to make sure that the other implementations don't need the restrictions for some reason or other. Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Apr 11 06:48:35 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 11 Apr 2012 14:48:35 +1000 Subject: [Python-ideas] Allow imports to a global name In-Reply-To: References: <20120410182711.GC30763@bagheera> <4F84CEC7.9090402@pearwood.info> Message-ID: On Wed, Apr 11, 2012 at 1:29 PM, Raymond Hettinger wrote: > > On Apr 10, 2012, at 8:59 PM, Nick Coghlan wrote: > > . FWIW, I'm actually in favour of dropping everything after the > "or" in that paragraph from the language spec, since we don't enforce > *any* of it. > > > +1 ?The restriction seems unnecessary to me. > > That being said, we should check to make sure that the other > implementations don't need the restrictions for some reason or other. Agreed. I created http://bugs.python.org/issue14544 to provide a location for people to object (and will share the link around a bit). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From tjreedy at udel.edu Wed Apr 11 07:08:50 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 11 Apr 2012 01:08:50 -0400 Subject: [Python-ideas] Allow imports to a global name In-Reply-To: References: <20120410182711.GC30763@bagheera> <4F84CEC7.9090402@pearwood.info> Message-ID: On 4/10/2012 8:59 PM, Nick Coghlan wrote: > On Wed, Apr 11, 2012 at 10:36 AM, Guido van Rossum wrote: >> +1 > > Ditto. FWIW, I'm actually in favour of dropping everything after the > "or" in that paragraph from the language spec, since we don't enforce > *any* of it. Aside from formal parameter definitions (which explicitly > declare local variables), name binding operations are just name > binding operations regardless of the specific syntax. > > With global: > >>>> def f(): > ... global x > ... for x in (): pass > ... class x: pass > ... def x(): pass > ... import sys as x > ... >>>> f() >>>> x > I found this slightly surprising, but on checking with dis, each of the 'implicit' assignments is implemented as the equivalent of x = internal_function_call(args). (For for-loops, the assignment is within the loop.) Which is to say, each bytecode sequence ends with store_xxx where xxx is 'fast' or 'global. So now I am not surprised. -- Terry Jan Reedy From ncoghlan at gmail.com Wed Apr 11 07:23:12 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 11 Apr 2012 15:23:12 +1000 Subject: [Python-ideas] Allow imports to a global name In-Reply-To: References: <20120410182711.GC30763@bagheera> <4F84CEC7.9090402@pearwood.info> Message-ID: On Wed, Apr 11, 2012 at 3:08 PM, Terry Reedy wrote: > I found this slightly surprising, but on checking with dis, each of the > 'implicit' assignments is implemented as the equivalent of x = > internal_function_call(args). ?(For for-loops, the assignment is within the > loop.) Which is to say, each bytecode sequence ends with store_xxx where xxx > is 'fast' or 'global. So now I am not surprised. By contrast, I already knew that CPython's underlying implementation of the name binding step was the same in all these cases, so I was surprised by the documented restriction in the language spec. I'm now wondering if the initial restriction that prompted the note in the language spec was something that existed in the pre-AST version of the compiler (I only started learning the code generation machinery when I was helping to get the AST based compiler branch ready for inclusion in Python 2.5, so I know very little about how the compiler used to work in 2.4 and earlier) Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From anacrolix at gmail.com Wed Apr 11 11:38:39 2012 From: anacrolix at gmail.com (Matt Joiner) Date: Wed, 11 Apr 2012 17:38:39 +0800 Subject: [Python-ideas] generator.close Message-ID: Why can't generator.close() return the value if a StopIteration is raised? The PEPs mentioned that it was proposed before, but I can't find any definitive reason, and it's terribly convenient if it does. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at hotpy.org Wed Apr 11 12:04:29 2012 From: mark at hotpy.org (Mark Shannon) Date: Wed, 11 Apr 2012 11:04:29 +0100 Subject: [Python-ideas] generator.close In-Reply-To: References: Message-ID: <4F85572D.9000608@hotpy.org> Matt Joiner wrote: > Why can't generator.close() return the value if a StopIteration is raised? No reason as far as I can see. The semantics are clear enough. From an implementation point of view it would be a simple patch. > > The PEPs mentioned that it was proposed before, but I can't find any > definitive reason, and it's terribly convenient if it does. I'm sure it is convenient, but I believe it is convention to provide a use case ;) Cheers, Mark. From steve at pearwood.info Wed Apr 11 12:42:02 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 11 Apr 2012 20:42:02 +1000 Subject: [Python-ideas] Allow imports to a global name In-Reply-To: References: <20120410182711.GC30763@bagheera> <4F84CEC7.9090402@pearwood.info> Message-ID: <4F855FFA.7060505@pearwood.info> Raymond Hettinger wrote: > On Apr 10, 2012, at 8:59 PM, Nick Coghlan wrote: > >> . FWIW, I'm actually in favour of dropping everything after the >> "or" in that paragraph from the language spec, since we don't enforce >> *any* of it. > > +1 The restriction seems unnecessary to me. > > That being said, we should check to make sure that the other > implementations don't need the restrictions for some reason or other. Seems to work with Jython: steve at runes:~$ jython Jython 2.5.1+ (Release_2_5_1, Aug 4 2010, 07:18:19) [OpenJDK Client VM (Sun Microsystems Inc.)] on java1.6.0_18 Type "help", "copyright", "credits" or "license" for more information. >>> def test(): ... global math ... import math ... >>> >>> test() >>> math -- Steven From jh at improva.dk Wed Apr 11 12:50:28 2012 From: jh at improva.dk (Jacob Holm) Date: Wed, 11 Apr 2012 12:50:28 +0200 Subject: [Python-ideas] generator.close In-Reply-To: References: Message-ID: <4F8561F4.9060806@improva.dk> Hello Matt On 04/11/2012 11:38 AM, Matt Joiner wrote: > Why can't generator.close() return the value if a StopIteration is raised? > > The PEPs mentioned that it was proposed before, but I can't find any > definitive reason, and it's terribly convenient if it does. > What should be returned when you call close on an already-exhausted generator? You can't return the value of the final StopIteration unless you arrange to have that value stored somewhere. Storing the value was deemed undesireable by the powers that be. The alternative is to return None if the generator is already exhausted. That would work, but severely reduces the usefulness of the change. If you don't care about the performance of yield-from, it is quite easy to write a class you can use to wrap your generator-iterators and get the desired result (see untested example below). - Jacob import functools class generator_result_wrapper(object): __slots__ = ('_it', '_result') def __init__(self, it): self._it = it def __iter__(self): return self def next(self): try: return self._it.next() except StopIteration as e: self._result = e.result raise def send(self, value): try: return self._it.send(value) except StopIteration as e: self._result = e.result raise def throw(self, *args, **kwargs): try: return self._it.throw(*args, **kwargs) except StopIteration as e: self._result = e.result raise def close(self): try: return self._result except AttributeError: pass try: self._it.throw(GeneratorExit) except StopIteration as e: self._result = e.result return self._result except GeneratorExit: pass def close_result(func): @functools.wraps(func) def factory(*args, **kwargs): return generator_result_wrapper(func(*args, **kwargs)) return factory From anacrolix at gmail.com Wed Apr 11 13:28:18 2012 From: anacrolix at gmail.com (Matt Joiner) Date: Wed, 11 Apr 2012 19:28:18 +0800 Subject: [Python-ideas] generator.close In-Reply-To: <4F8561F4.9060806@improva.dk> References: <4F8561F4.9060806@improva.dk> Message-ID: You make an excellent point. I'm inclined to agree with you. Cheers On Apr 11, 2012 6:50 PM, "Jacob Holm" wrote: > Hello Matt > > On 04/11/2012 11:38 AM, Matt Joiner wrote: > >> Why can't generator.close() return the value if a StopIteration is raised? >> >> The PEPs mentioned that it was proposed before, but I can't find any >> definitive reason, and it's terribly convenient if it does. >> >> > What should be returned when you call close on an already-exhausted > generator? > > You can't return the value of the final StopIteration unless you arrange > to have that value stored somewhere. Storing the value was deemed > undesireable by the powers that be. > > The alternative is to return None if the generator is already exhausted. > That would work, but severely reduces the usefulness of the change. > > If you don't care about the performance of yield-from, it is quite easy to > write a class you can use to wrap your generator-iterators and get the > desired result (see untested example below). > > > - Jacob > > > import functools > > class generator_result_wrapper(**object): > __slots__ = ('_it', '_result') > > def __init__(self, it): > self._it = it > > def __iter__(self): > return self > > def next(self): > try: > return self._it.next() > except StopIteration as e: > self._result = e.result > raise > > def send(self, value): > try: > return self._it.send(value) > except StopIteration as e: > self._result = e.result > raise > > def throw(self, *args, **kwargs): > try: > return self._it.throw(*args, **kwargs) > except StopIteration as e: > self._result = e.result > raise > > def close(self): > try: > return self._result > except AttributeError: > pass > try: > self._it.throw(GeneratorExit) > except StopIteration as e: > self._result = e.result > return self._result > except GeneratorExit: > pass > > def close_result(func): > @functools.wraps(func) > def factory(*args, **kwargs): > return generator_result_wrapper(func(***args, **kwargs)) > return factory > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Apr 11 14:26:43 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 11 Apr 2012 22:26:43 +1000 Subject: [Python-ideas] generator.close In-Reply-To: References: <4F8561F4.9060806@improva.dk> Message-ID: On Wed, Apr 11, 2012 at 9:28 PM, Matt Joiner wrote: > You make an excellent point. I'm inclined to agree with you. While Jacob does make a valid point about the question of what to do when close() is called multiple times (or on a generator that has already been exhausted through iteration), the specific reason that got the idea killed in PEP 380 [1] is that generators shouldn't be raising StopIteration as a result of a close() invocation anyway - they should usually just reraise the GeneratorExit that gets thrown in to finalise the generator body. If inner generators start making a habit of converting GeneratorExit to StopIteration, then intervening "yield from" operations may yield another value instead of terminating the way they're supposed to in response to close(). Cheers, Nick. [1] http://www.python.org/dev/peps/pep-0380/#rejected-ideas -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From tshepang at gmail.com Wed Apr 11 22:35:54 2012 From: tshepang at gmail.com (Tshepang Lekhonkhobe) Date: Wed, 11 Apr 2012 22:35:54 +0200 Subject: [Python-ideas] in str.replace(old, new), allow 'old' to accept a tuple Message-ID: Hi, I find the fact that 'prefix' in str.startswith(prefix) accept a tuple quite useful. That's because one can do a match on more than one pattern at a time, without ugliness. Would it be a good idea to do the same for str.replace(old, new)? before >>> 'foo bar baz'.replace('foo', 'baz').replace('bar', 'baz') baz baz baz after >>> 'foo bar baz'.replace(('foo', 'bar'), 'baz') baz baz baz From sven at marnach.net Thu Apr 12 02:41:53 2012 From: sven at marnach.net (Sven Marnach) Date: Thu, 12 Apr 2012 01:41:53 +0100 Subject: [Python-ideas] in str.replace(old, new), allow 'old' to accept a tuple In-Reply-To: References: Message-ID: <20120412004153.GE30763@bagheera> Tshepang Lekhonkhobe schrieb am Wed, 11. Apr 2012, um 22:35:54 +0200: > before > >>> 'foo bar baz'.replace('foo', 'baz').replace('bar', 'baz') > baz baz baz > > after > >>> 'foo bar baz'.replace(('foo', 'bar'), 'baz') > baz baz baz The usual current solution is to use `re.sub`: >>> re.sub("foo|bar", "baz", "foo bar baz") 'baz baz baz' or, for a general iterable of patterns re.sub("|".join(map(re.escape, patterns)), repl, string) Cheers, Sven From ben+python at benfinney.id.au Thu Apr 12 03:47:07 2012 From: ben+python at benfinney.id.au (Ben Finney) Date: Thu, 12 Apr 2012 11:47:07 +1000 Subject: [Python-ideas] in str.replace(old, new), allow 'old' to accept a tuple References: Message-ID: <87lim1g1r8.fsf@benfinney.id.au> Tshepang Lekhonkhobe writes: > >>> 'foo bar baz'.replace(('foo', 'bar'), 'baz') > baz baz baz How about: 'foo bar baz'.replace(('foo', 'bar'), 'foobar') You can't replace multiple matches ?at the same time?, as you're implying. The order of replacements is important, since it will affect the outcome in many cases. Do you think it's important to allow a set as the first argument to str.replace()? search_strings = set(['foo', 'bar']) 'foo bar baz'.replace(search_strings, 'foobar') I think that would be at least as desirable as your proposal; but what would be the order of replacements? -- \ ?Shepherds ? look after their sheep so they can, first, fleece | `\ them and second, turn them into meat. That's much more like the | _o__) priesthood as I know it.? ?Christopher Hitchens, 2008-10-29 | Ben Finney From cmjohnson.mailinglist at gmail.com Thu Apr 12 03:59:34 2012 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Wed, 11 Apr 2012 15:59:34 -1000 Subject: [Python-ideas] in str.replace(old, new), allow 'old' to accept a tuple In-Reply-To: <87lim1g1r8.fsf@benfinney.id.au> References: <87lim1g1r8.fsf@benfinney.id.au> Message-ID: On Apr 11, 2012, at 3:47 PM, Ben Finney wrote: > 'foo bar baz'.replace(('foo', 'bar'), 'foobar') > > You can't replace multiple matches ?at the same time?, as you're > implying. The order of replacements is important, since it will affect > the outcome in many cases. Can't you say the same about 'a b c'.replace("a", "aa")? I think the case of the needles overlapping is more to your point though. "abc".replace( ("ab", "bc"), "b") What should that produce? "bc"? "b"? "ab" even (if we ignore the order of the tuple)? From ben+python at benfinney.id.au Thu Apr 12 04:46:19 2012 From: ben+python at benfinney.id.au (Ben Finney) Date: Thu, 12 Apr 2012 12:46:19 +1000 Subject: [Python-ideas] in str.replace(old, new), allow 'old' to accept a tuple References: <87lim1g1r8.fsf@benfinney.id.au> Message-ID: <878vi13bwk.fsf@benfinney.id.au> "Carl M. Johnson" writes: > On Apr 11, 2012, at 3:47 PM, Ben Finney wrote: > > You can't replace multiple matches ?at the same time?, as you're > > implying. The order of replacements is important, since it will > > affect the outcome in many cases. > > Can't you say the same about 'a b c'.replace("a", "aa")? Not the same thing. The matches *can* be all ?at the same time?, in every case, since only a single pattern is being matched. Then, once all those matches are found, they're all replaced. So it's not a problem. I'm pointing out that, if distinct patterns are being matched and replaced, then the order of replacement matters. > I think the case of the needles overlapping is more to your point > though. > > "abc".replace( ("ab", "bc"), "b") > > What should that produce? "bc"? "b"? "ab" even (if we ignore the order > of the tuple)? Yes, these and other cases make it problematic to think in terms of ?replace them all at the same time?. The replacements should be done in an order predictable by the person reading the code. And if they should be done in order, then that order should be explicit. I think the existing solution helps with that. -- \ ?? it's best to confuse only one issue at a time.? ?Brian W. | `\ Kernighan and Dennis M. Ritchie, _The C programming language_, | _o__) 1988 | Ben Finney From greg.ewing at canterbury.ac.nz Thu Apr 12 04:47:42 2012 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 12 Apr 2012 14:47:42 +1200 Subject: [Python-ideas] in str.replace(old, new), allow 'old' to accept a tuple In-Reply-To: <87lim1g1r8.fsf@benfinney.id.au> References: <87lim1g1r8.fsf@benfinney.id.au> Message-ID: <4F86424E.7090403@canterbury.ac.nz> Ben Finney wrote: > Tshepang Lekhonkhobe > writes: >>>>>'foo bar baz'.replace(('foo', 'bar'), 'baz') > You can't replace multiple matches ?at the same time?, as you're > implying. An obvious thing to do is to try them in the order they appear in the sequence. That would argue against allowing an unordered collection. Not quite so obvious is whether the replacements should be considered as candidates for further replacements. I would say not, because it complicates the algorithm and in my experience is rarely needed. If you want that, you would just have to do multiple replace calls like you do now. And how about allowing a sequence of (old, new) pairs instead of just a single replacement? That would be even more useful. -- Greg From cs at zip.com.au Thu Apr 12 05:58:36 2012 From: cs at zip.com.au (Cameron Simpson) Date: Thu, 12 Apr 2012 13:58:36 +1000 Subject: [Python-ideas] in str.replace(old, new), allow 'old' to accept a tuple In-Reply-To: <4F86424E.7090403@canterbury.ac.nz> References: <4F86424E.7090403@canterbury.ac.nz> Message-ID: <20120412035836.GA19824@cskk.homeip.net> On 12Apr2012 14:47, Greg Ewing wrote: | Ben Finney wrote: | > Tshepang Lekhonkhobe | > writes: | >>>>>'foo bar baz'.replace(('foo', 'bar'), 'baz') | | > You can't replace multiple matches ?at the same time?, as you're | > implying. | | An obvious thing to do is to try them in the order they | appear in the sequence. That would argue against allowing | an unordered collection. And likewise with Ben's set() suggestion. I for one would allow it. If the order matters, the caller can produce a sequence with the required order. If the order doesn't matter (you know no replacement overlaps, and no replacement introduces text that itself should get replaced), then why not allow a set? I vote for any iterable if this ges ahead. The specification should sy that replacements happen in the order items come from the iterable, leaving the choice of control up to the caller but providing predicable behaviour if the caller provides a predictable sequence. | Not quite so obvious is whether the replacements should | be considered as candidates for further replacements. | I would say not, because it complicates the algorithm | and in my experience is rarely needed. Not to mention recursion! | If you want that, | you would just have to do multiple replace calls like | you do now. | | And how about allowing a sequence of (old, new) pairs | instead of just a single replacement? That would be even | more useful. Sure. But doesn't that break the function signature? I suppose we're already there though. Do you want to special case the single string replacement or require callers to use zip(repls, [ "foo" for s in repls ])? Personally, I would require the zip; the, um, flexibility of the %-format operator with string-vs-list has long bothered me to the point %of always providing a sequence, even a single element tuple. -- Cameron Simpson DoD#743 http://www.cskk.ezoshosting.com/cs/ Mountain rescue teams insist the all climbers wear helmets, and fall haedfirst. They are then impacted into a small globular mass easily stowed in a rucsac. - Tom Patey, who didnt, and wasnt From cs at zip.com.au Thu Apr 12 06:01:02 2012 From: cs at zip.com.au (Cameron Simpson) Date: Thu, 12 Apr 2012 14:01:02 +1000 Subject: [Python-ideas] in str.replace(old, new), allow 'old' to accept a tuple In-Reply-To: <87lim1g1r8.fsf@benfinney.id.au> References: <87lim1g1r8.fsf@benfinney.id.au> Message-ID: <20120412040102.GA21051@cskk.homeip.net> On 12Apr2012 11:47, Ben Finney wrote: | Tshepang Lekhonkhobe | writes: | | > >>> 'foo bar baz'.replace(('foo', 'bar'), 'baz') | > baz baz baz | | How about: | | 'foo bar baz'.replace(('foo', 'bar'), 'foobar') | | You can't replace multiple matches ?at the same time?, as you're | implying. The order of replacements is important, since it will affect | the outcome in many cases. "At the same time" might imply something equivalent to the cited "re.sub('foo|bar',...)" suggestion. And that is different to an iterated "replace foo, then replace bar" if the possible matched overlap. Just a thought about what semantics the OP may have envisaged. Personally, given re.sub and the ease of running replace a few times in a loop, I'm -0.3 on the suggestion itself. Cheers, -- Cameron Simpson DoD#743 http://www.cskk.ezoshosting.com/cs/ A software engineering discussion from Perl-Porters: Chip Salzenberg: The wise one has seen the calamity, and has proceeded to hide himself. - Ecclesiastes Gurusamy Sarathy: He that observeth the wind shall not sow; and he that regardeth the clouds shall not reap. From ben+python at benfinney.id.au Thu Apr 12 06:33:51 2012 From: ben+python at benfinney.id.au (Ben Finney) Date: Thu, 12 Apr 2012 14:33:51 +1000 Subject: [Python-ideas] in str.replace(old, new), allow 'old' to accept a tuple References: <87lim1g1r8.fsf@benfinney.id.au> <4F86424E.7090403@canterbury.ac.nz> Message-ID: <874nsp36xc.fsf@benfinney.id.au> Greg Ewing writes: > An obvious thing to do is to try them in the order they appear in the > sequence. That would argue against allowing an unordered collection. For that reason, I'm ?0.5 on the proposal. If we're to specify multiple match patterns and do them all in a single operation, I'd prefer to specify them in e.g. a set or some other efficient non-ordered collection. -- \ ?On the other hand, you have different fingers.? ?Steven Wright | `\ | _o__) | Ben Finney From ben+python at benfinney.id.au Thu Apr 12 06:37:01 2012 From: ben+python at benfinney.id.au (Ben Finney) Date: Thu, 12 Apr 2012 14:37:01 +1000 Subject: [Python-ideas] in str.replace(old, new), allow 'old' to accept a tuple References: <87lim1g1r8.fsf@benfinney.id.au> <20120412040102.GA21051@cskk.homeip.net> Message-ID: <87zkah1s7m.fsf@benfinney.id.au> Cameron Simpson writes: > "At the same time" might imply something equivalent to the cited > "re.sub('foo|bar',...)" suggestion. And that is different to an iterated > "replace foo, then replace bar" if the possible matched overlap. Yes, it is; but the OP presented a proposal as though it were to have the same semantics as a sequence of replace operations. If the OP wants to specify different semantics, let's hear it. -- \ ?A child of five could understand this. Fetch me a child of | `\ five.? ?Groucho Marx | _o__) | Ben Finney From ncoghlan at gmail.com Thu Apr 12 06:56:16 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 12 Apr 2012 14:56:16 +1000 Subject: [Python-ideas] in str.replace(old, new), allow 'old' to accept a tuple In-Reply-To: <87zkah1s7m.fsf@benfinney.id.au> References: <87lim1g1r8.fsf@benfinney.id.au> <20120412040102.GA21051@cskk.homeip.net> <87zkah1s7m.fsf@benfinney.id.au> Message-ID: On Thu, Apr 12, 2012 at 2:37 PM, Ben Finney wrote: > If the OP wants to specify different semantics, let's hear it. Whatever semantics were chosen, they would end up being confusing to *someone*. With prefix and suffix matching, the implicit OR is simple and obvious. The same can't be said for the replacement command, particular if it can be used with unordered collections. Far better to leave this task to re.sub (which uses regex syntax to avoid ambiguity) or to explicit flow control and multiple invocations of replace(). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From rob.cliffe at btinternet.com Thu Apr 12 09:32:44 2012 From: rob.cliffe at btinternet.com (Rob Cliffe) Date: Thu, 12 Apr 2012 08:32:44 +0100 Subject: [Python-ideas] in str.replace(old, new), allow 'old' to accept a tuple In-Reply-To: References: <87lim1g1r8.fsf@benfinney.id.au> <20120412040102.GA21051@cskk.homeip.net> <87zkah1s7m.fsf@benfinney.id.au> Message-ID: <4F86851C.9060504@btinternet.com> On 12/04/2012 05:56, Nick Coghlan wrote: > On Thu, Apr 12, 2012 at 2:37 PM, Ben Finney wrote: >> If the OP wants to specify different semantics, let's hear it. > Whatever semantics were chosen, they would end up being confusing to *someone*. > > With prefix and suffix matching, the implicit OR is simple and > obvious. The same can't be said for the replacement command, > particular if it can be used with unordered collections. > > Far better to leave this task to re.sub (which uses regex syntax to > avoid ambiguity) or to explicit flow control and multiple invocations > of replace(). > > Cheers, > Nick. > I rather like this proposal. The semantics for s.replace(strings, replacementString) could be: 'strings', if not a string, must be a tuple, for consistency with str.startswith (although I don't see why a list shouldn't be allowed for both). Scan s from left to right; whenever a match is found with any member of 'strings' (tested in the order specified by 'strings'), do the replacement. The replaced text is not eligible for further replacement. But the real value for such proposals is not the complicated cases where the precise semantics matter, but the convenience in simple cases (almost any language feature CAN be used in an obscure way), e.g. def dequote(s): singlequote = "'" doublequote = '"' return s.replace((singlequote, doublequote), '') +0.8 Rob Cliffe From tshepang at gmail.com Thu Apr 12 11:39:45 2012 From: tshepang at gmail.com (Tshepang Lekhonkhobe) Date: Thu, 12 Apr 2012 11:39:45 +0200 Subject: [Python-ideas] in str.replace(old, new), allow 'old' to accept a tuple In-Reply-To: <87zkah1s7m.fsf@benfinney.id.au> References: <87lim1g1r8.fsf@benfinney.id.au> <20120412040102.GA21051@cskk.homeip.net> <87zkah1s7m.fsf@benfinney.id.au> Message-ID: On Thu, Apr 12, 2012 at 06:37, Ben Finney wrote: > Cameron Simpson writes: > >> "At the same time" might imply something equivalent to the cited >> "re.sub('foo|bar',...)" suggestion. And that is different to an iterated >> "replace foo, then replace bar" if the possible matched overlap. > > Yes, it is; but the OP presented a proposal as though it were to have > the same semantics as a sequence of replace operations. > > If the OP wants to specify different semantics, let's hear it. You guys are thinking more deeply about this than I was. I don't even see a difference between the 2: >>> 'foo bar baz'.replace('foo', 'baz').replace('bar', 'baz') == re.sub('foo|bar', 'baz', 'foo bar baz') True I was not even thinking about ordering, but it would help to have it to avoid confusion I think. The example I gave was just the closest I could think of. From songofacandy at gmail.com Thu Apr 12 13:32:45 2012 From: songofacandy at gmail.com (INADA Naoki) Date: Thu, 12 Apr 2012 20:32:45 +0900 Subject: [Python-ideas] in str.replace(old, new), allow 'old' to accept a tuple In-Reply-To: References: <87lim1g1r8.fsf@benfinney.id.au> <20120412040102.GA21051@cskk.homeip.net> <87zkah1s7m.fsf@benfinney.id.au> Message-ID: I want multiple replace at once. For example html escape looks like: >>> "<>&".replace('<', '<', '>', '>', '&', '&') '<>&' or >>> "<>&".replace( ('<', '<'), ('>', '>'), ('&', '&') ) '<>&' On Thu, Apr 12, 2012 at 6:39 PM, Tshepang Lekhonkhobe wrote: > On Thu, Apr 12, 2012 at 06:37, Ben Finney wrote: >> Cameron Simpson writes: >> >>> "At the same time" might imply something equivalent to the cited >>> "re.sub('foo|bar',...)" suggestion. And that is different to an iterated >>> "replace foo, then replace bar" if the possible matched overlap. >> >> Yes, it is; but the OP presented a proposal as though it were to have >> the same semantics as a sequence of replace operations. >> >> If the OP wants to specify different semantics, let's hear it. > > You guys are thinking more deeply about this than I was. I don't even > see a difference between the 2: > >>>> 'foo bar baz'.replace('foo', 'baz').replace('bar', 'baz') == re.sub('foo|bar', 'baz', 'foo bar baz') > True > > I was not even thinking about ordering, but it would help to have it > to avoid confusion I think. The example I gave was just the closest I > could think of. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -- INADA Naoki? From sven at marnach.net Thu Apr 12 15:10:23 2012 From: sven at marnach.net (Sven Marnach) Date: Thu, 12 Apr 2012 14:10:23 +0100 Subject: [Python-ideas] in str.replace(old, new), allow 'old' to accept a tuple In-Reply-To: References: <87lim1g1r8.fsf@benfinney.id.au> <20120412040102.GA21051@cskk.homeip.net> <87zkah1s7m.fsf@benfinney.id.au> Message-ID: <20120412131023.GF30763@bagheera> INADA Naoki schrieb am Thu, 12. Apr 2012, um 20:32:45 +0900: > >>> "<>&".replace( ('<', '<'), ('>', '>'), ('&', '&') ) > '<>&' In current Python, it's >>> t = str.maketrans({"<": "<", ">": ">", "&": "&"}) >>> "<>&".translate(t) '<>&' Looks good enough for me. Cheers, Sven From songofacandy at gmail.com Thu Apr 12 15:17:30 2012 From: songofacandy at gmail.com (INADA Naoki) Date: Thu, 12 Apr 2012 22:17:30 +0900 Subject: [Python-ideas] in str.replace(old, new), allow 'old' to accept a tuple In-Reply-To: <20120412131023.GF30763@bagheera> References: <87lim1g1r8.fsf@benfinney.id.au> <20120412040102.GA21051@cskk.homeip.net> <87zkah1s7m.fsf@benfinney.id.au> <20120412131023.GF30763@bagheera> Message-ID: Oh, I didn't know that. Thank you. But what about unescape? str.translate accepts only one character key. On Thu, Apr 12, 2012 at 10:10 PM, Sven Marnach wrote: > INADA Naoki schrieb am Thu, 12. Apr 2012, um 20:32:45 +0900: >> >>> "<>&".replace( ('<', '<'), ('>', '>'), ('&', '&') ) >> '<>&' > > In current Python, it's > > ? ?>>> t = str.maketrans({"<": "<", ">": ">", "&": "&"}) > ? ?>>> "<>&".translate(t) > ? ?'<>&' > > Looks good enough for me. > > Cheers, > ? ?Sven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -- INADA Naoki? From sven at marnach.net Thu Apr 12 18:20:13 2012 From: sven at marnach.net (Sven Marnach) Date: Thu, 12 Apr 2012 17:20:13 +0100 Subject: [Python-ideas] in str.replace(old, new), allow 'old' to accept a tuple In-Reply-To: References: <87lim1g1r8.fsf@benfinney.id.au> <20120412040102.GA21051@cskk.homeip.net> <87zkah1s7m.fsf@benfinney.id.au> <20120412131023.GF30763@bagheera> Message-ID: <20120412162013.GG30763@bagheera> INADA Naoki schrieb am Thu, 12. Apr 2012, um 22:17:30 +0900: > Oh, I didn't know that. Thank you. > But what about unescape? str.translate accepts only one character key. You'd currently need to use the `re` module: >>> d = {"&": "&", ">": ">", "<": "<"} >>> re.sub("|".join(d), lambda m: d[m.group()], "<>&") '<>&' Cheers, Sven From songofacandy at gmail.com Thu Apr 12 18:32:58 2012 From: songofacandy at gmail.com (INADA Naoki) Date: Fri, 13 Apr 2012 01:32:58 +0900 Subject: [Python-ideas] in str.replace(old, new), allow 'old' to accept a tuple In-Reply-To: <20120412162013.GG30763@bagheera> References: <87lim1g1r8.fsf@benfinney.id.au> <20120412040102.GA21051@cskk.homeip.net> <87zkah1s7m.fsf@benfinney.id.au> <20120412131023.GF30763@bagheera> <20120412162013.GG30763@bagheera> Message-ID: Yes, I know it. But if str.replace() or str.translate() can do it, it is simpler and faster than re.sub(). On Fri, Apr 13, 2012 at 1:20 AM, Sven Marnach wrote: > INADA Naoki schrieb am Thu, 12. Apr 2012, um 22:17:30 +0900: >> Oh, I didn't know that. Thank you. >> But what about unescape? str.translate accepts only one character key. > > You'd currently need to use the `re` module: > > ? ?>>> d = {"&": "&", ">": ">", "<": "<"} > ? ?>>> re.sub("|".join(d), lambda m: d[m.group()], "<>&") > ? ?'<>&' > > Cheers, > ? ?Sven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -- INADA Naoki? From stefan_ml at behnel.de Thu Apr 12 19:08:58 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 12 Apr 2012 19:08:58 +0200 Subject: [Python-ideas] in str.replace(old, new), allow 'old' to accept a tuple In-Reply-To: References: <87lim1g1r8.fsf@benfinney.id.au> <20120412040102.GA21051@cskk.homeip.net> <87zkah1s7m.fsf@benfinney.id.au> <20120412131023.GF30763@bagheera> <20120412162013.GG30763@bagheera> Message-ID: INADA Naoki, 12.04.2012 18:32: > On Fri, Apr 13, 2012 at 1:20 AM, Sven Marnach wrote: >> INADA Naoki schrieb am Thu, 12. Apr 2012, um 22:17:30 +0900: >>> Oh, I didn't know that. Thank you. >>> But what about unescape? str.translate accepts only one character key. >> >> You'd currently need to use the `re` module: >> >> >>> d = {"&": "&", ">": ">", "<": "<"} >> >>> re.sub("|".join(d), lambda m: d[m.group()], "<>&") >> '<>&' > > Yes, I know it. > But if str.replace() or str.translate() can do it, it is simpler and > faster than re.sub(). Simpler, maybe, at least at the API level. But faster? Not necesarily. It could use Aho-Corasick, but that means it needs to construct the search graph on each call, which is fairly expensive. And str.replace() isn't the right interface for anything but a one-shot operation if the intention is to pass in a sequence of keywords. Stefan From greg.ewing at canterbury.ac.nz Fri Apr 13 01:01:46 2012 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 13 Apr 2012 11:01:46 +1200 Subject: [Python-ideas] in str.replace(old, new), allow 'old' to accept a tuple In-Reply-To: References: <87lim1g1r8.fsf@benfinney.id.au> <20120412040102.GA21051@cskk.homeip.net> <87zkah1s7m.fsf@benfinney.id.au> <20120412131023.GF30763@bagheera> <20120412162013.GG30763@bagheera> Message-ID: <4F875EDA.500@canterbury.ac.nz> Stefan Behnel wrote: > And str.replace() isn't the > right interface for anything but a one-shot operation if the intention is > to pass in a sequence of keywords. So maybe a better approach would be to enhance maketrans so that both keys and replacements can be more than one character long? Behind the scenes, it could build a DFA or whatever is needed to do it efficiently. -- Greg From greg at krypto.org Fri Apr 13 02:40:01 2012 From: greg at krypto.org (Gregory P. Smith) Date: Thu, 12 Apr 2012 17:40:01 -0700 Subject: [Python-ideas] in str.replace(old, new), allow 'old' to accept a tuple In-Reply-To: References: <87lim1g1r8.fsf@benfinney.id.au> <20120412040102.GA21051@cskk.homeip.net> <87zkah1s7m.fsf@benfinney.id.au> Message-ID: On Wed, Apr 11, 2012 at 9:56 PM, Nick Coghlan wrote: > On Thu, Apr 12, 2012 at 2:37 PM, Ben Finney > wrote: > > If the OP wants to specify different semantics, let's hear it. > > Whatever semantics were chosen, they would end up being confusing to > *someone*. > > With prefix and suffix matching, the implicit OR is simple and > obvious. The same can't be said for the replacement command, > particular if it can be used with unordered collections. > > Far better to leave this task to re.sub (which uses regex syntax to > avoid ambiguity) or to explicit flow control and multiple invocations > of replace(). > > Agreed. Which is why I'm -1 on the proposed change to str.replace(). -------------- next part -------------- An HTML attachment was scrubbed... URL: From songofacandy at gmail.com Fri Apr 13 03:00:33 2012 From: songofacandy at gmail.com (INADA Naoki) Date: Fri, 13 Apr 2012 10:00:33 +0900 Subject: [Python-ideas] in str.replace(old, new), allow 'old' to accept a tuple In-Reply-To: References: <87lim1g1r8.fsf@benfinney.id.au> <20120412040102.GA21051@cskk.homeip.net> <87zkah1s7m.fsf@benfinney.id.au> <20120412131023.GF30763@bagheera> <20120412162013.GG30763@bagheera> Message-ID: > Simpler, maybe, at least at the API level. But faster? Not necesarily. It > could use Aho-Corasick, but that means it needs to construct the search > graph on each call, which is fairly expensive. You're right. But in simple situation, overhead of making match object and calling callback is more expensive. (ex. https://gist.github.com/2369648 ) I think chaining replace is not so bad for such simple cases. So a problem is there are no "one obvious way" to replace multiple keywords. On Fri, Apr 13, 2012 at 2:08 AM, Stefan Behnel wrote: > INADA Naoki, 12.04.2012 18:32: >> On Fri, Apr 13, 2012 at 1:20 AM, Sven Marnach wrote: >>> INADA Naoki schrieb am Thu, 12. Apr 2012, um 22:17:30 +0900: >>>> Oh, I didn't know that. Thank you. >>>> But what about unescape? str.translate accepts only one character key. >>> >>> You'd currently need to use the `re` module: >>> >>> ? ?>>> d = {"&": "&", ">": ">", "<": "<"} >>> ? ?>>> re.sub("|".join(d), lambda m: d[m.group()], "<>&") >>> ? ?'<>&' >> >> Yes, I know it. >> But if str.replace() or str.translate() can do it, it is simpler and >> faster than re.sub(). > > Simpler, maybe, at least at the API level. But faster? Not necesarily. It > could use Aho-Corasick, but that means it needs to construct the search > graph on each call, which is fairly expensive. And str.replace() isn't the > right interface for anything but a one-shot operation if the intention is > to pass in a sequence of keywords. > > Stefan > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -- INADA Naoki? From raymond.hettinger at gmail.com Fri Apr 13 04:03:00 2012 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Thu, 12 Apr 2012 22:03:00 -0400 Subject: [Python-ideas] in str.replace(old, new), allow 'old' to accept a tuple In-Reply-To: References: Message-ID: <19BE1705-7076-4E1E-AF5A-7C4C99800032@gmail.com> On Apr 11, 2012, at 4:35 PM, Tshepang Lekhonkhobe wrote: > I find the fact that 'prefix' in str.startswith(prefix) accept a tuple > quite useful. That's because one can do a match on more than one > pattern at a time, without ugliness. Would it be a good idea to do the > same for str.replace(old, new)? > > before >>>> 'foo bar baz'.replace('foo', 'baz').replace('bar', 'baz') > baz baz baz > > after >>>> 'foo bar baz'.replace(('foo', 'bar'), 'baz') > baz baz baz It seems to meet that it is a rare use case to want to replace many things with a single replacement string. I can't remember a single case of ever needing this. This only thing that comes to mind is automated redaction. What I have needed and have seen others need is a dictionary based replace: {'customer': 'client', 'headquarters': 'office', 'now': 'soon'}. Even that case is a fraught with peril -- I would want "now" to change to "soon" but not have "snow" change to "ssoon". In the end, I think want people want is to have the power and control afforded by re.sub() but without having to learn regular expressions. Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen at xemacs.org Fri Apr 13 05:14:51 2012 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 13 Apr 2012 12:14:51 +0900 Subject: [Python-ideas] in str.replace(old, new), allow 'old' to accept a tuple In-Reply-To: <19BE1705-7076-4E1E-AF5A-7C4C99800032@gmail.com> References: <19BE1705-7076-4E1E-AF5A-7C4C99800032@gmail.com> Message-ID: On Fri, Apr 13, 2012 at 11:03 AM, Raymond Hettinger wrote: > What I have needed and have seen others need is a dictionary > based replace: ? ? {'customer': 'client', 'headquarters': 'office', 'now': > 'soon'}. > Even that case is a fraught with peril -- I would want "now" to change > to "soon" but not have "snow" change to "ssoon". > > In the end, I think want people want is to have the power > and control afforded by re.sub() but without having > to learn regular expressions. There is one very attractive special case, however, which is an invertible translation like URL-escaping (or HTML-escaping), where at least one side of the transform is single characters. Then there is no ambiguity. Nevertheless, I think that case is special enough that it may as well be done in the modules that deal with URLs and HTML respectively. From breamoreboy at yahoo.co.uk Fri Apr 13 05:43:40 2012 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Fri, 13 Apr 2012 04:43:40 +0100 Subject: [Python-ideas] in str.replace(old, new), allow 'old' to accept a tuple In-Reply-To: <878vi13bwk.fsf@benfinney.id.au> References: <87lim1g1r8.fsf@benfinney.id.au> <878vi13bwk.fsf@benfinney.id.au> Message-ID: On 12/04/2012 03:46, Ben Finney wrote: > And if they should be done in order, then that order should be explicit. > I think the existing solution helps with that. > Something along the lines of >>> mystr = 'foo bar baz' >>> for old in 'foo', 'bar': ... mystr = mystr.replace(old, 'baz') ... >>> mystr 'baz baz baz' Or can this be simplified with the Python Swiss Army Knife aka the itertools module? :) -- Cheers. Mark Lawrence. From xorninja at gmail.com Fri Apr 13 17:41:21 2012 From: xorninja at gmail.com (Itzik Kotler) Date: Fri, 13 Apr 2012 18:41:21 +0300 Subject: [Python-ideas] Pythonect 0.1.0 Release In-Reply-To: References: Message-ID: Hi, I have just committed PEP8 fixes to Pythonect ( https://github.com/ikotler/pythonect). And, I also made a Pythonect Tutorial: Learn By Example Regards, Itzik Kotler | http://www.ikotler.org On Sun, Apr 1, 2012 at 6:58 PM, Jakob Bowyer wrote: > You might want to PEP8 your code, move imports to the top lose some of > the un-needed lines > > On Sun, Apr 1, 2012 at 3:34 PM, Itzik Kotler wrote: > > Hi All, > > > > I'm pleased to announce the first beta release of Pythonect interpreter. > > > > Pythonect is a new, experimental, general-purpose dataflow programming > > language based on Python. > > > > It aims to combine the intuitive feel of shell scripting (and all of its > > perks like implicit parallelism) with the flexibility and agility of > Python. > > > > Pythonect interpreter (and reference implementation) is written in > Python, > > and is available under the BSD license. > > > > Here's a quick tour of Pythonect: > > > > The canonical "Hello, world" example program in Pythonect: > > > >>>> "Hello World" -> print > > : Hello World > > Hello World > >>>> > > > > '->' and '|' are both Pythonect operators. > > > > The pipe operator (i.e. '|') passes one item at a item, while the other > > operator passes all items at once. > > > > > > Python statements and other None-returning function are acting as a > > pass-through: > > > >>>> "Hello World" -> print -> print > > : Hello World > > : Hello World > > Hello World > >>>> > > > >>>> 1 -> import math -> math.log > > 0.0 > >>>> > > > > > > Parallelization in Pythonect: > > > >>>> "Hello World" -> [ print , print ] > > : Hello World > > : Hello World > > ['Hello World', 'Hello World'] > > > >>>> range(0,3) -> import math -> math.sqrt > > [0.0, 1.0, 1.4142135623730951] > >>>> > > > > In the future, I am planning on adding support for multi-processing, and > > even distributed computing. > > > > > > The '_' identifier allow access to current item: > > > >>>> "Hello World" -> [ print , print ] -> _ + " and Python" > > : Hello World > > : Hello World > > ['Hello World and Python', 'Hello World and Python'] > >>>> > > > >>>> [ 1 , 2 ] -> _**_ > > [1, 4] > >>>> > > > > > > True/False return values as filters: > > > >>>> "Hello World" -> _ == "Hello World" -> print > > : Hello World > >>>> > > > >>>> "Hello World" -> _ == "Hello World1" -> print > > False > >>>> > > > >>>> range(1,10) -> _ % 2 == 0 > > [2, 4, 6, 8] > >>>> > > > > > > Last but not least, I have also added extra syntax for making remote > > procedure call easy: > > > >>> 1 -> inc at xmlrpc://localhost:8000 -> print > > : 2 > > 2 > >>>> > > > > Download Pythonect v0.1.0 from: > > http://github.com/downloads/ikotler/pythonect/Pythonect-0.1.0.tar.gz > > > > More information can be found at: http://www.pythonect.org > > > > > > I will appreciate any input / feedback that you can give me. > > > > Also, for those interested in working on the project, I'm actively > > interested in welcoming and supporting both new developers and new users. > > Feel free to contact me. > > > > > > Regards, > > Itzik Kotler | http://www.ikotler.org > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > http://mail.python.org/mailman/listinfo/python-ideas > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From maxmoroz at gmail.com Wed Apr 18 03:23:10 2012 From: maxmoroz at gmail.com (Max Moroz) Date: Tue, 17 Apr 2012 18:23:10 -0700 Subject: [Python-ideas] Providing a guarantee that instances of user-defined classes have distinct identities In-Reply-To: References: Message-ID: Suppose I model a game of cards, where suits don't matter. I might find the integer representation of cards (14 for "Ace", 13 for "King", ..., 2 for "2") to be convenient. The deck has 4 copies of each card, which I need to distinguish. I was thinking to model a card as follows: class Card(int): __hash__ = int.__hash__ def __eq__(self, other): return self is other This works precisely as I want (at least in CPython 3.2): x = Card(14) y = Card(14) assert x != y # x and y are two different Aces z = x assert x == z # x and z are bound to the same Ace But this behavior is implementation dependent, so the above code may one day break (very painfully for whoever happens to maintain it at the time). Is it possible to add a guarantee to the language that would make the above code safe to use? Currently the language promises: "For immutable types, operations that compute new values may actually return a reference to any existing object with the same type and value, while for mutable objects this is not allowed." Nowhere in the documentation is it clearly defined which objects are considered "immutable" for the purpose of this promise. As a result, a Python implementation, now or in the future, may decide that it's ok to return a reference to an existing object when a Card instance is created - since arguably, class Card is immutable (since it derives from an immutable base class, and doesn't add any new attributes). Perhaps a change like this would be fine (it obviously won't break any existing code): "For certain types, operations that compute new values may actually return a reference to an existing object with the same type and value. The only types for which this may happen are: - built-in immutable types - user-defined classes that explicitly override __new__ to return a reference to an existing object Note that a user-defined class that inherits from a built-in immutable types, without overriding __new__, will not exhibit this behavior." -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Apr 18 03:44:47 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 18 Apr 2012 11:44:47 +1000 Subject: [Python-ideas] Providing a guarantee that instances of user-defined classes have distinct identities In-Reply-To: References: Message-ID: On Wed, Apr 18, 2012 at 11:23 AM, Max Moroz wrote: > "For immutable types, operations that compute new values may actually return > a reference to any existing object with the same type and value, while for > mutable objects this is not allowed." > > Nowhere in the documentation is it clearly defined which objects are > considered "immutable" for the purpose of this promise. As a result, a > Python implementation, now or in the future, may decide that it's ok to > return a reference to an existing object when a Card instance is created - > since arguably, class Card is immutable (since it derives from an immutable > base class, and doesn't add any new attributes). It's up to the objects themselves (and their metaclasses) - any such optimisation must be implemented in cls.__new__ or metacls.__call__. So, no, you're not going to get a stronger guarantee than is already in place (and you'd be better of just writing Card properly - inheriting from int for an object that should model a "value, suit" 2-tuple is a bad idea. Using collections.namedtuple would be a much better option. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From simon.sapin at kozea.fr Wed Apr 18 07:26:06 2012 From: simon.sapin at kozea.fr (Simon Sapin) Date: Wed, 18 Apr 2012 07:26:06 +0200 Subject: [Python-ideas] Providing a guarantee that instances of user-defined classes have distinct identities In-Reply-To: References: Message-ID: <4F8E506E.7040606@kozea.fr> Le 18/04/2012 03:23, Max Moroz a ?crit : > Nowhere in the documentation is it clearly defined which objects are > considered "immutable" for the purpose of this promise. As a result, a > Python implementation, now or in the future, may decide that it's ok to > return a reference to an existing object when a Card instance is created > - since arguably, class Card is immutable (since it derives from an > immutable base class, and doesn't add any new attributes). Hi, I agree that the definition of "immutable" is not very clear, but I don?t think that your Card class is immutable. As Card inherits without __slots__, it gets a __dict__ and can hold arbitrary attributes. Even if none of its methods do so, this is perfectly okay: x = Card(14) y = Card(14) x.foo = 42 y.foo # AttributeError Because of their __dict__, Card objects can never be considered immutable. (Now, I?m not sure what would happen with an empty __slots__.) Regards, -- Simon Sapin From maxmoroz at gmail.com Wed Apr 18 11:57:37 2012 From: maxmoroz at gmail.com (Max Moroz) Date: Wed, 18 Apr 2012 02:57:37 -0700 Subject: [Python-ideas] Providing a guarantee that instances of user-defined classes have distinct identities In-Reply-To: <4F8E506E.7040606@kozea.fr> References: <4F8E506E.7040606@kozea.fr> Message-ID: Simon Sapin wrote: > I agree that the definition of "immutable" is not very clear, but I don?t > think that your Card class is immutable. I shouldn't have used the word "immutable"; it is a bit confusing and distracts from my real concern. I'm really just trying to get a guarantee from the language that would make my original code safe. As is, it relies on a very reasonable, but undocumented, assumption about the behavior of built-in classes' __new__ method The exact guarantee I need is: "Any built-in class' __new__ method called with the cls argument set to a user-defined subclass, will always return a new instance of type cls." (Emphasis on "new instance" - as opposed to a reference to an existing object.) Nick Coghlan wrote: > and you'd be better of just writing Card properly - > inheriting from int for an object that should model a "value, suit" > 2-tuple is a bad idea. Using collections.namedtuple would be a much > better option. My example might have been poor. Still, I have use cases for objects nearly identical to `int`, `tuple`, etc., but where I want to distinguish two objects created at different times (place them both in a set, compare them as unequal, etc.). Thanks for your comments. Max From mwm at mired.org Wed Apr 18 15:19:17 2012 From: mwm at mired.org (Mike Meyer) Date: Wed, 18 Apr 2012 09:19:17 -0400 Subject: [Python-ideas] Providing a guarantee that instances of user-defined classes have distinct identities In-Reply-To: References: Message-ID: <20120418091917.0e201797@bhuda.mired.org> On Wed, 18 Apr 2012 11:44:47 +1000 Nick Coghlan wrote: > On Wed, Apr 18, 2012 at 11:23 AM, Max Moroz wrote: > > "For immutable types, operations that compute new values may actually return > > a reference to any existing object with the same type and value, while for > > mutable objects this is not allowed." > > > > Nowhere in the documentation is it clearly defined which objects are > > considered "immutable" for the purpose of this promise. As a result, a > > Python implementation, now or in the future, may decide that it's ok to > > return a reference to an existing object when a Card instance is created - > > since arguably, class Card is immutable (since it derives from an immutable > > base class, and doesn't add any new attributes). > > It's up to the objects themselves (and their metaclasses) - any such > optimisation must be implemented in cls.__new__ or metacls.__call__. First, the Python docs don't clearly tell you what objects are immutable because, well, it's an extensible language. With that constraint, the best you can do about that is what it says not far above the section you quoted: An object?s mutability is determined by its type; for instance, numbers, strings and tuples are immutable, while dictionaries and lists are mutable. I.e. - that this is determined by it's type, and a list of builtin types that are immutable. > So, no, you're not going to get a stronger guarantee than is already > in place I believe that this guarantee is strong enough to guarantee that classes that inherit from immutable types won't share values unless the class code does something to make that happen. The type of such an object is *not* the type that it inherits from, it's a Python class type. As demonstrated, such classes aren't immutable, so Python needs to make different instances different objects even if they share the same value. If you want behavior different from that, the class or metaclass has to provide it. http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From sven at marnach.net Wed Apr 18 14:55:15 2012 From: sven at marnach.net (Sven Marnach) Date: Wed, 18 Apr 2012 13:55:15 +0100 Subject: [Python-ideas] Providing a guarantee that instances of user-defined classes have distinct identities In-Reply-To: References: <4F8E506E.7040606@kozea.fr> Message-ID: <20120418125515.GN30763@bagheera> Max Moroz schrieb am Wed, 18. Apr 2012, um 02:57:37 -0700: > I'm really just trying to get a guarantee from the language that would > make my original code safe. As is, it relies on a very reasonable, but > undocumented, assumption about the behavior of built-in classes' > __new__ method Simon's point is that your current code *is* safe since your instances are not immutable. > The exact guarantee I need is: "Any built-in class' __new__ method > called with the cls argument set to a user-defined subclass, will > always return a new instance of type cls." As long as your class does not set `__slots__` to an empty sequence, you already have this guarantee, since your type is not immutable. And while the current documentation might suggest that built-in types would be allowed to check for empty `__slots__` and reuse already created instances of a subclass in that case, it's very unlikely they will ever implement such a mechanism. So just don't define `__slots__` if you want this kind of guarantee, or better even, add an ID to your instances to make the differences you rely on explicit. Cheers, Sven From steve at pearwood.info Wed Apr 18 15:22:55 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 18 Apr 2012 23:22:55 +1000 Subject: [Python-ideas] Providing a guarantee that instances of user-defined classes have distinct identities In-Reply-To: References: <4F8E506E.7040606@kozea.fr> Message-ID: <4F8EC02F.3090009@pearwood.info> Max Moroz wrote: > I'm really just trying to get a guarantee from the language that would > make my original code safe. As is, it relies on a very reasonable, but > undocumented, assumption about the behavior of built-in classes' > __new__ method > > The exact guarantee I need is: "Any built-in class' __new__ method > called with the cls argument set to a user-defined subclass, will > always return a new instance of type cls." > > (Emphasis on "new instance" - as opposed to a reference to an existing object.) I can't help feel that you are worrying about nothing. Why would a built-in class ever return an existing instance of a sub-class? While technically it would be possible, it would require the built-in class to keep a cache of instances for each subclass. Who is going to do that, and why would they bother? It seems to me that you're asking for a guarantee like: "Calling len() on a list will never randomly shuffle the list as a side-effect." The fact that len() doesn't shuffle the list as a side-effect is not a documented promise of the language. But does it have to be? Some things would be just stupid for any implementation to do. There is no limit to the number of stupid things an implementation might do, and for the language to specify that it doesn't do any of them is impossible. I think that __new__ returning an existing instance of a subclass would be one of those stupid things. After all, it is a *constructor*, it is supposed to construct a new instance, if it doesn't do so in the case of being called with a subclass argument it isn't living up to the implied contract. I guess what this comes down to is that I'm quite satisfied with the implied promise that constructors will construct new instances and don't think it is necessary to make that explicit. It's not that I object to your request as that I think it's unnecessary. -- Steven From maxmoroz at gmail.com Wed Apr 18 18:49:43 2012 From: maxmoroz at gmail.com (Max Moroz) Date: Wed, 18 Apr 2012 09:49:43 -0700 Subject: [Python-ideas] Providing a guarantee that instances of user-defined classes have distinct identities In-Reply-To: <4F8EC02F.3090009@pearwood.info> References: <4F8E506E.7040606@kozea.fr> <4F8EC02F.3090009@pearwood.info> Message-ID: After reading the comments, and especially the one below, I am now persuaded that this implied guarantee is sufficient. Part of the problem was that I didn't have a clear picture of what __new__ is supposed to do when called with a (proper) subclass argument. Now I (hopefully correctly) understand that a well-behaved __new__ should in this case simply pass the call to object.__new__, or at least do something very similar: the subclass has every right to expect this behavior to remain unchanged whether or not one of its parent classes defined a custom __new__. No explicit guarantee is needed to confirm that this is what built-in classes do. Steven D'Aprano wrote: > "Calling len() on a list will never randomly shuffle the list as a > side-effect." > > The fact that len() doesn't shuffle the list as a side-effect is not a > documented promise of the language. But does it have to be? Some things > would be just stupid for any implementation to do. There is no limit to the > number of stupid things an implementation might do, and for the language to > specify that it doesn't do any of them is impossible. From sven at marnach.net Thu Apr 19 12:35:13 2012 From: sven at marnach.net (Sven Marnach) Date: Thu, 19 Apr 2012 11:35:13 +0100 Subject: [Python-ideas] Providing a guarantee that instances of user-defined classes have distinct identities In-Reply-To: <4F8EC02F.3090009@pearwood.info> References: <4F8E506E.7040606@kozea.fr> <4F8EC02F.3090009@pearwood.info> Message-ID: <20120419103513.GO30763@bagheera> Steven D'Aprano schrieb am Wed, 18. Apr 2012, um 23:22:55 +1000: > I can't help feel that you are worrying about nothing. Why would a > built-in class ever return an existing instance of a sub-class? > While technically it would be possible, it would require the > built-in class to keep a cache of instances for each subclass. This was also my first reaction; there is one case, though, which you wouldn't need a cache for: if the constructor is called with an instance of the subclass as an argument. As an example, the tuple implementation does not have a cache of instances, and reuses only tuples that are directly passed to the constructor: >>> a = 1, 2 >>> b = 1, 2 >>> a is b False >>> b = tuple(a) >>> a is b True It wouldn't be completely unthinkable that a Python implementation chooses to extend this behaviour to immutable subclasses of immutable types. I don't think there is any reason to disallow such an implementation either. Cheers, Sven From mwm at mired.org Thu Apr 19 14:58:28 2012 From: mwm at mired.org (Mike Meyer) Date: Thu, 19 Apr 2012 08:58:28 -0400 Subject: [Python-ideas] Providing a guarantee that instances of user-defined classes have distinct identities In-Reply-To: <20120419103513.GO30763@bagheera> References: <4F8E506E.7040606@kozea.fr> <4F8EC02F.3090009@pearwood.info> <20120419103513.GO30763@bagheera> Message-ID: <20120419085828.6c58185d@bhuda.mired.org> On Thu, 19 Apr 2012 11:35:13 +0100 Sven Marnach wrote: > It wouldn't be completely unthinkable that a Python implementation > chooses to extend this behaviour to immutable subclasses of immutable > types. I don't think there is any reason to disallow such an > implementation either. How would the implementation determine that the subclass was immutable? http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From simon.sapin at kozea.fr Thu Apr 19 15:31:02 2012 From: simon.sapin at kozea.fr (Simon Sapin) Date: Thu, 19 Apr 2012 15:31:02 +0200 Subject: [Python-ideas] Providing a guarantee that instances of user-defined classes have distinct identities In-Reply-To: <20120419085828.6c58185d@bhuda.mired.org> References: <4F8E506E.7040606@kozea.fr> <4F8EC02F.3090009@pearwood.info> <20120419103513.GO30763@bagheera> <20120419085828.6c58185d@bhuda.mired.org> Message-ID: <4F901396.5000506@kozea.fr> Le 19/04/2012 14:58, Mike Meyer a ?crit : > On Thu, 19 Apr 2012 11:35:13 +0100 > Sven Marnach wrote: >> > It wouldn't be completely unthinkable that a Python implementation >> > chooses to extend this behaviour to immutable subclasses of immutable >> > types. I don't think there is any reason to disallow such an >> > implementation either. > How would the implementation determine that the subclass was > immutable? When all classes in the MRO except 'object' have an empty __slots__? (object behaves just as if it had an empty __slots__, but accessing object.__slots__ raises an AttributeError.) -- Simon Sapin From amauryfa at gmail.com Thu Apr 19 15:48:42 2012 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Thu, 19 Apr 2012 15:48:42 +0200 Subject: [Python-ideas] Providing a guarantee that instances of user-defined classes have distinct identities In-Reply-To: <4F901396.5000506@kozea.fr> References: <4F8E506E.7040606@kozea.fr> <4F8EC02F.3090009@pearwood.info> <20120419103513.GO30763@bagheera> <20120419085828.6c58185d@bhuda.mired.org> <4F901396.5000506@kozea.fr> Message-ID: 2012/4/19 Simon Sapin > Le 19/04/2012 14:58, Mike Meyer a ?crit : > > On Thu, 19 Apr 2012 11:35:13 +0100 >> Sven Marnach wrote: >> >>> > It wouldn't be completely unthinkable that a Python implementation >>> > chooses to extend this behaviour to immutable subclasses of immutable >>> > types. I don't think there is any reason to disallow such an >>> > implementation either. >>> >> How would the implementation determine that the subclass was >> immutable? >> > > When all classes in the MRO except 'object' have an empty __slots__? > But you can still change the __class__ of an object, even immutable ones. -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From contact at xavierho.com Fri Apr 20 14:32:55 2012 From: contact at xavierho.com (Xavier Ho) Date: Fri, 20 Apr 2012 22:32:55 +1000 Subject: [Python-ideas] Have dict().update() return its own reference. Message-ID: Hello, What's the rationale behind the fact that `dict().update()` return nothing? If it returned the dictionary reference, at least we could chain methods, or assign it to another variable, or pass it into a function, etc.. What's the design decision made behind this? Cheers, Xav -------------- next part -------------- An HTML attachment was scrubbed... URL: From contact at xavierho.com Fri Apr 20 14:37:45 2012 From: contact at xavierho.com (Xavier Ho) Date: Fri, 20 Apr 2012 22:37:45 +1000 Subject: [Python-ideas] Have dict().update() return its own reference. In-Reply-To: References: Message-ID: Thanks, that's fair, for consistency. One use case for my question was a stackoverflow question regarding merging two dict's. If update() returned its own reference, and if we explicitly wanted a copy (instead of an in-place modification), we could have used dict(x).update(y) given x and y are both dict() instances. Cheers, Xav On 20 April 2012 22:35, Laurens Van Houtven <_ at lvh.cc> wrote: > As a general rule, methods/functions in Python either *mutate* or > *return*. (Obviously, mutating methods also return, they just return None) > > For example: random.shuffle shuffles in place so doesn't return anything > list.sort sorts in place so doesn't return anything > sorted creates a new sorted thing, so returns that sorted thing > > cheers > lvh > > > > On 20 Apr 2012, at 14:32, Xavier Ho wrote: > > > Hello, > > > > What's the rationale behind the fact that `dict().update()` return > nothing? If it returned the dictionary reference, at least we could chain > methods, or assign it to another variable, or pass it into a function, etc.. > > > > What's the design decision made behind this? > > > > Cheers, > > Xav > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > http://mail.python.org/mailman/listinfo/python-ideas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From masklinn at masklinn.net Fri Apr 20 14:47:34 2012 From: masklinn at masklinn.net (Masklinn) Date: Fri, 20 Apr 2012 14:47:34 +0200 Subject: [Python-ideas] Have dict().update() return its own reference. In-Reply-To: References: Message-ID: <6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net> On 2012-04-20, at 14:37 , Xavier Ho wrote: > Thanks, that's fair, for consistency. > > One use case for my question was a stackoverflow question regarding merging > two dict's. If update() returned its own reference, and if we explicitly > wanted a copy (instead of an in-place modification), we could have used > > dict(x).update(y) > > given x and y are both dict() instances. If you start from dict instances, you could always use: merged = dict(x, **y) From contact at xavierho.com Fri Apr 20 14:48:38 2012 From: contact at xavierho.com (Xavier Ho) Date: Fri, 20 Apr 2012 22:48:38 +1000 Subject: [Python-ideas] Have dict().update() return its own reference. In-Reply-To: <6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net> References: <6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net> Message-ID: On 20 April 2012 22:47, Masklinn wrote: > > If you start from dict instances, you could always use: > > merged = dict(x, **y) > I heard that Guido wasn't a fan of this. Cheers, Xav -------------- next part -------------- An HTML attachment was scrubbed... URL: From masklinn at masklinn.net Fri Apr 20 14:58:46 2012 From: masklinn at masklinn.net (Masklinn) Date: Fri, 20 Apr 2012 14:58:46 +0200 Subject: [Python-ideas] Have dict().update() return its own reference. In-Reply-To: References: <6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net> Message-ID: <0D667BF1-9EB0-447C-AF8F-01E262F98B65@masklinn.net> On 2012-04-20, at 14:48 , Xavier Ho wrote: > On 20 April 2012 22:47, Masklinn wrote: >> >> If you start from dict instances, you could always use: >> >> merged = dict(x, **y) >> > > I heard that Guido wasn't a fan of this. Works to merge two dicts in a single expression, if you don't want to define a wrapper function and find a name for it. From sven at marnach.net Fri Apr 20 15:37:10 2012 From: sven at marnach.net (Sven Marnach) Date: Fri, 20 Apr 2012 14:37:10 +0100 Subject: [Python-ideas] Have dict().update() return its own reference. In-Reply-To: <6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net> References: <6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net> Message-ID: <20120420133710.GR30763@bagheera> Masklinn schrieb am Fri, 20. Apr 2012, um 14:47:34 +0200: > If you start from dict instances, you could always use: > > merged = dict(x, **y) No, not always. Only if all keys of `y` are strings (and probably they should also be valid Python identifiers.) Cheers, Sven From stefan_ml at behnel.de Fri Apr 20 16:28:28 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 20 Apr 2012 16:28:28 +0200 Subject: [Python-ideas] Have dict().update() return its own reference. In-Reply-To: <20120420133710.GR30763@bagheera> References: <6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net> <20120420133710.GR30763@bagheera> Message-ID: Sven Marnach, 20.04.2012 15:37: > Masklinn schrieb am Fri, 20. Apr 2012, um 14:47:34 +0200: >> If you start from dict instances, you could always use: >> >> merged = dict(x, **y) > > No, not always. Only if all keys of `y` are strings (and probably > they should also be valid Python identifiers.) Also, it's not immediately clear from the expression what happens for duplicate keys, and the intended behaviour for that case may be different from what the above does. Stefan From alexander.belopolsky at gmail.com Fri Apr 20 16:35:55 2012 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Fri, 20 Apr 2012 10:35:55 -0400 Subject: [Python-ideas] Have dict().update() return its own reference. In-Reply-To: <20120420133710.GR30763@bagheera> References: <6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net> <20120420133710.GR30763@bagheera> Message-ID: On Fri, Apr 20, 2012 at 9:37 AM, Sven Marnach wrote: >> If you start from dict instances, you could always use: >> >> ? ? merged = dict(x, **y) > > No, not always. ?Only if all keys of `y` are strings (and probably > they should also be valid Python identifiers.) >>> a = {} >>> b = {1:2} >>> dict(a, **b) {1: 2} From _ at lvh.cc Fri Apr 20 16:41:24 2012 From: _ at lvh.cc (Laurens Van Houtven) Date: Fri, 20 Apr 2012 16:41:24 +0200 Subject: [Python-ideas] Have dict().update() return its own reference. In-Reply-To: <20120420133710.GR30763@bagheera> References: <6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net> <20120420133710.GR30763@bagheera> Message-ID: <61BA1F20-2717-4CEF-9AFA-8582D5DC23A2@lvh.cc> That's not actually true. **kwargs can contain things that aren't strings :) cheers lvh On 20 Apr 2012, at 15:37, Sven Marnach wrote: > Masklinn schrieb am Fri, 20. Apr 2012, um 14:47:34 +0200: >> If you start from dict instances, you could always use: >> >> merged = dict(x, **y) > > No, not always. Only if all keys of `y` are strings (and probably > they should also be valid Python identifiers.) > > Cheers, > Sven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From stefan_ml at behnel.de Fri Apr 20 16:49:17 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 20 Apr 2012 16:49:17 +0200 Subject: [Python-ideas] Have dict().update() return its own reference. In-Reply-To: References: <6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net> <20120420133710.GR30763@bagheera> Message-ID: Alexander Belopolsky, 20.04.2012 16:35: > On Fri, Apr 20, 2012 at 9:37 AM, Sven Marnach wrote: >>> If you start from dict instances, you could always use: >>> >>> merged = dict(x, **y) >> >> No, not always. Only if all keys of `y` are strings (and probably >> they should also be valid Python identifiers.) > > >>> a = {} > >>> b = {1:2} > >>> dict(a, **b) > {1: 2} That's no guaranteed behaviour, though. It doesn't work in PyPy, for example: >>>> a={} >>>> b={1:2} >>>> dict(a,**b) Traceback (most recent call last): File "", line 1, in TypeError: keywords must be strings (and, no, it's not PyPy that's wrong here) Stefan From masklinn at masklinn.net Fri Apr 20 16:56:04 2012 From: masklinn at masklinn.net (Masklinn) Date: Fri, 20 Apr 2012 16:56:04 +0200 Subject: [Python-ideas] Have dict().update() return its own reference. In-Reply-To: References: <6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net> <20120420133710.GR30763@bagheera> Message-ID: On 2012-04-20, at 16:28 , Stefan Behnel wrote: > Sven Marnach, 20.04.2012 15:37: >> Masklinn schrieb am Fri, 20. Apr 2012, um 14:47:34 +0200: >>> If you start from dict instances, you could always use: >>> >>> merged = dict(x, **y) >> >> No, not always. Only if all keys of `y` are strings (and probably >> they should also be valid Python identifiers.) > > Also, it's not immediately clear from the expression what happens for > duplicate keys Not sure why, as with `dict.update` `dict` is defined as setting from the first argument, then setting from the keyword arguments (overriding keys originally set if any). Now of course that might not be obvious to people who don't know how dict works, but I fail to see why an other function which they don't know either will be any more "immediately clear". You may counter that a function taking (and merging) a sequence of mappings would "obviously" apply a left fold in merging the mappings, but in that case the dict constructor would "obviously" copy the positional then apply the keywords (which are after the positional). Which is exactly what happens. From alexander.belopolsky at gmail.com Fri Apr 20 17:00:28 2012 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Fri, 20 Apr 2012 11:00:28 -0400 Subject: [Python-ideas] Have dict().update() return its own reference. In-Reply-To: References: <6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net> <20120420133710.GR30763@bagheera> Message-ID: On Fri, Apr 20, 2012 at 10:49 AM, Stefan Behnel wrote: >> >>> a = {} >> >>> b = {1:2} >> >>> dict(a, **b) >> {1: 2} > > That's no guaranteed behaviour, though. It doesn't work in PyPy, for example. I seem to recall that CPython had a similar limitation in the past, but it was removed at some point. I will try to dig out the relevant discussion, but I think the consensus was that ** should not attempt validate the keys. From sven at marnach.net Fri Apr 20 17:47:05 2012 From: sven at marnach.net (Sven Marnach) Date: Fri, 20 Apr 2012 16:47:05 +0100 Subject: [Python-ideas] Have dict().update() return its own reference. In-Reply-To: References: <6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net> <20120420133710.GR30763@bagheera> Message-ID: <20120420154705.GS30763@bagheera> Alexander Belopolsky schrieb am Fri, 20. Apr 2012, um 11:00:28 -0400: > I seem to recall that CPython had a similar limitation in the past, > but it was removed at some point. I will try to dig out the relevant > discussion, but I think the consensus was that ** should not attempt > validate the keys. It's the other way around. Your code used to work in Python 2.x, but it doesn't work in Python 3.x. Cheers, Sven From guido at python.org Fri Apr 20 18:21:50 2012 From: guido at python.org (Guido van Rossum) Date: Fri, 20 Apr 2012 09:21:50 -0700 Subject: [Python-ideas] Have dict().update() return its own reference. In-Reply-To: References: <6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net> <20120420133710.GR30763@bagheera> Message-ID: On Fri, Apr 20, 2012 at 8:00 AM, Alexander Belopolsky wrote: > On Fri, Apr 20, 2012 at 10:49 AM, Stefan Behnel wrote: >>> >>> a = {} >>> >>> b = {1:2} >>> >>> dict(a, **b) >>> {1: 2} >> >> That's no guaranteed behaviour, though. It doesn't work in PyPy, for example. > > I seem to recall that CPython had a similar limitation in the past, > but it was removed at some point. ?I will try to dig out the relevant > discussion, but I think the consensus was that ** should not attempt > validate the keys. The should be strings. There is no requirement that they are valid identifiers. -- --Guido van Rossum (python.org/~guido) From victor.varvariuc at gmail.com Fri Apr 20 20:30:09 2012 From: victor.varvariuc at gmail.com (Victor Varvariuc) Date: Fri, 20 Apr 2012 21:30:09 +0300 Subject: [Python-ideas] Have dict().update() return its own reference. In-Reply-To: References: <6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net> <20120420133710.GR30763@bagheera> Message-ID: >>> a = {} >>> b = {1:2} >>> dict(a, **b) If b is a huge dict - not a good approach -- *Victor Varvariuc* -------------- next part -------------- An HTML attachment was scrubbed... URL: From masklinn at masklinn.net Fri Apr 20 21:49:10 2012 From: masklinn at masklinn.net (Masklinn) Date: Fri, 20 Apr 2012 21:49:10 +0200 Subject: [Python-ideas] Have dict().update() return its own reference. In-Reply-To: References: <6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net> <20120420133710.GR30763@bagheera> Message-ID: On 2012-04-20, at 20:30 , Victor Varvariuc wrote: >>>> a = {} >>>> b = {1:2} >>>> dict(a, **b) > > If b is a huge dict - not a good approach If they're huge mappings, you probably don't want to go around copying them either way[0] and would instead use more custom mappings, either some sort of joining proxy or something out of Okasaki (a clojure-style tree-based map with structural sharing for instance) [0] I'm pretty sure "being fast to copy when bloody huge" is not at the forefront of Python's dict priorities. From p.f.moore at gmail.com Fri Apr 20 23:41:42 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 20 Apr 2012 22:41:42 +0100 Subject: [Python-ideas] Have dict().update() return its own reference. In-Reply-To: References: <6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net> <20120420133710.GR30763@bagheera> Message-ID: On 20 April 2012 20:49, Masklinn wrote: > If they're huge mappings, you probably don't want to go around copying > them either way[0] and would instead use more custom mappings, either > some sort of joining proxy or something out of Okasaki (a clojure-style > tree-based map with structural sharing for instance) Python 3.3 has collections.ChainMap for this sort of case. Paul. From masklinn at masklinn.net Sat Apr 21 01:10:35 2012 From: masklinn at masklinn.net (Masklinn) Date: Sat, 21 Apr 2012 01:10:35 +0200 Subject: [Python-ideas] Have dict().update() return its own reference. In-Reply-To: References: <6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net> <20120420133710.GR30763@bagheera> Message-ID: <27E61C7A-0F78-4825-A235-227882A64C6A@masklinn.net> On 2012-04-20, at 23:41 , Paul Moore wrote: > On 20 April 2012 20:49, Masklinn wrote: >> If they're huge mappings, you probably don't want to go around copying >> them either way[0] and would instead use more custom mappings, either >> some sort of joining proxy or something out of Okasaki (a clojure-style >> tree-based map with structural sharing for instance) > > Python 3.3 has collections.ChainMap for this sort of case. Yeah, it's an example of the "joining proxy" thing. Though I'm not sure I like it being editable, or the lookup order when providing a sequence of maps (I haven't tested it but it appears the maps sequence is traversed front-to-back, I'd have found the other way around more "obvious", as if each sub-mapping was applied to a base through an update call). An other potential weirdness of this solution ? I don't know how ChainMap behaves there, the documentation is unclear ? is iteration over the map and mapping.items() versus [(key, mapping[key]) for key in mapping] potentially having very different behaviors/values since the former is going to return all key:value pairs but the latter is only going to return the key:(first value for key) pairs which may lead to significant repetitions any time a key is present in multiple contexts. From steve at pearwood.info Sat Apr 21 02:24:41 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 21 Apr 2012 10:24:41 +1000 Subject: [Python-ideas] Have dict().update() return its own reference. In-Reply-To: <27E61C7A-0F78-4825-A235-227882A64C6A@masklinn.net> References: <6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net> <20120420133710.GR30763@bagheera> <27E61C7A-0F78-4825-A235-227882A64C6A@masklinn.net> Message-ID: <4F91FE49.2000704@pearwood.info> Masklinn wrote: > On 2012-04-20, at 23:41 , Paul Moore wrote: > >> On 20 April 2012 20:49, Masklinn wrote: >>> If they're huge mappings, you probably don't want to go around copying >>> them either way[0] and would instead use more custom mappings, either >>> some sort of joining proxy or something out of Okasaki (a clojure-style >>> tree-based map with structural sharing for instance) >> Python 3.3 has collections.ChainMap for this sort of case. > > Yeah, it's an example of the "joining proxy" thing. Though I'm not sure > I like it being editable, or the lookup order when providing a sequence > of maps (I haven't tested it but it appears the maps sequence is traversed > front-to-back, I'd have found the other way around more "obvious", as if > each sub-mapping was applied to a base through an update call). ChainMap is meant to emulate scoped lookups, e.g. builtins + globals + nonlocals + locals. Hence, newer scopes mask older scopes. "Locals" should be fast, hence it is at the front. As for being editable, I'm not sure what you mean here, but surely you don't object to it being mutable? > An other potential weirdness of this solution ? I don't know how ChainMap > behaves there, the documentation is unclear ? is iteration over the map > and mapping.items() versus [(key, mapping[key]) for key in mapping] > potentially having very different behaviors/values since the former is > going to return all key:value pairs but the latter is only going to return > the key:(first value for key) pairs which may lead to significant repetitions > any time a key is present in multiple contexts. No, iteration over the ChainMap returns unique keys, not duplicates. >>> from collections import ChainMap >>> mapping = ChainMap(dict(a=1, b=2, c=3, d=4)) >>> mapping = mapping.new_child() >>> mapping.update(dict(d=5, e=6, f=7)) >>> mapping = mapping.new_child() >>> mapping.update(dict(f=8, g=9, h=10)) >>> >>> len(mapping) 8 >>> mapping ChainMap({'h': 10, 'g': 9, 'f': 8}, {'e': 6, 'd': 5, 'f': 7}, {'a': 1, 'c': 3, 'b': 2, 'd': 4}) >>> list(mapping.keys()) ['h', 'a', 'c', 'b', 'e', 'd', 'g', 'f'] >>> list(mapping.values()) [10, 1, 3, 2, 6, 5, 9, 8] -- Steven From masklinn at masklinn.net Sat Apr 21 14:53:27 2012 From: masklinn at masklinn.net (Masklinn) Date: Sat, 21 Apr 2012 14:53:27 +0200 Subject: [Python-ideas] Have dict().update() return its own reference. In-Reply-To: <4F91FE49.2000704@pearwood.info> References: <6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net> <20120420133710.GR30763@bagheera> <27E61C7A-0F78-4825-A235-227882A64C6A@masklinn.net> <4F91FE49.2000704@pearwood.info> Message-ID: <55920183-6C39-49F6-9017-8D358F7ED739@masklinn.net> On 2012-04-21, at 02:24 , Steven D'Aprano wrote: > Masklinn wrote: >> On 2012-04-20, at 23:41 , Paul Moore wrote: >>> On 20 April 2012 20:49, Masklinn wrote: >>>> If they're huge mappings, you probably don't want to go around copying >>>> them either way[0] and would instead use more custom mappings, either >>>> some sort of joining proxy or something out of Okasaki (a clojure-style >>>> tree-based map with structural sharing for instance) >>> Python 3.3 has collections.ChainMap for this sort of case. >> Yeah, it's an example of the "joining proxy" thing. Though I'm not sure >> I like it being editable, or the lookup order when providing a sequence >> of maps (I haven't tested it but it appears the maps sequence is traversed >> front-to-back, I'd have found the other way around more "obvious", as if >> each sub-mapping was applied to a base through an update call). > > ChainMap is meant to emulate scoped lookups yes, my notes were in the context of the thread considering chainmap as a proxy for multiple mappings, I understand this was not the primary use case for chainmap > , e.g. builtins + globals + nonlocals + locals. Hence, newer scopes mask older scopes. "Locals" should be fast, hence it is at the front. That's just a question of traversal order for the maps sequence, if the sequence is in the order you specify there: [builtins, globals, nonlocals, locals] then it can be traversed from the back for locals to have the highest priority. The difference in speed should be almost nil > As for being editable, I'm not sure what you mean here, but surely you don't object to it being mutable? I do, though again that's considering the usage of chainmap as a proxy, not as a scope chain. >> An other potential weirdness of this solution ? I don't know how ChainMap >> behaves there, the documentation is unclear ? is iteration over the map >> and mapping.items() versus [(key, mapping[key]) for key in mapping] >> potentially having very different behaviors/values since the former is >> going to return all key:value pairs but the latter is only going to return >> the key:(first value for key) pairs which may lead to significant repetitions >> any time a key is present in multiple contexts. > > No, iteration over the ChainMap returns unique keys, not duplicates. Ah, that's good. Would probably warrant mention in the documentation though. From stefan_ml at behnel.de Sat Apr 21 15:40:37 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 21 Apr 2012 15:40:37 +0200 Subject: [Python-ideas] Have dict().update() return its own reference. In-Reply-To: <55920183-6C39-49F6-9017-8D358F7ED739@masklinn.net> References: <6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net> <20120420133710.GR30763@bagheera> <27E61C7A-0F78-4825-A235-227882A64C6A@masklinn.net> <4F91FE49.2000704@pearwood.info> <55920183-6C39-49F6-9017-8D358F7ED739@masklinn.net> Message-ID: Masklinn, 21.04.2012 14:53: > On 2012-04-21, at 02:24 , Steven D'Aprano wrote: >> Masklinn wrote: >>> An other potential weirdness of this solution ? I don't know how ChainMap >>> behaves there, the documentation is unclear ? is iteration over the map >>> and mapping.items() versus [(key, mapping[key]) for key in mapping] >>> potentially having very different behaviors/values since the former is >>> going to return all key:value pairs but the latter is only going to return >>> the key:(first value for key) pairs which may lead to significant repetitions >>> any time a key is present in multiple contexts. >> >> No, iteration over the ChainMap returns unique keys, not duplicates. > > Ah, that's good. Would probably warrant mention in the documentation though. What would you want to see there? "This class works as expected even when iterating over it"? Stefan From ericsnowcurrently at gmail.com Sun Apr 22 10:59:54 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Sun, 22 Apr 2012 02:59:54 -0600 Subject: [Python-ideas] sys.implementation In-Reply-To: References: Message-ID: On Thu, Mar 22, 2012 at 7:22 PM, Nick Coghlan wrote: > On Thu, Mar 22, 2012 at 4:37 AM, Benjamin Peterson wrote: >> Eric Snow writes: >>> >>> I'd like to move this forward, so any objections or feedback at this >>> point would be helpful. >> >> I would like to see a concrete proposal of what would get put in there. > > +1 > > A possible starting list: > > - impl name (with a view to further standardising the way we check for > impl specific tests in the regression test suite) > - impl version (official place to store the implementation version, > potentially independent of the language version as it already is in > PyPy) > - cache tag (replacement for imp.get_tag()) This is a great start, Nick. Having a solid sys.implementation would (for one) help with the import machinery, as your list suggests. I'm planning on reviewing that old thread over the next few days. [1] In the meantime, any additional thoughts by anyone on what would go into sys.implementation would be very helpful. -eric p.s. is python-ideas the right place for this discussion? Also, ultimately this topic should go into a PEP, right? [1] http://mail.python.org/pipermail/python-dev/2009-October/092893.html From mark at hotpy.org Sun Apr 22 11:57:13 2012 From: mark at hotpy.org (Mark Shannon) Date: Sun, 22 Apr 2012 10:57:13 +0100 Subject: [Python-ideas] sys.implementation In-Reply-To: References: Message-ID: <4F93D5F9.4050402@hotpy.org> Eric Snow wrote: > On Thu, Mar 22, 2012 at 7:22 PM, Nick Coghlan wrote: >> On Thu, Mar 22, 2012 at 4:37 AM, Benjamin Peterson wrote: >>> Eric Snow writes: >>>> I'd like to move this forward, so any objections or feedback at this >>>> point would be helpful. >>> I would like to see a concrete proposal of what would get put in there. >> +1 >> >> A possible starting list: >> >> - impl name (with a view to further standardising the way we check for >> impl specific tests in the regression test suite) >> - impl version (official place to store the implementation version, >> potentially independent of the language version as it already is in >> PyPy) >> - cache tag (replacement for imp.get_tag()) One more: GC (reference-counting, copying, mark-and-sweep, generational, etc.) > > This is a great start, Nick. Having a solid sys.implementation would > (for one) help with the import machinery, as your list suggests. I'm > planning on reviewing that old thread over the next few days. [1] In > the meantime, any additional thoughts by anyone on what would go into > sys.implementation would be very helpful. > > -eric > > p.s. is python-ideas the right place for this discussion? Also, > ultimately this topic should go into a PEP, right? > > > [1] http://mail.python.org/pipermail/python-dev/2009-October/092893.html > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From solipsis at pitrou.net Sun Apr 22 13:30:24 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 22 Apr 2012 13:30:24 +0200 Subject: [Python-ideas] sys.implementation References: <4F93D5F9.4050402@hotpy.org> Message-ID: <20120422133024.71125555@pitrou.net> On Sun, 22 Apr 2012 10:57:13 +0100 Mark Shannon wrote: > > One more: > > GC (reference-counting, copying, mark-and-sweep, generational, etc.) I think this would sound more natural in the gc module itself. Regards Antoine. From nestornissen at gmail.com Mon Apr 23 03:07:36 2012 From: nestornissen at gmail.com (Nestor) Date: Sun, 22 Apr 2012 21:07:36 -0400 Subject: [Python-ideas] Haskell envy Message-ID: The other day a colleague of mine submitted this challenge taken from some website to us coworkers: Have the function ArrayAddition(arr) take the array of numbers stored in arr and print true if any combination of numbers in the array can be added up to equal the largest number in the array, otherwise print false. For example: if arr contains [4, 6, 23, 10, 1, 3] the output should print true because 4 + 6 + 10 + 3 = 23. The array will not be empty, will not contain all the same elements, and may contain negative numbers. Examples: 5,7,16,1,2 3,5,-1,8,12 I proposed this solution: from itertools import combinations, chain def array_test(arr): biggest = max(arr) arr.remove(biggest) my_comb = chain(*(combinations(arr,i) for i in range(1,len(arr)+1))) for one_comb in my_comb: if sum(one_comb) == biggest: return True, one_comb return False But then somebody else submitted this Haskell solution: import Data.List f :: (Ord a,Num a) => [a] -> Bool f lst = (\y -> elem y $ map sum $ subsequences $ [ x | x <- lst, x /= y ]) $ maximum lst I wonder if we should add a subsequences function to itertools or make the "r" parameter of combinations optional to return all combinations up to len(iterable). From ncoghlan at gmail.com Mon Apr 23 03:26:38 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 23 Apr 2012 11:26:38 +1000 Subject: [Python-ideas] Haskell envy In-Reply-To: References: Message-ID: On Mon, Apr 23, 2012 at 11:07 AM, Nestor wrote: > I wonder if we should add a subsequences function to itertools or make > the "r" parameter of combinations optional to return all combinations > up to len(iterable). Why? It just makes itertools using code that much harder to read, since you'd have yet another variant to learn. If you need it, then just define a separate "all_combinations()" that makes it clear what is going on (replace yield from usage with itertools.chain() for Python < 3.3): from itertools import combinations def all_combinations(data): for num_items in range(1, len(data)+1): yield from combinations(data, num_items) def array_test(arr): biggest = max(arr) data = [x for x in arr if x != biggest] for combo in all_combinations(data): if sum(combo) == biggest: return True, combo return False, None When a gain in brevity increases the necessary level of assumed knowledge for future maintainers, it isn't a clear win from a language design point of view. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From tjreedy at udel.edu Mon Apr 23 04:55:49 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 22 Apr 2012 22:55:49 -0400 Subject: [Python-ideas] Haskell envy In-Reply-To: References: Message-ID: On 4/22/2012 9:07 PM, Nestor wrote: > The other day a colleague of mine submitted this challenge taken from > some website to us coworkers: This is more of a python-list post than a python-ideas post. So is my response. > Have the function ArrayAddition(arr) take the array of numbers stored > in arr and print true if any combination of numbers in the array can > be added up to equal the largest number in the array, otherwise print > false. For example: if arr contains [4, 6, 23, 10, 1, 3] the output > should print true because 4 + 6 + 10 + 3 = 23. The array will not be > empty, will not contain all the same elements, and may contain > negative numbers. Since the order of the numbers is arbitrary and irrelevant to the problem, it should be formulated in term of a set of numbers. Non-empty means that max(set) exists. No duplicates means that max(set) is unique and there is no question of whether to remove one or all copies. > Examples: > 5,7,16,1,2 False (5+7+1+2 == 15) > 3,5,-1,8,12 True (8+5-1 == 12) > I proposed this solution: > > from itertools import combinations, chain > def array_test(arr): > biggest = max(arr) > arr.remove(biggest > my_comb = chain(*(combinations(arr,i) for i in range(1,len(arr)+1))) I believe the above works for sets as well as lists. > for one_comb in my_comb: > if sum(one_comb) == biggest: > return True, one_comb > return False The above is must clearer to me than the Haskell (or my equivalent) below, and I suspect that would be true even if I really knew Haskell. If you want something more like the Haskell version, try return any(filter(x == biggest, map(sum, my_comb))) To make it even more like a one-liner, replace my_comb with its expression. > But then somebody else submitted this Haskell solution: > > import Data.List > f :: (Ord a,Num a) => [a] -> Bool > f lst = (\y -> elem y $ map sum $ subsequences $ [ x | x<- lst, x /= > y ]) $ maximum lst If you really want one logical line that I believe matches the above ;-) def sum_to_max(numbers): return ( (lambda num_max: (lambda nums: any(filter(lambda n: n == num_max, map(sum, chain(*(combinations(nums,i) for i in range(1,len(nums)+1)))))) ) (numbers - {num_max}) # because numbers.remove(num_max)is None ) (max(numbers)) ) print(sum_to_max({5,7,16,1,2}) == False, sum_to_max({3,5,-1,8,12}) == True) # sets because of numbers - {num_max} True, True Comment 1: debugging nested expressions is harder than debugging a sequence of statements because cannot insert print statements. Comment 2: the function should really return an iterator that yields the sets that add to the max. Then the user can decide whether or not to collapse the information to True/False and stop after the first. > I wonder if we should add a subsequences function to itertools or make > the "r" parameter of combinations optional to return all combinations > up to len(iterable). I think not. The goal of itertools is to provide basic iterators that can be combined to produce specific iterators, just as you did. -- Terry Jan Reedy From pyideas at rebertia.com Mon Apr 23 05:18:36 2012 From: pyideas at rebertia.com (Chris Rebert) Date: Sun, 22 Apr 2012 20:18:36 -0700 Subject: [Python-ideas] Haskell envy In-Reply-To: References: Message-ID: On Sun, Apr 22, 2012 at 7:55 PM, Terry Reedy wrote: > On 4/22/2012 9:07 PM, Nestor wrote: >> Have the function ArrayAddition(arr) take the array of numbers stored >> in arr and print true if any combination of numbers in the array can >> be added up to equal the largest number in the array, otherwise print >> false. For example: if arr contains [4, 6, 23, 10, 1, 3] the output >> should print true because 4 + 6 + 10 + 3 = 23. ?The array will not be >> empty, will not contain all the same elements, and may contain >> negative numbers. > > Since the order of the numbers is arbitrary and irrelevant to the problem, > it should be formulated in term of a set of numbers. Er, multiplicity still matters, so it should be a multiset/bag. One possible representation thereof would be a list... Cheers, Chris From jeanpierreda at gmail.com Mon Apr 23 06:30:43 2012 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Mon, 23 Apr 2012 00:30:43 -0400 Subject: [Python-ideas] Haskell envy In-Reply-To: References: Message-ID: On Sun, Apr 22, 2012 at 10:55 PM, Terry Reedy wrote: > Comment 1: debugging nested expressions is harder than debugging a sequence > of statements because cannot insert print statements. Also, pdb has no support for debugging expressions. -- Devin From hobsonlane at gmail.com Mon Apr 23 07:21:10 2012 From: hobsonlane at gmail.com (Hobson Lane) Date: Mon, 23 Apr 2012 13:21:10 +0800 Subject: [Python-ideas] Anyone working on a platform-agnostic os.startfile() Message-ID: There is significant interest in a cross-platform file-launcher.[1][2][3][4] The ideal implementation would be an operating-system-agnostic interface that launches a file for editing or viewing, similar to the way os.startfile() works for Windows, but generalized to allow caller-specification of view vs. edit preference and support all registered os.name operating systems, not just 'nt'. Mercurial has a mature python implementation for cross-platform launching of an editor (either GUI editor or terminal-based editor like vi).[5][6] The python std lib os.startfile obviously works for Windows. The Mercurial functionality could be rolled into os.startfile() with additional named parameters for edit or view preference and gui or non-gui preference. Perhaps that would enable backporting belwo Python 3.x. Or is there a better place to incorporate this multi-platform file launching capability? [1]: http://stackoverflow.com/questions/1856792/intelligently-launching-the-default-editor-from-inside-a-python-cli-program [2]: http://stackoverflow.com/questions/434597/open-document-with-default-application-in-python [3]: http://stackoverflow.com/questions/1442841/lauch-default-editor-like-webbrowser-module [4]: http://stackoverflow.com/questions/434597/open-document-with-default-application-in-python [5]: http://selenic.com/repo/hg-stable/file/2770d03ae49f/mercurial/ui.py [6]: http://selenic.com/repo/hg-stable/file/2770d03ae49f/mercurial/util.py -------------- next part -------------- An HTML attachment was scrubbed... URL: From pyideas at rebertia.com Mon Apr 23 07:57:30 2012 From: pyideas at rebertia.com (Chris Rebert) Date: Sun, 22 Apr 2012 22:57:30 -0700 Subject: [Python-ideas] Anyone working on a platform-agnostic os.startfile() In-Reply-To: References: Message-ID: On Sun, Apr 22, 2012 at 10:21 PM, Hobson Lane wrote: > There is significant interest in a cross-platform > file-launcher.[1][2][3][4]??The ideal implementation would be > an?operating-system-agnostic interface that launches a file for editing or > viewing, similar to the way os.startfile() works for Windows, but > generalized to allow caller-specification of view vs. edit preference and > support all registered os.name operating systems, not just 'nt'. There is an existing open bug that requests such a feature: "Add shutil.open": http://bugs.python.org/issue3177 Cheers, Chris From hobsonlane at gmail.com Mon Apr 23 13:10:27 2012 From: hobsonlane at gmail.com (Hobson Lane) Date: Mon, 23 Apr 2012 19:10:27 +0800 Subject: [Python-ideas] Python-ideas Digest, Vol 65, Issue 43 In-Reply-To: References: Message-ID: Formatted and finished Rebert's solution to this issue http://bugs.python.org/issue3177 But the question of where to put it is still open ( shutil.open vs. shutil.launch vs. os.startfile ): 1. `shutil.open()` will break anyone that does `from shutil import *` or edits the shutil.py file and tries to use the builtin open() after the shutil.open() definition. 2. `shutil.launch()` is better than shutil.open() due to reduced breakage, but not as simple or DRY or reverse-compatible as putting it in os.startfile() in my mind. This fix just implements the functionality of os.startfile() for non-Windows OSes. 3. `shutil.startfile()` was recommended against by a developer or two on this mailing list, but seems appropriate to me. The only upstream "breakage" for an os.startfile() location that I can think of is the failure to raise exceptions on non-Windows OSes. Any legacy (<3.0) code that relies on os.startfile() exceptions in order to detect a non-windows OS is misguided and needs re-factoring anyway, IMHO. Though their only indication of a "problem" in their code would be the successful launching of a viewer for whatever path they pointed to... 4. `os.launch()` anyone? Not me. On Mon, Apr 23, 2012 at 6:00 PM, wrote: > Send Python-ideas mailing list submissions to > python-ideas at python.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://mail.python.org/mailman/listinfo/python-ideas > or, via email, send a message with subject or body 'help' to > python-ideas-request at python.org > > You can reach the person managing the list at > python-ideas-owner at python.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Python-ideas digest..." > > > Today's Topics: > > 1. Anyone working on a platform-agnostic os.startfile() (Hobson Lane) > 2. Re: Anyone working on a platform-agnostic os.startfile() > (Chris Rebert) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Mon, 23 Apr 2012 13:21:10 +0800 > From: Hobson Lane > To: python-ideas at python.org > Cc: Hobson's Totalgood Aliases > Subject: [Python-ideas] Anyone working on a platform-agnostic > os.startfile() > Message-ID: > > > Content-Type: text/plain; charset="iso-8859-1" > > There is significant interest in a cross-platform > file-launcher.[1][2][3][4] The ideal implementation would be > an operating-system-agnostic interface that launches a file for editing or > viewing, similar to the way os.startfile() works for Windows, but > generalized to allow caller-specification of view vs. edit preference and > support all registered os.name operating systems, not just 'nt'. > > Mercurial has a mature python implementation for cross-platform launching > of an editor (either GUI editor or terminal-based editor like vi).[5][6] > The python std lib os.startfile obviously works for Windows. > > The Mercurial functionality could be rolled into os.startfile() with > additional named parameters for edit or view preference and gui or non-gui > preference. Perhaps that would enable backporting belwo Python 3.x. Or is > there a better place to incorporate this multi-platform file launching > capability? > > [1]: > > http://stackoverflow.com/questions/1856792/intelligently-launching-the-default-editor-from-inside-a-python-cli-program > [2]: > > http://stackoverflow.com/questions/434597/open-document-with-default-application-in-python > [3]: > > http://stackoverflow.com/questions/1442841/lauch-default-editor-like-webbrowser-module > [4]: > > http://stackoverflow.com/questions/434597/open-document-with-default-application-in-python > [5]: http://selenic.com/repo/hg-stable/file/2770d03ae49f/mercurial/ui.py > [6]: > http://selenic.com/repo/hg-stable/file/2770d03ae49f/mercurial/util.py > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < > http://mail.python.org/pipermail/python-ideas/attachments/20120423/e672411c/attachment-0001.html > > > > ------------------------------ > > Message: 2 > Date: Sun, 22 Apr 2012 22:57:30 -0700 > From: Chris Rebert > Cc: python-ideas at python.org > Subject: Re: [Python-ideas] Anyone working on a platform-agnostic > os.startfile() > Message-ID: > > > Content-Type: text/plain; charset=UTF-8 > > On Sun, Apr 22, 2012 at 10:21 PM, Hobson Lane > wrote: > > There is significant interest in a cross-platform > > file-launcher.[1][2][3][4]??The ideal implementation would be > > an?operating-system-agnostic interface that launches a file for editing > or > > viewing, similar to the way os.startfile() works for Windows, but > > generalized to allow caller-specification of view vs. edit preference and > > support all registered os.name operating systems, not just 'nt'. > > There is an existing open bug that requests such a feature: > "Add shutil.open": http://bugs.python.org/issue3177 > > Cheers, > Chris > > > ------------------------------ > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > > End of Python-ideas Digest, Vol 65, Issue 43 > ******************************************** > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pyideas at rebertia.com Mon Apr 23 14:01:15 2012 From: pyideas at rebertia.com (Chris Rebert) Date: Mon, 23 Apr 2012 05:01:15 -0700 Subject: [Python-ideas] Python-ideas Digest, Vol 65, Issue 43 In-Reply-To: References: Message-ID: On Mon, Apr 23, 2012 at 4:10 AM, Hobson Lane wrote: > On Mon, Apr 23, 2012 at 6:00 PM, wrote: >> >> Send Python-ideas mailing list submissions to >> ? ? ? ?python-ideas at python.org >> >> To subscribe or unsubscribe via the World Wide Web, visit >> ? ? ? ?http://mail.python.org/mailman/listinfo/python-ideas >> or, via email, send a message with subject or body 'help' to >> ? ? ? ?python-ideas-request at python.org >> >> You can reach the person managing the list at >> ? ? ? ?python-ideas-owner at python.org >> >> When replying, please edit your Subject line so it is more specific >> than "Re: Contents of Python-ideas digest..." Please avoid replying to the digest; it breaks conversation threading. Switch to a non-digest mailing list subscription when not lurking. Cheers, Chris From solipsis at pitrou.net Mon Apr 23 14:02:33 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 23 Apr 2012 14:02:33 +0200 Subject: [Python-ideas] Python-ideas Digest, Vol 65, Issue 43 References: Message-ID: <20120423140233.052ed5e0@pitrou.net> Or at least, if you do reply to the digest, please change the e-mail subject line to something informative. Regards Antoine. On Mon, 23 Apr 2012 05:01:15 -0700 Chris Rebert wrote: > On Mon, Apr 23, 2012 at 4:10 AM, Hobson Lane wrote: > > > On Mon, Apr 23, 2012 at 6:00 PM, wrote: > >> > >> Send Python-ideas mailing list submissions to > >> ? ? ? ?python-ideas at python.org > >> > >> To subscribe or unsubscribe via the World Wide Web, visit > >> ? ? ? ?http://mail.python.org/mailman/listinfo/python-ideas > >> or, via email, send a message with subject or body 'help' to > >> ? ? ? ?python-ideas-request at python.org > >> > >> You can reach the person managing the list at > >> ? ? ? ?python-ideas-owner at python.org > >> > >> When replying, please edit your Subject line so it is more specific > >> than "Re: Contents of Python-ideas digest..." > > Please avoid replying to the digest; it breaks conversation threading. > Switch to a non-digest mailing list subscription when not lurking. > > Cheers, > Chris > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From solipsis at pitrou.net Mon Apr 23 14:04:32 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 23 Apr 2012 14:04:32 +0200 Subject: [Python-ideas] shutil.launch / open/ startfile References: Message-ID: <20120423140432.2980d52b@pitrou.net> On Mon, 23 Apr 2012 19:10:27 +0800 Hobson Lane wrote: > Formatted and finished Rebert's solution to this issue > > http://bugs.python.org/issue3177 > > But the question of where to put it is still open ( shutil.open vs. > shutil.launch vs. os.startfile ): > > 1. `shutil.open()` will break anyone that does `from shutil import *` or > edits the shutil.py file and tries to use the builtin open() after the > shutil.open() definition. > > 2. `shutil.launch()` is better than shutil.open() due to reduced breakage, > but not as simple or DRY or reverse-compatible as putting it in > os.startfile() in my mind. This fix just implements the functionality of > os.startfile() for non-Windows OSes. > > 3. `shutil.startfile()` was recommended against by a developer or two on > this mailing list, but seems appropriate to me. The only upstream > "breakage" for an os.startfile() location that I can think of is the > failure to raise exceptions on non-Windows OSes. Any legacy (<3.0) code > that relies on os.startfile() exceptions in order to detect a non-windows > OS is misguided and needs re-factoring anyway, IMHO. Though their only > indication of a "problem" in their code would be the successful launching > of a viewer for whatever path they pointed to... Both shutil.launch() and shutil.startfile() are fine to me. I must admit launch() sounds a bit more obvious than startfile(). I am -1 on shutil.open(). Regards Antoine. From ncoghlan at gmail.com Mon Apr 23 18:43:59 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 24 Apr 2012 02:43:59 +1000 Subject: [Python-ideas] shutil.launch / open/ startfile In-Reply-To: <20120423140432.2980d52b@pitrou.net> References: <20120423140432.2980d52b@pitrou.net> Message-ID: On Mon, Apr 23, 2012 at 10:04 PM, Antoine Pitrou wrote: > On Mon, 23 Apr 2012 19:10:27 +0800 > Hobson Lane wrote: >> Formatted and finished Rebert's solution to this issue >> >> http://bugs.python.org/issue3177 >> >> But the question of where to put it is still open ( shutil.open vs. >> shutil.launch vs. os.startfile ): >> >> 1. `shutil.open()` will break anyone that does `from shutil import *` or >> edits the shutil.py file and tries to use the builtin open() after the >> shutil.open() definition. >> >> 2. `shutil.launch()` is better than shutil.open() due to reduced breakage, >> but not as simple or DRY or reverse-compatible as putting it in >> os.startfile() in my mind. This fix just implements the functionality of >> os.startfile() for non-Windows OSes. >> >> 3. `shutil.startfile()` was recommended against by a developer or two on >> this mailing list, but seems appropriate to me. The only upstream >> "breakage" for an os.startfile() location that I can think of is the >> failure to raise exceptions on non-Windows OSes. Any legacy (<3.0) code >> that relies on os.startfile() exceptions in order to detect a non-windows >> OS is misguided and needs re-factoring anyway, IMHO. Though their only >> indication of a "problem" in their code would be the successful launching >> of a viewer for whatever path they pointed to... > > Both shutil.launch() and shutil.startfile() are fine to me. I must > admit launch() sounds a bit more obvious than startfile(). > > I am -1 on shutil.open(). Indeed, my preference is the same (i.e. shutil.launch()). The file isn't being started - an application is being started to read the file. And, despite the existence of os.startfile(), this functionality feels too high level to be generally appropriate for the os module. Cheers. Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From steve at pearwood.info Mon Apr 23 18:52:16 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 24 Apr 2012 02:52:16 +1000 Subject: [Python-ideas] Haskell envy In-Reply-To: References: Message-ID: <4F9588C0.2020005@pearwood.info> Nestor wrote: [...] > But then somebody else submitted this Haskell solution: > > import Data.List > f :: (Ord a,Num a) => [a] -> Bool > f lst = (\y -> elem y $ map sum $ subsequences $ [ x | x <- lst, x /= > y ]) $ maximum lst > > > I wonder if we should add a subsequences function to itertools or make > the "r" parameter of combinations optional to return all combinations > up to len(iterable). You really don't do your idea any favours by painting it as "Haskell envy". I have mixed ideas on this. On the one hand, subsequences is a natural function to include along with permutations and combinations in any combinatorics tool set. As you point out, Haskell has it. So does Mathematica, under the name "subsets". But on the other hand, itertools is arguably not the right place for it. If we add subsequences, will people then ask for partitions and derangements next? Where will it end? At some point the line needs to be drawn, with some functions declared "too specialised" for the general itertools module. (Perhaps there should be a separate combinatorics module.) And on the third hand, what you want is a simple two-liner, easy enough to write in place: from itertools import chain, combinations allcombs = chain(*(combinations(data, i) for i in range(len(data)+1))) which is close to what your code used. However, the discoverability of this solution is essentially zero (if you don't know how to do this, it's hard to find out), and it is hardly as self-documenting as from itertools import subsequences allcombs = subsequences(data) Overall, I would vote +0.5 on a subsequences function. -- Steven From steve at pearwood.info Mon Apr 23 19:24:54 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 24 Apr 2012 03:24:54 +1000 Subject: [Python-ideas] Haskell envy In-Reply-To: References: Message-ID: <4F959066.6090504@pearwood.info> Nestor wrote: > The other day a colleague of mine submitted this challenge taken from > some website to us coworkers: > > Have the function ArrayAddition(arr) take the array of numbers stored > in arr and print true if any combination of numbers in the array can > be added up to equal the largest number in the array, otherwise print > false. For example: if arr contains [4, 6, 23, 10, 1, 3] the output > should print true because 4 + 6 + 10 + 3 = 23. The array will not be > empty, will not contain all the same elements, and may contain > negative numbers. By the way, your solution is wrong. Consider this sample data: -2, -5, 0, -1, -3 In this case, the Haskell solution should correctly print true, while yours will print false, because you skip the empty subset. The empty sum equals the maximum value of the set, 0. The ease at which people can get this wrong is an argument in favour of a standard solution. -- Steven From raymond.hettinger at gmail.com Mon Apr 23 21:55:42 2012 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Mon, 23 Apr 2012 12:55:42 -0700 Subject: [Python-ideas] Haskell envy In-Reply-To: <4F9588C0.2020005@pearwood.info> References: <4F9588C0.2020005@pearwood.info> Message-ID: <647E82A1-48ED-4025-8195-19982D8BC441@gmail.com> On Apr 23, 2012, at 9:52 AM, Steven D'Aprano wrote: > However, the discoverability of this solution is essentially zero That exact code has been in the documentation for years: def powerset(iterable): "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)" s = list(iterable) return chain.from_iterable(combinations(s, r) for r in range(len(s)+1)) The whole purpose of the itertools recipes are to teach how the itertools can be readily combined to build new tools. http://docs.python.org/library/itertools.html#module-itertools Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Mon Apr 23 22:14:11 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 23 Apr 2012 16:14:11 -0400 Subject: [Python-ideas] Anyone working on a platform-agnostic os.startfile() In-Reply-To: References: Message-ID: [subject corrected from digest...] On 4/23/2012 7:10 AM, Hobson Lane wrote: > Formatted and finished Rebert's solution to this issue > > http://bugs.python.org/issue3177 > > But the question of where to put it is still open ( shutil.open vs. > shutil.launch vs. os.startfile ): > > 1. `shutil.open()` will break anyone that does `from shutil import *` or > edits the shutil.py file and tries to use the builtin open() after the > shutil.open() definition. > > 2. `shutil.launch()` is better than shutil.open() due to reduced > breakage, but not as simple or DRY or reverse-compatible as putting it > in os.startfile() in my mind. This fix just implements the functionality > of os.startfile() for non-Windows OSes. > > 3. `shutil.startfile()` was recommended against by a developer or two on > this mailing list, but seems appropriate to me. The only upstream > "breakage" for an os.startfile() location that I can think of is the > failure to raise exceptions on non-Windows OSes. Any legacy (<3.0) code > that relies on os.startfile() exceptions in order to detect a > non-windows OS is misguided and needs re-factoring anyway, IMHO. Though > their only indication of a "problem" in their code would be the > successful launching of a viewer for whatever path they pointed to... > > 4. `os.launch()` anyone? Not me. The functions in os are intended to be thin wrappers around os functions, and a launcher is not. So shutil seems a better place. Since launch or startfile means more than just 'open', but open to edit or run, I would not use 'open'. -- Terry Jan Reedy From tjreedy at udel.edu Mon Apr 23 22:28:50 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 23 Apr 2012 16:28:50 -0400 Subject: [Python-ideas] Haskell envy In-Reply-To: References: Message-ID: On 4/22/2012 11:18 PM, Chris Rebert wrote: > On Sun, Apr 22, 2012 at 7:55 PM, Terry Reedy wrote: >> On 4/22/2012 9:07 PM, Nestor wrote: > >>> Have the function ArrayAddition(arr) take the array of numbers stored >>> in arr and print true if any combination of numbers in the array can >>> be added up to equal the largest number in the array, otherwise print >>> false. For example: if arr contains [4, 6, 23, 10, 1, 3] the output >>> should print true because 4 + 6 + 10 + 3 = 23. The array will not be >>> empty, will not contain all the same elements, and may contain >>> negative numbers. >> >> Since the order of the numbers is arbitrary and irrelevant to the problem, >> it should be formulated in term of a set of numbers. > > Er, multiplicity still matters, so it should be a multiset/bag. One > possible representation thereof would be a list... Er, yes. Given the examples, I (too quickly) misread 'will not contain all the same elements' as 'no duplicates'. In any case, a set was needed for the functional version as there is no 'list1 - list2' expression that returns the list1 minus the items in list2. (Well, I could have defined an auxiliary list sub function, but that is beside the point of the example.) -- Terry Jan Reedy From ethan at stoneleaf.us Mon Apr 23 21:51:19 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 23 Apr 2012 12:51:19 -0700 Subject: [Python-ideas] shutil.launch / open/ startfile In-Reply-To: References: Message-ID: <4F95B2B7.6030409@stoneleaf.us> Hobson Lane wrote: > Formatted and finished Rebert's solution to this issue > > http://bugs.python.org/issue3177 > > But the question of where to put it is still open ( shutil.open vs. > shutil.launch vs. os.startfile ): > > 1. `shutil.open()` will break anyone that does `from shutil import *` or > edits the shutil.py file and tries to use the builtin open() after the > shutil.open() definition. `from ... import *` should not be used with any module that does not explicitly support it, and shutil does not. How often do people modify library modules? > 2. `shutil.launch()` is better than shutil.open() due to reduced > breakage, but not as simple or DRY or reverse-compatible as putting it > in os.startfile() in my mind. This fix just implements the functionality > of os.startfile() for non-Windows OSes. +1 for shutil.launch() ~Ethan~ From tjreedy at udel.edu Mon Apr 23 22:30:16 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 23 Apr 2012 16:30:16 -0400 Subject: [Python-ideas] Haskell envy In-Reply-To: <4F959066.6090504@pearwood.info> References: <4F959066.6090504@pearwood.info> Message-ID: On 4/23/2012 1:24 PM, Steven D'Aprano wrote: > Nestor wrote: >> The other day a colleague of mine submitted this challenge taken from >> some website to us coworkers: >> >> Have the function ArrayAddition(arr) take the array of numbers stored >> in arr and print true if any combination of numbers in the array can >> be added up to equal the largest number in the array, otherwise print >> false. For example: if arr contains [4, 6, 23, 10, 1, 3] the output >> should print true because 4 + 6 + 10 + 3 = 23. The array will not be >> empty, will not contain all the same elements, and may contain >> negative numbers. > > By the way, your solution is wrong. Consider this sample data: > > -2, -5, 0, -1, -3 > > In this case, the Haskell solution should correctly print true, while > yours will print false, because you skip the empty subset. The empty sum > equals the maximum value of the set, 0. > > The ease at which people can get this wrong is an argument in favour of > a standard solution. I call it an argument for writing tests first, starting with the most simple and trivial case(s). -- Terry Jan Reedy From ned at nedbatchelder.com Tue Apr 24 01:45:37 2012 From: ned at nedbatchelder.com (Ned Batchelder) Date: Mon, 23 Apr 2012 19:45:37 -0400 Subject: [Python-ideas] Haskell envy In-Reply-To: <647E82A1-48ED-4025-8195-19982D8BC441@gmail.com> References: <4F9588C0.2020005@pearwood.info> <647E82A1-48ED-4025-8195-19982D8BC441@gmail.com> Message-ID: <4F95E9A1.3070001@nedbatchelder.com> On 4/23/2012 3:55 PM, Raymond Hettinger wrote: > > On Apr 23, 2012, at 9:52 AM, Steven D'Aprano wrote: > >> However, the discoverability of this solution is essentially zero > > That exact code has been in the documentation for years: > > def powerset(iterable): > "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)" > s = list(iterable) > return chain.from_iterable(combinations(s, r) for r in range(len(s)+1)) > > The whole purpose of the itertools recipes are to teach how > the itertools can be readily combined to build new tools. > > http://docs.python.org/library/itertools.html#module-itertools > > Raymond's "that code has been in the docs for years," and Steven's "the discoverability of this solution is essentially zero" are not contradictions. It sounds like we need a better way to find the information in the itertools docs. For example, there is no index entry for "powerset", and I don't know what term Steven tried looking it up with. Sounds like you two could work together to make people more aware of the tools we've already got. --Ned. > Raymond > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Tue Apr 24 08:42:54 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 24 Apr 2012 00:42:54 -0600 Subject: [Python-ideas] sys.implementation In-Reply-To: References: Message-ID: On Mon, Mar 19, 2012 at 10:41 PM, Eric Snow wrote: > In October 2009 there was a short flurry of interest in adding > "sys.implementation" as an object to encapsulate some > implementation-specific information [1]. ?Does anyone recollect where > this proposal went? ?Would anyone object to reviving it (or a > variant)? The premise is that sys.implementation would be a "namedtuple" (like sys.version_info). It would contain (as much as is practical) the information that is specific to a particular implementation of Python. "Required" attributes of sys.implementation would be those that the standard library makes use of. For instance, importlib would make use of sys.implementation.name (or sys.implementation.cache_tag) if there were one. The thread from 2009 covered a lot of this ground already. [1] Here are the "required" attributes of sys.implementation that I advocate: * name (mixed case; sys.implementation.name.lower() used as an identifier) * version (of the implementation, not of the targeted language version; structured like sys.version_info?) Here are other variables that _could_ go in sys.implementation: * cache_tag (e.g. 'cpython33' for CPython 3.3) * repository * repository_revision * build_toolchain * url (or website) * site_prefix * runtime Let's start with a minimum set of expected attributes, which would have an immediate purpose in the stdlib. However, let's not disallow implementations from adding whatever other attributes are meaningful for them. -eric [1] http://mail.python.org/pipermail/python-dev/2009-October/092893.html From p.f.moore at gmail.com Mon Apr 23 20:50:14 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 23 Apr 2012 19:50:14 +0100 Subject: [Python-ideas] Haskell envy In-Reply-To: <4F9588C0.2020005@pearwood.info> References: <4F9588C0.2020005@pearwood.info> Message-ID: On 23 April 2012 17:52, Steven D'Aprano wrote: > I have mixed ideas on this. On the one hand, subsequences is a natural > function to include along with permutations and combinations in any > combinatorics tool set. As you point out, Haskell has it. So does > Mathematica, under the name "subsets". > > But on the other hand, itertools is arguably not the right place for it. If > we add subsequences, will people then ask for partitions and derangements > next? Where will it end? At some point the line needs to be drawn, with some > functions declared "too specialised" for the general itertools module. > > (Perhaps there should be a separate combinatorics module.) It seems to me that you're precisely right here - it's *not* right to keep adding this sort of thing to itertools, or as you say it will never end. On the other hand, it's potentially useful, and not necessarily immediately obvious. So I'd say it's exactly right for a 3rd party "combinatorics" module on PyPI. Or if no-one thinks it's sufficiently useful to write and maintain one, then as a simple named function in your application. Once your application has a combinatorics module within it, that's the point where you should think about releasing that module separately... Of course, I'm arguing theoretically here. I've never even used itertools.combinations, so I have no real need for *any* of this. If people who do use this sort of thing on a regular basis have other opinions, then that's a much stronger argument :-) Paul. From ncoghlan at gmail.com Tue Apr 24 13:37:18 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 24 Apr 2012 21:37:18 +1000 Subject: [Python-ideas] Haskell envy In-Reply-To: References: <4F9588C0.2020005@pearwood.info> Message-ID: On Tue, Apr 24, 2012 at 4:50 AM, Paul Moore wrote: > Of course, I'm arguing theoretically here. I've never even used > itertools.combinations, so I have no real need for *any* of this. If > people who do use this sort of thing on a regular basis have other > opinions, then that's a much stronger argument :-) In practice, you'll be doing prefiltering and other conditioning on your combinations to weed out nonsensical variants cheaply, or your combinations will be coming from a database query or map reduce result. So I personally think it's the kind of thing that comes up in toy programming exercises and various academic explorations rather than something that solves a significant practical need. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From techtonik at gmail.com Tue Apr 24 17:42:24 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Tue, 24 Apr 2012 18:42:24 +0300 Subject: [Python-ideas] Python-ideas Digest, Vol 65, Issue 43 In-Reply-To: References: Message-ID: On Mon, Apr 23, 2012 at 3:01 PM, Chris Rebert wrote: > On Mon, Apr 23, 2012 at 4:10 AM, Hobson Lane wrote: > >> On Mon, Apr 23, 2012 at 6:00 PM, wrote: >>> >>> Send Python-ideas mailing list submissions to >>> ? ? ? ?python-ideas at python.org >>> >>> To subscribe or unsubscribe via the World Wide Web, visit >>> ? ? ? ?http://mail.python.org/mailman/listinfo/python-ideas >>> or, via email, send a message with subject or body 'help' to >>> ? ? ? ?python-ideas-request at python.org >>> >>> You can reach the person managing the list at >>> ? ? ? ?python-ideas-owner at python.org >>> >>> When replying, please edit your Subject line so it is more specific >>> than "Re: Contents of Python-ideas digest..." > > Please avoid replying to the digest; it breaks conversation threading. > Switch to a non-digest mailing list subscription when not lurking. But to reply to a non-digest message you need to receive it in non-digest mode, which didn't happen already. The only way it makes sense is when you ask the Mailman to resend message again. I don't know if that's possible. -- anatoly t. From ericsnowcurrently at gmail.com Tue Apr 24 21:23:53 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 24 Apr 2012 13:23:53 -0600 Subject: [Python-ideas] breaking out of module execution Message-ID: In a function you can use a return statement to break out of execution in the middle of the function. With modules you have no recourse. This is akin to return statements being allowed only at the end of a function. There are a small number of ways you can work around this, but they aren't great. This includes using wrapper modules or import hooks or sometimes from-import-*. Otherwise, if your module's execution is conditional, you end up indenting everything inside an if/else statement. Proposal: introduce a non-error mechanism to break out of module execution. This could be satisfied by a statement like break or return, though those specific ones could be confusing. It could also involve raising a special subclass of ImportError that the import machinery simply handles as not-an-error. This came up last year on python-list with mixed results. [1] However, time has not dimmed the appeal for me so I'm rebooting here. While the proposal seems relatively minor, the use cases are not extensive. The three main ones I've encountered are these: 1. C extension module with fallback to pure Python: try: from _extension_module import * except ImportError: pass else: break # or whatever color the bikeshed is # pure python implementation goes here 2. module imported under different name: if __name__ != "expected_name": from expected_name import * break # business as usual 3. module already imported under a different name: if "other_module" in sys.modules: exec("from other_module import *", globals()) break # module code here Thoughts? -eric [1] http://mail.python.org/pipermail/python-list/2011-June/1274424.html From solipsis at pitrou.net Tue Apr 24 21:40:58 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 24 Apr 2012 21:40:58 +0200 Subject: [Python-ideas] breaking out of module execution References: Message-ID: <20120424214058.22540616@pitrou.net> On Tue, 24 Apr 2012 13:23:53 -0600 Eric Snow wrote: > In a function you can use a return statement to break out of execution > in the middle of the function. With modules you have no recourse. > This is akin to return statements being allowed only at the end of a > function. > > There are a small number of ways you can work around this, but they > aren't great. This includes using wrapper modules or import hooks or > sometimes from-import-*. Otherwise, if your module's execution is > conditional, you end up indenting everything inside an if/else > statement. I think good practice should lead you to put your initialization code in a dedicated function that you call from your module toplevel. In this case, breaking out of execution is a matter of adding a return statement. I'm not sure the particular use cases you brought up are a good enough reason to add a syntactical construct. Regards Antoine. From mal at egenix.com Tue Apr 24 21:58:35 2012 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 24 Apr 2012 21:58:35 +0200 Subject: [Python-ideas] breaking out of module execution In-Reply-To: <20120424214058.22540616@pitrou.net> References: <20120424214058.22540616@pitrou.net> Message-ID: <4F9705EB.1070702@egenix.com> Antoine Pitrou wrote: > On Tue, 24 Apr 2012 13:23:53 -0600 > Eric Snow > wrote: >> In a function you can use a return statement to break out of execution >> in the middle of the function. With modules you have no recourse. >> This is akin to return statements being allowed only at the end of a >> function. >> >> There are a small number of ways you can work around this, but they >> aren't great. This includes using wrapper modules or import hooks or >> sometimes from-import-*. Otherwise, if your module's execution is >> conditional, you end up indenting everything inside an if/else >> statement. > > I think good practice should lead you to put your initialization code > in a dedicated function that you call from your module toplevel. In > this case, breaking out of execution is a matter of adding a return > statement. True, but that doesn't prevent import from being run, functions and classes from being defined and resources being bound which are not going to get used. Think of code like this (let's assume the "break" statement is used for stopping module execution): """ # # MyModule # ### Try using the fast variant try: from MyModule_C_Extension import * except ImportError: pass else: # Stop execution of the module code object right here break ### Ah, well, so go ahead with the slow version import os, sys from MyOtherPackage import foo, bar, baz class MyClass: ... def MyFunc(a,b,c): ... def main(): ... if __name__ == '__main__': main() """ You can solve this by using two separate modules and a top-level module to switch between the implementations, but that's cumbersome if you have more than just a few of such modules in a package. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 24 2012) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2012-04-28: PythonCamp 2012, Cologne, Germany 4 days to go ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From g.brandl at gmx.net Tue Apr 24 22:11:53 2012 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 24 Apr 2012 22:11:53 +0200 Subject: [Python-ideas] breaking out of module execution In-Reply-To: <4F9705EB.1070702@egenix.com> References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> Message-ID: On 24.04.2012 21:58, M.-A. Lemburg wrote: > Antoine Pitrou wrote: >> On Tue, 24 Apr 2012 13:23:53 -0600 >> Eric Snow >> wrote: >>> In a function you can use a return statement to break out of execution >>> in the middle of the function. With modules you have no recourse. >>> This is akin to return statements being allowed only at the end of a >>> function. >>> >>> There are a small number of ways you can work around this, but they >>> aren't great. This includes using wrapper modules or import hooks or >>> sometimes from-import-*. Otherwise, if your module's execution is >>> conditional, you end up indenting everything inside an if/else >>> statement. >> >> I think good practice should lead you to put your initialization code >> in a dedicated function that you call from your module toplevel. In >> this case, breaking out of execution is a matter of adding a return >> statement. > > True, but that doesn't prevent import from being run, functions and > classes from being defined and resources being bound which are not > going to get used. What's wrong with an if statement on module level, if you even care about this? > Think of code like this (let's assume the "break" statement is used > for stopping module execution): > > """ > # > # MyModule > # > > ### Try using the fast variant > > try: > from MyModule_C_Extension import * > except ImportError: > pass > else: > # Stop execution of the module code object right here > break > > ### Ah, well, so go ahead with the slow version > > import os, sys > from MyOtherPackage import foo, bar, baz > > class MyClass: > ... > > def MyFunc(a,b,c): > ... > > def main(): > ... > > if __name__ == '__main__': > main() > """ There's a subtle bug here that shows that the proposed feature has its awkward points: you probably want to execute the "if __name__ == '__main__'" block in the C extension case as well. Georg From mark at hotpy.org Tue Apr 24 22:15:57 2012 From: mark at hotpy.org (Mark Shannon) Date: Tue, 24 Apr 2012 21:15:57 +0100 Subject: [Python-ideas] breaking out of module execution In-Reply-To: <4F9705EB.1070702@egenix.com> References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> Message-ID: <4F9709FD.4060901@hotpy.org> This has to be the only few feature request that can implemented by removing code :) I implemented this by deleting 2 lines of code from the compiler. M.-A. Lemburg wrote: > Antoine Pitrou wrote: >> On Tue, 24 Apr 2012 13:23:53 -0600 >> Eric Snow >> wrote: >>> In a function you can use a return statement to break out of execution >>> in the middle of the function. With modules you have no recourse. >>> This is akin to return statements being allowed only at the end of a >>> function. >>> >>> There are a small number of ways you can work around this, but they >>> aren't great. This includes using wrapper modules or import hooks or >>> sometimes from-import-*. Otherwise, if your module's execution is >>> conditional, you end up indenting everything inside an if/else >>> statement. >> I think good practice should lead you to put your initialization code >> in a dedicated function that you call from your module toplevel. In >> this case, breaking out of execution is a matter of adding a return >> statement. > > True, but that doesn't prevent import from being run, functions and > classes from being defined and resources being bound which are not > going to get used. > > Think of code like this (let's assume the "break" statement is used > for stopping module execution): > > """ > # > # MyModule > # > > ### Try using the fast variant > > try: > from MyModule_C_Extension import * > except ImportError: > pass > else: > # Stop execution of the module code object right here > break It will have to be "return" not "break". > > ### Ah, well, so go ahead with the slow version > > import os, sys > from MyOtherPackage import foo, bar, baz > > class MyClass: > ... > > def MyFunc(a,b,c): > ... > > def main(): > ... > > if __name__ == '__main__': > main() > """ > > You can solve this by using two separate modules and a top-level > module to switch between the implementations, but that's cumbersome > if you have more than just a few of such modules in a package. > From mal at egenix.com Tue Apr 24 22:20:38 2012 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 24 Apr 2012 22:20:38 +0200 Subject: [Python-ideas] breaking out of module execution In-Reply-To: References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> Message-ID: <4F970B16.2020702@egenix.com> Georg Brandl wrote: > On 24.04.2012 21:58, M.-A. Lemburg wrote: >> Antoine Pitrou wrote: >>> On Tue, 24 Apr 2012 13:23:53 -0600 >>> Eric Snow >>> wrote: >>>> In a function you can use a return statement to break out of execution >>>> in the middle of the function. With modules you have no recourse. >>>> This is akin to return statements being allowed only at the end of a >>>> function. >>>> >>>> There are a small number of ways you can work around this, but they >>>> aren't great. This includes using wrapper modules or import hooks or >>>> sometimes from-import-*. Otherwise, if your module's execution is >>>> conditional, you end up indenting everything inside an if/else >>>> statement. >>> >>> I think good practice should lead you to put your initialization code >>> in a dedicated function that you call from your module toplevel. In >>> this case, breaking out of execution is a matter of adding a return >>> statement. >> >> True, but that doesn't prevent import from being run, functions and >> classes from being defined and resources being bound which are not >> going to get used. > > What's wrong with an if statement on module level, if you even care > about this? You'd have to indent the whole module. Been there, done that, doesn't look nice :-) >> Think of code like this (let's assume the "break" statement is used >> for stopping module execution): >> >> """ >> # >> # MyModule >> # >> >> ### Try using the fast variant >> >> try: >> from MyModule_C_Extension import * >> except ImportError: >> pass >> else: >> # Stop execution of the module code object right here >> break >> >> ### Ah, well, so go ahead with the slow version >> >> import os, sys >> from MyOtherPackage import foo, bar, baz >> >> class MyClass: >> ... >> >> def MyFunc(a,b,c): >> ... >> >> def main(): >> ... >> >> if __name__ == '__main__': >> main() >> """ > > There's a subtle bug here that shows that the proposed feature has its > awkward points: you probably want to execute the "if __name__ == '__main__'" > block in the C extension case as well. No, you don't :-) If you would have wanted that to happen, you'd put the "if __name__..." into the else: branch. You think of the "break" as having the same functionality as a "return" in a function. If reusing a statement is too much trouble, the same functionality could be had with an exception that get's caught by the executing (import) code. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 24 2012) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2012-04-28: PythonCamp 2012, Cologne, Germany 4 days to go ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From g.brandl at gmx.net Tue Apr 24 22:29:28 2012 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 24 Apr 2012 22:29:28 +0200 Subject: [Python-ideas] breaking out of module execution In-Reply-To: <4F970B16.2020702@egenix.com> References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> Message-ID: On 24.04.2012 22:20, M.-A. Lemburg wrote: >>> Think of code like this (let's assume the "break" statement is used >>> for stopping module execution): >>> >>> """ >>> # >>> # MyModule >>> # >>> >>> ### Try using the fast variant >>> >>> try: >>> from MyModule_C_Extension import * >>> except ImportError: >>> pass >>> else: >>> # Stop execution of the module code object right here >>> break >>> >>> ### Ah, well, so go ahead with the slow version >>> >>> import os, sys >>> from MyOtherPackage import foo, bar, baz >>> >>> class MyClass: >>> ... >>> >>> def MyFunc(a,b,c): >>> ... >>> >>> def main(): >>> ... >>> >>> if __name__ == '__main__': >>> main() >>> """ >> >> There's a subtle bug here that shows that the proposed feature has its >> awkward points: you probably want to execute the "if __name__ == '__main__'" >> block in the C extension case as well. > > No, you don't :-) If you would have wanted that to happen, you'd > put the "if __name__..." into the else: branch. Not sure I understand. Your example code is flawed because it doesn't execute the main() for the C extension case. Of course you can duplicate the code in the else branch, but you didn't do it in the first place, which was the bug. Georg From mal at egenix.com Tue Apr 24 22:38:14 2012 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 24 Apr 2012 22:38:14 +0200 Subject: [Python-ideas] breaking out of module execution In-Reply-To: References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> Message-ID: <4F970F36.2070208@egenix.com> Georg Brandl wrote: > On 24.04.2012 22:20, M.-A. Lemburg wrote: > >>>> Think of code like this (let's assume the "break" statement is used >>>> for stopping module execution): >>>> >>>> """ >>>> # >>>> # MyModule >>>> # >>>> >>>> ### Try using the fast variant >>>> >>>> try: >>>> from MyModule_C_Extension import * >>>> except ImportError: >>>> pass >>>> else: >>>> # Stop execution of the module code object right here >>>> break >>>> >>>> ### Ah, well, so go ahead with the slow version >>>> >>>> import os, sys >>>> from MyOtherPackage import foo, bar, baz >>>> >>>> class MyClass: >>>> ... >>>> >>>> def MyFunc(a,b,c): >>>> ... >>>> >>>> def main(): >>>> ... >>>> >>>> if __name__ == '__main__': >>>> main() >>>> """ >>> >>> There's a subtle bug here that shows that the proposed feature has its >>> awkward points: you probably want to execute the "if __name__ == '__main__'" >>> block in the C extension case as well. >> >> No, you don't :-) If you would have wanted that to happen, you'd >> put the "if __name__..." into the else: branch. > > Not sure I understand. Your example code is flawed because it doesn't execute > the main() for the C extension case. Of course you can duplicate the code in > the else branch, but you didn't do it in the first place, which was the bug. Ok, you got me :-) Should've paid more attention. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 24 2012) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2012-04-28: PythonCamp 2012, Cologne, Germany 4 days to go ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From ethan at stoneleaf.us Tue Apr 24 22:43:44 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 24 Apr 2012 13:43:44 -0700 Subject: [Python-ideas] breaking out of module execution In-Reply-To: References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> Message-ID: <4F971080.4000303@stoneleaf.us> Georg Brandl wrote: > On 24.04.2012 22:20, M.-A. Lemburg wrote: > >>>> Think of code like this (let's assume the "break" statement is used >>>> for stopping module execution): >>>> >>>> """ >>>> # >>>> # MyModule >>>> # >>>> >>>> ### Try using the fast variant >>>> >>>> try: >>>> from MyModule_C_Extension import * >>>> except ImportError: >>>> pass >>>> else: >>>> # Stop execution of the module code object right here >>>> break >>>> >>>> ### Ah, well, so go ahead with the slow version >>>> >>>> import os, sys >>>> from MyOtherPackage import foo, bar, baz >>>> >>>> class MyClass: >>>> ... >>>> >>>> def MyFunc(a,b,c): >>>> ... >>>> >>>> def main(): >>>> ... >>>> >>>> if __name__ == '__main__': >>>> main() >>>> """ >>> There's a subtle bug here that shows that the proposed feature has its >>> awkward points: you probably want to execute the "if __name__ == '__main__'" >>> block in the C extension case as well. >> No, you don't :-) If you would have wanted that to happen, you'd >> put the "if __name__..." into the else: branch. > > Not sure I understand. Your example code is flawed because it doesn't execute > the main() for the C extension case. Of course you can duplicate the code in > the else branch, but you didn't do it in the first place, which was the bug. It's only a bug if you *want* main() to execute in the C extension case -- so for M.A.L. it's not a bug (apparently he meant "I" when he wrote "you" ;) ~Ethan~ From fuzzyman at gmail.com Tue Apr 24 23:45:28 2012 From: fuzzyman at gmail.com (Michael Foord) Date: Tue, 24 Apr 2012 22:45:28 +0100 Subject: [Python-ideas] breaking out of module execution In-Reply-To: References: Message-ID: On 24 April 2012 20:23, Eric Snow wrote: > In a function you can use a return statement to break out of execution > in the middle of the function. With modules you have no recourse. > This is akin to return statements being allowed only at the end of a > function. > > There are a small number of ways you can work around this, but they > aren't great. This includes using wrapper modules or import hooks or > sometimes from-import-*. Otherwise, if your module's execution is > conditional, you end up indenting everything inside an if/else > statement. > > Proposal: introduce a non-error mechanism to break out of module > execution. This could be satisfied by a statement like break or > return, though those specific ones could be confusing. It could also > involve raising a special subclass of ImportError that the import > machinery simply handles as not-an-error. > > This came up last year on python-list with mixed results. [1] > However, time has not dimmed the appeal for me so I'm rebooting here. > > For what it's worth I've wanted this a couple of times. There are always workarounds of course (but not particularly pretty sometimes). Michael > While the proposal seems relatively minor, the use cases are not > extensive. The three main ones I've encountered are these: > > 1. C extension module with fallback to pure Python: > > try: > from _extension_module import * > except ImportError: > pass > else: > break # or whatever color the bikeshed is > > # pure python implementation goes here > > 2. module imported under different name: > > if __name__ != "expected_name": > from expected_name import * > break > > # business as usual > > 3. module already imported under a different name: > > if "other_module" in sys.modules: > exec("from other_module import *", globals()) > break > > # module code here > > Thoughts? > > -eric > > > [1] http://mail.python.org/pipermail/python-list/2011-June/1274424.html > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From mwm at mired.org Wed Apr 25 00:45:47 2012 From: mwm at mired.org (Mike Meyer) Date: Tue, 24 Apr 2012 18:45:47 -0400 Subject: [Python-ideas] Python-ideas Digest, Vol 65, Issue 43 In-Reply-To: References: Message-ID: <20120424184547.0bca79f5@bhuda.mired.org> On Tue, 24 Apr 2012 18:42:24 +0300 anatoly techtonik wrote: > On Mon, Apr 23, 2012 at 3:01 PM, Chris Rebert wrote: > > On Mon, Apr 23, 2012 at 4:10 AM, Hobson Lane wrote: > > > >> On Mon, Apr 23, 2012 at 6:00 PM, wrote: > >>> > >>> Send Python-ideas mailing list submissions to > >>> ? ? ? ?python-ideas at python.org > >>> > >>> To subscribe or unsubscribe via the World Wide Web, visit > >>> ? ? ? ?http://mail.python.org/mailman/listinfo/python-ideas > >>> or, via email, send a message with subject or body 'help' to > >>> ? ? ? ?python-ideas-request at python.org > >>> > >>> You can reach the person managing the list at > >>> ? ? ? ?python-ideas-owner at python.org > >>> > >>> When replying, please edit your Subject line so it is more specific > >>> than "Re: Contents of Python-ideas digest..." > > > > Please avoid replying to the digest; it breaks conversation threading. > > Switch to a non-digest mailing list subscription when not lurking. > > But to reply to a non-digest message you need to receive it in > non-digest mode, which didn't happen already. The only way it makes > sense is when you ask the Mailman to resend message again. I don't > know if that's possible. Your initial statement is - or at least may be - wrong. If the digest is in one of the well-known formats, a good MUA will let you burst a digest into individual messages and reply to them just like any other message to a list. mh, nmh and GUI's built on top them can do this. I haven't subscribed to a digest of any kind in years, so don't even know if any of my current mail readers can do that. The reason I don't is a different solution to same problem: current mail agents are sufficiently power that I can automatically mark/tag/file all messages to a list and then deal with them outside the normal flow of email, just like a digest. There are other, more esoteric solutions (like the formail program), available as well. http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From hobsonlane at gmail.com Wed Apr 25 01:10:49 2012 From: hobsonlane at gmail.com (Hobson Lane) Date: Wed, 25 Apr 2012 07:10:49 +0800 Subject: [Python-ideas] Haskell envy (Terry Reedy) Message-ID: > On 4/22/2012 11:18 PM, Chris Rebert wrote: > > On Sun, Apr 22, 2012 at 7:55 PM, Terry Reedy wrote: > >> On 4/22/2012 9:07 PM, Nestor wrote: > > > >>> false. For example: if arr contains [4, 6, 23, 10, 1, 3] the output > >>> should print true because 4 + 6 + 10 + 3 = 23. > >> Since the order of the numbers is arbitrary and irrelevant to the > problem, > >> it should be formulated in term of a set of numbers. > > > > Er, multiplicity still matters, so it should be a multiset/bag. One > > possible representation thereof would be a list... > > Er, yes. Given the examples, I (too quickly) misread 'will not contain > all the same elements' as 'no duplicates'. In any case, a set was needed > And doesn't ordering matter too (for efficiency). A sorted list of the positive integers may solve in much less less time, right? -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Wed Apr 25 02:09:18 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 24 Apr 2012 20:09:18 -0400 Subject: [Python-ideas] Haskell envy (Terry Reedy) In-Reply-To: References: Message-ID: On 4/24/2012 7:10 PM, Hobson Lane wrote: > And doesn't ordering matter too (for efficiency). A sorted list of the > positive integers may solve in much less less time, right? The OP gave the brute-force solution of testing all subsets as a justification for adding a powerset iterator. With negative numbers allowed (as in this problem), that may be the best one can do. But if negatives are excluded, sorting allows a pruning strategy. For instance, all subsets can be separated in to those with and without the 2nd highest. For those with the 2nd highest, one only need consider the initial slice of numbers <= hi - 2nd_hi. If two numbers add to more than the target, then all larger subsets containing the pair can be excluded and not generated and summed. One can apply this idea recursively. But there is no guarantee of any saving. -- Terry Jan Reedy From tjreedy at udel.edu Wed Apr 25 02:17:21 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 24 Apr 2012 20:17:21 -0400 Subject: [Python-ideas] Haskell envy (Terry Reedy) In-Reply-To: References: Message-ID: On 4/24/2012 8:09 PM, Terry Reedy wrote: > On 4/24/2012 7:10 PM, Hobson Lane wrote: > >> And doesn't ordering matter too (for efficiency). A sorted list of the >> positive integers may solve in much less less time, right? > > The OP gave the brute-force solution of testing all subsets as a > justification for adding a powerset iterator. With negative numbers > allowed (as in this problem), that may be the best one can do. > > But if negatives are excluded, sorting allows a pruning strategy. For > instance, all subsets can be separated in to those with and without the > 2nd highest. For those with the 2nd highest, one only need consider the > initial slice of numbers <= hi - 2nd_hi. If two numbers add to more than > the target, then all larger subsets containing the pair can be excluded > and not generated and summed. One can apply this idea recursively. But > there is no guarantee of any saving. As an algorithm question and answer, this is off topic. However, it reinforces Nick's claim that powerset() is not too useful because "In practice, you'll be doing prefiltering and other conditioning on your combinations to weed out nonsensical variants cheaply," -- Terry Jan Reedy From steve at pearwood.info Wed Apr 25 02:40:32 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 25 Apr 2012 10:40:32 +1000 Subject: [Python-ideas] breaking out of module execution In-Reply-To: <4F9705EB.1070702@egenix.com> References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> Message-ID: <4F974800.5000301@pearwood.info> M.-A. Lemburg wrote: > Antoine Pitrou wrote: >> I think good practice should lead you to put your initialization code >> in a dedicated function that you call from your module toplevel. In >> this case, breaking out of execution is a matter of adding a return >> statement. > > True, but that doesn't prevent import from being run, functions and > classes from being defined and resources being bound which are not > going to get used. Premature micro-optimizations. Just starting the interpreter runs imports, defines functions and classes, and binds resources which your script may never use. Lists and dicts are over-allocated, which you may never need. Ints and strings are boxed. Python is a language that takes Knuth seriously: "We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil." In the majority of cases, a few more small inefficiencies, especially those that are one-off costs at startup, are not a big deal. In those cases where it is a big deal, you can place code inside if-blocks, factor it out into separate modules, or use delayed execution by putting it inside functions. It may not be quite so pretty, but it gets the job done, and it requires no new features. > Think of code like this (let's assume the "break" statement is used > for stopping module execution): > > """ > # > # MyModule > # > > ### Try using the fast variant > > try: > from MyModule_C_Extension import * > except ImportError: > pass > else: > # Stop execution of the module code object right here > break > > ### Ah, well, so go ahead with the slow version > > import os, sys > from MyOtherPackage import foo, bar, baz > > class MyClass: > ... > > def MyFunc(a,b,c): > ... > > def main(): > ... > > if __name__ == '__main__': > main() > """ > > You can solve this by using two separate modules and a top-level > module to switch between the implementations, but that's cumbersome > if you have more than just a few of such modules in a package. You can also solve it by defining the slow version first, as a fall-back, then replace it with the fast version, if it exists: import os, sys from MyOtherPackage import foo, bar, baz class MyClass: ... def MyFunc(a,b,c): ... def main(): ... try: from MyModule_C_Extension import * except ImportError: pass if __name__ == '__main__': main() The advantage of this is that MyModule_C_Extension doesn't need to supply everything or nothing. It may only supply (say) MyClass, and MyFunc will naturally fall back on the pre-defined Python versions. This is the idiom used by at least the bisect and pickle modules, and I think it is the Pythonic way. -- Steven From steve at pearwood.info Wed Apr 25 02:55:07 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 25 Apr 2012 10:55:07 +1000 Subject: [Python-ideas] breaking out of module execution In-Reply-To: <4F970F36.2070208@egenix.com> References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> Message-ID: <4F974B6B.3040208@pearwood.info> M.-A. Lemburg wrote: > Georg Brandl wrote: >> On 24.04.2012 22:20, M.-A. Lemburg wrote: >>>> There's a subtle bug here that shows that the proposed feature has its >>>> awkward points: you probably want to execute the "if __name__ == '__main__'" >>>> block in the C extension case as well. >>> No, you don't :-) If you would have wanted that to happen, you'd >>> put the "if __name__..." into the else: branch. >> Not sure I understand. Your example code is flawed because it doesn't execute >> the main() for the C extension case. Of course you can duplicate the code in >> the else branch, but you didn't do it in the first place, which was the bug. > > Ok, you got me :-) Should've paid more attention. I think you have inadvertently demonstrated that that this proposed feature is hard to use correctly. Possibly even harder to use than existing idioms for solving the problems this is meant to solve. If the user does use it, they will likely need to duplicate code, which encourages copy-and-paste programming. Even if break at the module level is useful on rare occasions, I think the usefulness is far outweighed by the costs: - hard to use correctly, hence code using this feature risks being buggy - encourages premature micro-optimization, or at least the illusion of optimization - encourages or requires duplicate code and copy-and-paste programming - complicates the top-level program flow Today, if you successfully import a module, you know that all the top-level code in that module was executed. If this feature is added, you cannot be sure what top-level code was reached unless you scan through all the code above it. In my opinion, this is an attractive nuisance. -1 on the feature. -- Steven From nathan.alexander.rice at gmail.com Wed Apr 25 04:03:43 2012 From: nathan.alexander.rice at gmail.com (Nathan Rice) Date: Tue, 24 Apr 2012 22:03:43 -0400 Subject: [Python-ideas] Haskell envy In-Reply-To: References: <4F9588C0.2020005@pearwood.info> Message-ID: Why not repeat the same practice as is currently in place for range, enumerate, etc. Take 3 arguments, the iterables, the repeat minimum size and the repeat maximum size. If the third argument is None, just treat the minimum as the maximum (yielding the current behavior). Most things in python have gobs of keyword arguments and most of itertools has relatively few. I don't think it is going to harm anyone to give them a few more toys to play with. From ncoghlan at gmail.com Wed Apr 25 07:33:54 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 25 Apr 2012 15:33:54 +1000 Subject: [Python-ideas] breaking out of module execution In-Reply-To: <4F974B6B.3040208@pearwood.info> References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> Message-ID: On Wed, Apr 25, 2012 at 10:55 AM, Steven D'Aprano wrote: > In my opinion, this is an attractive nuisance. > > -1 on the feature. Agreed (and my preferred idiom for all the cited cases is also "always define the Python version, override at the end with the accelerated version"). Although, if we *did* do it, I think allowing "return" at module level would be the way to proceed (as Mark Shannon noted, that's only *disallowed* now because the compiler specifically prevents it. The eval loop itself understands it just fine and ceases execution as soon as it encounters the relevant bytecode. It isn't quite as simple as just deleting those lines though, since we likely still wouldn't want to allow return statements in class bodies). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From greg.ewing at canterbury.ac.nz Wed Apr 25 09:58:34 2012 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 25 Apr 2012 19:58:34 +1200 Subject: [Python-ideas] breaking out of module execution In-Reply-To: References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> Message-ID: <4F97AEAA.90009@canterbury.ac.nz> Nick Coghlan wrote: > It isn't quite as simple as > just deleting those lines though, since we likely still wouldn't want > to allow return statements in class bodies. I'm sure there's someone out there with a twisted enough mind to think of a use for that... -- Greg From mark at hotpy.org Wed Apr 25 10:11:10 2012 From: mark at hotpy.org (Mark Shannon) Date: Wed, 25 Apr 2012 09:11:10 +0100 Subject: [Python-ideas] breaking out of module execution In-Reply-To: <4F97AEAA.90009@canterbury.ac.nz> References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> <4F97AEAA.90009@canterbury.ac.nz> Message-ID: <4F97B19E.7000604@hotpy.org> Greg Ewing wrote: > Nick Coghlan wrote: >> It isn't quite as simple as >> just deleting those lines though, since we likely still wouldn't want >> to allow return statements in class bodies. > > I'm sure there's someone out there with a twisted enough > mind to think of a use for that... > It's not that twisted. class C: def basic_feature(self): ... if use_simple_api: return def advanced_feature(self): ... Not that this is a good use, but it is a use :) Cheers, Mark. From anacrolix at gmail.com Wed Apr 25 10:24:30 2012 From: anacrolix at gmail.com (Matt Joiner) Date: Wed, 25 Apr 2012 16:24:30 +0800 Subject: [Python-ideas] breaking out of module execution In-Reply-To: <4F97B19E.7000604@hotpy.org> References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> <4F97AEAA.90009@canterbury.ac.nz> <4F97B19E.7000604@hotpy.org> Message-ID: If not use_simple_api: class C: On Apr 25, 2012 4:19 PM, "Mark Shannon" wrote: > Greg Ewing wrote: > >> Nick Coghlan wrote: >> >>> It isn't quite as simple as >>> just deleting those lines though, since we likely still wouldn't want >>> to allow return statements in class bodies. >>> >> >> I'm sure there's someone out there with a twisted enough >> mind to think of a use for that... >> >> > It's not that twisted. > > class C: > > def basic_feature(self): > ... > > if use_simple_api: > return > > def advanced_feature(self): > ... > > Not that this is a good use, but it is a use :) > > Cheers, > Mark. > > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mal at egenix.com Wed Apr 25 11:41:35 2012 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 25 Apr 2012 11:41:35 +0200 Subject: [Python-ideas] breaking out of module execution In-Reply-To: References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> Message-ID: <4F97C6CF.8060106@egenix.com> Nick Coghlan wrote: > On Wed, Apr 25, 2012 at 10:55 AM, Steven D'Aprano wrote: >> In my opinion, this is an attractive nuisance. >> >> -1 on the feature. > > Agreed (and my preferred idiom for all the cited cases is also "always > define the Python version, override at the end with the accelerated > version"). IMO, defining things twice in the same module is not a very Pythonic way of designing Python software. Left aside the resource leakage, it also makes if difficult to find the implementation that actually gets used, bypasses "explicit is better than implicit", and it doesn't address possible side-effects of the definitions that you eventually override at the end of the module. Python is normally written with a top-to-bottom view in mind, where you don't expect things to suddenly change near the end. This is why we introduced decorators before the function definition, rather than place them after the function definition. It's also why we tend to put imports, globals, helpers at the top of the file. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 25 2012) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2012-04-28: PythonCamp 2012, Cologne, Germany 3 days to go 2012-04-25: Released eGenix mx Base 3.2.4 http://egenix.com/go27 ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From ncoghlan at gmail.com Wed Apr 25 11:52:17 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 25 Apr 2012 19:52:17 +1000 Subject: [Python-ideas] breaking out of module execution In-Reply-To: <4F97C6CF.8060106@egenix.com> References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> <4F97C6CF.8060106@egenix.com> Message-ID: On Wed, Apr 25, 2012 at 7:41 PM, M.-A. Lemburg wrote: > Nick Coghlan wrote: >> On Wed, Apr 25, 2012 at 10:55 AM, Steven D'Aprano wrote: >>> In my opinion, this is an attractive nuisance. >>> >>> -1 on the feature. >> >> Agreed (and my preferred idiom for all the cited cases is also "always >> define the Python version, override at the end with the accelerated >> version"). > > IMO, defining things twice in the same module is not a very Pythonic > way of designing Python software. > > Left aside the resource leakage, it also makes if difficult to find > the implementation that actually gets used, bypasses "explicit is > better than implicit", and it doesn't address possible side-effects > of the definitions that you eventually override at the end of the > module. > > Python is normally written with a top-to-bottom view in mind, where > you don't expect things to suddenly change near the end. > > This is why we introduced decorators before the function definition, > rather than place them after the function definition. It's also why > we tend to put imports, globals, helpers at the top of the file. I agree overwriting at the end isn't ideal, but I don't think allowing returns at module level is a significant improvement. I'd rather see a higher level approach that specifically set out to tackle the problem of choosing between multiple implementations of a module at runtime that cleanly supported *testing* all the implementations in a single process, while still having one implementation that was used be default. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ubershmekel at gmail.com Wed Apr 25 12:22:41 2012 From: ubershmekel at gmail.com (Yuval Greenfield) Date: Wed, 25 Apr 2012 13:22:41 +0300 Subject: [Python-ideas] breaking out of module execution In-Reply-To: References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> <4F97C6CF.8060106@egenix.com> Message-ID: On Wed, Apr 25, 2012 at 12:52 PM, Nick Coghlan wrote: > I agree overwriting at the end isn't ideal, but I don't think allowing > returns at module level is a significant improvement. I'd rather see a > higher level approach that specifically set out to tackle the problem > of choosing between multiple implementations of a module at runtime > that *cleanly supported *testing* all the implementations in a single > process,* while still having one implementation that was used be > default. > +1 for Nick's remark. -1 for the current proposal. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mal at egenix.com Wed Apr 25 12:37:54 2012 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 25 Apr 2012 12:37:54 +0200 Subject: [Python-ideas] breaking out of module execution In-Reply-To: References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> <4F97C6CF.8060106@egenix.com> Message-ID: <4F97D402.5050701@egenix.com> Nick Coghlan wrote: > On Wed, Apr 25, 2012 at 7:41 PM, M.-A. Lemburg wrote: >> Nick Coghlan wrote: >>> On Wed, Apr 25, 2012 at 10:55 AM, Steven D'Aprano wrote: >>>> In my opinion, this is an attractive nuisance. >>>> >>>> -1 on the feature. >>> >>> Agreed (and my preferred idiom for all the cited cases is also "always >>> define the Python version, override at the end with the accelerated >>> version"). >> >> IMO, defining things twice in the same module is not a very Pythonic >> way of designing Python software. >> >> Left aside the resource leakage, it also makes if difficult to find >> the implementation that actually gets used, bypasses "explicit is >> better than implicit", and it doesn't address possible side-effects >> of the definitions that you eventually override at the end of the >> module. >> >> Python is normally written with a top-to-bottom view in mind, where >> you don't expect things to suddenly change near the end. >> >> This is why we introduced decorators before the function definition, >> rather than place them after the function definition. It's also why >> we tend to put imports, globals, helpers at the top of the file. > > I agree overwriting at the end isn't ideal, but I don't think allowing > returns at module level is a significant improvement. I'd rather see a > higher level approach that specifically set out to tackle the problem > of choosing between multiple implementations of a module at runtime > that cleanly supported *testing* all the implementations in a single > process, while still having one implementation that was used be > default. Isn't that an application developer choice to make rather than one that we force upon the developer and one which only addresses a single use case (having multiple implementation variants in a module) ? What about other use cases, where you e.g. * know that the subsequent function/class definitions are going to fail, because your runtime environment doesn't provide the needed functionality ? * want to limit the available defined APIs based on flags or other settings ? * want to make modules behave more like functions or classes ? * want to debug import loops ? Since the module body is run more or less like a function or class body, it seems natural to allow the same statements available there in modules as well. Esp. with the new importlib, tapping into the wealth of functionality in that area has become a lot easier than before. Only the compiler is preventing it. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 25 2012) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2012-04-28: PythonCamp 2012, Cologne, Germany 3 days to go 2012-04-25: Released eGenix mx Base 3.2.4 http://egenix.com/go27 ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From steve at pearwood.info Wed Apr 25 13:49:06 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 25 Apr 2012 21:49:06 +1000 Subject: [Python-ideas] breaking out of module execution In-Reply-To: <4F97C6CF.8060106@egenix.com> References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> <4F97C6CF.8060106@egenix.com> Message-ID: <4F97E4B2.6070302@pearwood.info> M.-A. Lemburg wrote: > Nick Coghlan wrote: >> On Wed, Apr 25, 2012 at 10:55 AM, Steven D'Aprano wrote: >>> In my opinion, this is an attractive nuisance. >>> >>> -1 on the feature. >> Agreed (and my preferred idiom for all the cited cases is also "always >> define the Python version, override at the end with the accelerated >> version"). > > IMO, defining things twice in the same module is not a very Pythonic > way of designing Python software. You're not defining things twice in the same module. You're designing them twice in two modules, one in Python and one in a C extension module. Your own example does the same thing. The only difference is that you try to avoid creating the pure-Python versions if you don't need them, but you still have the source code for them in the module. > Left aside the resource leakage, What resource leakage? If I do this: def f(): pass from module import f then the first function f is garbage collected and there is no resource leakage. As for the rest of your post, I'm afraid that I find most of it "not even wrong" so I won't address it directly. I will say this: I'm sure you can come up with all sorts of reasons for not liking the current idiom of "define pure-Python code first, then replace with accelerated C version if available", e.g. extremely unlikely scenarios for code that has side-effects that you might want to avoid. That's all fine. The argument is not that there is never a use for a top level return, or that alternatives are perfect. The argument is that a top level return has more disadvantages than advantages. Unless you address the disadvantages and costs of top level return, you won't convince me, and I doubt you will convince many others. -- Steven From mark at hotpy.org Wed Apr 25 14:11:03 2012 From: mark at hotpy.org (Mark Shannon) Date: Wed, 25 Apr 2012 13:11:03 +0100 Subject: [Python-ideas] breaking out of module execution In-Reply-To: <4F97E4B2.6070302@pearwood.info> References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> <4F97C6CF.8060106@egenix.com> <4F97E4B2.6070302@pearwood.info> Message-ID: <4F97E9D7.5090804@hotpy.org> Steven D'Aprano wrote: [snip] > Unless you address the disadvantages and costs of top level return, you > won't convince me, and I doubt you will convince many others. > What costs? From steve at pearwood.info Wed Apr 25 14:30:32 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 25 Apr 2012 22:30:32 +1000 Subject: [Python-ideas] breaking out of module execution In-Reply-To: <4F97E9D7.5090804@hotpy.org> References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> <4F97C6CF.8060106@egenix.com> <4F97E4B2.6070302@pearwood.info> <4F97E9D7.5090804@hotpy.org> Message-ID: <4F97EE68.4080102@pearwood.info> Mark Shannon wrote: > Steven D'Aprano wrote: > > [snip] > >> Unless you address the disadvantages and costs of top level return, >> you won't convince me, and I doubt you will convince many others. >> > > What costs? Quoting from my post earlier today: [quote] Even if break at the module level is useful on rare occasions, I think the usefulness is far outweighed by the costs: - hard to use correctly, hence code using this feature risks being buggy - encourages premature micro-optimization, or at least the illusion of optimization - encourages or requires duplicate code and copy-and-paste programming - complicates the top-level program flow Today, if you successfully import a module, you know that all the top-level code in that module was executed. If this feature is added, you cannot be sure what top-level code was reached unless you scan through all the code above it. [end quote] And to see the context: http://mail.python.org/pipermail/python-ideas/2012-April/014897.html -- Steven From mark at hotpy.org Wed Apr 25 14:43:08 2012 From: mark at hotpy.org (Mark Shannon) Date: Wed, 25 Apr 2012 13:43:08 +0100 Subject: [Python-ideas] breaking out of module execution In-Reply-To: <4F97EE68.4080102@pearwood.info> References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> <4F97C6CF.8060106@egenix.com> <4F97E4B2.6070302@pearwood.info> <4F97E9D7.5090804@hotpy.org> <4F97EE68.4080102@pearwood.info> Message-ID: <4F97F15C.6000509@hotpy.org> Steven D'Aprano wrote: > Mark Shannon wrote: >> Steven D'Aprano wrote: >> >> [snip] >> >>> Unless you address the disadvantages and costs of top level return, >>> you won't convince me, and I doubt you will convince many others. >>> >> >> What costs? > > Quoting from my post earlier today: > > > [quote] > Even if break at the module level is useful on rare occasions, I think > the usefulness is far outweighed by the costs: > > - hard to use correctly, hence code using this feature risks being buggy > - encourages premature micro-optimization, or at least the illusion of > optimization > - encourages or requires duplicate code and copy-and-paste programming > - complicates the top-level program flow I would have taken these to be the disadvantages, rather costs. By costs, I assumed you meant implementation effort or runtime overhead. Also, I don't see the difference between a return in a module, and a return in a function. Both terminate execution and return to the caller. Why do these four points apply to module-level returns any more than function-level returns? > > Today, if you successfully import a module, you know that all the > top-level code in that module was executed. If this feature is added, > you cannot be sure what top-level code was reached unless you scan > through all the code above it. > [end quote] I don't know if this is a good idea or not, but the fact that to it can be implemented by removing a single restriction in the compiler suggests it might have some merit. Cheers, Mark. From ronaldoussoren at mac.com Wed Apr 25 14:37:56 2012 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Wed, 25 Apr 2012 14:37:56 +0200 Subject: [Python-ideas] breaking out of module execution In-Reply-To: <4F97E9D7.5090804@hotpy.org> References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> <4F97C6CF.8060106@egenix.com> <4F97E4B2.6070302@pearwood.info> <4F97E9D7.5090804@hotpy.org> Message-ID: <4A8CACF1-860D-48F7-9538-596A3EEB4445@mac.com> On 25 Apr, 2012, at 14:11, Mark Shannon wrote: > Steven D'Aprano wrote: > > [snip] > >> Unless you address the disadvantages and costs of top level return, you won't convince me, and I doubt you will convince many others. > > What costs? Harder to understand code is one disadvantage. The "return" that ends execution can easily be hidden in a list of definitions, such as ... some definitions ... if sys.platform != 'win32': return ... more definitions for win32 specific functionality ... That's easy to read with a 10 line module, but not when the module gets significantly larger. Also, why use the proposed module-scope return instead of an if-statement with nested definitions, this works just fine: : def foo(): pass : : if sys.platform == 'linux': : : def linux_bar(): pass > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 4788 bytes Desc: not available URL: From mal at egenix.com Wed Apr 25 16:00:58 2012 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 25 Apr 2012 16:00:58 +0200 Subject: [Python-ideas] breaking out of module execution In-Reply-To: <4A8CACF1-860D-48F7-9538-596A3EEB4445@mac.com> References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> <4F97C6CF.8060106@egenix.com> <4F97E4B2.6070302@pearwood.info> <4F97E9D7.5090804@hotpy.org> <4A8CACF1-860D-48F7-9538-596A3EEB4445@mac.com> Message-ID: <4F98039A.3090301@egenix.com> Ronald Oussoren wrote: > > Also, why use the proposed module-scope return instead of an if-statement with nested definitions, this works just fine: > > : def foo(): pass > : > : if sys.platform == 'linux': > : > : def linux_bar(): pass Because this only works reasonably if you have a few lines of code to indent. As soon as you have hundreds of lines, this becomes both unreadable and difficult to edit. The above is how the thread was started, BTW :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 25 2012) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2012-04-28: PythonCamp 2012, Cologne, Germany 3 days to go 2012-04-25: Released eGenix mx Base 3.2.4 http://egenix.com/go27 ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From arnodel at gmail.com Wed Apr 25 16:44:11 2012 From: arnodel at gmail.com (Arnaud Delobelle) Date: Wed, 25 Apr 2012 15:44:11 +0100 Subject: [Python-ideas] breaking out of module execution In-Reply-To: <4F98039A.3090301@egenix.com> References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> <4F97C6CF.8060106@egenix.com> <4F97E4B2.6070302@pearwood.info> <4F97E9D7.5090804@hotpy.org> <4A8CACF1-860D-48F7-9538-596A3EEB4445@mac.com> <4F98039A.3090301@egenix.com> Message-ID: (sent from my phone) On Apr 25, 2012 3:01 PM, "M.-A. Lemburg" wrote: > > Ronald Oussoren wrote: > > > > Also, why use the proposed module-scope return instead of an if-statement with nested definitions, this works just fine: > > > > : def foo(): pass > > : > > : if sys.platform == 'linux': > > : > > : def linux_bar(): pass > > Because this only works reasonably if you have a few lines of code > to indent. As soon as you have hundreds of lines, this becomes > both unreadable and difficult to edit. OTOH the return statement becomes really hard to spot... Arnaud -------------- next part -------------- An HTML attachment was scrubbed... URL: From mal at egenix.com Wed Apr 25 17:09:16 2012 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 25 Apr 2012 17:09:16 +0200 Subject: [Python-ideas] breaking out of module execution In-Reply-To: References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> <4F97C6CF.8060106@egenix.com> <4F97E4B2.6070302@pearwood.info> <4F97E9D7.5090804@hotpy.org> <4A8CACF1-860D-48F7-9538-596A3EEB4445@mac.com> <4F98039A.3090301@egenix.com> Message-ID: <4F98139C.1020003@egenix.com> Arnaud Delobelle wrote: > (sent from my phone) > On Apr 25, 2012 3:01 PM, "M.-A. Lemburg" wrote: >> >> Ronald Oussoren wrote: >>> >>> Also, why use the proposed module-scope return instead of an > if-statement with nested definitions, this works just fine: >>> >>> : def foo(): pass >>> : >>> : if sys.platform == 'linux': >>> : >>> : def linux_bar(): pass >> >> Because this only works reasonably if you have a few lines of code >> to indent. As soon as you have hundreds of lines, this becomes >> both unreadable and difficult to edit. > > OTOH the return statement becomes really hard to spot... People don't appear to have a problem with this in long functions or methods, so I'm not sure how well that argument qualifies. The programmer can of course add an easy to spot comment where the return is used. Just as question of programming style. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 25 2012) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2012-04-28: PythonCamp 2012, Cologne, Germany 3 days to go 2012-04-25: Released eGenix mx Base 3.2.4 http://egenix.com/go27 ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From ericsnowcurrently at gmail.com Wed Apr 25 17:28:36 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Wed, 25 Apr 2012 09:28:36 -0600 Subject: [Python-ideas] breaking out of module execution In-Reply-To: <4F97EE68.4080102@pearwood.info> References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> <4F97C6CF.8060106@egenix.com> <4F97E4B2.6070302@pearwood.info> <4F97E9D7.5090804@hotpy.org> <4F97EE68.4080102@pearwood.info> Message-ID: I'm actually fine with the way things are and would _really_ have used a module "break" perhaps once. As I alluded to originally, there just aren't enough good use cases to make it worth it over the status quo. [1][2] At the very least, the mailing list archives will have a pretty good discussion on the idea (of which I could not find one previously). To recap, this idea is about making the intent/context of a module explicit at the beginning, rather than the end -- without resorting to the extra level(s) of indent that an if/else solution would require. Decorators and the with statement (both targeting code blocks) came about for the same reason. However, a simple return/break statement would allow much more than that. As Nick suggested, a more specific, targeted solution would be better. ("import-else" doesn't fit the bill. [3]) For the record, I still think the status quo is sub-optimal. My original post lists what I think are legitimate (if uncommon) use cases. Here's my list of (nitpick-ish?) concerns with the current solutions: 1. if/else to make context explicit: one extra level of indent (or ever-so-slightly-possibly more) 2. conditionally replace module contents at the end: without a clear comment at the beginning, may miscommunicate the final contents of the module 3. put the code in the else of #1 in a separate module: one more import involved (weak, I know), and one more level of FS indirection ("flat is better than nested") 4. special exception + import hook: not worth the trouble -eric [1] http://www.boredomandlaziness.org/2011/02/justifying-python-language-changes.html [2] http://www.boredomandlaziness.org/2011/02/status-quo-wins-stalemate.html [3] An idea that has come up before (and at least once recently): import cdecimal else decimal as decimal <==> try: import cdecimal as decimal except ImportError: import decimal From ethan at stoneleaf.us Wed Apr 25 17:31:34 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 25 Apr 2012 08:31:34 -0700 Subject: [Python-ideas] breaking out of module execution In-Reply-To: References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> <4F97AEAA.90009@canterbury.ac.nz> <4F97B19E.7000604@hotpy.org> Message-ID: <4F9818D6.5010808@stoneleaf.us> Matt Joiner wrote: > If not use_simple_api: > class C: More like: class C: def basic_method(self): pass if not use_simple_gui: def advanced_method(self, this, that): pass ~Ethan~ From ron3200 at gmail.com Wed Apr 25 17:52:49 2012 From: ron3200 at gmail.com (Ron Adam) Date: Wed, 25 Apr 2012 10:52:49 -0500 Subject: [Python-ideas] breaking out of module execution In-Reply-To: <4F97AEAA.90009@canterbury.ac.nz> References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> <4F97AEAA.90009@canterbury.ac.nz> Message-ID: On 04/25/2012 02:58 AM, Greg Ewing wrote: > Nick Coghlan wrote: >> It isn't quite as simple as >> just deleting those lines though, since we likely still wouldn't want >> to allow return statements in class bodies. > > I'm sure there's someone out there with a twisted enough > mind to think of a use for that... Currently return isn't allowed in class bodies defined inside functons. So it probably won't work in top level either. >>> def foo(): ... class A: ... return ... File "", line 3 SyntaxError: 'return' outside function As far as the feature goes, it wouldn't be consistent with class behaviour unless you allow the returns in class's to work too. Think of modules as a type of class where ... import module is equivalent to ... module module_name: Like classes the module body would execute to define the module, and return inside the module body would be a syntax error. Of course since modules aren't specifically defined in this way, there is the option to not follow that consultancy. Cheers, Ron Ron From ron3200 at gmail.com Wed Apr 25 18:06:23 2012 From: ron3200 at gmail.com (Ron Adam) Date: Wed, 25 Apr 2012 11:06:23 -0500 Subject: [Python-ideas] breaking out of module execution In-Reply-To: <4F9709FD.4060901@hotpy.org> References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F9709FD.4060901@hotpy.org> Message-ID: On 04/24/2012 03:15 PM, Mark Shannon wrote: > > This has to be the only few feature request that can implemented by > removing code :) > > I implemented this by deleting 2 lines of code from the compiler. Does it also allow returns in class bodies when you do that? >>> class A: ... return ... File "", line 2 SyntaxError: 'return' outside function Weather or not it does. I think modules are closer to class bodies and return should be a syntax error in that case as well. They aren't functions and we shouldn't think of them that way. IMHO It would make Python harder to learn when the lines between them get blurred. Cheers, Ron From ericsnowcurrently at gmail.com Wed Apr 25 18:28:38 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Wed, 25 Apr 2012 10:28:38 -0600 Subject: [Python-ideas] breaking out of module execution In-Reply-To: <4F97EE68.4080102@pearwood.info> References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> <4F97C6CF.8060106@egenix.com> <4F97E4B2.6070302@pearwood.info> <4F97E9D7.5090804@hotpy.org> <4F97EE68.4080102@pearwood.info> Message-ID: On Wed, Apr 25, 2012 at 6:30 AM, Steven D'Aprano wrote: > - encourages premature micro-optimization, or at least the illusion of > ?optimization > - hard to use correctly, hence code using this feature risks being buggy > - encourages premature micro-optimization, or at least the illusion of > optimization > - encourages or requires duplicate code and copy-and-paste programming > - complicates the top-level program flow > > Today, if you successfully import a module, you know that all the top-level code in > that module was executed. If this feature is added, you cannot be sure what top-level > code was reached unless you scan through all the code above it. This got me thinking "well, you get the same thing with functions and the return statement". Then I realized there's a problem with that line of thinking and stepped back. Modules and functions have distinct purposes (by design) and we shouldn't help make that distinction blurrier. We should (and mostly do) teach the concept of a module as a top-level (singleton) namespace definition. The idioms presented in this thread mostly bear this out. Python doesn't force the distinction syntactically, nor am I suggesting it should. However, it seems to me that this not how most people think of modules. The culprit here is the lack of distinction between modules and scripts. If a module is like a class, a script is like a function. Perhaps we should consider ways of making the difference between scripts and modules clearer, whether in the docs, with syntax, or otherwise. -eric From rob.cliffe at btinternet.com Wed Apr 25 18:34:44 2012 From: rob.cliffe at btinternet.com (Rob Cliffe) Date: Wed, 25 Apr 2012 17:34:44 +0100 Subject: [Python-ideas] breaking out of module execution In-Reply-To: References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> <4F97C6CF.8060106@egenix.com> <4F97E4B2.6070302@pearwood.info> <4F97E9D7.5090804@hotpy.org> <4F97EE68.4080102@pearwood.info> Message-ID: <4F9827A4.6070901@btinternet.com> Not, please, with adding differences to the syntax allowed in scripts and in modules. (More to learn and remember.) But in the docs: absolutely. Rob Cliffe On 25/04/2012 17:28, Eric Snow wrote: > > Perhaps we should consider ways of making the > difference between scripts and modules clearer, whether in the docs, > with syntax, or otherwise. > > -eric > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > From steve at pearwood.info Wed Apr 25 18:54:58 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 26 Apr 2012 02:54:58 +1000 Subject: [Python-ideas] breaking out of module execution In-Reply-To: <4F97F15C.6000509@hotpy.org> References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> <4F97C6CF.8060106@egenix.com> <4F97E4B2.6070302@pearwood.info> <4F97E9D7.5090804@hotpy.org> <4F97EE68.4080102@pearwood.info> <4F97F15C.6000509@hotpy.org> Message-ID: <4F982C62.9040405@pearwood.info> Mark Shannon wrote: [...] > I would have taken these to be the disadvantages, rather costs. > By costs, I assumed you meant implementation effort or runtime overhead. No, costs as in "costs versus benefits". > Also, I don't see the difference between a return in a module, and a > return in a function. Both terminate execution and return to the caller. > Why do these four points apply to module-level returns any more than > function-level returns? A very good point. There is a school of thought that functions should always have a single entry (the top of the function) and a single exit (the bottom). If I recall correctly, Pascal is like that: there's no way to return early from a function in Pascal. However, code inside a function normally performs a calculation and returns a value, so once that value is calculated there's no point hanging around. The benefit of early return in functions outweighs the cost. It is my argument that this is not the case for top level module code. Some differences between return in a function and return in a module: 1) Modules don't have a caller as such, so it isn't clear what you are returning too. (If the module is being imported, I suppose you could call the importing module the caller; but when the module is being run instead, there is no importing module.) So a top level return is more like an exit than a return, except it doesn't actually exit the Python interpreter. 2) When functions unexpectedly return early, you can sometimes get a clue why by inspecting the return value. Modules don't have a return value. 3) Functions tend to be relatively small (or at least, they should be relatively small), so while an early return in the middle of a function can be surprising, the cost of discovering that is not very high. In contrast, modules tend to be relatively large, hundreds or even thousands of lines. An early return could be anywhere. 4) Code at the top level of modules is usually transparent: the details of what gets done are important. People will want to know which functions, classes and global variables are actually created, and which are skipped due to an early return. In contrast, functions are usually treated as opaque blackboxes: people usually care about the interface, not the implementation. So typically they don't care whether the function returns out early or not. There may be other differences. >> Today, if you successfully import a module, you know that all the >> top-level code in that module was executed. If this feature is added, >> you cannot be sure what top-level code was reached unless you scan >> through all the code above it. >> [end quote] > > I don't know if this is a good idea or not, but the fact that to it can > be implemented by removing a single restriction in the compiler suggests > it might have some merit. Do you really mean to say that *because* something is easy, it therefore might be a good idea? rm -rf / Easy, and therefore a good idea, yes? *wink* -- Steven From steve at pearwood.info Wed Apr 25 18:58:28 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 26 Apr 2012 02:58:28 +1000 Subject: [Python-ideas] breaking out of module execution In-Reply-To: <4F98039A.3090301@egenix.com> References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> <4F97C6CF.8060106@egenix.com> <4F97E4B2.6070302@pearwood.info> <4F97E9D7.5090804@hotpy.org> <4A8CACF1-860D-48F7-9538-596A3EEB4445@mac.com> <4F98039A.3090301@egenix.com> Message-ID: <4F982D34.4010709@pearwood.info> M.-A. Lemburg wrote: > Ronald Oussoren wrote: >> Also, why use the proposed module-scope return instead of an if-statement with nested definitions, this works just fine: >> >> : def foo(): pass >> : >> : if sys.platform == 'linux': >> : >> : def linux_bar(): pass > > Because this only works reasonably if you have a few lines of code > to indent. As soon as you have hundreds of lines, this becomes > both unreadable and difficult to edit. I think that is wrong. Why would hundreds of lines suddenly become unreadable and hard to edit because they have a little bit of leading whitespace in front of them? -- Steven From mwm at mired.org Wed Apr 25 19:18:30 2012 From: mwm at mired.org (Mike Meyer) Date: Wed, 25 Apr 2012 13:18:30 -0400 Subject: [Python-ideas] breaking out of module execution In-Reply-To: <4F982C62.9040405@pearwood.info> References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> <4F97C6CF.8060106@egenix.com> <4F97E4B2.6070302@pearwood.info> <4F97E9D7.5090804@hotpy.org> <4F97EE68.4080102@pearwood.info> <4F97F15C.6000509@hotpy.org> <4F982C62.9040405@pearwood.info> Message-ID: <20120425131830.3f58c130@bhuda.mired.org> On Thu, 26 Apr 2012 02:54:58 +1000 Steven D'Aprano wrote: > Mark Shannon wrote: > > I don't know if this is a good idea or not, but the fact that to it can > > be implemented by removing a single restriction in the compiler suggests > > it might have some merit. > Do you really mean to say that *because* something is easy, it therefore might > be a good idea? I read it as an expression of the language design philosophy that the best way to add power is to remove restrictions. Personally, I agree with that philosophy, as removing a single restriction is a much better alternative than having a flock of tools, syntax and special cases to compensate. Compare Python - where functions are first-class objects and can be trivially passed as arguments - to pretty much any modern language that restricts such usage. That said, "more power" is not always the best choice from a language design point of few. In this case there's really only one use case for lifting the restriction against return in classes and modules, and the problems already pointed out that lifting this restriction creates outweigh the benefits of that use case. -1. http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From anacrolix at gmail.com Wed Apr 25 20:35:11 2012 From: anacrolix at gmail.com (Matt Joiner) Date: Thu, 26 Apr 2012 02:35:11 +0800 Subject: [Python-ideas] breaking out of module execution In-Reply-To: <20120425131830.3f58c130@bhuda.mired.org> References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> <4F97C6CF.8060106@egenix.com> <4F97E4B2.6070302@pearwood.info> <4F97E9D7.5090804@hotpy.org> <4F97EE68.4080102@pearwood.info> <4F97F15C.6000509@hotpy.org> <4F982C62.9040405@pearwood.info> <20120425131830.3f58c130@bhuda.mired.org> Message-ID: If this is to be done I'd like to see all special methods supported. One of particular interest to modules is __getattr__... I think the idea is crazy and will lead to chicken and egg discussions. -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.brandl at gmx.net Wed Apr 25 21:31:05 2012 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 25 Apr 2012 21:31:05 +0200 Subject: [Python-ideas] breaking out of module execution In-Reply-To: References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> <4F97AEAA.90009@canterbury.ac.nz> Message-ID: On 25.04.2012 17:52, Ron Adam wrote: > Think of modules as a type of class where ... > > import module > > is equivalent to ... > > module module_name: > > > Like classes the module body would execute to define the module, and return > inside the module body would be a syntax error. No, sorry, that's not a good equivalence. It reinforces the impression some people have of "import" working like "#include" in C or (God forbid) "require" in PHP. Georg From bboe at cs.ucsb.edu Wed Apr 25 21:27:03 2012 From: bboe at cs.ucsb.edu (Bryce Boe) Date: Wed, 25 Apr 2012 12:27:03 -0700 Subject: [Python-ideas] Structured Error Output Message-ID: Hi, I looked through the man page for python's interpreter and appears that there is no way to properly distinguish between error messages output to stderr by the interpreter and output produced the by a user-program to stderr. What I would really like to have are two things: 1) an option to output interpreter generated messages to a specified file, whether these messages are uncatchable syntax errors, or catchable runtime errors that result in the termination of the interpreter. This feature would allow a wrapper program to distinguish between user-output and python interpreter output. 2) an option to provide a structured error output in some common easy-to-parse and extendable format that can be used to associate the file, line number, error type/number in some post-processing error handler. This feature would make the parsing of error messages more deterministic, and would be of significant benefit if other compilers/interpreters also provide the same functionality in the same common format. Does anyone know if there is already such a way to do what I've asked? If not, do you think having such features added to python would be something that would actually be included? Thanks, Bryce Boe From g.brandl at gmx.net Wed Apr 25 21:37:24 2012 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 25 Apr 2012 21:37:24 +0200 Subject: [Python-ideas] breaking out of module execution In-Reply-To: <4F98039A.3090301@egenix.com> References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> <4F97C6CF.8060106@egenix.com> <4F97E4B2.6070302@pearwood.info> <4F97E9D7.5090804@hotpy.org> <4A8CACF1-860D-48F7-9538-596A3EEB4445@mac.com> <4F98039A.3090301@egenix.com> Message-ID: On 25.04.2012 16:00, M.-A. Lemburg wrote: > Ronald Oussoren wrote: >> >> Also, why use the proposed module-scope return instead of an if-statement with nested definitions, this works just fine: >> >> : def foo(): pass >> : >> : if sys.platform == 'linux': >> : >> : def linux_bar(): pass > > Because this only works reasonably if you have a few lines of code > to indent. As soon as you have hundreds of lines, this becomes > both unreadable and difficult to edit. So you don't have any classes that span hundreds of lines? I don't see how this is different in terms of editing difficulty. As for readability, at least you can see from the indentation that there's something special about the module code in question. You don't see that if there's a "return" scattered somewhere. cheers, Georg PS: I don't buy the "it's no problem with functions" argument. Even though I'm certainly guilty of writing 100+ line functions, I find them quite ungraspable (is that a word?) and usually try to limit functions and methods to a screenful. From g.brandl at gmx.net Wed Apr 25 21:38:27 2012 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 25 Apr 2012 21:38:27 +0200 Subject: [Python-ideas] breaking out of module execution In-Reply-To: <4F97F15C.6000509@hotpy.org> References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> <4F97C6CF.8060106@egenix.com> <4F97E4B2.6070302@pearwood.info> <4F97E9D7.5090804@hotpy.org> <4F97EE68.4080102@pearwood.info> <4F97F15C.6000509@hotpy.org> Message-ID: On 25.04.2012 14:43, Mark Shannon wrote: >> Today, if you successfully import a module, you know that all the >> top-level code in that module was executed. If this feature is added, >> you cannot be sure what top-level code was reached unless you scan >> through all the code above it. >> [end quote] > > I don't know if this is a good idea or not, but the fact that to it can > be implemented by removing a single restriction in the compiler suggests > it might have some merit. That is a strange argument. The restriction was placed there for a reason. (With the same argument you could argue for "None = 1" to be valid code.) Georg From ron3200 at gmail.com Wed Apr 25 22:27:33 2012 From: ron3200 at gmail.com (Ron Adam) Date: Wed, 25 Apr 2012 15:27:33 -0500 Subject: [Python-ideas] breaking out of module execution In-Reply-To: References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> <4F97AEAA.90009@canterbury.ac.nz> Message-ID: On 04/25/2012 02:31 PM, Georg Brandl wrote: > On 25.04.2012 17:52, Ron Adam wrote: > >> Think of modules as a type of class where ... >> >> import module >> >> is equivalent to ... >> >> module module_name: >> >> >> Like classes the module body would execute to define the module, and return >> inside the module body would be a syntax error. > > No, sorry, that's not a good equivalence. It reinforces the impression some > people have of "import" working like "#include" in C or (God forbid) "require" > in PHP. Not quite the same thing, but I see how you would think that from the way I wrote the example. I didn't mean the file to be inserted, but instead as if it was written in the module statement body. That is, if we even had a "module" keyword, which we don't. ;-) The point is that a module contents execute from beginning to end to create a module, in the same way a class's contents execute from beginning to end to create a class. There are also differences, such as where a module is stored and how it's contents are accessed, and so they are not the same thing. You can't just change a class into a module and vise-versa by just changing it's header or moving it's body into a separate file. I was just trying to point out a module is closer to a class than it is to a function, and that is a good thing. Allowing a return or break in a module could make things more confusing. Also, by not allowing return or breaks, it catches errors were the indentation is lost in functions or methods quicker. Cheers, Ron From tjreedy at udel.edu Wed Apr 25 23:33:21 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 25 Apr 2012 17:33:21 -0400 Subject: [Python-ideas] Structured Error Output In-Reply-To: References: Message-ID: On 4/25/2012 3:27 PM, Bryce Boe wrote: > Hi, > > I looked through the man page for python's interpreter and appears > that there is no way to properly distinguish between error messages > output to stderr by the interpreter and output produced the by a > user-program to stderr. That should be a false distinction. User programs should only print error messages to stderr. Some modify error messages before they get printed. Some raise exceptions themselves with messages. The interpreter makes makes no distinction between user code, 3rd party code, and stdlib code. > What I would really like to have are two things: > > 1) an option to output interpreter generated messages to a specified > file, whether these messages are uncatchable syntax errors, or > catchable runtime errors that result in the termination of the > interpreter. This feature would allow a wrapper program to distinguish > between user-output and python interpreter output. 'Raw' interpreter error messages start with 'SyntaxError' or 'Traceback'. Runtime errors do not seem to go to the normal stderr channel. > 2) an option to provide a structured error output in some common > easy-to-parse and extendable format that can be used to associate the > file, line number, error type/number in some post-processing error > handler. This feature would make the parsing of error messages more > deterministic, and would be of significant benefit if other > compilers/interpreters also provide the same functionality in the same > common format. Exception instances have a .__traceback__ instance that is used to print the default traceback message. So it has or can generate much of what you request. I believe traceback objects are documented somewhere. Some apps wrap everything in try: run_app except Exception as e: custom_handle(e) -- Terry Jan Reedy From ckaynor at zindagigames.com Wed Apr 25 23:46:22 2012 From: ckaynor at zindagigames.com (Chris Kaynor) Date: Wed, 25 Apr 2012 14:46:22 -0700 Subject: [Python-ideas] Structured Error Output In-Reply-To: References: Message-ID: On Wed, Apr 25, 2012 at 2:33 PM, Terry Reedy wrote: > On 4/25/2012 3:27 PM, Bryce Boe wrote: > >> 2) an option to provide a structured error output in some common >> easy-to-parse and extendable format that can be used to associate the >> file, line number, error type/number in some post-processing error >> handler. This feature would make the parsing of error messages more >> deterministic, and would be of significant benefit if other >> compilers/interpreters also provide the same functionality in the same >> common format. >> > > Exception instances have a .__traceback__ instance that is used to print > the default traceback message. So it has or can generate much of what you > request. I believe traceback objects are documented somewhere. Some apps > wrap everything in > > try: run_app > except Exception as e: custom_handle(e) The sys module also has an excepthook (1) which can be overridden to customize the exception handling. Note that it does not function with the threading module, however. (1) http://docs.python.org/library/sys.html#sys.excepthook > > > -- > Terry Jan Reedy > > > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Wed Apr 25 23:51:41 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 25 Apr 2012 17:51:41 -0400 Subject: [Python-ideas] breaking out of module execution In-Reply-To: References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> <4F97AEAA.90009@canterbury.ac.nz> Message-ID: On 4/25/2012 4:27 PM, Ron Adam wrote: > I was just trying to point out a module is closer to a class than it is > to a function, and that is a good thing. Each should only be executed once, and this is enforced for modules by the sys.modules cache. > Allowing a return or break in a > module could make things more confusing. Agreed. It is not actually necessary for functions to have an explicit return statement. I believe that there are languages that define the function return value as the last value assigned to the function name. def fact(n): if n > 1: fact = n*fact(n-1) else: fact = 1 (This is pretty close to how mathematicians might write the definition. It does, however, require special-casing the function name on the left of '='.) 'return' is, however, useful for returning out of more deeply nested constructs, especially loops, without setting flag variables or having multi-level breaks. This consideration does not really apply to the use case for module return. A virtue of defining everything in Python and then trying to import the accelerated override is that different implementations and versions thereof can use the same file even though they accelerate different parts and amounts (even none) of the module. -- Terry Jan Reedy From bboe at cs.ucsb.edu Thu Apr 26 00:06:16 2012 From: bboe at cs.ucsb.edu (Bryce Boe) Date: Wed, 25 Apr 2012 15:06:16 -0700 Subject: [Python-ideas] Structured Error Output In-Reply-To: References: Message-ID: >> I looked through the man page for python's interpreter and appears >> that there is no way to properly distinguish between error messages >> output to stderr by the interpreter and output produced the by a >> user-program to stderr. > > > That should be a false distinction. User programs should only print error > messages to stderr. Some modify error messages before they get printed. Some > raise exceptions themselves with messages. The interpreter makes makes no > distinction between user code, 3rd party code, and stdlib code. Perhaps I wasn't very clear. I want to write a tool to collect error messages when I run a program. Ideally the tool should be agnostic to what language is used and should be able to identify syntax errors, parser errors, and runtime errors. While I can parse both the stdout and stderr streams to find this information, from what I can tell there is no way to distinguish between a real syntax error (output to stderr): File "./test.py", line 5 class ^ SyntaxError: invalid syntax and a program that outputs that exact output to stderr and exits with status 1. This "channel" sharing of control (error messages) and data is a problem that affects more than just the python interpreter. I am hoping to start with python and provide a way to separate the control and data information so I can be certain that output on the "control" file descriptor is guaranteed to be generated by the interpreter. > Exception instances have a .__traceback__ instance that is used to print the default traceback message I am aware I can obtain this information and output it however I want from my own program (less syntax errors), however, the goal is to run third party code and provide a more detailed error report. -Bryce From zuo at chopin.edu.pl Thu Apr 26 00:27:47 2012 From: zuo at chopin.edu.pl (Jan Kaliszewski) Date: Thu, 26 Apr 2012 00:27:47 +0200 Subject: [Python-ideas] breaking out of module execution In-Reply-To: <4F97D402.5050701@egenix.com> References: <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> <4F97C6CF.8060106@egenix.com> <4F97D402.5050701@egenix.com> Message-ID: <20120425222747.GA1806@chopin.edu.pl> M.-A. Lemburg dixit (2012-04-25, 12:37): > Isn't that an application developer choice to make rather than > one that we force upon the developer and one which only addresses > a single use case (having multiple implementation variants in a > module) ? > > What about other use cases, where you e.g. > > * know that the subsequent function/class definitions are going to > fail, because your runtime environment doesn't provide the > needed functionality ? IMHO then you should refactor your module into a few smaller ones. > * want to limit the available defined APIs based on flags or > other settings ? As above if there are many and/or long variants. Otherwise top-level if/else should be sufficient. > * want to make modules behave more like functions or classes ? What for? Use functions or classes then. > * want to debug import loops ? Don't we have some good ways and tools to do it anyway? > Since the module body is run more or less like a function or > class body, it seems natural to allow the same statements > available there in modules as well. Function body and class body are completely different stories. * The former can contain return, the latter not (and IMHO this limitation is a good thing -- see below...). * The former is executed many times (each time creating a new local namespace); the latter is executed immediately and only once. * The former takes some input (arguments) and returns some output; the latter is a container for some callables and/or some data. * The former is mostly relatively small; the latter is often quite large (breaking execution flow of large bodies is often described as a bad practice, and not without good reasons). Modules resomehow similar to singleton classes (instantiated with the first import). Cheers. *j From greg.ewing at canterbury.ac.nz Thu Apr 26 01:14:39 2012 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 26 Apr 2012 11:14:39 +1200 Subject: [Python-ideas] breaking out of module execution In-Reply-To: <4F97F15C.6000509@hotpy.org> References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> <4F97C6CF.8060106@egenix.com> <4F97E4B2.6070302@pearwood.info> <4F97E9D7.5090804@hotpy.org> <4F97EE68.4080102@pearwood.info> <4F97F15C.6000509@hotpy.org> Message-ID: <4F98855F.1000704@canterbury.ac.nz> Mark Shannon wrote: > Also, I don't see the difference between a return in a module, and a > return in a function. Both terminate execution and return to the caller. > Why do these four points apply to module-level returns any more than > function-level returns? I think the difference is that functions are usually small, whereas modules can be very large. So there is much more scope for surprises due to a return lurking in a module. -- Greg From mwm at mired.org Thu Apr 26 01:15:00 2012 From: mwm at mired.org (Mike Meyer) Date: Wed, 25 Apr 2012 19:15:00 -0400 Subject: [Python-ideas] Structured Error Output In-Reply-To: References: Message-ID: <20120425191500.081ce8f0@bhuda.mired.org> On Wed, 25 Apr 2012 15:06:16 -0700 Bryce Boe wrote: > Perhaps I wasn't very clear. I want to write a tool to collect error > messages when I run a program. Ideally the tool should be agnostic to > what language is used and should be able to identify syntax errors, > parser errors, and runtime errors. While I can parse both the stdout > and stderr streams to find this information, from what I can tell > there is no way to distinguish between a real syntax error (output to > stderr): > > File "./test.py", line 5 > class > ^ > SyntaxError: invalid syntax > > and a program that outputs that exact output to stderr and exits with status 1. Correct. And it isn't a problem, because it shouldn't matter if the programmer wrote a bit of code that caused the compiler to raise a syntax error, raised the syntax error themselves, evaled a string that raised the syntax error, or wrote the message explicitly to standard error: all those cases represent a syntax error to the programmer. > This "channel" sharing of control (error messages) and data is a > problem that affects more than just the python interpreter. I am > hoping to start with python and provide a way to separate the control > and data information so I can be certain that output on the "control" > file descriptor is guaranteed to be generated by the interpreter. If your program is writing *data* to stderr, it's badly designed. You should fix it instead of trying to get Python changed to accommodate it. Or maybe you're trying to draw a distinction between messages purposely generated by the programmer, and messages the programmer didn't want? I think that's an artificial distinction. An error is an error is an error, and by any other name would be just as smelly. Whether it's an exception raised by the interpreter, an exception raised by the programmer, or just a message printed by the programmer really doesn't matter. Python programmers have complete control over all of this, and can make it do what they want. Trying to make distinctions based on default behaviors is misguided. http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From greg.ewing at canterbury.ac.nz Thu Apr 26 01:24:50 2012 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 26 Apr 2012 11:24:50 +1200 Subject: [Python-ideas] breaking out of module execution In-Reply-To: <4F98139C.1020003@egenix.com> References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> <4F97C6CF.8060106@egenix.com> <4F97E4B2.6070302@pearwood.info> <4F97E9D7.5090804@hotpy.org> <4A8CACF1-860D-48F7-9538-596A3EEB4445@mac.com> <4F98039A.3090301@egenix.com> <4F98139C.1020003@egenix.com> Message-ID: <4F9887C2.6090803@canterbury.ac.nz> M.-A. Lemburg wrote: > People don't appear to have a problem with this in long functions > or methods, so I'm not sure how well that argument qualifies. Well, I have a problem with long functions whether they have embedded returns or not. But another difference is that functions usually consist of a set of related statements that need to be read in order from top to bottom. Modules, on the other hand, typically consist of multiple relatively small units whose order mostly doesn't matter. Sticking a return in the middle introduces an unexpected ordering dependency. -- Greg From cs at zip.com.au Thu Apr 26 01:32:10 2012 From: cs at zip.com.au (Cameron Simpson) Date: Thu, 26 Apr 2012 09:32:10 +1000 Subject: [Python-ideas] breaking out of module execution In-Reply-To: References: Message-ID: <20120425233210.GA1109@cskk.homeip.net> On 24Apr2012 13:23, Eric Snow wrote: | In a function you can use a return statement to break out of execution | in the middle of the function. With modules you have no recourse. | This is akin to return statements being allowed only at the end of a | function. [...] Something very similar came up some months ago I think. My personal suggestion was: raise StopImport to cleanly exit a module without causing an ImportError, just as exiting a generator raises Stopiteration in the caller. In this case the caller is the import machinery, which should consider this except a clean completion of the import. Cheers, -- Cameron Simpson DoD#743 http://www.cskk.ezoshosting.com/cs/ Disclaimer: Opinions expressed here are CORRECT, mine, and not PSLs or NMSUs.. - Larry Cunningham From greg.ewing at canterbury.ac.nz Thu Apr 26 01:42:51 2012 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 26 Apr 2012 11:42:51 +1200 Subject: [Python-ideas] breaking out of module execution In-Reply-To: <4F982C62.9040405@pearwood.info> References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> <4F97C6CF.8060106@egenix.com> <4F97E4B2.6070302@pearwood.info> <4F97E9D7.5090804@hotpy.org> <4F97EE68.4080102@pearwood.info> <4F97F15C.6000509@hotpy.org> <4F982C62.9040405@pearwood.info> Message-ID: <4F988BFB.5020103@canterbury.ac.nz> Steven D'Aprano wrote: > If I recall correctly, Pascal is like that: there's no way > to return early from a function in Pascal. Actually, there is -- you can 'goto' a label at the bottom. -- Greg From jimjjewett at gmail.com Thu Apr 26 02:04:25 2012 From: jimjjewett at gmail.com (Jim Jewett) Date: Wed, 25 Apr 2012 20:04:25 -0400 Subject: [Python-ideas] breaking out of module execution In-Reply-To: <4F974B6B.3040208@pearwood.info> References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> Message-ID: On Tue, Apr 24, 2012 at 8:55 PM, Steven D'Aprano wrote: > - encourages or requires duplicate code and copy-and-paste programming How does stopping a module definition early encourage duplicate code? Just because you might have to react differently after a full import vs a partial? > Today, if you successfully import a module, you know that all the top-level > code in that module was executed. No, you don't. On the other hand, the counterexamples (circular imports, some lazy imports, module-name clashes, top-level definitions dependent on what was already imported by other modules, etc) are already painful to debug. -jJ From jimjjewett at gmail.com Thu Apr 26 02:27:54 2012 From: jimjjewett at gmail.com (Jim Jewett) Date: Wed, 25 Apr 2012 20:27:54 -0400 Subject: [Python-ideas] breaking out of module execution In-Reply-To: <4F97C6CF.8060106@egenix.com> References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> <4F97C6CF.8060106@egenix.com> Message-ID: On Wed, Apr 25, 2012 at 5:41 AM, M.-A. Lemburg wrote: > Nick Coghlan wrote: >> On Wed, Apr 25, 2012 at 10:55 AM, Steven D'Aprano wrote: > IMO, defining things twice in the same module is not a very Pythonic > way of designing Python software. Agreed, which is one problem with accelerator modules. > Left aside the resource leakage, it also makes if difficult to find > the implementation that actually gets used, Agreed, which is another problem with accelerator modules. > bypasses "explicit is better than implicit", Agreed, which is is another problem with import * from accelerator So far, these reasons have been saying "Don't do that if you can help it", and if you do it anyhow, making the code stick out is better than nothing. (That said, making it stick out by having top-level definitions that are *not* at the far left is ... unfortunate.) > and it doesn't address possible side-effects > of the definitions that you eventually override at the end of the > module. That is a real problem, but the answer is to be more explicit. try: from _foo import A except ImportError ... # define exactly A, as it will be needed or try: A except NameError: ... # define exactly A, as it will be needed or even create a wrapper that imports or defines your object the first time it is called... > Python is normally written with a top-to-bottom view in mind, where > you don't expect things to suddenly change near the end. Actually, I tend to (wrongly) view the module-level code as fixed declarations rather than commands, so that leaving a name undefined or defining it conditionally is just as bad. Having an alternative definition in the same location is less of a problem. I have learned to look for import * at the beginning and end of a module, but more special cases would not be helpful. -jJ From bboe at cs.ucsb.edu Thu Apr 26 02:30:45 2012 From: bboe at cs.ucsb.edu (Bryce Boe) Date: Wed, 25 Apr 2012 17:30:45 -0700 Subject: [Python-ideas] Structured Error Output In-Reply-To: <20120425191500.081ce8f0@bhuda.mired.org> References: <20120425191500.081ce8f0@bhuda.mired.org> Message-ID: > Correct. And it isn't a problem, because it shouldn't matter if the > programmer wrote a bit of code that caused the compiler to raise a > syntax error, raised the syntax error themselves, evaled a string that > raised the syntax error, or wrote the message explicitly to standard > error: all those cases represent a syntax error to the programmer. Perhaps you cannot envision a case where it matters because you don't work with people that intentionally try to cheat or mislead a system. I am merely pointing out that currently there is no way to distinguish between these behaviors, and I would personally like to add that support because I have a need to deterministically differentiate between them. I don't want to break backwards compatibility, so what I propose would only take place via a command line argument. >> This "channel" sharing of control (error messages) and data is a >> problem that affects more than just the python interpreter. I am >> hoping to start with python and provide a way to separate the control >> and data information so I can be certain that output on the "control" >> file descriptor is guaranteed to be generated by the interpreter. > > If your program is writing *data* to stderr, it's badly designed. You > should fix it instead of trying to get Python changed to accommodate > it. That is a generalization which is not true. Counter example: I have students writing a simple interpreter in python and their compiler should output syntax errors to stderr when the program their interpreter is interpreting. Now say their errors looks very similar to python errors, how does one distinguish between an error in their implementation of their interpreter, or an error raised by their interpreter? Furthermore, having this separation is somewhat pointless without the structured part, as ideally I would like it if all compilers and interpreters produced similar output so I could easily measure how many errors beginning programmers have in various languages and group them by type. I have to start somewhere with this project and I was hoping the python community would be in favor of adding such support as I feel the changes are relatively trivial. -Bryce From ncoghlan at gmail.com Thu Apr 26 02:58:04 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 26 Apr 2012 10:58:04 +1000 Subject: [Python-ideas] Structured Error Output In-Reply-To: References: <20120425191500.081ce8f0@bhuda.mired.org> Message-ID: On Thu, Apr 26, 2012 at 10:30 AM, Bryce Boe wrote: > Furthermore, having this separation is somewhat pointless without the > structured part, as ideally I would like it if all compilers and > interpreters produced similar output so I could easily measure how > many errors beginning programmers have in various languages and group > them by type. I have to start somewhere with this project and I was > hoping the python community would be in favor of adding such support > as I feel the changes are relatively trivial. And we're telling you that no, the changes you're interested in are not trivial - the use of stderr is embedded deep within many parts of the interpreter. I suggest just raising the bar for your students and require that they write their errors to both stderr *and* to an error log. Interpreter generated errors will show up in the former, but will never appear in the latter. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From bboe at cs.ucsb.edu Thu Apr 26 04:04:15 2012 From: bboe at cs.ucsb.edu (Bryce Boe) Date: Wed, 25 Apr 2012 19:04:15 -0700 Subject: [Python-ideas] Structured Error Output In-Reply-To: References: <20120425191500.081ce8f0@bhuda.mired.org> Message-ID: > And we're telling you that no, the changes you're interested in are > not trivial - the use of stderr is embedded deep within many parts of > the interpreter. This is the first constructive comment I've received thus far. Perhaps I am a bit optimistic that grepping for output to stderr and replacing the write or fprintf calls with a function call would be appropriate. Maybe a tedious procedure, but it still seems trivial. It's not like trying to replace the GIL ;) > I suggest just raising the bar for your students and require that they > write their errors to both stderr *and* to an error log. Interpreter > generated errors will show up in the former, but will never appear in > the latter. The example I gave was simply an example. Sure we could ask the students to more, but ideally this tool would work on any recent python source without requiring source modification. Many of these solutions have been presented before, however, they fail to work in the general case. > Oh, I work with such cases often enough to know that if they've got a > complete programming language as a tool, you've already lost. > And if the people writing the code are antagonistic, there is no way > to differentiate those behaviors. Anything the python interpreter can > do the programmer can also do. And, for that matter, suppress. I realize that even if there was another output stream the user could write to it via, os.write(3, "foobar"), however, static checks can be made on the code to detect such function calls. Also, I'm curious, can a python program suppress later syntax errors? >> That is a generalization which is not true. Counter example: I have >> students writing a simple interpreter in python and their compiler >> should output syntax errors to stderr when the program their >> interpreter is interpreting. Now say their errors looks very similar >> to python errors, how does one distinguish between an error in their >> implementation of their interpreter, or an error raised by their >> interpreter? > That's not writing data to stderr, that's writing errors. The problem > in this case is that the program in question isn't handling errors in > the implementation properly. If your students aren't bright enough to > figure out how to catch errors in their implementation and flag them > as such, flunk them. Again this was just a contrived example that demonstrates my want to differentiate between the two. Whether or not the students get stuck isn't the problem, the problem is that it's not possible to do the differentiation. But simply put, why not allow for such differentiation? What is lost by doing so? > Trying to get all language processors to produce similar error > messages is tilting at windmills. The existence of IDE's that parse > error message and let the user go through them in order hasn't been > sufficient to cause that to happen. Some abstract wish to study > beginners errors will have even less effect. It is my opinion that most people make due with what they have available to them. Of course, I can do exactly what I want to do without modifying the interpreter, however, it suffers from the ambiguity problem I've already mentioned. One of the great things about open source software is the ability to adapt it to suit your own needs, and thus I prefer to take the approach of making things better. Perhaps no one before me has even considered separating interpreter output from the programs that the interpreter interprets, but I find that quite hard to believe. This really isn't a problem with compiled code, because the compilation and type checking process is separate from the execution process, though I'll admit there is the same problem with runtime errors such as segmentation faults but these can be discovered with proper signal handling. > But you claim the structure is the import part. Want to give an > example of how you would "structure the error output" so that errors > in a program processing program source can be distinguished from > errors in the processed source, yet at the same time be similar enough > so that some tool could be used on both sets of errors? First, the two changes should work in tandem thus both interpreters would have a flag, say --structured-error-output that takes a filename. With such a flag directing the error output to different files is quite trivial. However, even if they went to the same stream (poor design in my opinion) the structured messages can have an attribute indicating which interpreter produced the message thus allowing for differentiation. Of course, if they went to the same stream, then you still have the possibility of spoofing the other program which is why the separation is necessary. Anyway, I appreciate the argument. It is fairly clear that if I were to implement this support it is not something that would be integrated in python thus it's not worth my time. I'll take the band-aid approach as everyone before me has. Thanks, Bryce From steve at pearwood.info Thu Apr 26 04:44:01 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 26 Apr 2012 12:44:01 +1000 Subject: [Python-ideas] Structured Error Output In-Reply-To: References: <20120425191500.081ce8f0@bhuda.mired.org> Message-ID: <20120426024401.GB30490@ando> On Wed, Apr 25, 2012 at 07:04:15PM -0700, Bryce Boe wrote: > I realize that even if there was another output stream the user could > write to it via, os.write(3, "foobar"), however, static checks can be > made on the code to detect such function calls. I don't think so. Or at least, not easily. funcs = [len, str, eval, map] value = funcs[2]("__imp" + "port"[1:] + "__('sys')") f = getattr(value, ''.join(map(chr, (115, 116, 100, 101, 114, 114)))) y = getattr(f, ''.join(map(chr, (119, 114, 105, 116, 101)))) y("statically check this!\n") > Also, I'm curious, can > a python program suppress later syntax errors? try: exec "x = )(" except SyntaxError: print("nothing to see here, move along") [...] > Again this was just a contrived example that demonstrates my want to > differentiate between the two. Whether or not the students get stuck > isn't the problem, the problem is that it's not possible to do the > differentiation. But simply put, why not allow for such > differentiation? What is lost by doing so? Apart from simplicity? You risk infinite regress. stderr exists to differentiate "good" output from "error" output. So you propose a new stream to differentiate "good errors" (those raised by the students' interpreter, call it stderr2) from "bad errors" (those raised by the interpreter running the students' interpreter). At some point, someone will have some compelling (to them, not necessarily everyone else) use-case that needs to distinguish between "real" good errors going to stderr2 and "fake" good errors going to stderr2, and propose stderr3. And deeper and deeper down the rabbit hole we go... At some point, people will just say "Enough!". I suggest that the distinction between stdout and stderr is exactly that point. (Although, having said that, I wish there was a stdinfo for informational messages that are neither the intended program output nor unintended program errors, e.g. status messages, progress indicators, etc.) -- Steven From greg.ewing at canterbury.ac.nz Thu Apr 26 06:04:17 2012 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 26 Apr 2012 16:04:17 +1200 Subject: [Python-ideas] Structured Error Output In-Reply-To: References: <20120425191500.081ce8f0@bhuda.mired.org> Message-ID: <4F98C941.2050406@canterbury.ac.nz> On 26/04/12 14:04, Bryce Boe wrote: > Perhaps > I am a bit optimistic that grepping for output to stderr and replacing > the write or fprintf calls with a function call would be appropriate. I don't see how that would solve your problem anyway. If your student's interpreter code, written in Python, raises e.g. a TypeError, and nothing catches it, the error message for it will get printed by the very same printf call as any other uncaught exception. Seems to me the solution to your problem lies in sandboxing the student's code inside something that catches any exceptions emanating from it and logs them in a distinctive way. You could also replace sys.stdout and sys.stderr with objects that perform a similar function. -- Greg From ncoghlan at gmail.com Thu Apr 26 06:07:38 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 26 Apr 2012 14:07:38 +1000 Subject: [Python-ideas] Structured Error Output In-Reply-To: <20120426024401.GB30490@ando> References: <20120425191500.081ce8f0@bhuda.mired.org> <20120426024401.GB30490@ando> Message-ID: On Thu, Apr 26, 2012 at 12:44 PM, Steven D'Aprano wrote: > (Although, having said that, I wish there was a stdinfo for > informational messages that are neither the intended program output nor > unintended program errors, e.g. status messages, progress indicators, > etc.) And, indeed, such a channel exists: it's called the logging system, which separates event *generation* (calls to the logging message API) from event *display* (configuration of logging handlers). In particular, see the following table in the logging HOWTO guide: http://docs.python.org/howto/logging.html#when-to-use-logging Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From mwm at mired.org Thu Apr 26 06:18:12 2012 From: mwm at mired.org (Mike Meyer) Date: Thu, 26 Apr 2012 00:18:12 -0400 Subject: [Python-ideas] Structured Error Output In-Reply-To: References: <20120425191500.081ce8f0@bhuda.mired.org> Message-ID: <8851db61-4657-4368-9eae-78dbb2ad046c@email.android.com> Bryce Boe wrote: >I realize that even if there was another output stream the user could >write to it via, os.write(3, "foobar"), however, static checks can be >made on the code to detect such function calls. Also, I'm curious, can >a python program suppress later syntax errors? Yes, a python program can supporess later syntax errors. >This really isn't a problem with compiled >code, because the compilation and type checking process is separate >from the execution process For sorting out error messages in modern languages, compilation and execution are not necessarily separate. You can get compilation errors at pretty much any point in the execution of the program. Most such languages - include python, but also include languages that compile to machine or JVM code - include both the ability to import uncompiled source, compiling it along the way, and the ability to compile and run code fragments (aka "eval") in the program. Both of these can generate compilation errors in the middle of runtime. If my proram imports a config modules the user provided and it has a syntax error in it, is that syntax error a runtime error or a compilation error? >> But you claim the structure is the import part. Want to give an >> example of how you would "structure the error output" so that errors >> in a program processing program source can be distinguished from >> errors in the processed source, yet at the same time be similar >enough >> so that some tool could be used on both sets of errors?> >First, the two changes should work in tandem thus both interpreters >would have a flag, say --structured-error-output that takes a >filename. With such a flag directing the different errors to different >files is quite trivial. It is? How do you distinguish between an actual syntax error and a syntax error raised by an explicit raise statement? And which of those two cases would a syntax error raised by passing a bad code fragment to eval be, or is that a third case requiring yet another flag? >Anyway, I appreciate the argument. It is fairly clear that if I were >to implement this support it is not something that would be integrated >in python thus it's not worth my time. I'll take the band-aid approach >as everyone before me has. I'm still waiting for a proposal solid enough to evaluate. I like the idea of more structured error output, and think it might be a nice addition to the interpreter. A python programmer already has complete control over all error messages, though it can be hard to get to. Making that easier is a worthy goal, but it's got to be more than the ability to send some ill-defined set of exceptions to a different output stream. References: Message-ID: On Tue, Apr 24, 2012 at 12:42 AM, Eric Snow wrote: > The premise is that sys.implementation would be a "namedtuple" (like > sys.version_info). ?It would contain (as much as is practical) the > information that is specific to a particular implementation of Python. > ?"Required" attributes of sys.implementation would be those that the > standard library makes use of. ?For instance, importlib would make use > of sys.implementation.name (or sys.implementation.cache_tag) if there > were one. ?The thread from 2009 covered a lot of this ground already. > [1] > > Here are the "required" attributes of sys.implementation that I advocate: > > * name (mixed case; sys.implementation.name.lower() used as an identifier) > * version (of the implementation, not of the targeted language > version; structured like sys.version_info?) > > Here are other variables that _could_ go in sys.implementation: > > * cache_tag (e.g. 'cpython33' for CPython 3.3) > * repository > * repository_revision > * build_toolchain > * url (or website) > * site_prefix > * runtime > > Let's start with a minimum set of expected attributes, which would > have an immediate purpose in the stdlib. ?However, let's not disallow > implementations from adding whatever other attributes are meaningful > for them. FYI, I've created a tracker ticket with a patch and moved this over to python-dev. -eric [1] http://bugs.python.org/issue14673 From jimjjewett at gmail.com Thu Apr 26 16:36:04 2012 From: jimjjewett at gmail.com (Jim Jewett) Date: Thu, 26 Apr 2012 10:36:04 -0400 Subject: [Python-ideas] breaking out of module execution In-Reply-To: References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> <4F97C6CF.8060106@egenix.com> Message-ID: On Wed, Apr 25, 2012 at 5:52 AM, Nick Coghlan wrote: > ... tackle the problem of choosing between multiple > implementations of a module at runtime ... I like the idea, but I'm not sure I would like the result. The obvious path (at least for experimenting) is to replace the import statement with an import function. (a) At the interactive prompt, the function would return the module, and therefore allow it to be referenced as _. (I don't always remember the "as shortname" until I've already hit enter. And often all I really want is to see help(module), but without switching to another window.) (b) The functional interface could expose a configuration object, so that in addition to deciding between alternate implementations, a single implementation could set up objects differently. (Do I really need to define that type of handler? What loggers should I enable initially, even while setting up the rest of the logging machinery? OK, let me open that database connection before I define this class.) These are both features I have often wanted. But I don't want to deal with the resulting questions, like "Wait, how can that logger not exist? Oh, someone else imported logging first..." And I'm not sure it is possible to get the freedom of (b) without those problems, unless modules stop being singletons. -jJ From jimjjewett at gmail.com Thu Apr 26 22:30:32 2012 From: jimjjewett at gmail.com (Jim Jewett) Date: Thu, 26 Apr 2012 16:30:32 -0400 Subject: [Python-ideas] breaking out of module execution In-Reply-To: <4F9818D6.5010808@stoneleaf.us> References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> <4F97AEAA.90009@canterbury.ac.nz> <4F97B19E.7000604@hotpy.org> <4F9818D6.5010808@stoneleaf.us> Message-ID: On Wed, Apr 25, 2012 at 11:31 AM, Ethan Furman wrote: > Matt Joiner wrote: >> >> If not use_simple_api: >> ? ?class C: > More like: > > ?class C: > ? ?def basic_method(self): > ? ? ?pass > ? ?if not use_simple_gui: > ? ? ?def advanced_method(self, this, that): > ? ? ? ?pass He may have been thinking of a replacement idiom class C: ... if system_has_propertyX(): class C(C): # Extend the Base class, but keep the name... -jJ From techtonik at gmail.com Fri Apr 27 07:45:20 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Fri, 27 Apr 2012 08:45:20 +0300 Subject: [Python-ideas] Python-ideas Digest, Vol 65, Issue 43 In-Reply-To: <20120424184547.0bca79f5@bhuda.mired.org> References: <20120424184547.0bca79f5@bhuda.mired.org> Message-ID: On Wed, Apr 25, 2012 at 1:45 AM, Mike Meyer wrote: > On Tue, 24 Apr 2012 18:42:24 +0300 > anatoly techtonik wrote: > >> On Mon, Apr 23, 2012 at 3:01 PM, Chris Rebert wrote: >> > On Mon, Apr 23, 2012 at 4:10 AM, Hobson Lane wrote: >> > >> >> On Mon, Apr 23, 2012 at 6:00 PM, wrote: >> >>> >> >>> Send Python-ideas mailing list submissions to >> >>> ? ? ? ?python-ideas at python.org >> >>> >> >>> To subscribe or unsubscribe via the World Wide Web, visit >> >>> ? ? ? ?http://mail.python.org/mailman/listinfo/python-ideas >> >>> or, via email, send a message with subject or body 'help' to >> >>> ? ? ? ?python-ideas-request at python.org >> >>> >> >>> You can reach the person managing the list at >> >>> ? ? ? ?python-ideas-owner at python.org >> >>> >> >>> When replying, please edit your Subject line so it is more specific >> >>> than "Re: Contents of Python-ideas digest..." >> > >> > Please avoid replying to the digest; it breaks conversation threading. >> > Switch to a non-digest mailing list subscription when not lurking. >> >> But to reply to a non-digest message you need to receive it in >> non-digest mode, which didn't happen already. The only way it makes >> sense is when you ask the Mailman to resend message again. I don't >> know if that's possible. > > Your initial statement is - or at least may be - wrong. If the digest > is in one of the well-known formats, a good MUA will let you burst a > digest into individual messages and reply to them just like any other > message to a list. mh, nmh and GUI's built on top them can do this. I use GMail, which is a good MUA to me (much better than mh, nmh and their GUIs). I believe most users feel comfortable with emails in their browsers and won't install some awkward terminal mh, nmh just to reply to the digest messages. Requirement to "use something proper beforehand" was neither a solution nor an alternative. I am not sure that `programs` nowadays makes any sense if you can not access your data from all the entrypionts. -- anatoly t. From phd at phdru.name Fri Apr 27 08:25:57 2012 From: phd at phdru.name (Oleg Broytman) Date: Fri, 27 Apr 2012 10:25:57 +0400 Subject: [Python-ideas] Python-ideas Digest, Vol 65, Issue 43 In-Reply-To: References: <20120424184547.0bca79f5@bhuda.mired.org> Message-ID: <20120427062557.GA32663@iskra.aviel.ru> Hell! On Fri, Apr 27, 2012 at 08:45:20AM +0300, anatoly techtonik wrote: > won't install some awkward terminal "awkward" indeed! Trolling as usual, yeah? > Requirement to "use something proper > beforehand" was neither a solution nor an alternative. It was not a requirement, just an advice. But that was an advice to solve a real problem. Replying to digest brings a lot of problems and thus prevents effective communication. Email, mailing lists and archives don't make sense if they don't help to communicate. And it was only AN advice, not THE advice. Another solution for the same problem would be not to reply to digest. I am sure there are other solutions. > I am not sure that `programs` nowadays makes any sense if you can not > access your data from all the entrypionts. Do you believe - those "awkward" terminal programs works remotely quite fine?! Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From ericsnowcurrently at gmail.com Fri Apr 27 08:36:26 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 27 Apr 2012 00:36:26 -0600 Subject: [Python-ideas] PEP 4XX: Adding sys.implementation Message-ID: I've written up a PEP for the sys.implementation idea. Feedback is welcome! You'll notice some gaps which I'll be working on to fill in over the next couple days. Don't mind the gaps. They are in less critical (?) portions and I wanted to get this out to you before the weekend. Thanks! -eric -------------------------------------------------------------- PEP: 4XX Title: Adding sys.implementation Version: $Revision$ Last-Modified: $Date$ Author: Eric Snow Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 26-April-2012 Python-Version: 3.3 Abstract ======== This PEP introduces a new variable for the sys module: ``sys.implementation``. The variable holds consolidated information about the implementation of the running interpreter. Thus ``sys.implementation`` is the source to which the standard library may look for implementation-specific information. The proposal in this PEP is in line with a broader emphasis on making Python friendlier to alternate implementations. It describes the new variable and the constraints on what that variable contains. The PEP also explains some immediate use cases for ``sys.implementation``. Motivation ========== For a number of years now, the distinction between Python-the-language and CPython (the reference implementation) has been growing. Most of this change is due to the emergence of Jython, IronPython, and PyPy as viable alternate implementations of Python. Consider, however, the nearly two decades of CPython-centric Python (i.e. most of its existance). That focus had understandably contributed to quite a few CPython-specific artifacts both in the standard library and exposed in the interpreter. Though the core developers have made an effort in recent years to address this, quite a few of the artifacts remain. Part of the solution is presented in this PEP: a single namespace on which to consolidate implementation specifics. This will help focus efforts to differentiate the implementation specifics from the language. Additionally, it will foster a multiple-implementation mindset. Proposal ======== We will add ``sys.implementation``, in the sys module, as a namespace to contain implementation-specific information. The contents of this namespace will remain fixed during interpreter execution and through the course of an implementation version. This ensures behaviors don't change between versions which depend on variables in ``sys.implementation``. ``sys.implementation`` is a dictionary, as opposed to any form of "named" tuple (a la ``sys.version_info``). This is partly because it doesn't have meaning as a sequence, and partly because it's a potentially more variable data structure. The namespace will contain at least the variables described in the `Required Variables`_ section below. However, implementations are free to add other implementation information there. Some possible extra variables are described in the `Other Possible Variables`_ section. This proposal takes a conservative approach in requiring only two variables. As more become appropriate, they may be added with discretion. Required Variables -------------------- These are variables in ``sys.implementation`` on which the standard library would rely, meaning they would need to be defined: name the name of the implementation (case sensitive). version the version of the implementation, as opposed to the version of the language it implements. This would use a standard format, similar to ``sys.version_info`` (see `Version Format`_). Other Possible Variables ------------------------ These variables could be useful, but don't necessarily have a clear use case presently: cache_tag a string used for the PEP 3147 cache tag (e.g. 'cpython33' for CPython 3.3). The name and version from above could be used to compose this, though an implementation may want something else. However, module caching is not a requirement of implementations, nor is the use of cache tags. repository the implementation's repository URL. repository_revision the revision identifier for the implementation. build_toolchain identifies the tools used to build the interpreter. url (or website) the URL of the implementation's site. site_prefix the preferred site prefix for this implementation. runtime the run-time environment in which the interpreter is running. gc_type the type of garbage collection used. Version Format -------------- XXX same as sys.version_info? Rationale ========= The status quo for implementation-specific information gives us that information in a more fragile, harder to maintain way. It's spread out over different modules or inferred from other information, as we see with ``platform.python_implementation()``. This PEP is the main alternative to that approach. It consolidates the implementation-specific information into a single namespace and makes explicit that which was implicit. With the single-namespace-under-sys so straightforward, no alternatives have been considered for this PEP. Discussion ========== The topic of ``sys.implementation`` came up on the python-ideas list in 2009, where the reception was broadly positive [1]_. I revived the discussion recently while working on a pure-python ``imp.get_tag()`` [2]_. The messages in `issue 14673`_ are also relevant. Use-cases ========= ``platform.python_implementation()`` ------------------------------------ "explicit is better than implicit" The platform module guesses the python implementation by looking for clues in a couple different sys variables [3]_. However, this approach is fragile. Beyond that, it's limited to those implementations that core developers have blessed by special-casing them in the platform module. With ``sys.implementation` the various implementations would *explicitly* set the values in their own version of the sys module. Aside from the guessing, another concern is that the platform module is part of the stdlib, which ideally would minimize implementation details such as would be moved to ``sys.implementation``. Any overlap between ``sys.implementation`` and the platform module would simply defer to ``sys.implementation`` (with the same interface in platform wrapping it). Cache Tag Generation in Frozen Importlib ---------------------------------------- PEP 3147 defined the use of a module cache and cache tags for file names. The importlib bootstrap code, frozen into the Python binary as of 3.3, uses the cache tags during the import process. Part of the project to bootstrap importlib has been to clean out of Lib/import.c any code that did not need to be there. The cache tag defined in Lib/import.c was hard-coded to ``"cpython" MAJOR MINOR`` [4]_. For importlib the options are either hard-coding it in the same way, or guessing the implementation in the same way as does ``platform.python_implementation()``. As long as the hard-coded tag is limited to CPython-specific code, it's livable. However, inasmuch as other Python implementations use the importlib code to work with the module cache, a hard-coded tag would become a problem.. Directly using the platform module in this case is a non-starter. Any module used in the importlib bootstrap must be built-in or frozen, neither of which apply to the platform module. This is the point that led to the recent interest in ``sys.implementation``. Regardless of how the implementation name is gotten, the version to use for the cache tag is more likely to be the implementation version rather than the language version. That implementation version is not readily identified anywhere in the standard library. Implementation-Specific Tests ----------------------------- XXX http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l509 http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1246 http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1252 http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1275 Jython's ``os.name`` Hack ------------------------- XXX http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l512 Impact on CPython ================= XXX Feedback From Other Python Implementators ========================================= IronPython ---------- XXX Jython ------ XXX PyPy ---- XXX Past Efforts ============ XXX PEP 3139 XXX PEP 399 Open Issues =========== * What are the long-term objectives for sys.implementation? - pull in implementation detail from the main sys namespace and elsewhere (PEP 3137 lite). * Alternatives to the approach dictated by this PEP? * ``sys.implementation`` as a proper namespace rather than a dict. It would be it's own module or an instance of a concrete class. Implementation ============== The implementatation of this PEP is covered in `issue 14673`_. References ========== .. [1] http://mail.python.org/pipermail/python-dev/2009-October/092893.html .. [2] http://mail.python.org/pipermail/python-ideas/2012-April/014878.html .. [3] http://hg.python.org/cpython/file/2f563908ebc5/Lib/platform.py#l1247 .. [4] http://hg.python.org/cpython/file/2f563908ebc5/Python/import.c#l121 .. _issue 14673 http://bugs.python.org/issue14673 Copyright ========= This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: From stefan_ml at behnel.de Fri Apr 27 09:05:26 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 27 Apr 2012 09:05:26 +0200 Subject: [Python-ideas] breaking out of module execution In-Reply-To: References: Message-ID: Eric Snow, 24.04.2012 21:23: > In a function you can use a return statement to break out of execution > in the middle of the function. With modules you have no recourse. > This is akin to return statements being allowed only at the end of a > function. > > There are a small number of ways you can work around this, but they > aren't great. This includes using wrapper modules or import hooks or > sometimes from-import-*. Otherwise, if your module's execution is > conditional, you end up indenting everything inside an if/else > statement. > > Proposal: introduce a non-error mechanism to break out of module > execution. This could be satisfied by a statement like break or > return, though those specific ones could be confusing. It could also > involve raising a special subclass of ImportError that the import > machinery simply handles as not-an-error. > > This came up last year on python-list with mixed results. [1] > However, time has not dimmed the appeal for me so I'm rebooting here. > > While the proposal seems relatively minor, the use cases are not > extensive. The three main ones I've encountered are these: > > 1. C extension module with fallback to pure Python: > > try: > from _extension_module import * > except ImportError: > pass > else: > break # or whatever color the bikeshed is > > # pure python implementation goes here > > 2. module imported under different name: > > if __name__ != "expected_name": > from expected_name import * > break > > # business as usual > > 3. module already imported under a different name: > > if "other_module" in sys.modules: > exec("from other_module import *", globals()) > break > > # module code here > > Thoughts? Without having read through the thread, I think that code that needs this is just badly structured. All of the above cases can be fixed by moving the code into a separate (and appropriately named) module and importing conditionally from that. I'm generally -1 on anything that would allow non-error code at an arbitrary place further up in a module to prevent the non-indented module code I'm looking at from being executed. Whether the result of that execution makes it into the module API or not is a different question that is commonly answered either by "__all__" at the very top of a module or by the code at the very end of the module, not in between. Stefan From jimjjewett at gmail.com Fri Apr 27 15:19:26 2012 From: jimjjewett at gmail.com (Jim Jewett) Date: Fri, 27 Apr 2012 09:19:26 -0400 Subject: [Python-ideas] breaking out of module execution In-Reply-To: <4F982D34.4010709@pearwood.info> References: <20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com> <4F970B16.2020702@egenix.com> <4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info> <4F97C6CF.8060106@egenix.com> <4F97E4B2.6070302@pearwood.info> <4F97E9D7.5090804@hotpy.org> <4A8CACF1-860D-48F7-9538-596A3EEB4445@mac.com> <4F98039A.3090301@egenix.com> <4F982D34.4010709@pearwood.info> Message-ID: On Wed, Apr 25, 2012 at 12:58 PM, Steven D'Aprano wrote: > Why would hundreds of lines suddenly become > unreadable and hard to edit because they have a little bit of leading > whitespace in front of them? A single indent isn't that bad for reading or editing; the problem is with skimming. def X... class A.... ZZZ= .... Anything not on the far left is part of the undifferentiated "...". -jJ From jimjjewett at gmail.com Fri Apr 27 15:31:02 2012 From: jimjjewett at gmail.com (Jim Jewett) Date: Fri, 27 Apr 2012 09:31:02 -0400 Subject: [Python-ideas] Module __getattr__ [Was: breaking out of module execution] Message-ID: On Wed, Apr 25, 2012 at 2:35 PM, Matt Joiner wrote: > If this is to be done I'd like to see all special methods supported. One of > particular interest to modules is __getattr__... For What It's Worth, supporting __setattr__ and __getattr__ is one of the few reasons that I have considered subclassing modules. The workarounds of either offering public set_varX and get_varX functions, or moving configuration to a separate singleton, just feel kludgy. Since those module methods would be defined at the far left, I don't think it would mess up understanding any more than they already do on regular classes. (There is always *some* surprise, just because they are unusual.) That said, I personally tend to view modules as a special case of classes, so I wouldn't be shocked if others found it more confusing than I would -- particularly as to whether or not the module's __getattr__ would somehow affect the lookup chain for classes defined within the module. -jJ From techtonik at gmail.com Fri Apr 27 20:12:15 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Fri, 27 Apr 2012 21:12:15 +0300 Subject: [Python-ideas] Python-ideas Digest, Vol 65, Issue 43 In-Reply-To: <20120427062557.GA32663@iskra.aviel.ru> References: <20120424184547.0bca79f5@bhuda.mired.org> <20120427062557.GA32663@iskra.aviel.ru> Message-ID: On Fri, Apr 27, 2012 at 9:25 AM, Oleg Broytman wrote: > Hell! Ok. Let's discuss. > On Fri, Apr 27, 2012 at 08:45:20AM +0300, anatoly techtonik wrote: >> won't install some awkward terminal > > ? "awkward" indeed! Trolling as usual, yeah? You won't win this fight. =) "Awkward (Adjective): Causing difficulty; hard to do or deal with." I can't see that's wrong with that. You don't want to say that installing unknown program and learn how to work with the whole terminal toolchain and syncing your mail archive across nodes is as easy as working with GMail in Chrome, do you? You can record a session at shelr.tv and try to convince me, but I have experience that tells me that Linux keyboard terminal input is sick, and that is the reason why terminal programs are awkward to deal with - the final explanation I reached rests at http://www.leonerd.org.uk/hacks/fixterms/ - I wish the library was a part of Python distribution. >> Requirement to "use something proper >> beforehand" was neither a solution nor an alternative. > > ? It was not a requirement, just an advice. But that was an advice to > solve a real problem. Replying to digest brings a lot of problems and > thus prevents effective communication. Email, mailing lists and archives > don't make sense if they don't help to communicate. That's why I prefer Google Groups. You can use email, subscribe as a mailing list, read the web, have a searchable archive and reply to any thread you haven't been subscribed to. Everything from a single interface - no need to carry your mail archive around anymore if you want to search it without 3rd party services. That is my definition of effective communication platform. The constructive advice - research a tutorial how to properly integrate full sync between Mailman and Google Groups and ban usage of digests altogether. In fact I've asked in Mailman group how to properly setup it to automatically accept the mails from Groups subscribers, but the stuff got too complicated, so it was postponed for a better time to learn email protocol intricacies. > ? And it was only AN advice, not THE advice. Another solution for the > same problem would be not to reply to digest. I am sure there are other > solutions. Well, sorry for my tone. It seems I've entered "that" favorite style again. Of course, I accept the idea that mh (which is Public Domain and that's awesome) can solve the problem with digest reading, but the story is too exotic for me, and I certainly won't sacrifice features of web based mail services to make sure I can properly reply to digests. I better won't reply to them at all next time, just because my mail agent doesn't allow. That's my personal choice, but it is also the choice that makes people exclude then they faced with such requirements. >> I am not sure that `programs` nowadays makes any sense if you can not >> access your data from all the entrypionts. > > ? Do you believe - those "awkward" terminal programs works remotely > quite fine?! But I can't use them from my (imaginary) tablet version 3. =) And for me any SSH session interaction is still slow - maybe I am too picky, but the typing delay in comparison with browser is more that enough to feel the difference. -- anatoly t. From mwm at mired.org Fri Apr 27 21:57:12 2012 From: mwm at mired.org (Mike Meyer) Date: Fri, 27 Apr 2012 15:57:12 -0400 Subject: [Python-ideas] Python-ideas Digest, Vol 65, Issue 43 In-Reply-To: References: <20120424184547.0bca79f5@bhuda.mired.org> <20120427062557.GA32663@iskra.aviel.ru> Message-ID: <20120427155712.2929246d@bhuda.mired.org> On Fri, 27 Apr 2012 21:12:15 +0300 anatoly techtonik wrote: > On Fri, Apr 27, 2012 at 9:25 AM, Oleg Broytman wrote: > > Hell! > Ok. Let's discuss. Since I made the suggestion, I'll step in here. > > On Fri, Apr 27, 2012 at 08:45:20AM +0300, anatoly techtonik wrote: > >> won't install some awkward terminal > > ? "awkward" indeed! Trolling as usual, yeah? > You won't win this fight. =) A truism in any web discussion. > "Awkward (Adjective): Causing difficulty; hard to do or deal with." > I can't see that's wrong with that. You don't want to say that > installing unknown program and learn how to work with the whole > terminal toolchain and syncing your mail archive across nodes is as > easy as working with GMail in Chrome, do you? Yes, it is. The setup process is a relatively straight-forward, one-time-per-node thing. If you're willing to do a little research instead of expecting only using the work of others, you also might be able to avoid the terminal toolchain issues. The GMail web client has a number of problems that crop up over and over again. It's mail marking facilities are subpar, making simply reading mail more difficult than it is on other clients *every time you read mail*. It's mail quoting mechanism is fundamentally broken, require hand-editing the quote *every time you reply to mail*. And, of course, GMail can't burst a digest, meaning you either don't reply to digest posts, fix the headers by hand, or piss other people off - all creating unneeded difficulty *every time you reply to a digest message*. So eventually using only GMail in a web browser will cause more difficulty than setting up a proper mail client. Unless you use read very little mail, in which case - why are you getting the digests? > >> Requirement to "use something proper > >> beforehand" was neither a solution nor an alternative. > > ? It was not a requirement, just an advice. But that was an advice to > > solve a real problem. Replying to digest brings a lot of problems and > > thus prevents effective communication. Email, mailing lists and archives > > don't make sense if they don't help to communicate. > That's why I prefer Google Groups. You can use email, subscribe as a > mailing list, read the web, have a searchable archive and reply to any > thread you haven't been subscribed to. Everything from a single > interface - no need to carry your mail archive around anymore if you > want to search it without 3rd party services. That is my definition of > effective communication platform. The constructive advice - research a > tutorial how to properly integrate full sync between Mailman and > Google Groups and ban usage of digests altogether. Google Groups is nothing more than a web interface to a netnews system. IIUC, one with a broken news<->mail gateway. It doesn't provide anything that the mail list doesn't provide, except the ability to reply from Google Groups. And that's broken because Google Groups is broken. If you really want to use a news or web interface and reply properly, take the time to find one that doesn't have a broken mail gateway. > > ? And it was only AN advice, not THE advice. Another solution for the > > same problem would be not to reply to digest. I am sure there are other > > solutions. > Well, sorry for my tone. It seems I've entered "that" favorite style > again. Of course, I accept the idea that mh (which is Public Domain > and that's awesome) can solve the problem with digest reading, but the > story is too exotic for me, and I certainly won't sacrifice features > of web based mail services to make sure I can properly reply to > digests. Nobody said you had to make that sacrifice. There aren't any features of generic web based mail services that aren't available in proper mail readers. Sure, mh isn't one of those. But it may not be the only mail reader that can burst a digest. So long as you waste effort trying to change the world rather than changing the part you can change, you'll never find out if that's true or not. And of course, you always have the option of only using mh (or a gui wrapper for same) to read digests. Treating a digest as a single message is awkward enough that the difficulty of setting up mh and a GUI wrapper will be lost in the noise if you read enough digests. > I better won't reply to them at all next time, just because > my mail agent doesn't allow. That's my personal choice, but it is also > the choice that makes people exclude then they faced with such > requirements. If you chose to act in a way that makes you feel excluded, that's your problem. Sure, it's not the one I want people following, but I'm not going to waste time trying to change your behavior. I'll point out your errors to help other people avoid them, but you can do what you want with that information. > >> I am not sure that `programs` nowadays makes any sense if you can not > >> access your data from all the entrypionts. > > ? Do you believe - those "awkward" terminal programs works remotely > > quite fine?! > But I can't use them from my (imaginary) tablet version 3. =) I could, if I really wanted to. But here you're fundamentally correct - the mh mail readers only have one interaction with mail servers: "Load unread mail". That makes using them in an environment where you want to deal with mail from multiple machines problematical, at best. That's why I quit using them. Of course, there are other mail readers besides GMail that don't have that problem. They also don't have the problems that GMail has. They may well have other problems, but only you can figure out what those are and change the things you can control to best deal with them. http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From miki.tebeka at gmail.com Sat Apr 28 00:52:17 2012 From: miki.tebeka at gmail.com (Miki Tebeka) Date: Fri, 27 Apr 2012 15:52:17 -0700 (PDT) Subject: [Python-ideas] Anyone working on a platform-agnostic os.startfile() In-Reply-To: References: Message-ID: <2596653.232.1335567137261.JavaMail.geo-discussion-forums@ynbv36> There's http://pypi.python.org/pypi/desktop/0.4, but it seems to be unmaintained. It provides a an "open" command. On Sunday, April 22, 2012 10:21:10 PM UTC-7, Hobson Lane wrote: > > There is significant interest in a cross-platform > file-launcher.[1][2][3][4] The ideal implementation would be > an operating-system-agnostic interface that launches a file for editing or > viewing, similar to the way os.startfile() works for Windows, but > generalized to allow caller-specification of view vs. edit preference and > support all registered os.name operating systems, not just 'nt'. > > Mercurial has a mature python implementation for cross-platform launching > of an editor (either GUI editor or terminal-based editor like vi).[5][6] > The python std lib os.startfile obviously works for Windows. > > The Mercurial functionality could be rolled into os.startfile() with > additional named parameters for edit or view preference and gui or non-gui > preference. Perhaps that would enable backporting belwo Python 3.x. Or is > there a better place to incorporate this multi-platform file launching > capability? > > [1]: > http://stackoverflow.com/questions/1856792/intelligently-launching-the-default-editor-from-inside-a-python-cli-program > [2]: > http://stackoverflow.com/questions/434597/open-document-with-default-application-in-python > [3]: > http://stackoverflow.com/questions/1442841/lauch-default-editor-like-webbrowser-module > [4]: > http://stackoverflow.com/questions/434597/open-document-with-default-application-in-python > [5]: http://selenic.com/repo/hg-stable/file/2770d03ae49f/mercurial/ui.py > [6]: > http://selenic.com/repo/hg-stable/file/2770d03ae49f/mercurial/util.py > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Sat Apr 28 08:06:29 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Sat, 28 Apr 2012 00:06:29 -0600 Subject: [Python-ideas] PEP 4XX: Adding sys.implementation In-Reply-To: References: Message-ID: Here's an update to the PEP. Though I have indirect or old feedback already, I'd love to hear from the other main Python implementations, particularly regarding the version variable. Thanks. -eric ------------------------------------------------------------- PEP: 421 Title: Adding sys.implementation Version: $Revision$ Last-Modified: $Date$ Author: Eric Snow Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 26-April-2012 Post-History: 26-April-2012 Abstract ======== This PEP introduces a new variable for the ``sys`` module: ``sys.implementation``. The variable holds consolidated information about the implementation of the running interpreter. Thus ``sys.implementation`` is the source to which the standard library may look for implementation-specific information. The proposal in this PEP is in line with a broader emphasis on making Python friendlier to alternate implementations. It describes the new variable and the constraints on what that variable contains. The PEP also explains some immediate use cases for ``sys.implementation``. Motivation ========== For a number of years now, the distinction between Python-the-language and CPython (the reference implementation) has been growing. Most of this change is due to the emergence of Jython, IronPython, and PyPy as viable alternate implementations of Python. Consider, however, the nearly two decades of CPython-centric Python (i.e. most of its existance). That focus had understandably contributed to quite a few CPython-specific artifacts both in the standard library and exposed in the interpreter. Though the core developers have made an effort in recent years to address this, quite a few of the artifacts remain. Part of the solution is presented in this PEP: a single namespace on which to consolidate implementation specifics. This will help focus efforts to differentiate the implementation specifics from the language. Additionally, it will foster a multiple-implementation mindset. Proposal ======== We will add ``sys.implementation``, in the ``sys`` module, as a namespace to contain implementation-specific information. The contents of this namespace will remain fixed during interpreter execution and through the course of an implementation version. This ensures behaviors don't change between versions which depend on variables in ``sys.implementation``. ``sys.implementation`` will be a dictionary, as opposed to any form of "named" tuple (a la ``sys.version_info``). This is partly because it doesn't have meaning as a sequence, and partly because it's a potentially more variable data structure. The namespace will contain at least the variables described in the `Required Variables`_ section below. However, implementations are free to add other implementation information there. Some possible extra variables are described in the `Other Possible Variables`_ section. This proposal takes a conservative approach in requiring only two variables. As more become appropriate, they may be added with discretion. Required Variables -------------------- These are variables in ``sys.implementation`` on which the standard library would rely, meaning implementors must define them: name the name of the implementation (case sensitive). version the version of the implementation, as opposed to the version of the language it implements. This would use a standard format, similar to ``sys.version_info`` (see `Version Format`_). Other Possible Variables ------------------------ These variables could be useful, but don't necessarily have a clear use case presently: cache_tag a string used for the PEP 3147 cache tag (e.g. 'cpython33' for CPython 3.3). The name and version from above could be used to compose this, though an implementation may want something else. However, module caching is not a requirement of implementations, nor is the use of cache tags. repository the implementation's repository URL. repository_revision the revision identifier for the implementation. build_toolchain identifies the tools used to build the interpreter. url (or website) the URL of the implementation's site. site_prefix the preferred site prefix for this implementation. runtime the run-time environment in which the interpreter is running. gc_type the type of garbage collection used. Version Format -------------- A main point of ``sys.implementation`` is to contain information that will be used in the standard library. In order to facilitate the usefulness of a version variable, its value should be in a consistent format across implementations. XXX Subject to feedback As such, the format of ``sys.implementation['version']`` must follow that of ``sys.version_info``, which is effectively a named tuple. It is a familiar format and generally consistent with normal version format conventions. Rationale ========= The status quo for implementation-specific information gives us that information in a more fragile, harder to maintain way. It's spread out over different modules or inferred from other information, as we see with ``platform.python_implementation()``. This PEP is the main alternative to that approach. It consolidates the implementation-specific information into a single namespace and makes explicit that which was implicit. The ``sys`` module should old the new namespace because ``sys`` is the depot for interpreter-centric variables and functions. With the single-namespace-under-sys so straightforward, no alternatives have been considered for this PEP. Discussion ========== The topic of ``sys.implementation`` came up on the python-ideas list in 2009, where the reception was broadly positive [1]_. I revived the discussion recently while working on a pure-python ``imp.get_tag()`` [2]_. The messages in `issue #14673`_ are also relevant. Use-cases ========= ``platform.python_implementation()`` ------------------------------------ "explicit is better than implicit" The ``platform`` module guesses the python implementation by looking for clues in a couple different ``sys`` variables [3]_. However, this approach is fragile. Beyond that, it's limited to those implementations that core developers have blessed by special-casing them in the ``platform`` module. With ``sys.implementation`` the various implementations would *explicitly* set the values in their own version of the ``sys`` module. Aside from the guessing, another concern is that the ``platform`` module is part of the stdlib, which ideally would minimize implementation details such as would be moved to ``sys.implementation``. Any overlap between ``sys.implementation`` and the ``platform`` module would simply defer to ``sys.implementation`` (with the same interface in ``platform`` wrapping it). Cache Tag Generation in Frozen Importlib ---------------------------------------- PEP 3147 defined the use of a module cache and cache tags for file names. The importlib bootstrap code, frozen into the Python binary as of 3.3, uses the cache tags during the import process. Part of the project to bootstrap importlib has been to clean out of `Python/import.c` any code that did not need to be there. The cache tag defined in `Python/import.c` was hard-coded to ``"cpython" MAJOR MINOR`` [4]_. For importlib the options are either hard-coding it in the same way, or guessing the implementation in the same way as does ``platform.python_implementation()``. As long as the hard-coded tag is limited to CPython-specific code, it's livable. However, inasmuch as other Python implementations use the importlib code to work with the module cache, a hard-coded tag would become a problem.. Directly using the ``platform`` module in this case is a non-starter. Any module used in the importlib bootstrap must be built-in or frozen, neither of which apply to the ``platform`` module. This is the point that led to the recent interest in ``sys.implementation``. Regardless of the outcome for the implementation name used, another problem relates to the version used in the cache tag. That version is likely to be the implementation version rather than the language version. However, the implementation version is not readily identified anywhere in the standard library. Implementation-Specific Tests ----------------------------- Currently there are a number of implementation-specific tests in the test suite under ``Lib/test``. The test support module (`Lib/test/support.py`_) provides some functionality for dealing with these tests. However, like the ``platform`` module, ``test.support`` must do some guessing that ``sys.implementation`` would render unnecessary. Jython's ``os.name`` Hack ------------------------- XXX http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l512 Feedback From Other Python Implementators ========================================= IronPython ---------- XXX Jython ------ XXX PyPy ---- XXX Past Efforts ============ PEP 3139 -------- This PEP from 2008 recommended a clean-up of the ``sys`` module in part by extracting implementation-specific variables and functions into a separate module. PEP 421 is a much lighter version of that idea. While PEP 3139 was rejected, its goals are reflected in PEP 421 to a large extent, though with a much lighter approach. PEP 399 ------- This informational PEP dictates policy regarding the standard library, helping to make it friendlier to alternate implementations. PEP 421 is proposed in that same spirit. Open Issues =========== * What are the long-term objectives for ``sys.implementation``? - possibly pull in implementation details from the main ``sys`` namespace and elsewhere (PEP 3137 lite). * Alternatives to the approach dictated by this PEP? * ``sys.implementation`` as a proper namespace rather than a dict. It would be it's own module or an instance of a concrete class. Implementation ============== The implementatation of this PEP is covered in `issue #14673`_. References ========== .. [1] http://mail.python.org/pipermail/python-dev/2009-October/092893.html .. [2] http://mail.python.org/pipermail/python-ideas/2012-April/014878.html .. [3] http://hg.python.org/cpython/file/2f563908ebc5/Lib/platform.py#l1247 .. [4] http://hg.python.org/cpython/file/2f563908ebc5/Python/import.c#l121 .. [5] Examples of implementation-specific handling in test.support: | http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l509 | http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1246 | http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1252 | http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1275 .. _issue #14673: http://bugs.python.org/issue14673 .. _Lib/test/support.py: http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py Copyright ========= This document has been placed in the public domain. From pyideas at rebertia.com Sat Apr 28 08:22:25 2012 From: pyideas at rebertia.com (Chris Rebert) Date: Fri, 27 Apr 2012 23:22:25 -0700 Subject: [Python-ideas] PEP 4XX: Adding sys.implementation In-Reply-To: References: Message-ID: On Fri, Apr 27, 2012 at 11:06 PM, Eric Snow wrote: > Proposal > ======== > > We will add ``sys.implementation``, in the ``sys`` module, as a namespace > to contain implementation-specific information. > > The contents of this namespace will remain fixed during interpreter > execution and through the course of an implementation version. ?This > ensures behaviors don't change between versions which depend on variables > in ``sys.implementation``. > > ``sys.implementation`` will be a dictionary, as opposed to any form of > "named" tuple (a la ``sys.version_info``). ?This is partly because it > doesn't have meaning as a sequence, and partly because it's a potentially > more variable data structure. > Open Issues > =========== > > * What are the long-term objectives for ``sys.implementation``? > > ?- possibly pull in implementation details from the main ``sys`` namespace > ? ?and elsewhere (PEP 3137 lite). > > * Alternatives to the approach dictated by this PEP? > > * ``sys.implementation`` as a proper namespace rather than a dict. ?It > ?would be it's own module or an instance of a concrete class. So, what's the justification for it being a dict rather than an object with attributes? The PEP merely (sensibly) concludes that it cannot be considered a sequence. Relatedly, I find the PEP's use of the term "namespace" in reference to a dict to be somewhat confusing. Cheers, Chris From victor.stinner at gmail.com Sun Apr 29 03:39:53 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Sun, 29 Apr 2012 03:39:53 +0200 Subject: [Python-ideas] PEP 4XX: Adding sys.implementation In-Reply-To: References: Message-ID: > I've written up a PEP for the sys.implementation idea. Feedback is welcome! Cool, it's better with PEP! Even the change looks trivial. > name > the name of the implementation (case sensitive). It would help if the PEP (and the documentation of sys.implementation) lists at least the most common names. I suppose that we would have something like: "CPython", "PyPy", "Jython", "IronPython". > version > the version of the implementation, as opposed to the version of the > language it implements. This would use a standard format, similar to > ``sys.version_info`` (see `Version Format`_). Dummy question: what is sys.version/sys.version_info? The version of the implementation or the version of the Python lnaguage? The PEP should explain that, and maybe also the documentation of sys.implementation.version (something like "use sys.version_info to get the version of the Python language"). > cache_tag Why not adding this information to the imp module? Victor From mwm at mired.org Mon Apr 30 19:33:38 2012 From: mwm at mired.org (Mike Meyer) Date: Mon, 30 Apr 2012 13:33:38 -0400 Subject: [Python-ideas] argparse FileType v.s default arguments... Message-ID: <20120430133338.33b2f75d@bhuda.mired.org> While I really like the argparse module, I've run into a case I think it ought to handle that it doesn't. So I'm asking here to see if 1) I've overlooked something, and it can do this, or 2) there's a good reason for it not to do this or maybe 3) this is a bad idea. The usage I ran into looks like this: parser.add_argument('configfile', default='/my/default/config', type=FileType('r'), nargs='?') If I provide the argument, everything works fine, and it opens the named file for me. If I don't, parser.configfile is set to the string, which doesn't work very well when I try to use it's read method. Unfortunately, setting default to open('/my/default/config') has the side affect of opening the file. Or raising an exception if the file doesn't exist (which is a common reason for wanting to provide an alternative!) Could default handling could be made smarter, and if 1) type is set and 2) the value of default is a string, call pass the value of default to type? Or maybe a flag to make that happen, or even a default_factory argument (incompatible with default) that would accept something like default_factory=lambda: open('/my/default/config')? Thanks, http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From greg at krypto.org Mon Apr 30 21:59:09 2012 From: greg at krypto.org (Gregory P. Smith) Date: Mon, 30 Apr 2012 12:59:09 -0700 Subject: [Python-ideas] argparse FileType v.s default arguments... In-Reply-To: <20120430133338.33b2f75d@bhuda.mired.org> References: <20120430133338.33b2f75d@bhuda.mired.org> Message-ID: On Mon, Apr 30, 2012 at 10:33 AM, Mike Meyer wrote: > While I really like the argparse module, I've run into a case I think > it ought to handle that it doesn't. > > So I'm asking here to see if 1) I've overlooked something, and it can > do this, or 2) there's a good reason for it not to do this or maybe 3) > this is a bad idea. > > The usage I ran into looks like this: > > parser.add_argument('configfile', default='/my/default/config', > type=FileType('r'), nargs='?') > > If I provide the argument, everything works fine, and it opens the > named file for me. If I don't, parser.configfile is set to the string, > which doesn't work very well when I try to use it's read method. > Unfortunately, setting default to open('/my/default/config') has the > side affect of opening the file. Or raising an exception if the file > doesn't exist (which is a common reason for wanting to provide an > alternative!) > > Could default handling could be made smarter, and if 1) type is set > and 2) the value of default is a string, call pass the value of > default to type? Or maybe a flag to make that happen, or even a > default_factory argument (incompatible with default) that would accept > something like default_factory=lambda: open('/my/default/config')? > > Thanks, > This makes sense to me as described. I suggest going ahead and file an issue on bugs.python.org with the above. -gps (who hasn't actually used argparse enough) -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Mon Apr 30 23:04:54 2012 From: barry at python.org (Barry Warsaw) Date: Mon, 30 Apr 2012 17:04:54 -0400 Subject: [Python-ideas] PEP 4XX: Adding sys.implementation References: Message-ID: <20120430170454.08d73f74@resist.wooz.org> On Apr 27, 2012, at 12:36 AM, Eric Snow wrote: >I've written up a PEP for the sys.implementation idea. Feedback is welcome! Thanks for working on this PEP, Eric! >``sys.implementation`` is a dictionary, as opposed to any form of "named" >tuple (a la ``sys.version_info``). This is partly because it doesn't >have meaning as a sequence, and partly because it's a potentially more >variable data structure. I agree that sequence semantics are meaningless here. Presumably, a dictionary is proposed because this cache_tag = sys.implementation.get('cache_tag') is nicer than cache_tag = getattr(sys.implementation, 'cache_tag', None) OTOH, maybe we need a nameddict type! >repository > the implementation's repository URL. What does this mean? Oh, I think you mean the URL for the VCS used to develop this version of the implementation. Maybe vcs_url (and even then there could be alternative blessed mirrors in other vcs's). A Debian analog are the Vcs-* header (e.g. Vcs-Git, Vcs-Bzr, etc.). >repository_revision > the revision identifier for the implementation. I'm not sure what this is. Is it like the hexgoo you see in the banner of a from-source build that identifies the revision used to build this interpreter? Is this key a replacement for that? >build_toolchain > identifies the tools used to build the interpreter. As a tuple of free-form strings? >url (or website) > the URL of the implementation's site. Maybe 'homepage' (another Debian analog). >site_prefix > the preferred site prefix for this implementation. > >runtime > the run-time environment in which the interpreter is running. I'm not sure what this means either. ;) >gc_type > the type of garbage collection used. Another free-form string? What would be the values say, for CPython and Jython? >Version Format >-------------- > >XXX same as sys.version_info? Why not? :) It might be useful also to have something similar to sys.hexversion, which I often find convenient. >* What are the long-term objectives for sys.implementation? > > - pull in implementation detail from the main sys namespace and > elsewhere (PEP 3137 lite). That's where this seems to be leaning. Even if it's a good idea, I bet it will be a long time before the old sys names can be removed. >* Alternatives to the approach dictated by this PEP? > >* ``sys.implementation`` as a proper namespace rather than a dict. It > would be it's own module or an instance of a concrete class. Which might make sense, as would perhaps a top-level `implementation` module. IOW, why situate it in sys? >The implementatation of this PEP is covered in `issue 14673`_. s/implementatation/implementation Nicely done! Let's see how those placeholders shake out. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From greg at krypto.org Mon Apr 30 23:21:08 2012 From: greg at krypto.org (Gregory P. Smith) Date: Mon, 30 Apr 2012 14:21:08 -0700 Subject: [Python-ideas] Module __getattr__ [Was: breaking out of module execution] In-Reply-To: References: Message-ID: On Fri, Apr 27, 2012 at 6:31 AM, Jim Jewett wrote: > On Wed, Apr 25, 2012 at 2:35 PM, Matt Joiner wrote: > > If this is to be done I'd like to see all special methods supported. One > of > > particular interest to modules is __getattr__... > > For What It's Worth, supporting __setattr__ and __getattr__ is one of > the few reasons that I have considered subclassing modules. > > The workarounds of either offering public set_varX and get_varX > functions, or moving configuration to a separate singleton, just feel > kludgy. > > Since those module methods would be defined at the far left, I don't > think it would mess up understanding any more than they already do on > regular classes. (There is always *some* surprise, just because they > are unusual.) > > That said, I personally tend to view modules as a special case of > classes, so I wouldn't be shocked if others found it more confusing > than I would -- particularly as to whether or not the module's > __getattr__ would somehow affect the lookup chain for classes defined > within the module. > > -jJ > Making modules "simply" be a class that could be subclasses rather than their own thing _would_ be nice for one particular project I've worked on where the project including APIs and basic implementations were open source but which allowed for site specific code to override many/most of those base implementations as a way of customizing it for your own specific (non open source) environment. Any APIs that were unfortunately defined as a module with a bunch of functions in it was a real pain to make site specific overrides for. Anything APIs that were thankfully defined as a class within a module even when there wasn't a real need for a class was much easier to make site specific. But this is not an easy thing to do. I wouldn't want functions in a module to need to be declared as class methods with a cls parameter nor would I want an implicit named equivalent of cls; or does that already exist through an existing __name__ style variable today that I've been ignoring? This could leads to basically treating a module globals() dict as the class dict which at first glance seems surprising but I'd have to ponder this more. (and yes, note that I am thinking of a module as a class, not an instance) -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: